acknowledgementir.amu.ac.in/11964/1/t10316.pdf · 2018-11-22 · acknowledgement ii i express my...

i

Acknowledgement

"He who does not thank people, does not thank ALLAH" – Prophet Muhammad

(peace and blessings be upon him).

Unending glory is expedient to the Almighty, the Exalted; who granted me the

primary inspiration and stamina all along to complete this humble work. This small

contribution, if just and correct, is only a drop of appreciation for his Ocean of

munificence, without his blessings the work would not have seen the light of the day.

The completion of a research work has never been a ‘one man show’ but collective

efforts of all the well-wishers. This thesis has been very exciting and challenging for

me and I have been accompanied by a great number of people whose contributions are

worth to be acknowledged as Dwight Frindt has said, “Acknowledgment and

celebration are essential to fuelling passion making people feel valid and valuable

and giving the team real sense of progress and makes it all worthwhile.”

There are no proper words to convey my deep gratitude and respect for my thesis

supervisor and my mentor Prof. Jamshed Siddiqui, Professor and Chairman,

Department of Computer Science. I can never manifest the true sense of thanks to his

kind support which he has provided throughout the entire period of my Ph.D. I

acclaim his courage, respect his decisions, learn from his knowledge and reverence

the personality he has. The thesis would have not been accomplished successfully

without his kind support and sincere attention.

I am very much indebted to Co- Supervisor of thesis, Dr. Rashid Ali, Associate

Professor, Department of Computer Engineering, Aligarh Muslim University for

inspiring me to become an independent researcher and helped me realize the power of

critical reasoning. He also demonstrated what a brilliant and hard-working researcher

can accomplish. His advice on my research work has been invaluable.

My supervisors not only taught me to be a good scholar, but to be a good person in

the life also. I am really grateful to them for giving the Midas touch to my thesis.

Acknowledgement

ii

I express my sincere gratitude to Prof. Mohammad Ubaidullah Bokhari and Mr.

Suhel Mustajab, ex-Chairmen, Department of Computer Science, for providing me all

the necessary research facilities in the Department. I would like to convey my

heartfelt thanks to faculty members of the Department, Prof. Rafiqul Zaman Khan,

Mr. S. Maheshwari, Ms. P. Bala, Dr. Asim Zafar, Dr. Tamanna Siddiqui, Mr. Shahid

Masood, Dr. Arman Rasool Faridi, Ms. Sehba Masood, Mr. Faisal Anwer, Dr.

Swaleha, Dr. Sajid, and specially Dr. Mohammad Nadeem for their endorsements. I

also acknowledge my deepest gratitude to the staff of our lab.

I am immensely grateful to all members of research lab of the Department; Dr.

Nazir Ahmad, Dr. Suby Khanam, Dr. Yahya, Dr. Haider Khalaf Jabbar, Mr. Oqail

Ahmad, Kashif, Riyaz, Ausaf, Parvej, Rizwan, Shabbir, and Mr. Mayank Srivastava.

They generously gave their time to offer me valuable comments toward improving my

work. In particular Dr. Zaki Ahmad Khan and Mr. Faraz Hasan showed me the great

power through their innovative abilities.

It is delightful for me to thank my seniors Dr. Shadab Alam, Dr. Shamsh Tabrez

Siddiqui, Dr. Javed Ali, and Dr. Hattem for their appreciable advices.

It will be injustice, if I will not remember my dear friends at this juncture for being

with me in all my good and bad times during my long stay at the Aligarh Muslim

University. I feel very blessed when I count few names among them like Dr. Kamran

ahsan, Dr. Tanweer Khan, Dr. Tariq Sheikh, Dr. Imran Hussain, Mr. Humayun,

Ahmad Danish, Faheemuddin Malik, Fahim Akhtar, Saifullah, Shamsuddin and

Hasnain. I have been residing in the hall of residence for last 13 years and being loved

equally from the resident of respective hostels Mumtaz, Morison, Aftab and Mac

Donnell. My juniors at Hostel and Departments are worth mention here. To name a

few, Shuez, Faizan, Nafees, Faisal, Anas, Aziz, Zeyaullah, Farhan, Huzaifa, Ahmar,

Abdul Salam, Rashid, Abuzar, Adil, Haris, and my room partner Shanu.

Apart from the above, I would like to extend my heartfelt emotions to my

childhood friends Adnan Arif, Syed Tanweer, Shahab Akhtar and Meraj. I am also

thankful to Mobashir and Shadab for their technical support which helped me in

accomplishing my research work.

I deeply thank the other faculties of our University from different departments for

their support and kind attentions in my need. Specially, Prof Sufyan Beg has helped a

lot and I must mention him for his entire support during my UC, Berkeley visit for

Acknowledgement

iii

presenting my paper on an International Conference. Dr. Faiza Abbasi has encouraged

me and provided various platforms to boost my personality. Dr. Sagheeruddin, Prof.

Abdul Mannan, Dr. Obaidullah Khan and Professor Javed Arif have remained source

of encouragement and stimulation for me always.

Words cannot pay them back their kind attentions, care, guidance and love, I wish

for long age and health to the source of my spiritual power and inspirations, Prof.

Sanaullah Khan and Prof. Nadir Ali Khan, as they are light for me in the dark night.

I am deeply grateful to the University Grant Commission, New Delhi, for

providing me financial assistance in the form of “MANF” during my research, and

Vice Chancellor of the Aligarh Muslim University who worked extraordinary hard to

maintain peace in the campus and provided a more competitive and educational

environment in the University.

I pay my immense gratitude to my family members, whose Dua remain a source of

courage and inspiration for me ever, specially my big Brother Asif Sohail who has

sacrificed enough for this Degree. Also, my brothers, Mr. Akif, Mr. Absar, Mr.

Aehtasham, Jami and Sharjeel, brothers in law, Mr. Fakhre Alam and Nizamuddin,

my loving sisters, and sisters in law for their love, care and Dua.

Lastly, I would like to pay my sincere thanks and gratefulness to most influencing

character of my life, my mother; whose constant encouragement and support acted as

an impetus for working hard and completing the work with sincerity. There are no

words that can express gratitude for her love, affection and patience. She always stood

by my side, have faith in my work and always prayed for my success.

Shahab Saquib Sohail

List of Abbreviations

iv

Abstract

The proliferation of the Internet has attracted the masses for its wide application and

people have assimilated it in their lives. The huge inclination of the people towards

using Internet and other means of technology for various daily life activities like

online shopping has created problems of data overload. The growing data has

developed the problems for users in selecting the exact item of their preferences over

a large amount of available product. Recommender Systems (RS) come into the

picture to provide users a personalized and best suited item by saving the times in

searching the desired product, and reducing the complexities of using modern tools.

In the mid-90s, RS were introduced inclusively, prior to which these systems were

treated as an Information Retrieval (IR) approach. The Collaborative Filtering (CF) is

the most widely used recommendation technique along with other existing

recommendation approaches. The other well-known techniques which are used for

developing recommender systems are mainly, Content based (CB) or Reclusive

Methods (RM), Knowledge Based approach (KB), Demographic Filtering (DF),

Hybrid Approach (HA) and Context Aware approach for Recommender System

(CARS).

These techniques have been used by researchers and there is a significant increase

of researches in RS recently. The collaborative filtering (CF) based recommendation

approach tries to explore the recommendation from other customers whose choices

are similar to the target customers (i.e. customer for whom the recommendation is

made). Unlike collaborative filtering, reclusive approach finds similarities between

items without any collaboration of users. The recommender systems based on

demographic filtering also use similarity measures as a metric. But instead of finding

similar rated items by neighbor users, it tries to find the similarity between users’

demographic information like, age, sex, occupation, etc.

List of Abbreviations

v

The knowledge based system has been seen as a better substitute for above

discussed approaches. The idea which differentiates knowledge based systems from

other systems, is the degree of importance it gives to the following two domains –

a) user’s requirement

b) Characteristic of the recommended items.

The above area of expertise helps in achieving users’ satisfaction by fulfilling

their needs. Certainly, an approach for building recommender system which needs

either explicitly defined set of recommendation rules or some sort of similarity

measures from prior purchase history of the users is perceived as knowledge based

approach for recommender system.

There are several issues with the existing approaches. One of the severe

problems is the cold start problem. We propose a solution to cold start problem which

is based on the consensus ranking of the item that suits majority of the group to which

user belongs. However, for this approach we need to know similar-like user

surroundings. The approach may save time and ease the complexities involved in the

recommendation. In this work, we have considered problem of book recommendation

for computer science graduate students in Indian perspectives. Finding each user’s

preferences and providing personalized recommendation to all is time consuming and

extra efforts are required in it. Also, cold start issue will remain a threat forever. As a

solution, all graduate students of Indian Universities are considered as member of

same group. The top N books amongst these universities are obtained by observing –

a) what the best universities are recommending and, b) what the students have their

opinion about these books. By finding the best book with some experimented

suggestible approach, we may provide a good recommendation to large user without

unknown prior preference (UPP) problem. The results of the suggested approaches are

discussed in this work. The comparisons of the proposed approaches are made on the

basis of 8 different parameters. The parameters are P@10, FPR@10, FNR@10, Mean

Average Precision (MAP), Mean Reciprocal Rank (MRR), Mean Absolute Error

(MAE), Root Mean Square Error (RMSE), and Spearman rank correlation coefficient.

The proposed Opinion Mining Technique has performed well and produces a better

result for each parameter.

vi | P a g e

TABLE OF CONTENTS CONTENTS PAGE NO Certificates and Declarations Dedication Acknowledgement i Abstract iv Table of Contents vi List of Tables xii List of Figures xv List of Abbreviations xvii

Chapter 1: Introduction ..................................................................................................... 1

1.1 Recommender Systems: An Introduction .................................................................... 1

1.2 Application of Recommender Systems ......................................................................... 1

1.3 Problems and Issues in Recommendation Approaches.......................................... 2

1.3.1 Cold Start ................................................................................................................................. 3

1.3.2 Missing of Absolute Ranking ........................................................................................... 3

1.3.3 Personalization for Community Recommendation ................................................ 4

1.4 Web Mining Techniques ................................................................................................... 4

1.4.1 Web Mining Techniques as a Solution to the Existing Problems of Recommender Systems .................................................................................................................. 6

1.5 Organization of the Thesis ............................................................................................... 8

Chapter 2: An Overview of Recommender Systems ................................................ 11

2.1 Introduction: ....................................................................................................................... 11

2.2 Previous Review Studies ................................................................................................. 12

2.3 Types of Recommender Systems .................................................................................. 15

vii

2.3.1 Collaborative Filtering based Recommender Systems ........................................ 16

2.3.1.1 Item based and User based CF techniques ..................................................................18

2.3.1.1.1 Association rule mining between preferences of neighbor of users........19

2.3.1.1.2 Rating based recommendation .............................................................................19

2.3.1.1.3 Choice based recommendation ............................................................................20

2.3.1.1.4 Recommendation based on similarity in the users’ preferences for common items ..............................................................................................................................21

2.3.1.1.5 Tagging based recommendation ..........................................................................22

2.3.1.2 Model based CF techniques .............................................................................................22

2.3.2 Reclusive Methods based Recommender Systems................................................ 25

2.3.2.1 Heuristic based Reclusive Recommendation ..............................................................26

2.3.2.2 Model based Reclusive Recommendation ...................................................................27

2.3.2.3 Web Mining based Recommendation ..........................................................................28

2.3.3 Demographic Filtering based Recommender Systems ....................................... 30

2.3.4 Knowledge based Recommender Systems .............................................................. 32

2.3.4.1. Case-based Recommendation: .......................................................................................35

2.3.4.2. Constraint based Recommendation:.............................................................................36

2.3.5 Hybrid Recommender Systems .................................................................................... 37

2.3.5.1. Hybrid Recommender Systems based on Collaborative Filtering dominated Reclusive Method ...............................................................................................................................38

2.3.5.2. Hybrid Recommender Systems based on Reclusive Method dominated Collaborative Filtering Techniques ..............................................................................................40

2.3.5.3. Hybrid Recommender Systems based on unified Reclusive Method and Collaborative Filtering Techniques ..............................................................................................40

2.3.5.4. Hybrid Recommender Systems based on Subsequent Integration of separately applied Collaborative Filtering Techniques and Reclusive Method ..................................41

2.3.5.5. Hybrid Recommender Systems based on Integration of Collaborative Filtering and Reclusive Method with Knowledge based System ..........................................................42

2.3.5.6. Other Hybrid Recommender Systems using Collaborative Filtering Techniques ...........................................................................................................................................42

viii

2.3.5.7. Other Hybrid Recommender Systems using Reclusive Method ......................... 43

2.3.6 Context Aware Recommender Systems .................................................................... 44

2.3.7 Social Network based Recommender Systems....................................................... 48

2.3.8 Soft Computing Techniques based Recommender Systems ............................. 49

2.4 Summary .............................................................................................................................. 52

Chapter 3: Link Mining based Book Recommendation Approach ...................... 54

3.1 Introduction ........................................................................................................................ 54

3.2 Book Recommendation using Positional Aggregation based Scoring Technique .................................................................................................................................... 56

3.2.1 Positional Aggregation based Scoring Technique ................................................ 57

3.2.2 Book Recommendation Approach using Positional Aggregation Scoring . 57

3.3 Results and Discussions .................................................................................................. 63

3.3.1 Dataset ................................................................................................................................... 63

3.3.1.1 Selection of top Universities ............................................................................................ 63

3.3.1.2 Courses included from top Universities ...................................................................... 64

3.3.1.3 Prescribed books by top Universities: ........................................................................... 65

3.3.2 Experimental Results ....................................................................................................... 66

3.4 Summary .............................................................................................................................. 71

Chapter 4: Book Recommendation based on Soft Computing Approaches ...... 72

4.1 Introduction ........................................................................................................................ 72

4.2 Ordered Weighted Aggregation .................................................................................. 74

4.3 Book Recommendation based on Ordered Weighted Aggregation ............... 76

4.4 Book Recommendation Approach using Ordered Ranked Weighted Aggregation ................................................................................................................................. 77

4.4.1 Ordered Ranked Weighted Aggregation ................................................................. 77

ix

4.4.2 Book Recommendation based on Ordered Ranked Weighted Aggregation ............................................................................................................................................................. 79

4.5 Results and Discussions ...................................................................................................81

4.5.1 Dataset ................................................................................................................................... 81

4.5.2 Experimental Results ....................................................................................................... 81

4.6 Summary ...............................................................................................................................91

Chapter 5: Feature based Opinion Mining Approaches for Book Recommendation ............................................................................................................... 93

5.1 Introduction: ........................................................................................................................93

5.2 Customer Reviews..............................................................................................................95

5.2.1 Issues while handling Online Reviews: .................................................................... 95

5.3 Feature Extraction and Selection .................................................................................97

5.4 Scoring Technique for Extracted Feature .............................................................. 102

5.4.1 Opinion Score Calculation ......................................................................................... 102

5.4.1.1 Positive words: ................................................................................................................... 103

5.4.1.2 Negative words: ................................................................................................................ 103

5.4.1.3 Reciprocal terms ............................................................................................................... 105

5.4.1.4 Highly expressible words .............................................................................................. 107

5.4.2 Weight Assignment to Features ................................................................................ 108

5.5 Results and Discussions ................................................................................................ 109

5.6 Summary ............................................................................................................................ 113

Chapter 6: Evaluation of Recommender Systems .................................................. 115

6.1 Introduction: ..................................................................................................................... 115

6.2 Previous Evaluation Studies ........................................................................................ 118

6.3 Evaluation Metrics .......................................................................................................... 119

x

6.3.1 P@10 .................................................................................................................................. 120

6.3.2 FPR@10 ............................................................................................................................. 121

6.3.3 FNR@10............................................................................................................................. 121

6.3.4 Mean Average Precision.............................................................................................. 121

6.3.5 Mean Absolute Error .................................................................................................... 121

6.3.6 Mean Reciprocal Rank ................................................................................................. 121

6.3.7 Root Mean Square Error ............................................................................................. 122

6.3.8 Spearman rank Correlation Coefficient ............................................................... 122

6.3.9 Modified Spearman rank Correlation Coefficient ............................................ 123

6.4 Evaluation based on Experts’ Ranking using Explicit Feedback .................. 123

6.4.1 Evaluation Results based on Different Evaluation Metrics ........................... 125

6.4.1.1 Evaluation Results using Explicit Feedback based on Root Mean Square Error .............................................................................................................................................................. 126

6.4.1.2 Evaluation Results using Explicit Feedback based on Mean Absolute Error 128

6.4.1.3 Evaluation Results using Explicit Feedback based on P@10 ............................. 130

6.4.1.4 Evaluation Results using Explicit Feedback based on Mean Average Precision .............................................................................................................................................................. 132

6.4.1.5 Evaluation Results using Explicit Feedback based on FPR@10 ........................ 133

6.4.1.6 Evaluation Results using Explicit Feedback based on FNR@10 ....................... 134

6.4.1.7 Evaluation Results using Explicit Feedback based on Modified Spearman Rank Correlation Coefficient ................................................................................................................. 136

6.4.1.8 Evaluation Results using Explicit Feedback based on Mean Reciprocal Rank .............................................................................................................................................................. 138

6.4.2 Comprehensive Evaluation Measure ..................................................................... 139

6.5 Evaluation based on Implicit User Feedback ....................................................... 140

6.6 Architecture for Evaluation Scheme based on Implicit Feedback ............... 141

6.6.1 Vector Component of User feedback ..................................................................... 143

6.6.2 User Feedback based Scoring of Products ............................................................ 144

xi

6.6.3 User’s Sincerity Measure ............................................................................................ 147

6.6.4 Product Preference Score ............................................................................................ 150

6.6.5 User Personalized Ranking ........................................................................................ 152

6.7 Results and Discussions ................................................................................................ 153

6.7.1 Mean Reciprocal Rank obtained using Comprehensive Approach ........... 154

6.7.2 Precision@10 obtained using Comprehensive Approach ............................ 155

6.7.3 Mean Average Precision obtained using Comprehensive Approach ........ 156

6.7.4 FPR@10 obtained using Comprehensive Approach ........................................ 157

6.7.5 FNR@10 obtained using Comprehensive Approach ....................................... 159

6.7.6 Spearman Correlation value using Comprehensive Approach ................... 160

6.8 Relative Performance of the Recommender System using Proposed Comprehensive Approach and other Existing Evaluation Approaches ............. 161

6.8.1 Comparison of Proposed Comprehensive Approach with Existing Evaluation Strategies ................................................................................................................ 165

6.9 Summary ............................................................................................................................ 167

Chapter 7: Conclusion and Future Direction ......................................................... 169

7.1 Introduction ...................................................................................................................... 169

7.2 Conclusion ......................................................................................................................... 169

7.2 Future Directions ............................................................................................................ 172

References: ........................................................................................................................ 174

List of Publications : …………………………………………………….….200

xii

LIST OF TABLES

Table 1. 1: A summary of results contained in this thesis .................................................................. 10 Table 2. 1:A glance of the review studies on Recommender Systems ............................................. 14 Table 2. 2 : Collaborative Approach illustration.................................................................................... 16 Table 2. 3: Recommender Systems, Categories and Techniques ....................................................... 50 Table 3. 1: Top 4 ranked books by 5 universities .................................................................................. 58 Table 3. 2: : Conversion of Rank into Scores ........................................................................................... 58 Table 3. 3: : Pairwise comparison of books .............................................................................................. 59 Table 3. 4: Normalized preference score of books ................................................................................ 59 Table 3. 5: Positional Aggregated scores of books ................................................................................. 62 Table 3. 6: Preference score of books......................................................................................................... 62 Table 3. 7: Ranked books based on Positional Aggregation Scoring technique .......................... 62 Table 3. 8: Top 7 Indian Universities in QS ranking [244] ................................................................ 64 Table 3. 9: Syllabus of various Courses, offered at top Universities. ............................................... 65 Table 3. 10: Total number of books in the syllabus of corresponding courses in top

Universities .............................................................................................................................................. 66 Table 3. 11: Code and details for books on Compiler Design ........................................................... 67 Table 3. 12: Ranked list of book ‘compiler design’ by top universities .......................................... 68 Table 3. 13: Compiler design ranked books by top 7 Universities .................................................. 68 Table 3. 14: Rank to Score conversion of book Compiler Design .................................................... 69 Table 3. 15: Positional Score for book Compiler Design ..................................................................... 69 Table 3. 16: Ranking of book ‘compiler design’ using Positional Aggregation Scoring ........... 69 Table 3. 17: PAS based Ranking of different books ............................................................................... 70 Table 4. 1: Ranked books using relative quantifier most .................................................................... 76 Table 4. 2: Ranked books using relative quantifier As many as possible ....................................... 77 Table 4. 3: Ranked books using relative quantifier At least half ...................................................... 77 Table 4. 4: Ranked books based on Ordered Ranked Weighted Aggregation technique for

example 3.1 ............................................................................................................................................. 81 Table 4. 5: Code and details for books on Compiler Design .............................................................. 82 Table 4. 6: Ranked list of book ‘compiler design’ by top universities ............................................. 83 Table 4. 7: Compiler design ranked books by top 7 Universities ..................................................... 84 Table 4. 8: Rank to Score conversion of book Compiler Design ....................................................... 84

xiii

Table 4. 9: Positional Score for book Compiler Design ........................................................................85 Table 4. 10: Score obtained by recommendation approaches for compiler design ...................85 Table 4. 11: Five different ranking of book ‘compiler design’ ..........................................................86 Table 4. 12: Five different ranking of book ‘Discrete Mathematics’ ...............................................86 Table 4. 13: Five different ranking of book ’Artificial Intelligence’ ................................................87 Table 4. 14: Five different ranking of book ‘Data Structure’ .............................................................87 Table 4. 15: Five different ranking of book ‘Principal of Data Base’ ...............................................87 Table 4. 16: Five different ranking of book ‘‘Computer Graphics’ ..................................................88 Table 4. 17: Five different ranking of book ‘Software Engineering’ ...............................................88 Table 4. 18: Five different ranking of book ‘‘Operating System’ ......................................................88 Table 4. 19: Five different ranking of book Computer Network’ .....................................................89 Table 4. 20: Five different ranking of book ‘Theory of Computation’ ............................................89 Table 5.1: Features and related review terms ...................................................................................... 101 Table 5.2: Precision of extracted features ............................................................................................. 110 Table 5.3: weights distribution of features ........................................................................................... 111 Table 5.4: Score calculation example ..................................................................................................... 111 Table 5.5: Example of Final Score Calculation ................................................................................... 112 Table 5.6: Top 10 ranked books of all the courses using Opinion Mining Technique .......... 112 Table 6. 1: Root Mean Square Error of all books by different approaches................................. 126 Table 6.2: Mean Absolute Error of all books for different approaches ....................................... 129 Table 6.3: P@10 for all approaches ........................................................................................................ 131 Table 6.4: Mean Average Precision of different approaches. ......................................................... 132 Table 6. 5: FPR@10 for all techniques. .................................................................................................. 134 Table 6.6: FNR@10 of all books................................................................................................................ 135 Table 6.7: Modified Spearman Rank Correlation Coefficient by different approaches ......... 137 Table 6.8: Mean Reciprocal Rank of all techniques for different Courses ................................. 138 Table 6.9: Final values of parameters used to find error .................................................................. 139 Table 6.10: Final values of parameters used to find precisions and correlation ...................... 140 Table 6.11: Comprehensive evaluation measure ................................................................................ 140 Table 6.12: Illustration for the calculation of Normalized Products Importance Score ‘δ’ for

user 1. ..................................................................................................................................................... 146 Table 6.13: Correlation values of different products of Laptop ..................................................... 148 Table 6.14: Correlation values of different products of Printer .................................................... 148 Table 6.15: Correlation values of different products of Head Phone........................................... 149 Table 6.16: Correlation values of different products of Tablet ...................................................... 149

xiv

Table 6.17: Correlation values of different products of Smart Phone ......................................... 149 Table 6.18: List of users which are excluded after user’s sincerity analysis ............................. 150 Table 6.19: Criteria of preference for a product to be preferred by a user ............................... 151 Table 6.20: Normalized Products Importance Score for Laptop ................................................... 152 Table 6.21: Ranking of laptop by different users based on product preference score .......... 152 Table 6.22: Mean Reciprocal Rank of first ranked product of different items ......................... 154 Table 6.23: values of precision at k, for different items ................................................................... 155 Table 6.24: Mean Average Precision for different products ........................................................... 156 Table 6.25: values of FPR@10 for different products ....................................................................... 157 Table 6.26: Avg. FPR@5 and Avg. FPR@10 for all the items .......................................................... 158 Table 6.27: Values of FNR@10 for different products ..................................................................... 159 Table 6. 28: Avg. FNR@5 and Avg. FNR@10 for all the items ....................................................... 159 Table 6.29: Spearman correlation coefficient for different products .......................................... 160 Table 6.30: Mean Reciprocal Rank, P@5, Mean Average Preciison and Spearman Correlation

Coefficient for different approaches ............................................................................................ 162 Table 6.31: FPR@5 and FNR@5 for different approaches .............................................................. 164 Table 6. 32: Comparison of proposed Comprehensive Approach with existing evaluation

strategies ................................................................................................................................................ 166

xv

LIST OF FIGURES

Figure 2. 1: Collaborative Filtering Approach ........................................................................................17 Figure 2. 2: Reclusive Approach for Recommendation ........................................................................26 Figure 2. 3: Demographic Filtering based Recommendation Approach ........................................31 Figure 2. 4: Context Aware Recommender Systems Overview .........................................................45 Figure 2. 5: Example for Context Aware Recommender Systems using season based clothes

......................................................................................................................................................................45 Figure 3.1: An overview of link mining approach ................................................................................55 Figure 3. 2: Positional Aggregation Scoring based Book Recommendation System ...................61 Figure 4.1: Most Quantifier ..........................................................................................................................75 Figure 4. 2: As many as possible quantifier .............................................................................................75 Figure 4. 3: At least half quantifier .............................................................................................................76 Figure 4.4: Ordered Ranked Weighted Aggregation based Book Recommendation System...80 Figure 5.1: Demonstration of a review in Spanish ................................................................................96 Figure 5.2: Demonstration of a review in Russian ................................................................................96 Figure 5.3: Demonstration of a review in Portuguese .........................................................................96 Figure 5.4: Demonstration of a review in Greek ...................................................................................96 Figure 5.5: Screenshot displaying no reviews .........................................................................................96 Figure 5.6: Architecture of Book Recommendation using meta searching ...................................97 Figure 5.7: Screenshot of Seacrh Engine Result Page for books on Artificial Intelligence .......99 Figure 5.8: Customer review expressing the views about content ..................................................99 Figure 5.9: Review example of ‘understandability’ feature. ............................................................ 100 Figure 5.10: Review representing importance of physical attributes .......................................... 101 Figure 5.11: Review representing importance of Price .................................................................... 101 Figure 5.12: Precision of Extracted Features ........................................................................................ 110 Figure 6. 1: Block diagram for Evaluation of Book Recommendation Approaches ................ 125 Figure 6. 2: Average Root Mean Square Error for all techniques ................................................. 128 Figure 6. 3: Average of Mean Absolute Error of all the books for different quantifiers ........ 130 Figure 6. 4: Average P@10 for all books ............................................................................................... 132 Figure 6. 5: Mean Average Precision of different approaches ....................................................... 133 Figure 6. 6: Average FPR@10 for all books using different book recommender approaches

................................................................................................................................................................... 134 Figure 6. 7: Average FPR@10 for all books using different book recommender approaches

................................................................................................................................................................... 136 Figure 6. 8: Average of Modified Spearman rank correlation coefficient .................................. 137 Figure 6. 9: Average Mean Reciprocal Rank of all the books for different techniques .......... 139

xvi

Figure 6. 10: Block diagram of implicit User Feedback based Evaluation of Recommender Systems ................................................................................................................................................... 142

Figure 6. 11: Mean Reciprocal Rank of top rank-position for respective items using Comprehensive Approach. .............................................................................................................. 155

Figure 6. 12: P@k for different items using Comprehensive Approach ..................................... 156 Figure 6. 13: Mean Average Precision using Comprehensive Approach ................................... 156 Figure 6. 14: Average FPR@5 using Comprehensive Approach ................................................... 158 Figure 6. 15: Average FPR@10 using Comprehensive Approach ................................................. 158 Figure 6. 16: Average FNR@5 using Comprehensive Approach .................................................. 160 Figure 6. 17: Average FNR@10 using Comprehensive Approach ................................................ 160 Figure 6. 18: Spearman correlation coefficient between system ranking and Comprehensive

Approach based ranking .................................................................................................................. 161 Figure 6. 19: Comprehensive Veracity Measure of different approaches ................................. 162 Figure 6. 20: Average FPR@5 for all items ........................................................................................... 164 Figure 6. 21: Average FNR@5 for all items .......................................................................................... 165

xvii

LIST OF ABBREVIATIONS

Acronym Full form

CARS Context Aware Recommender Systems

CB Content Based

CEM Comprehensive Evaluation Measure

CF Collaborative Filtering

DF Demographic Filtering

FNR False Negative Rate

FPR False Positive Rate

HA Hybrid Approach

HRS Hybrid Recommender Systems

KBS Knowledge based Systems

MAE Mean Absolute Error

MAP Mean Average Precision

OMT Opinion Mining Technique

ORWA Ordered Ranked Weighted Aggregation

OWA Ordered Weighted Aggregation

P@k Precision at top K position

PAS Positional Aggregation based Scoring

RMSE Root Mean Square Error

SMRCC

Spearman Rank Correlation Coefficient

1 | P a g e

Chapter 1

Introduction

1.1 Recommender Systems: An Introduction

"Necessity is the mother of invention," the famous proverb is practically experienced

in our daily life as we are growing and moving towards the advancement in

technologies. These technologies are giving birth to the modern tools and techniques

to fulfill our daily needs. Today a huge number of users are using the Internet. The

developed countries like Germany and U.K have approximately 83% Internet users of

their population, whereas China leads the overall contribution to the Internet users in

the world, which counts to 22.4%. USA has 78.1% Internet users of their population,

a contribution of 10.2% of overall users in the world [1]. This accelerated increase in

the use of the Internet in recent days has changed the style people live, they think and

they work.

With the changing trends in technologies, daily life of an individual has also

changed at a very fast pace. People prefer online shopping for their needs more and

more. To make online shopping easy and reliable a good number of product

recommendation techniques are proposed by many researchers in recent decay[2], [3],

[4]. Recommender systems (RS) try to identify the need and preferences of users,

filter the huge collection of data accordingly and present the best suited option before

the users by using some well-defined mechanism.

There are several known and frequent used techniques to recommend products

including Collaborative Filtering (CF),Content based (CB) or Reclusive Methods

(RM), Knowledge Based approach (KB), Demographic Filtering (DF), Hybrid

Approach (HA) and Context Aware approach for Recommender System (CARS), etc.

1.2 Application of Recommender Systems

The popularity of the recommender system is evident from the fast and vast

development of these systems for various applications. In the present time where

researchers deal with big data, it is observed that the prime focus of recommender

Chapter 1: Introduction

__________________________________________________________________________________

2 | P a g e

system‟s research is its application study. The application area of the RS has spread

over various domain of the daily life which includes –

Academics

Business

Entertainment

Health and care

Sports

In academics, RS have been widely used for Book recommendations[5],[6], [7]

Institute recommendations[8], Research papers recommendations [9], [10],

Conference recommendation [11] and Courses recommendations[12], [13], etc. The

recommender systems have gained much popularity in providing a means of

entertainment due to early music and movie recommendations [14], [15], [16]. The

sub domains of entertainment [17], [18], where RS have been used include Music,

Movie, TV program, and Tourism [19]–[21], etc.

The application of recommender system for health has grown rapidly due to the

increase in the demand of health information systems. Various health related

recommendation methods have been proposed [22], [23], [24], [25], [26].

There are numerous recommender systems have been developed for various e-

commerce applications. These applications oriented RS help business users to get

information about products and services of other products. Online shopping

interchangeably termed as e-shopping has achieved a high rate of growth in recent

days. Many portals have been designed for e-commerce applications based on e-

shopping [27], [28], [29], [30].

1.3 Problems and Issues in Recommendation Approaches

The different techniques which are used in designing RS have their own advantages

and limitations. The collaborative filtering, reclusive approach and demographic

filtering are basically learning-based techniques. Somehow, all these techniques

exhibit cold start problems. We explain and illustrate the major issue with these

leading recommender system technology supported by suitable examples in the

followed section, along with other issues exist with other existing techniques.


__________________________________________________________________________________

3 | P a g e

1.3.1 Cold Start

Cold start problem occurs for new user as well as for new items. We call it unknown

prior preference (UPP) problem. Although, reclusive approach does not need other

user‟s preferences and purchase details, it can recommend best match to user‟s

preferences only if it knew how and what has been rated by users previously. Problem

arises when a new user comes to shop and the system due to lack of its past

experience fails to recommend an item that matches to its choice. When a new user

starts buying something, the reclusive approach fails to identify what to recommend?

Similarly, if a user has been purchasing few specific types of products from a

merchandiser, say clothes, if s/he starts to seek electronic gadgets, reclusive approach

alone is unable to make any better recommendation.

Collaborative approach needs a good amount of rating from neighbors of a user;

also it requires a good amount of rating for identifying neighbor users. Therefore,

newly launched items and items those do not have good number of rating have found

weaker recommendation, though how best it may suite to user. For example, if a

person who lives in a cool place, like Paris or New York, his/her choice would always

be warm fabrics while shopping clothes. The person would move to a relatively hot

place, say Chennai or Mumbai (India) for any business or tour purpose, it would be

difficult to recommend cotton clothes to the user against the profile already have been

established which have high ratings for warm clothe only. Thus, in C.F approach,

UPP problem remain usual. In the same way for a new user C.F approach lacks a

better recommendation. The same issue seems to happen for demographic filtering

based recommendation also.

Although the researchers have suggested knowledge based approach as a solution

to the cold start problem. In knowledge based approach knowledge engineer learns

the user‟s immediate requirement and matches it to product features without any

historical data of the product and users. The knowledge based approach does not rely

upon any prior information of users. However, finding dedicated users who must be

willing to devote a good amount of time so that system can learn their preferences, is

difficult.

1.3.2 Missing of Absolute Ranking

Usually, the C.F approach needs rating. The merchandiser and users have their own

rating scale and own perception and understanding for rating scale. Hence, there is no


__________________________________________________________________________________

4 | P a g e

standard rating parameter and lack of rating standard affects the recommendation

badly. Some of the sites use 5 rating scale whereas others have 10 rating scales. So if

we consider only how much star has been awarded by the user for a particular item, it

would be confusing. A product is rated 6 out of 10, and another is rated 4 out of 5, it

is obvious the latter is best rated but the system which only asks number of rating or

rating points, it would opt for former. Thus we need some aggregation operator that

can fit the difference in one view. Also, rating scale gives a relative preference idea

and not absolute ranking. Hence, a user while reading a book gives 4 stars on amazon

while the quotes from the user is indicating that the book is not up to the mark,

however is well. That is 4 stars for a user means different from other. Different users

perceive rating differently. Sometime a user gives 3 star to the best books he has ever

been read. Whereas another user might have rated 4 stars to an average book, thus,

there is a need of some operator that can eliminate the difference and project the

rating absolutely and not relatively.

1.3.3 Personalization for Community Recommendation

Generally the leading RS filtering techniques like CF and RM favor the philosophy of

personalization. Though personalized recommendation seems great idea while

predicting an item to a particular user, however, making recommendation for users

which belong to more or less same group and have similar requirements causes extra

time and effort and repetition of the same process for different users. Let us consider

the book recommendation problem for students of same course and year. The

variations in the preferences of the students are possible but the need of these students

is equally important and similar. Also, there are specified domains of collection of

books for the same courses students. Thus, instead of applying personalized

recommendation approach it seems adequate to make use of a group recommendation

technology and same techniques and single ranked recommendation can be one

answer to several simultaneous queries [31].

1.4 Web Mining Techniques

Web mining is the application of data mining techniques to extract knowledge from

the Web data [32]. The researchers in this work have proposed web mining techniques

as a solution to various problems with the existing recommender systems‟

technologies which have been discussed above. In this work, web mining techniques


__________________________________________________________________________________

5 | P a g e

have been utilized for the recommendation of several electronic products and books.

Web mining techniques include – a) web usage mining, b) web structure mining and

c)web content mining.

Web usage mining makes use of log information of users on web and according to

their activity the items which match to their preferences are recommended to users.

Web structure mining, also referred as „link mining‟, incorporates the link available in

a web page and explores those links to find the best options for a user. Link mining is

a new challenging area where statistical modeling is performed for relational learning

[33]. It gives the importance of a website by finding out the backward links and

forward links. A forward link of a web page „A‟ of any website is the

recommendation of different web pages by this particular page, whereas the backward

link gives that by how many different pages this particular page is recommended, i.e.

linked. The importance of a page is determined by the number of backward link it has.

If the backward link is more, a page‟s importance is higher. For a high value page, its

entire forward links are considered to be highly valuable. Keeping the above concept

in consideration, we have chosen top universities amongst the Indian universities and

checked their recommendation for different courses of computer science, it‟s evident

that recommendation of a book by a high class university will eventually increase the

importance of the recommended books.

Web content mining is a process of extracting useful information from content of

the web. These contents may be the details of the items available on the web and

opinion of the users, etc. As far as online purchasing of products is concerned;

opinions from the customers are seen as a base to analyze the features of a product

and assess the requirement of a user. Customers' reviews are the basis for opinion

mining technique. Finding and summarizing the opinion from huge amount of

customers reviews, is also very tedious for business. The summary of reviews is

worth for the job.

For researchers, Opinion mining is a very hot topic in the field of data mining. The

main issue to consider is to find (a) product feature and (b) analysis comments,

whether positive or negative, as described in [5, 6, 7]. People generally use to analyze

some pre-determined terms to interpret it as a positive or negative comments, like,

better, good, nice, well written, highly recommend etc. are treated as positive terms

and worst, time consuming, bad, not recommended, etc. are termed as negative


__________________________________________________________________________________

6 | P a g e

comments. In [12], opinion retrieval is perceived as a two-step task, finding relevant

documents and re-ranking these documents by opinion scores. The reviews are given

by the human and it is very evident that to understand the review one should perceive

it as human being. Finding the comments is not sufficient. Sometimes things are

different then what they seem to appear. Let us consider the following example:

"I highly recommend this book for those who want to waste their time and money.

If you are really sincere to get some knowledge into your bucket, another one is the

better option"

Though the sentence above has terms like highly recommend and better but both

terms are being used in a negative sense for some specific book, keeping only the

positive and negative aspects of the terms and processing on these basis is not

sufficient alone to extract opinion for a better conclusion.

1.4.1 Web Mining Techniques as a Solution to the Existing Problems of

Recommender Systems

In this work, the researchers have tried to employ all the three discussed web

mining techniques for designing the recommendation methodology. The details of

mathematical verification of the procedure are elaborated in the respective chapters of

the thesis. Here, we have recommended top books on different disciplines of

computer science by using different web mining techniques. For opinion mining; we

have suggested various algorithms which consider the above mentioned issues in

finding the orientation of users in their opinions or reviews. For web structure and

web usage mining; we have tried to weight the importance of most valuable

universities (which may be consider as valuable links in link/structure mining) for

recommendation and validation recommender systems are performed on the basis of

user‟s behavior towards reviews of a product (which represents mining of the web

usage).

We propose a solution to cold start problem which is based on the consensus

ranking of the item that suits majority of the group to which user belongs. However,

there are two important aspects in it, first, to know similar-like user surroundings and

second it may not be personalized recommendation. However, this approach may save

time and ease the complexities involved in the recommendation. Let us consider book

recommendation problem graduate students of any university. Finding each user‟s

preferences and providing personalized recommendation all is time consuming and


__________________________________________________________________________________

7 | P a g e

efforts are required in it. Also, cold start issue will remain a threat forever. As a

solution, all the graduate students of same course in a University can be considered as

member of one group. Top N books amongst several universities can be obtained by

observing – a) what the best universities are recommending and, b) what the students

have their opinion about these books. By finding the best book with some

experimented suggestible approach, a good recommendation can be provided to large

user without unknown prior preference (UPP) problem. We have tried to incorporate

the above approach for the recommendation of books. The explanations and detail

procedure for different techniques used is elaborated in consecutive chapters.

Ordered Weighted Aggregation (OWA) and Ordered Ranked Weighted

Aggregation (ORWA) are used to include absolute ranking, as discussed in section

1.3.2 that missing of absolute ranking is a major concern. These techniques help in

aggregating the users‟ heterogeneous ranking to obtain a final aggregated result. We

have tried to utilize OWA and incorporated the proposed ORWA for aggregation

purpose.

With the above discussions in the considerations, we have also suggested a ranked

recommendation approach for books which aggregates the several ranking of the top

universities (which is considered as authorities) and employ link mining approach in

the recommendation process. On the one hand it handles the cold start issues and on

the other hand it eases the complexities of personalized recommendation to huge

number of users and replaces it with a single ranked recommendation. Chapter 3 and

Chapter 4 deals with the above issues and a comprehensive approach based on the soft

computing and link mining approaches will be discussed.

Opinion mining avoids user‟s rating and rather it emphasizes on user‟s reviews.

Thus, opinion mining can be a good solution to deal with the issues those have been

encountered with rating based recommendations. The issue with the opinion mining

which has been discussed in section 1.4 has not been adequately addressed yet. We

have tried to propose the solution for these problems. Several algorithms will be

discussed in Chapter 5 which we believe would overcome the prevailing issues to an

extent.


__________________________________________________________________________________

8 | P a g e

1.5 Organization of the Thesis

The thesis contains seven chapters. In Chapter 1, the introduction of the thesis is

presented which gives an overview of the problem formulated and proposed solutions.

Chapter 2 reviews extensively the literature and presents different categories possible.

Chapter 3 and Chapter 4 presents link mining and soft computing approaches

respectively for the recommendation of items in general, and books in particular.

Chapter 5 gives feature based recommendation of books which exploits opinion

mining techniques. In Chapter 6, evaluation of recommender systems based on

explicit as well as implicit feedback is discussed. The need of explicit and implicit

feedback is also discussed in details. Finally we have concluded the complete work in

Chapter 7. The Chapter-wise description is as follows;

Chapter 1: The recommender systems, its need and applications are introduced. A

highlight on the issues encountered in existing techniques is presented. The proposed

techniques which could be helpful in solving the emergent issues are briefly

described. Also, the main idea contained in this research work is featured in this

section.

Chapter 2: Review of Recommender systems have been performed in Chapter 2. The

literature survey is carried out by studying more than 200 recent research papers,

published in reputed conferences and peer reviewed journals, on the topic. The

various limitations and shortcomings of the existing techniques have been mentioned.

Chapter 3: Chapter 3 deals with an extensive research work supported by several

architectures designed for proposed link mining techniques which also incorporates

positional aggregation to aggregate the top universities recommended books and

provide students with the best books for their syllabus. The results of

recommendations made by the proposed approaches are shown at the end of the

chapter. However, the comparison of the approaches is explicitly discussed in Chapter

6.

Chapter 4: In Chapter 4, Soft Computing techniques have been used, in addition to

this, the fuzzy aggregation have been incorporated for the aggregation purpose. The

results of recommendations made by the proposed approaches are shown at the end of


__________________________________________________________________________________

9 | P a g e

the chapter. The comparison of the approaches with positional aggregation is

explicitly discussed in Chapter 6.

Chapter 5: In this chapter, the main consideration is to introduce opinion mining for

product recommendations. We have taken books as a product here. The

recommendation of books is discussed with the help of opinion extraction and feature

selection algorithm. These algorithms are designed by keeping in view those

considerations which are not well studied in the literature for the recommender

systems.

Chapter 6: A Comprehensive Evaluation Approach has been discussed in Chapter 6.

The comparison of the proposed approaches are discussed which is based on Explicit

feedback. The comparisons are made on the basis of eight (8) different parameters.

The parameters are P@10, FPR@10, FNR@10, Mean Average Precision (MAP),

Mean Reciprocal Rank (MRR), Mean Absolute Error (MAE), Root Mean Square

Error (RMSE), and Spearman rank order correlation coefficient. An additional

Comprehensive evaluation approach for implicit user feedback is also suggested and

an existing RS has been evaluated using proposed approach. The evaluation approach

may help in assessing the performance of any recommender systems as well. The

suggested approach uses implicit user feedback and recommends only those products

which are preferred by the users.

Chapter 7: It concludes the overall work and emphasizes our contributions in the

research work carried out in the thesis. It also focuses on the scope for future

extension of the work.

The main contributions in the thesis has been summarized and presented in Table 1.1.


__________________________________________________________________________________

10 | P a g e

Table 1.1: A summary of results contained in this thesis

Serial No.

Recommendation

based on Web Mining

Techniques

Contribution Chapter Publications

1.

Survey of the existing

literature

Literature Survey of RS

from Ecommerce

perspective

2 [28], [32]

2.

Link Mining based

Book Recommender

Systems

Incorporated Positional

aggregation technique for

the recommendation of

books.

3 [34]

3.

Soft Computing based

Book Recommender

Systems

ORWA is proposed and

utilized for the

recommendation of

books.

4 [5]

4.

Fuzzy technique based

Book Recommender

Systems

OWA is employed for

book recommendation 4 [35], [36]

5.

Opinion Mining based

Book Recommender

Systems

Feature based opinion

mining technique 5 [1], [37], [36],

6.

Evaluation Strategy for

Recommender Systems

User feedback based

evaluation of

recommender system

6 [38], [39], [40]

11 | P a g e

Chapter 2

An Overview of Recommender Systems

2.1 Introduction:

The recommender systems (RS) have grown exponentially in recent few years and its

applications have spread over various domain of life including online shopping of

books, home appliances, movies, electronic gadgets, recommendation of doctors and

hospitals for patients, institute recommendation for students and teachers, hotel

recommendations for tourists and so forth. The philosophy behind the success of

recommendation technology is the fact that it is human tendency to rely upon

experiences of their neighbors and friends prior to making decision of any kind,

especially regarding purchase of any items, taking admissions in institutes for higher

education, opting an apartment for rent or buying it, spending weekend at some

holiday places, etc.

The advancement of Internet technologies has caused data overload due to which

the buyers face more difficulties in finding the exact destination which meet their

needs out of a huge collection of the available options. If a student who wishes to

spend his/her vacations at some hill stations and would like to stay in a hotel with

peace and calm, there would be thousands of places all around the world which might

come to him/her as options. In such a situation recommender systems can provide a

better option according to the need and requirement of the user and depending upon

his/her prior preferences.

Although there are several definitions which researchers have suggested for

recommender systems, we define recommender systems as –

“Recommender systems try to identify the need and preferences of users, filter the

huge collection of data accordingly and present the best suited option before the users

by using some well-defined mechanism.”

In this chapter, we have reviewed more than 100 articles related to recommender

system including the manuscript in which very first existence of collaborative filtering

has reported in mid 90s [41], [42].

Chapter 2: An Overview of Recommender Systems

___________________________________________________________________________________

12 | P a g e

2.2 Previous Review Studies

The first paper on collaborative filtering (CF) was introduced in mid of 90s [42], [43].

The proposed CF technique provided a platform to design recommender system and

laid a strong foundation for the development of such recommender systems. The work

in the concerned area has been reviewed extensively in the literature. The study of the

surveys and reviews of recommender systems helps in establishing a better

understanding of the subject and gives a holistic picture of the technology used in the

field along with various aspects related to the topic. In this section, we have tried to

include major review/survey papers on the related work and discussed their

contributions. As the origin of the recommendation techniques are in mid 1990s, it

seems adequate to include papers from 2000 onwards.

In 2000, B. Sarwar et al. [44] has analyzed the effectiveness of recommender

systems on actual customer data from an e–commerce site and compared several

recommender algorithms with respect to their performance [45]. In 2001, Schafer et

al. [46] have examined traditional marketing methods and provided a foundation for

the growth of recommender systems as a marketing tool for e-commerce. They have

also presented taxonomy for recommender system and identified five models of

recommender applications. One of the excellent contribution of the Schafer et al. was

their exploration of four different domain for future study based on the taxonomy that

have not been adequately explored by the existing applications, then. They have

suggested following four area of research for recommender systems; non-

personalized, attribute based, item-to-item correlations, and people-to-people

correlations.

In 2002, R. Burke [47] investigated possible extent of hybrid recommender

systems and provided quantitative results for relative comparisons. Burke [48] has

also contributed for the researchers by surveying the hybrid recommender systems.

He has made comparison between different recommendation techniques and

hybridization strategies. Four techniques for recommendation and seven strategies of

hybridization were considered. He also has included 41 hybrids with some new

combination of that time. The attraction of the researchers towards recommender

system has been noticed increasing rapidly in early decay of the millennium. The

generations of recommendation by early decay of the millennium has been reported in

[42]. The authors have also presented an overview of recommender systems with the


___________________________________________________________________________________

13 | P a g e

discussion of the limitations, and possible enhancement for the solution of existing

issues.

In 2007, Candillier et al.[49]has reviewed the primary collaborative filtering based

systems and done an extensive comparison using MovieLens data set. Their study

identifies advantages and drawbacks of the approaches under evaluation. However,

there was no much discussion about the various issues encountered in the

collaborative filtering based approaches. The issues like data sparsity, shilling attacks,

synonymy, scalability, etc. are discussed comprehensively by X su and

Khoshguftaar[50]. They have proposed possible solutions for the existing issues as

well. The authors have also presented a comprehensive survey for collaborative

filtering techniques, categorized collaborative filtering algorithms and analyzed their

predictive performance in addressing these issues. The evaluation of the recommender

system has been discussed in [51]. The authors have discussed the ways to compare

recommenders based on the basis of a set of properties and described how can

recommender systems' performance be compared for the relevant area of application.

They have described experimental background suitable for deciding preferences

between several algorithms. They have also discussed how to draw reliable

conclusions from the conducted experiments.

In 2012, Park et al.[52] and Zhou et al. [53] have done a good work. Park et al.

reviewed 210 research articles related to recommender system and examined the

research trends in the concerned area by observing publication of the paper year-wise

and journal-wise. The effort helps the interested people with insight for future

research direction. Zhou et al. 2012 also presented an overview of state-of-the-art for

developing personalized recommender systems in social networking environment in

the same year. The work provides a research direction to address user profiling and

cold start problems.

The maximum number of research papers for the survey has been included by

Bobadilla in his tremendous work [54]. They have proposed a method which gives a

criterion for the inclusion of research papers of the concerned field. They have

discussed the overview of recommender systems and collaborative filtering methods.

They have also provided original classification of recommender systems,

suggestedarea of future research including bio inspired approaches for recommender

system.


___________________________________________________________________________________

14 | P a g e

Table 2.1:A glance of the review studies on Recommender Systems

Serial

no. Author & Year Primary Contribution

Citation on

Google Scholar as

on February 2017

1 B. Sarwar et al.

(2000)

Analysis of effectiveness of RS

comparison of recommendation algorithm 2235

2 J. B. Schafer et

al. (2001)

Provided a foundation for the growth of RS as a marketing tool in e-

commerce

Five models of applications and four domain of future work are explored.

1954

3 R. Burke (2002) Investigated possible extent of hybrid recommender systems

Provided quantitative results for relative comparison. 3153

4

G.Adomavicius

and A. Tuzhilin

(2005)

Generation of RS is discussed

Limitations and possible enhancement are mathematically modeled

7777

(Most cited article

on RS)

5 L. Candillier et

al. (2007)

Reviewed the primary collaborative filtering based systems

An extensive comparison using MovieLens data set.

150

6

X Su & T. M.

Khoshguftaar

(2009)

Issues like data sparsity, shilling attacks, synonymy, scalability, etc. are

discussed, and their possible solutions are proposed.

Comprehensive survey for CF techniques is performed, categorized CF

algorithms and analyzed their predictive performance.

1974

7

G. Shani & A.

Gunawardana

(2011)

Compared RS on the basis of characteristics and application both.

Described experimental background suitable for deciding preferences

between several algorithms.

Discussed method of drawing reliable conclusions from the conducted

experiments.

709

8 D. H. Park et al.

(2012)

Reviewed 210 research articles

Examined the research trends in the concerned area year-wise and journal-

wise.

The effort helps the interested people with insight for future research

direction.

284

9 X. Zhou et al.

(2012)

An overview of state-of-the-art for developing personalized recommender

systems in social networking environment.

The work provides a research direction to address user profiling and cold

start problems.

124

10 J. Bobadilla et

al. (2013)

An overview of recommender systems and collaborative filtering methods

are discussed over 253 articles.

Provided original classification of recommender systems

Suggested area of future research including bio inspired approaches for

recommender system.

683 (Most cited

article since 2011,

Elsevier)

11 J Lu et al. (2015)

It systematically examines the reported recommender systems through four

dimensions:recommendation methods, recommender systems software, real-

world application domains and application platforms.

Provides an understanding of developments in recommender system

applications.

102


___________________________________________________________________________________

15 | P a g e

However, they have not discussed about the timing-factor in recommendation and

a little was touched about fuzzy approaches in the recommendation. J Lu et al. in

2015 have systematically examines the reported recommender systems through four

dimensions: Provides an understanding of developments in recommender system

applications.

We have tried to include the discussion on the issue of time-constraint for

recommender systems and also have discussed the fuzzy approaches for the

recommendation of items. We have summarized the work in a tabular form and

shown in Table 2.1.

2.3 Types of Recommender Systems

The recommender systems can be categorized on several bases. In the literature, the

categorization of the recommender systems are usually found [42] on the following

bases;

Approaches used

Area of application for which recommendation is made

Data mining techniques applied, etc.

In [42], RS is categorized in 3 different criteria based on approaches, 1) Content-

based recommendations, 2) Collaborative recommendations and 3) Hybrid

recommendations. Bobadilla et al. [54] have suggested four categories on the basis of

filtering algorithms, Content-based filtering, collaborative filtering, hybrid filtering

and demographic filtering. Burke [47] have categorized 5 types of the recommender

systems based on the approaches. The categories are; Collaborative based

recommendations, Content- based recommendations, Demographic based

recommendations, Utility based recommendations and Knowledge based

recommendations.

We have categorized 8 types of recommender systems (RS). These categories

broadly cover the techniques which have been used by the masses or the current

generation researchers are frequently applying it.

1. Collaborative Filtering based recommender systems (C.F)

2. Reclusive methods based recommender systems (R.M)

3. Demographic Filtering based recommender systems (D.F)


___________________________________________________________________________________

16 | P a g e

4. Knowledge based recommender systems (K.B)

5. Hybrid Recommender systems (H.R)

6. Context Aware Recommendation System (CARS)

7. Social network based recommender systems

8. Soft Computing techniques based Recommender Systems

2.3.1 Collaborative Filtering based Recommender Systems

It is the most successful and frequently used recommendation technique discussed in

the literature [55], [44], [56] since the appearance of first recommender system in mid

1990s. The collaborative approach makes use of the recommendation from other

customers whose choices are similar to the target customers (i.e. customer for whom

the recommendation is made). The customers with similar choices are termed as

neighbor.

Table 2.2 :Collaborative Approach illustration

Users Items Purchase

User1 Tv1

Tv2

Tv3

Tv4

Tv5

User2 Tv1

Tv2

Tv3

Tv4

Tv5

User3 Tv1

Tv2

Tv3

Tv4

Tv5

User4 Tv1

Tv2

Tv3

Tv4

Tv5 recommended


___________________________________________________________________________________

17 | P a g e

Thus, two major tasks are being performed in collaborative filtering; 1) finding the

neighbor of a customer and 2) exploring the preferences of the neighbors of a target

customer or user. The neighbor of a user can be formed by analysing the past

purchasing behavior of the user and calculating the similarity scores between the

choices of these users. Whereas the recommendation of the neighborhood customers

can be obtained either explicitly in terms of rating which are numerical values within

a specified range, or implicitly with some defined measures. Implicit

recommendations also involve customer‟s feedback. The customer‟s feedback can be

their behavior noticed by the user‟s log information or it can be users‟ sentiments

expressed in terms of their reviews.

e.g. to understand better how items are recommended using C.F, we give a basic

assumption which is supported by a diagram presented in Figure 2.1 and illustrated in

Table 2.2.

Figure 2.1: Collaborative Filtering Approach


___________________________________________________________________________________

18 | P a g e

Assumption for C.F: if user1 and user2 have similar ratings of item1, item2 … item

„n‟, they must have similar ratings for item „n+1‟ also. In other words, if user1 has

high rating for item1, item2 & item3, and user2 too has high rating for item1 and

item2 then user2 must have high rating for item3 also.

The researchers have defined C.F differently and categorized in different criteria

based on approaches and algorithms used. Adomavicius and Tuzhilin[42] expressed

C.F in terms of a utility function which tries to predict utility of the items based on the

rating given to the item by other customers having similar preferences as the target

user. They have divided C.F algorithm in two categories. 1) Model based and 2)

heuristic based. The same categorization has been reported in [57]. However,

candillier et al. [49] have given three categories of collaborative approaches. a) User

based, b) model-based and c) item-based.

In user based approach, a set of nearest neighbors is associated to each user, and by

using nearest neighbors‟ ratings on that item, user‟s rating is predicted for the item. In

model-based approach, a set of users groups are constructed and ratings of members

of its group are explored. By using these ratings of an item, user‟s rating on an item is

predicted. Usually in this CF technique, models are created for recommendation.

These models are designed to produce accurate prediction on real data. However, in

item-based approaches, a set of nearest neighbors is associated to each item, and by

using rating of users on items‟ nearest neighbors, the rating on an item by users are

predicted.

Researchers have applied these C.F to design RS for various applications such as

recommending music, movie, web pages, articles and products for online shopping,

etc. [58], [59]. Further, there are several techniques within the above three categories

which researchers have worked on. The work can be classified further on the basis of

different methods and algorithms. The respective criteria and related work is

described in the following section.

2.3.1.1 Item based and User based CF techniques

Item based and User based recommendation are usually performed by exploiting –

Association rule mining between preferences of neighbor of users

Rating

Choice of individuals for varied items


___________________________________________________________________________________

19 | P a g e

Similarity in the preferences of different users for common items

Tagging

2.3.1.1.1 Association rule mining between preferences of neighbor of users

Association rule mining has been used extensively in collaborative recommendation.

An association rule based recommendation technique was proposed by Sarwar et al.

[44]. The authors have suggested some association rule for exploring the association

between user‟s purchase behaviors towards items and accordingly the items are

recommended to users. The authors in [60] have investigated the possibilities of

inclusion of association rule mining for collaborative filtering based

recommendations. Since collaborative recommender exploit how similar are the

customers' preferences, it is easy to make personalized recommendations.

However, association rule mining algorithms are designed by keeping in mind the

concept of market basket analysis. Such algorithms are not useful for collaborative

recommendation as there are enough rules which these methods need to mine, which

may and may not be fruitful for the user. Also, other criteria of association rule

mining often lead to create huge number of rules or some time very few rules which

have a negative impact on the performance of the system. The authors have designed

a collaborative recommendation technique to mine association rules for this purpose.

The associations between users as well as associations between items, both are

considered. In [61] authors have proposed scalable techniques based on association

rule. The rules are discovered from usage data for personalization of web to users.

Sandvig et al. have presented a collaborative recommendation algorithm based on

association rule mining in 2007 [62]. They have used k-NN algorithm to prevent

profile injection attack. Their results indicate that the proposed methods have shown

significantly improved performance.

2.3.1.1.2Rating based recommendation

Since a general trend in recommendation is to get rating from a user for available

items which in turn, support other users to find better items. This trend of

recommendation is simply termed as rating based collaborative filtering. However,

rating based recommendation is used in model based recommendation as well, which

shall be discussed in its appropriate place (see section 2.3.1.2).


___________________________________________________________________________________

20 | P a g e

PolyLens[63], an extended version of MovieLens, is very helpful in group creation

and management. Basically, PolyLens is designed for smaller group to recommend

movie. Several factors have been considered while designing PolyLens, like

generating group recommendation, evolution of group and its formation, and the

nature of the group to which a user belongs. It uses the nearest neighbor methods and

presents the sorted list according to lowest ratings.

RACOFI (Rule Applying Collaborative Filtering) is proposed in [64] which is a

multi-dimensional rating system. The authors implemented RACOFI Music for

assistance of users who usually prefers to listening music on-line. Their

implementation helps in recommending and rating audio. Authors have categorized

five features of music which generally have impact on users. They have made their

system available on-line since August 2003 at [http://racofi.elg.ca].

Rating system is also used in TiVo [65] which uses 100 million ratings. These

ratings are provided by approximately 30,000 users of different TV shows and

movies. The TiVo recommends the different TV programs to viewers.

Since a general trend in recommendation is to get rating from a user for available

items which in turn eventually support other users to find better items. The authors

have presented [66] a database-driven approach which makes use of the ratings in

item-to-item CF technique. The authors have claimed the ease of implementation and

its applicability in vast range.

2.3.1.1.3 Choice based recommendation

In choice based recommendation, items are recommended by using similarity in the

preferences of a single user for different items. Hayes and Cunningham [67]

developed a music application, „smart radio‟ at Trinity College, Dublin in 2001. The

music application is a web-based which allows users to share music programs. The

authors have used collaborative recommendation techniques and applied streaming

audio technology. The controlled distribution of music on web by the operators is

studied in their work and smart radio is designed to personalize the music programs.

The idea of collaborative filtering is introduced to swap the music programs by

observing the similarities between the users‟ choice. The smart radio is currently

working and has the permission from Irish music rights organization (IMRO).

Iman et al. Presented [68] a choice based technique that makes use of CF method

and extract latent knowledge from user ratings, and ask the user to prefer one of the


___________________________________________________________________________________

21 | P a g e

two sample items iteratively presented before them. The technique tries to place the

user in the latent factor space, and those items are selected for recommendation which

is near to the user position. The authors showed their results present better

recommendations. Since, the authors have used latent factor as well, this CF

technique can also fall in model based recommendation, if perceived otherwise.

As online radio has become popular, the authors have designed [69] a mechanism

by which playlist in real-time of listening the audio can be tailored according to the

musical tastes of the listener. The authors have used CF techniques to generate a

playlist in real time. The audience usually has listening history of the music before

listening to a particular one. On the basis of history of the listener, playlist is

recommended to the listener. They have also described the details of the

implementation of the technique.

A choice-based interface is studied for preference evocation during the cold start

phase [70]. The interface is compared with an existing rating-based system. The

authors have shown results which indicate that rating-based interface take more effort

whereas choice based system provides more satisfying recommendations.

2.3.1.1.4 Recommendation based on similarity in the users‟ preferences for common items

GroupLens[58] is one of the earliest developed collaborative filtering based system

which provides filtered online news to member of a group. It eases the process of

finding news articles which a user might like from huge amount of available news

articles.

Pazzani in 1998 [71] has discussed how to learn profile of user interests and how it

could help in the recommendation of web pages or news articles. The author has

mentioned the collaborative approaches and their pros and cons in the

recommendation of information sources to users by taking examples of restaurants.

In 2001, G Karypis[72] has suggested an item based personalized information

filtering technology to explore a set of N items. These N items are matched with the

interest of certain users. The authors have presented a method that first determines the

similarities between the various items and then the similarity is used for final

recommendation of items. The author has shown that the experimental evaluation on

five different datasets is 27% better. The standard collaborative filtering techniques

face great challenges in terms of scalability and performance, especially when there is

a lack of explicit user ratings. To improve the scalability of collaborative filtering,


___________________________________________________________________________________

22 | P a g e

web usage mining techniques can be used. However, it affects the recommendation

accuracy.

An improved FolkRank by using item based CF method is proposed by Gemmell

et al. [73]. They came up with a conclusion that item-based CF if mixed with

traditional graph-based approach could enhance the performance in FolkRank. Thus,

it is evident from the work that CF, especially item-based collaborative filtering,

could be proved an excellent way to enhance the performance of a recommender

system [74].

2.3.1.1.5 Tagging based recommendation

A recommendation approach based on tagging, „FolkRank‟, was proposed in [75],

[76]. Authors have calculated the distance from the uploaded resource. These

distances serve as a base in exploring the tag recommendations.

Another tagging based recommendation approach is presented by Zheng and Li

[77][78]. The system is based on CF. Their research has highlighted the importance of

tag and time in the process of recommendation. In general, rating matrices are used in

traditional systems based on CF; however, unlike others they used matrices based on

tag and time relations. The similarities are obtained by calculating tag-weight and

time-weight. The similarity index helps in identifying new neighbor which in turn

give the prediction on the basis of recommendation they made.

2.3.1.2 Model based CF techniques

Model based CF techniques as described earlier used to develop models using several

techniques including machine learning, Bayesian classification, ordering, clustering,

latent information utilization, graph model, etc. Goldeberg[41] have presented a

model based personalized book recommendation technique. The authors have applied

association rule mining and BNs for personalized books recommendation. The

association rule mining is used for exploring the association between user‟s

preferences by observing the borrowed books. The BNs are implemented is designing

the personalization of the RS.

However, the rating is also used in the model based recommendation. A User

Rating Profile model (URP) for rating-based collaborative filtering is [79] presented.

The URP is designed to assign one rating to each item for each user. The author

introduced a generative latent variable model. Each user is represented as a mixture of


___________________________________________________________________________________

23 | P a g e

activities of the user by generative latent variable model. User‟s actions help in

generating the rating for each item by observing activity of a user towards an item. A

preference pattern is associated with each activity of the user which supports in rating

of the items.

The author [80] analyzed existing methods in 2004 from machine learning

perspective to predict the rating. The author has shown that many existing methods

which were designed to fulfill the task are simply modified machine learning

techniques. The basic operations like dimensionality reduction, clustering,

classification, regression, and density estimation are performed. New prediction

methods are developed by the author. Marlin introduced a new experimental

procedure which has not been used previously.

The Kim et al. have proposed a machine learning technique to extract the

marketing rule for personalized recommendation. They have used tree induction

techniques, which can be incorporated with data mining techniques to match the

customer‟s demographic details. The proposed methodology helps in fetching the

rules for personalization of advertisement to a buyer shopping on the Internet [81].

One of the issues with collaborative filtering technique is that they are not portable

and is successful for an Internet environment with large computers. Miller et al.[82]

presented „PocketLens‟, a promising collaborative system that works on connected

servers with even palmtop and the results are no more less than the other competitive

techniques. PocketLens is based on CF algorithm which finds neighbor by the use of

5 peer to peer architectures. A shopbot is presented [83]. Shopbot is basically a

comparison shopping search engine which is designed in such a way that it can

exploit freebies to consumers without paying any extra fee. The authors have

suggested an item-item similarity method by using CF techniques. They have

considered the additional provision of providing the cost of the product as well as

their benefit from saving point of view to customers for recommendations.

Bayesian networks (BNs) is used [84] as a classifier for CF. Binary-class data have

been major focus for researchers in earlier model to perform CF task, however, the

authors have applied advanced classifier based on BNs. Moreover, they have not

worked on traditionally synthetic binary data; instead they have used real-world

multi-class CF. they have showed by their experimental results that their proposed CF

model has the performance better than the traditional CF algorithm, especially when


___________________________________________________________________________________

24 | P a g e

rating data have relatively more missing rates. Also BNs based CF is robust as it does

not degrade with increase of sparseness.

One of the fastest methods to improve the prediction accuracy without affecting

the running time is presented by [85]. Previously, the adopted approaches used to

compute interpolation priorities separately; however, Bell and Koren optimized the

problem in a way that they computed interpolation weights simultaneously for

neighbor. This method can generate a prediction in about 0.2 milliseconds. And is

equivalent efficient for large scale applications. The Netflix dataset is used for

evaluation.

In 2012, Sahoo et al. [86] developed a personalized recommendations to help the

user when their preference might change with time. The authors have argued that

user‟s behavior is not static and changes over time. They have proposed a hidden

Markov model. The model performs personalized recommendations by correctly

interpreting the behavior of a user in selecting the product. The preference of a user is

modeled as a hidden Markov sequence. Authors claim that the proposed model

outperforms the existing algorithms when the data is less sparse and the user

preference is changing.

In 2013, Yue Shi et al. [31] introduced ranking in recommendations. Due to the

rise of collaborative filtering (CF), the need of learning to rank has emerged. For

improving the ranking of the top-N recommendations, the ranking method could

contribute significantly. The authors have presented the key ideas of different

categories of learning to rank approaches, and illustrated how these techniques can be

extended to specific CF methods.

CF techniques follow the philosophy of one to one, i.e. every user is independent

and uses a single account. However, in a case where multiple users share a same

account may trouble the recommendation using CF. if context is available then CARS

could solve the issue. But it needs context to be illustrated and explained [87]. Author

proposed a solution to solve the issue without being aware of the context, by using top

N shared accounts, an item-based top N collaborative filtering recommender system.

The method gives the recommendations according to the binary positive feedback.

The experimental results show that their techniques can tackle the issues regarding

shared accounts of various datasets.


___________________________________________________________________________________

25 | P a g e

ExcUseMe[88] is the only pure CF based recommender system which tries to

avoid cold start problem without combining content filtering or context details. The

authors have presumed that the arrivals of users for purchase is randomly sequenced

and certainly system takes the decision about the possibilities of new user

participation in the exploration of newly launched items. The users which are possibly

interested in new items are revealed by ExcUseMe gradually. The new items are

modeled according to the user‟s preferences. The provable guarantee for cold start

problem is assured by [68]. The authors have used matrix factorization. The

theoretical prove of the error estimate is also given [88].

2.3.2 Reclusive Methods based Recommender Systems

It is clear from the above discussion that collaborative filtering is based upon finding

similarities between users. It does not need any representation of the objects to be

recommended. Unlike collaborative filtering, reclusive approach exploits the features

of the objects and requires its representation [89]. The reclusive methods are

considered as complementary to collaborative techniques. And it emphasizes on

finding similarities between objects, i.e. items rather than finding the similarities

between users.

Let us consider the example illustrated by using Figure 2.2. There are five different

TVs for which reclusive approach is described for a user. The user has preferred TV1,

TV2 and TV3 either by purchasing or by putting it into cart. TV4 and TV5 are newly

launched items. The features of TV5 are similar to TV1, whereas TV4 has different

representations in terms of its characteristics. Thus reclusive approach which is also

referred as „content based or feature based recommendation‟ would recommend TV5

to user and not TV4.

Reclusive recommendation or content-based Recommendation [90] mainly came

from the concept of information accessing and is a kind of recommendation method

based on comparing users‟ preferences and associating contents between items in

order to provide recommendations to users. This content-based method is also called

Feature-based Recommendations [91] that judges and find out items users are

possibly interested in by analyzing the attributes and characteristics based on User

Profile. The results are then recommended to users. It could even further assign

different weights [92], [41] based on the degree of association between user‟s

preferences and targeted contents in order to better fit users‟ requirements [93].


___________________________________________________________________________________

26 | P a g e

Figure 2.2: Reclusive Approach for Recommendation

Like CF techniques, the recommendation approach for Reclusive techniques can

also be categorized in the three types, i.e. 1) Heuristic based, 2) Model-based and 3)

Web mining based. By the use of model based approaches, reclusive method tries to

exploit different machine learning algorithms, classification techniques like Bayesian

networks (BNs), probabilistic approaches, to group the preferences of users based on

the content of the items purchased. Whereas, heuristic approaches uses different data

mining techniques like clustering, decision tree, rule induction, etc. to fetch the

product‟s features and recommend the one which is closest to the preferences of a

user. A category, opinion mining, is explicitly classified as it has been used frequently

in characterizing the items‟ features. Customer‟s reviews and their log information

helps in making a consensus about the features of an item whether it could suit the

users preferences or not?

2.3.2.1 Heuristic based Reclusive Recommendation

The Kim et al. [81] used decision tree to personalize the web advertisement for a user.

The authors have proposed personalized recommendation techniques for the

customers based on their past purchasing behavior. User profile is maintained to

observe the attitude of a user towards similar products. The authors [94] presented a


___________________________________________________________________________________

27 | P a g e

technique that combined the feature of classification; user based collaborative filtering

and association rule mining. The classification technique is used to mine the book

with respect to book‟s features. The latter two techniques are used to know the user‟s

requirement for recommending highly rated books. A book recommendation system

based on digital signage system has been proposed by the authors in [95]. The books

are recommended for particulars by identifying age and sex of the users. Here books

recommendation approach is confined and very limited. It cannot be spread to a big

community or universities but only for few magazines for the user aged 19-21 of same

located schools.

In [96], James and Nick have developed a recommender agent for the

recommendation of movie (available at www.filmrecommendations.co.uk). The

proposed approach make predictions based on content that relates the features

accompanying in a movie like, actors, directors, stories, etc. new movies are included

for making recommendation to users. The authors have improved the accuracy by

their pure reclusive approach.

Kazai et al. have presented a mobile app which is enough intelligent to learn the

user‟s interest from the past purchase history or activity knowledge of user at social

network sites [97]. The app provides users with crowd curated content. The app is

also capable of providing users with the knowledge of contents like by the user of

twitter followed by them.

2.3.2.2 Model based Reclusive Recommendation

The researchers have usually utilized users profile to model it for storing their

preferences. These preferences are matched with the feature or contents of the items.

If there is a match between user interests and product‟s features, the item is

recommended to the user. K. Lang [98] has sorted the user dependence problem in

profiling user‟s preferences. Lang has proposed „Newsweeder‟, a technique that has

the provision for users to rate the news they have read in 1-5 rating scale. This helps

in recommendation of next news for the user. Pazzani[71] has proposed a model

based reclusive approach for fetching the user profile about their purchase. The

authors have also suggested CF and demographic technique and combination of trio

for a better recommendation.

Books, journals and research papers recommendations have helped a lot the people

to fulfill their need and get benefited of the recommendation for their study of course.


___________________________________________________________________________________

28 | P a g e

Mooney [99] proposed a content based book recommendation technique, called

LIBRA (Learning Intelligent Book recommendation Agent) that utilized information

extraction and a machine learning algorithm to explore the features of the books in

recommendation. Jomsri[100] proposes a library book recommendation system based

on user profile loaning and association rule. This system is useful for particular

resides in the same institute within the same library and campus. The experiment is

performed for the specific university only.

A user interface is designed for wireless information devices [101] by using user

feedback. User interests learning model are framed for the current events through

news. Machine learning methodology based on reclusive approach is developed. The

authors have claimed their system can adapt according to the interests shown by the

users. Also, the information size is reduced by the methods; as a consequence, users

can save for obtaining the relevant information.

Since, reclusive approach tends to recommend those items which user has already

aware of. This leads to the problem of overspecialization. The authors [102] have

presented mechanism that overcomes overspecialization. Firstly, by exploring

knowledge of user‟s preferences, then matching the preferences with launched items

at shopping sites.

Bansal has proposed content driven user profiling [103]. The system provides

recommendations for news and blog articles. The recommendation is supported by

Comment-valued approach using topic modeling. A novel hierarchical Bayesian

modeling approach is combined with classical recommendation technique. The

content based solution also exploits user profiles which are enough influential in

providing personalized ranking for users of comment-worthy articles. The system

handles with cold-start issue with no extra requirement of meta-data.

2.3.2.3 Web Mining based Recommendation

As discussed in Chapter 1, web mining techniques are sensibly useful in processing

the web data for extracting the desired information and performing operations

according to the need of the problems. Web usage mining, web content mining and

link mining i.e. web structure mining; all three leading web mining techniques are

used in recommender technology recently. Since reclusive approach, mostly referred

as content based approach, exploits user profiles and items descriptions to guess what


___________________________________________________________________________________

29 | P a g e

user could like in future, depending upon the past preferences of a user, irrespective of

the choices made by other users. Most content-based recommender systems encounter

those ambiguities which usually a natural language suffers. The authors [104] have

presented comprehensive methodology to overcome the issues which is associated

with keywords based approaches.

Cho et al. [61] proposed a personalized recommendation system which is based on

Web usage mining. They have suggested an improved collaborative recommendation

methodology which can enhance the quality of recommendation for an Internet

shopping mall. Further, sparsity and scalability are addressed well here to overcome

the poor recommendation problems. Another personalized recommendation based on

Web usage mining is proposed by Kim et al. [105]. Their method is mainly targeted

the problem of helping customers to achieve recommendation only about the products

they wish to purchase. Kim et al. have experimentally evaluated the proposed

methods by applying it on a shopping mall of Korea.

A detailed discussion about the development of a personalized product

recommendation system based on customer‟s click streams is performed in [106]. The

authors have proposed a recommender system based on web mining to overcome the

problem of data overload so that satisfactory recommendation can be made for users.

Web mining techniques are used to observe the purchase behavior of the users and

adopt the change in the users‟ preferences dynamically.

Although there have been good number of studies on opinion mining, however few

of them lead to products recommendation. User feed-back based recommendation for

electronics items are performed by the authors in [39][38], [40]. Liu et al. [107]

proposed a novel product recommendation methodology by combining group decision

making and data mining techniques. It addresses the customer lifetime value (CLV) to

a firm. The authors in [1] recommended books for online shopping using web

mining technique where they categorized the features from the reviews of the users

available online and recommended top computer science books by assigning weights

to these features and scoring these values. The authors in the paper searched the book

on a specific topic using Google search. The top links are stored and the reviews of

the readers for all the stored results are assessed. The features are extracted from the

user‟s review and accordingly the books are ranked.


___________________________________________________________________________________

30 | P a g e

The reclusive methods are very effective in recommending TV program [78] as the

content of a TV program can easily be traced by the features of programs like time of

program being telecasted and characters involved in the programs, etc. The reclusive

approaches can be a solution to sparsity and cold start problems to an extent. Authors

have suggested reclusive approaches in music recommendation to overcome these

issues. In [108] reclusive approaches are proposed to overcome the sparsity while

authors [96] used reclusive methods to solve the cold start issues. There is few music

systems developed to recommend music to a particular group [15].

2.3.3 Demographic Filtering based Recommender Systems

The recommender systems based on demographic filtering also use similarity

measures as a metric. But instead of finding similar rated items by neighbor users, it

tries to find the similarity between users‟ demographic information like, age, sex,

occupation etc. In this approach, the system stores the demographic information of the

customers and whenever a new user comes to merchandisers‟ site for the purchase of

any product, the system identifies the similarity between user‟s demographic

information. According to the preference of the customer, the system recommends

alike items to new user having similar age, sex, occupation etc. to customer. A typical

recommendation approach of demographic filtering based recommender system is

shown in Figure 2.3

In the figure, four different users are shown, user 1 and user2 are from same region,

they both are teenager students from France, i.e. both the users have almost same

demographic values. Whereas user3 and user4 are from different region with different

occupation and they belong to different age group. However, both are females. Thus

user 3 and user 4 differ significantly from each other as well as from user 1 and user

2. Hence, once the purchase and demographic record of user 1 is stored, the system

would likely to recommend same item to a new user (say user 2) who is common in

various or all aspects with user 1. Also, it is important to decide what types of

similarities between the users are desirable. As we have seen above there is significant

difference between user 3 and user 4. However, both are female and hence they may

have similar choices of buying a product (like clothes, food products, etc.).


___________________________________________________________________________________

31 | P a g e

Figure 2.3: Demographic Filtering based Recommendation Approach

Thus, the choice of one of the users can be recommended to another on the basis of

partial demographic similarities. Demographic information can be useful in finding

the category of users whose choices are similar for certain objects. Krulwich et al.

proposed LifeStyleFinder [109] and have used 62 clusters of users which were pre-

existed and has made recommendations to users on the basis of other users belong to

the defined clusters. Pazzani [71] attempted to apply minimal effort for collecting

information of users, and classified users using text classifications. They have used

hybrid techniques including collaborative, content and demographic information for

making recommendation of Restaurants. They have come out with a conclusion that

demographic methods can help in finding evenness in the descriptions of users that

have similar choices of the restaurants. Content-based methods find evenness among

the details of restaurants preferred by a particular user. However, collaborative

method helps in finding correlation between the user‟s ratings of a particular

restaurant and the user‟s ratings of other restaurants. Their experiment demonstrated

that the consensus-based method is effective than any one of the individual method

discussed above. Usually demographic filtering based recommendation technique

when hybridized with collaborative or reclusive recommender approach is found to be


___________________________________________________________________________________

32 | P a g e

more effective. We call it as hybrid techniques. The hybrid technique is discussed

later in the section. These combinations can work well and may solve cold start

problems to an extent.

Laila et al. [110] presented a solution to cold start problems which occur while

using rating history of the user as a base for recommendation. They have used user‟s

demographic details and combined it with reclusive and collaborative approach to

provide recommendation for new users with no prior preference and rating details. A

significant impact of demographic information of users for recommending research

papers have been reported in [10]. Kim et al. [81] have suggested demographic

filtering based recommender system. The filtering is based on decision tree induction

and machine learning techniques.

2.3.4 Knowledge based Recommender Systems

The recommender system has much of its emergence due to the initial involvement of

collaborative filtering methods. However, later a good amount of work is contributed

using reclusive methods too. The early implication of collaborative and reclusive

approaches to the recommendation technology has given a distinguished identity to

the above two techniques in the categorization of recommender systems. As

recommender system is a knowledge based approach, thus all the different categories

are based on knowledge filtering techniques. The reason behind keeping reclusive

and collaborative as a separate category is its familiarity and domination from early

days of evolution of recommendation technology.

Apart from the above two techniques, collaborative and reclusive, any

recommender technique by default may be inferred as a knowledge based approach.

However, demographic filtering is based on collaboration of users‟ demographic

knowledge; it seems adequate to keep it as a different criterion. The idea which

differentiates knowledge based systems from other systems, is the degree of

importance it gives to the following two domains –

a) user‟s requirement

b) Characteristic of the recommended items.

The above area of expertise helps in achieving users‟ satisfaction by fulfilling their

needs. Certainly, an approach for building recommender system which needs either

explicitly defined set of recommendation rules or some sort of similarity measures

from prior purchase history of the users is perceived as knowledge based approach for


___________________________________________________________________________________

33 | P a g e

recommender system. It is important to know the knowledge sources while

categorizing a recommender system. It is more difficult to concisely typify knowledge

based system than collaborative or reclusive systems. However, the recommender

systems that use supplementary knowledge sources which are not exploited by

collaborative and reclusive recommendations can be characterized as, “knowledge

based systems”. These systems depend more upon knowledge sources, while others

frequently-used techniques do not depend highly upon such sources of knowledge.

Towle and Quinn [111] have argued that an additional information provided by the

user could help in overcoming the sparse related problem as well as cold start

problems. Hence, instead of „rating‟ based recommendation, which is an implicit

approach, they have suggested explicit model for recommendations. The authors have

configured three major retardant in the sensible success of recommender systems.

First, customers show reluctance for receiving recommendation if there is not up to

the mark recommendation constantly, second, the constant arrival of new items and

third, all the products do not have same characteristics. Thus, explicitly asking the

requirement and choices from the user would allow training the system according to

the user‟s need.

Knowledge based approach is applied in [112] for recommending programs on TV

to a group. Since most of the RS need explicit ranking from the users. Merging these

individuals ranking to one consensus ranking so that it may suit all member of a group

well is a tough job. The authors proposed a method which learns the family

preferences separately. On one hand the method keeps the privacy of the family

preferences and on other hand it adapts to the changed preferences of a family. The

classifier is applied to adapt the preferences of each family separately. A recall of

0.57 and precision of 0.30 have been achieved by the author‟s suggested work,

although not much description of the Meta data was provided.

A similar approach is presented by Yu et al. in 2006 [18]. The authors have

provided recommendation mechanism for TV programs for a group by exploiting user

profile. The selected strategy first merges all user profiles to construct a common user

profile, and then uses a recommendation approach to generate a common program

recommendation list for the group according to the merged user profile. The total

distance minimization is used for evaluation of the results. The system works well for

group of users viewing the TV together.


___________________________________________________________________________________

34 | P a g e

For tourism recommendation; aspects like the charming of a place, ease of

accessibility and accommodation, and well-furnished restaurants are often seen as

important factors. Entrée, a FindMe driven system proposed by Burke et al. [113],

[114] to recommend restaurants by using knowledge based approaches. The authors

have clubbed concepts of several retrieval strategies involving knowledge based to

fetch the destined information. RentMe system is designed which follow the

guidelines of FindMe system for the recommendation of apartments in Chicago on

rent.

To explore the best suited locations of restaurants for a group of people, a

recommender system, „Pocket-Restaurant-finder‟ is suggested in [115]. It

incorporates the choices of the associates of a group. Furthermore, the application

developed can help the group members in real life and have been designed to run at

any kiosk to help a group in finding the restaurant of their mood. A system,

Collaborative Advisory Travel System (CATS), has been presented as a solution for

the recommendation of holidays. The system also tells the area where these holidays

can be engaged.

SPETA [21] is a recommender system behaves like a guide that provides the

service to tourist by observing their past preferences and locations. The suggested

system makes use of the knowledge of user associated information like current and

past locations and preferences. The information for the user is extracted which is

integrated with innovative techniques to provide pleasant experiences to tourists. E-

learning courses have been recommended with different techniques, proposed by

authors in [12], [13], [116]–[118]. In these works authors have proposed

recommendation methodology for courses to graduate students at university level and

for online learning environment. A course recommendation for open university of

China is proposed in [117]. In [119] authors have used machine learning technique to

recommend courses for new enrolled students.

Further, two different types of knowledge based systems are reported in literature

[120], [121], [122], [114], [123].

Case based recommender system.

Constraint based recommender system

These recommender systems are described below.


___________________________________________________________________________________

35 | P a g e

2.3.4.1. Case-based Recommendation:

In case-based approach, recommendation is largely perceived as a problem of

evaluating resemblance of a product with user‟s preferences. The approach employed

in Case-based recommendation is somewhat similar to reclusive approach in an

exceedingly sense that both the approaches need detail descriptions of the products‟

features. In turn, these features are matched with the user‟s preferences to best suites

their requirement and provide a high level of user satisfaction. Since the requirements

and preferences of users aren't well outlined, hence, similarity-assessment method

helps in up the standard of the recommendations, this is why case-based approach has

gained a great success in e-commerce [124].

Let us consider an example [125]. If I go to market for buying refrigerator, the

seller may and may not be acquainted with my preferences depending upon whether I

have made my purchase from there before or not? Obviously, if I have purchased

refrigerator before, why should I go again? That is seller is not aware of my

preferences, right? Now, if there is description of the products like company to which

it belongs, size of the refrigerator, color, power consumption and warranty durations,

it will help the seller in providing the closest to choice object for customer. Let I was

provided an item whose similitude to my preferences are high but I dislike the color.

“Everything is fine but may please you show me a blue of it?” it would be my request

from the shopkeeper to give me an item with similar features but the color should be

blue, with this additional explicit knowledge provided an exact recommendation can

be made with less effort and time. This is what a case based recommendation does in

recommending the items to users. Case-based recommendation treats

recommendation as primarily a similarity-assessment problem. How can the system

find a product that is most similar to what the user has in mind, with the

understanding that what counts as similar will often involve domain-specific

knowledge and considerations?

In product recommendation, decision trees have been used extensively. McSherry

[126] has come up with an idea of treating the decision tree as an identification

method which identifies an item as an object for recommendation and stores it in case

library as a single case. The authors have tried to reduce the complexity in acquiring

the explicit knowledge from user for case based recommendations. McSherry in

another work [127] has talked about how recommender system is affected by


___________________________________________________________________________________

36 | P a g e

incremental query elicitation. Generally, obtaining the additional knowledge from the

users, hinder in obtaining quality solution. The context for which the dialogues can be

stopped without any loss in quality of solution is explained by the authors. It is

suggested by the authors that destination-oriented technique in which number of cases

gets dominance over target case, would provide a better solution. Further, it is noticed

that the strategy costs less in computation as well. The authors have evaluated their

results on Travel case library (TCL). TCL is a standard benchmark which contains

more than 1,000 cases. It is found that their method reduces the average length of

argument better than others.

The explanation of recommendation that briefs the user why recommendations

have been made would attract the users and might satisfy for a good extent [128].

With this principle in mind, authors have not tried to justify the specific suggestion

but rather explained the reason of suggestion. The philosophy also helps users in

knowing the further opportunities in a case when the recommended items dissatisfy

them. The compound critiques are trained to work as a form which may generate

feedback. The authors have claimed explanation-rich critiques improve

recommendations for users.

2.3.4.2. Constraint based Recommendation:

To understand constraint based recommendation, let us consider an example of how

recommendations are made for web hosting services [121]. The personal preferences

regarding cost, bandwidth, visitors count, etc. are required to be provided with users.

The recommender suggests the users and explains the reason of recommendation on

the basis of the preferences of the users observed. If no solution can be acquired by

the recommender, a replacement is required to be provided for users, in order to save

the users from going into a dead end situation. The above example is a better

explanation for a constraint based recommendation [129]. In these recommender

systems, features of the product and association of user‟s requirement with these

features, both are modeled in the form of a constraint. Constraint-based approaches

help in purchasing the items which are not frequently purchased. Constraint-based

recommenders support customers in a deadly scenario where no other solution is

provided by automatically suggesting options for remedies and explaining

technicalities with the items‟ features.


___________________________________________________________________________________

37 | P a g e

The application of constraint-based recommenders for financial services is

presented in [130]. Another financial application of constraint based is reported in

[131]. The authors [132] present an approach to enhance the recommendation for

multimedia. The additional feature for component visualization is associated with

constraint based. It enables users to interact the virtual product directly. Visualization

functionalities provide substantial contributions to user-friendly interfaces boosting

the acceptance of recommenders.

Knowledge-based recommender technologies [130] enable customers and sales

executives to identify the appropriate products and services. These knowledge

engineering are also useful for complex and high involvement products such as cars,

computers, or financial services. The authors have presented the VITA

(VirtualisTanacsado) financial services recommendation environment which has been

deployed for the Fundamenta building and loan association in Hungary.

The effective integration of configuration system development with industrial

software development is crucial for a successful implementation of a mass

customization strategy. On the one hand, configuration knowledge bases must be easy

to develop and maintain due to continuously changing product assortments. On the

other hand, flexible integrations into existing enterprise applications, e-marketplaces

and different facets of supply chain settings must be supported. The authors have

designed a model-driven architecture (MDA) for model development and interchange,

and sensibly argued how the industrial configuration can serve as a foundation for

standardized configuration knowledge representation; thus providing knowledge

sharing in heterogeneous environments. [131].

The problems with DVR and catch-up TV has been resolved by methods proposed

by [133]. The challenges and solution regarding personalizing the topic have been

illustratively explained in this work. The author has concluded that there are the

contents which are absorbed sequentially trends of seasonal dynamics is observed

with these contents. If new content arrives just after broadcasting of any content, it

would lead dynamic stream of data. And there may be repetition of similar data for

different services simultaneously.

2.3.5 Hybrid Recommender Systems

Though Collaborative Filtering (C.F) and (R.M) are the most frequently used

techniques in designing Recommender Systems (RS) but they inadequately provide


___________________________________________________________________________________

38 | P a g e

any explanation of why the specific recommendations have been made to particular

user along with recommendation, hence, they fail in fulfilling the explanation in

various scenarios. These shortcomings of the both leading technologies can be

overcome by the use of the combination of duo. The various combinations of these

techniques have been presented in the literature. These combinations are termed as

„hybrid technique‟. We have categorized seven types of hybrid recommender systems

based on different combinations –

i) Hybrid Recommender Systems based on Collaborative Filtering (CF)

dominated Reclusive Method (RM)

ii) Hybrid Recommender Systems based on RM dominated CF techniques

iii) Hybrid Recommender Systems based onunified RM and CF techniques

iv) Hybrid Recommender Systems based onSubsequent Integration of

separately applied CFtechniques and RM

v) Hybrid Recommender Systems based onIntegration of CF and RM with

knowledge based system (KBS)

vi) Other Hybrid Recommender Systems using CF techniques

vii) Other Hybrid Recommender Systems using RM

The work which incorporates these combinations has a wide range and has been

applied over various applications. The techniques of the hybridization are described in

the following section.

2.3.5.1. Hybrid Recommender Systems based on Collaborative Filtering dominated

Reclusive Method

Incorporating components from CF and RM lead to form a hybrid recommender

system. These hybrid recommender systems help in dealing with the above said

shortcomings. The researchers have started to explore the frequent occurring

problems with these two techniques, namely overspecialization and cold start

problems. The hybrid technique, composed of the combination of these techniques in

various suitable combinations is proposed, and a new approach for recommender

system is perceived. It is noticed that various aspects which should be retained in the

designing of recommender systems, are ignored. Not considering these aspects may

dissatisfy the users and the ultimate goal of the recommender system cannot be

achieved. The advantage of employing hybrid system is the power of assimilation of


___________________________________________________________________________________

39 | P a g e

these methods in integrating the collaborative and reclusive approaches by

contemplating the best of the two without considering the drawbacks of the either

[134].

Mooney et al. have [99] presented an effective methodology combining content

and collaboration. The content is used in enhancing user data whereas personalization

of recommendation is made through collaborative filtering. The hybrid system

performs better than pure CF technique or Reclusive Method [108].

Content with collaboration are elegantly combined in [108]. The authors used

reclusive approach to design feature-based predictor for boosting the user profile. CF

techniques have been utilized further to provide personalized recommendations. The

authors have shown that Collaborative Filtering dominated reclusive methods

outperforms the pure reclusive recommendation, pure collaborative filtering

techniques, and simple hybrid approach.

The Good et al. have come up with a conclusion that CF techniques can be

combined with content based agents, which in turn gives the best recommendations

than any combination or separate techniques would produce. They designed the

system in such a way that users need not to choose best in agents, instead, the CF

framework recommends best ones for them [45].

A clustering technique has been presented [135] as a solution to cold start

problems. Item-based CF techniques make use of clustering strategies. The

comprehensive idea of integrating the content information into the CF has been

explained. The authors have used MovieLens data for experiments. The results

evinced the improvement for cold start problem.

The method to combine features of human personality into the traditional rating-

based approach for CF systems is presented by Hu and Pu [136]. The rating based

system usually computes similarity of the users‟ preferences with its neighbor and a

naïve user may not find good recommendation due to lack of exploration about their

past preferences. Combining human characteristics with CF techniques provides

better recommendation for new users whose past rating preferences are not well

formed.


___________________________________________________________________________________

40 | P a g e

2.3.5.2. Hybrid Recommender Systems based on Reclusive Method dominated Collaborative

Filtering Techniques

RM dominated CF techniques implies those hybrid systems which incorporate CF

techniques into Reclusive approaches. The basic philosophy of reclusive approaches

is retained and collaborative techniques are applied over there. A technique for the

purpose of text filtering by combining collaborative and content methods are

presented in [137]. The latent semantic technique is used for storing user profiles. The

RM dominated CF techniques performs well than the simple reclusive approaches.

Since, collaborative filtering methods are treated as a base in recommendation

technology. It utilizes the recommendations based on other users‟ preferences. By

contrast, reclusive approach is powerful enough to make recommendations by

obtaining details about an item. Thus, reclusive approach can recommend items which

are not previously rated by user. The additional feature of CF techniques for getting

user profile stronger can boost the recommendation process if the two techniques are

combined [108] . The authors have presented the results which demonstrate that RM

dominated CF techniques can give correct recommendations [99].

A combination of RM and CF [138] is used to recommend TV program to viewers

of Ireland and Britain by collecting their rating and reviews. The authors have

discussed a content personalization system which selects the most suitable contents

from an individual by reclusive dominated collaborative approach. The key to

address the issue is the exposure of a learned user profiles. The duo combination

provides a vigorous personalization solution.

2.3.5.3. Hybrid Recommender Systems based on unified Reclusive Method and

Collaborative FilteringTechniques

Several authors have integrated CF and RM in many ways [54] . However, coalescing

the two methods into one, is proposed by Ansari et al. [139]. As the brand online

retailers like Amazon, eBay and Yahoo! use CF or Reclusive methods for the process

of recommendation of various products and services to their users, unifying the duo

could enhance the recommender strategies [66]. The authors have described a

Bayesian model to sort out the preferences by categorizing the information into five

different types of knowledge associated with the characteristics of the recommender

system concerned. Markov chain Monte Carlo methods used to recommend movies.

The Monte Carlo model works in all circumstances whether CF technique are able to


___________________________________________________________________________________

41 | P a g e

be employed or not. Thus, in general, a recommender system can predict whether a

particular product or service may fall into the preference category of a user, in

addition it can guess for a user the movie he would be interested in, for sure. The

inductive learning method [141] is proposed which incorporates the characteristics of

the artifact, which is utilized by the recommender systems in making predictions.

A novel approach for the recommendation of on-line academic research papers

based on ontology to help in boosting the user profiling is discussed [142]. The

authors collect feedback for users‟ profiles by utilizing a novel approach based on

profile visualization. The ontology of the research papers topic support in categorizing

the papers which in turn serve as a base for collaborative recommendation. The users

who have similar preferences in browsing the research papers of same interest are

stored and accordingly the recommendations are made.

A unified framework for collaborative and reclusive recommendations based on

probabilistic method is discussed [143] which is an extension of Hofmann‟s aspect

model [144]. The method assimilates item‟s content with users and items which is

generated by data source itself, and provides a solution in recommendation when data

are prevailed by sparsity.

2.3.5.4. Hybrid Recommender Systems based on Subsequent Integration of separately

applied Collaborative Filtering Techniques and Reclusive Methods

The separate implementation of RM and CF are applied in [137] and [145] where the

authors have discussed the improvement in quality of recommendation. Kim et al.

[146] have proposed a book recommender system for the validation of their method

which was designed for an online community. They tried to satisfy the minor

members of group which are left unsatisfied although the majority may have

satisfaction due to the differences in preferences.

The recommendation technologies have been useful for recommending courses as

well as utilizing the courses for other library management program. In [147] authors

have used academic courses to generate data for library planning purposes.

Billsus and Pazzani have proposed the induction of hybrid user models. The hybrid

model comprised of separate models for RM and CF techniques. The detail

description of the implementation of these algorithms for addressing the issues

booming in recommendation technology is done [148]. The CF and RM are also

integrated and discussed from the restaurants recommendation perspectives and


___________________________________________________________________________________

42 | P a g e

illustrated the advantages of the combination over separate implementation of either

of the techniques [71].

2.3.5.5. Hybrid Recommender Systems based on Integration of Collaborative Filtering and

Reclusive Methods withKnowledge based System

The use of combination of CF techniques and RM with knowledge based systems

(KBS) has been reported in literature. The authors [149] have suggested a

recommendation technique which identifies sets of rule and deduce the

recommendation upon these rules. This recommender system provides accurate and

cheap clinical examination to patients. A recommender system which provides

personal health information of users is designed by Lee at al. [26]. It uses profile of

users and accordingly the information is provided for the better services of patients.

Wiesner et al. [150] kept the fact that physical activities are very important for fitness

and health, so, they have designed a physical activity recommender system that tells

the exercise time useful for people. Recommendation of nutritious diets have been

suggested in [25]. They have used user ratings for the nutrition needed accordingly

provided the nutritious diets to users. The usage of RS in health has been explored by

Fernandez et al. in [151], where a detail of RS and their extensive uses in the domain

of Health and care is discussed.

A recommender system has been suggested in [84] for e-business by introducing

computational ecologies. This system supports recommendation based on negotiation

which also inspires ecosystems monitor [152] .

For private banking, a recommender system „PB-ADVISOR‟ for multi investment

has been framed. The system addresses the issues of recommendation with

explanation, in addition it also generates several packages and has the ability to

suggest best services for customers with appropriate explanations [153].

2.3.5.6. Other Hybrid Recommender Systems using Collaborative Filtering Techniques

Knowledge-based (KB) and collaborative-filtering (CF) recommender systems, both

have equally contributed online recommendation for users to find products close to

their choices out of a huge data with large varieties. R Burke in 1999 has explicitly

described in detail the pros and cons of these two [123]. The author has outlined the

chances of collaborative and knowledge based hybrid recommender system. In the

suggested methodology knowledge-based techniques and CF techniques both work as


___________________________________________________________________________________

43 | P a g e

a complementary for others. KB technique bootstraps the CF engine, and the CF

filters the KB recommendations.

A film recommender agent expands and fine-tunes collaborative-filtering results

according to filtered content elements - namely, actors, directors, and genres. This

approach supports recommendations for newly released, previously unrated titles.

Directing users to relevant content is increasingly important in today's society with its

ever-growing information mass [96]. Tang et al. [154] have suggested QoS services

by using a hybrid techniques which combines CF method with location aware

approach.

The use of CF techniques with temporal dynamics [155] is studied. The authors

have presented a hybrid recommender system comprised of CF techniques and graph

based model [156]. The graph-based approach has already been proven superior to

other methods by experimental results. In other words, what information has been

conveyed by other techniques is already suggested graph based model. An extensive

evaluation has been performed by authors.

2.3.5.7. Other Hybrid Recommender Systems using Reclusive Method

The CF and RM are the base technologies in recommendation. The most of the

technology either use both techniques in any combination or combine either of the

technique with other methods like knowledge based approaches, context aware

approaches, etc. since contents are used in reclusive approach to extract the features

associated with the items in consideration, it would be a powerful combination if

reclusive methods are combined with knowledge based approaches. A personalized

recommendation could be a solution in providing user the matching items to their

preferences out of the huge data available. A hybrid method which combines the

reclusive approach with knowledge-based methods to enhance the recommendation

performance is presented [157]. Explicit and implicit feedbacks are taken from the

users for recommendation process. Optimized weight vectors and preference matrix

(PM) are used for exploiting implicit and explicit attributes respectively. The hybrid

system gives better results in reducing cold-start and sparsity.

The reclusive approach is able to recommend users the product which have been

already searched or visited by them and cannot predict about one which has no past

record. However, in many cases users may wish to go for purchasing a new item they

never have seen before, as the unheard items may be of interest for a user. The


___________________________________________________________________________________

44 | P a g e

situation is termed as serendipity. Incorporating serendipitous recommendation

strategy with reclusive methods alleviate the over specialization problems in

recommendation [158]. The authors have suggested hybrid recommendation approach

to recommend surprisingly the new items to users. The hybrid methodology is

comprised of reclusive and serendipitous approaches.

In [23] the authors have tried to use sematic web structure and text mining

techniques for providing users the risk that may occur if the ignorance are kept alive.

Thus these risks are advertised on social networks, etc. A hybrid music

recommendation system which handles the issues encountered with collaborative and

reclusive approaches has been reported [159]. The authors have utilized the rating as

well as content of data by using a Bayesian network. The approach solved the

problems of collaborative approach of not being capable of recommending music for

which no ratings have been recorded. In addition, it also resolves the issue in studying

the artist varieties. Latent variables are used to explore the solutions [16].

2.3.6 Context Aware Recommender Systems

Context aware recommender system though can be perceived as a special kind of

knowledge based system, when context is involved as knowledge, required for

recommendation. However, the high inclination of the recommender system research

community towards recommender system for learning has provided a platform that

compels us to keep CARS as a different category, and not a type of KBS.

The ultimate goal of recommender system is to achieve user satisfaction. And user

can only be assured for their satisfaction if they are delivered with the exact

recommendations that meet their needs. The user‟s requirement is not static and may

vary time to time depending upon various social and other factors affecting their

purchasing trend. Hence, Figure 2.4shows a context aware recommender system

(CARS) which takes into account the context in which user goes for some specific

item. And different varieties of the items can affect the user‟s demand significantly.

We illustrate the example with the help of Figure 2.5. Let us consider that user

needs to buy clothes from a cloth merchandiser, obviously the demand of cloth for the

type it belongs must depend upon the season and weather. In winter season user must

be asking for the woolen cloth. Now, if we consider the reclusive approach it would


___________________________________________________________________________________

45 | P a g e

Figure 2.4: Context Aware Recommender Systems Overview

Figure 2.5: Example for Context Aware Recommender Systems using season based clothes


___________________________________________________________________________________

46 | P a g e

be recommending woolen-like clothes to the specified user always, irrespective of the

context. In collaborative approach, the system would go to observe the user‟s

neighbors preference, eventually in the scenario, the probable recommendation may

be clothes similar to woolen.

The context aware recommendation is necessary to understand the user‟s delicate

preferences and exploiting the complications in their requirement explicitly. It is

shown in the Figure 2.5 that how a CARS would care for user‟s choices. A Context

Aware system would explore the situation in which the user‟s purchase is noticed, and

tries to filter the recommendation accordingly. Thus, in a summer season, CARS can

never recommend a woolen cloth to user. However, in the same scenario, the other

systems may exhibit false positive error. False positive employs the recommendation

of an item to user while the item is not needed to be recommended and not preferred

by the users.

In mobile environments there can be various contexts needed to be considered

while making any recommendations. The considerable context can be weather, time,

route, location, ad transportation means, etc. before making any recommendation in

such scenario recommendations should be designed context-aware for guiding the

users on mobile path. A context based travel-related information for mobile systems

are proposed [160]. Recommendations of restaurants in Taipei in a mobile device are

performed.

Woerndl et al. [161] tries to incorporate contexts in recommender systems to make

it applicable in mobile domain. The approach helps users to get aware of what have

been installed in mobile of their neighbor; accordingly they may get recommendations

for their mobile.

Though there are various techniques which have been classified as a separate

category of RS, however, we classify the following different recommendation

procedure as a part of context aware recommendation. Somehow, these are the

contexts which may affect the recommendations from both seller and buyer point of

view.

Location aware RS

Trust aware RS

Temporal RS


___________________________________________________________________________________

47 | P a g e

Brunato and Battiti[162] realized the need of pilgrims and suggested mobility-

aware recommendation system by fetching the location of the users. The authors have

calculated a preference metric which answers the queries of the users for their needs

of resources while making any pilgrimage. Mobility scenarios are introduced to better

appropriate and more reliable predictions of user requirements.

Levandoski et al. [163] utilized location based ratings for recommendation and

presented „LARS‟, a location-aware recommender system. User partitioning is used to

explore the location based ratings. The technique produces quality recommendation;

in addition it maximizes the scalability of the system [164].

Yang et al. [165] has identified the need of the customers and sellers both for

promotional selling and has presented a location based recommender system for

online shopping which gives the best recommendation by fetching the sales and

promotions which are location dependent. Tang et al. [154] has presented a location

aware system which also incorporate collaborative techniques to produces QoS web

based services. The web recommendations are made based on collaboration of user‟s

locations.

A Bayesian Networks (BN) influenced map-based customized RS is proposed

[166]. The system utilizes contextual knowledge including location and time. The

contexts like weather and user request automatically collected from mobile devices

are used to recommend appropriate item to users which match to their preferences.

Temporal recommender systems are meant to recommend items for users when

time is required to be kept an essential component in decision-making process. A

system is designed to recommend ranked cafes to customers [167] according to their

preferences, explored by their preference‟s knowledge, characteristics of the cafes‟,

specific situations, requirements, as well as the time of intended recommendation.

Queue Lee et al. [168] suggested a collaborative filtering-based recommender

system using implicit feedback. Since the system does not use explicit feedback, it

had relied upon pseudo rating observed from implicit feedback. The time of user‟s

purchase and launch of an item are used to construct pseudo rating matrixes which in

turn increase recommendation accuracy.

Lathia et al. [169] have shown how the temporal diversity can affect the

recommendation specially the behavior of CF techniques in recommendations. Since

the user‟s rating serve as a base in CF techniques. It is shown in their work that CF


___________________________________________________________________________________

48 | P a g e

data changes over time and a user may not always rate the item each time he/she

comes to shop online.

The authors in [170] presented a hybrid recommender system that not only

incorporates the demographic details of users but also the temporal information. The

results of the experiments has supported that temporal knowledge may enhance the

performance.

2.3.7 Social Network based Recommender Systems

The detail of the RS applied over social networking environment has been extensively

studied and presented by Zhou et al. [53]. The authors have tried to explore the pros

and cons and the opportunities of social network based RS.

An overview of the Foafing the Music system is presented [14], [171]. The system

used the text from RDF Site Summary (RSS) and Friend of a Friend (FOAF). The

Foafing based system predicts music to a user that matches to his essence of music

listening. Music information is collected from RSS feeds, music related blogs,

upcoming albums and „mp3‟ audio files at different music containing sites. The

system discovered music with the help of user profiling, information and descriptions

based on context supported ontological details of music domain.

Hu has presented a new paradigm of recommender systems. The RS can make use

of social networks (SN) based information. This information can be the preferences

observed for users, usual inclination of users towards a product or service, influenced

and influencing entities, like friends and acquaintances. A probabilistic model is

designed for personalization of the suggestion from these inferences. The real data

from online SNs are extracted. With their experiment, the author has concluded that

there is a strong similarity in the preferences of friends. Experimental results on this

dataset show that proposed system improves the performance [172].

To make use of social network where private or personalized data of an individual

is easily accessible for recommendation of items to new users are presented [136].

Human personality characteristics are integrated with rating given by them in the

recommendation process.

A social network based recommender system which exploit [173] the trading

relationships has been proposed. The system proposes the ways to compute the degree

of recommendation for trusted online auction sellers. The authors have utilized

network structure which is formed by history the transaction performed by user.


___________________________________________________________________________________

49 | P a g e

2.3.8 Soft Computing Techniques based Recommender Systems

The soft computing techniques have now been increasingly used in recommender

systems for incorporating collaborative recommendations, reclusive recommendations

and hybrid recommendations. To deal with the uncertainty in various business

marketing affairs, Cornelis et al. [174] make use of fuzzy relations to model the

degree of similitude between items and users. They also proposed a novel hybrid CF–

CB approach whose rationale is concisely summed up as “recommending future items

if they are similar to past items that similar users have liked”. A hybrid fuzzy logic-

based recommendation framework [175] was then developed to improve the trade

exhibition recommender system for e-government. Zhang et al.[176] has developed a

telecom recommender system using fuzzy techniques. The authors have used fuzzy on

item based similarity approaches. The have applied fuzzy set techniques on mobile

product and service recommendation. They have designed system referred as Fuzzy-

based Telecom Product Recommender System (FTCP-RS).

A soft computing technique is applied for the recommendation of the books for

university graduates by the authors [35] where they have incorporated the vagueness

in the preferences of the books and aggregated the score of the books using OWA

technique. Similar work has been suggested using ordered ranked approach in [5].

Hybrid approach using fuzzy-genetic to exploit the use of its varieties to address

sparsity and scalability is addressed. But CF techniques face the issue of accuracy and

sacability both. To overcome the problems of accuracy and scalability with memory

based and model based CF techniques, respectively. The proposed system reduces

sparsity and complexity; while retaining the neighbor recommendations perspective

[177].

Fuzzy logic based RS is presented as a solution to issues encountered by CF

techniques for some specific situations regarding those items which are brought into

market rarely and not necessarily be repetitively put on sale [175]. The employed

fuzzy technique recognizes the uncertainty in the information. The method can be

helpful in various scenarios like trade exhibition recommendation.


___________________________________________________________________________________

50 | P a g e

Table 2.3: Recommender Systems, Categories and Techniques

S.N

o

Types of

Recommende

r System (RS)

Sub-

category Techniques Research Papers

1

Collaborative

Filtering (CF)

based RS

Item

based

Association rule mining between

preferences of neighbor of users,

Rating, Choice of individuals for

varied items, Similarity in the

preferences of different users for

common items, Tagging.

[178], [42], [57], [148],

[179], [180], [181], [94],

[85], [96], [179], [182]–

[185], [8], [9], [30], [43],

[165]

User

based

Model

based

Bayesian networks, clustering,

Machine learning, Graph

modeling

[88], [68], [88], [41], [79],

[80],

[81][84][86][83][85][31][82

], [87].

2

Reclusive

Methods (RM)

based RS

Heuristic

method

Rule induction, nearest

neighborhood, Rocchio‟s

algorithm, tagging, rating, etc.

[98], [134], [178], [55],

[101], [187]–[191], [81],

[94],[95][96], [97].

Model

based

technique

s

Bayesian networks, clustering,

Machine learning, Graph

modeling

[189],[89], [192]–[194],

[195], [91], [196], [98],

[71], [99], [100]

,[101][102][103].

Web

mining

Opinion mining, web usage

mining, etc.

[104], [61], [105], [106],

[39], [38],

[40][107][1][78][108][96][1

5]

3

Hybrid

recommender

systems

CF

dominate

d RM

Techniques of CF, RM applied

with each other in different

combinations

[47], [197], [135], [48],

[198], [199], [200],

[201],[202], [203], [204],

[96], [145], [183]

RM

dominate

d CF

CF and

RM

coalesced

into one

Subseque

nt

Integratio

n of

separately

applied

CF and

RM

Integratio

n of CF

and RM

with

Techniques of CF and RM are

applied with KBS, and other

[205], [123], [206]–

[209], [202], [19], [8],


___________________________________________________________________________________

51 | P a g e

(KBS) fuzzy, social network, etc. [183], [197], [199],

Integratio

n of CF

with other

than RM

Integratio

n of RM

with other

than CF

4

Demographic

filtering based

RS

- Correlation, similarity measures,

etc.

[109], [209] , [210], [211],

[109], [212], [71],

[110],[213], [214]

5

Knowledge

based

Recommender

System (KBS)

Constraint

based Machine learning, Bayesian

network, AI, etc.

[121], [215], [125],

[216], [126], [217], [122],

[218], [127], [128] , [111],

[114], [219]–[221], [120],

[7], [222], [220], [119]

Case

based

6

Context Aware

Recommender

System

Location

aware,

Temporal,

Trust

aware

User feedback, AI

techniques, machine learning,

etc.

[223], [224], [225],

[226], [161], [8], [163],

[154], [164], [227], [228],

[167], [170], [168], [155],

[229]–[231]

7 Social network

based RS

Foafing,

trade

relationsh

ip, etc.

Similarities measures, user

profiling, etc.

[232], [233], [234], [18],

[172]

8

Soft

Computing

techniques

based RS

Fuzzy

genetics,

fuzzy

linguistics

,

OWA, ORWA, fuzzy model, etc. [5], [35], [235], [174],

[236], [237], [25], [177]

The problem with CF technique and RM is that both of them fail in representation

of explanation of relationship between users feedback and features of items as they

are subjective and uncertain. The authors have presented Fuzzy set theoretic method

(FTM) [237] which identifies the application of fuzzy method presented by

Yager[89]. The FTM makes use of aggregation which finds confidence score for

recommendation. The techniques also utilize the various statistical measures to

evaluate the RS.

The authors have suggested how to automatically recommend newly launched

items to user which have no prior rating. Only with the users past history of purchase,

the new items are recommended to users. The combination of Bayesian networks and

Fuzzy Set Theory are used to enhance the system performance [238].


___________________________________________________________________________________

52 | P a g e

2.4 Summary

The comprehensive survey of the recommender system is presented in this chapter.

With the help of the study conducted in the chapter, we have concluded that there is

an exponential growth of the research in the field of RS. Researchers have shown a

great interest towards this area. The application area of RS has covered diverse field

of daily life. It includes academia, health and care, business using e-commerce and e-

shopping sites, etc.

Various techniques have been used to meet the demand for these applications. We

have categorized 8 different types of RS which is further broken into 19 sub

categories based on techniques and filtering algorithms used. The Collaborative

Filtering (CF), most influential recommender technique, has largely used by the

researchers but still fails to produce satisfying solution due to major drawbacks like

cold start problem for new users and sparsity, as stated in section 2.4. The leading

technique next to CF widely used in the literature is „Reclusive Method‟ (RM) or

„Content based filtering‟; the technique also suffers from the same complications. No

technique alone can sensibly be considered as a solution to these problems, instead

hybrid approach may fulfill the requirement. Thus, a more robust hybrid method

which incorporates the best of these techniques without being affected by their worst

may produce satisfying results.

It is evident that prior to designing a recommender system one must understand the

characteristics of the recommendation which can please the users. User‟s feedback

directly reflects their priorities, likes and dislikes. Therefore, explicit or implicit

feedback from the users to know the characteristics of their past preferences as well as

to predict the future behavior is pragmatically important. A recommendation

procedure for the books will be extensively discussed in Chapter 3 which exploits the

feedback from users (experts from the universities are considered as users). The

procedure provides a recommendation on consensus basis which overcomes the

prevailing issues with RM and CF techniques.

In section 2.3.8, it is discussed how the soft computing have emerged and its

employment in recommendation technology is seen increasing rapidly. In Chapter 4 a


___________________________________________________________________________________

53 | P a g e

method is discussed which utilizes soft computing techniques and makes possible use

of it to outstretch the satisfaction level of users.

Also, it is revealed that rating scale proves handicap for several occasion, like

when no ratings are available or rating scale lacks standard. Opinion mining could

help better recommendation where rating scale might not have done well. Opinion

mining could provide the recommendation by finding user‟s requirement according to

their reviews, and matching it with characteristic of the product, hence,

recommending the exact items to users. An opinion mining based recommendation is

also proposed and described in the Chapter 5. The proffered approach is believed to

be a realistic one adequate for the users‟ satisfactions.

54 | P a g e

Chapter 3

Link Mining based Book Recommendation

Approach

3.1 Introduction

The development of modern tools and technologies, and inclination of the new

generation towards education has made the demand of Information exchange very

high. The huge information available on a specific topic creates confusion for the

people who are seeking for desired source to grab the correct information. Books and

research articles, whether online or offline, are the sources for obtaining information.

Hence, it is an important task to filter the sources for finding the desired books. There

are millions of e-books available on the Internet and the numbers are increasing

rapidly, this rapid increase has created a high demand of developing a

recommendation technique to get exact and desired book.

There are a good number of works in the area of product recommendation [2].

There are various methods being used frequently in recommendation techniques.

Collaborative filtering and Content-based Recommendation are most frequently used

recommendation techniques found in the literature. Due to some serious problems that

collaborative filtering faces, researchers switched to Web mining techniques for

product recommendation problems. Web usage mining; a process of extraction of

useful patterns from web usage data, supposed to be the most applied branch of the

web mining techniques that has attracted the researchers in recent decade [239].

Generally recommender systems use customer‟s preferences and assume that

several customers must have same taste and may like the similar products, however it

is not the case always. If we are concerned about academic books, the selection is not

the fun and must not be dependent upon student‟s choice, but it should be handled

with utmost care and decided by the experts. Therefore specialized way of

recommendation where authorities‟ recommendations are considered would be

advisable and fruitful. Therefore it seems adequate while recommending books that

Chapter 3: Link Mining based Book Recommendation Approach

___________________________________________________________________________________

55 | P a g e

one should ask the percept of experts at Universities instead of common people so that

the above issues can be avoided.

Thus, instead of applying personalized recommendation approach it seems

adequate to make use of a group recommendation technology and same techniques

and single ranked recommendation can be one answer to several simultaneous queries

[31]. With the above discussions in the considerations, we have suggested a ranked

recommendation approach for books which aggregates the several ranking of the top

universities (which is considered as authorities) and employ link mining approach in

the recommendation process. On the one hand it handles the cold start issues and on

the other hand it eases the complexities of personalized recommendation to huge

number of users and replaces it with a single ranked recommendation.

In this chapter, we have recommended top books for University‟s students in

Indian scenario. That is why we have chosen top ranked universities from India. The

selection of top ranked universities is based upon the QS world university ranking. QS

Ranking is one of the leading ranker of the academic institution. Once the top

institutions are explored, their syllabus for the particular subject are searched which

served as a base for the recommendation of books.

Figure 3.1: An overview of link mining approach

Web page A

Web page B

Web page E

Web page I

Web page H

Web page G

Web page F

Web page C

Backward Link

Forward Link


___________________________________________________________________________________

56 | P a g e

Positional Aggregation Scoring (PAS) technique [34] is highly advisable for

aggregating a final result from these types of data. Instead of searching desired books

for a user from thousands of books available, it seems more appealing to find the best

amongst the books prescribed by ranked university, as it shortens the data overload,

reduces the complexities and increases the authenticity of the products.

We have chosen top universities amongst the Indian universities and checked their

recommendation for different courses of computer science; it‟s evident that a

recommendation of a book by a high class university will eventually increase the

importance of the recommended books. In this way the philosophy of link mining is

incorporated in this chapter. Apart from PAS technique, there are some other useful

aggregation operators which can be proved effective. The primary advantage of the

adopted technique is that it includes the recommendation of high status top ranked

universities as well as rank of the rankers i.e. universities, which are authorities for

the academic program to recommend books for university students. Figure 3.1depicts

an example of link mining.

The main contributions in the Chapters are as follows:

We have conferred a book recommendation method based on aggregation of

expert‟s decision. Thus the problem of book recommendation is converted into

decision making problem.

Positional Aggregation based Scoring Technique is implemented for

recommendation process. To the best of our knowledge, we are the first to use

this concept for book recommendation.

Section 3.3.1 deals with the details of data collections, types of dataset and

experimental results, in which there is a comprehensive discussion to highlight the

various aspects of experimental results. Finally we have summarized in section 3.4.

3.2 Book Recommendation using Positional Aggregation based Scoring

Technique

We are concerned with different books prescribed in the syllabus of top ranked

universities. The prescribed books may be considered as the rankings of the books by

that particular university. As discussed in section 3.5.1, the syllabus of respective

universities differs significantly. Thus, it gives us a partial list for the books


___________________________________________________________________________________

57 | P a g e

recommended in the syllabus by different universities. We have aggregated the ranked

books to obtain a full list of books with aggregated ranking. For full list, we have

several well-known methods like Borda‟s method [240] Markov chain based methods

[241]and soft computing based methods. But these techniques work for full list only.

[240], [242]. Therefore we have applied Positional Aggregation Score (PAS) based

technique [34], [243] that can work better to recommend the top books for partial list.

3.2.1 Positional Aggregation based Scoring Technique

The PAS based technique is used to aggregate the ranked data which have been

ranked by several users, hence, involves different position in the data set for different

ranking. R. Ali [243]has used it to evaluate a RS designed for product

recommendation. The PAS technique works as follows:

Suppose „m‟ different books are recommended by „n‟ different universities. First

we find out the rank of a book „Bi‟ for every university, we assign maximum value

(Vmax= -1) to book which is best ranked i.e. first ranked book is assigned a value '-1'.

The idea behind assigning „-1‟ to best ranked book is to give highest value to it and all

the values associated to ranked book should be in order of their ranking i.e. better

ranked books must have a higher numerical value associated with it. For next value,

we assign {(Vmax) – (i)} to (i+1)th

best ranked book. The above steps are repeated so

that all the books are assigned a value. If a book is not ranked, it is assigned a value „–

(m+1)‟, where m is number of total books being ranked by different universities.

Now, we compare each book „Bi‟ with all the „m-1‟ books. If value of a book is

greater than the other, we assign value of Bi = 1 otherwise Bi is assigned 0, i.e Bi = 0;

If it is found that Bi == - (m+1), again zero is assigned to Bi. Finally sum of all values

of Bi for each university are obtained. Thus, we will be getting (m+1) different scores

of every book; we call it S. the final score „FS‟ is given by (S / (m+1)).

3.2.2 Book Recommendation Approach using Positional Aggregation Scoring

The working of PAS is described in section 3.2.1. The above procedure is used for

recommending books. The following example is illustrates the approach in detail.

Example 3.1: we are taking an example with four books and five universities to

illustrate the above procedure, i.e. m = 4. U_1, U_2, U_3, U_4 and U_5 are five

different universities and B1, B2, B3, and B4 are four different books. Sequence 1, 2, 3

and 4 are ranking position of the books, i.e. the row consists of „1‟ in first column will


___________________________________________________________________________________

58 | P a g e

give the first ranked book of the particular university. All the universities have their

own ranking for different books; these rankings are given in Table 3.1.

Let a cell „z‟ is represented by z (r, c), where r and c represent the rth

row and cth

column respectively. The value of z (1, 1) is B1, i.e. B1 is ranked first by University

U_1. z (3, 5) is „-„ that implies no book is recommended by University U_4 except B2

in first four position. In Table 3.2, rank to score conversion is illustrated where the

best ranked university is assigned „-1‟. As B1 is first ranked book by University U_1,

the cell corresponding to B1 and S (U_1) has -1, where S (U_1) denotes score

assigned to respective books by U_1. Those books which are not ranked by any

university is assigned a value „-5‟.

Table 3.3gives the pairwise comparison of respective books. A pair (Bi, Bj) = 1

implies that book Bi is preferred over book Bj by the university. If (Bi, Bj) = 0, it

means book Bj is preferred over book Bi by the university concerned. The column

PwC (U_1) implies pairwise comparison of the books for U_1. The sum of values of

all the comparison of each book for all the universities is shown in Table 3.6. This

value is termed as preference score, and hence the notion for this in first column is PS

(U_1), i.e. preference of a book by U_1, and so on.

Table 3.1: Top 4 ranked books by 5 universities

Rank positions U_1 U_2 U_3 U_4 U_5

1 B1 B2 B3 B2 B4

2 B2 B4 B1 - B1

3 B3 - B4 - -

4 - - - - -

Table 3.2: :Conversion of Rank into Scores

S (U_1) S (U_2) S (U_3) S (U_4) S (U_5)

B1 -1 -5 -2 -5 -2

B2 -2 -1 -5 -1 -5

B3 -3 -5 -1 -5 -5

B4 -5 -2 -3 -5 -1


___________________________________________________________________________________

59 | P a g e

Table 3.3: :Pairwise comparison of books

Pair (Bi, Bj) PwC (U_1) PwC (U_2) PwC (U_3) PwC (U_4) PwC (U_5)

B1,B2 1 0 1 0 1

B1,B3 1 0 0 0 1

B1,B4 1 0 1 0 0

B2,B1 0 1 0 1 0

B2,B3 1 1 0 1 0

B2,B4 1 1 0 1 0

B3,B1 0 0 1 0 0

B3,B2 0 0 1 0 0

B3,B4 1 0 1 0 0

B4,B1 0 1 0 0 1

B4,B2 0 0 1 0 1

B4,B3 0 1 0 0 1

The value of B1 for U_1 comes out to be 3, which means university „U_1‟ prefers

book „B1‟ over rest of the three books. There are (m-1) comparisons for each book;

hence the values obtained in Table 3.6 can be normalized by dividing 3.

Table 3.4: Normalized preference score of books

NPS (U_1) NPS (U_2) NPS (U_3) NPS (U_4) NPS (U_5)

B1 1 0 0.66 0 0.66

B2 0.66 1 0 1 0

B3 0.33 0 1 0 0

B4 0 0.66 0.33 0 1


___________________________________________________________________________________

60 | P a g e

Algorithm 3.1: Positional Aggregation based Scoring of books

𝑷𝑨𝑺 = 𝟏

𝒏 𝑵𝑴 𝒊,𝒌

𝒎,𝒏

𝒊,𝒌=𝟏

Preliminaries:

Total no .of books is „m‟

Total no. of different universities is‟ n‟,

hence total „n‟ ranking is available

For each book Bi belongs to m, we have

different ranked position of Bi in every

ranking Rk ; 1≤k≤n i.e. we have a matrix

with m rows and n column may be

represented as: R[i,k] where 1≤i≤m

&1≤k≤n;

Steps:

I: Repeat the following procedure till steps7 for every

ranking Rk

1: find out the rank of a book „Bi‟ where, 1≤i

≤m

2: Assign maximum value (Vmax= -1) to book

which is best ranked

3: For next value, assign (Vmax – i) to (i+1)th

best ranked book

4: If a book is not ranked, assign it a value = –

(m+1)

Repeat the steps 2 to 4 until all the books are

assigned a value, store these values in a matrix SM [i,

k];

5: compare each book „Bi‟ with each of the

remaining „m-1‟ books, for 1≤i≤m

If

SM [i,k] > SM [j,k], 1≤ j ≤ m; i ≠ j

PC [i,k] = 1;

else

PC [i,k] =0;

6: we find preference score matrix PSM [i,k]

such that

PSM [i,k]= 𝐁𝐢,𝐁𝐣 𝒎𝒋=𝟏 ; j≠i

7: create normalized matrix NM [i,k] = PSM

[i,k] /(m-1);

8: we find positional aggregation based scores

as:


___________________________________________________________________________________

61 | P a g e

Figure 3.2: Positional Aggregation Scoring based Book Recommendation System


___________________________________________________________________________________

62 | P a g e

Table 3.5: Positional Aggregated scores of books

Book Positional Aggregation Score

B1 0.464

B2 0.532

B3 0.266

B4 0.398

Table 3.6: Preference score of books

PS (U_1) PS (U_2) PS (U_3) PS (U_4) PS (U_5)

B1 3 0 2 0 2

B2 2 3 0 3 0

B3 1 0 3 0 0

B4 0 2 1 0 3

Table 3.7: Ranked books based on Positional Aggregation Scoring technique

Ranked position Book

1 B2

2 B1

3 B4

4 B3

The normalized score is given in Table 3.4. We call it normalized preference score.

Finally we get Positional Aggregated Score by dividing the values obtained in Table

3.4 by number of university „n‟, here n=5. The values are given in Table 3.5.

Finally the PAS is sorted to find top books, as shown in the Table 3.7. The above

calculation is summarized in algorithm 3.1. Pictorial representation of the working of

book recommendation using PAS based technique is presented in Figure 3.2.


___________________________________________________________________________________

63 | P a g e

3.3 Results and Discussions

In the previous sections of this chapter, we have described the procedure of

recommendation and generalized steps for the implementation of the recommendation

scheme is presented in respective sections. In this section, the results of the

recommendation by different techniques applied in the previous discussions, are

presented and discussed in details. The methods and steps of data collections, data

filtering, datasets, and pros and cons of the methods are also discussed.

3.3.1 Dataset

Basically, we are concerned in recommending books for university graduates of

Indian universities. Initially, those different books were taken that could be a part of

the curriculum of the universities. Though, there were neither any criteria nor any

limit of the inclusion of the books. Then we had a second thought to filter the data in a

way that could fulfil our objective, i.e. top ranked books of the course to the students.

For this, it seems adequate to include the top ranked institutions and their

recommended books. This step will filter the data as well as it helps in explaining why

the methodology (section 3.4) is chosen for the recommendation process. Also, only

„computer science‟ as a subject is selected from these universities/institutions.

Because, once we can find the method of presenting top books for any specific

subject, it can easily be extended for all other subjects. Hence, different courses of

computer science like, Discrete Mathematics, Data Structure, etc. which are almost

considered in top institutions, have been included. The methods of selection of

universities and courses, and the final recommendation by the different techniques are

discussed in subsequent sub sections.

3.3.1.1 Selection of top Universities

The selection of top universities can fulfill the purpose of recommendation of

appropriate books for graduate students as far as our methodology is concerned.

Although, the basic idea seems to be simple, however, is very useful. The books

which are recommended by top institution should be more reliable for students rather

than the books recommended by other corporate recommender sites. Thus, we have

chosen top universities from India and their recommended books are explored. The

recommended books of these universities are quantified and then aggregated using


___________________________________________________________________________________

64 | P a g e

proposed technique with the help of some strong aggregating operators which gives

top N books to be suggested.

Table 3.8: Top 7 Indian Universities in QS ranking [244]

Rank Position University Name

1 Indian Institute of Technology, Bombay

2 Indian Institute of Technology, Delhi

3 Indian Institute of Technology, Kanpur

4 Indian Institute of Technology, Madras

5 Indian Institute of Science, Bangalore

6 Indian Institute of Technology, Kharagpur

7 Indian Institute of Technology, Roorkee

There are several ranking sites and authorities which suggest top universities,

courses and places for higher studies. Amongst all these rankers, QS World

University rankings in collaboration with Elsevier is one of the leading and highly

reliable source of university ranking that ranges about 40 subjects all around the

globe. They ranked the institutions subject-wise, region-wise, course-wise, etc. QS

World University Rankings for computer science & information systemshas only 7

Institutions from India[244]. We have selected these institutions for inclusion of

books in our procedure. The list of top Indian Institution for 2015 is listed below in

Table 3.8.

3.3.1.2 Courses included from top Universities

The different institutions have offered different courses for their enrolled students.

Merely, there are the very few courses having exactly the same title at these top

institutions. However, we have tried to our best for exploiting the courses that can be

categorized as a common to these institutions. Although, there are the subjects that

differ in these universities and academic institution, we have just taken those courses

which are either common to all or most of the universities offer these courses and

have published its syllabus at respective websites, or have made these syllabus

available to their students. The list of the courses included in the proposed work, and

corresponding offering universities is given in the Table 3.9. The column representing

U_1, U_2 to U_7 is the sorted universities, i.e. U_1 is the best ranked


___________________________________________________________________________________

65 | P a g e

Table 3.9: Syllabus of various Courses, offered at top Universities.

Sequence Course Title U_1 U_2 U_3 U_4 U_5 U_6 U_7

1. Discrete Mathematics

2. Operating Systems

3. Theory of Computation

4. Computer Networks

5. Software Engineering

6. Compiler Design

7. Principles of Database Systems

8. Artificial Intelligence

9. Data Structure

10. Graphics

university. Discrete Mathematics is offered at every university, whereas, only ranked

3 university does not offer it. More specifically, except IIT Kanpur all other top

universities offer the course with similar title, “Discrete Mathematics”. In the same

way, the table depicts the institutes and whether the courses are offered their or not.

The „‟ and „‟ marks represent syllabus are available or not for the courses with the

same title at corresponding universities, respectively. However, few universities do

not show the entire syllabus at their web site. We had to make an extra effort by

visiting the faculty and students of the institution. Some of them are contacted in

person, while others are being asked by email, etc.

3.3.1.3 Prescribed books by top Universities:

10 courses are listed in the Table 3.9. The universities recommended the syllabus

where they explicitly describe the prescribed books for students. Different courses

have different number of books. The list of courses and their respective books,

obtained by the combining all the universities are listed in Table 3.10.

In the above table, total number of books is given. Discrete Mathematics,

represented by code „DM‟, has a total of 17 books from all the selected top 7


___________________________________________________________________________________

66 | P a g e

Table 3.10: Total number of books in the syllabus of corresponding courses in top Universities

Sequence Course Title Course Code Number of

books

1. Discrete Mathematics DM 17

2. Operating Systems OS 15

3. Theory of Computation ToC 11

4. Computer Networks CN 18

5. Software Engineering SE 19

6. Compiler Design CD 10

7. Principles of Database

Systems

DB 13

8. Artificial Intelligence AI 20

9. Data Structure DS 15

10. Computer Graphics CG 20

universities. It is also evident from Table 3.13 that only six universities have shown

the details of the syllabus for the book „DM‟ at their websites. Thus, these 17 books

are from 6 universities. In the same way, the total books for all courses are the

collection from universities which have made their syllabus available, whose

detailsare presented in Table 3.13. Maximum number of books considered is of

Artificial Intelligence „AI‟ and Computer Graphics „CG‟. Both courses have total

collection of 20 books from these universities. However, not all 7 universities have

syllabus for both the books published at their websites. There are six and five

universities involved for both the books respectively. Minimum number of books

available is of Compiler Design „CD‟.

Only 10 books are available, although 5 universities have made their syllabus

available.But the number is low because most of the recommended books are

common to these universities, whereas in the case of CG, there is significant

difference in recommended books. Thus 158 different books for 10 different courses

are included in our procedure.

3.3.2 Experimental Results

In this section, we have discussed the final recommendations by all 5 different

techniques implemented which include PAS, OWA with quantifiers „at least half‟, „as


___________________________________________________________________________________

67 | P a g e

many as possible‟, „most‟ and ORWA. We have applied all the techniques which are

discussed above.

Different ranked books are obtained by using all these techniques. The books are

represented by unique course codes. E.g.: code „CD‟ has been used to refer books on

Compiler Design see table (3.14). In the similar way, different courses of books have

different notation for representations. For each book we have different sequence of

the books according to their ranking. For „Compiler Design‟ the different codes are

CD1, CD2, etc. The details of the books on compiler design including code, author of

the books, title and publisher, for which syllabus are available, are listed in Table

3.11.Course code of the books and corresponding rank given by respective

universities are given in Table 3.12.

Table 3.11: Code and details for books on Compiler Design

Course Code Author Title

Publisher

CD.1.

Alfred V. Aho, Monica

S. Lam, Ravi Sethi and

Jeffrey D. Ullman:

Compilers: Principles,

Techniques, and Tools 2/E, AddisonWesley

2007.

CD.2. Andrew Appel

Modern Compiler

Implementation in

C/ML/Java

Cambridge University

Press, 2004

CD.3.

Dick Grune, Henri E.

Bal, Cerial J.H.

Jacobs and Koen G.

Langendoen:

Modern Compiler

Design

John Wiley& Sons,

Inc. 2012.

CD.4. S. Muchnick,

Advanced Compiler

Design &

Implementation,

Indian Reprint 2002.

CD.5. K Cooper,

L Torczon

Engineering a Compile

r

2nd Ed., Morgan Kauf

mann, 2011

CD.6. KC Louden, Compiler Construction:

Principles and Practice

Cengage Learning, 199

7

CD.7. D Grune et al.

Modern

Compiler Design Wiley, 2000

CD.8. Michael L Scott, Programming Languag

e Pragmatics

3rd Ed., Morgan Kauf

mann, 2009

CD.9.

Tremblay, J.P. and

Sorenson, P.G.

Theory and Practice of

Compiler Writing

SR Publications.

2005

CD.10.

Tremblay, J.P. and

Sorenson, P.G.

Parsing Techniques: A

Practical Guide

Ellis Horwood.

1998.


___________________________________________________________________________________

68 | P a g e

Table 3.12: Ranked list of book „compiler design‟ by top universities

Rank

Position U_1 U_2 U_3 U_4 U_5 U_6 U_7

1. CD1 CD2 CD1 CD1 CD1 - CD1

2. CD2 CD4 CD2 - CD4 - CD9

3. CD3 CD1 CD5 - - - CD5

4. - - CD6 - - - CD6

5. - - CD7 - - - CD10

6. - - CD8 - - - -

7. - - CD4 - - - -

8. - - - - - - -

9. - - - - - - -

10. - - - - - - -

From the table it is evident that U_1 has ranked book CD1 1st. CD2 and CD3 are

ranked 2nd

and 3rd

respectively. As U_1 is the 1st ranked university, this implies that

top university has recommended only three books on compiler design for their

students. The book CD1 is almost ranked by all the university except U_6, which has

not issued list of any book for the particular course. U_2 has ranked CD1 3rd

and CD2

is ranked 1st. However, U_3, U_4, U_5 and U_7 all have ranked CD1 1

st. It is also

observed that U_4 has recommended only one book. The PAS based ranking of books

are obtained by applying the procedure illustrated in example 3.1. To get the ranking

of books, say compiler design, we have considered all 10 books recommended by top

universities. These ranks are numerically represented in Table 3.13. In the table, R

(U_1) indicates rank given by U_1. These ranks are converted into scores and shown

in Table 3.14. As described in Table 3.2 of example 3.1, the best rank i.e. „1‟ is

assigned „-1‟, rank 2 is assigned „-2‟, and so on. The book which is not ranked by any

of the university is assigned lowest value which is „-8‟ here. Thus cells values

„0‟ofTable 3.13have changed to „-8‟ in Table 3.14.

Table 3.13: Compiler design ranked books by top 7 Universities

Course

Code R (U_1) R (U_2) R (U_3) R (U_4) R (U_5) R (U_6) R (U_7)

CD.1. 1 3 1 1 1 0 1

CD.2. 2 1 2 0 0 0 0

CD.3. 3 0 0 0 0 0 0

CD.4. 0 2 7 0 2 0 0

CD.5. 0 0 3 0 0 0 3

CD.6. 0 0 4 0 0 0 4

CD.7. 0 0 5 0 0 0 0

CD.8. 0 0 6 0 0 0 0

CD.9. 0 0 0 0 0 0 2

CD.10. 0 0 0 0 0 0 5


___________________________________________________________________________________

69 | P a g e

Table 3.14: Rank to Score conversion of book Compiler Design

Course

Code S (U_1) S (U_2) S (U_3) S (U_4) S (U_5) S (U_6) S (U_7)

CD.1. -1 -3 -1 -1 -1 -8 -1

CD.2. -2 -1 -2 -8 -8 -8 -8

CD.3. -3 -8 -8 -8 -8 -8 -8

CD.4. -8 -2 -7 -8 -2 -8 -8

CD.5. -8 -8 -3 -8 -8 -8 -3

CD.6. -8 -8 -4 -8 -8 -8 -4

CD.7. -8 -8 -5 -8 -8 -8 -8

CD.8. -8 -8 -6 -8 -8 -8 -8

CD.9. -8 -8 -8 -8 -8 -8 -2

CD.10. -8 -8 -8 -8 -8 -8 -5

Table 3.15: Positional Score for book Compiler Design

Course

Code

PS

(U_1)

PS

(U_2)

PS

(U_3) PS (U_4)

PS

(U_5)

PS

(U_6)

PS

(U_7) PAS

CD.1. 1 0.777 1 1 1 0 1 0.8252

CD.2. 0.888 1 0.888 0 0 0 0 0.3965

CD.3. 0.777 0 0 0 0 0 0 0.111

CD.4. 0 0.888 0.333 0 0.888 0 0 0.3012

CD.5. 0 0 0.777 0 0 0 0.777 0.222

CD.6. 0 0 0.666 0 0 0 0.666 0.1902

CD.7. 0 0 0.555 0 0 0 0 0.0792

CD.8. 0 0 0.444 0 0 0 0 0.0634

CD.9. 0 0 0 0 0 0 0.888 0.1268

CD.10. 0 0 0 0 0 0 0.555 0.0792

Table 3.16: Ranking of book „compiler design‟ using Positional Aggregation Scoring

Rank position PAS based ranking

1 CD.1.

2 CD.2.

3 CD.4.

4 CD.5.

5 CD.6.

6 CD.9.

7 CD.3.

8 CD.7.

9 CD.10.

10 CD.8.


___________________________________________________________________________________

70 | P a g e

Table 3.17: PAS based Ranking of different books

Rank

Position

PAS based Ranking of different books

1 DM.9. AI.2. DS.1. DB.2. CG.2. SE.2. OS.2. CN.2. TOC.1.




5 DM.2. AI.11. DS.12

DB.3. CG.14. SE.18 OS.13. CN.7. TOC.3.


7 DM.12. AI.6. DS.7. DB.6. CG.4. SE.11 OS.10. CN.13. TOC.7.



10 DM.5. AI.15. DS.13

DB.13. CG.15. SE.12 OS.8. CN.18. TOC.10.

The final PAS is given in Table 3.15. Thesescores obtained by PAS lead to the

ranking of books on different disciplines. The ranking of books on compiler design is

given in Table 3.16,the CD1 has almost attained maximum score from all the adopted

methods. Its score by PAS is 0.8252.

In the ranking of books on „Discrete Mathematics‟, DM.9 is placed at 1st position

and DM.1 is placed at second position. However, DM.10 is 3rd

ranked and DM.3 is

ranked at 9th

positions. AI.1 is not even in top 3 ranking and is ranked 8th

whereas

AI.2 is at top position. If we observe the rankings of data structure books, the top 2

position is acquired by first two books ranked by top universities, i.e. DS 1 and DS 2.

This variation indicates the importance of the authoritative recommendations. In most

of the cases it is observed that universities‟ recommendation has influenced the

recommendation significantly.

The books on data base has the same trend of being ranked by the PAS technique

and on observing the rankings of data base books, the top 2 position is acquired by

first two books ranked by top universities, i.e. DB. 1 and DB. 2. Similarly, computer

graphics (C.G) books are ranked and has similar variations as of data structure books,

however, top 3 books which is recommended by the authoritative recommendation are

also in top 4 ranking position of PAS. But beyond top 10 positions in the ranking of

universities, three books have also secured ranking in PAS under 10 ranks. The books

on Software Engineering interestingly has not ranked SE.1 in any of the top 5

positions, however SE. 2 is ranked top. Again, the books on Operating Systems


___________________________________________________________________________________

71 | P a g e

interestingly has not ranked OS.1 in any of the top 5 positions, however OS. 2 is

ranked top.

Unlike the above two books‟ ranking, the ranking of books on Computer Networks

has CN. 1 and CN.2 on top three positions which again suggest he importance of

authorities in ranking.

3.4 Summary

We have incorporated link mining approach for the recommendation of books. The

syllabus of the top ranked universities are taken and aggregated ranking which is

obtained by considering the most valuable universities‟ recommendation as more

preferred, with the help of positional aggregation scheme.

Rank aggregation algorithm which is termed as Positional Aggregation based

Scoring (PAS) technique is used for recommendation of books. We believe the

proposed technique may meet the user‟s need and provide them the perfect books they

need. For the sake of illustration and ease of experiments, we have shown the

procedure considering books from top institutions. However, we can generalize the

procedure of the recommendation for any kind of items. The robustness of the

procedure may lead to a novel way in the field of recommendation and would fulfill

the demand of millions.

72 | P a g e

Chapter 4

Book Recommendation based on Soft

Computing Approaches

4.1 Introduction

The Internet is the gift of the modern era, which is a consequence of the proliferation

in the modern technologies. The growth of the Internet has also boosted ecommerce.

Online shopping has become much more popular. Today it is a vogue for a common

man to shop online using online marketing portals such as www.amazon.com. The

boom in the Internet has caused data overload over it. The huge data over the World

Wide Web has increased the problems for the users to extract the exact information.

The buyer finds it extremely tough to go for an exact product which he or she is

looking for. While browsing the online shopping portals, multiple options are weeded;

however picking the right item is an arduous job. Researchers have proposed different

recommendation techniques to help the customers in purchasing the right item.

Various efforts have been made for an ease and effective online shopping. In last few

years, researchers have proposed a good number of recommendation techniques [2],

[3], [4]. An increase in soft computing methods in the field of recommendation

technologies especially, fuzzy based recommendation is recorded [177]. To solve the

various issues encountered in leading recommender systems, researchers have also

used web mining, an emerging recommendation technique that researchers are using

frequently. Link mining, supposed to be a sub set of the web mining technique, is an

emerging research area [245], [246].

In this chapter, we have proposed a recommendation methodology that

incorporates soft computing techniques and link mining both to rank the products and

recommend it before the users. Like the previous chapter, we have included the

importance of the recommending universities i.e. important links are given priority to

incorporate the link mining techniques. The soft computing methods are applied over

the link mining for a precise and near to preference recommendation for users.

Chapter 4: Book Recommendation based on Soft Computing Approaches

___________________________________________________________________________________

73 | P a g e

The two soft computing based averaging mechanism, Ordered Weighted

Aggregation (OWA) and Ordered Ranked Weighted Aggregation (ORWA), a

modified OWA, [5], [35] are applied. OWA is a fuzzy based averaging operator,

which in combination with linguistic quantifier gives a variation of option for decision

making problems. However, there is a lack of consideration for the voters or rankers.

By voters or rankers we mean the entity which recommends or suggests the items in

consideration. In our case, the item is book. The motive behind introducing ORWA is

to add the value of rankers in support of the philosophy that recommendation by

higher authority must be valuable than by a lower authority. In ORWA, a specific

weight is assigned to ranking agents (rankers), in our case; the universities. The

weight assignment method gives high weightage to the best ranked university and

hence their rankings are evaluated with a high degree of preference than to those

institutions that have lower ranking. The primary advantage of the adopted technique

is that it includes the recommendation of high status top ranked universities as well as

rank of the rankers i.e. universities, which are authorities for the academic program to

recommend books for university students.

On the one hand OWA utilizes different linguistic quantifiers to overcome the

problem of vagueness in the recommendation, and on the other hand ORWA

considers top ranked universities by assigning them the higher weights and make

recommendations on the basis of „value to voter‟ mechanism. By the use of these two

diverse applicable soft computing mechanisms, we have contributed following

achievement in the Chapter which is described as follows:

We have conferred a book recommendation method based on aggregation of

expert‟s decision. Thus the problem of book recommendation is converted into

decision making problem.

A fuzzy technique based method using Ordered Weighted Aggregation

(OWA) is employed for book recommendation. As far as our search is

concerned, we did not find any book recommendation approach employing

these techniques for the said techniques.

We have proposed an aggregation technique, Ordered Ranked Weighted

Aggregation (ORWA) which uses the rank of the rankers. The proposed

technique may be very useful for decision making problems where rankers


___________________________________________________________________________________

74 | P a g e

need to be taken into consideration. The ORWA is used for the

recommendation of books.

The rest of the Chapteris organized as follows; we have described ordered weighted

aggregation (OWA) and its applications are discussed. The importance of the

technique followed by procedure of book recommendation using OWA is also stated.

The recommendation technique based on ORWA and its advantages are discussed in

section 4.4 with suitable diagrams and examples. Section 4.5 deals with the

experimental results, in which there is a comprehensive discussion to highlight the

various aspects of results which is obtained by different experiments. Finally we have

summarized in section 3.6.

4.2 Ordered Weighted Aggregation

Ordered Weighted Aggregation (OWA) is a fuzzy based aggregation approach to

handle the uncertainty. Fuzzy techniques have been used widespread for various

scientific and daily life problems. Ordered Weighted Averaging (OWA) operator is a

well-known fuzzy based averaging operator which was introduced by R. Yager[247].

A variety of its applications have been presented in the literature. Several authors

have used OWA operator for various applications[248], [249]. The author [250] used

OWA operator based novel fuzzy queries for web searching. The researchers have

also applied the OWA operator‟s application in several GIS environments [251],

[252], [253].

Ordered weighted aggregation operator is very useful for aggregating multiple

criterions [235]. Mathematically we give OWA as;

1 2

1

, , ., = n

n k k

k

OWA x x x W Z

----------- (4.1)

Where Zk implies that if we re-order the values x1, x2 , … xn in descending order,

we get a sequence z1, z2, .. , zn i.e. z1≥ z2≥ ,… zn-1≥zn. The weights „Wk‟ for OWA

operator is calculated by using following equation [33, 25].

k / – 1 / ,W Q k m Q k m ----------- (4.2)

Where k = 1, 2… m.

Function Q(r) for relative quantifier can be calculated as:

0 if r<a


___________________________________________________________________________________

75 | P a g e

Q(r) =

if a≤ r ≤b --- (4.3)

1 if r>b

Where Q (0) = 0, ∃r ε [0, 1] such that Q(r) =1, and a, b and r ε [0,1]. By using

different linguistic quantifier for different a and b, we can find different weights.

E.g. „Most‟ is a linguistic quantifier for which a=0.3 and b=0.8, by the use of these

quantifiers those books are preferred which are recommended by most of the

universities. Thus each book is assigned a value, and upon these values the books

are sorted and ranked. Similarly, „as many as possible‟ and „at least half‟ are other

quantifiers for which values of (a, b) are (0.5, 1) and (0, 0.5) respectively. Graphical

representations of these fuzzy linguistic quantifiers are shown in the Figure 4.1,

Figure 4.2 and Figure 4.3 for „most‟, „as many as possible‟, and „at least half‟,

respectively.

Figure 4.1: Most Quantifier

Figure 4.2: As many as possible quantifier


___________________________________________________________________________________

76 | P a g e

Figure 4.3: At least half quantifier

4.3 Book Recommendation based on Ordered Weighted Aggregation

This section gives the description of OWA, its use and application. The book

recommendation approach using OWA is also discussed.Book recommendation

approach based on OWA is illustrated in this section. The example 3.1 is considered and

accordingly the weights are applied which are obtained as mentioned in example 4.1.

Example 4.1: For number of criteria (m) = 5 and parametric values as a=0 and b=0.5,

we will have corresponding weights for OWA values as:

w (1) =0.4, w (2) =0.4, w (3) =0.2, w (4) =0.0, w (5) =0.0.

In the same way, for a=0.3, b=0.8 the obtained of weights are; w (1) =0.0, w (2)

=0.2, w (3) =0.4, w (4) =0.4, w (5) =0.0.

For a=0.5, b=1.0 we have obtained values of weights as; w (1) =0.0, w (2) =0.0, w

(3) =0.2, w (4) =0.4, w (5) =0.4.

For obtaining the results of recommended top ranked books using relative quantifier

with OWA as an operator, we need to use above weights and employ it in example 3.1.

By using weights obtained in example 4.1, equation (4.1), and Table 3.5 of Chapter 3,

ranked books for these quantifiers can be obtained. The rankings are shown in Table

4.1, Table 4.2andTable 4.3, respectively.

Table 4.1:Ranked books using relative quantifier most

Rank Position Ranked Books

1 B2

2 B3

3 B1

4 B4


___________________________________________________________________________________

77 | P a g e

Table 4.2: Ranked books using relative quantifier As many as possible


1 B4

2 B2

3 B1

4 B3

Table 4.3: Ranked books using relative quantifier At least half


1 B2

2 B1

3 B3

4 B4

The basis of the whole recommendation process adopted is the top universities in

the QS ranking and their recommended books for enrolled student at respective

campuses. In PAS technique, we have not assigned any weights to the universities.

Hence, PAS can be perceived as un-weighted aggregation of scores assigned to

books using algorithm 3.1. Each course which consists of several books, are

assigned a score using PAS. By using OWA with different linguistic quantifiers, we

can assign weights to the university.

4.4 Book Recommendation based on Ordered Ranked Weighted

Aggregation (ORWA)

The modification in OWA for the situations where it is useful to incorporate value of

voters is discussed in the section and detail procedure of book recommendation is also

elaborated.

4.4.1 Ordered Ranked Weighted Aggregation (ORWA )

Ordered weighted aggregations (OWA) have been widely used in computational

intelligence because of its strength in modeling the multi criteria decision making

problems [253]. The OWA operator where weights assignment is guided by quantifier

[254] is found to be effective for the problems where criteria are well defined and the


___________________________________________________________________________________

78 | P a g e

voters need not be taken into consideration. However for a case where the value of

voters or rankers matter, i.e. a ranker A has some preference over ranker B then their

respective ranking should be assigned weights accordingly in the order, i.e. weights

assigned to A should be higher than the weights assigned to B. Considering the above

situation, we introduce a weight assignment formula for OWA and modify OWA

basic formula to obtain ordered ranked weighted aggregation (ORWA) operator that

takes into account the importance of the ranking agents.

The application of ORWA may influence several real life decision making

problems. In fact, ORWA may be ideally useful in all decision making problems

where the recommendations are given by the experts, and we need to weight the

experts or rankers. The ORWA would be very helpful in the recommendations of

voting results, sports team, universities preferences and web sites selections, etc.

We have proposed Ordered Ranked Weighted Aggregation operator to

recommend books in which the weight assignment procedure for OWA is modified so

that it may make use of the positional rank of the recommending agents. Keeping the

above concept in consideration, we have chosen top universities amongst the Indian

universities and their recommendation for different courses of computer science is

investigated, it is evident that a recommendation of a book by a high ranked

university will eventually increase the importance of the recommended books.

Aggregation weights „v‟ to different universities are assigned using formula;

vi=

----------- (4.4)

Where, n is the number of universities. N is given by; N = = .

And i=1,2,3…n. „i‟ indicates the ith

ranked university i.e. i=1 means first ranked

university, i=2 means second ranked university, and so on. Also the weights „vi‟ fulfill

the following conditions;

i. vi ε [0,1]

ii. i = 1.

Further, vi indicates the weights assigned to the ith

ranked university, i.e. the best

ranked university is associated with a maximum weight to it, and hence more

preferred over other least ranked universities. Thus vi>vj for i<j. i.e. for five different

ordered ranked universities U_1, U_2, U_3, U_4 and U_5, ordered in best to least

ranks; we have v1>v2>v3>v4>v5.


___________________________________________________________________________________

79 | P a g e

4.4.2 Book Recommendation based on ORWA

Book recommendation approach based on ORWA is illustrated in this section. The

example 3.1 is considered and accordingly the weights are applied which are obtained

as mentioned in example 4.2.

Example 4.2: For five different universities we have m=5 that gives N = 15, thus we

get five values of weights as: v1 =1/3 =0.3333, v2 =4/15 =0.2666, v3 =1/5 =0.20, v4

=2/15 =0.1333, v5 =1/15 =0.0666.

We give formula to obtain ORWA as;

ORWA =1

n

i i

i

v y

----------- (4.5)

vi is given by the equation (4.4) and yi is the score given to a book by ith

ranked

university.

Referring to Table 5; we have preference scores of books for five ranked

universities. Considering equation 4.5, we get y1=1.0, y2=0.0, y3=0.66, y4=0.0 and

y5=0. 66.

ORWA = (0.3333 × 1) + (0.2666 × 0) + (0.20 × 0.66) + (0.1333 × 0) + (0.0666 ×

0.66)

= 0.0566

In the similar way we will be getting different values for book B2, B3 and B4. The

values obtained are as follows: B1 =0.4439292, B2= 0.619878, B3=0.309989,

B4=0.308556. The ranking of books illustrated using example 3.1 is tabulated in Table

4.4.

We can easily see the difference of the weights obtained by the OWA operator and

ORWA operator as calculated in section 4.2 and 4.3 respectively. The OWA operator

has W1 as 0 as well in several cases which would be associated with highest scored

ranker that eventually will make the final value zero. i.e. the most valuable ranker

may get „0‟ value whereas the ORWA operator, which has a modified way of

assigning weights to OWA, considers the strategy that highest weights should be

assigned to most valuable ranker, in our case the best ranked university.A specific

weight is associated to each university which is recommending a book, and ORWA

technique is used as described by equation (4.5). A block diagram for whole

procedure is given in


___________________________________________________________________________________

80 | P a g e

Figure 4.4. The detail discussion on the results is done in the section 4.5.

Figure 4.4: Ordered Ranked Weighted Aggregation based Book Recommendation System


___________________________________________________________________________________

81 | P a g e

Table 4.4: Ranked books based on Ordered Ranked Weighted Aggregation technique for example

3.1


1 B2

2 B1

3 B3

4 B4


In the previous sections of this chapter, we have described the procedure of

recommendation and generalized steps for the implementation of the recommendation

scheme is presented in respective sections. In this section, the results of the

recommendation by different techniques applied in the previous discussions, are

presented and discussed in details. The methods and steps of data collections, data

filtering, datasets, and pros and cons of the methods are also discussed.

4.5.1 Dataset

Basically, we are concerned in recommending books for university graduates of

Indian universities. Initially, those different books were taken that could be a part of

the curriculum of the universities. Though, there were neither any criteria nor any

limit of the inclusion of the books. Then we had a second thought to filter the data in a

way that could fulfil our objective, i.e. top ranked books of the course to the students.

For this, it seems adequate to include the top ranked institutions and their

recommended books. This step will filter the data as well as it helps in explaining why

the methodology (section 3.4) is chosen for the recommendation process. Also, only

„computer science‟ as a subject is selected from these universities/institutions.

Because, once we can find the method of presenting top books for any specific

subject, it can easily be extended for all other subjects. Hence, different courses of

computer science like, Discrete Mathematics, Data Structure, etc. which are almost

considered in top institutions, have been included. The methods of selection of

universities and courses, and the final recommendation by the different techniques are

discussed in subsequent sub sections.

4.5.2 Experimental Results

In this section, we have discussed the final recommendations by all 5 different

techniques implemented which include PAS, OWA with quantifiers „at least half‟, „as


___________________________________________________________________________________

82 | P a g e

many as possible‟, „most‟ and ORWA. We have applied all the techniques which are

discussed above. Different ranked books are obtained by using all these techniques.

The books are represented by unique course codes. E.g.: code „CD‟ has been used to

refer books on Compiler Design see table (3.14). In the similar way, different courses

of books have different notation for representations. For each book we have different

sequence of the books according to their ranking. For „Compiler Design‟ the different

codes are CD1, CD2, etc. The details of the books on compiler design including code,

author of the books, title and publisher, for which syllabus are available, are listed in

Table 4.5.

Table 4.5: Code and details for books on Compiler Design

course Code Author Title Publisher

CD.1. Alfred V. Aho, Monica

S. Lam, Ravi Sethi and

Jeffrey D. Ullman:

Compilers: Principles,

Techniques, and Tools

2/E, AddisonWesley

2007.

CD.2. Andrew Appel Modern Compiler

Implementation in

C/ML/Java

Cambridge University

Press, 2004

CD.3. Dick Grune, Henri E.

Bal, Cerial J.H.

Jacobs and Koen G.

Langendoen:

Modern Compiler

Design

John Wiley& Sons,

Inc. 2012.

CD.4. S. Muchnick, Advanced Compiler

Design &

Implementation,

Indian Reprint 2002.

CD.5. K Cooper,

L Torczon

Engineering a Compile

r

2nd Ed., Morgan Kauf

mann, 2011

CD.6. KC Louden, Compiler Construction:

Principles and Practice

Cengage Learning, 199

7

CD.7. D Grune et al. Modern

Compiler Design

Wiley, 2000

CD.8. Michael L Scott, Programming Languag

e Pragmatics

3rd Ed., Morgan Kauf

mann, 2009

CD.9. Tremblay, J.P. and

Sorenson, P.G.

Theory and Practice of

Compiler Writing

SR Publications.

2005

CD.10 Tremblay, J.P. and

Sorenson, P.G.

Parsing Techniques: A

Practical Guide

Ellis Horwood.

1998.

Course code of the books and corresponding rank given by respective universities

are given in Table 4.6.


___________________________________________________________________________________

83 | P a g e

Table 4.6: Ranked list of book „compiler design‟ by top universities

Rank

Position U_1 U_2 U_3 U_4 U_5 U_6 U_7

11. CD1 CD2 CD1 CD1 CD1 - CD1

12. CD2 CD4 CD2 - CD4 - CD9

13. CD3 CD1 CD5 - - - CD5

14. - - CD6 - - - CD6

15. - - CD7 - - - CD10

16. - - CD8 - - - -

17. - - CD4 - - - -

18. - - - - - - -

19. - - - - - - -

20. - - - - - - -

From the table it is evident that U_1 has ranked book CD1 1st. CD2 and CD3 are

ranked 2nd

and 3rd

respectively. As U_1 is the 1st ranked university, this implies that

top university has recommended only three books on compiler design for their

students. The book CD1 is almost ranked by all the university except U_6, which has

not issued list of any book for the particular course. U_2 has ranked CD1 3rd

and CD2

is ranked 1st. However, U_3, U_4, U_5 and U_7 all have ranked CD1 1

st. It is also

observed that U_4 has recommended only one book. The PAS based ranking of books

are obtained by applying the procedure illustrated in example 3.1. To get the ranking

of books, say compiler design, we have considered all 10 books recommended by top

universities. These ranks are numerically represented in Table 4.10. In the table, R

(U_1) indicates rank given by U_1. These ranks are converted into scores and shown

in Table 3.14. As described in Table 3.2 of example 3.1in previous chapter, the best

rank i.e. „1‟ is assigned „-1‟, rank 2 is assigned „-2‟, and so on. The book which is not

ranked by any of the university is assigned lowest value which is „-8‟ here. Thus cells

values „0‟ of Table 3.13 have changed to „-8‟ in Table 3.14.

The final PAS is given in Table 3.15. In the section 3.3 and 3.4, the methods of

finding rank using OWA and ORWA are discussed. With the help of this procedure,

we can get the different scores for books and accordingly different rankings may be

obtained. The details of these scores, i.e. scores for books using PAS, OWA with

three differentquantifiers namely, as many as possible, most and at least half, and

ORWA are mentioned in Table 4.10.

The CD1 has almost attained maximum score from all the adopted methods. Its

score by PAS is 0.8252, by OWA with quantifiers at least half, as many as possible

and most are 0.8805, 0.9361 and 0.7142. However, CD1 has obtained 0.8285 by using


___________________________________________________________________________________

84 | P a g e

ORWA technique. For 2nd

position, CD2 has maximum value for three techniques

whereas other two methods have CD4 in second rank.

Table 4.7: Compiler design ranked books by top 7 Universities

Course

Code R (U_1) R (U_2) R (U_3) R (U_4) R (U_5) R (U_6) R (U_7)

CD.1. 1 3 1 1 1 0 1

CD.2. 2 1 2 0 0 0 0

CD.3. 3 0 0 0 0 0 0

CD.4. 0 2 7 0 2 0 0

CD.5. 0 0 3 0 0 0 3

CD.6. 0 0 4 0 0 0 4

CD.7. 0 0 5 0 0 0 0

CD.8. 0 0 6 0 0 0 0

CD.9. 0 0 0 0 0 0 2

CD.10. 0 0 0 0 0 0 5

Table 4.8: Rank to Score conversion of book Compiler Design

Course

Code S (U_1) S (U_2) S (U_3) S (U_4) S (U_5) S (U_6) S (U_7)

CD.1. -1 -3 -1 -1 -1 -8 -1

CD.2. -2 -1 -2 -8 -8 -8 -8

CD.3. -3 -8 -8 -8 -8 -8 -8

CD.4. -8 -2 -7 -8 -2 -8 -8

CD.5. -8 -8 -3 -8 -8 -8 -3

CD.6. -8 -8 -4 -8 -8 -8 -4

CD.7. -8 -8 -5 -8 -8 -8 -8

CD.8. -8 -8 -6 -8 -8 -8 -8

CD.9. -8 -8 -8 -8 -8 -8 -2

CD.10. -8 -8 -8 -8 -8 -8 -5

The score obtained using above techniques gives the corresponding ranking of the

books. Thus, we may have 5 different ranking of book, „compiler design‟. These

rankings with the method adopted are given in Table 3.16.


___________________________________________________________________________________

85 | P a g e

Table 4.9: Positional Score for book Compiler Design

Course

Code

PS

(U_1)

PS

(U_2)

PS

(U_3) PS (U_4)

PS

(U_5)

PS

(U_6)

PS

(U_7) PAS

CD.1. 1 0.777 1 1 1 0 1 0.825

CD.2. 0.888 1 0.888 0 0 0 0 0.396

CD.3. 0.777 0 0 0 0 0 0 0.111

CD.4. 0 0.888 0.333 0 0.888 0 0 0.301

CD.5. 0 0 0.777 0 0 0 0.777 0.222

CD.6. 0 0 0.666 0 0 0 0.666 0.190

CD.7. 0 0 0.555 0 0 0 0 0.079

CD.8. 0 0 0.444 0 0 0 0 0.063

CD.9. 0 0 0 0 0 0 0.888 0.126

CD.10. 0 0 0 0 0 0 0.555 0.079

Table 4.10: Score obtained by recommendation approaches for compiler design

Course

Code PAS ORWA

OWA (At Least

half)

OWA (As many

as possible)

OWA

(most)

CD.1. 0.8252 0.8805 0.9361 0.7142 0.8285

CD.2. 0.3965 0.5947 0.7931 0 0.2283

CD.3. 0.111 0.1942 0.2219 0 0

CD.4. 0.3012 0.3447 0.3488 0.2537 0.3393

CD.5. 0.222 0.1664 0.2219 0.2219 0.1997

CD.6. 0.1902 0.1426 0.1902 0.1902 0.1712

CD.7. 0.0792 0.0990 0.1585 0 0.1426

CD.8. 0.0634 0.0792 0.1268 0 0.1141

CD.9. 0.1268 0.0317 0 0.2537 0

CD.10. 0.0792 0.0198 0 0.1585 0


___________________________________________________________________________________

86 | P a g e

Table 4.11: Five different ranking of book „compiler design‟

Rank

position

PAS based

ranking

ORWA

based

ranking

OWA (At

least) based

ranking

OWA (As

many as)

based

ranking

OWA (most)

based

ranking

1 CD.1. CD.1. CD.1. CD.1. CD.1.

2 CD.2. CD.2. CD.2. CD.4. CD.4.

3 CD.4. CD.4. CD.4. CD.9. CD.2.

4 CD.5. CD.3. CD.3. CD.5. CD.5.

5 CD.6. CD.5. CD.5. CD.6. CD.6.

6 CD.9. CD.6. CD.6. CD.10. CD.7.

7 CD.3. CD.7. CD.7. CD.2. CD.8.

8 CD.7. CD.8. CD.8. CD.3. CD.9.

9 CD.10. CD.9. CD.9. CD.7. CD.10.

10 CD.8. CD.10. CD.10. CD.8. CD.3.

Table 4.12: Five different ranking of book „Discrete Mathematics‟

Rank

Position

PAS based

ranking

ORWA

based

ranking

OWA (At

least half)

based

ranking

OWA (As

many as)

based

ranking

OWA (most)

based

ranking

1 DM.9. DM.1. DM.1. DM.9. DM.8.

2 DM.1. DM.8. DM.2. DM.10. DM.9.

3 DM.10. DM.9. DM.4. DM.12. DM.10.

4 DM.8. DM.2. DM.3. DM.16. DM.12.

5 DM.2. DM.3. DM.5. DM.8. DM.1.

6 DM.4. DM.10. DM.6. DM.15. DM.4.

7 DM.12. DM.4. DM.7. DM.17. DM.13.

8 DM.16. DM.5. DM.8. DM.13. DM.5.

9 DM.3. DM.6. DM.9. DM.18. DM.14.

10 DM.5. DM.7. DM.10. DM.14. DM.11.


___________________________________________________________________________________

87 | P a g e

Table 4.13: Five different ranking of book ‟Artificial Intelligence‟

Rank Position PAS based

ranking

ORWA based

ranking

OWA (At

least half)

based ranking

OWA (As

many as)

based ranking

OWA (most)

based ranking

1 AI.2. AI.2. AI.2. AI.2. AI.2.

2 AI.5. AI.7. AI.7. AI.21. AI.14.

3 AI.21. AI.5. AI.5. AI.5. AI.20.

4 AI.7. AI.4. AI.4. AI.20. AI.15.

5 AI.11. AI.6. AI.6. AI.11. AI.21.

6 AI.4. AI.1. AI.1. AI.14. AI.16.

7 AI.6. AI.3. AI.3. AI.15. AI.7.

8 AI.1. AI.11. AI.8. AI.16. AI.17.

9 AI.14. AI.8. AI.9. AI.17. AI.8.

10 AI.15. AI.9. AI.10. AI.18. AI.18.

Table 4.14: Five different ranking of book „Data Structure‟


ranking

ORWA based

ranking

OWA (At

least half)

based ranking

OWA (As

many as)

based ranking

OWA (most)

based ranking

1 DS.1. DS.1. DS.1. DS.12.

DS.6.

2 DS.2. DS.2. DS.3. DS.6. DS.4.

3 DS.4. DS.4. DS.2. DS.4. DS.1.

4 DS.6. DS.6. DS.4. DS.2. DS.5.

5 DS.12.

DS.3. DS.5. DS.7. DS.7.

6 DS.3. DS.5. DS.6. DS.1. DS.8.

7 DS.7. DS.12.

DS.12.

DS.8. DS.9.

8 DS.5. DS.7. DS.7. DS.13.

DS.10.

9 DS.8. DS.8. DS.8. DS.9. DS.11.

10 DS.13.

DS.9. DS.9. DS.10.

DS.12.

Table 4.15: Five different ranking of book „Principal of Data Base‟


ranking

ORWA based

ranking

OWA (At

least half)

based

ranking

OWA (As

many as)

based

ranking

OWA (most)

based

ranking

1 DB.2. DB.2. DB.1. DB.2. DB.2.

2 DB.1. DB.1. DB.2. DB.4. DB.4.

3 DB.4. DB.4. DB.3. DB.1. DB.1.

4 DB.7. DB.3. DB.4. DB.7. DB.3.

5 DB.3. DB.7. DB.7. DB.5. DB.5.

6 DB.5. DB.5. DB.5. DB.6. DB.6.

7 DB.6. DB.6. DB.6. DB.8. DB.7.

8 DB.8. DB.8. DB.8. DB.9. DB.8.

9 DB.9. DB.9. DB.9. DB.13. DB.9.

10 DB.13. DB.10 DB.10 DB.10 DB.10


___________________________________________________________________________________

88 | P a g e

Table 4.16: Five different ranking of book „„Computer Graphics‟


ranking

ORWA based

ranking

OWA (At

least half)

based ranking

OWA (As

many as)

based ranking

OWA (most)

based ranking

1 CG.2. CG.2. CG.2. CG.14. CG.14.

2 CG.7. CG.7. CG.1. CG.16. CG.7

3 CG.1. CG.1. CG.3. CG.7. CG.15.

4 CG.3. CG.3. CG.4. CG.17. CG.16.

5 CG.14. CG.4. CG.5. CG.15. CG.17.

6 CG.16. CG.5. CG.6. CG.18. CG.18.

7 CG.4. CG.6. CG.7. CG.19. CG.19.

8 CG.17. CG.8. CG.8. CG.20. CG.20.

9 CG.5. CG.9. CG.9. CG.2. CG.2.

10 CG.15. CG.10. CG.10. CG.1. CG.1.

Table 4.17: Five different ranking of book „Software Engineering‟


ranking

ORWA based

ranking

OWA (At

least half)

based ranking

OWA (As

many as)

based ranking

OWA (most)

based ranking

1 SE.2. SE.2. SE.2. SE.2. SE.2.

2 SE.4. SE.3. SE.1. SE.18 SE.9.

3 SE.9. SE.1. SE.4. SE.9. SE.11

4 SE.3. SE.4. SE.5. SE.11 SE.12

5 SE.18 SE.9. SE.3. SE.4. SE.8.

6 SE.1. SE.5. SE.6. SE.12 SE.4.

7 SE.11 SE.6. SE.7. SE.15 SE.13

8 SE.5. SE.7. SE.8. SE.13 SE.5.

9 SE.8. SE.8. SE.9. SE.16 SE.14

10 SE.12 SE.10 SE.10 SE.14 SE.10

Table 4.18: Five different ranking of book „„Operating System‟

Rank Position Rank position PAS based

ranking

ORWA based

ranking

OWA (At

least half)

based ranking

OWA (As

many as)

based ranking

1 OS.2. OS.2. OS.2. OS.4. OS.4.

2 OS.4. OS.3. OS.3. OS.2. OS.2.

3 OS.3. OS.4. OS.1. OS.13. OS.3.

4 OS.5. OS.1. OS.4. OS.10. OS.5.

5 OS.13. OS.5. OS.5. OS.7. OS.7.

6 OS.1. OS.6. OS.6. OS.11. OS.8.

7 OS.10. OS.7. OS.7. OS.8. OS.9.

8 OS.7. OS.8. OS.8. OS.12. OS.6.

9 OS.11. OS.9. OS.9. OS.5. OS.10.

10 OS.8. OS.13. OS.13. OS.9. OS.11.


___________________________________________________________________________________

89 | P a g e

Table 4.19: Five different ranking of book Computer Network‟


ranking

ORWA based

ranking

OWA (At

least half)

based ranking

OWA (As

many as)

based ranking

OWA (most)

based ranking

1 CN.2. CN.2. CN.2. CN.2. CN.12.

2 CN.17. CN.1. CN.1. CN.17. CN.2.

3 CN.1. CN.3. CN.7. CN.12. CN.13.

4 CN.12. CN.12. CN.8. CN.13. CN.11.

5 CN.7. CN.4. CN.3. CN.14. CN.14.

6 CN.11. CN.5. CN.9. CN.18. CN.7.

7 CN.13. CN.6. CN.4. CN.1. CN.8.

8 CN.8. CN.7. CN.10. CN.15. CN.15.

9 CN.14. CN.8. CN.5. CN.19. CN.9.

10 CN.18. CN.9. CN.6. CN.16. CN.16.

Table 4.20: Five different ranking of book „Theory of Computation‟

Rank

Position

PAS based

ranking

ORWA based

ranking

OWA (At

least half)

based

ranking

OWA (As

many as)

based

ranking

OWA (most)

based

ranking

1 TOC.1. TOC.1. TOC.1. TOC.1. TOC.1.










The first rank is achieved by CD1 from all the five different ways of recommendation.

However, five out of seven universities have ranked CD1 1st. This means 70% of

universities recommendation for the best book on the topic is same as the overall

recommendation of the proposed approaches. CD2 has obtained the 2nd

position in the

ranking by three rankings namely, PAS, ORWA and OWA with at least quantifier,

whereas rest of the two methods have recommended CD4 in 2nd

position. Although,

only three universities considers CD2 in their ranking, it has attained a good position


___________________________________________________________________________________

90 | P a g e

in the ranking as the all three universities involving CD2 in their recommendation

have high repute and acquire better position in ranking of universities than others.

Apart from CD1 and CD2 only CD4 is recommended by more than two universities in

their recommendations and hence its top position is obvious. All the three universities

which have considered CD2 in 2nd

position, recommends CD4 at third position. The

OWA with quantifier most has also ranked CD2 in third position as it has already

suggested CD4 to 2nd

position in its recommendations.

CD3 is recommended only by U_1 but has obtained rank 4 ORWA method

whereas others techniques have awarded a lower ranking to it except quantifier „at

least half‟. U_1 is the first ranked university and ORWA has a simple philosophy of

assigning a higher value to best voters, thus, inclusion of CD3 in higher positions

supports its approach. However, CD3 is ranked lower to CD5 by other techniques.

CD9 is ranked only by U_7 and no other universities have included it in their ranking,

still „as many as possible‟ prefers to rank CD9 on 3rd

position. Thus we can say from

the performance, ORWA and at least half is clearly recommending more appealing

and appropriate recommendations than other techniques. CD5 and CD6 both have two

number of inclusions in 7 ranking, and all have recommended CD5 above CD6, hence

is the final recommendation of the proposed approaches. CD7, CD8 and CD10 have

only one inclusion and are displaced to last two positions by both ORWA and „at least

half‟ quantifier. The last positions differ for other techniques. PAS and „as many as

possible‟ have recommended CD8 in last position and „most‟ quantifier has Cd3 in its

last position of ranking.

The above procedure is applied to all books which in turn give 5 different ranking

of each course of books. The five different ranking for all books are obtained in a

similar way the ranking of books on Compiler Design is achieved. These rankings are

presented from Table 4.11 to Table 4.20.

In these tables the ranked list of books for different approaches are listed. The

books are represented by their course code. These rankings can be very useful in

finding the top 10 or top 5 books on specified topic as well as it can give us the best

book, i.e. 1st ranked books of the course concerned. There are few courses whose 1

st

ranked book is common for all the methods as they all have recommended same book

in 1st position. Software Engineering, Compiler Design and Artificial Intelligence are

the courses for which the top universities have recommended same first ranked books.


___________________________________________________________________________________

91 | P a g e

Though most of the top ranked books are recommended at top positions in majority,

i.e. 3 or 4 out of 5 approaches coincide in their first position ranking of books for

different courses. The consistency of first ranked positions may help readers to

comfortably choose the desired books. The tables clearly depicts that not all the top

10 ranked books are the same as the top universities‟ recommendation, i.e. if we sort

the books directly from university rank in a way that books by top university is ranked

on top, then it is not necessary that order of these books remain same in final

recommendation. However, as far as recommendation from ORWA is concerned, it

gives the ranking of books which is in most of the cases are directly proportional to

university ranks.

Also, the method of ORWA takes consideration of how many universities have

ranked the books? If a book is ranked first by 1st ranked university but no other

university has recommended the book; the books which are recommended by most of

the university would be preferred and ranked better. Consider the tables 3.21, ranked

list of books on „Artificial Intelligence‟ (AI) are listed. AI.1. is the book which is

ranked first by 1st ranked university, although it has not been recommended even in

top 3 positions by any of the methods. Instead, AI.2 is ranked top as it is

recommended by almost all the universities except U_6, hence has obtained 1st

position in the ranking.

4.6 Summary

We have introduced two different schemes to recommend books. First, a fuzzy based

aggregation scheme known as Ordered Weighted Aggregation (OWA) which has

been used in various domains but was never used in the recommendation of books to

the best of our knowledge is implemented for the recommendation of books.

Secondly, we have proposed an aggregation operator, „Ordered Ranked Weighted

Aggregation (ORWA)‟ and suggested a recommendation technique which exploits

proposed ORWA.

The Ordered Ranked Weighted Aggregation incorporates rank of the rankers to

emphasize the importance of the rankers. Because we believe, a book recommended

by best ranked institution must get high preference than a book which is

recommended by a lower ranked institution. The ORWA gives the ranking positions

of the recommended books, along with the total recommended books. The strength of

assigning weights to the rankers in the ORWA provides a better recommendation.


___________________________________________________________________________________

92 | P a g e

Since we do not have any benchmark for ranking the books, we can rely on the best

ranked universities‟ recommendation (syllabus). The recommendation by the all

above techniques has been listed and presented. From the results of the

recommendation, it is obvious that the top ranked books have acquired different

positions in different ranking and have a slightly difference in first ranked books.

However, further discussion on the performance of the techniques regarding which

one is better will be discussed in Chapter 6.

We believe the proposed technique may meet the user‟s need and provide them the

perfect books they need. For the sake of illustration and ease of experiments, we have

shown the procedure considering books from top institutions. However, we can

generalize the procedure of the recommendation for any kind of items. The robustness

of the procedure may lead to a novel way in the field of recommendation and would

fulfill the demand of millions.

174 | P a g e

References:

[1] S. S. Sohail, J. Siddiqui, and R. Ali, “Book recommendation system using

opinion mining technique,” in International Conference on Advances in

Computing, Communications and Informatics (ICACCI), pp. 1609–1614,

2013.

[2] M. Hu, B. Liu, and S. M. Street, “Mining and Summarizing Customer

Reviews,” Proceedings of the tenth ACM SIGKDD international conference

on Knowledge discovery and data mining, USA. pp. 168-177, 2004.

[3] A. Andreevskaia and S. Bergler, “Mining WordNet for a Fuzzy Sentiment:

Sentiment Tag Extraction from WordNet Glosses.,” in EACL, vol. 6, pp. 209–

216, 2006.

[4] G. Carenini, R. T. Ng, and A. Pauls, “Interactive multimedia summaries of

evaluative text,” in Proceedings of the 11th international conference on

Intelligent user interfaces, pp. 124–131, 2006.

[5] S. S. Sohail, J. Siddiqui, and R. Ali, “Ordered Ranked Weighted Aggregation

based Book Recommendation Technique : A Link Mining Approach,” 14th

International Conference on Hybrid Intelligent Systems (HIS), IEEE, pp. 309–

314, 2014.

[6] H. K. Kim, H. Y. Oh, J. C. Gu, and J. K. Kim, “Commenders: A

recommendation procedure for online book communities,” Electron. Commer.

Res. Appl., vol. 10, no. 5, pp. 501–509, Sep. 2011.

[7] S.T. Yang and M.C. Hung, “A model for book inquiry history analysis and

book-acquisition recommendation of libraries,” Libr. Collect. Acquis. Tech.

Serv., vol. 36, no. 3–4, pp. 127–142, 2012.

[8] S. Pulakhandam and N. Patil, “Recommendation of Optimal Locations for

Government Funded Educational Institutes in Urban India Using a Hybrid

Data Mining Technique,” in Second International Conference on Advances in

Computing and Communication Engineering (ICACCE), pp. 560–567, 2015.

[9] J. Beel and S. Langer, “Research Paper Recommender Systems: A Literature

Survey,” Int. J. Digit. Libr., vol.17, no.4, 305-338, 2015.

References

175 | P a g e

[10] J. Beel, S. Langer, and M. Genzmehr, “Research paper recommender system

evaluation: A quantitative literature survey,” Proceedings of the Workshop on

Reproducibility and Replication in Recommender Systems Evaluation (RepSys)

at the ACM Recommender System Conference (RecSys), pp. 15–22, 2013.

[11] X. Wang and F. Yuan, “Course Recommendation by Improving BM25 to

Identity Students‟ Different Levels of Interests in Courses,” in International

Conference on New Trends in Information and Service Science, ( NISS), pp.

1372–1377, 2009.

[12] C.-Y. Huang, R.-C. Chen, and L.-S. Chen, “Course-recommendation system

based on ontology,” in 2013 International Conference on Machine Learning

and Cybernetics, vol. 3, pp. 1168-1173. IEEE, 2013.

[13] K. H. Tsai, T. C. Hsieh, T. K. Chiu, M. C. Lee, and T. I. Wang, “Automated

course composition and recommendation based on a learner intention,” in

Seventh IEEE International Conference on Advanced Learning Technologies

pp. 274–278, 2007.

[14] Ò. Celma, M. Ramírez, and P. Herrera, “Foafing the music: A music

recommendation system based on RSS feeds and user preferences,” In: 6th

International Conference on Music Information Retrieval (ISMIR), pp. 464–

467, 2005.

[15] M. T. Group and U. P. Fabra, “Foafing the Music : Bridging the semantic gap

in music recommendation,” Web Semantics: Science, Services and Agents on

the World Wide Web, vol. 6, no. 4, pp. 250-256, 2008:

[16] K. Yoshii, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, “Hybrid

Collaborative and Content-based Music Recommendation Using Probabilistic

Model with Latent User Preferences,” in ISMIR, vol. 6, pp. 296–301, 2006.

[17] F. Peleja and P. Dias, “A recommender system for the TV on the web :

integrating unrated reviews and movie ratings,” pp. 543–558, 2013.

[18] Z. Yu, X. Zhou, Y. Hao, and J. Gu, “TV program recommendation for

multiple viewers based on user profile merging,” User Model. User-adapt.

Interact., vol. 16, no. 1, pp. 63–82, 2006.

References

176 | P a g e

[19] J. M. Noguera, M. J. Barranco, R. J. Segura, and L. MartíNez, “A mobile 3D-

GIS hybrid recommender system for tourism,” Inf. Sci. (Ny)., vol. 215, pp. 37–

52, 2012.

[20] D. Gavalas, C. Konstantopoulos, K. Mastakas, and G. Pantziou, “Mobile

recommender systems in tourism,” J. Netw. Comput. Appl., vol. 39, pp. 319–

333, 2014.

[21] A. García-crespo, J. Chamizo, I. Rivera, M. Mencke, R. Colomo-palacios, and

J. M. Gómez-berbís, “SPETA : Social pervasive eTourism advisor,”

Telematics and Informatics, vol. 26, pp. 306–315, 2009.

[22] J. Shi, Y. Peng, and E. Erdem, “Simulation analysis on patient visit efficiency

of a typical VA primary care clinic with complex characteristics,” Simul.

Model. Pract. Theory, vol. 47, pp. 165–181, Sep. 2014.

[23] J.-H. Kim, J.-H. Lee, J.-S. Park, Y.-H. Lee, and K.-W. Rim, “Design of diet

recommendation system for healthcare service based on user information,” in

Fourth International Conference on Computer Sciences and Convergence

Information Technology, pp. 516–518, 2009.

[24] S. Saint-André, W. NeiraZalentein, D. Robin, and A. Lazartigues, “La

télépsychiatrie au service de l‟autisme,” Encephale., vol. 37, no. 1, pp. 18–24,

Feb. 2011.

[25] P. Pattaraintakorn, G. M. Zaverucha, and N. Cercone, “Web based health

recommender system using rough sets, survival analysis and rule-based expert

systems,” in International Workshop on Rough Sets, Fuzzy Sets, Data Mining,

and Granular-Soft Computing, pp. 491–499, 2007.

[26] C. Lee, M. Lee, D. Han, S. Jung, and J. Cho, “A framework for personalized

healthcare service recommendation,” in 10th International Conference on

ehealth Networking, Applications and Services, pp. 90–95, 2008.

[27] V. Kumar, K. M. P. D. Shrivastva, and S. Singh, “Cross Domain

Recommendation Using Semantic Similarity and Tensor Decomposition,”

Procedia Comput.Sci., vol. 85, pp. 317–324, 2016.

[28] S. S. Sohail, J. Siddiqui, and R. Ali, “Product Recommendation Techniques for

Ecommerce - past , present and future,” IJARCET, vol. 1, no. 9, pp. 219–225,

2012.

References

177 | P a g e

[29] I. Bose and R. K. Mahapatra, “Business data mining: a machine learning

perspective,” Information and Management, vol. 39, no.3, pp. 211–225, 2001.

[30] A. A. Shaikh and H. Karjaluoto, “Making the most of information technology

& systems usage: A literature review, framework and future research agenda,”

Comput. Human Behav., vol. 49, pp. 541–566, Aug. 2015.

[31] A. Karatzoglou, L. Baltrunas, and Y. Shi, “Learning to Rank for

Recommender Systems,” Proceedings of the 7th ACM conference on

Recommender systems. ACM, pp. 493–494, 2013.

[32] S. S. Sohail, J. Siddiqui, and R. Ali, “Recommender Systems for E-commerce :

In perspective of Business Strategies,” Shodh-Pioneer Journal of IT &

Management, vol.8, no.2, pp. 165-169, 2012.

[33] L. Getoor, “Link Mining : A New Data Mining Challenge,” ACM SIGKDD

Explorations Newsletter, vol. 5, no. 1, pp. 84–89, 2003.

[34] S. S. Sohail, J. Siddiqui, and R. Ali, “Book Recommendation Technique Using

Rank Based Scoring Method Abstract ,” National Conference on RIAIT, pp.

140-146, 2014.

[35] S. S. Sohail, J. Siddiqui, and R. Ali, “OWA based Book Recommendation

Technique,” Procedia Computer Science, vol. 62, pp. 126–133, 2015.

[36] S. S. Sohail, J. Siddiqui, and R. Ali, “Book Recommender System using Fuzzy

Linguistic Quantifier and Opinion Mining,” in The International Symposium

on Intelligent Systems Technologies and Applications, pp. 573–583, 2016.

[37] S. S. Sohail, J. Siddiqui, and R. Ali, “Feature extraction and analysis of online

reviews for the recommendation of books using opinion mining technique,”

Perspect. Sci., vol. 8, pp. 754–756, 2016.

[38] S. S. Sohail, J. Siddiqui, and R. Ali, “On Enhancement in Wearable Devices

using User Feed-back based Model,” International Journal of Computer

Science and Innovation, vol. 2016, no. 1, pp. 31–37, 2016.

[39] S. S. Sohail, J. Siddiqui, and R. Ali, “User feedback based evaluation of a

product recommendation system using rank aggregation method,” in Advances

in Intelligent Informatics, Springer, pp. 349–358, 2015.

References

178 | P a g e

[40] S. S. Sohail, J. Siddiqui, and R. Ali, “User Feedback Scoring and Evaluation of

a Product Recommendation System,” In Seventh International Conference on

Contemporary Computing (IC3), IEEE, pp. 525–530, 2014.

[41] D. Goldberg, D. Nichols, B. M. Oki, and D. Terry, “Using collaborative

filtering to weave an information tapestry,” Commun. ACM, vol. 35, no. 12,

pp. 61–70, 1992.

[42] G. Adomavicius and A. Tuzhilin, “Toward the next generation of

recommender systems: A survey of the state-of-the-art and possible

extensions,” IEEE Transactions on Knowledge and Data Engineering, vol. 17,

no. 6.pp. 734– 749, 2005.

[43] W. Hill, L. Stead, M. Rosenstein, and G. Furnas, “Recommending and

evaluating choices in a virtual community of use,” in Proceedings of the

SIGCHI conference on Human factors in computing systems, pp. 194–201,

1995.

[44] B. M. Sarwar, G. Karypis, J. a Konstan, and J. T. Riedl, “Application of

Dimensionality Reduction in Recommender System -- A Case Study,”

Architecture, vol. 1625, pp. 264–8, 2000.

[45] N. Good, J. Ben Schafer, J. A. Konstan, A. Borchers, B. Sarwar, J. Herlocker,

and J. Riedl, “Combining Collaborative Filtering with Personal Agents for

Better Recommendations,” In AAAI/IAAI, pp. 439-446, 1999.

[46] J.B. Schafer, J. A. Konstan, and J. Riedl. "E-commerce recommendation

applications." Applications of Data Mining to Electronic Commerce. Springer

US, pp. 115–153, 2001.

[47] R. Burke, “Hybrid Recommender Systems : Survey and,” User Model. User

Adapted Interact., vol. 12, no. 4, pp. 331–370, 2002.

[48] R. Burke, “Hybrid Web Recommender Systems,” LNCS 4321 - pp. 377–408,

2007.

[49] L. Candillier, F. Meyer, and M. Boullé, “Comparing state-of-the-art

collaborative filtering systems,” Lect. Notes Comput. Sci., vol. 4571, p. 548,

2007.

[50] X. Su and T. M. Khoshgoftaar, “A survey of collaborative filtering

techniques,” Adv. Artif. Intell., vol. 2009, no. Section 3, pp. 1–19, 2009.

References

179 | P a g e

[51] G. Shani and A. Gunawardana, “Evaluating recommendation systems,”

Recomm. Syst. Handb., pp. 257–298, 2011.

[52] D. H. Park, H. K. Kim, I. Y. Choi, and J. K. Kim, “Expert Systems with

Applications A literature review and classification of recommender systems

research,” Expert System with Applications., vol. 39, no. 11, pp. 10059–10072,

2012.

[53] X. Zhou, Y. Xu, Y. Li, A. Josang, and C. Cox, “The state-of-the-art in

personalized recommender systems for social networking,” Artificial

Intelligence Review vol.37, no. 2, pp. 119–132, 2012.

[54] J. Bobadilla, F. Ortega, A. Hernando, and A. Gutiérrez, “Recommender

systems survey,” Knowledge-based Syst., vol. 46, pp. 109–132, 2013.

[55] Y. Yang and X. Liu, “A re-examination of text categorization methods,” in

Proceedings of the 22nd annual international ACM SIGIR conference on

Research and development in information retrieval, pp. 42–49, 1999.

[56] B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, “Analysis of recommendation

algorithms for e-commerce." In Proceedings of the 2nd ACM conference on

Electronic commerce, pp. 158-167, 2000.

[57] J. S. Breese, D. Heckerman, and C. Kadie, “Empirical analysis of predictive

algorithms for collaborative filtering,” in Proceedings of the Fourteenth

conference on Uncertainty in artificial intelligence, pp. 43–52, 1998.

[58] P. Resnick, P. Bergstrom, and J. Riedl, “GroupLens : An Open Architecture

for Collaborative Filtering of Netnews,” In Proceedings of the ACM

conference on Computer supported cooperative work, pp. 175–186, 1994.

[59] P. Resnick, H. R. Varian, and G. Editors, “Recommender Systems,”

Communications of the ACM, vol. 40, no. 3, pp. 56–58, 1997.

[60] W. Lin, “Efficient Adaptive-Support Association Rule Mining for

Recommender Systems,” Data mining and knowledge discovery vol.6, no.1

pp. 83–105, 2002.

[61] S. W. Ave, “Effective Personalization Based on Association Rule Discovery

from Web Usage Data,” In Proceedings of the 3rd international workshop on

Web information and data management, pp. 9–15, 2001.

References

180 | P a g e

[62] J. J. Sandvig, B. Mobasher, and R. Burke, “Robustness of Collaborative

Recommendation Based On Association Rule Mining,” In Proceedings of the

ACM conference on Recommender systems, pp. 105–111, 2007.

[63] M. O‟connor, D. Cosley, J. A. Konstan, and J. Riedl, “PolyLens: a

recommender system for groups of users,” in ECSCW, pp. 199–218, 2001.

[64] M. Anderson, M. Ball, H. Boley, S. Greene, N. Howse, D. Lemire, S.

McGrath, “RACOFI: A Rule-Applying Collaborative Filtering System,” Paper

presented at the conference IEEE/WIC COLA, Halifax, Canada. 13 October,

2003.

[65] K. Ali and W. Van Stam, “TiVo: making show recommendations using a

distributed collaborative filtering architecture,” in Proceedings of the tenth

ACM SIGKDD international conference on Knowledge discovery and data

mining, pp. 394–401, 2004.

[66] D. Lemire and S. Mcgrath, “Implementing a Rating-Based Item-to-Item

Recommender System in PHP / SQL,” D-01, On delette.com, pp. 84-98, 2005.

[67] Â. Cunningham and C. Hayes, “Smart radio - community based music radio,”

Knowledge-Based Systems vol.14, no. 3, pp.197-201, 2001.

[68] I. Barjasteh and R. Forsati, “Cold-Start Item and User Recommendation with

Decoupled Completion and Transduction,” In Proceedings of the 9th ACM

Conference on Recommender Systems, pp. 91-98, 2015.

[69] D. Library, I. Spectrum, M. Sites, C. Account, P. Sign, and R. Articles,

“Flycasting : using collaborative filtering to generate a playlist for online

radio,” In Proceedings of First International Conference on Web Delivering of

Music, pp. 123-130, 2001.

[70] M. P. Graus, “Improving the User Experience during Cold Start through

Choice-Based Preference Elicitation,” In Proceedings of the 9th ACM

Conference on Recommender Systems, pp. 273-276, 2015.

[71] M. J. Pazzani, “A framework for collaborative, content-based and

demographic filtering,” Artif. Intell. Rev., vol. 13, no. 5, pp. 393–408, 1999.

[72] G. Karypis, “Evaluation of Item-Based Top-N Recommendation Algorithms,”

Proc. tenth Int. Conf., no. TR 00--046, pp. 247–254, 2001.

References

181 | P a g e

[73] J. Gemmell, T. Schimoler, M. Ramezani, L. Christiansen, and B. Mobasher,

“Improving folkrank with item-based collaborative filtering,” Recomm. Syst.

Soc. Web, 2009.

[74] J. Gemmell, T. Schimoler, B. Mobasher, and R. Burke, “Recommendation by

Example in Social Annotation Systems,” In International Conference on

Electronic Commerce and Web Technologies, pp. 209–220, 2011.

[75] R. Jäschke, L. Marinho, A. Hotho, L. Schmidt-Thieme, and G. Stumme, “Tag

recommendations in folksonomies,” in European Conference on Principles of

Data Mining and Knowledge Discovery, pp. 506–514, 2007.

[76] A. Hotho, R. Jäschke, C. Schmitz, and G. Stumme, “Information retrieval in

folksonomies: Search and ranking,” in European Semantic Web conference,

pp. 411–426, 2006.

[77] N. Zheng and Q. Li, “A recommender system based on tag and time

information for social tagging systems,” Expert System with Applications., vol.

38, no. 4, pp. 4575–4587, 2011.

[78] J. Lu, D. Wu, M. Mao, W. Wang, and G. Zhang, “Recommender system

application developments : A survey,” Decision Support Systems, vol. 74, pp.

12–32, 2015.

[79] B. Marlin, “Modeling User Rating Profiles For Collaborative Filtering.” In

NIPS, pp. 627-634. 2003

[80] B. Marlin, “Collaborative filtering: A machine learning perspective.” Doctoral

dissertation, University of Toronto, 2004.

[81] J. W. Kim, B. H. Lee, M. J. Shaw, H.-L. Chang, and M. Nelson, “Application

of decision-tree induction techniques to personalized advertisements on

internet storefronts,” Int. J. Electron. Commer., vol. 5, no. 3, pp. 45–62, 2001.

[82] B. N. Miller, J. A. Konstan, and J. Riedl, “PocketLens: Toward a personal

recommender system,” ACM Trans. Inf. Syst., vol. 22, no. 3, pp. 437–476,

2004.

[83] R. Garfinkel, R. Gopal, A. Tripathi, and F. Yin, “Design of a shopbot and

recommender system for bundle purchases,” Decision Support Systems, vol.

42, pp. 1974–1986, 2006.

References

182 | P a g e

[84] X. Su and T. M. Khoshgoftaar, “Collaborative filtering for multi-class data

using belief nets algorithms,” in 18th IEEE International Conference on Tools

with Artificial Intelligence, pp. 497–504, 2006.

[85] R. M. Bell, Y. Koren, P. Ave, and F. Park, “Improved Neighborhood-based

Collaborative Filtering,” pp. 7–14, 2007.

[86] N. Sahoo, V. P. Singh, and T. Mukhopadhyay, "A hidden Markov model for

collaborative filtering," MIS Quarterly, vol. 6, no. 4, pp. 1329–1356, 2012.

[87] K. Verstrepen, “Top-N Recommendation for Shared Accounts,” In

Proceedings of the 9th ACM Conference on Recommender Systems pp. 59–

66, 2015.

[88] M. Aharon, D. Drachsler-cohen, O. Anava, N. Avigdor-elgrabli, and O.

Somekh, “ExcUseMe : Asking Users to Help in Item Cold-Start

Recommendations,” In Proceedings of the 9th ACM Conference on

Recommender Systems, pp. 83–90, 2015.

[89] R. R. Yager, “Fuzzy logic methods in recommender systems,” Fuzzy Sets

Syst., vol. 136, no. 2, pp. 133–149, 2003.

[90] S. Russell, P. Norvig, and A. Intelligence, “A modern approach,” Artif. Intell.

Prentice-Hall, Egnlewood Cliffs, vol. 25, p. 27, 1995.

[91] S. Weng and M. Liu, “Feature-based recommendations for one-to-one

marketing,” Expert Systems with Applications, vol. 26, pp. 493–508, 2004.

[92] K. W. Cheung, K. C. Tsui, and J. Liu, “Extended latent class models for

collaborative recommendation,” IEEE Trans. Syst. Man, Cybern. A Syst.

Humans, vol. 34, no. 1, pp. 143–148, 2004.

[93] R. S. Chen, Y. S. Tsai, K. C. Yeh, D. H. Yu, and Y. Bak-Sau, “Using data

mining to provide recommendation service,” WSEAS Trans. Inf. Sci. Appl.,

vol. 5, no. 4, pp. 459–474, 2008.

[94] A. S. Tewari and K. Priyanka, “Book recommendation system based on

collaborative filtering and association rule mining for college students,” in

International Conference on Contemporary Computing and Informatics

(IC3I), pp. 135–138, 2014.

[95] M. Mikawa, S. Izumi, and K. Tanaka, “Book recommendation signage system

using silhouette-based gait classification,” in 10th International Conference on

References

183 | P a g e

Machine Learning and Applications and Workshops (ICMLA), vol. 1, pp. 416–

419, 2011.

[96] J. Salter and N. Antonopoulos, “CinemaScreen recommender agent:

combining collaborative and content-based filtering,” IEEE Intell. Syst., vol.

21, no. 1, pp. 35–41, 2006.

[97] G. Kazai, D. Clarke, and M. Venanzi, “A Personalised Reader for Crowd

Curated Content,” In Proceedings of the 9th ACM Conference on

Recommender Systems, pp. 325–326, 2015.

[98] K. Lang, “Newsweeder: Learning to filter netnews,” in Proceedings of the

12th international conference on machine learning, pp. 331–339, 1995.

[99] R. J. M. and L. Roy, “Content-based book recommendation using learning for

text categorization,” In Proceedings of the 5th ACM Conference on Digital

Libraries, pp. 195-204, 2000.

[100] P. Jomsri, “Book recommendation system for digital library based on user

profiles by using association rule,” in Fourth International Conference on

Innovative Computing Technology (INTECH), pp. 130–134, 2014.

[101] D. Billsus, M. J. Pazzani, and J. Chen, “A Learning Agent for Wireless News

Access,” In Proceedings of the 5th international conference on Intelligent user

interfaces, pp. 33–36, 2000.

[102] Y. Blanco-fernández, J. J. Pazos-arias, A. Gil-solla, and M. Ramos-cabrer,

“Providing Entertainment by Content-based Filtering and Semantic Reasoning

in Intelligent Recommender Systems,” IEEE Transactions on Consumer

Electronics, vol.54, no.2, May 2008.

[103] T. Bansal, “Content Driven User Profiling for

Comment-Worthy Recommendations of News and Blog Articles Categories

and Subject Descriptors,” RecSys‟15, Vienna, Austria, pp. 195–202, 2015.

[104] M. de Gemmis, P. Lops, C. Musto, F. Narducci, and G. Semeraro,

“Semanticsaware content-based recommender systems,” in Recommender

Systems Handbook, Springer, pp. 119–159, 2015.

[105] J. Kyeong, Y. Ho, W. Ju, J. Ran, and J. Hae, “A personalized recommendation

procedure for Internet shopping support,” Electronic Commerce Research and

Applications, vol. 1, pp. 301–313, 2002.

References

184 | P a g e

[106] Z. Zeng, “An Intelligent E-commerce Recommender System Based on Web

Mining,” International Journal of Business and Management, vol. 4, no. 7, pp.

10–14, 2009.

[107] D. R. Liu and Y. Y. Shih, “Integrating AHP and data mining for product

recommendation based on customer lifetime value,” Inf. Manag., vol. 42, no.

3, pp. 387–400, 2005.

[108] P. Melville, R. J. Mooney, and R. Nagarajan, “Content-Boosted Collaborative

Filtering for Improved Recommendations,” In Proceedings of the Eighteenth

National Conference on Artificial Intelligence(AAAI-2002), pp. 187-192,

Edmonton, Canada, July, pp. 187–192, 2002.

[109] B. Krulwich, “Lifestyle finder: Intelligent user profiling using large-scale

demographic data,” AI Mag., vol. 18, no. 2, p. 37, 1997.

[110] L. Safoury and A. Salah, “Exploiting User Demographic Attributes for Solving

Cold-Start Problem in Recommender System,” Lecture Notes on Software

Engineering, vol. 1, no. 3, pp. 1–5, August 2013.

[111] B. Towle and C. Quinn, “Knowledge Based Recommender Systems Using

Explicit user models,” In Proceedings of the AAAI Workshop on

KnowledgeBased Electronic Markets pp. 74–77, 2000.

[112] E. Vildjiounaite, V. Kyllönen, T. Hannula, and P. Alahuhta, “Unobtrusive

dynamic modelling of tvprogramme preferences in a finnish household,”

Multimed. Syst., vol. 15, no. 3, pp. 143–157, 2009.

[113] R. D. Burke, K. J. Hammond, and B. C. Yound, “The FindMe approach to

assisted browsing,” IEEE Expert, vol. 12, no. 4, pp. 32–40, 1997.

[114] R. D. Burke, K. J. Hammond, and B. C. Young, “Knowledge-based navigation

of complex information spaces,” in Proceedings of the national conference on

artificial intelligence, vol. 462, p. 468, 1996.

[115] J. F. McCarthy, “Pocket restaurantfinder: A situated recommender system for

groups,” in Workshop on Mobile Ad-Hoc Communication at the 2002 ACM

Conference on Human Factors in Computer Systems, 2002.

[116] M. Biňas and E. Pietriková, “Useful recommendations for successful

implementation of programming courses,” in IEEE 12th International

References

185 | P a g e

Conference on Emerging eLearning Technologies and Applications (ICETA),

pp. 397–401, 2014.

[117] Y. Jing, “A Study of the Mass Customization-Based Strategy for the

Recommendation of Online Course Resources of the Open University of

China,” in 8th International Symposium on Computational Intelligence and

Design (ISCID), vol. 2, pp. 311–314, 2015.

[118] G. Koutrika, B. Bercovitz, R. Ikeda, F. Kaliszan, H. Liou, and H. Garcia-

Molina, “Flexible recommendations for course planning,” 2008.

[119] S. B. Aher and L. M. R. J. Lobo, “Combination of machine learning

algorithms for recommendation of courses in E-Learning System based on

historical data,” Knowledge-Based Systems, vol. 51, pp. 1–14, 2013.

[120] R. Burke, “Knowledge-based recommender systems,” Encycl. Libr. Inf. Syst.,

vol. 69, no. Supplement 32, pp. 175–186, 2000.

[121] A. Felfernig and R. Burke, “Constraint-based recommender systems:

technologies and research issues,” in Proceedings of the 10th international

conference on Electronic commerce, p. 3, 2008.

[122] D. Bridge, M. H. Göker, L. McGinty, and B. Smyth, “Case-based

recommender systems,” Knowl. Eng. Rev., vol. 20, no. 3, pp. 315–320, 2005.

[123] R. Burke, “Integrating knowledge-based and collaborative-filtering

recommender systems,” in Proceedings of the Workshop on AI and Electronic

Commerce, pp. 69–72, 1999.

[124] A. Felfernig and R. Burke, “Constraint-based Recommender Systems :

Technologies and Research Issues,” In Proceedings of the 10th international

conference on Electronic commerce, pp. 3, 2008.

[125] B. Smyth, “Case-Based Recommendation,” The Adaptive Web, LNCS 4321,

pp. 342–376, 2007.

[126] D. McSherry, “Minimizing dialog length in interactive case-based reasoning,”

in Proceedings of the 17th international joint conference on Artificial

intelligence-Volume 2, pp. 993–998, 2001.

[127] D. McSherry, “Increasing dialogue efficiency in case-based reasoning without

loss of solution quality,” in IJCAI, pp. 121–126, 2003.

References

186 | P a g e

[128] K. McCarthy, J. Reilly, L. McGinty, and B. Smyth, “Thinking

positivelyexplanatory feedback for conversational recommender systems,” in

Proceedings of the European Conference on Case-Based Reasoning

(ECCBR‟04) Explanation Workshop, pp. 115–124, 2004.

[129] A. Felfernig, G. Friedrich, K. Isak, K. Shchekotykhin, E. Teppan, and

D. Jannach, “Automated debugging of recommender user interface

descriptions,” Appl. Intell., vol. 31, no. 1, pp. 1–14, 2009.

[130] A. Felfernig, K. Isak, K. Szabo, and P. Zachar, “The VITA Financial Services

Sales Support Environment,” Association for the Advancement of Artificial

Intelligence, pp. 1692–1699, 2007.

[131] A. Felfernig, “Standardized configuration knowledge representations as

technological foundation for mass customization,” IEEE Trans. Eng. Manag.,

vol. 54, no. 1, pp. 41–56, 2007.

[132] Z. Jiang, W. Wang, and I. Benbasat, “Multimedia-based interactive advising

technology for online consumer decision support,” Commun. ACM, vol. 48,

no. 9, pp. 92–98, 2005.

[133] P. Auteri and R. Turrin, "Personalized Catch-up & DVR: VOD or Linear, That

is the Question," In Proceedings of the 9th ACM Conference on Recommender

Systems, pp. 227-227, 2015.

[134] M. Balabanović and Y. Shoham, “Fab: content-based, collaborative

recommendation,” Commun. ACM, vol. 40, no. 3, pp. 66–72, 1997.

[135] Q. Li and B. M. Kim, “Clustering approach for hybrid recommender system,”

in Proceedings of IEEE/WIC International Conference on Web Intelligence,

pp. 33–38, 2003.

[136] R. Hu and P. Pu, “Using personality information in collaborative filtering for

new users,” Recomm. Syst. Soc. Web, vol. 17, 2010.

[137] I. Soboroff and C. Nicholas, “Combining content and collaboration in text

filtering,” in Proceedings of the IJCAI, vol. 99, pp. 86–91, 1999.

[138] B. Smyth and P. Cotter, “A personalized television listings service,” Commun.

ACM, vol. 43, no. 8, pp. 107–111, 2000.

[139] A. Ansari, S. Essegaier, and R. Kohli, “Internet recommendation systems,”

American Marketing Association, pp. 363-375, 2000.

References

187 | P a g e

[140] F. Ortega, J. Sánchez, J. Bobadilla, and A. Gutiérrez, “Improving collaborative

filtering-based recommender systems results using Pareto dominance,”

Information Sciences, vol. 239, pp. 50–61, 2013.

[141] C. Basu, H. Hirsh, and W. Cohen, “Recommendation as classification: Using

social and content-based information in recommendation,” In Technical

Report, AAAI Press, Menlo Park, pp. 714-720, 1998.

[142] S. E. Middleton, N. R. Shadbolt, and D. C. De Roure, “Ontological

user profiling in recommender systems,” ACM Trans. Inf. Syst., vol. 22, no. 1,

pp. 54–88, 2004.

[143] A. Popescul, D. M. Pennock, and S. Lawrence, “Probabilistic models for

unified collaborative and content-based recommendation in sparse-data

environments,” in Proceedings of the Seventeenth conference on Uncertainty

in artificial intelligence, pp. 437–444, 2001.

[144] T. Hofmann and J. Puzicha, “Latent class models for collaborative filtering,”

in IJCAI, 1999, vol. 99, no. 1999.

[145] M. Claypool, A. Gokhale, T. Miranda, P. Murnikov, D. Netes, and M. Sartin,

“Combining content-based and collaborative filters in an online newspaper,” in

Proceedings of ACM SIGIR workshop on recommender systems, vol. 60, 1999.

[146] J. K. Kim, H. K. Kim, H. Y. Oh, and Y. U. Ryu, “A group recommendation

system for online communities,” Int. J. Inf. Manage., vol. 30, no. 3, pp. 212–

219, 2010.

[147] M. Brown-sica, “Using academic courses to generate data for use in evidence

based library planning,” The Journal of Academic Librarianship, vol. 39, no.3,

pp.275-287, 2013.

[148] D. Billsus and M. J. Pazzani, “User modeling for adaptive news access,” User

Model. User-adapt. Interact., vol. 10, no. 2–3, pp. 147–180, 2000.

[149] A. Sami, R. Nagatomi, M. Terabe, and K. Hashimoto, “Design of physical

activity recommendation system,” in Multi Conference on Informatics and

Computer Science and Information Systems, (MCCSIS-IADIS) Data Mining,

pp. 148-152, 2008.

References

188 | P a g e

[150] M. Wiesner and D. Pfeifer, “Adapting recommender systems to the

requirements of personal health record systems,” in Proceedings of the 1st

ACM International Health Informatics Symposium, pp. 410–414, 2010.

[151] L. Fernandez-Luque, R. Karlsen, and L. K. Vognild, “Challenges and

opportunities of using recommender systems for personalized health

education.,” in MIE, pp. 903–907, 2009.

[152] L. C. Braat and R. de Groot, “The ecosystem services agenda:bridging the

worlds of natural science and economics, conservation and development, and

public and private policy,” Ecosyst. Serv., vol. 1, no. 1, pp. 4–15, Jul. 2012.

[153] I. Gonzalez-Carrasco, R. Colomo-Palacios, J. L. Lopez-Cuadrado, Á. Garcı,

and B. Ruiz-Mezcua, “PB-ADVISOR: A private banking multi-investment

portfolio advisor,” Inf. Sci. (Ny)., vol. 206, pp. 63–82, 2012.

[154] M. Tang, Y. Jiang, J. Liu, and X. Liu, “Location-aware collaborative filtering

for QoS-based service recommendation,” in 19th IEEE International

Conference on Web Services (ICWS), pp. 202–209, 2012.

[155] Y. Koren, “Collaborative filtering with temporal dynamics,” Commun. ACM,

vol. 53, no. 4, pp. 89–97, 2010.

[156] J. Gemmell, T. Schimoler, M. Ramezani, L. Christiansen, and B. Mobasher,

“Improving FolkRank With Item-Based Collaborative Filtering,”

Recommender Systems & the Social Web, Oct 25 2009.

[157] D. Godoy and A. Amandi, “Hybrid content and tag-based profiles for

recommendation in collaborative tagging systems,” in Latin American Web

Conference, 2008. LA-WEB‟08., pp. 58–65, 2008,.

[158] L. Iaquinta, M. De Gemmis, P. Lops, G. Semeraro, M. Filannino, and P.

Molino, “Introducing serendipity in a content-based recommender system,” in

Eighth International Conference on Hybrid Intelligent Systems (HIS‟08), pp.

168–173, 2008.

[159] A. Nanopoulos, D. Rafailidis, P. Symeonidis, and Y. Manolopoulos,

“Musicbox: Personalized music recommendation based on cubic analysis of

social tags,” IEEE Trans. Audio. Speech. Lang. Processing, vol. 18, no. 2, pp.

407–412, 2010.

References

189 | P a g e

[160] H. W. Tung and V. W. Soo, “A personalized restaurant recommender agent for

mobile e-service,” in IEEE International Conference on e-Technology,

eCommerce and e-Service (EEE‟04), pp. 259–262, 2004.

[161] W. Woerndl, C. Schueller, and R. Wojtech, “A hybrid recommender system

for context-aware recommendations of mobile applications,” in 23rd

International Conference on Data Engineering Workshop, pp. 871–878, 2007.

[162] M. Brunato and R. Battiti, “PILGRIM: A location broker and mobility-aware

recommendation system,” in Proceedings of the First IEEE International

Conference on Pervasive Computing and Communications, pp. 265–272,

2003.

[163] M. Sarwat, J. J. Levandoski, A. Eldawy, and M. F. Mokbel, “LARS*:

An efficient and scalable location-aware recommender system,” IEEE Trans.

Knowl.Data Eng., vol. 26, no. 6, pp. 1384–1399, 2014.

[164] J. J. Levandoski, M. Sarwat, A. Eldawy, and M. F. Mokbel, “Lars: A

locationaware recommender system,” in IEEE 28th International Conference

on Data Engineering (ICDE), pp. 450–461, 2012.

[165] W. S. Yang, H. C. Cheng, and J. B. Dia, “A location-aware recommender

system for mobile shopping environments,” Expert System with Applications.,

vol. 34, no. 1, pp. 437–445, 2008.

[166] M. H. Park, J. H. Hong, and S. B. Cho, “Location-based recommendation

system using Bayesian user‟s preference model in mobile devices,” in

International Conference on Ubiquitous Intelligence and Computing, pp.

1130–1139, 2007.

[167] K. Božidar, O. Dijana, and B. Nina, “Temporal recommender systems,” in

Proceedings of the 10th WSEAS international conference on Applied computer

and applied computational science, pp. 248–253, 2011.

[168] T. Q. Lee, Y. Park, and Y.-T. Park, “A time-based approach to effective

recommender systems using implicit feedback,” Expert System with

Applications., vol. 34, no. 4, pp. 3055–3062, 2008.

[169] N. Lathia, S. Hailes, L. Capra, and X. Amatriain, “Temporal diversity in

recommender systems,” in Proceedings of the 33rd international ACM SIGIR

References

190 | P a g e

conference on Research and development in information retrieval, pp. 210–

217, 2010.

[170] F. Ullah, G. Sarwar, S. C. Lee, Y. K. Park, K. D. Moon, and J. T. Kim,

“Hybrid recommender system with temporal information,” in International

Conference onInformation Networking (ICOIN), pp. 421–425, 2012.

[171] Ò. Celma, “Foafing the music: Bridging the semantic gap in music

recommendation,” in International Semantic Web Conference, pp. 927–934,

2006.

[172] J. He and W. W. Chu, “A social network-based recommender system (SNRS),”

Physics Reports, Springer, vol.519, no.1, pp.1-49, 2010.

[173] J.-C. Wang and C.-C. Chiu, “Recommending trusted online auction sellers

using social network analysis,” Expert System with Applications., vol. 34, no.

3, pp. 1666– 1679, 2008.

[174] C. Cornelis, X. Guo, J. Lu, and G. Zhang, “A Fuzzy Relational Approach to

Event Recommendation,” 2nd Indian International Conference on Artificial

Intelligence (IICAI), pp. 2231–2242, 2007.

[175] C. Cornelis, J. Lu, X. Guo, and G. Zhang, “One-and-only item

recommendation with fuzzy logic techniques,” Information Sciences, vol. 177,

pp. 4906–4921, 2007.

[176] Z. Zhang, H. Lin, K. Liu, D. Wu, G. Zhang, and J. Lu, “A hybrid fuzzy-based

personalized recommender system for telecom products/services,” Inf. Sci.

(Ny)., vol. 235, pp. 117–129, 2013.

[177] M. Y. H. Al-shamri and K. K. Bharadwaj, “Fuzzy-genetic approach to

recommender systems based on a novel hybrid user model,” Expert systems

with applications, vol. 35, pp. 1386– 1399, 2008.

[178] M. J. Pazzani, “A framework for collaborative, content-based and

demographic filtering,” Artif. Intell. Rev., vol. 13, no. 5–6, pp. 393–408, 1999.

[179] M. Deshpande and G. Karypis, “Item-Based Top- N Recommendation

Algorithms,” vol. 22, no. 1, pp. 143–177, 2004.

[180] G. Karypis, “Evaluation of item-based top-n recommendation algorithms,” in

Proceedings of the tenth international conference on Information and

knowledge management, pp. 247–254, 2001.

References

191 | P a g e

[181] G. Shani and A. Gunawardana, “Evaluating Recommendation Systems,” In

Recommender systems handbook, pp. 257-297, 2011.

[182] W. Lin, S. A. Alvarez, and C. Ruiz, "Collaborative recommendation via

adaptive association rule mining," Data Mining and Knowledge Discovery,

vol.6, pp.83-105, 2000.

[183] P. Symeonidis, A. Nanopoulos, A. N. Papadopoulos, and Y. Manolopoulos,

“Collaborative recommender systems: Combining effectiveness and

efficiency,” Expert System with Applications, vol. 34, no. 4, pp. 2995–3013,

May 2008.

[184] S. Gong, “A collaborative filtering recommendation algorithm based on user

clustering and item clustering,” J. Softw., vol. 5, no. 7, pp. 745–752, 2010.

[185] B Sarwar, G Karypis, J Konstan, J Riedl, “Item-Based Collaborative Filtering

Recommendation Algorithms,” In Proceedings of the 10th international

conference on World Wide Web, pp. 285-295, 2001.

[186] J. L. Herlocker, J. A. Konstan, L. G. Terveen, and J. T. Riedl, “Evaluating

Collaborative Filtering Recommender Systems,” vol. 22, no. 1, pp. 5–53,

2004.

[187] R. Garcia and X. Amatriain, “Weighted content based methods for

recommending connections in online social networks,” in Workshop on

Recommender Systems and the Social Web, pp. 68–71, 2010.

[188] M. J. Pazzani and D. Billsus, “Content-Based Recommendation Systems,”

Adapt. Web, vol. 4321, pp. 325–341, 2007.

[189] T. Joachims, “Text categorization with support vector machines: Learning

with many relevant features,” in European conference on machine learning,

pp. 137–142, 1998.

[190] W. Huang, Y. Zhao, S. Yang, and Y. Lu, “Analysis of the user behavior and

opinion classification based on the BBS,” Appl. Math. Comput., vol. 205, no.

2, pp. 668–676, 2008.

[191] P. Lops, M. De Gemmis, and G. Semeraro, “Content-based Recommender

Systems : State of the Art and Trends,” In Recommender systems handbook,

pp. 73-105, Springer, US, 2011.

References

192 | P a g e

[192] R. J. Mooney and L. Roy, “Content-based book recommending using learning

for text categorization,” in Proceedings of the fifth ACM conference on Digital

libraries, pp. 195–204, 2000.

[193] C. Romero and S. Ventura, “Educational data mining: A survey from 1995 to

2005,” Expert System with Applications., vol. 33, no. 1, pp. 135–146, 2007.

[194] P. Taylor and R. R. Yager, “Multicriteria decision-making using fuzzy

measures," Cybernetics and Systems, vol. 46, no. 3-4, pp.150-171, August

2015.

[195] D. Anand and B. S. Mampilli, “Folksonomy-based fuzzy user profiling for

improved recommendations,” Expert System with Applications., vol. 41, no. 5,

pp. 2424–2436, Apr. 2014.

[196] E. S. Han and G. Karypis, “Feature-Based Recommendation System,” In

Proceedings of the 14th ACM international conference on Information and

knowledge management, pp. 446-452, 2005.

[197] K. Choi, D. Yoo, G. Kim, and Y. Suh, “A hybrid online-product

recommendation system: Combining implicit rating-based collaborative

filtering and sequential pattern analysis,” Electron. Commer. Res. Appl., vol.

11, no. 4, pp. 309–317, 2012.

[198] N. Pellas and I. Kazanidis, “Online and hybrid university-level courses with

the utilization of Second Life : Investigating the factors that predict student

choice in Second Life supported online and hybrid university-level courses,”

Computers in Human Behavior, vol. 40, pp. 31–43, 2014.

[199] P. C. Vaz, D. M. De Matos, B. Martins, and P. Calado, “Improving a Hybrid

Literary Book Recommendation System through Author Ranking Categories

and Subject Descriptors,” In Proceedings of the 12th ACM/IEEE-CS joint

conference on Digital, pp. 387–388, 2012.

[200] S. Maneeroj and A. Takasu, “Hybrid recommender system using latent

features,” in International Conference on Advanced Information Networking

and Applications Workshops WAINA‟09., pp. 661–666, 2009.

[201] S. K. Shinde and U. Kulkarni, “Hybrid personalized recommender system

using centering-bunching based clustering algorithm,” Expert System with

Applications., vol. 39, no. 1, pp. 1381–1387, 2012.

References

193 | P a g e

[202] T. Tran and R. Cohen, “Hybrid recommender systems for electronic

commerce,” in Proc. Knowledge-Based Electronic Markets, Papers from the

AAAI Workshop, Technical Report WS-00-04, AAAI Press, 2000.

[203] R. E. Krainer, “Towards a program for financial stability,” J. Econ. Behav.

Organ., vol. 85, pp. 207–218, Jan. 2013.

[204] L. M. De Campos, J. M. Fernández-Luna, J. F. Huete, and M. A.

RuedaMorales, “Combining content-based and collaborative

recommendations: A hybrid approach based on Bayesian networks,” Int. J.

Approx. Reason., vol. 51, no. 7, pp. 785–799, 2010.

[205] R. Burke, “Hybrid Web Recommender Systems,” The Adaptive Web, LNCS

4321 pp. 377–408, 2007.

[206] O. Kaššák, M. Kompan, and M. Bieliková, “Personalized hybrid

recommendation for group of users: Top-N multimedia recommender,” Inf.

Process.Manag., vol. 52, no. 3, pp. 459–477, 2016.

[207] G. Badaro, H. Hajj, W. El-Hajj, and L. Nachman, “A hybrid approach with

collaborative filtering for recommender systems,” in 9th International

Wireless Communications and Mobile Computing Conference (IWCMC), pp.

349–354, 2013.

[208] A. C. Bukhari and Y.-G. Kim, “Integration of a secure type-2 fuzzy ontology

with a multi-agent platform: A proposal to automate the personalized flight

ticket booking domain,” Inf. Sci. (Ny)., vol. 198, pp. 24–47, Sep. 2012.

[209] C. Porcel, A. Tejeda-Lorente, M. A. Martínez, and E. Herrera-Viedma, “A

hybrid recommender system for the selective dissemination of research

resources in a technology transfer office,” Inf. Sci. (Ny)., vol. 184, no. 1, pp. 1–

19, 2012.

[210] L. Qiu and I. Benbasat, “A study of demographic embodiments of product

recommendation agents in electronic commerce,” J. Hum. Comput. Stud., vol.

68, no. 10, pp. 669–688, 2010.

[211] N. Korfiatis and M. Poulos, “Using online consumer reviews as a source for

demographic recommendations: A case study using online travel reviews,”

Expert System with Applications., vol. 40, no. 14, pp. 5507–5515, Oct. 2013.

References

194 | P a g e

[212] B. Krulwich, “Using Large-Scale Demographic Data,” AI magazine, vol. 18,

no. 2, pp. 37– 46, 1997.

[213] S. Langer, A. Nürnberger, M. Genzmehr, T. Impact, J. Beel, S. Langer, A.

Nürnberger, and M. Genzmehr, “The Impact of Demographics (Age and

Gender) and Other User-Characteristics on Evaluating Recommender

Systems,” no. , pp. 400–404, 2013.

[214] S. Loh, F. Lorenzi, R. Granada, D. Lichtnow, L. K. Wives, and J. P. M. de

Oliveira, “Identifying Similar Users by their Scientific Publications to Reduce

Cold Start in Recommender Systems.,” in WEBIST, vol. 9, pp. 593–600, 2009.

[215] R. R. Yager and N. Rochelle, “Constraint Satisfaction Using Soft Quantifiers,”

Intelligent Systems in Accounting, Finance and Management, vol. 12, no. 3,

pp.177-186, 2004.

[216] J. A. Recio-garc and D. Bridge, “CBR for CBR : A Case-Based Template

Recommender System for Building Case-Based Systems,” In European

Conference on Case-Based Reasoning, pp. 459-473, 2008.

[217] D. Lee, “Case-based Recommendation Case-based Reasoning,” pp. 1–17,

2011.

[218] F. Lorenzi and F. Ricci, “Case-Based Recommender Systems : a Unifying

View,” In Intelligent Techniques for Web Personalization, pp. 89-113, 2005.

[219] J. Bobadilla, F. Ortega, a Hernando, and a Gutiérrez, “Recommender systems

survey,” Knowledge-Based Syst., vol. 46, pp. 109–132, 2013.

[220] F. Chaumartin, L. Talana, and U. Paris, “UPAR7 : A knowledge-based system

for headline sentiment tagging,” Proceedings of the 4th International

Workshop on Semantic Evaluations (SemEval), pages 422–425, Prague, June

2007.

[221] T. C.-K. Huang, “Recommendations of closed consensus temporal

patterns by group decision making,” Knowledge-Based Syst., vol. 54, pp. 318–

328, Dec. 2013.

[222] S. S. Anand, D. A. Bell, and J. G. Hughes, “KNOWLEDGE ENGINEERING

EDM : A general framework for Data Mining based on Evidence Theory,”

Data & Knowledge Engineering, vol.18, pp. 89-223, 1996.

References

195 | P a g e

[223] G. Adomavicius and F. Ricci, “RecSys‟ 09 workshop 3: workshop on

contextaware recommender systems (CARS-2009).,” RecSys, p. 60558, 2009.

[224] G. Adomavicius and A. Tuzhilin, “Context-aware recommender systems,”

Recomm. Syst. Handbook, Second Ed., pp. 67–80, 2015.

[225] K. Verbert, N. Manouselis, X. Ochoa, M. Wolpers, H. Drachsler, I. Bosnic, S.

Member, and E. Duval, “Context-Aware Recommender Systems for Learning :

A Survey and Future Challenges,” vol. 5, no. 4, pp. 318–335, 2012.

[226] A. Q. Macedo, C. Grande, and C. Grande, “Context-Aware Event

Recommendation in Event-based Social Networks,” In Proceedings of the 9th

ACM Conference on Recommender Systems, pp. 123–130, 2015.

[227] J. Bao, Y. Zheng, and M. F. Mokbel, “Location-based and preference-aware

recommendation using sparse geo-social networking data,” in Proceedings of

the 20th international conference on advances in geographic information

systems, pp. 199–208, 2012.

[228] P. Heshi, G. Kulkarni, K. Bidkar, and A. Oswal, “Location Aware

Recommendation System,” Imp. J. Interdiscip. Res., vol. 2, no. 6, 2016.

[229] P. Bedi, H. Kaur, and S. Marwaha, “Trust Based Recommender System for

Semantic Web, In IJCAI, vol. 7, pp. 2677–2682.

[230] C. Chen, H. Yin, J. Yao, and B. Cui, “Terec: A temporal recommender system

over tweet stream,” Proc. VLDB Endow., vol. 6, no. 12, pp. 1254–1257, 2013.

[231] D. A. DellaSala, C. T. Hanson, W. L. Baker, R. L. Hutto, R. W. Halsey, D. C.

Odion, L. E. Berry, R. W. Abrams, P. Heneberg, and H. Sitters, “Chapter 13 -

Flight of the Phoenix: Coexisting with Mixed-Severity Fires BT - The

Ecological Importance of Mixed-Severity Fires,” Elsevier, pp. 372–396, 2015.

[232] X. Zhou, Y. Xu, Y. Li, A. Josang, and C. Cox, “The state-of-the-art in

personalized recommender systems for social networking,” Artif. Intell. Rev.,

vol. 37, no. C, pp. 119–132, 2012.

[233] L. Lü, M. Medo, C. H. Yeung, Y.-C. Zhang, Z.-K. Zhang, and T. Zhou,

“Recommender systems,” Phys. Rep., vol. 519, no. 1, pp. 1–49, 2012.

[234] S. N. Revolution, “The Present and Future of Recommender Systems,”

Summer Schools Lecture, Spain, 2006.

References

196 | P a g e

[235] R. R. Yager, “Quantifier Guided Aggregation Using OWA Operators,” IJIS,

Wiley, vol. 11, pp. 49–73, 1996.

[236] J. M. Merigo and M. Casanovas, "Generalized aggregation operators and fuzzy

numbers in a unified Model between the weighted average and the OWA

operator," International Journal of Intelligent Systems, vol. 24, pp. 934-54,

2008.

[237] A. Zenebe and A. F. Norcio, “Representation , similarity measures and

aggregation methods using fuzzy sets for content-based recommender

systems,” Fuzzy Sets and Systems, vol. 160, pp. 76–94, 2009.

[238] L. M. De Campos, J. M. Fernández-luna, and J. F. Huete, "A collaborative

recommender system based on probabilistic inference from fuzzy

observations." Fuzzy sets and systems, vol.159, no. 12, pp. 1554-1576, (2008).

[239] B. Mobasher, R. Cooley, and J. Srivastava, “Automatic personalization based

on web usage mining,” Commun. ACM, vol. 43, no. 8, pp. 142–151, 2000.

[240] J. C. de Borda, “Mémoire sur les élections au scrutin,” 1781.

[241] C. Dwork, R. Kumar, M. Naor, and D. Sivakumar, “Rank aggregation methods

for the web,” in Proceedings of the 10th international conference on World

Wide Web, pp. 613–622, 2001.

[242] M. M. S. Beg, “User feedback based enhancement in web search quality,”

Information Sciences, vol. 170, no. 2, pp.153-172, 2005.

[243] R. Ali, “Pro-Mining: Product recommendation using web-based opinion

mining,” IJCET, vol. 4, no. 6, pp. 299–313, 2013.

[244] “QS World University Ranking.” [Online]. Available:

http://www.topuniversities.com/university-rankings/university-

subjectrankings/2015/computer-science-

informationsystems#sorting=rank+region=+country=96+faculty=+stars=false+

search=.

[245] L. Getoor, C. Park, and C. P. Diehl, “Link Mining : A Survey,” ACM

SIGKDD Explorations Newsletter, vol. 7, no. 2, pp. 3–12, 2005.

[246] T. E. Senator and N. F. Drive, “Link Mining Applications : Progress and

Challenges,” ACM SIGKDD Explorations Newsletter, vol. 7, no. 2, pp. 76–83,

2005.

References

197 | P a g e

[247] R. R. Yager, “On ordered weighted averaging aggregation operators in

multicriteriadecisionmaking,” IEEE Trans. Syst. Man. Cybern., vol. 18, no. 1,

pp. 183–190, 1988.

[248] G. Beliakov, A. Pradera, and T. Calvo, “Aggregation functions: A guide for

practitioners,” Heidelberg, Springer, vol. 221, 2007.

[249] M. M. S. Beg, “User feedback based enhancement in web search quality,” Inf.

Sci. (Ny)., vol. 170, no. 2, pp. 153–172, 2005.

[250] M. M. S. Beg, “A subjective measure of web search quality,” Inf. Sci. (Ny).,

vol. 169, no. 3, pp. 365–381, 2005.

[251] J. Malczewski, “Ordered weighted averaging with fuzzy quantifiers : GIS

based multicriteria evaluation for land-use suitability analysis,” International

Journal of Applied Earth Observation and Geo Information, vol. 8, pp. 270–

277, 2006.

[252] C. K. Makropoulos and D. Butler, “Spatial ordered weighted averaging :

incorporating spatially variable attitude towards risk in spatial,” Environmental

Modelling & Software, vol. 21, pp. 69–84, 2006.

[253] B. M. Rasmussen, B. Melgaard, and B. Kristensen, “GIS for decision

supportdesignation of potential wetlands,” in 3rd International Conference on

Geospatial Information in Agriculture and Forestry, 2001.

[254] M. Lee, J. Chang, and J. Chen, “Fuzzy Preference Relations in Group Decision

Making Problems Based on Ordered Weighted Averaging Operators,”

International Journal of Artificial Intelligence and Applications for Smart

Devices, vol. 2, no. 1, pp. 11–22, 2014.

[255] N. Kaji and K. Masaru, “Automatic Construction of Polarity-tagged

Corpus from HTML\nDocuments,” COLING-ACL ‟06 Proc. COLING/ACL

Main Conf. poster Sess., no.July, pp. 452–459, 2006.

[256] A.-M. Popescu, B. Nguyen, and O. Etzioni, “OPINE: Extracting product

features and opinions from reviews,” in Proceedings of HLT/EMNLP on

interactive demonstrations, pp. 32–33, 2005.

[257] L. Zhuang, F. Jing, and X. Zhu, “Movie Review Mining and Summarization

∗,” In Proceedings of the 15th ACM international conference on Information

and knowledge management, pp. 43-50, 2006.

References

198 | P a g e

[258] J. Beel, S. Langer, and M. Genzmehr, “Sponsored vs. organic (research paper)

recommendations and the impact of labeling,” in International Conference on

Theory and Practice of Digital Libraries, pp. 391–395, 2013.

[259] M. M. S. Beg, “Novel Fuzzy Queries for Searching the World Wide Web,” In

Proc. Int. Conf. on High Performance Computing (HiPC 2002), Workshop on

Soft Computing (WOSCO 2002), Bangalore, 2002.

[260] M. Ge, C. Delgado-Battenfeld, and D. Jannach, “Beyond accuracy: evaluating

recommender systems by coverage and serendipity,” in Proceedings of the

fourth ACM conference on Recommender systems, pp. 257–260, 2010.

[261] A. Gunawardana, “A Survey of Accuracy Evaluation Metrics of

Recommendation Tasks,” Journal of Machine Learning Research, vol. 10, pp.

2935–2962, 2009.

[262] W. Hersh, A. Turpin, S. Price, B. Chan, D. Kramer, L. Sacherek, and D.

Olson, “Do batch and user evaluations give the same results?,” in Proceedings

of the 23rd annual international ACM SIGIR conference on Research and

development in information retrieval, pp. 17–24, 2000.

[263] D. Jannach, L. Lerche, F. Gedikli, and G. Bonnin, “What recommenders

recommend–an analysis of accuracy, popularity, and sales diversity effects,” in

International Conference on User Modeling, Adaptation, and Personalization,

pp. 25–37, 2013.

[264] A. H. Turpin and W. Hersh, “Why batch and user evaluations do not give the

same results,” in Proceedings of the 24th annual international ACM SIGIR

conference on Research and development in information retrieval, pp. 225–

231, 2001.

[265] S. M. McNee, I. Albert, D. Cosley, P. Gopalkrishnan, S. K. Lam, A.

M. Rashid, J. A. Konstan, and J. Riedl, “On the recommending of citations for

research papers,” in Proceedings of the 2002 ACM conference on Computer

supported cooperative work, pp. 116–125, 2002.

[266] F. Ricci, L. Rokach, B. Shapira, P. B. Kantor, and F. Ricci, Recommender

systems handbook, Springer, pp. 1-35, 2011.

[267] G. Schröder, M. Thiele, and W. Lehner, “Setting goals and choosing metrics

for recommender system evaluations,” in UCERSTI2 Workshop at the 5th

References

199 | P a g e

ACM Conference on Recommender Systems, Chicago, USA, vol. 23, p. 53,

2011.

[268] F. Hernández delOlmo and E. Gaudioso, “Evaluation of recommender

systems: A new approach,” Expert System with Applications., vol. 35, no. 3,

pp. 790–804, 2008.

[269] P. Cremonesi, R. Turrin, E. Lentini, and M. Matteucci, “An evaluation

methodology for collaborative recommender systems,” in International

Conference on Automated solutions for Cross Media Content and Multi-

channel Distribution, AXMEDIS, pp. 224–231, 2008.

[270] J. Vinagre, A. M. Jorge, J. Gama, "Evaluation of recommender systems in

streaming environments," arXiv preprint arXiv:1504.08175. Apr 30, 2015.

[271] J. Thom and F. Scholer, “A comparison of evaluation measures given how

users perform on search tasks,” in Proceedings of the 12th Australasian

Document Computing Symposium, Melbourne, Australia, December 10,

(ADCS), pp. 100-103, 2007.

[272] A. C. Nielsen, “Nielsen: Global consumers‟ trust in „earned‟ advertising grows

in importance,” Bus. Wire, 2012.

[273] R. Ali and M. S. Beg, "Modified rough set based aggregation for effective

evaluation of web search systems," In Fuzzy Information Processing Society,

NAFIPS, Annual Meeting of the North American, pp. 1-6, 2009.

[274] M. M. S. Beg and N. Ahmad, “Subjective Enhancement and Measurement of

Web Search Quality,” In Enhancing the Power of the Internet, pp. 95-129,

2004.

93 | P a g e

Chapter 5

Feature based Opinion Mining Approaches for

Book Recommendation

5.1 Introduction:

The previous Chapter deals with the recommendation issues of the top books to

universities‟ students using suggestions made by their institution, and keeping the

university prescribed syllabus as a foundation for the problems. The dataset was the

recommended books in top Indian universities. The main idea was to incorporate link

mining techniques with the help of robust aggregation methods for sorting the top

books amongst all. In this chapter, the dataset of books was kept as it is, but we have

modified the approach in the recommendation process of top books. Instead of

directly applying weighted or un-weighted aggregation method to the books

prescribed in the syllabus of universities, we have collected the customer reviews

available at different online retailer sites of books. Thus, opinion mining which is a

type of web content mining is exploited in addition to link mining which has been

used in previous chapter. Collection of these reviews is the first step in opinion

mining. These reviews are found in different form; usually reviews are open views of

a user in any language, unformatted and unstructured [255]. Hence, deciding about a

product from the available online user‟s unstructured opinions is a tough task, though

very interesting [256].

Once the opinion or reviews of the users are obtained, next step is to enforce some

methodology for converting these reviews in some operational form so that we may

process the reviews to get acquaintances with the features of the item for which all

these opinions are made. The feature extraction is a very important aspect of opinion

mining process. If the appropriate features are extracted from varieties of the options

for the targeted item, i.e. items whose top products are to be recommended, the

reviews can be assessed in the perspective of these features, which, then, may lead us

Chapter 5: Feature based Opinion Mining Approaches for Book Recommendation

___________________________________________________________________________________

94 | P a g e

to decide about the products and what are the feedbacks of the user according to their

experiences with it.

The sentiments of the users are attached with reviews. By reviews, one can know

the experiences, emotions and sentiments hidden in their words. The reviews may

have some positive words about a product or negative. The main task is to find the

sense of users from the reviews about the extracted features of the items, whether they

are positive or negative [255], [257]. Generally, for analyzing about the positivity or

negativity of a review, some pre-determined set of words are examined. Words like, “

well, fantastic, written by an experienced team, exactly what I needed, good job,

covering everything you need to be aware of, especially appreciated, awesome book,

classic, well written, highly recommend ” etc. are treated as positive terms. Examples

of negative words include, “ real disappointment, Not a general discussion, fluffy,

worst business, copied, biggest failing, worst, time consuming, bad, not

recommended,” etc. are termed as negative comments. It is important to observe the

reviews with human intelligence for interpreting the true sense of the reviewers as the

reviews are representations of human emotions. Sometimes it does happen that things

appear to be different then what they are. Let us consider the following examples:

Example 2.1:

Example 2.2:

In the above example 2.1 and example 2.2, there are positive words which can be

realized as a positive sentiment of user in favor of the books. The terms like „do your

students a favor‟ in first example, and „highly recommend‟, „better option‟ in second

example seem to be representing a positive sense; however it is not the case, as all

above positive terms are used in a negative sense. Thus, keeping only the positive and

negative aspects of the terms and processing on this much of knowledge may lead to a

wrong conclusion. This is the point which is not addressed by several opinion mining


___________________________________________________________________________________

95 | P a g e

algorithms. We have proposed algorithms that take into account the reciprocal-

meaning interpreting terms and employ human intelligence for a final decision.

5.2 Customer Reviews

As the customers reviews serve as a base for the recommendation of books in the

feature based opinion mining approach, we have collected reviews from highly rated

online merchandiser of books worldwide and different sites that allow the users to

present online reviews. The list of the sites from where the reviews are obtained is

listed below.

1. http://diestel-graph-theory.com/reviews.html

2. http://www.amazon.in/Concrete-Mathematics-Foundation-Computer-

Science/dp/0201558025

3. http://www-cs-faculty.stanford.edu/~uno/gkp.html

4. http://www.flipkart.com/concrete-mathematics-foundation-computer-science-

2nd-english/p/itmdx9se8fvvfph8

5. http://www.goodreads.com/book/show/1041923.Discrete_Mathematics_for_Co

mputer_Scientists_and_Mathematicians

6. http://www-groups.dcs.st-and.ac.uk/history/Extras/Harary_books.html

7. https://dzone.com/articles/compilers-principles

8. http://www.cambridge.org/ae/academic/subjects/computer-

science/programming-languages-and-applied-logic/logic-computer-science-

modelling-and-reasoning-about-systems-2nd-edition

9. http://shop.oreilly.com/product/9781565924536.do#PowerReview

In various cases authors have their own web pages that link to the resources of

reviews from the users. Though there are good numbers of books for which online

shopping portals have users review, however, there are some reviews that are

published by the publisher and not by a common user. Since, we are interested in user

reviews, the reviews from the merchandisers, magazine editors, writers and any

sources which may have biasness in reviews, have not been considered.

5.2.1 Issues while handling Online Reviews:

Several problems are encountered while handling online reviews. One of the problems

is of languages other than English in which reviews are available. Few examples of


___________________________________________________________________________________

96 | P a g e

reviews other than English language found at several sites are shown in Figure 5.1,

5.2, 5.3 and 5.4 respectively. While handling these reviews, the Google translator is

used for translating into English so that the reviews can be processed through

proposed algorithms.

Figure 5.1: Demonstration of a review in Spanish

Figure 5.2: Demonstration of a review in Russian

Figure 5.3: Demonstration of a review in Portuguese

Figure 5.4: Demonstration of a review in Greek

However, there are various books for which not enough number of reviews is

found, even there are the books which have not a single customer review. An example

is shown in Figure 5.5.

Figure 5.5: Screenshot displaying no reviews


___________________________________________________________________________________

97 | P a g e

Figure 5.6: Architecture of Book Recommendation using meta searching

5.3 Feature Extraction and Selection

In[1], authors have used the technique to categorize the features of the books for its

recommendation using meta searching as shown in Figure 5.6. The authors have

queried with key words on different search engines for finding the top books on the

discipline concerned. Key words like “books on the „specific course‟ ” for more than

10 disciplines of the computer science are passed. Forevery coursethe query seems

manifested systematically, e.g., let us consider books on compiler design; the query is,

"books on compiler design". The names of the books that appear in top 100 links are

stored with the help of Search Engine Optimization (SEO) tools.


___________________________________________________________________________________

98 | P a g e

Again, the authors try to extract the features from opinion. When the opinion

extraction is complete, ranking of the items (books in this case) based on scores are

performed. The feedbacks from the customers can be positive, negative or comments

may be neutral that contains neither negative nor positive terms in reviews. We have

analyzed the reviews and categorized several features for providing a better

understanding to the customers of books. The features are extracted from reviews on

the basis of the adjectives and indicative terms used by the users. Let us take an

example.

“The book has plenty of material to explore, written in a good manner but the

cost is too much to buy.”

The text expressed by the readers tells about the book features. We can extract

from the review that users are talking about „study material of the books‟,

„understandability of the contents‟ and „price‟. Thus, three features are extracted. In

such manner we have found only seven important features which are considered in the

recommendation procedure. These features are presented to users [37]. The users

were asked to give their feedback on importance of features. The users‟ feedbacks

were stored and analyzed to get an idea for creating bases to perform feature based

opinion mining.

The precision for the extracted feature is calculated and the observed precision is

considerably high which indicates that users have agreed on these features

unconditionally in a major proportion. The precision is the proportion of the users

favoring the features to total number of users involved in the procedure.

The categorized seven features play a vital role and serve as a foundation for the

process of feature based opinion extraction in the book recommendation approach. The

features and their explanations are discussed below.

Frequency of Occurrence in SERP: Occurrence means the multiple appearances of

the same books on single query. When users would like to see various options of the

books for a particular topic, they usually browse to search engine sites. There are

various books that have multi occurrences in the form of different link in any search

engines. Also, there may be books which occur in more than one search engines in

very first appearance of the Search Engine Results Page (SERP). This characteristic is


___________________________________________________________________________________

99 | P a g e

termed as „frequency of occurrencesin SERP”. Figure 5.7 shows the example of a

SERP.

Figure 5.7: Screenshot of Seacrh Engine Result Page for books on Artificial Intelligence

Useful Content: sometimes the books are well written, however the content is

either not sufficient or the authors would have merged all content together,

which make the reader irritating. In contrary, if the contents are useful, though

short and precise, the readers get attracted towards the book and eventually

show a love towards it.

An example of user review from amazon.com that favors our argument is

shown in the Figure 5.8.

Figure 5.8: Customer review expressing the views about content


___________________________________________________________________________________

100 | P a g e

Rating: rating is done by the user through various websites. The collective

rating of users on a particular books advocates about the user preferences

towards a book. Few of the listed sources in the previous section consist of

rating along with reviews. The higher rating of the user indicates the popularity

of a book amongst readers. Most of the techniques that support the philosophy

of recommendation approaches use only rating based recommendation.

However, we have incorporated rating as well as 6 other characteristics in our

recommendation process.

Understandablility: The content of the book, though how reach, should be

written in an easy-to-understand way, i.e. it should be understandable. The

users in their reviews use to write that book has a lot to give but not able to

understand the way writer has presented. The example words for this feature as

well as other features are listed in Table 5.1andFigure 5.9shows its importance

interpreted from user review.

Figure 5.9: Review example of „understandability‟ feature.

Physical Attributes: the physical attribute is concerned with the quality of the

pages, hard cover, and print quality etc. of the books. An example from screen

shot of a user review available at amazon.com is shown in Figure 5.10.

Sometimes it is found that the in spite of books being written brilliantly has low

value in the eye of the customer due to its various physical attributes. The

following example is an illustration taken from Amazon.


___________________________________________________________________________________

101 | P a g e

Figure 5.10: Review representing importance of physical attributes

Market Availability: the services of the book associated with publishing

houses, its market strategies and other related features which are reflected by

user reviews are placed in this feature. Also, it includes availability of the

reviews. As it is discussed in the previous section, there may be some books

which do not have even one review (Figure 5.5); hence their market

availability will have fewer values.

Price: This is the obvious feature of the book, a student is interested in. An

example of real user review emphasizing the importance of the considered

feature is shown in Figure 5.11

Figure 5.11: Review representing importance of Price

Table 5.1: Features and related review terms

Symbols Features Adjectives indicating features

SUC Useful Content

Content, material, helpful, useful,

advantageous, cover, Time waste,

creeping all together

SU Understandability

Hard to understand, easy to understand,

convey, fantastic written, Explained very

well,

SA Market Availability Available, sold,

Quick delivery

SR Rating Star, full rating, rating, rated,

SP Price Cost, price, worth, cheaper, costly, not

worthy for the amount

SPA Physical Attributes

Cover page, page quality, The cover of

the newer edition (2006) is pretty dull,

etc.

http://www.amazon.in/product-reviews/0321455363/ref=cm_cr_dp_synop?ie=UTF8&showViewpoints=0&sortBy=recent#R1GO8LDMZ7YTPA

http://www.amazon.in/product-reviews/0321455363/ref=cm_cr_dp_synop?ie=UTF8&showViewpoints=0&sortBy=recent#R1GO8LDMZ7YTPA


___________________________________________________________________________________

102 | P a g e

The above seven features can be expressed by number of ways and hundreds of words

in English. The specific terms which are used to express the sentiments have been

highlighted an above examples. The common terms which are usually manifested by

users to convey the characteristic of the books are indicated in Table 5.1.

5.4 Scoring Technique for Extracted Feature

The recommendation approach is basically based upon two step scoring process. First

step is to find the score for all seven (7) features discussed in section 5.3 associated

with each book and second, assigning weights to the feature. In section 5.4.1, the

calculation of opinion score is described which is aided by several algorithms.

5.4.1 Opinion Score Calculation

The opinions are combination of positive and negative sentiments, expressed by words

conveying similar meaning. There are various suggestions made by the researchers for

a detailed list which clearly identify whether the word involved in the sentence has a

positive sense or negative.

However, we have discussed earlier that only counting and calculating positive

words and negative words may mislead in finding exactly what the sentences meant

for? That is why we have designed a calculation method which involves the positive

and negative words along with those words which seems to be positive but in

association with some other reciprocal phrases it covey the almost opposite meaning.

These terms are said, “Reciprocal terms”. S_pw and S_nw represent scores for positive

and negative words respectively.

Algorithm 5.1: Score calculation of Positive Words

_ ) – (( )

_=

_ * 1.5_

i

N pw N pr N hpS pw

r

N_pw = No. of Positive words

N_pr = no. of positive words with reciprocal meaning

N_hp = no. of highly expressible positive words

_i

i

Nr

N R

_ No. of reviews considering feature 'i'iN R

N = Total no. of reviews extracted


___________________________________________________________________________________

103 | P a g e

Algorithm 5.2:Score calculation of Negative Words

_ ) – ( _( ) =

_ * 1.5_

i

N nw N nr N hnS nw

r

N_nw = No. of negative words

N_nr = no. of negative words with reciprocal meaning

N_hn = no. of highly expressible negative words

The explanations and examples for the positive words, negative words, reciprocal

terms with positive and negative words, highly expressible positive and negative

words are discussed below.Examples of positive, negative and reciprocal words are

explained in the next sections.

5.4.1.1 Positive words:

The sentiments which are straight forward and conveying positive meanings are

categorized and placed in positive words. The number of positive words is indicated

by N_pw. An example for positive review is underlined.

Example 5.1a “Positive review”: “I taught a couple of classes from the first edition

of this textbook, and my students did fairly well. On the whole, they were able to

understand the material and solve the homework problems. I certainly wouldn't mind

teaching a class on this subject from the second edition as well, which I feel is a mild

improvement over the first one.”

Example 5.1b “Positive review”: “The Chapteron finite automata is excellent. And

the material on context-free languages is thorough and well written. So is the

introduction to Turing machines. Of course, the book then spends a fair amount of

time on recursive function theory. That is exactly what I want it to do. And I think the

Chapteronunsolvability, starting with the Halting Problem, is excellent.”

5.4.1.2 Negative words:

The sentiments which ditrectly convey negative opinion from the customers are

categorized and placed in negative words. The number of negative words is indicated

by N_nw. few examples for negative review are illustrated below.


___________________________________________________________________________________

104 | P a g e

Example 5.2a “Negative review”: “Apparently, the only way to understand this book

is by having gotten your PhD in the 1950's. Completely incomprehensible, stilted, and

pompous, this book is the long sought after cure for insomnia. If you are a professor,

please do not choose this book for your class. If you are a student, pray”.

Example 5.2b “Negative review”: “So, you're basically paying anywhere from $100

to150 for the newest cover art and 25 pages. Don't waste the money.”

Example 5.2c “Negative review”: “This book is horrible. Please read some other

book which explains data structures in plain English. This book has two sections

called array data types and array data structure and both of them have pretty much

the same stuff written in a different, complicated way. Each is four pages long as well.

Not a good read and I will definitely not recommend it. If you want to memorize

definitions this might be the book but if you want to understand concepts, read

something else.”

We have proposed an algorithm for scoring to consider the value of positive and

negative words accordingly. The method is shown in Algorithm 5.3.

The above procedure helps in finding the negative orientation contained in a review.

The method provides in analysing the text and depending upon the orientation of the

sentiments expressed by users, products are scored.

Algorithm 5.3: Search the positive words in the document (opinion).

N_pw Count (positive words)

Search the negative words in the opinion.

N_nw Count (negative words)

If (N_pw>N_nw)

{

OO positive

}

else

OO negative


___________________________________________________________________________________

105 | P a g e

5.4.1.3 Reciprocal terms

In the above examples from 5.1a – 5.2c, the sentences are clearly stating positive or

negative aspects of the features associated with the books. However, there are the

situations when a word which may be grouped as positive words but would be used in

conveying its reciprocal sense, i.e. negative and vice-versa. These terms, as discussed

in section 5.4.1, are treated separately in calculating feature scores and called as

„reciprocal terms‟. Let us consider an example.

“The derivations, definitions and everything are written in a very easy and

perceivable way. But again the same warning, don't buyif you are a beginner”.

In the above example the terms like very easy, perceivable etc. represent positive

words however user is clearly forbidding from the purchase of the books. Few other

examples of reciprocal terms include:

Example 5.3a “Reciprocal terms”: “I would say it is a fine undergrad book, but

probably not the best choice for grad level studies.”

Example 5.3b “Reciprocal terms”: “This is a nice read for someone entering the

software architecture domain. However, I would not recommend it as a reference

book for a software architect.”

Example 5.3c “Reciprocal terms”: “Deserve one of the bestpoorly written for CS

theory.”

Example 5.3d “Reciprocal terms”: “Nearly all of the writer's explanations are

lacking at best. There are practically no examples to help you understand what the

writer is trying to convey. The answers in the back of the book seem to only be for the

easiest questions. Many of the proofs are incomplete as the writer intends for you to

come up with them in exercises without adequate explanation.

In short, if you have to use this book, I`m sorry”

Example 5.3e “Reciprocal terms”: “Not a book I would recommend.”

In the above examples we would easily recognize there are the supportive words in

the review which may seem in the favor of book, however they are not at all. Like in

Example 5.3e, „I would recommend‟ is an obvious positive and strong word in the

favor of book; however it is used totally in a negative sense with „not‟ in start of the


___________________________________________________________________________________

106 | P a g e

sentence. For understanding reciprocal terms one should have a good ground of

English grammar. There are the terms which confuse the meaning of a sentence. It

starts with a negative word and tries to convey its reciprocal meaning, and vice-versa.

Let us consider an example with „not only but‟, as it may convey dual meaning. “He‟s

not only funny, but also he‟s intelligent.” The „not‟ is usually classified as a negative

word, however „not only‟ is used to exhibit a reverse feeling.

Example 5.3f “Reciprocal terms”: “The book is very comprehensive and explains the

needed background to allow readers to not only use metrics well, but to understand

the limitations of metric”.

Sometimes readers compare a book with other and use terms to discuss the

differences, however if a true measure is not applied the opinion interpreter would

convey the meaning incorrectly.

Example 5.3g “Reciprocal terms”: “What a terrible book. Though it's the

cornerstone of many CS undergrad algorithm courses, this book fails in every way. In

almost every way, Dasgupta and Papadimitriou's "Algorithms" is a much better

choice.

Algorithm 5.4: Finding Opinion Orientation “OO” using reciprocal terms

Search for the reciprocal terms (rt) in the sentence.

If (rt ϵ Sentences)

{ Find the OO of the sentence before rt;

If (OO before „rt‟ is positive)

{

OO negative

N_pr Count (rt)

}

else

{

OO positive

N_nr Count (rt)

}}


___________________________________________________________________________________

107 | P a g e

Algorithm 5.5: Finding Opinion Orientation for „not only‟ phrase

Is (((not only) ϵ (Sentence))∧ ((but also) ϵ (adjac. (not only) ∨ Near5 (not only))))

If (pw ϵ ((adjac. (but also)) ∨Near2 (but also)))

OO positive

Else if

{

(nwϵ ((adjac. (but also)) ∨Near2 (but also)))

OO negative

}

The term „better choice‟ in the above example is not used for the book but for some

another one, and observing these terms is very important while dealing with usrs

sentiments through their reviews. We have proposed reciprocal terms finding

algorithm to enrich the process with a correct measures of understanding sentiments.

However, there are few words which are often used as reciprocal terms. These are

although, however, nevertheless, on the other hand, still, though, yet, but, etc.

As in Example 5.3a and 3.3b the meaning of sentences before reciprocal term is

changed by the sentence used after reciprocal terms.Keeping in view the issues

mentioned in the above examples, we have proposed the following algorithms to

tackle with. Opinion Orientation „OO‟ of the reciprocal terms is found from the

proposed procedure which will boost the reviews interpretation. Algorithm 5.5

presents the considerable inclusion of reciprocal meanings in a text. To deal with „not

only‟ phrase, the algorithm is suggested below.

5.4.1.4 Highly expressible words

By highly expressible word we mean those words which do have extra exclamations

and have been assigned a high score than simple affirmative word. E.g. “One of the of

the finest computer science textbooks I've ever read, and I've read hundreds.”


___________________________________________________________________________________

108 | P a g e

Algorithm 5.6: Counting Highly Expressible positive and negative words

If (highly expressible words)

{

If (OO positive)

{

N_hp Count (positive words)

}

Else

{

OO negative

N_hn Count (negative words)

}

}

With the help of six algorithm presented in the chapter, the sentiments are associated

with scores. Each user‟s sentiments expressed through their reviews are numerically

assigned values, which we call as opinion score. After finding the respective scores

for positive and negative words, the conclusive score „S‟ is calculated by;

( _ ) _ ) ---- (5.1 )(S S pw S nw

5.4.2 Weight Assignment to Features

In the section 5.3, it is discussed about how the features are selected which are further

served a base for finding opinion orientation towards these features and scoring the

books accordingly. We have already discussed in the section 5.1 that the important job

to be performed is score calculation and weight assignment to features. Section 5.4.1

describes how the scores are calculated for associated features of books. Here, we give

the details of the weights distribution scheme to the extracted features.

We have considered WO,WUC,WR, WU,WA, WP, and WPA as weights assigned to,

Frequency of Occurrence in SERP, Useful Content,Rating, Understandability,

Market Availability, Price and Physical Attributes, respectively. In the previous work

[1], the authors have used different ranges of the weights depending upon the use of

the features. However, they have selected the features without any scheme and they do

not have any defined method for assigning weights.


___________________________________________________________________________________

109 | P a g e

Algorithm 5.7: Weight assignment to features

n number of users participated in the feedback

r number of users with „true positive‟ feedback for feature „i‟

Tf total number of features extracted; Tf = 7

fi ith

extracted feature

Sfi (r/n)*100

Find (maximum amongst Sfi , for 1≤ i ≤ Tf )

Smax Max (Sfi)

WfiSfi / Smax

}

Here, the weights are assigned according to the importance that the users have

suggested in their feedback, as discussed in section 5.3. An algorithm for weight

assignment is depicted in Algorithm 5.7. The final score for a book is calculated as

follows;

Let si is the score of ith

feature associated with a book, and wi is the weight assigned to

feature „i‟. The final score (F.S) of the books is calculated as;

7

i

1

----- (5.2)i

i

FS w s

The books are sorted according to FS value which gives a ranked list of books. The top

books are then recommended for users.


Here, in this Chapterthe direct opinion of users are taken from online review sites

and these reviews are processed to ascertain the preference of the users. According to

the users‟ manifest, the books are recommended. Since, we have a total of 10 different

courses consisting of 158 distinct books, as discussed in Chapter 3. A total of 0.4587

× 104 reviews are obtained for these books from the above sources. There are seven

features regarding books which have been extracted from reviews, and total numbers

of 100 users are considered for feedback collections.


___________________________________________________________________________________

110 | P a g e

Table 5.2: Precision of extracted features

Extracted

features

Frequency of

Occurrence

in SERP

Useful

Content Rating Understandability

Physical

Attributes

Market

Availability Price

Precision 0.84 0.92 0.78 0.79 0.67 0.73 0.71

Figure 5.12: Precision of Extracted Features

The precision of the extracted features is also calculated. The precision is the

proportion of the users favoring the features to total number of users involved in the

procedure. The precision value of their feedback is tabulated in Table 5.2, and

pictorially represented in Figure 5.12.

It is evident from above table that all the features have higher value of precision.

The worst precision is recorded for feature „physical attributes‟ which is 0.67.

The weights are assigned to these seven features as described in algorithm 5.7. The

extracted features are presented before users to give their feedback on the concerned

features, whether they consider it important or not? The user feedback helps us in

deciding on what features the books recommendations would be made. To understand

how weights are assigned to features, let us take precision of useful content. From

Table 5.2, we have p (useful content) = 0.92. Weight assigned to useful content „Wuc‟

is given by;

( ax)

( )

m

ucp usefulcontent

pW

Wuc = 1.

It is evident from the table, „useful content‟ is the most valuable feature obtaining

1, maximum weight, i.e. if for a book the opinion score of feature useful content „Suc„


___________________________________________________________________________________

111 | P a g e

is high it will have a greater impact in its recommendation. Similarly, WU = 0.858696

which employ that opinion score of „understandability‟ will have 85% weightage in

scoring of users sentiments.

The minimum weight is obtained by „physical attributes‟ which has lowest

precision „.67‟. Thus, if a book is best reviewed from its physical attribute point of

view but is not attracting the users for other highly weighted features like „useful‟

content, the book well have a less score, and hence would get a lower ranking. The

weights obtained for respective features are given in Table 5.3.

A special feature, “Occurrence in SERP” is also included which gives the

importance of the books in search engine. The frequent and top ordered appearance of

a book on SERP clearly supports its high value in the eye of users, as most hit are

made on it. So, overall only the books which have high values from all the aspects

will be sorted and ranked. These ranked books are recommended to users.

Table 5.3: weights distribution of features

Features Weights

Frequency of Occurrence in SERP 0.913

Useful Content 1

Rating 0.8478

Understandability 0.8586

Physical Attributes 0.7282

Market Availability 0.7934

Price 0.7717

Table 5.4: Score calculation example

Frequency

of

Occurrence

in SERP

Content Understandability Rating Physical

Attributes

Market

Availability Price

N_pw 4 9 12 45 0 7 5

N_pr 0 0 0 0 0 0 0

N_hp

0 0 3 0 0 0 0

N_nw 0 5 3 39 0 2 4

N_nr 0 0 0 0 0 0 0

N_hn 0 0 2 0 0 0 2

N_ri 1 11 9 84 52 9 9

N 1 84 84 84 84 84 84

ri 1 0.13 0.11 1 0.62 0.11 0.11

S_pw 4 1.18 1.77 45 0 0.75 0.54

S_nw 0 0.65 0.64 39 0 0.21 0.75

S 4 0.52 1.13 6 0 0.54 0.21


___________________________________________________________________________________

112 | P a g e

The number of different books for each course has been discussed in Chapter 3. The

detail is given in Table 3.14. We have 17 different books of discrete mathematics.

Total 364 reviews are obtained for all books on discrete mathematics. This implies a

book has average 20 reviews. The calculation which is shown in Table 5.5is

performed for all the reviews of each and every book. Final score (F.S) then is

calculated. The books are ranked in order of scores achieved based on score

calculation of opinions.

Table 5.5: Example of Final Score Calculation

Features Weights Score „S‟

Frequency of Occurrence in SERP 0.913043 4

Useful Content 1 0.52

Rating 0.847826 1.13

Understandability 0.858696 6

Physical Attributes 0.728261 0

Market Availability 0.793478 0.54

Price 0.771739 0.21

0.913043*4 1*0.52 0.847826*1.13 0.858696*6

0.728261*0 0.793478*0.54 0.771739*0.21

10.8729

(

3

)

FS

FS

Table 5.6: Top 10 ranked books of all the courses using Opinion Mining Technique

Rank

position Books of different courses

1 TOC.1 CD.1 AI.2 SE.2 DB.1 DS.1 CN.16 DM.1 CG.7 OS.2











___________________________________________________________________________________

113 | P a g e

On the basis of the scores, ranking of books for all the courses are shown in Table

5.6. The ranking of these books based on different soft computing techniques are

given in Table 4.21 to 4.29. The detail discussions of those rankings have been made

their. In next chapter, the different soft computing techniques will be discussed and

their methodology will also be described. Here, we just highlight how the

recommended books behave with respect to different approaches used for book

recommendation. On comparing the ranking based on Opinion mining technique

(OMT) with those of soft computing, it is observed that books on theory of

computation has same recommendation from all the techniques.

Further, TOC.11 is ranked in top position using OMT; however, PAS and ORWA

techniques do not recommend it in top positions either. SE.2 is ranked first by all the

procedure adopted in Chapter 3 and it has secured first positions using opinion mining

technique, presented in this Chapter also. The usual similarity of OMT with ORWA

and PAS can easily be noticed as OS.2 which has secured first rank in the ranking of

all these techniques. OWA with quantifier at least half has also the same

recommendation. However, OWA with „as many as possible‟, and „most‟ quantifier

differ in their final recommendation for first position.

The configured discussion on various parameters for these recommended books are

discussed in details in the Chapter 6. Further, interpretation of results from various

aspects are also elaborated and exhaustively analyzed in the chapter.

5.6 Summary

In this Chapter the book recommendation methods are discussed. The proposed

method is based on opinion mining. The technique for extraction of opinions and

selection of features are discussed for various leading book retailers. The feature

based recommendation helps users in exploring the items of their preferences as it

ease the process of selecting the desired products when users are aware of their

requirements and specification of the items.

It has also been shown that the books features which are extracted have high

precision. The high value of precision shows the strength of the feature selection

procedure.

A recommendation technique influenced by experts‟ opinion has been conferred in

Chapter 3 and 4. The same data set is used here in this chapter. But instead of experts‟


___________________________________________________________________________________

114 | P a g e

suggestion of books, users‟ reviews are considered. The ranking on the basis of

different approaches are discussed. The details of the data set and selection of books

is discussed extensively in the Chapter 3 and 4.

The method is based on opinion mining, as discussed above. It is obvious; the

user‟s reviews will help in recommendation as most of the users would seek to know

the opinion of other users prior to buy an online product. For exactly evaluating the

best one, the comparison of different parameters is needed and an exhaustive

approach is required. In the Chapter 6, a comprehensive approach is discussed that

provides a guideline for the evaluation of RS and has the provision to avoid the

inclusion of insincere users from evaluation process while taking feedback from them.

The sixth Chapter also compares the results of the explicit feedback recommendation

approaches proposed in Chapter 3, Chapter 4 and Chapter 5.

115 | P a g e

Chapter 6

Evaluation of Recommender Systems

6.1 Introduction:

The proliferation of the Internet has enlivened online shopping. The increased interest

in online shopping has caused emergence of a large number of online merchandisers.

Due to the significant increase in online merchandiser, there is an enormous increase

in the number and varieties of products being sold on the Web [3], [4]. Consequently,

for a buyer, finding products of his choice in online shopping has become a tough and

tedious job. Merchandisers provide recommendation for buyers to help them in

getting the products of their choice. The recommender system also helps the

customers a lot by reducing the time spent in the exploration of different products of

their choice. On the other hand, it fulfills the merchandisers‟ interest of being at top

by exposing their competence in business, as it is likely that a good recommender

system will enhance the marketing strategy of the merchandisers and help them to

attract the customers [2], [258].

The ultimate goal of a recommender system (interchangeably termed as

recommendation system) is to satisfy the user [213]. User satisfaction depends upon

fulfilment of their needs. Different users have different requirements. Thus we should

opt for a recommender system that can identify the different users‟ requirement and

can predict items to them according to their needs. However there are other important

aspects while designing a recommender system along with the consideration of user

satisfaction. The system should not exhibit errors that can irritate the users while they

purchase, otherwise a user may never come again for purchasing. The evaluation of a

recommender system helps us in judging these errors and designing an application

that may fulfil the user satisfaction. The recommender system performs the job of

providing users better option for their purchase, the evaluation of a recommender

system judges its performance and helps in employing the modification in the system

according to the user‟s need and as per the shortcoming encountered while

Chapter 6: Evaluation of Recommender Systems

___________________________________________________________________________________

116 | P a g e

performance assessment is performed. Thus, it is necessary to have an adequate way

of assessing the recommendation approaches which may be achieved by a sound

evaluation process.

This Chapteraims at exploring the techniques to reduce the flaw in Recommender

Systems (RS) so that users can get most appropriate recommendation for their online

shopping. Recommendations based on fake review are usually biased. The biased

recommendation made by online shopping portals may have a negative impact over

the customers and it may lead to further reluctance of their purchase from the same

online shopping site. Thus, if a RS can be designed in order to reduce the factors

which can affect the users purchase negatively, eventually it will help online

merchandiser to boost their business as well as customers would be provided with the

suitable products of their choice. Since RS are meant to provide customers with ease

and satisfaction, hence user feedback can be a key component in evaluating the RS

and checking whether it is up to the mark or not? The proposed approach provides a

platform to evaluate the RS on the basis of user feedback. In addition to this, it also

examines the user‟s sincerity and put preference criteria, which in turn must overcome

the biasedness and fake review problems in designing RS.

User feedback is one of the strongest bases for evaluating a recommender system.

However, there are two types of user feedback.

Implicit user feedback

Explicit user feedback

Implicit user feedback requires the detail of user behavior which in turn gives their

trend and inclination towards purchase. Explicit user feedback needs feedback from

users described explicitly. In this chapter, we have extensively discussed evaluation

measure based on both types of feedback. A recommender system which is designed

to recommend electronic items like, laptop, tablet, smart phone, headphone and

printer [243] is evaluated using implicit user feedback. As, these items include

reviews from a common user, which may be biased or casual. We have suggested an

evaluation approach that checks the user sincerity and followed implicit feedback

strategy. On the other hand, in this work, we have also recommended books and

evaluated the recommendation technology. The difference lies in sources from where

feedbacks are taken. For books, we have explicitly taken feedback from experts, who


___________________________________________________________________________________

117 | P a g e

are computer professionals and graduates. The pros and cons of books for any

specified topic can never be judged by a common man but by experts. Thus experts

ranking have become basis for the evaluation of book recommendation approach.

Hence, there is no need of sincerity checking and finding the criteria of preference,

instead, we can simply ask direct ranking by the user. This may be treated as a

standard ranking. Further, it may serve as a basis for the evaluation of the system. The

detail of the explicit feedback based evaluation is elaborated in next chapter; however,

later in this Chapter a brief discussion on the topic is made.

With the brief discussion of explicit feedback and its differences with implicit

feedback which has been discussed above, it is understood that if the users are

authentic, reliable and experts of the field concerned, the explicit feedback is

preferred. In Chapter 5, we have comprehensively discussed about the evaluation of

recommender system where the feedbacks are implicitly taken from the users. The

users considered for the evaluation procedure were common people who use the

products and need not to be an expert for presenting their opinion on the concerned

items. However, there could be events or items for which general users‟ behavior for

purchasing may not lead to exact conclusion, like books, institutes, conferences, etc.

Since, in our proposed work, we have opted specifically book for recommendation,

which further can be extended to any product of various domain. As to assess the

book, one need to be an expert and the expert‟s opinion is something which can be

relied upon. Thus, the experts of the different subjects from all over the globe who is

somehow related to Indian education system are considered and approached for their

recommendation and reviews on the specified topic of the books concerned. Their

feedbacks are kept to serve as a basis for the evaluation of the proposed system which

is presented in Chapter 3, 4, and 5.

Once we get the ranking from the experts, the evaluation metric is needed for the

evaluation of the performance of the proposed work by comparing the ranking

obtained by suggested book recommendation approaches (BRA) and the ranking

given by experts. P@k, Mean Average Precision (MAP), FPR@10, FNR@k, Root

Mean Square Error (RMSE), Modified Spearman Rank Correlation Coefficient and

Mean Absolute Error (MAE) are used as evaluation metrics for the assessment of the

approach. The brief explanation of these evaluation metrics is discussed in section


___________________________________________________________________________________

118 | P a g e

6.3. The corresponding values of the respective parameters with adequate discussions

are presented in section 6.4.

6.2 Previous Evaluation Studies

As the approaches for recommendation of products, books, research articles, etc. are

becoming increasingly popular, the evaluation of the recommender system is

becoming very important so that it can be assessed that which approach should be

adopted for recommendation as per the need of the users.

Prior to evaluation of recommender system it is necessary to know how a

recommender system can be measured. However consensus is the basis for judging a

good recommender system and the methods adopted for evaluation of recommender

system[51].

Initially, researchers emphasized on accuracy as a measure for the evaluation of

recommender systems. They used accuracy based on MAE as an evaluation measure

[259], [260], [186]. Later, aspects other than accuracy were suggested, as it is never

mandatory that an accurate system is always good [186], [261], [262]. The author in

[259] focused on the quality of the recommender system instead of just finding the

accuracy of mathematical approach or algorithm. The quality of a good recommender

system cannot be judged by finding predictive accuracy only but it should meet the

user satisfaction which is the ultimate goal of a recommender system. In literature,

several features that constitute a good recommender system are discussed. Authors in

[9] discuss user satisfaction and satisfaction of recommendation provider along with

accuracy as main features for an adequately reliable recommender system. In [42],

[51], [259] the authors have suggested features like coverage and serendipity are also

important factors. In [186], [263] response time of recommendation is considered as

an important factor. However, presentation is also important factor that can enhance

the user satisfaction [264], [265].

What are goals of RS and how to choose metrics for evaluation is studied [266] and

an extensive discussion is presented about the advantages and shortcomings of the

previous evaluation approaches.Olmo and Gaudiosotried to project the presentation

and calculation of the RS separately [267]. Cremonesi and Lentinihave suggested the

procedure to evaluate collaborative filtering (CF) based recommender systems

R(S)[268]. They use 7 evaluation metrics to evaluate the system.


___________________________________________________________________________________

119 | P a g e

The recommender system evaluation usually has three different approaches stated in

literature; offline evaluation, online evaluation and user studies [1], [10]. The author

[269] argues that online experiment has very difficult to be evaluated as it keeps on

generating data. They propose a prequential evaluation protocol that is useful for both

online and offline evaluation.

As far as evaluation metric are concerned, Mean Average Precision (MAP) is

supposed to be defacto standard, also Mean Reciprocal Rank (MRR) is considered

best for measuring the recommendation for top ranked products [38], [40]. In [39],

[270] authors discussed the idea of finding the errors in recommendation using false

positive and false negative and, illustrated how these help in finding the accuracy of

the system.

In the previous work [39], [40] authors have proposed evaluation process but no

criteria of preference is assured. They have quantified the user feedback and selected

all the items which have more than zero score, however, all the items for which user

visits to check the review, the value will come out to be more than zero, although it is

never guaranteed what have been visited to see is also the preference of a user. Here,

criteria of preference are defined mathematically, and it is suggested about an item

whether it can be treated as preferred product or not?

A survey based on more than 28,000 Internet users across 56 countries by Nielsen in

2012 suggests that online customer reviews are the second most trusted source of

brand information [271]. However, the biasedness and casual-review remain a

concern for the online reviews [29]. In various evaluation studies [267], [51] the

integrity of the user feedback is not well explored and the emphasis on the

consequences of fake, casual or biased feedback are not discussed. In this paper, a

check for biasedness and casualness is proposed by measuring user‟s sincerity. The

approach empowers the feedback, which in turn is the base for evaluation study.

6.3 Evaluation Metrics

Deciding how wella recommender system performs, depends upon the task it needs to

achieve. There are no absolute guidelines for designing recommender systems,

instead, it has to be designed in such a way that it may satisfy the users and meet their

needs. The users‟ needs vary according to the situation and as per their requirements.

Thus, the evaluation of such systems is purely relative rather than absolute.


___________________________________________________________________________________

120 | P a g e

In the literature, prediction accuracy is the most considered factor of a

recommender system [51]. It refers to the accuracy of the recommender system in

predicting the items to the users. It is pre-assumed that users would prefer the systems

that have predicted the items for them more accurately. The prediction can be

classified in the following categories.

Accuracy of ratings predictions

Accuracy of usage predictions

Accuracy of rankings of items

There are different evaluations metrics have been used for deciding these accuracies.

Few of the most frequently used metrics are;

i. P@10

ii. FPR@10

iii. FNR@10

iv. Mean Average Precision (MAP)

v. Mean Absolute Error (MAE)

vi. Mean Reciprocal Rank (MRR)

vii. Root Mean Square Error (RMSE)

viii. Spearman rank Correlation Coefficient

ix. Modified Spearman rank Correlation Coefficient

The calculation method of the above metrics and their details are illustrated below.

6.3.1 P@10

We define the precision at top-k positions as P@k and it is given as;

' ' ‟ @

Number of products recommended by system ranking in top k positions that are also endorsed in user s raP k

k

nking

For different value of „k‟ we may obtain different P@k. for k=10, we define the

precision at top-10 positions as P@10 and it is given as;

@

'10' 10

1

‟

0

Number of products recommended by system ranking in top positions that are also endorsed in user s rankingP


___________________________________________________________________________________

121 | P a g e

6.3.2 FPR@10

We denote false positive rate for top 10 positions as FPR@10. We define FPR@10 as

follows:

'10 ' @10

1

0

Number of products recommended in top position but not preferred by customF R

erP

The “false positive” point out a situation in which recommended products are not

preferred by the customers. This situation is considered as the worst, as the customers

get irritated and never go for any further buying.

6.3.3 FNR@10

The false negative error refers to a situation when the preferred items customers are

missing in the recommendation. We denote false negative rate for top 10 positions as

FNR@10. We define FNR@10 as follows:

( '10' @10

10

Number of products missing in recommendation but preferred by customer in top positiFN

onR

6.3.4 Mean Average Precision

Mean Average Precision (MAP) is given by;

1

1( ) --------------------------------------------------- (6.1)

n

i

i

MAP P Un

P(Ui) is the precision of ith

user, and „n‟ is the number of users.

6.3.5 Mean Absolute Error

The Mean Absolute Error (MAE) is measured to know the closeness of the outcomes

with actual results. It is given by;

i

1

1x - x ------------------------------------------------ (6.2)

n

i

MAEn

xi and x are the outcomes and actual values respectively, and n is total number of

observations.

6.3.6 Mean Reciprocal Rank

In the ranking of the products, based on comprehensive approach, we find the position

of the product of an item which is ranked first in the system ranking. Let „r‟ denotes

its ranked-position in comprehensive ranking. We give Reciprocal Rank (RR) as:

RR = 1

r


___________________________________________________________________________________

122 | P a g e

Mean Reciprocal Rank (MRR) of all the items for their respective first ranked product

is given by;

1

1 -------------------------------- (6.3)

n

i

i

MRR RRn

Where „n‟ is the total number of different items and „i‟ denotes item‟s sequence.

Mean Reciprocal Rank (MRR) gives the degree of relevance of a particular product in

the eye of a customer. If all the first ranked product of different items in system

ranking is also ranked first in comprehensive ranking, MRR will be 1, i.e. the best

case.

6.3.7 Root Mean Square Error

The root mean square error gives the error value of the data. It is given as;

2

1

1( ) --------------------------------- (6.4)

n

i i

i

RMSE Y yn

Yi and yi are the actual ranking and outcomes of the ranking by experiments. In the

above equation, „Yi „in the equation denotes expert‟s recommendation, and „yi „is

used for indicating prediction by system.

The less the value of RMSE, the most accurate is the ranking.

6.3.8 Spearman rank Correlation Coefficient

Spearman‟s Rank correlation coefficient is used to identify and test the strength of a

relationship between two sets of data.

The Spearman rank correlation coefficient „ 'r ‟ is mathematically given by:

2

2' 1 6

( 1)

dr

N N

------------------------------ (6.5)

Where d is the difference in statistical rank of corresponding variables, and is an

approximation to the exact correlation coefficient.


___________________________________________________________________________________

123 | P a g e

6.3.9 Modified Spearman rank Correlation Coefficient

The researchers have used spearman rank correlation coefficient for the measurement

of similarities between different rankings [272]. The problem with this coefficient is

its inability of producing correct correlation for partial list. Beg [273] have suggested

modified version of this coefficient. The formal definition of the modified spearman

rank correlation coefficient is given as-

“If full list is given as [1, 2,…, n] and let the partial list be given as [v1, v2,…,vm].

Then Without loss of generality, Modified Spearman rank order correlation

coefficient (rs׳) between these two rankings is given as follows”;

------------------------------ (6.6)

6.4 Evaluation based on Experts‟ Ranking using Explicit Feedback

The evaluation of recommender systems (RS) is very important in deciding the

procedure and parameters to be considered while designing RS. A quality evaluation

mitigates the issues encountered in recommendation and reinforces the

recommendation by placing the products before the users that matches to their

preferences. A detail discussion for the evaluation studies has been stated in section

6.2 and related evaluation metrics are mentioned in section 6.3.

The present study tries to help the academicians in deciding the books for the

syllabus of university graduates of India. On the one hand the proposed study helps

the university experts and administrators while designing the course curriculum for

their graduate students, and on the other hand it lays a platform for the students to

explore the challenging books for their studies. Thus, assessing whether the

recommendation approach performs well and would fulfill the requirement has great

significance. Therefore, we have evaluated the adopted procedure with the help of

experts. The term „expert‟ refers to the specialist of computer science. These

specialists are selected from different locations on the globe with only consideration

in common is their direct or indirect connection with the Indian education system

which involves computer science to an extent.


___________________________________________________________________________________

124 | P a g e

The list of recommended books by the book recommendation approaches (BRA)

which are discussed in previous chapters 3, 4 and 5 are provided to the specialists.

The specialists are computer science graduates, researchers, corporate officials and

academicians. They are Indian and foreigners both, but have link with computer

science education in India. It is described earlier in Chapter 3 that a total of 10

different courses are taken. 10 different specialists for each course are approached.

They are searched from their profile, university data base and Google scholar.

Especially the faculty members of Computers Science Department of central

universities of India, research scholar at Indian universities in the Department of

Computer Science and IT industries engineers are included. The experts from India,

Iraq, Iran, USA, KSA, Jordan, Yemen, UK and Australia are approached. The

specialists from abroad are the researchers and IT specialist working in India or the

Indian computer science graduates working in those countries. The ranked books

from the BRA are compared with the ranking of books provided by experts. On the

basis of several evaluation measures the recommendation approaches are evaluated. A

block diagram for illustrating the evaluation scheme is shown in Figure 6.1.

The block labeled with „prescribed books by top ranked universities‟ shows the list

of books prescribed in the universities‟ curriculum. These books are used by all the

different recommendation approaches (BRA). The details of all these approaches have

been extensively discussed in respective sections of Chapter 3 and Chapter 4. The

final rankings of different approaches have been stored and these different lists of

ranked books are considered as the final recommendation by the book recommender

respectively.

As shown in the above figure, the list of books is given to experts. Since, we have

10 different courses containing varying books which has a total of 158 in number. The

books from each course are presented to their corresponding experts. Let us consider

books on “operating systems” (OS). There are 15 different books of OS. These 15

books of OS are given to experts and their explicit feedbacks for the ranking of these

books are recorded.


___________________________________________________________________________________

125 | P a g e

Figure 6.1: Block diagram for Evaluation of Book Recommendation Approaches

The final recommendations of books by each recommender approach are compared

with the ranking given by experts. On the basis of evaluation measures which includes

Mean Reciprocal Rank (MRR), P@k, Mean Average Precision (MAP), FPR@k (false

positive rate), FNR@k (false negative rate), Mean Absolute Error (MAE), Root Mean

Square Error (RMSE) and Modified Spearman rank correlation coefficient, the

recommendation are evaluated.

6.4.1 Evaluation Results based on Different Evaluation Metrics

The different evaluation metrics have been discussed in 6.3. These metrics are

frequently used to evaluate the performance of recommender systems. We have used

eight (8) metrics from the above discussed sections for the purpose of evaluation of

our proposed recommender system based on the experts ranking. The different

approaches for the recommender system have been compared and the results from

various aspects have been shown and discussed.

The all parameters which are included in the evaluation process have been

discussed one by one to show their values for each technique. Hence, the relative

comparison of the recommendation approaches for the respective parameters are

presented.


___________________________________________________________________________________

126 | P a g e

6.4.1.1Evaluation Results using Explicit Feedback based on Root Mean Square Error

The Root Mean Square Error (RMSE) of ranking of books provided in Chapter 3,

Chapter 4 and Chapter 5 are compared with the experts‟ ranking. For all 10 different

courses the values of RMSE for OMT, PAS, OWA with different quantifiers and

ORWA is shown in the Table 6.1. The RMSE is obtained using equation 6.1.

The RMSE for the book „Theory of Computation‟ (TOC) is least for the Opinion

Mining Technique (OMT). It comes out to be 0.645 which is the minimum value and

OWA (at least half) has performed well next to OMT as RMSE for it has obtained as

0.936. The maximum RMSE, i.e. the worst performance has been recorded by OWA

(as many as possible). However the value is still not too much and considerably low.

This variation indicates that the recommendation made by the techniques based on

user‟s opinion has least error while compared with experts ranking for the books on

TOC. A similar trend has been noticed for books on Data Base (DB). This again

emphasizes that the techniques used for recommendation based on their online

opinion has the most similarity to the experts‟ rankings. It suggests that the adopted

method is most accurate and the experts ranking has least difference with readers‟

opinions. Surprisingly, Ordered Ranked Weighted Aggregation (ORWA) has the

minimum RMSE for books on Compiler Design (CD) and its value is same as RMSE

obtained for OWA (at least half). The reason behind the similar value of ORWA and

OWA (at least half) is the weight assignment methodology of ORWA.

Table 6.1: Root Mean Square Error of all books by different approaches

Courses

Recommendation Approaches

PAS ORWA OWA (At

least half)

OWA (As

many as

possible)

OWA

(Most) OMT

TOC 0.97 0.96 0.94 1.14 1.01 0.65

DB 0.86 0.86 0.81 1.11 0.91 0.74

CD 0.91 0.73 0.73 1.06 0.91 0.82

OS 0.86 0.87 0.88 1.32 1.06 0.84

DS 0.95 0.90 0.91 1.11 1.17 0.96

AI 2.22 2.16 2.34 2.46 2.23 2.00

CN 1.40 1.46 1.36 1.74 1.80 1.17

SE 1.08 1.00 1.10 1.48 1.61 1.17

DM 0.99 0.80 1.25 1.61 1.37 1.08

CG 2.14 1.52 1.74 2.58 2.59 0.29

Average

RMSE for

all books

1.24 1.13 1.21 1.56 1.47 0.97


___________________________________________________________________________________

127 | P a g e

Since, while assigning weights using ORWA, the ranking of the rankers are taken

into considerations. The OWA (at least half) also considers the upper half of the best

ranked institution. Therefore, in most of the cases these two methods have similar

value of the metric. The maximum difference between the values of ORWA and

OWA (at least half) is observed for books on „Computer Network‟ (CN) and „Discrete

mathematics‟ (DM). The possible reason behind this could be the variation in the

recommendation of CN and DM books by universities for their students. Due to

varying books prescribed in the syllabus by universities less number of similar books

are found which in turn gives the difference in values of RMSE for ORWA and OWA

(at least half).

The root mean square value of different books is found to be least by OMT in most

of the cases. And the book on AI has most error by all the methods. The OMT has

least error in 6 out of 10 cases whereas ORWA has 4 times better performance than

other techniques. OWA (at least half) has the second best performance in two cases

and Positional Aggregation Scheme in three cases respectively. However, OWA (as

many as possible) has the maximum RMSE in most of the cases. The major difference

in the values of respective techniques is observed for books on „Computer Graphics‟

(CG). The experts‟ recommendation and user‟s opinions have least root mean square

error. It simply implies the techniques adopted for opinion extraction and

recommendation of products is up to the mark and it shows the strong similarities

between experts ranking and user‟s reviews. Also, the results indicate that the

interpretation of opinions are successfully framed that it has high similarities with the

recommendation of experts.

The average values of the RMSE for all the books are represented pictorially in

Figure 6.2. The result gives the clear indication of OMT being outperformer for these

parameters as it holds the least error. The average RMSE is maximum for OWA (as

many as possible) and minimum for OMT. ORWA is the second best performer as far

as RMSE is concerned. Unlike ORWA, OWA (at least half) has not obtained equally

least RMSE. Since it includes the half of the rankers, i.e. universities in

recommendation process and there are the books recommended by least ranked

universities also which in turn down the ranking of this method. Thus the results

shows the significance of the authoritative recommendation as the results of these

techniques i.e., ORWA, OWA with all quantifiers and PAS are authoritative


___________________________________________________________________________________

128 | P a g e

recommendations considers universities authorities suggestion while making

recommendation for students their best books.

On observing the RMSE for different techniques, one interesting thing that can

easily be seen is that the maximum RMSE which has been attained by OMT is 2.0.

Thus, OMT not only has least error for most of the books but also shows the

minimum error while considering the maximum value of RMSE for each of the

techniques.

Figure 6.2: Average Root Mean Square Error for all techniques

6.4.1.2 Evaluation Results using Explicit Feedback based on Mean Absolute Error

The Mean Absolute Error (MAE) of ranking of books provided in Chapter 3,

chapter 4 and Chapter 5 are compared with the experts‟ ranking. For all 10 different

courses the values of MAE for OMT, PAS, OWA with different quantifiers and

ORWA is shown in the Table 6.2.

The MAE for the book „Computer Network‟ (CN) is least for the ORWA. It comes

out to be 4.54. The maximum MAE, i.e. the worst performance has been recorded by

OWA (as many as possible). This variation indicates that the recommendation made

by the techniques based on ordered ranked weighted aggregation which is applied

over authoritative recommendations has least error when compared with experts

ranking for the books on CN. Unlike RMSE, the results of MAE have lowest for all

books using ORWA. The consistent performance has been noticed for related

approaches, hence, PAS and OWA (at least half) has also performed well.

The results have almost same trend in every technique for all the books. Though,

the differences between ORWA and OMT values of MAE have variations. For books

TOC, DB, AI and CD there are minor differences whereas MAE for other books has


___________________________________________________________________________________

129 | P a g e

slightly more differences. It suggests that the adopted method of aggregation for the

authoritative recommendations has the best performance as far as measurement of

MAE is concerned. Still, OWA (as many as possible) and OWA (most)

recommendations lag behind. The reason for this is basically the concept that these

two methods consider lower half of recommendation while performing aggregation.

Hence, instead of high ranked rankers‟ recommendation it includes the lower ranked

recommendations. Also, Ordered Ranked Weighted Aggregation (ORWA) has the

minimum MAE for books on Compiler Design (CD) and its value is same as MAE

obtained for OWA (at least half). The reason behind the similar value of ORWA and

OWA (at least half) is the weight assignment methodology of ORWA.

Table 6.2:Mean Absolute Error of all books for different approaches

Courses


PAS ORWA OWA (At

least half)

OWA (As

many as

possible)

OWA

(Most) OMT

CN 5.91 4.54 4.25 6.84 6.06 5.52

DM 3.83 2.81 3.94 6.22 4.89 4.338

OS 2.73 2.76 2.77 4.19 3.35 3.16

TOC 3.05 2.91 2.89 3.51 3.13 3.098

SE 4.41 3.23 3.41 6.23 5.59 4.574

DBMS 2.72 2.73 2.57 3.5 2.89 2.882

DS 2.98 2.81 2.8 3.54 3.71 3.168

AI 6.7 6.45 6.99 7.21 6.75 6.82

CD 2.88 2.3 2.3 3.36 2.88 2.744

CG 4.41 3.23 3.41 6.23 5.59 4.574

Average

MAE for all

the books

3.962 3.377 3.533 5.083 4.484 4.0878

The MAE value of different books is found to be least by ORWA in all the cases.

Like RMSE, the book on AI has most error by all the methods. However, the least

Mae is obtained for CD in all the cases. As total number of books on CD is only 10

therefore the overall recommendations have fewer variations. The major difference in

the values of respective techniques is observed for books on „Computer Graphics‟

(CG).


___________________________________________________________________________________

130 | P a g e

Figure 6.3: Average of Mean Absolute Error of all the books for different quantifiers

6.4.1.3 Evaluation Results using Explicit Feedback based onP@10

The P@10 gives the value of preciseness in recommendation that how many books

which have been recommended by the Experts in top 10 positions is also

recommended by Book Recommendation Approaches (BRA). This measure tells how

accurate the adopted recommender technique is and how accurate is the

recommendation made by these systems.

The values of P@10 for the respective books by using all different

recommendation techniques are given in Table 6.3. The ORWA has maximum P@10

for the books of four courses whereas OMT has higher values of P@10 for five

courses. Thus the recommendation which matches most to the expert‟s ranking is

made by OMT, ORWA and OWA (at least half); hence these are the best performers

as far as P@10 is concerned. Like ORWA, OWA (at least half) has obtained equally

good values of P@10 and almost for each book the values are same for both the

method.

For books on „Compiler Design‟ (CD), P@10 is 1 for all the techniques. This is

again because of the same reason that CD has a total of10 books only, and experts

have ranked them in order. This makes all the books to anyhow fall in the

recommendation list of experts. Similarly, all 10 books are also ordered from best to

least by each technique, which in turn allows all 10 books to be recommended.

Therefore, the method could have been consider true adoptable for the large number

of books, where top 10 books out of the large number of collection of books would

mean really a better scrutinized results.


___________________________________________________________________________________

131 | P a g e

Thus, the results show the significance of the authoritative recommendation as the

results of these techniques i.e., ORWA, OWA with all quantifiers and PAS are

authoritative recommendations, consider universities authorities suggestion while

making recommendation for students their best books.

By observing the average P@10 for different techniques which is shown in Figure

6.4, one interesting thing that can easily be seen is that the average values of P@10

for different approaches has insignificant differences and are very close to each other.

ORWA and OWA (at least half) have [email protected]. The PAS and OWA (most) has 0.72

and 0.70 respectively. The maximum value of average P@10 for all the books using

each technique is 0.78 which is acquired by Opinion Mining Technique (OMT).

Table 6.3: P@10 for all approaches

Courses


PAS ORWA OWA (At

least half)

OWA (As many

as possible)

OWA

(Most) OMT

CN 0.53 0.67 0.65 0.45 0.57 0.73

DM 0.75 0.77 0.77 0.45 0.65 0.68

OS 0.74 0.72 0.72 0.7 0.74 0.78

TOC 0.92 0.92 0.92 0.91 0.9 0.91

SE 0.6 0.76 0.76 0.44 0.56 0.75

DBMS 0.84 0.81 0.81 0.78 0.81 0.82

DS 0.7 0.74 0.74 0.69 0.69 0.78

AI 0.53 0.54 0.53 0.5 0.53 0.61

CD 1 1 1 1 1 1

CG 0.6 0.76 0.76 0.44 0.56 0.75

Average

P@10 for all

the books

0.721 0.769 0.766 0.636 0.701 0.781


___________________________________________________________________________________

132 | P a g e

Figure 6.4:Average P@10 for all books

Thus, again it is evident that the experts‟ suggestion and users‟ opinion coincide.

On the one hand it can be interpreted as the proficiency of the technique adopted

which have been designed in such a way that users sentiments are converted to rank

using feature based extraction and evaluation of reviews, on the other hand it is

supports and proves the philosophy of selecting experts‟ ranking as base for the

evaluation of the systems.

6.4.1.4 Evaluation Results using Explicit Feedback based on Mean Average Precision

The average of MAP for the respective techniques is presented in Table 6.4. The

OMT has performed relatively well when we measure P@10. In the same way, Mean

Average Precision (MAP) comes out to be highest for OMT. MAP for Opinion

mining technique (OMT) which is discussed in Chapter 5 is 0.6397 and MAP for

ORWA is 0.5571. The difference in the value is because of the variation in the

precision for different „k‟ in the value of P@k, at 10th

position the value of precision

may have higher values whereas for other values of k, it might have significant

changes. The MAP incorporates precision for each position and hence gives a more

holistic representation of precision.

Table 6.4: Mean Average Precision of different approaches.

PAS ORWA

OWA (At

least half)

OWA (As

many as

possible)

OWA

(Most) OMT

MAP for all courses

altogether 0.5257 0.5733 0.5571 0.4011 0.4593 0.6397


___________________________________________________________________________________

133 | P a g e

Figure 6.5: Mean Average Precision of different approaches

6.4.1.5 Evaluation Results using Explicit Feedback based on FPR@10

The FPR@10 for all the techniques is given in Table 6.5. The FPR@10 gives the

value of impreciseness in recommendation that how many books which have been

recommended by the Experts in top 10 positions is not recommended by Book

Recommendation Approaches (BRA). This measure tells how accurate the adopted

recommender technique is and what degree of false positive appears in the


The value of FPR@10 for the respective books by using all different

recommendation techniques is least for ORWA for the books of four courses whereas

OMT has minimum error values for five courses. Thus the recommendation which

matches most to the expert‟s ranking is made by OMT and ORWA. Hence, these are

the best performers as far as false positive error for top 10 positions, i.e. FPR@10 is

concerned. Like ORWA, OWA (at least half) has obtained equally good values and

almost for each book the values are same for both the method.

False positive value of different books is found to be least by OMT in most of the

cases. And the book on AI has most error by all the methods. The OMT has least error

in 5 out of 10 cases whereas ORWA has 4 times better performance than other

techniques. OWA (at least half) has the performance similar to ORWA and Positional

Aggregation Scheme relative performance for books on „data base‟ (DB) is better than

any other technique. However, on analyzing the average FPR@10, OWA (as many as

possible) has the maximum FPR@10 in most of the cases. The major difference in the

values of respective techniques is observed for books on „Computer Graphics‟ (CG)

and „Discrete Mathematics‟ (DM). Also, the results indicate that the interpretation of


___________________________________________________________________________________

134 | P a g e

opinions are successfully framed that it has high similarities with the recommendation

of experts.

Table 6.5: FPR@10 for all techniques.

Courses


PAS ORWA OWA (At

least half)

OWA (As

many as

possible)

OWA

(Most) OMT

CN 0.47 0.33 0.35 0.55 0.43 0.27

DM 0.25 0.23 0.23 0.55 0.35 0.32

OS 0.26 0.28 0.28 0.3 0.26 0.22

TOC 0.08 0.08 0.08 0.09 0.1 0.09

SE 0.4 0.24 0.24 0.56 0.44 0.25

DBMS 0.16 0.19 0.19 0.22 0.19 0.18

DS 0.3 0.26 0.26 0.31 0.31 0.22

AI 0.47 0.46 0.47 0.5 0.47 0.39

CD 0 0 0 0 0 0

CG 0.4 0.24 0.24 0.56 0.44 0.25

Average

FPR@10

for all the

books 0.279 0.231 0.234 0.364 0.299 0.219

Figure 6.6: Average FPR@10 for all books using different book recommender approaches

6.4.1.6 Evaluation Results using Explicit Feedback based on FNR@10

The FNR@10 for all the techniques is given in Table 6.6. The FNR@10 gives the

value of impreciseness in recommendation that how many books which have been

recommended by the Experts in top 10 positions is not recommended by Book

Recommendation Approaches (BRA). This measure tells how accurate the adopted


___________________________________________________________________________________

135 | P a g e

recommender technique is and what degree of false negative appears in the


The value of FNR@10 for the respective books by using all different

recommendation techniques is least for ORWA for the books of four courses whereas

OMT has minimum error values for five courses. Thus the recommendation which

matches most to the expert‟s ranking is made by OMT and ORWA. Hence, these are

the best performers as far as false negative error for top 10 positions, i.e. FNR@10 is

concerned. Like ORWA, OWA (at least half) has obtained equally good values and

almost for each book the values are same for both the method.

False negative value of different books is found to be least by OMT in most of the

cases. And the book on AI has most error by all the methods. The OMT has least error

in 5 out of 10 cases whereas ORWA has 4 times better performance than other

techniques. OWA (at least half) has the performance similar to ORWA and Positional

Aggregation Scheme relative performance for books on „data base‟ (DB) is better than

any other technique

Table 6.6: FNR@10 of allbooks

Courses


PAS ORWA OWA (At

least half)

OWA (As

many as

possible)

OWA (Most) OMT

CN 0.47 0.33 0.35 0.55 0.43 0.27

DM 0.25 0.23 0.23 0.55 0.35 0.32

OS 0.26 0.28 0.28 0.3 0.26 0.22

TOC 0.08 0.08 0.08 0.09 0.1 0.09

SE 0.4 0.24 0.24 0.56 0.44 0.25

DBMS 0.16 0.19 0.19 0.22 0.19 0.18

DS 0.3 0.26 0.26 0.31 0.31 0.22

AI 0.47 0.46 0.47 0.5 0.47 0.39

CD 0 0 0 0 0 0

CG 0.4 0.24 0.24 0.56 0.44 0.25

Average

FNR@10 for

all the books 0.279 0.231 0.234 0.364 0.299 0.219


___________________________________________________________________________________

136 | P a g e

Figure 6.7: Average FPR@10 for all books using different book recommender approaches

However, on analyzing the average FNR@10, OWA (as many as possible) has the

maximum FNR@10 in most of the cases. The major difference in the values of

respective techniques is observed for books on „Computer Graphics‟ (CG) and

„Discrete Mathematics‟ (DM). Also, the results indicate that the interpretation of

opinions are successfully framed that it has high similarities with the recommendation

of experts.

6.4.1.7 Evaluation Results using Explicit Feedback based on Modified Spearman Rank

Correlation Coefficient

The researchers have used spearman rank correlation coefficient for the

measurement of similarities between different rankings [272]. The problem with this

coefficient is its inability of producing correct correlation for partial list. Beg [273]

have suggested modified version of this coefficient. The modified rank correlation

gives the degree of similarities between the two different rankings. Here, we have

proposed 6 different book recommendation approaches; hence it leads to 6 different

rankings of books. These 6 rankings are compared with the ranking of books collected

from experts and considered as standard rankings. The modified spearman correlation

coefficient are expressed in Table 6.7


___________________________________________________________________________________

137 | P a g e

Table 6.7: Modified Spearman Rank Correlation Coefficient by different approaches

Courses


PAS ORWA

OWA

(At least

half)

OWA (As

many as

possible)

OWA

(Most) OMT

TOC 0.87 0.88 0.88 0.85 0.87 0.93

DB 0.91 0.92 0.92 0.88 0.91 0.93

CD 0.87 0.90 0.90 0.83 0.86 0.88

OS 0.93 0.93 0.93 0.87 0.90 0.93

dS 0.93 0.93 0.92 0.89 0.89 0.93

AI 0.82 0.83 0.86 0.80 0.82 0.82

CN 0.88 0.88 0.88 0.81 0.82 0.92

SE 0.92 0.91 0.90 0.87 0.84 0.90

DS 0.93 0.93 0.88 0.86 0.89 0.92

CG 0.80 0.89 0.87 0.76 0.76 0.98

Average MSRCC 0.89 0.90 0.89 0.84 0.86 0.91

Figure 6.8: Average of Modified Spearman rank correlation coefficient

The minimum value of average spearman rank correlation coefficient is 0.842

which have been observed while calculating for OWA (as many as possible). The

individual score of average correlation for each book is very close for all techniques.

An insignificant difference can be observed, however, for books on „Computer

Graphics‟ (CG) maximum difference in the correlation values for respective

techniques is recorded. The highest correlation with the experts‟ rankings is recorded

for OMT. Average Correlation for all the books by using OMT is 0.91 which is the

maximum average correlation. If considering average correlation for books

separately, the most correlated observation is seen for books on CG with OMT which

is 0.97.


___________________________________________________________________________________

138 | P a g e

The highly correlated values of each techniques with experts‟ rankings has clear

indications of all the adopted techniques being close to the what can help the users

and can be used to satisfy the users‟ needs and provide them with accurate

recommendations.

6.4.1.8 Evaluation Results using Explicit Feedback based on Mean Reciprocal Rank

The average MRR for all the techniques is presented in Table 6.8 and shown in

Figure 6.9. The average MRR for ORWA is nearly 0.61. It means 61% times the 1st

ranked books recommended by ORWA is also ranked 1 by experts in their rankings.

Basically MRR tries to let users aware of the best products and recommend it along

with a list of complete ranked products.

Table 6.8: Mean Reciprocal Rank of all techniques for different Courses

Courses


PAS ORWA OWA (At

least half)

OWA (As

many as

possible)

OWA

(Most) OMT

CN 0.50 0.58 0.55 0.46 0.25 0.49

DS 0.83 0.84 0.87 0.15 0.30 0.86

OS 0.85 0.88 0.88 0.42 0.44 0.88

TOC 0.59 0.57 0.58 0.57 0.60 0.66

SE 0.65 0.73 0.61 0.55 0.54 0.59

DBMS 0.62 0.63 0.78 0.51 0.53 0.58

dS 0.58 0.58 0.49 0.23 0.25 0.45

AI 0.30 0.27 0.25 0.33 0.29 0.26

CD 0.56 0.56 0.56 0.56 0.58 0.57

CG 0.65 0.73 0.61 0.55 0.54 0.58

Average

MRR for

all books

0.613 0.637 0.618 0.433 0.432 0.592

The MRR for PAS and OWA (at least half) is relatively more close to the experts‟

suggestion than OMT. The OMT has average MRR 0.59 whereas MRR for PAS and

OWA (at least half) are 0.61, approximately same.


___________________________________________________________________________________

139 | P a g e

Figure 6.9: Average Mean Reciprocal Rank of all the books for different techniques

6.4.2 Comprehensive Evaluation Measure

The different parameters have been used in this section and have discussed in details.

To know which of the adopted recommendation approach has performed better, we

have suggested a comprehensive evaluation measure which aggregates all the above

evaluation metrics uniformly. The two different types of metrics have been used in

this chapter. One metric finds the error, which we call as fallacy metric and another

technique measures precision, which can be termed as veracity metric. The veracity

measures should be high and fallacy measure should be low for a good recommender

system. These two metrics have been shown in the Table 6.9 and Table 6.10, and their

aggregated comprehensive value is shown in Table 6.11.

The comprehensive evaluation measure (CEM) is calculated as;

( values of veracity metrics)+( values of fallacy metrics)Comprehensive Evaluation Measure =

metrics

sum of sum of

number of----(6.7)

Table 6.9: Final values of parameters used to find error

Metrics


PAS ORWA OWA (At

least half)

OWA (As

many as

possible)

OWA

(Most) OMT

RMSE 1.23 1.12 1.2 1.56 1.46 0.97

FPR@10 0.28 0.23 0.23 0.36 0.3 0.22

FNR@10 0.28 0.23 0.23 0.36 0.3 0.22

MAE 3.96 3.38 3.53 5.08 4.48 3.29


___________________________________________________________________________________

140 | P a g e

Table 6.10: Final values of parameters used to find precisions and correlation

Metrics


PAS ORWA OWA (At

least half)

OWA (As

many as

possible)

OWA

(Most) OMT

P@10 0.72 0.77 0.77 0.64 0.7 0.78

MAP 0.55 0.61 0.6 0.42 0.47 0.62

Correlation 0.34 0.41 0.39 0.14 0.14 0.43

MRR 0.61 0.63 0.62 0.43 0.43 0.59

Table 6.11: Comprehensive evaluation measure

PAS ORWA

OWA (At

least half)

OWA (As

many as

possible)

OWA

(Most) OMT

Comprehensive

Evaluation

Measure

1.304 1.538 1.524 1.003 1.164 1.606

The Comprehensive Evaluation Measure (CEM) gives an aggregated interpretation

of all the techniques used. With all different metrics and distinguished

recommendation approaches, a final recommendation by the proposed evaluation

approach is feature extraction based opinion mining technique (OMT). CEM for OMT

is 1.606 which is marginally ahead from OWA (as many as possible) and OWA

(most).

6.5 Evaluation based on Implicit User Feedback

In this section, we have evaluated a recommender system and suggest a

comprehensive approach for the evaluation of a recommender system so that user

satisfaction can be assured. We perform user studies to evaluate recommender system.

We present the different recommended items and links to their reviews from a

recommender system proposed in [6] to users. Implicit user feedbacks were taken that

captures the behavior of the users over the recommended items of the recommender

system. We quantify the feedback and associate a score to each product from each

user. The sincerity of users is measured quantitatively and only feedbacks from

sincere users are considered for evaluation process. We put a threshold to classify the

preference of a user and only preferred items for users are stored. It gives ranking of

preferred products for all users. We get user‟s aggregated ranking of products for all


___________________________________________________________________________________

141 | P a g e

users by using rank aggregation algorithm. We call „system ranking‟ to ranking

presented by the system which is being evaluated. The system ranking is evaluated on

the basis of aggregated ranking of products. The system is evaluated using several

evaluation metrics. Also a relative comparison with related approach is done.

We can summarize our main contributions for Implicit evaluation of RS as follows:

I) Though quantification of user feedback has been done[274] but there is no

sincerity measure for the users. We have introduced user‟s sincerity

measure that strengthens the procedure and hence, make the evaluation

process robust.

II) Though sincerity measure improves the fairness in evaluation process, we

set criteria of preference for the products to classify the user‟s preferred

products. We put a threshold value for the product importance score that

invigorated the criteria of preference.

III) Accuracy alone may not decide the quality of a recommender system,

user‟s interaction and satisfaction is important. We suggested an approach

that interacts with users and took feedback from them to evaluate the

recommender system.

IV) The proposed approach is fulfilling two aspects simultaneously. First, it

gives a comprehend methodology to evaluate a recommender system that

can be generalized to any product and any marketing portal. Second, the

methodology is implemented to evaluate a recommender system and tells

how that system performance is.

V) Biasedness and casual-reviews are major concern in user feedback. We

have explicitly defined and described a procedure to trace the insincerity in

exploiting user feedback.

6.6 Architecture for Evaluation Scheme based on Implicit Feedback

We give architecture (Figure 6.10) for evaluation method of recommender system.

The evaluation procedure is basically two-step process. First, step is to lay down a

recommendation approach and second is to compare the recommended items with that

of which is recommended by the system under evaluation. The recommendation

approach constitutes of several steps. Initially, final recommendation given by a

recommender system [243] is presented before users. The recommender system


___________________________________________________________________________________

142 | P a g e

recommends top 10 products of 5 different items. The list of all ranked products and

link to reviews of their respective products are given to 10 different users. All the

users are either graduate students or professionals familiar with the use of Internet.

Figure 6.10: Block diagram of implicit User Feedback based Evaluation of Recommender

Systems


___________________________________________________________________________________

143 | P a g e

We have taken implicit user feedback (see section 6.6.1) for these reviews and

quantify the feedbacks (see section 6.6.2). The quantification of the feedbacks give us

product importance score (PIS) for the products by respective users. Outlieranalysis is

performed to check the sincerity of the users (see section 6.6.3). Outlier analysis gives

user‟s sincerity measure. The sincerity measure of the users strengthens the ranking

process as it allows inclusion of only sincere and substantial users in the

ranking.Athreshold value for PIS to set criteria of preference (see section 6.6.4) is

suggested. Onlythose products that fall under the criteria of preference are considered,

i.e. products for which PIS are higher than the threshold value. By sorting these

products, we get different ranking of each product for each user.We apply a rank

aggregation algorithm to aggregate the ranking of all users to get a final ranking. This

final ranking is the aggregated ranking of products by the users. We have evaluated

the system ranking on the basis of aggregated user‟s ranking of products.

6.6.1 Vector Component of User feedback

To observe the user behavior, we take implicit user feedback in vector form. We use

five vectors namely E, P, S, T and V to observe the user‟s behavior, thus we call the

feedback as vector feedback. The five vectors are explained below.

(a) V: it is the sequence „V‟ of visiting the review site of the product by a user.

Let there be „n‟ number of products whose reviews are available for „m‟ users

and ith user (i≤m) visits a review site of jth product (j≤n) which is the kth site

visited by the user. We assign vij = k, where k≤ n. A snippet about the product

is attached with the links. Thus, users would visit the page which they find

more close to their choices. Hence, the sequence in which a page is visited

shows the significance of product in the eyes of a user.

(b) T: The time duration that ith

user gives to reconnoitre the reviews for the

product „j‟ is denoted by Tij. The value of T is assigned „0‟for a product whose

review is not visited by the user.

(c) We use Boolean „P‟ to denote whether or not a user prints the review of a

product. The Boolean vector P is denoted by „Pij‟. Where „i‟denote ith

user and

„j‟ denote jth

product.


___________________________________________________________________________________

144 | P a g e

(d) We use Boolean „S‟ to denote whether or not a user saves the review of a

product. The Boolean vector S is denoted by „Sij‟. Where „i‟denote ith

user and

„j‟ denote jth

product.

(e) We use Boolean „E‟ to denote whether or not a user e-mail the review of a

product. The Boolean vector E is denoted by „Eij‟. Where „i‟denote ith

user and

„j‟ denote jth

product.

The idea behind compiling this feedback is based on the assumption that an intelligent

user is likely to visit the more appealing product early in the system discovery

process. The snippet of the links would attract the user about the contents of the items;

hence, it is more probable that users would click for the links of items in the order of

their preference. Similarly, the time that a user invest in exploring a review, whether

or not the user saves it to their computer, and whether the user prints or e-mails it to

someone else reveals the degree of importance that a product holds for that particular

user[273].

Example 6.1: let 10 documents d1, d2... d10 are presented before a user, the user in

very first visit go to explore d3, and email the document to a friend. Spend 40 seconds

in traversing the documents. The user neither saves the document nor prints it. The

values of the vector component for d3 for the user would be, v=1, t=40 sec, e=1,

p=s=0.

6.6.2 User Feedback based Scoring of Products

To assess the user‟s behavior, we give a formula for the quantification of vector

feedback. The quantification will help in scoring the products‟ value and ranking

them accordingly [20]. We denote product‟s importance score (PIS) by ϕ. The PIS of

jth

product by ith

user is written as ϕij.

We give ϕij as:

ϕij = ij

1 1

{ (1/ )}n m

ij ij ij ij

i j

E P S T V

------------- (6.8); 0ijV

Where E, P and S are Boolean vectors acquire value 0 or 1 only. As described in

section 6.6.1, if a user prints, saves or emails a review to someone else, the Boolean

value will be set to 1 else 0 for respective Boolean measures. The importance of each

component may differ and would be identified if a weight is associated to these


___________________________________________________________________________________

145 | P a g e

according to their significance suggested by user‟s behaviours. The weighted sum is

given by equation 6.3.

ij

1 1

Sum = { (1/ )} (6.9)ij ij ij ij ij

n m

E P ij S ij T ij V ij

i j

Weighted W E W P W S W T W V

For the sake of simplicity in the procedure, we take all weights as 1. Hence, equation

6.3again reduces to equation 6.2. For calculation of time „T‟, we find that the average

lengths of the reviews are nearly about 400 bytes i.e. 400 characters. The reading

speed of a user is considered as 10 bytes/second. Thus, we classify different time

interval to calculate time „T‟. For ith user and jth product we have[273];

tij =0 if user do not visit the page.

tij =1 if time duration „t‟ spent in investigating review lies between 1 second to 39

second, i.e. 1≤t≤39

tij =2 if 40≤t≤79

tij =3 if 80≤t≤119

tij =4 if 120≤t≤159

tij =5 if t ≥160 seconds.

Finally we compute Tij as,

Tij= tij/tmax;

Here tmax =5;

If the user browses a review site for 45 seconds, we assign tij=2,

Tij=2/5;

Tij=2/5;

Tij =0.4;

To calculate ϕ for above case, we use Tij as 0.4.

A detail review with more number of characters, whether positive or negative, could

take longer time. In this case it seems the higher value of „T‟ does not reflect the

exactly high affinity of user for the reviewed item. However, we emphasize that a user

can only read a detail negative review of a highly desirable item. Thus, irrespective of


___________________________________________________________________________________

146 | P a g e

the polarity of the review, the product would be given more weights and must be

considered as a preferable one for the users.

We set the value of „V‟ in the sequence in which the review site is hit, i.e. if user „i‟

visits the jth product in very first click we set Vij = 1, this gives 1/ Vij = 1. If a user

does not visit a particular product‟s review, we set ϕ = 0.

As we use five vectors, we give normalized product‟s importance score (NPIS). We

denote NPIS of jth

product by ith

user as δij, and it is given by:

δij = ϕij / 5 --------------- (6.10)

We have calculated NPIS, „δ‟ of each product for each user. We have five different

items; Laptop, Head Phone, Smart Phone, Printer and Tablet. Each item has 10

different products. Here, in the Table 6.12, symbols L1, L2 to L10 represents 10

different ranked products of Laptops. These products are recommended by the

recommender system which is under study for evaluation. An illustration for the

calculation of NVS is summarized in the Table 6.12. The respective value of different

vector components of product L1, L2, etc. is calculated for user1 and calculation is

performed as mentioned in above equations. The δ of L1 is 0.92 i.e. the normalized

vector score of product L1 for user1 is 0.92. The value of all the components for

product L9 is 0; it shows that the review of the product is not visited by the user1 at

all.

Table 6.12: Illustration for the calculation of Normalized Products Importance Score „δ‟ for user

1.

Products E P S T T V 1/V ϕ δ

L1 1 1 1 3 0.6 1 1 4.6 0.92

L2 0 1 1 2 0.4 4 0.25 2.65 0.53

L3 0 1 1 2 0.4 2 0.5 2.90 0.58

L4 0 0 1 2 0.4 3 0.33 1.73 0.346

L5 1 0 0 3 0.6 5 0.2 1.80 0.36

L6 0 0 1 2 0.4 9 0.11 1.51 0.302

L7 0 1 0 1 0.2 6 0.16 1.36 0.272

L8 1 0 0 2 0.4 7 0.14 1.54 0.308

L9 0 0 0 0 0 0 0 0 0

L10 0 1 1 2 0.4 8 0.125 2.525 0.509


___________________________________________________________________________________

147 | P a g e

6.6.3 User‟s Sincerity Measure

Our approach to evaluate recommender systems is based on users‟ studies. We have

10 different users for each product. It is necessary to check whether the users are

sincere or not? As the sincerity of a user plays an important role in making any

decision based on the feedback taken from users. Any offhand feedback from

insincere users may lead to misguided conclusion. Therefore we measure user‟s

sincerity to know that user is sincere enough that the feedback from user can be

considered as a base in recommender system‟s evaluation process.

We find the correlation of all the users among themselves, represented as corr

(ui,uj), where i and j are users, for all the different products, and considered those

users as insincere for which we get a negative correlation. Finding out offhand

feedbacks strengthens the dataset and reduces the chances of data discrepancy. And it

gives a true measure of user‟s sincerity. The correlation values for laptop of all the

users are shown in Table 6.13, the user 5 has a negative correlation and we excluded

its feedback from our recommendation process.

Let, F is the function defining sincerity (s) of user „ui‟; we have: F(s,ui) = corr

(ui,uj) ; 1≤j≤10, j≠i

Thus, formally, user sincerity of a user „ui‟ is defined as;

≥ 0, Sincere user

F(s,ui)

< 0, Insincere user

A user‟s approach may vary with time and hence it is important to check the user‟s

sincerity for each item explicitly. The separate test of user‟s sincerity for each item

ensures true users‟ feedback of respective products of all the items. We find user‟s

sincerity for all the items in the same way as stated above for Laptop. The correlation

values of all the items, Head Phone, Smart Phone, Printer and Tablet has been shown

in Table 6.14 to Table 6.17, respectively. The user with negative correlation, i.e.

F(s,ui) < 0, is considered as insincere and represented in bold letters. The list of items

and corresponding users, whose offhand feedback is excluded from evaluation

process, is listed in Table 6.18.


___________________________________________________________________________________

148 | P a g e

Table 6.13: Correlation values of different products of Laptop

User 1 user 2 user 3 user 4 user 5 user 6 user 7 user 8 user 9 user 10 Average

User 1 1.00 0.38 0.09 0.68 -0.04 0.53 0.79 0.72 0.76 0.65 0.56

user 2 0.38 1.00 0.43 0.85 -0.62 0.56 0.31 0.85 0.84 0.68 0.53

user 3 0.09 0.43 1.00 0.35 -0.70 0.28 0.00 0.35 0.34 0.35 0.25

user 4 0.68 0.85 0.35 1.00 -0.55 0.52 0.56 1.00 0.99 0.77 0.62

user 5 -0.04 -0.62 -0.70 -0.55 1.00 -0.24 0.22 -0.54 -0.54 -0.21 -0.22

user 6 0.53 0.56 0.28 0.52 -0.24 1.00 0.32 0.50 0.55 0.65 0.47

user 7 0.79 0.31 0.00 0.56 0.22 0.32 1.00 0.57 0.58 0.78 0.51

user 8 0.72 0.85 0.35 1.00 -0.54 0.50 0.57 1.00 1.00 0.77 0.62

user 9 0.76 0.84 0.34 0.99 -0.54 0.55 0.58 1.00 1.00 0.76 0.63

user 10 0.65 0.68 0.35 0.77 -0.21 0.65 0.78 0.77 0.76 1.00 0.62

Table 6.14: Correlation values of different products of Printer


User 1 1.00 0.77 0.77 0.60 0.95 0.88 0.02 0.89 0.81 0.82 0.75

user 2 0.77 1.00 0.90 0.47 0.82 0.67 -0.48 0.76 0.88 0.87 0.67

user 3 0.77 0.90 1.00 0.61 0.83 0.70 -0.42 0.77 0.93 0.92 0.70

user 4 0.60 0.47 0.61 1.00 0.70 0.62 -0.29 0.56 0.47 0.45 0.52

user 5 0.95 0.82 0.83 0.70 1.00 0.93 -0.03 0.89 0.85 0.84 0.78

user 6 0.88 0.67 0.70 0.62 0.93 1.00 0.16 0.76 0.68 0.67 0.71

user 7 0.02 -0.48 -0.42 -0.29 -0.03 0.16 1.00 -0.09 -0.25 -0.24 -0.06

user 8 0.89 0.76 0.77 0.56 0.89 0.76 -0.09 1.00 0.87 0.83 0.72

user 9 0.81 0.88 0.93 0.47 0.85 0.68 -0.25 0.87 1.00 0.99 0.72

user 10 0.82 0.87 0.92 0.45 0.84 0.67 -0.24 0.83 0.99 1.00 0.72


___________________________________________________________________________________

149 | P a g e

Table 6.15: Correlation values of different products of Head Phone


User 1 1.00 0.68 0.67 0.87 0.87 0.96 -0.87 0.81 0.85 0.83 0.67

user 2 0.68 1.00 0.90 0.84 0.84 0.67 -0.48 0.76 0.88 0.87 0.70

user 3 0.67 0.90 1.00 0.88 0.88 0.76 -0.88 0.75 0.85 0.92 0.67

user 4 0.87 0.84 0.88 1.00 1.00 0.90 -1.00 0.89 0.98 0.94 0.73

user 5 0.87 0.84 0.88 1.00 1.00 0.90 -1.00 0.89 0.98 0.94 0.73

user 6 0.96 0.67 0.76 0.90 0.90 1.00 -0.90 0.77 0.85 0.82 0.67

user 7 -0.87 -0.48 -0.88 -1.00 -1.00 -0.90 1.00 -0.89 -0.98 -0.94 -0.69

user 8 0.81 0.76 0.75 0.89 0.89 0.77 -0.89 1.00 0.89 0.87 0.67

user 9 0.85 0.88 0.85 0.98 0.98 0.85 -0.98 0.89 1.00 0.94 0.72

user 10 0.83 0.87 0.92 0.94 0.94 0.82 -0.94 0.87 0.94 1.00 0.72

Table 6.16: Correlation values of different products of Tablet


User 1 1.00 -0.02 -0.03 0.03 0.23 0.22 0.04 0.12 0.00 -0.07 0.15

user 2 -0.02 1.00 0.73 0.38 0.81 0.61 -0.92 0.62 0.89 0.93 0.50

user 3 -0.03 0.73 1.00 0.31 0.76 0.70 -0.87 0.83 0.92 0.89 0.52

user 4 0.03 0.38 0.31 1.00 0.70 0.43 -0.54 0.47 0.39 0.43 0.36

user 5 0.23 0.81 0.76 0.70 1.00 0.64 -0.89 0.64 0.83 0.84 0.55

user 6 0.22 0.61 0.70 0.43 0.64 1.00 -0.75 0.71 0.79 0.65 0.50

user 7 0.04 -0.92 -0.87 -0.54 -0.89 -0.75 1.00 -0.79 -0.96 -0.98 -0.57

user 8 0.12 0.62 0.83 0.47 0.64 0.71 -0.79 1.00 0.87 0.77 0.52

user 9 0.00 0.89 0.92 0.39 0.83 0.79 -0.96 0.87 1.00 0.95 0.57

user 10 -0.07 0.93 0.89 0.43 0.84 0.65 -0.98 0.77 0.95 1.00 0.54

Table 6.17:Correlation values of different products of Smart Phone


User 1 1.00 0.75 0.77 0.66 0.96 0.98 -0.01 0.98 0.87 0.84 0.78

user 2 0.75 1.00 0.73 0.75 0.73 0.72 -0.31 0.75 0.81 0.83 0.68

user 3 0.77 0.73 1.00 0.56 0.82 0.77 -0.42 0.83 0.90 0.92 0.69

user 4 0.66 0.75 0.56 1.00 0.72 0.64 -0.13 0.73 0.47 0.54 0.59

user 5 0.96 0.73 0.82 0.72 1.00 0.95 -0.02 0.99 0.83 0.85 0.78

user 6 0.98 0.72 0.77 0.64 0.95 1.00 0.01 0.96 0.84 0.81 0.77

user 7 -0.01 -0.31 -0.42 -0.13 -0.02 0.01 1.00 -0.03 -0.24 -0.24 -0.04

user 8 0.98 0.75 0.83 0.73 0.99 0.96 -0.03 1.00 0.84 0.84 0.79

user 9 0.87 0.81 0.90 0.47 0.83 0.84 -0.24 0.84 1.00 0.98 0.73

user 10 0.84 0.83 0.92 0.54 0.85 0.81 -0.24 0.84 0.98 1.00 0.74


___________________________________________________________________________________

150 | P a g e

Table 6.18:List of users which are excluded after user‟s sincerity analysis

Items User‟s list

Tablet User7

Laptop User5

Smartphone User7

Printer User7

Head Phone User7

6.6.4 Product Preference Score

User‟s sincerity measure removes the discrepancy in the data if available. Now, we

set a criterion of preference of the user. The value of ϕ obtained in equation 6.2 gives

the importance of a product in the eyes of a user. In [19] the author used a similar

formula and did not set any criteria of preference and consider all those products as

user‟s preferred products for which user just visited to look at the reviews, however, a

user can just visit the page for various reasons and not necessarily the visit indicates

that products is conforming the choice of the user. Therefore, we formulate the

criteria of preference as follows.

We consider following assumptions to set criteria of preference:

1) There may be the situation that the user does not need to save or email the

reviews or non-availability of printer may cause of no document being printed

even once. Thus the Boolean variables E, P and S which are used in equation

6.2, they all may be zero for review of a product which may be preferred by

the users.

2) If a person visits the link of the review presented before him for various

product in first six clicks i.e. the sequence of the visit to the link does not

exceed 6 out of the 10 products for each items which gives the 60% chances of

being visited. We say that the product has an importance in the eye of the user.

i.e. Vijmaygain value 1,2,3,4,5 or 6 only, it means -

(1/Vij) ≥ 0.16 ----------------- (A)

3) The time taken to read the review should be greater than 40 seconds, as we

consider that a review consists of 400 bytes on an average. Further we


___________________________________________________________________________________

151 | P a g e

speculate the reading speed of a user is 10 bytes/second. [22]. Refer to

equation (6.2), the value of t≥40 i.e. tij ≥2.

And Tij ≥ tij/tmax;

Tij ≥ 2/5;

Tij ≥ 0.4 ---------- (B)

Putting the values of in-equations A and B in equation (6.2), and considering

P=S=E=0; we get

ϕ ij≥ 0+0+0+0.16+0.4;

ϕ ij ≥0.56

From equation (6.3), we get:

δij ≥ 0.112 ---------- ( C )

Thus we define a function preferred (i,j) to set criteria of preference as;

1, if δ ij≥ 0.112

Preferred (i, j) =

0, otherwise

The criteria of preferences may be written as:

If preferred (i,j) = 1, product is preferred by the customers. If preferred (i,j) = 0,

product is not preferred by the customers. Thus, the product whose normalized

quantified vector score is greater than 0.112 will be considered as customer‟s

choice otherwise be neglected. We tabulate the criteria of preference in Table

6.19.

Table 6.19:Criteria of preference for a product to be preferred by a user

User‟s choice Value of preferred (i,j)

Preferred 1

Not preferred 0

In Table 6.20, the NPIS is shown for Laptop. User5 is excluded from the process

as the user was identified as insincere. The values of NPIS for various products which

are less than the threshold value, i.e. 0.112, are marked in bold.


___________________________________________________________________________________

152 | P a g e

Table 6.20: Normalized Products Importance Score for Laptop

Average

Score Of

Products

User 1 User 2 User 3 User 4 User 6 User 7 User 8 User 9 User

10

L1 0.92 0.62 0.186 0.96 0.62 0.72 0.72 0.72 0.56

L2 0.53 0.68 0.52 0.62 0.76 0.46 0.62 0.66 0.66

L3 0.58 0.36 0.38 0.386 0.186 0.385 0.386 0.386 0.36

L4 0.346 0.53 0.53 0.33 0.57 0.348 0.37 0.37 0.386

L5 0.36 0.508 0.352 0.32 0.08 0.545 0.32 0.36 0.37

L6 0.302 0.512 0.52 0.312 0.305 0.302 0.312 0.112 0.312

L7 0.272 0.386 0.105 0.268 0.102 0.3 0.108 0.108 0.108

L8 0.308 0.305 0.348 0.265 0.1 0.312 0.105 0.105 0.105

L9 0 0.102 0.302 0.062 0.16 0.36 0 0 0.342

L10 0.509 0.1 0.3 0.06 0.352 0.37 0 0.102 0.34

6.6.5 User Personalized Ranking

Once we set criteria for preference, we will be getting a series of products that

customer prefer by ordering the value obtained by each user in descending order

where we will be having products for which the function preferred (i,j) =1. This may

lead to a partial list. The ranking of laptop based on product preference score (PPS)

i.e. the NPIS greater than the threshold value, for all the concerned users, is depicted

in Table 6.21. It is a partial list.

Table 6.21: Ranking of laptop by different users based on product preference score

Ranked

position User 1 User 2 User 3 User 4 User 6 User 7 User 8 User 9

User

10

1 L1 L2 L4 L1 L2 L1 L1 L1 L2

2 L3 L1 L2 L2 L1 L5 L2 L2 L1

3 L2 L4 L6 L3 L4 L2 L3 L3 L4

4 L10 L6 L3 L4 L10 L3 L4 L4 L5

5 L5 L5 L5 L5 L6 L10 L5 L5 L3

6 L4 L7 L8 L6 L3 L9 L6 L6 L9

7 L8 L3 L9 L7 L9 L4 L7 - L10

8 L6 L8 L10 L8 L5 L8 L8 - L6

9 L7 - L1 - - L6 - - -

10 - - - - - L7 - - -


___________________________________________________________________________________

153 | P a g e

Algorithm 6.1: Positional Rank aggregation

1: Repeat for x=1 to m

{

2: Repeat for y=1 to n

{

3: find ranked position of product ‟x‟ in the ranking of user ‟y‟, say it is ‟ k‟; k ∈ [0, n]

4: If (k! =0) /* ranking of product „x‟ is present for user „y‟*/

{

5: compute score „S(x,y)‟ for product „x‟ by user „y‟:

S(x,y) = [(m+k) – {(2*k)- 1}] ;

}

6: else /* product „x‟ is missing in the ranking of user „y‟ i.e. k=0 */

{

S(x,y) =0;

}

}

7: ( , )n

x

y i

S S x y

8: Sort product „x‟, x ∈ [1, n] in descending value of Sx; this arrangement will give ranked list of

products by user „y‟

}

We apply rank aggregation algorithm to get a single list that may be considered as the

final ranking by the user. We give the algorithm to find the aggregated ranking of the

products by different users. If we have „m‟ different products and products acquired

different positions in the ranking given by respective users. If the users involved in

the ranking process are „n‟. The procedure is represented in algorithm 6.1;


We have two different rankings for several products of 5 different items. First one is

the system ranking and another is the users‟ aggregated ranking. System ranking is the

final ranking recommended in [6] and the users‟ aggregated ranking is obtained by the

proposed comprehensive approach discussed in section 6.5. We evaluate the system

ranking on the basis of user‟s aggregated ranking. The values of the different

measures for comprehensive approach are discussed in section 6.7 and the discussion


___________________________________________________________________________________

154 | P a g e

of the results of the comparison of the proposed approach and other existing

techniques is performed in section 6.8.

We have adopted various measures to evaluate the system using proposed

comprehensive approach (C.A). The evaluation measures frequently used in to

evaluate recommender system are MRR, MAP, p@k, FPR@k, FNR@k, and

spearman rank correlation. These measures are employed and the recommender

system [6] is compared by the proposed approach. Thus, these measures help us in

evaluating the recommender system under evaluation, proposed in [6]. Each measure

is discussed separately in the subsequent sections. The values obtained for these

measures are tabulated and their pictorial representations are also shown in the

respective sub sections.

6.7.1 Mean Reciprocal Rank obtained using Comprehensive Approach

The Mean Reciprocal Rank (MRR) is discussed in section 6.3.6. We find MRR of all

items for their respective ranked first products by using comprehensive ranking for

each item. The values of Reciprocal Rank (RR) for all the items are 1except laptop;

which implies that all first ranked products of different items (except laptop) in

system ranking are also ranked first by comprehensive ranking. The mean reciprocal

rank (MRR) comes out to be 0.9. The values are shown in Table 6.22 and pictorially

depicted in Figure 6.11.

Table 6.22:Mean Reciprocal Rank of first ranked product of different items

Products RR

Laptop 0.5

Head Phone 1

Smart Phone 1

Tablet 1

Printer 1

MRR 0.9


___________________________________________________________________________________

155 | P a g e

Figure 6.11: Mean Reciprocal Rank of top rank-position for respective items using

Comprehensive Approach.

6.7.2 Precision@10 obtained using Comprehensive Approach

The values of P@1 to P@10 of user‟s ranking for respective products are given. The

variation in the precision value for each top position can easily be noticed with the help

of Table 6.23, as we can see that P@1 for laptop is 0, it is because the first ranked

product in the system ranking gets a second position in the user‟s ranking and not the

first position, and hence at first position we get precision as zero, whereas the value of

P@1 for headphone, smart phone, tablet and printer is 1.

Table 6.23:values of precision at k, for different items

P@1 P@2 P@3 P@4 P@5 P@6 P@7 P@8 P@9 P@10

Laptop 0 1 1 1 1 1 0.86 0.875 0.89 1

Head Phone 1 1 1 0.75 1 1 1 1 1 1

Smart Phone 1 0.5 1 1 1 1 1 0.875 0.89 1

Tablet 1 1 1 0.75 1 1 1 0.875 0.89 1

Printer 1 1 1 1 1 1 1 0.875 0.89 1


___________________________________________________________________________________

156 | P a g e

Figure 6.12: P@k for different items using Comprehensive Approach

We have depicted P@k in Figure 6.12, for all k=1 to k=10 graphically to elaborate

the precision value for top 10 positions. It is very clear that recommended system gives

100% precise recommendation for top 3 and top 5positions.

6.7.3 Mean Average Precision obtained using Comprehensive Approach

The value of the MAP is given in Table 6.24and a graphical representation is also

shown in Figure 6.13. The high values of MAP for respective products show the good

quality of the recommender system.

Table 6.24: Mean Average Precision for different products

Products MAP

Laptop 0.8625

Head Phone 0.975

Smart Phone 0.9265

Tablet 0.9515

Printer 0.9765

Figure 6.13: Mean Average Precision using Comprehensive Approach


___________________________________________________________________________________

157 | P a g e

6.7.4 FPR@10 obtained using Comprehensive Approach

The FPR@1 to FPR@10 is shown in Table 6.25, we can see that FPR@3 and FPR@5

is coming out to be 0, i.e. for top 3 and top 5 positions the recommendation has no

false positive error. The zero value of FPR@k, for k=3 and k=5 does not mean that

the system is free from error but it simply indicates that due to the change in ranking

position of the products, the value of FPR@k is coming out to be zero.Whereas for

other values of k, we get non-zero values of FPR@k, it clarifies that system is not

biased as it exhibits error for other values of k. Also, the zero value of False Positive

Rate for different ranking positions represents the degree of preciseness of the system.

We define Average of FPR@k as follows;

1

@. FPR@k = ---------------------------------- (6.11)

k

i

FPR iAvg

k

The avg. FPR@k gives the more accurate measure of fallacy of the system. Avg.

FPR@5 and Avg. FPR@10 for the system is obtained and presented in Table 6.26.

Table 6.25:values of FPR@10 for different products

Laptop Head Phone Smart Phone Tablet Printer

FPR@1 1 0 0 0 0

FPR@2 0 0 0.5 0 0

FPR@3 0 0 0 0 0

FPR@4 0 0.25 0 0.25 0

FPR@5 0 0 0 0 0

FPR@6 0 0 0 0 0

FPR@7 0.14 0 0 0 0

FPR@8 0.125 0 0.125 0.125 0.125

FPR@9 0.11 0 0.11 0.11 0.11

FPR@10 0 0 0 0 0


___________________________________________________________________________________

158 | P a g e

Table 6.26: Avg. FPR@5 and Avg. FPR@10 for all the items

Laptop Head Phone

Smart

Phone Tablet Printer

Mean of

Average FPR

for all products

Avg

FPR@5 0.2 0.05 0.1 0.05 0 0.08

Avg

FPR@10 0.1375 0.025 0.0735 0.0485 0.0235 0.0616

Figure 6.14: Average FPR@5 using Comprehensive Approach

Figure 6.15:Average FPR@10 using Comprehensive Approach

The above value of the measure of false positive and false negative is depicted in

Figure 6.14andFigure 6.15. The values indicate the performance of the system which

is being evaluated using comprehensive approach.


___________________________________________________________________________________

159 | P a g e

6.7.5 FNR@10 obtained using Comprehensive Approach

In Table 6.27, values of FNR for different positions are shown. For the corresponding

table which illustrates that FNR@3 and FNR@5 are zero; it shows the zero false

negative error for the recommender system for top 3 and top 5 positions respectively.

We define Average of FNR@k as follows;

1

@. FNR@k = ----------------------------------------- (6.12)

k

i

FNR iAvg

k

The FNR@k works similar to FPR@k in assessing the systems accuracy in terms of

its prediction to user‟s choices. These values of avg. FNR@k for k=5 and k=10 are

shown in Figure 6.16 andFigure 6.17 respectively.

Table 6.27: Values of FNR@10 for different products

Laptop Head Phone Smart Phone Tablet Printer

FNR@1 1 0 0 0 0

FNR@2 0 0 0.5 0 0

FNR@3 0 0 0 0 0

FNR@4 0 0.25 0 0.25 0

FNR@5 0 0 0 0 0

FNR@6 0 0 0 0 0

FNR@7 0.14 0 0 0 0

FNR@8 0.125 0 0.125 0.125 0.125

FNR@9 0.11 0 0.11 0.11 0.11

FNR@10 0 0 0 0 0

Table 6.28: Avg. FNR@5 and Avg. FNR@10 for all the items

Laptop Head Phone

Smart

Phone Tablet Printer

Mean of

Average FPR

for all products

Avg

FNR@5 0.2 0.05 0.1 0.05 0 0.08

Avg

FNR@10 0.1375 0.025 0.0735 0.0485 0.0235 0.0616


___________________________________________________________________________________

160 | P a g e

Figure 6.16: Average FNR@5 using Comprehensive Approach

Figure 6.17: Average FNR@10 using Comprehensive Approach

6.7.6 Spearman Correlation value using Comprehensive Approach

We find the spearman correlation coefficient between system ranking and user raking.

The values are shown in Table 6.29, anddepicted in Figure 6.18. It is evident from the

obtained values that both the ranking is highly correlated.

Table 6.29:Spearman correlation coefficient for different products

Products Spearman Correlation Coefficient

value

Laptop 0.9030

HeadPhone 0.9878

SmartPhone 0.9515

Tablet 0.9515

Printer 0.9636


___________________________________________________________________________________

161 | P a g e

Figure 6.18: Spearman correlation coefficient between system ranking and Comprehensive

Approach based ranking

6.8 Relative Performance of the Recommender Systems using Proposed

Comprehensive Approach and other Existing Evaluation Approaches

In this section we have discussed the details of the results obtained while comparing

the proposed comprehensive approach with existing Average Scoring based technique

[40] and Rank Aggregation based technique [39]. The section 6.7 gives the values of

calculated metric using comprehensive approach. The relative performance of the

system under evaluation, by using proposed comprehensive approach and other

related work [39], [40]is analyzed and their corresponding values are tabulated below.

In Table 6.30, those metric which show accuracy are indicated. The higher values

of these evaluation metrics imply the examined system performance is better. In Table

6.31, the evaluation parameters which measure errors are presented. The lower values

of these parameters indicate the least error occurring in the examined system.

We have obtained a Comprehensive Veracity Measure (CVM) for the system using

all three different approaches which help in assessing the performance of the system

under different scenario. CVM is given by;

Sum of values of different evaluation metrics ----------------- (6.13)

Total number of metrics

@5 Spearman correlation coefficient --- (6.14)

4

CVM

MRR P MAPCVM


___________________________________________________________________________________

162 | P a g e

Thus, CVM for Comprehensive Approach (C. A)isreferred as „CVM (CA)‟ and

calculated as;

CVM (CA) = (.9+1+.9384+.9514)/4

CVM (CA) = 0.94745

Table 6.30:Mean Reciprocal Rank, P@5, Mean Average Precision and Spearman correlation

coefficient for different approaches

Approach

Mean

Reciprocal

Rank (MRR)

P@5

Mean

Average

Precision

(MAP)

Spearman

Correlation

Coefficient

Comprehensive

Veracity measure

(CVM)

Proposed

Comprehensive

Approach

.9 1 .9384 0.9514 0.94745

Average Scoring

based technique

[40]

1 0.96 0.9501 0.9030 0.927275

Rank Aggregation

based technique

[39]

1 0.856 0.866 0.9363 0.940575

Figure 6.19: Comprehensive Veracity Measure of different approaches

Similarly, we find the CVM for previous approaches. These values of CVM for

average based scoring technique; rank aggregation approach and comprehensive

approach are shown in Table 6.30 and pictorially represented in Figure 6.19.

The results of MRR for the various evaluation measures indicate that the system

has recommended all the first ranked products of different items exactly as most of

the evaluation measures suggest. However, if we consider comprehensive approach,

we get MRR as 0.9, i.e. 90% of first ranked products are similar in system ranking


___________________________________________________________________________________

163 | P a g e

and comprehensive ranking. Average P@5 for all the items for different evaluation

approaches is also shown in Table 6.30. The maximum value of P@5 for the

recommender system has obtained by comprehensive approach. However, minimum

value of P@5 is .86, which is good enough for the recommender system to be

considered as accurate in terms of precision.

The Mean Average Precision (MAP) of all ranking positions for different

evaluation approaches has shown. Interestingly, the system has a very good MAP by

all the evaluation approaches. But the maximum MAP is obtained by Average

Scoring based Approach.

We have shown spearman correlation of all the evaluation approaches with the

recommender system under study. It is evident from the value that all the approaches

are highly correlated; however, comprehensive approach is more correlated than the

other approaches. The high correlation of system ranking with the ranking of other

approaches clearly indicates that the system ranking is very close to the user‟s choice

and good enough to be chosen as a suggested recommender system for the

recommendation of various products.

Also, observing the „comprehensive veracity measure‟, it is found that the all 4

metrics measuring veracity of the system has approximately near values. The

proposed approach with „rank aggregation based approach‟ shows very similar results

however the average scoring based technique differs slightly. The reason behind is

obvious, as both the methods incorporate aggregation algorithm of the ranked items of

the users, whereas the average scoring technique relies upon numerical value assigned

to different products and hence differ significantly. Since the small data set is used in

the work, on a larger data set the comprehensive approach would evaluate the system

veracity more accurately as the probability of users being insincere can be determined

and other two methods would not be able to trace these issues.

Similarly, the measure of fallacy, FPR@k and FNR@k gives how much the system

can defense the user‟s dissatisfaction and avoid the user-reluctant tendency in the

recommendation.

In Table 6.31, the measures of FPR@5 and FNR@5 for the different approaches

have been listed. Since the preference criteria and user‟s sincerity is not included in

the two studies [40], [39], the evaluation approach is error prone and identify the error

which basically is due to the outlier existence in the data. Therefore, average scoring


___________________________________________________________________________________

164 | P a g e

based technique encounters maximum FPR@5 which is 0.144; interestingly the

veracity measure for the technique is also high, i.e. a contradiction. The measures

highlight the loop hole in the previous approach which is overcome by the proposed

comprehensive approach. Thus it can be clearly concluded that proposed

Comprehensive approach outperforms the other existing techniques.

Also, FPR@5 is not the overall error score but it just indicates for top k positions

how precise the recommendation is. Thus, the value of FPR@k for different k

represents the degree of biasness that the system has. If FPR@k remains same for

each k, it implies that the system is biased and does not perform with consistency. The

values are shown in Figure 6.20andFigure 6.21.

Table 6.31:FPR@5 and FNR@5 for different approaches

Approach Average FPR@5 for all items Average FNR@5 for all items

Proposed Comprehensive

Approach 0.08 0.08

Average Scoring based

technique[40] 0.144 0.144

Rank Aggregation based

technique[39] 0.04 0.04

Figure 6.20: Average FPR@5 for all items


___________________________________________________________________________________

165 | P a g e

Figure 6.21: Average FNR@5 for all items

6.8.1 Comparison of Proposed Comprehensive Approach with Existing

Evaluation Strategies

The present work suggests an approach to evaluate the recommender system. The

evaluation is based on implicit feedback. The local dataset is used which is created by

observing the users‟ behavior and the implicit record of their activity over the

provided link and reviews of the items. The users behavior is noticed for those

selected items which have been sent to them and taken from the dataset described by

the author [243].

The different evaluation studies have been reported in the literature and the

respective strategies are discussed in section 6.2. We do not have any mathematical

model or simulation technique to compare the evaluation approaches with each other

having so diverse data and extensive approaches. Thus we have chosen different

factors which have been incorporated in the proposed evaluation system. These

factors have been used by others evaluation approaches as well. The advantages of

employing these techniques and why the proposed approach should be preferred is

discussed below. In the Table 6.32, a comparative study is presented which illustrates

how proposed approach has advantages over the evaluation techniques existing in the

literature. Seven different factors have been considered. From the table it is evident

that all the existing evaluation studies have proposed a new framework for evaluation

or have been suggested a new metric to evaluate the system. Only Herlocker et al.

[186] has not proposed new framework, however, they have studied the evaluation

scheme for RS thoroughly and have provided a detail discussion for these approaches

including more than 6 metrics for the purpose.


___________________________________________________________________________________

166 | P a g e

Table 6.32: Comparison of proposed Comprehensive Approach with existing evaluation

strategies

Evaluation

approaches

New

metric/fram

ework

proposed

No. of

metric

s used

≥6

Sincerity

check of

users

Mathematical

formulation

for criteria of

preference

Explicit/I

mplicit

feedback

from

users

Experimental

analysis with

existing RS

Schroder et al.

[266]

Herlocker et al.

[186]

Sohail et al.

[39], [40]

Olmo&Gaudios

o[267]

Cremonesi et al.

[268]

Shani and

Gunawardana[5

1]

Proposed

Comprehensive

Approach

Almost all the metrics have used six or more than six metrics in the evaluation

process. However, Sohail et al. [40] have used less than 6 metrics. Also, Olmo &

Gaudioso[267] have used less than six metrics. We have proposed the system and

have also used 6 metrics to evaluate the system. Five from seven approaches have

also employed explicit/implicit feedbacks from users. Either the techniques have

described in details about how they have incorporated feedbacks or they have

themselves collected feedbacks from users.

In the same way, not all the techniques have experimentally evaluated any system

by their proposed approach. In the previous work, authors have evaluated the system

but preferably not discussed the differences in their adopted approach with the

approaches have been used by researchers in the literature. The point where the

proposed approach takes advantages over other is the sincerity check of the users and

well-defined criteria for defining the preference criteria of the users. The proposed

Comprehensive Approach (CA) includes all the factors whereas no other evaluation

strategy provides user sincerity check and only one has the provision which explicitly

defines the criteria of the selection and advices the mechanism to decide threshold.

This clearly designates the superiority of the proposed approach.


___________________________________________________________________________________

167 | P a g e

6.9 Summary

In this chapter, we have put forward two different user feedback based framework for

the evaluation of Recommender Systems (RS). The first framework depends upon

explicit feedback whereas another one utilizes implicit feedback. The evaluation of

RS based on explicit feedback is used to evaluate the book recommendation

techniques employed in this work and discussed in chapters 3, 4 and 5. The implicit

feedback based evaluation mechanism is used to evaluate RS presented in [243]. The

reason behind choosing explicit feedback for the evaluation of book recommender

systems proposed in this work is integrity of the experts whose feedbacks are taken

and considered as a base in the evaluation process, whereas the system proposed in

[243] is meant for general purpose of daily needs commodities and hence user‟s

sincerity and authenticity must be examined. This is why implicit feedback

mechanism is applied over it.

Through implicit feedback, a comprehensive approach for the evaluation of

recommender systems is suggested. The proposed methodology tries to measure

sincerity of the users who provide their feedback for the evaluation of the

recommender system. The computation of user‟s sincerity measure eliminates the

feedback of insincere users and in turn, makes the evaluation process reliable. Further,

the methodology outlines a procedure to decide whether a specific product, whose

review site is visited by the user, is to be considered as a product preferred by the user

or not. Hence, the proposed evaluation approach is poised to be a fairly realistic

approach and better than the other related evaluation techniques, which do not have

any provision to measure user‟s sincerity and which consider any product, whose

review site is visited by the user, as a product preferred by the user.

The proposed comprehensive approach is used to evaluate the performance of a

recommender system and the result of the evaluation is presented. We compare the

aggregated ranking of products obtained in the comprehensive evaluation approach

with the recommender system‟s ranking of the products, and compute the values of a

good number of evaluation metrics, namely Spearman correlation coefficient,

precision at k, Mean Average Precision (MAP), Mean Reciprocal Rank (MRR), false

positive rate (FPR) and false negative rate (FNR). We also evaluate the performance

of the recommender system using two other related evaluation techniques proposed in

[19, 18] and compute the values of the same set of evaluation metrics. Since we do


___________________________________________________________________________________

168 | P a g e

not have the true ranking of the products recommended by the recommender system,

we cannot decide objectively which of the three evaluation techniques is able to

evaluate the recommender system most accurately. Hence, the values of the

evaluation metrics obtained using the three evaluation techniques are compared and

the results of this comparison is pictorially represented.

Since the comprehensive evaluation approach is a fairly realistic approach as

discussed above, the high values of Spearman correlation coefficient, precision at k,

Mean Average Precision (MAP), Mean Reciprocal Rank (MRR) and low values of

false positive rate and false negative rate, obtained for the comprehensive evaluation

approach, clearly indicates that the recommender system under evaluation performs

well. Hence, the products recommended by the recommender system may satisfy the

user and the user may purchase them.

As more than 75 experts are contacted for 100 books. Few of them were given

more than 1 course. These people are in several universities in India, KSA, Iraq, Iran,

Jordan, and USA. Also few of them are in leading tech companies in above countries.

That‟s why we have selected explicit feedback evaluation scheme for the examining

the proposed book recommendation techniques.

The final result for the methodology which utilizes explicit feedback is

summarized in the respective tables. Table 6.16 shows the values of parameters which

have estimation of errors, and table 6.17 has the values of parameters indicating

precision. The technique which has higher value of these parameters would be treated

as better one.

The results show that amongst the different approaches used for the

recommendation of books, namely PAS, OWA with quantifiers „most‟, „at least half‟

and „as many as possible‟, ORWA and opinion mining; the most preferred approach

by the evaluation of experts is book recommendation using opinion mining technique.

169 | P a g e

Chapter 7

Conclusion and Future Direction

7.1 Introduction

The research work carried out in the thesis aimed at exploring the role of opinion

mining, a sub branch of web mining, in recommender systems and how it can

overcome the prevailing issues in the concerned techniques. In this last chapter,

research findings and our contributions are summarized; also future direction for the

research is highlighted. In section 7.2, the concluding remarks of the different works

carried out in respective chapters are discussed. The section describes the pros and

cons of the adopted approaches. The variation in the results of recommendation due to

the change in the technique is addressed. In section 7.3, the scope of the future work

and limitation of the adopted techniques are suggested.

7.2 Conclusion

We have reviewed the state of the art in recommender systems. A detail description of

the various techniques with the diagrammatic representations and common examples,

are given in Chapter 2. These details are easy to understand the approaches adopted

in recommender systems design. The contributions of the researchers on the topic are

focused and their relative comparison is discussed. The study also reveals the major

flaws with the leading existing techniques.

Suggestion to overcome the drawbacks which were identified at the time of the

study of literature is proposed with an intention of introducing opinion mining as a

solution. Not only the method but we have also suggested a comprehensive approach

for evaluation of the recommender system. In Chapter 3, we have introduced a rank

aggregation algorithm based recommendation of books; we call it „Positional

Aggregation based Scoring (PAS) technique‟. The link mining concepts are

incorporated to find the top raked universities recommendations for the books under

their prescribed syllabus and the aggregation scheme is employed to aggregate these

ranking and recommend top books for students.

Chapter 7: Conclusion

__________________________________________________________________________________

170 | P a g e

A fuzzy based aggregation operator, OWA (Ordered Weighted aggregation), have

been utilized and discussed in Chapter 4. OWA is implemented to give a variety of

experiments and verifications of proposed techniques. In order to rely more upon

voters and rankers prestige, an „Ordered Ranked Weighted Aggregation (ORWA)‟ is

suggested for the book recommendation.

The Ordered Ranked Weighted Aggregation incorporates rank of the rankers to

emphasize the importance of the rankers as a book recommended by best ranked

institution must get high preference than a book which is recommended by a lower

ranked institution. The ORWA gives the ranking positions of the recommended

books, along with the total recommended books. The strength of assigning weights to

the rankers in the ORWA provides a better recommendation. We believe the

proposed technique may meet the user‟s need and provide them the perfect books they

need.

An extensive approach based on opinion mining is also proposed. The OWA and

ORWA used recommendations from universities authorities, i.e. experts. It seems

adequate to involve the users rather than only the experts, for a better understanding

of their preferences and what they actually love to have? To observe the user‟s

requirements and their like or dislike opinion about an item, particularly books,

opinion mining techniques have been applied. The recommendation of books based

on opinion mining is presented before the users. All these suggested approaches are

evaluated to show the best amongst all.

We have put forward two different user feedback based frameworks for the

evaluation of Recommender Systems (RS). The first framework depends upon

explicit feedback whereas another one utilizes implicit feedback. The evaluation of

RS based on explicit feedback is used to evaluate the book recommendation

techniques employed in this work and discussed in chapters 3, 4 and 5. The implicit

feedback based evaluation mechanism is used to evaluate RS presented in [243]. The

reason behind choosing explicit feedback for the evaluation of book recommender

systems proposed in this work, is the integrity of the experts whose feedbacks are

taken and considered as a base in the evaluation process, whereas the system proposed

in [243] is meant for general purpose of daily needs commodities and hence user‟s

sincerity and authenticity must be examined. This is why implicit feedback

mechanism is applied over there.


__________________________________________________________________________________

171 | P a g e

The comprehensive approach to evaluate recommender systems which is based on

user feedback tries to measure sincerity of the users who provide their feedback for

the evaluation of the recommender system. The computation of user‟s sincerity

measure eliminates the feedback of insincere users and in turn, makes the evaluation

process reliable. Further, the methodology outlines a procedure to decide whether a

specific product, whose review site is visited by the user, is to be considered as a

product preferred by the user or not. Hence, the proposed evaluation approach is

poised to be a fairly realistic approach and better than the other related evaluation

techniques, which do not have any provision to measure user‟s sincerity and which

consider any product, whose review site is visited by the user, as a product preferred

by the user.

The proposed comprehensive approach is used to evaluate the performance of a

recommender system and the result of the evaluation is presented. We compare the

aggregated ranking of products obtained in the comprehensive evaluation approach

with the recommender system‟s ranking of the products, and compute the values of a

good number of evaluation metrics, namely Spearman correlation coefficient,

precision at k, Mean Average Precision (MAP), Mean Reciprocal Rank (MRR), false

positive rate (FPR) and false negative rate (FNR). We also evaluate the performance

of the recommender system using two other related evaluation techniques proposed in

[19, 18] and compute the values of the same set of evaluation metrics. Since we do

not have the true ranking of the products recommended by the recommender system,

we cannot decide objectively which of the three evaluation techniques is able to

evaluate the recommender system most accurately. Hence, the values of the

evaluation metrics obtained using the three evaluation techniques are compared and

the results of this comparison is pictorially represented. The said method can be

served as a base to evaluate the recommender systems performance based on explicit

feedbacks.

The explicit feedback based evaluation of the present study is performed. The

results of all the proposed schemes, i.e. PAS, OWA with quantifiers „at least half‟, „at

most‟ and „as many as possible‟, ORWA and Opinion mining are shown and

compared. The comparison suggests the advantage of using opinion mining

techniques over other approaches. Also, user feedback is the basis of the evaluation of

the proposed scheme; hence the work also gives a direction of utilizing feedback from


__________________________________________________________________________________

172 | P a g e

users for evaluation process. There is a great deal of future enhancement in the work,

which is discussed in the next section.

7.2 Future Directions

The work presented in this thesis has comprehensively covered web mining and soft

computing techniques for the recommendation of books and evaluation of

recommender systems, both. Still, there is a lot of scope to enhance the work and

there are several areas for which the present works gives the direction to explore.

These future directions are listed below.

i. The different approaches discussed in this thesis are specific to the selected

domain of books and products. In future, the work can easily be extended for

the enlarged dataset.

ii. The present work is designed in Indian perspectives. Hence, the syllabus of the

books is taken from top universities of India only. In future, the approach can

be comfortably implemented to any institute and any country by considering

the universities around the world.

iii. The proposed opinion mining techniques exploits features based

recommendations. We have selected different features of books depending

upon users‟ interests. For simplicity, relatively less number of features is

selected. In future, the features selection procedure can be modified to

increase or decrease the total number of features according to the interests of

the users and item types, both.

iv. Our procedure for checking of the users‟ sincerity is solely based upon the

assumption that majority of the users are sincere. Also, there could be no

genuine users or less sincere users. In such a case much cannot be done by our

method. In future, the sincerity checking for the users in similar situations can

be formulated and applied to deal with.


__________________________________________________________________________________

173 | P a g e

v. Our focus in this work is to provide a framework for the recommendation of

online items, especially books. Thus instead of emphasizing on the spam

detection in reviews, we just concentrated on how these reviews can be

formulated to make appropriate recommendations. Hence, the spam detection

is not well studied. Therefore, in future it would be interesting to see what if

the customers‟ reviews can be checked for spam and only the spam filtered

reviews are involved in the recommendation process.

vi. Further, the user emotion state can be applied to weight the feature according

to the emotional condition of the users. The emotions can be observed by

capturing facial expressions, prior events happening with the users, etc. this

emotional state would help in analyzing the review genuineness and how

much it can be relied upon?