a potential approach to overcome data limitation in scientific publication recommendation
TRANSCRIPT
![Page 1: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/1.jpg)
APotentialApproachtoOvercomeDataLimitationinScientificPublicationRecommendation
HungNghiepTran,TinHuynh,KiemHoang
UniversityofInformationTechnology
Vietnam
Originalpaper:HungNghiepTran,TinHuynh,KiemHoang.APotentialApproachtoOvercomeDataLimitationinScientificPublicationRecommendation.KSE2015.Resource: Seethelastslide(SlideShareconvention.)
![Page 2: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/2.jpg)
IntroductionRelevantpaperrecommendation:Whatpapersarerelevanttoaresearcher’sinterests.
§ Note:Differentfromcitationrecommendation.
2Imagesource:SugiyamaandKan,Exploitingpotentialcitationpapers inscholarlypaperrecommendation,JCDL‘13.
àProblemsinexperimentsandevaluations.
![Page 3: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/3.jpg)
IntroductionRelevantpaperrecommendation:Whatpapersarerelevanttoaresearcher’sinterests.
§ Note:Differentfromcitationrecommendation.
3Imagesource:SugiyamaandKan,Exploitingpotentialcitationpapers inscholarlypaperrecommendation,JCDL‘13.
àProblemsinexperimentsandevaluations.
![Page 4: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/4.jpg)
IntroductionRelevantpaperrecommendation:Whatpapersarerelevanttoaresearcher’sinterests.
§ Note:Differentfromcitationrecommendation.
4Imagesource:SugiyamaandKan,Exploitingpotentialcitationpapers inscholarlypaperrecommendation,JCDL‘13.
àProblemsinexperimentsandevaluations.
![Page 5: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/5.jpg)
Introduction
Offlineevaluation:Themostpopularapproach.§ Basedongroundtruthdata.• Recommendedpapersarecomparedtotheonesknowntoberelevanttoeachresearcher.
àBuildinggroundtruthdataisusuallydifficultandexpensive.
5
![Page 6: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/6.jpg)
Introduction
Offlineevaluation:Themostpopularapproach.§ Basedongroundtruthdata.• Recommendedpapersarecomparedtotheonesknowntoberelevanttoeachresearcher.
àBuildinggroundtruthdataisusuallydifficultandexpensive.
6
![Page 7: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/7.jpg)
Introduction
• Anapproachtobuildgroundtruth:§ Naturalthinking,referencesarerelevanttoresearchers’interests.
àIntuitively,wecanbuiltgroundtruthdatabasedonreferecedata.
àButthisapproachisnotexploredintheliterature.
7
![Page 8: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/8.jpg)
Introduction
• Anapproachtobuildgroundtruth:§ Naturalthinking,referencesarerelevanttoresearchers’interests.
àIntuitively,wecanbuiltgroundtruthdatabasedonreferecedata.
àButthisapproachisnotexploredintheliterature.
8
![Page 9: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/9.jpg)
Introduction
• Anapproachtobuildgroundtruth:§ Naturalthinking,referencesarerelevanttoresearchers’interests.
àIntuitively,wecanbuiltgroundtruthdatabasedonreferecedata.
àButthisapproachisnotexploredintheliterature.
9
![Page 10: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/10.jpg)
OurGoal
To systematically study the approach thatbuilds ground truth data based onreference data for evaluation of relevantpaper recommendation.
10
![Page 11: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/11.jpg)
RelatedWork• Relevantpaperrecommendationisanemergedresearcharea:§ [Sugiyama&Kan,JCDL‘10].§ [Leetal.,ICCCI‘14].§ [Ohtaetal.,ICADIWT‘11].
• Lackofgroundtruthdataisarecognizedproblem:§ [Beeletal,RepSysWorkshop‘13].
• Someapproacheshavebeentriedtobuilddata:§ Manuallybuilt:
• [Sugiyama&Kan,JCDL‘10].§ Adaptedfromreferencemanagementsoftware:
• Mendeley.• Docear.
11
![Page 12: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/12.jpg)
OurApproach
• Theoreticalanalysis:§ Proposeandanalyzethehypothesessupportingtheapproach.
• Empiricalanalysis:§ Evaluatetheapproach’scapabilityofevaluatingrecommendationmethods.
12
![Page 13: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/13.jpg)
TheoreticalAnalysis
Hypotheses:• Hypothesis1.Inthecontextofdoingresearchandwriting
scientificpublications,therearemanylevelsofexposingaresearcher’sinformationneedsinwhichcitingisthehighestlevel.
• Hypothesis2.Referencesmadebyresearchersaretheirrelevantpublications.Moreover,theyarethemostimportantones.
• Hypothesis3.Futurereferencescouldbeusedasgroundtruthdata inevaluationofrecommendingrelevantpublicationsforresearchers.
àTheapproachisreasonable.13
![Page 14: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/14.jpg)
TheoreticalAnalysis
Hypotheses:• Hypothesis1.Inthecontextofdoingresearchandwriting
scientificpublications,therearemanylevelsofexposingaresearcher’sinformationneedsinwhichcitingisthehighestlevel.
• Hypothesis2.Referencesmadebyresearchersaretheirrelevantpublications.Moreover,theyarethemostimportantones.
• Hypothesis3.Futurereferencescouldbeusedasgroundtruthdata inevaluationofrecommendingrelevantpublicationsforresearchers.
àTheapproachisreasonable.14
![Page 15: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/15.jpg)
EmpiricalAnalysis
15
Howto evaluate theapproach’scapabilityofevaluating recommendationmethods?à Thetrick:Two-layerevaluation.
1. Evaluatedifferentrecommendationmethods,gettherecommendationevaluationresults.
2. Evaluatetheconsistencyoftheaboveevaluationresultsontwodatasets,theonebuiltbasedontheapproachandtheonebuiltmanually.
![Page 16: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/16.jpg)
EmpiricalAnalysis
16
Howto evaluate theapproach’scapabilityofevaluating recommendationmethods?à Thetrick:Two-layerevaluation.
1. Evaluatedifferentrecommendationmethods,gettherecommendationevaluationresults.
2. Evaluatetheconsistencyoftheaboveevaluationresultsontwodatasets,theonebuiltbasedontheapproachandtheonebuiltmanually.
![Page 17: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/17.jpg)
EmpiricalAnalysis
17
Howto evaluate theapproach’scapabilityofevaluating recommendationmethods?à Thetrick:Two-layerevaluation.
1. Evaluatedifferentrecommendationmethods,gettherecommendationevaluationresults.
2. Evaluatetheconsistencyoftheaboveevaluationresultsontwodatasets,theonebuiltbasedontheapproachandtheonebuiltmanually.
![Page 18: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/18.jpg)
EmpiricalAnalysis
18
Howto evaluate theapproach’scapabilityofevaluating recommendationmethods?à Thetrick:Two-layerevaluation.
1. Evaluatedifferentrecommendationmethods,gettherecommendationevaluationresults.
2. Evaluatetheconsistencyoftheaboveevaluationresultsontwodatasets,theonebuiltbasedontheapproachandtheonebuiltmanually.
![Page 19: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/19.jpg)
ExperimentsPlan
19
1. Build a dataset D with ground truthdata based on the approach.
2. Get a manually built dataset D’ forcomparison.
3. Recommend by different methods onD and D’.
4. Evaluate recommendation methods’result on D and D’.
5. Evaluate the consistency of evaluationresults on D and D’.
![Page 20: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/20.jpg)
ExperimentsPlan
20Imagesource:SugiyamaandKan,Exploitingpotentialcitationpapers inscholarlypaperrecommendation,JCDL‘13.
1. Build a dataset D with ground truthdata based on the approach.
2. Get a manually built dataset D’ forcomparison.
3. Recommend by different methods onD and D’.
4. Evaluate recommendation methods’result on D and D’.
5. Evaluate the consistency of evaluationresults on D and D’.
DifferentContent-basedFiltering recommendationmethods formedbyfeaturevectorcombination.
![Page 21: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/21.jpg)
Evaluation
Layer1:Toevaluaterecommendationresults:§ Order-aware:NDCG@5,NDCG@10.§ Firstrelevantitem-aware:MRR.
Layer2:Tomeasuretheconsistencyofevaluationresults:
§ Pearson’scoefficient.§ Spearman’scoefficient.§ Kendall’scoefficient.
21
![Page 22: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/22.jpg)
Evaluation
Layer1:Toevaluaterecommendationresults:§ Order-aware:NDCG@5,NDCG@10.§ Firstrelevantitem-aware:MRR.
Layer2:Tomeasuretheconsistencyofevaluationresults:
§ Pearson’scoefficient.§ Spearman’scoefficient.§ Kendall’scoefficient.
22
![Page 23: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/23.jpg)
TheProcesstoBuildDataset
• Timeline
• TargetResearcher: thoseonesforwhomrecommendationsaregenerated.
• FutureReference:thosepaperscitedintheFuturebutnotinthePast.
• GroundTruthData:futurereferencescitedbyeachtargetresearcher.
23
FuturePastPresent(Defined)
Firstpublished year Lastpublished year
![Page 24: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/24.jpg)
TheProcesstoBuildDataset
• Timeline
• TargetResearcher: thoseonesforwhomrecommendationsaregenerated.
• FutureReference:thosepaperscitedintheFuturebutnotinthePast.
• GroundTruthData:futurereferencescitedbyeachtargetresearcher.
24
FuturePastPresent(Defined)
Firstpublished year Lastpublished year
![Page 25: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/25.jpg)
ExperimentalData
• D:Automaticallybuiltdataset.§ Tobereleasedfromhttps://sites.google.com/site/tranhungnghiep
• D’:Manuallybuiltdataset.§ FromSugiyamaandKan,JCDL‘10.
25
![Page 26: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/26.jpg)
26
Layer1: RecommendationEvaluationResults
EvaluationresultofCBFmethodson2datasetsDandD’:
![Page 27: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/27.jpg)
Layer2: EvaluationResultsConsistency
27
• Ingeneral:statisticallysignificantstrongpositivecorrelatedresults.à Applicabilityinroughlycomparingmethodsbeforeonlineevaluation.• Forspecificmetric:
§ ForNDCG@10:lesscorrelation.§ ForMRR:nocorrelation.
à Notsuitabletomeasurethefirstrelevantrecommendeditem.
CorrelationsbetweenevaluationresultsonDandD’:
![Page 28: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/28.jpg)
Layer2: EvaluationResultsConsistency
28
• Ingeneral:statisticallysignificantstrongpositivecorrelatedresults.à Applicabilityinroughlycomparingmethodsbeforeonlineevaluation.• Forspecificmetric:
§ ForNDCG@10:lesscorrelation.§ ForMRR:nocorrelation.
à Notsuitabletomeasurethefirstrelevantrecommendeditem.
CorrelationsbetweenevaluationresultsonDandD’:
![Page 29: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/29.jpg)
Layer2: EvaluationResultsConsistency
29
• Ingeneral:statisticallysignificantstrongpositivecorrelatedresults.à Applicabilityinroughlycomparingmethodsbeforeonlineevaluation.• Forspecificmetric:
§ ForNDCG@10:lesscorrelation.§ ForMRR:nocorrelation.
à Notsuitabletomeasurethefirstrelevantrecommendeditem.
CorrelationsbetweenevaluationresultsonDandD’:
![Page 30: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/30.jpg)
Conclusion
Thisstudyisthefirstoneto:• Assesstheapproachbuildinggroundtruthdatabasedonreferencedataforevaluatingrelevantpaperrecommendation.àWeshowedthatthisapproachispromising.
• Proposeaprocesstobuildgroundtruthdatafrombibliographicdata.àWebuiltandpublishedadatasettohelpadvancingotherresearches(*).
30
![Page 31: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/31.jpg)
Conclusion
Thisstudyisthefirstoneto:• Assesstheapproachbuildinggroundtruthdatabasedonreferencedataforevaluatingrelevantpaperrecommendation.àWeshowedthatthisapproachispromising.
• Proposeaprocesstobuildgroundtruthdatafrombibliographicdata.àWebuiltandpublishedadatasettohelpadvancingotherresearches(*).
31
![Page 32: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/32.jpg)
Conclusion
Thisstudyisthefirstoneto:• Assesstheapproachbuildinggroundtruthdatabasedonreferencedataforevaluatingrelevantpaperrecommendation.àWeshowedthatthisapproachispromising.
• Proposeaprocesstobuildgroundtruthdatafrombibliographicdata.àWebuiltandpublishedadatasettohelpadvancingotherresearches(*).
32(*)Tobereleasedfromhttps://sites.google.com/site/tranhungnghiep
![Page 33: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/33.jpg)
Futurework
• Shouldfocusonextensiveassessmentofthisapproach.àEspecially,bycomparisonwithonlineevaluation.
Thankyouverymuch!
![Page 34: A Potential Approach to Overcome Data Limitation in Scientific Publication Recommendation](https://reader031.vdocument.in/reader031/viewer/2022030213/589a8f901a28abae648b5219/html5/thumbnails/34.jpg)
Futurework
• Shouldfocusonextensiveassessmentofthisapproach.àEspecially,bycomparisonwithonlineevaluation.
Thankyouverymuch!
Originalpaper:HungNghiepTran,TinHuynh,Kiem Hoang.APotentialApproachtoOvercomeDataLimitationinScientificPublicationRecommendation.KSE2015.Code&Data: https://github.com/tranhungnghiep/PaperRecommender.Otherresource: https://sites.google.com/site/tranhungnghiep/code-data/paper-recommender-systems.