building recommender systems - mendeley and science direct
TRANSCRIPT
![Page 1: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/1.jpg)
| 0
Daniel Kershaw (@danjamker)
Building Recommenders
20th September 2017
![Page 2: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/2.jpg)
| 1
Mendeley
• Reference Manager
• Social Network
• Publication Catalogue
![Page 3: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/3.jpg)
| 2
Science Direct
• Scientific publication database
• Used by the majority of
university and research
institutions
• Contains 12 million articles of
content from 3,500 academic
journals and 34,000 e-books
![Page 4: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/4.jpg)
| 3
Why Recommendations
Pull
Allow users to discover more content
Make it easier to navigate catalogue
![Page 5: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/5.jpg)
| 4
Why Recommendations
Pull
Allow users to discover more content
Make it easier to navigate catalogue
Push
Highlight new content to users
Bring users back to service
![Page 6: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/6.jpg)
| 5
The five core components
Data Collection
Recommender Model
Recommendation Post Processing
Online Modules
User Interface
![Page 7: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/7.jpg)
| 6
Outline
Developed Algorithms – keeping it simple
Practical Considerations – don’t look stupid
Implementation – how to scale a system
Evaluation – what is good enough
Evolution – what’s changed over time
Future Direction – the future’s bright the future’s is deep
![Page 8: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/8.jpg)
| 7
Developed Algorithms
![Page 9: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/9.jpg)
| 8
Available Data
Implicit
User libraries (Mendeley)
User article interactions (Science Direct)
Content
Abstracts
Titles
References
![Page 10: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/10.jpg)
| 9
Content Based
Similarity between what users
have read
Similarity in references
Collaborative Collaborative
Matrix Factorization
KNN
LDA
Potential Methods
![Page 11: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/11.jpg)
| 10
User item interaction matrix
User base CF – (kNN)
https://buildingrecommenders.wordpress.com/2015/11/18/overview-of-recommender-algorithms-part-2/
![Page 12: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/12.jpg)
| 11
Similarity between query users and other readers
User base CF – (kNN)
https://buildingrecommenders.wordpress.com/2015/11/18/overview-of-recommender-algorithms-part-2/
![Page 13: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/13.jpg)
| 12
Similarity between all users
User base CF – (kNN)
https://buildingrecommenders.wordpress.com/2015/11/18/overview-of-recommender-algorithms-part-2/
![Page 14: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/14.jpg)
| 13
Generating recommendations for user
User base CF – (kNN)
https://buildingrecommenders.wordpress.com/2015/11/18/overview-of-recommender-algorithms-part-2/
![Page 15: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/15.jpg)
| 14
• Ability to scale
• Matrix incredibly sparse
Why not Matrix Factorization
![Page 16: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/16.jpg)
| 15
Practical Considerations
![Page 17: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/17.jpg)
| 16
Explore/Exploit (Dithering)
Recommendations generated in batch
Users want an interactive experience
Slight shuffles give the impression of
freshness
Allow for the exploration of the list if only
a proportion shown
𝑠𝑐𝑜𝑟𝑒𝑑𝑖𝑡ℎ𝑒𝑟𝑒𝑑 = log 𝑟𝑎𝑛𝑘 + 𝑁 0, log 𝜖
where 𝜀 =∆ 𝑟𝑎𝑛𝑘
𝑟𝑎𝑛𝑘and tipically 𝜀 ∈ [1.5,2]
![Page 18: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/18.jpg)
| 17
Impression Discounting
• Experience deteriorates if exposed to the same information
• Push recommendations seen before down the list
Rank
Impressions
![Page 19: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/19.jpg)
| 18
Impression Discounting
• Experience deteriorates if exposed to the same information
• Push recommendations seen before down the list
𝑠𝑐𝑜𝑟𝑒𝑛𝑒𝑤 = scoreoriginal ∗ (w1 ∗ g impCount + w2 ∗ g lastSeen )
See Lee, P. et. al
![Page 20: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/20.jpg)
| 19
Business Logic (Pre and Post Filtering)
Don’t show items they already have (bought, added, consumed)
Don’t feed the recommender positive feedback from recommender
Don’t recommend out of stock items
• A bad recommender has a cost
- Can be greater than not receiving a recommendation
![Page 21: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/21.jpg)
| 20
Implementation
![Page 22: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/22.jpg)
| 21
Systems Architecture
Impression
Discounting
API
Front End
AWS
Dithering
Candidate Selection
Conte
nt
Based
Item
2Ite
m
CF
Online
Offline
Logs
![Page 23: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/23.jpg)
| 22
The unbundled mess
![Page 24: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/24.jpg)
| 23
System
• Which run generated the
recommendation
• What was served to the user
• How was the score modified
• What was removed from the
recommendations
User (Feedback loop)
• What was displayed
• What was clicked
• When were they served
• Where the recommendations
displayed
Logging
Used for both debugging and feeding information to recommender
![Page 25: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/25.jpg)
| 24
Evolutions
![Page 26: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/26.jpg)
| 25
• User to Item CF
• Impression Discounting
Mendeley – Desktop Application
![Page 27: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/27.jpg)
| 26
Mendeley – Online
• Implicit – serves recommendations based on user libraries
• Recent Activity – based off recent additions to a users library
• Research Interests - based on user generated tags
• Discipline – based on their self identified discipline
Most Personalized
Least Personalized
See Hristakeva, M et. Al (2017)
![Page 28: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/28.jpg)
| 27
• Remove carousels
• Focus on implicit
recommendations
• Fall back to content based
solution
Mendeley – Online
![Page 29: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/29.jpg)
| 28
• Recommendation based of the
complete library of the user
• Don’t send the same
recommendations twice
Mendeley - Email
![Page 30: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/30.jpg)
| 29
• Item to Item
• Take user reading history
• Get recommendations for each
item
• Interleave recommendations
• Don’t send same
recommendations twice
Science Direct - Email
![Page 31: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/31.jpg)
| 30
Science Direct – Article Page
Item to Item
Dither
recommendations
every 30 minutes
![Page 32: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/32.jpg)
| 31
Evaluation
![Page 33: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/33.jpg)
| 32
Off-line Methodology
Train model QueryGround
truth
Time, user interactions
Test
![Page 34: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/34.jpg)
| 33
Off-line evaluation - Mendeley
From Hristakeva, M et. al
![Page 35: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/35.jpg)
| 34
Science Direct – Item-to-item
![Page 36: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/36.jpg)
| 35
• Infrastructure takes a long time to build
• Need feedback from users to learn
1. Generate recommendations off-line
2. Send to users via email (A/A)
3. Modify method based on feedback
4. Send second set of users split into A/B buckets
Static Recommendations for quick learnings
Email to users
Modify Recommender
Email to users
![Page 37: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/37.jpg)
| 36
Future Direction
![Page 38: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/38.jpg)
| 37
Learning to rank (LtR)
Currently only using implicit feedback
No content used
Use CF as candidate selection
Re-rank results based on learnt model
optimised for CtR
Use item and user features
![Page 39: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/39.jpg)
| 38
Deep Learning
Use to learn more complex features
Use as features in LtR
Build on the existing framework developed
Use pre-trained models before developing own
![Page 40: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/40.jpg)
| 39
Conclusion (Take Homes)
• Log EVERYTHING
• Start Simple
• Iterate quickly
• Get recommendations out quickly to learn
• Don’t look stupid
• CTR ≇ Off-line Evaluation
![Page 41: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/41.jpg)
| 40
www.elsevier.com/rd-solutions
Thank you,
Book chapter being written based on the content in this presentation
![Page 42: Building Recommender Systems - Mendeley and Science Direct](https://reader034.vdocument.in/reader034/viewer/2022051710/5a6e99777f8b9ae8728b4a2f/html5/thumbnails/42.jpg)
| 41
References
Hristakeva, M., Kershaw, D., Rossetti, M., Knoth, P., Pettit, B., Vargas, S., & Jack, K. (2017). Building recommender systems for scholarly information. the 1st Workshop (pp. 25–32). New York, New York, USA: ACM. http://doi.org/10.1145/3057148.3057152
Rossetti, M., Stella, F., & Zanker, M. (2016). Contrasting Offline and Online Results when Evaluating Recommendation Algorithms (pp. 31–34). Presented at the Proceedings of the 10th ACM Conference on Recommender Systems, New York, NY, USA: ACM. http://doi.org/10.1145/2959100.2959176
Lee, P., Lakshmanan, L. V. S., Tiwari, M., & Shah, S. (2014). Modeling impression discounting in large-scale recommender systems (pp. 1837–1846). Presented at the Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, New York, USA: ACM Press. http://doi.org/10.1145/2623330.2623356
Koren, Y. (2010). Collaborative filtering with temporal dynamics. Communications of the ACM, 53(4), 89–97. http://doi.org/10.1145/1721654.1721677