![Page 1: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/1.jpg)
Fajie Yuan (Tencent); Xiangnan He (University of Science and Technology of China)Alexandros Karatzoglou (Google Research); Liguang Zhang (Tencent)
Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and
Recommendation SIGIR2020
PeterRec Data&Code: https://github.com/fajieyuan/sigir2020_peterrec
![Page 2: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/2.jpg)
➢ Motivation➢ Related Work➢ PeterRec➢ Experiments
Outline
![Page 3: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/3.jpg)
Users engage with recommender systems and provided or left feedback.
Figure: https://www.researchgate.net/figure/The-sequential-recommendation-process-After-the-RS-recommends-an-item-the-user-gives_fig4_311513879
![Page 4: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/4.jpg)
Car Rec
A user has different roles to play in life!
News Rec
Music Rec
Video Rec Video Rec
Social APP
Social APP
Video Rec
Search Engine
Browser
Map APP
Video Rec
Me
our
Motivation
![Page 5: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/5.jpg)
About myself
Male
Age: 30Phd of Glasgow University
Researcher at Tencent
Married
Hot user in Tiktok Cold user in
Amazon
New user in Netflix
A user has different roles to play in life!
My user Model
our
Motivation
![Page 6: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/6.jpg)
Our PeterRec
About myself
Male
Age: 30Phd of Glasgow
Univeristy
Researcher at Tencent
Married
Hot user in Tiktok
Cold user in
AmazonNew user in Netflix
My user Model
our
MotivationA user has different roles to play in life!
![Page 7: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/7.jpg)
➢ Motivation➢ Related Work➢ PeterRec➢ Experiments
Outline
![Page 8: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/8.jpg)
• Recommendation Background: (1) Content & Context Recommendation
(2) Session-based Recommendation: recommending the next item based on previously recorded user interactions.
DNN
user vector item vector
𝑥1, 𝑥2 , … , 𝑥𝑛 +C 𝑦i
DNN
Embedding Embedding
A DSSM (Non-sequential) RS model (Supervised Learning)Sequential NextItNet (Self-supervised Learning)
![Page 9: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/9.jpg)
Why sequential recommendation?
• Short-videos (Tik Tok, Weishi, Kuaishou)
Music (Tencent music,Yahoo! Music) & News
Movie clips (You Tube, Netflix)
NonSeq Rec vs. Seq Rec:
• Only Static vs. Dynamic Preference
• Manual Feature Engineering vs. Manual-free Features
• Supervised Learning vs. Unsupervised (self-supervised) Learning
![Page 10: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/10.jpg)
Transfer Learning Background
TL aims to extract the knowledge from one or more source tasks and applies the knowledge to a target task.
Figure: A Comprehensive Hands-on Guide to Transfer Learning with Real-World Applications in Deep Learning, online
![Page 11: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/11.jpg)
Transfer Learning (TL) vs Multi-task Learning (MTL)
[1] ICML 2018: Advances in transfer, multitask, and semi-supervised learning, online
TL vs MTL• Two-stage training vs joint training• One objective vs multiple objectives• Care only target vs. care all objectives
![Page 12: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/12.jpg)
Transfer Learning (TL) for Recommender System (RS)
Motivation:• User representation may be generic,
since their preference tends to besimilar across different recommendationtask. That is, user’s engagement inprevious platforms may be importanttraining signals for other systems.
• Traditional ML models usually fail towhen modeling new or cold users dueto lack of interaction data
[1] figure is from online, url is missing
[1]
![Page 13: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/13.jpg)
➢ Motivation➢ Related Work➢ PeterRec➢ Experiments
Outline
![Page 14: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/14.jpg)
Transfer Learning (TL) for Recommender System (RS)
Task description
Source data: (u, 𝑥𝑢), 𝑤ℎ𝑒𝑟𝑒 𝑥𝑢 = 𝑥1𝑢, 𝑥2
𝑢 , … 𝑥𝑛𝑢 ,
where 𝑥𝑡𝑢denotes the t−th interacted item of user u
Target data: (u, y) where y is the supervise label in the target dataset
Example Source data: user’s watching activities in Tencent QQ BrowserTarget data: user’s watching activities in Kandian, but users are cold or new here
or user’s profile labels e.g. age, gender, lifestatus, etc.
![Page 15: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/15.jpg)
PeterRec Architecture
(a)pre-training on QQ Browser data
[TCL]
𝐿𝑎𝑏𝑒𝑙 (e.g., gender)
(b)fine-tuning on user profile dataset
ℋ ෩Θ
𝑤 Θ π ν
෩ℋ ෩Θ
𝑥1𝑢 𝑥2
𝑢 𝑥3𝑢 𝑥4
𝑢 𝑥1𝑢 𝑥2
𝑢 𝑥3𝑢 𝑥4
𝑢
𝑥2𝑢 𝑥3
𝑢 𝑥4𝑢 𝑥5
𝑢
User clickingbehaviors
NextItNet-style neural network
NextItNet: A Simple Convolutional Generative Network for Next Item Recommendation. WSDM2019, Yuan et al.
, .
AutoRegressive
![Page 16: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/16.jpg)
What can be done by PeterRec
(a)pre-training
[TCL]
𝐿𝑎𝑏𝑒𝑙
(b)fine-tuning
ℋ ෩Θ
𝑤 Θ π ν
෩ℋ ෩Θ
𝑥1𝑢 𝑥2
𝑢 𝑥3𝑢 𝑥4
𝑢 𝑥1𝑢 𝑥2
𝑢 𝑥3𝑢 𝑥4
𝑢
𝑥2𝑢 𝑥3
𝑢 𝑥4𝑢 𝑥5
𝑢
• Cold-start problem, e.g., ads rec• User profile prediction, e.g., gender prediction
![Page 17: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/17.jpg)
Problems we meet when a number of tasks are required.
pretraining
Six fine-tuning models
[TCL]
Age 𝐿𝑎𝑏𝑒𝑙
π1 ν
෩ℋ ෩Θ1
𝑥1𝑢 𝑥2
𝑢 𝑥3𝑢 𝑥4
𝑢
[TCL]
Lifestatus 𝐿𝑎𝑏𝑒𝑙
π2 ν
෩ℋ ෪Θ2
𝑥1𝑢 𝑥2
𝑢 𝑥3𝑢 𝑥4
𝑢
[TCL]
Education 𝐿𝑎𝑏𝑒𝑙
π3 ν
෩ℋ ෩Θ3
𝑥1𝑢 𝑥2
𝑢 𝑥3𝑢 𝑥4
𝑢
[TCL]
Gender𝐿𝑎𝑏𝑒𝑙
π4 ν
෩ℋ ෪Θ4
𝑥1𝑢 𝑥2
𝑢 𝑥3𝑢 𝑥4
𝑢
[TCL]
Profession 𝐿𝑎𝑏𝑒𝑙
π5 ν
෩ℋ ෩Θ4
𝑥1𝑢 𝑥2
𝑢 𝑥3𝑢 𝑥4
𝑢
[TCL]
Ads 𝐿𝑎𝑏𝑒𝑙
π6 ν
෩ℋ ෩Θ5
𝑥1𝑢 𝑥2
𝑢 𝑥3𝑢 𝑥4
𝑢
Training a separate model for each downstream task is parameter-inefficientsince both pretraining& finetuning modelsare very large.
The number of finetunedmodels is as many as the number of downstreamtasks.100 tasks=100 finetunedmodels
![Page 18: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/18.jpg)
Taking inspiration from grafting
A: branch of plumB: Tree of peachC: insertionD: grow together
TreePretrained model
Pretrained model is treated as the peach Tree.
![Page 19: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/19.jpg)
Grafting for plants.
Original resnet block insertionMP
A: branch of plum vs MPB: Tree of peach vs pretrained modelC: insertion vs insertionD: grow together vs finetuning
Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and Recommendation, Yuan et al SIGIR 2020
![Page 20: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/20.jpg)
➢ Motivation➢ Related Work➢ PeterRec➢ Experiments
Outline
![Page 21: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/21.jpg)
Results.
PeterZero:no pretraining
PeterRec: with pretraining
Is pretraining necessary?
, .
![Page 22: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/22.jpg)
Results.
![Page 23: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/23.jpg)
What can be done by Peterrec
More
Adolescent mental health - for parentsPayment capacity - for bank Advertising – for company
Example : if we have the video watch be- haviors of a teenager, we may know whether he has depression or propensity for violence by PeterRec without resorting to much fea- ture engineering and human-labeled data.
![Page 24: Parameter-Efficient Transfer from Sequential Behaviors for](https://reader031.vdocument.in/reader031/viewer/2022012417/61723675a542ce2810513849/html5/thumbnails/24.jpg)