parameter-efficient transfer from sequential behaviors for

24
Fajie Yuan Tencent); Xiangnan He University of Science and Technology of ChinaAlexandros Karatzoglou (Google Research); Liguang Zhang (Tencent) Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and Recommendation SIGIR2020 PeterRec Data&Code: https://github.com/fajieyuan/sigir2020_peterrec

Upload: others

Post on 22-Oct-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Parameter-Efficient Transfer from Sequential Behaviors for

Fajie Yuan (Tencent); Xiangnan He (University of Science and Technology of China)Alexandros Karatzoglou (Google Research); Liguang Zhang (Tencent)

Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and

Recommendation SIGIR2020

PeterRec Data&Code: https://github.com/fajieyuan/sigir2020_peterrec

Page 2: Parameter-Efficient Transfer from Sequential Behaviors for

➢ Motivation➢ Related Work➢ PeterRec➢ Experiments

Outline

Page 3: Parameter-Efficient Transfer from Sequential Behaviors for

Users engage with recommender systems and provided or left feedback.

Figure: https://www.researchgate.net/figure/The-sequential-recommendation-process-After-the-RS-recommends-an-item-the-user-gives_fig4_311513879

Page 4: Parameter-Efficient Transfer from Sequential Behaviors for

Car Rec

A user has different roles to play in life!

News Rec

Music Rec

Video Rec Video Rec

Social APP

Social APP

Video Rec

Search Engine

Browser

Map APP

Video Rec

Me

our

Motivation

Page 5: Parameter-Efficient Transfer from Sequential Behaviors for

About myself

Male

Age: 30Phd of Glasgow University

Researcher at Tencent

Married

Hot user in Tiktok Cold user in

Amazon

New user in Netflix

A user has different roles to play in life!

My user Model

our

Motivation

Page 6: Parameter-Efficient Transfer from Sequential Behaviors for

Our PeterRec

About myself

Male

Age: 30Phd of Glasgow

Univeristy

Researcher at Tencent

Married

Hot user in Tiktok

Cold user in

AmazonNew user in Netflix

My user Model

our

MotivationA user has different roles to play in life!

Page 7: Parameter-Efficient Transfer from Sequential Behaviors for

➢ Motivation➢ Related Work➢ PeterRec➢ Experiments

Outline

Page 8: Parameter-Efficient Transfer from Sequential Behaviors for

• Recommendation Background: (1) Content & Context Recommendation

(2) Session-based Recommendation: recommending the next item based on previously recorded user interactions.

DNN

user vector item vector

𝑥1, 𝑥2 , … , 𝑥𝑛 +C 𝑦i

DNN

Embedding Embedding

A DSSM (Non-sequential) RS model (Supervised Learning)Sequential NextItNet (Self-supervised Learning)

Page 9: Parameter-Efficient Transfer from Sequential Behaviors for

Why sequential recommendation?

• Short-videos (Tik Tok, Weishi, Kuaishou)

Music (Tencent music,Yahoo! Music) & News

Movie clips (You Tube, Netflix)

NonSeq Rec vs. Seq Rec:

• Only Static vs. Dynamic Preference

• Manual Feature Engineering vs. Manual-free Features

• Supervised Learning vs. Unsupervised (self-supervised) Learning

Page 10: Parameter-Efficient Transfer from Sequential Behaviors for

Transfer Learning Background

TL aims to extract the knowledge from one or more source tasks and applies the knowledge to a target task.

Figure: A Comprehensive Hands-on Guide to Transfer Learning with Real-World Applications in Deep Learning, online

Page 11: Parameter-Efficient Transfer from Sequential Behaviors for

Transfer Learning (TL) vs Multi-task Learning (MTL)

[1] ICML 2018: Advances in transfer, multitask, and semi-supervised learning, online

TL vs MTL• Two-stage training vs joint training• One objective vs multiple objectives• Care only target vs. care all objectives

Page 12: Parameter-Efficient Transfer from Sequential Behaviors for

Transfer Learning (TL) for Recommender System (RS)

Motivation:• User representation may be generic,

since their preference tends to besimilar across different recommendationtask. That is, user’s engagement inprevious platforms may be importanttraining signals for other systems.

• Traditional ML models usually fail towhen modeling new or cold users dueto lack of interaction data

[1] figure is from online, url is missing

[1]

Page 13: Parameter-Efficient Transfer from Sequential Behaviors for

➢ Motivation➢ Related Work➢ PeterRec➢ Experiments

Outline

Page 14: Parameter-Efficient Transfer from Sequential Behaviors for

Transfer Learning (TL) for Recommender System (RS)

Task description

Source data: (u, 𝑥𝑢), 𝑤ℎ𝑒𝑟𝑒 𝑥𝑢 = 𝑥1𝑢, 𝑥2

𝑢 , … 𝑥𝑛𝑢 ,

where 𝑥𝑡𝑢denotes the t−th interacted item of user u

Target data: (u, y) where y is the supervise label in the target dataset

Example Source data: user’s watching activities in Tencent QQ BrowserTarget data: user’s watching activities in Kandian, but users are cold or new here

or user’s profile labels e.g. age, gender, lifestatus, etc.

Page 15: Parameter-Efficient Transfer from Sequential Behaviors for

PeterRec Architecture

(a)pre-training on QQ Browser data

[TCL]

𝐿𝑎𝑏𝑒𝑙 (e.g., gender)

(b)fine-tuning on user profile dataset

ℋ ෩Θ

𝑤 Θ π ν

෩ℋ ෩Θ

𝑥1𝑢 𝑥2

𝑢 𝑥3𝑢 𝑥4

𝑢 𝑥1𝑢 𝑥2

𝑢 𝑥3𝑢 𝑥4

𝑢

𝑥2𝑢 𝑥3

𝑢 𝑥4𝑢 𝑥5

𝑢

User clickingbehaviors

NextItNet-style neural network

NextItNet: A Simple Convolutional Generative Network for Next Item Recommendation. WSDM2019, Yuan et al.

, .

AutoRegressive

Page 16: Parameter-Efficient Transfer from Sequential Behaviors for

What can be done by PeterRec

(a)pre-training

[TCL]

𝐿𝑎𝑏𝑒𝑙

(b)fine-tuning

ℋ ෩Θ

𝑤 Θ π ν

෩ℋ ෩Θ

𝑥1𝑢 𝑥2

𝑢 𝑥3𝑢 𝑥4

𝑢 𝑥1𝑢 𝑥2

𝑢 𝑥3𝑢 𝑥4

𝑢

𝑥2𝑢 𝑥3

𝑢 𝑥4𝑢 𝑥5

𝑢

• Cold-start problem, e.g., ads rec• User profile prediction, e.g., gender prediction

Page 17: Parameter-Efficient Transfer from Sequential Behaviors for

Problems we meet when a number of tasks are required.

pretraining

Six fine-tuning models

[TCL]

Age 𝐿𝑎𝑏𝑒𝑙

π1 ν

෩ℋ ෩Θ1

𝑥1𝑢 𝑥2

𝑢 𝑥3𝑢 𝑥4

𝑢

[TCL]

Lifestatus 𝐿𝑎𝑏𝑒𝑙

π2 ν

෩ℋ ෪Θ2

𝑥1𝑢 𝑥2

𝑢 𝑥3𝑢 𝑥4

𝑢

[TCL]

Education 𝐿𝑎𝑏𝑒𝑙

π3 ν

෩ℋ ෩Θ3

𝑥1𝑢 𝑥2

𝑢 𝑥3𝑢 𝑥4

𝑢

[TCL]

Gender𝐿𝑎𝑏𝑒𝑙

π4 ν

෩ℋ ෪Θ4

𝑥1𝑢 𝑥2

𝑢 𝑥3𝑢 𝑥4

𝑢

[TCL]

Profession 𝐿𝑎𝑏𝑒𝑙

π5 ν

෩ℋ ෩Θ4

𝑥1𝑢 𝑥2

𝑢 𝑥3𝑢 𝑥4

𝑢

[TCL]

Ads 𝐿𝑎𝑏𝑒𝑙

π6 ν

෩ℋ ෩Θ5

𝑥1𝑢 𝑥2

𝑢 𝑥3𝑢 𝑥4

𝑢

Training a separate model for each downstream task is parameter-inefficientsince both pretraining& finetuning modelsare very large.

The number of finetunedmodels is as many as the number of downstreamtasks.100 tasks=100 finetunedmodels

Page 18: Parameter-Efficient Transfer from Sequential Behaviors for

Taking inspiration from grafting

A: branch of plumB: Tree of peachC: insertionD: grow together

TreePretrained model

Pretrained model is treated as the peach Tree.

Page 19: Parameter-Efficient Transfer from Sequential Behaviors for

Grafting for plants.

Original resnet block insertionMP

A: branch of plum vs MPB: Tree of peach vs pretrained modelC: insertion vs insertionD: grow together vs finetuning

Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and Recommendation, Yuan et al SIGIR 2020

Page 20: Parameter-Efficient Transfer from Sequential Behaviors for

➢ Motivation➢ Related Work➢ PeterRec➢ Experiments

Outline

Page 21: Parameter-Efficient Transfer from Sequential Behaviors for

Results.

PeterZero:no pretraining

PeterRec: with pretraining

Is pretraining necessary?

, .

Page 22: Parameter-Efficient Transfer from Sequential Behaviors for

Results.

Page 23: Parameter-Efficient Transfer from Sequential Behaviors for

What can be done by Peterrec

More

Adolescent mental health - for parentsPayment capacity - for bank Advertising – for company

Example : if we have the video watch be- haviors of a teenager, we may know whether he has depression or propensity for violence by PeterRec without resorting to much fea- ture engineering and human-labeled data.

Page 24: Parameter-Efficient Transfer from Sequential Behaviors for