recsys 2016: modeling contextual information in session-aware recommender systems with neural...

19
RecSys 2016 Boston Modelling Contextual Information in Session-Aware Recommender Systems with Neural Networks Bart lomiej Twardowski Warsaw University of Technology 18 August 2016 Bart lomiej Twardowski RecSys2016 Boston

Upload: bartlomiej-twardowski

Post on 16-Apr-2017

419 views

Category:

Science


0 download

TRANSCRIPT

RecSys 2016 BostonModelling Contextual Information in Session-Aware

Recommender Systems with Neural Networks

Bart lomiej Twardowski

Warsaw University of Technology

18 August 2016

Bart lomiej Twardowski RecSys2016 Boston

1 Problem Definition and Motivation

2 Explicit Session Modeling with Matrix Factorization

3 Automatic Session Modeling with NN

4 Experiments and Evaluation

Bart lomiej Twardowski RecSys2016 Boston

Research Motivation

Traditional RS require identified User and persistent Items.

Context-Aware RS (CARS) are not prepared for handling usersessions data directly.

Time in recommender systems is discretized (CARS TensorFactorization) or used as a bias.

Successes of using Neural Networks in other fields:

FF/Conv NN in vision and image processingRNN for Natural Language Processing

Bart lomiej Twardowski RecSys2016 Boston

Industry Motivation

Main objective

Capture a user short-term goals as fast as possible.

The 57.06% of all sessions are non-logged users and its fingerprintin form of HTTP cookie or device hash does not allow us toidentify the user.1

Only 2.53% of all sessions converted to transaction. Most of thesessions are window-shopping ones.The 20.98% of all page views are interactions with search engine.From all sessions, the 35.80% percent used search for finding theright offer.

1Presented statistics are based on Polish e-marketplace allegro.pl for 3228M page views sample in January of 2016, where 310 M sessions was identifiedby HTTP cookie or mobile device hash.

Bart lomiej Twardowski RecSys2016 Boston

Implicit Feedback: User-System Interactions

·

source:direct

ua:chrome1

cookie

user-id S1

search-terms:iPhone

S2

sort: price

search-terms:iPhone 5 black

V1id1

name2

desc1

seler1

attrib1: value1

. . .

V2

V3

S3category:Accessories

terms:iPhone 5 black

S4

sort: price

location: Warsaw

C+1

add-item:id2

C+2

add-item:id3

C−3

remove-item:id2

V4

V5

B1 item:id3

Figure: Navigational representation of the session for a samplee-marketplace session

Bart lomiej Twardowski RecSys2016 Boston

Research Problem - Key Assumptions

User is not identified by known id, just by the currentbehaviour.

All users activities in a form of session are available.

Items are ephemeral, described by a set of attributes.

RS output is Top-N new, recommended items.

Bart lomiej Twardowski RecSys2016 Boston

User Session Definition

User sessions in this work are defined as uninterrupted sequences ofactivity in the system. The session ends when the user is inactivefor more than a predefined number of minutes [1].

All sessions form a set S = {s1, . . . , sm}, where each session isrepresented as a set of events ordered in time:

sm = {e(1)m , . . . , e

(t)m }, where t is the time of the event occurrence

in session m. In turn, each event is described by contextual

information: e(t)m ∈ CE

1 × CE2 × · · · × CE

k , where number of

attributes k depends on collected event e(t)m type.

Bart lomiej Twardowski RecSys2016 Boston

Ephemeral Items and Item Representation

The recommended items are ephemeral i.e. the item life-cycleis too short or the availability is too dynamic to identify itonly by unique id, e.g. news, online auctions.

In such settings, known workarounds: Content-Based filteringand items clustering.

Assuming that items will never come back - dealing withconstant cold-start problem.

I = {i1, . . . , in}. Each item is described by a set of definedattributes in ∈ C I

1 × C I2 × · · · × C I

p, where the number of all theitem attributes is p.

Bart lomiej Twardowski RecSys2016 Boston

Item and Event Encoding

All methods presented in this work require items and events beingrepresented by real-valued vectors.Item encoding: C I

1 × C I2 × · · · × C I

p → xi , xi ∈ RdI should exist,where dI is the number of real-values in the encoded itemrepresentation.Similarly, the session event CE

1 × CE2 × · · · × CE

k → xe , xe ∈ RdE ,where dE is the dimension of the encoded event vector.

Bart lomiej Twardowski RecSys2016 Boston

Matrix Factorization Model

The final estimation is:

y(xs, xi) = (xsQ)(xiP)ᵀ + xsbpᵀ + xibq

where Q ∈ RdS×d and P ∈ RdI×d are matrices with d-dimensionallatent features for session and items variables respectively.

Bart lomiej Twardowski RecSys2016 Boston

Explicit Session Modeling

The session vector xs aggregates variables from all events within it.Due to the fact that some assumptions have to be made abouthow all events information should be encoded into single sessionvector this method is considered as an explicit session modeling.One solution, which is giving good results and is used in this work,is to aggregate event data in a time decaying way

xsm =∑t

j=11

1+t−j x(j)es,m , where session vector size dS = dE .

Bart lomiej Twardowski RecSys2016 Boston

Recurrent NN and Feed Forward NN

Both, Recurrent Neural Network (RNN) and Feed Forward NeuralNetwork (FFNN) are used to predict Top-N recommendations forthe session.The RNN is used to capture data dependency between sessionevents in time. It uses hidden state as the memory to handlevariable length data. In this case, the sequence of events in session.The FFNN is used as a ranking score estimator. It uses therepresentation of session context returned by RNN and the newitems data as an input.Pairwise ranking loss func.: BPR[3], TOP-1[2], WARP[4], k-osWARP.

Bart lomiej Twardowski RecSys2016 Boston

NN Architecture

Event EmbeddingItem Embedding

x(t)es

xi

RNN Layers

Merge/Dropout Layer

FF Layers

y(t)s,i

Figure: Neural Network Layers Architecture.

Bart lomiej Twardowski RecSys2016 Boston

Datasets

dataset ALLEGRO AVITO

items 24360 4374sessions 20904 31826events 535871 767550searches 366874 110204density % 0.028 0.464s. len. mean 25.634 24.117s. len. std 30.282 14.877

Table: Dataset statistics

For further processing, event and items in dataset is encoded.Bag-of-words minimal frequency is set to 10. This results indI=473/2710 and dE=716/5523 and for AVITO/ ALLEGROdatasets respectively.

Bart lomiej Twardowski RecSys2016 Boston

Evaluated Methods

Used methods and its training parameters:

POP

CB: cos

MF: d=100, adagrad

NN: no embedding, single GRU layer with 200, MLP 400,dropout 0.3, rmsprop

Bart lomiej Twardowski RecSys2016 Boston

Experiments Results

ALLEGRO AVITOREC@20 MRR@20 REC@20 MRR@20

NN-BPR-IE 0.4131±.00510.1328±.0023 0.1601±.0028 0.0469±.0009MF-BPR-IE 0.3849±.0031 0.1150±.0003 0.1894±.0008 0.0404±.0001NN-TOPK-IE 0.3367±.0057 0.0885±.0020 0.1985±.0031 0.0579±.0004MF-TOPK-IE 0.2862±.0024 0.0773±.0017 0.2699±.0018 0.0658±.0005NN-TOPK-I 0.2863±.0048 0.0709±.0025 0.2003±.0012 0.0579±.0004MF-BPR-I 0.3080±.0013 0.0864±.0012 0.1883±.0015 0.0408±.0000MF-TOPK-I 0.2353±.0025 0.0586±.0008 0.2691±.0001 0.0660±.0004CB 0.1858 0.0354 0.1528 0.0243POP 0.0499 0.0129 0.0193 0.0037

Table: Experiment results for Recall and Mean Reciprocal Rank for Top-20recommendations. A row label describes used algorithm, loss function andcontextual information (I - only items data, IE - items and events context).The mean values with 95% CI are given.

Bart lomiej Twardowski RecSys2016 Boston

Experiments Results

5 10 15 20N

0.000.050.100.150.200.250.300.350.400.45

RE

C@

N

Dataset: ALLEGRO

5 10 15 20N

0.000.020.040.060.080.100.120.14

MR

R@

N

Dataset: ALLEGRONN-BPR-IEMF-BPR-IENN-TOPK-IEMF-BPR-INN-TOPK-ICBPOP

5 10 15 20N

0.000.050.100.150.200.250.30

RE

C@

N

Dataset: AVITO

5 10 15 20N

0.000.010.020.030.040.050.060.07

MR

R@

N

Dataset: AVITO

MF-TOPK-INN-TOPK-INN-BPR-IEMF-BPR-IECBPOP

Figure: Results in Recall and MRR@Top-N Recommendations.

Bart lomiej Twardowski RecSys2016 Boston

Thank You.Q&A.

Bart lomiej [email protected], @btwardow

Bart lomiej Twardowski RecSys2016 Boston

References I

Gayo-Avello, D.A survey on session detection methods in query logs and a proposal for futureevaluation.Information Sciences 179, 12 (2009).

Hidasi, B., Karatzoglou, A., Baltrunas, L., and Tikk, D.Session-based Recommendations with Recurrent Neural Networks.

Rendle, S., Freudenthaler, C., Gantner, Z., and Schmidt-thieme, L.BPR : Bayesian Personalized Ranking from Implicit Feedback.In Proceedings of the Twenty-Fifth Conference on Uncertainty in ArtificialIntelligence (2009).

Weston, J., Bengio, S., and Usunier, N.WSABIE: Scaling up to large vocabulary image annotation.IJCAI International Joint Conference on Artificial Intelligence (2011).

Bart lomiej Twardowski RecSys2016 Boston