nuanced graph representation to improve recommendation: the case of browsing and social networks

Post on 17-Nov-2014

312 Views

Category:

Data & Analytics

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Graphs are ubiquitous representations of a wide range of online traces generated by user activities including browsing, messaging, social linking, and many more. For their simplicity and power, graphs (like other similar representations of relational data) have been used in a plethora of applications, most of them falling under the umbrella of recommendation and personalization. However, very often the notion of graph and its atomic components (nodes and edges) are adopted uncritically, without giving much thought to their nature or meaning. In real-world scenarios the meaning of a link can vary broadly even within the same system or interaction type. We study browsing and social graph and show how a to obtain a more nuanced representation of their links to help gaining a deeper understanding of their nature and, in turn, to properly exploit the information about link type in recommendation tasks. First, we present the use of the BrowseGraph and its decomposition into ReferrerGraphs for image and news recommendation. Last, we will show how conversation graphs can be decomposed in subgraphs carrying different information about the type of resources exchanged between peers, providing an overview on the potential that such nuanced representation can have in the field of recommendation. Our analysis is conducted on large datasets extracted from Yahoo News, Flickr, and aNobii.

TRANSCRIPT

Nuanced graph representation to improve recommendationThe case of browsing and social networks

1st International Workshop on Social Personalisation (SP 2014)

Luca Maria Aiello

Who’s this guy?

Network analysis

?

??

? ?

Roadmap

Part I : Browsing graphs in contextTo surface interesting content and address cold start-

scenario

Part II : Pragmatics of communication graphs To decompose the dyadic interaction and profile

user-to-user ties

Browsing graphs

Team

Luca Maria Aiello

Michele Trevisiol

Alejandro Jaimes

Luca Chiarandini

Rossano Schifanella

Browse Graph

• Nodes are pages• Edges are aggregated

browsing transitions

Trevisiol et al. “Image Ranking Based on User Browsing Behaviour” SIGIR 2012

• Centrality is a “good” indicator of content interestingness

• External layers add useful information

Flickr browsegraph

flickr

• Flickr browsing data– 2 months, 10M users, 50M nodes, 300M pageviews

Most central nodes in Flickr BrowseGraph

• Comparison with PageRank (no external nodes), Favorites, Clicks, View time– High quality– Higher topical variety– Surfaces photos related to real world events or interesting but not

popular

Trevisiol et al. “Image Ranking Based on User Browsing Behaviour” SIGIR 2012

Top 10 photos

Art Series OddEvents

Referrer Graph

• External accesses come from heterogeneous environment

Trevisiol et al. “Cold-start News Recommendation with Domain-dependent Browse Graph” RecSys 2014

• Extract subgraphs induced by the browsing traces from the same entry point

• Study their structural differences

Browsing in News

Yahoo News ReferrerGraphs

• 1 month of Yahoo News browsing log– 0.5 B entries

• Avg. number of hops per session =~2

Domain-dependent consumption

Jaccard similarity of node sets Kendall tau of nodes pageranks

Domain-dependent consumption

News consumption in time

Normalized article lifespan

PD

F(vie

ws)

Cold start recommendation

• Fingerprint of traffic depends of the referrer domain• Can we use this for recommendation?

40

30

2010

50

80

25

15

• Random• Most popular• Edge-based• Content-based

• Cosine sim + TF-IDF• (Full and mix graph

variants)

60

90

BenchmarkAveraged over 1,438 hourly graphs (~350k users per hour)

BenchmarkAveraged over 1,438 hourly graphs (~350k users per hour)

Takeaways

• Graph structure can be more useful than other simple indicators of user feedback to surface interesting content

• Browsing structure changes radically wrt referrer domain

• Historical browsing information is more effective than other cold-start indicators to predict next view (surprising?)

Conversation graphs

Team

Luca Maria Aiello

Rossano Schifanella

Bogdan State

Aiello et al. “Reading the source code of social ties” WebSci 2014

Conversation graph

Beyond simple edges

• Structure• Content

– Syntactics– Semantics

• Pragmatics (beyond saying)– Communication acts that define the type of social

relationship

35

7

hello!

Topic modeling, sentiment analysis, NLP, …

?What is the “nature” of a social tie?

Beyond simple edges

• Blau’s Social Exchange Theory– Exchange of non-material resources

• Objective: Label message with resources it conveys

5

Peter Blau “Exchange and power in social life” 1964

User profilingLink profilingVisualization…

How?

1

23

4 5

6

7

8 9

10 111. Preprocessing

– Stopwords, stemming

2. Message bucketing– NMF, LDA, …

3. Transition graph– Buckets as nodes

transitions as edges

• Intuition: conversations tend to stick to the same resource (“You’re very good at it” “You are pretty good as well”)

4. Resource extraction– Community detection

on transition graph

Input: directed comm. multigraph, arcs labeled with time and textOutput: (probabilistic) assignment message resource DISCOVERY!

A C

B D

Experiments

Dataset (anobii.com)

Aiello et al. “People are Strange when you're a Stranger: Impact and Influence of Bots on Social Networks” ICWSM'12

Status

Knowledge

Support

Anobii transition graph

Status exchangeKnowledge exchange Social support

Technical knowledge of a domain (stackoverflow)

Request for knowledge

“I read a very good review of that book”

Expression of admiration or esteem

Recognition of the partner’s higher status

“You are very smart!”

Emotional valuation

Everyday minute exchanges

“Hope your dad is feeling better now”

80% of messages are correctly assigned (human coders)

Gilbert, Karahalios “Predicting tie strength with social media” CHI 2009

Tie composition and strength

Communication networks induced by the exchange of a single resource

• Status: highly reciprocal, short lived, pervasive• Support: sentiment involved, long lived, between similar actors• Knowledge: long messages, between similar actors

Inequality

• Gini coefficient ~0.7 for all networks, higher for status

Lorenz curve Assortativity

• People receive status from people with lower status

Indegree (/instrength) = amount of resource owned

Tie evolution

• Knowledge prevails after three exchanges• Support increases steadily• Status-exchange fades away quickly

Conversation length

Rati

o o

f re

sourc

e in c

onvers

ati

on

Generality? (Flickr!)

Flickr

Conversation lengthRati

o o

f re

sourc

e in c

onvers

ati

on

Takeaways

• Need for a description of social interaction that goes beyond topics/sentiment/etc.

• Big potential impact on related fields on network studies (e.g., information propagation)

• Social tie sequence of individual exchanges Computational properties of social rituals “Grammar of society”

Conclusion

Graphs are usually not isolated, homogeneous entities. Do not oversimplify

when possible.

Quick announcement(I’ll be ready for questions in few seconds!)

BARCELONA, 10-13 November 2014www.socinfo2014.org@socinfo2014

Thank you!Questions?

@lajellowww.lajello.com

alucca@yahoo-inc.com

top related