smart canvas @ large scale recommender systems workshop 2015

36
Smart Canvas ® Hybrid Recommender System Architecture and Evaluation Gilmar Souza Gabriel Moreira

Upload: gilmar-souza

Post on 14-Feb-2017

434 views

Category:

Technology


0 download

TRANSCRIPT

Smart Canvas® Hybrid Recommender SystemArchitecture and Evaluation

Gilmar Souza

Gabriel Moreira

We are global with people in Brazil, USA, Europe, Australia, Japan and China

Our excellence has been recognized by the market with awards since our foundation in 1995

We are truly multi-cultural, an army of over 2000 talented employees and their great stories

Strong presence in strategic regions

USA

• Atlanta• Philadelphia• Houston• San Francisco• Somerset NJ• New York

Brazil

• Campinas (headquarters)

• Belo Horizonte• Rio de Janeiro• São Paulo

UK

• London

China

• Ningbo• Shanghai

Japan

• Tokyo

3

Australia

• Sydney

Smart Canvas is a platform that delivers web and mobile user experiences using

machine learning algorithms.

Smart CanvasReordered Stacks

Desktop mobile sitesand mobile apps

Card deck(Chronological

content inventory)

Big Picture: 1 aggregate -> 2. cardify -> 3. curate

Your environment(internal and external)

Public DataRSS

SocialNetworks

All of your portals, apps and people

3rd party systems through custom

connections

1 3

2

metrics and insights

Algorithms

Context

● Cloud-based SaaS, hosted on Google● Multi-tenant● Used in different domains (websites, intranets,

collaboration portals)● Content produced by the users or ingested from

external services.● Hybrid recommender system● Implicit and explicit feedback (touchpoints)

Some Context

Following concepts of material design, presents content as cards and allow like, dislike, pins and share interactions. Curation merges search relevance with personalized recommendations.

Smart Canvas User Interface

Websites: Ancar Ivanhoe Shopping MallsDigital strategy revamp

Collaboration Portal: Whirpool Winning Workplace TransformationThe place to learn about Google Apps

Collaboration Portal: SulAmérica Seguros: Innovation PortalCapturing ideas and inspiration to overcome new market challenges

Collaboration Portal: Motorola Smart Analytics PortalReports repository reimagined

Designing for Choice: The Making of the Wood-Back Moto X [VIDEO]New Year, New You

Publisher Name30 december 2014

Get to Know Rhea Jeong, a Motorola Designer of Moto Hint

Search

Analytics

999 999 999

Inside Motorola Inside Motorola Inside Motorola Inside Motorola

999 999 999 999 999 999 999 999 999

... ...

Smart Analytics

Moto X Device Activations: Latin America Northsource.com

Customer Satisfaction 4Q2014source.com

Model Sales per regionsource.com

All

Number of support calls per regionsource.com

What's new Communities AnalyticsPeople ToolsEvents

Smart Analytics

Inside Motorola

999 999 999

Number of support calls per regionsource.com

● 23 Tenants (customer portals).● 33K cards ingested.● 8.5M users (anonymous and logged).● 30M touchpoints (user events) captured.● 14M recommendations provided.● 300 ms average recommendation response time.

Last 12 months In Numbers

Architecture

Cards

Touchpoints

Users

Data Layer Batch Layer

RecommendersOffline Components

SimilarityAlgorithms

Serving Layer

RecommendersOnline Components

HybridComposition

Recommendation Layers

Search Engine

search by terms. (context: url, localedevice and person id)

cards satisfying the search criteria

Dynamic Recommender

Strategy

Recommenders

cards reordered by relevance.

Recommenders are created and configured based the context pattern (locale, device, time)

Your environment(internal and external)

1

All of your portals and apps

2

3

4

5

search results curated by personalized recommenders

6

user clickstream (touchpoints) is used to refine recommendations

Recommendation Process

Online ServingGoogle App Engine (Java)

Batch ProcessingGoogle Compute Engine

Google Cloud Storage

Dynamic Recommender

Factory

Recommenders

HDFS

Memcache On-demand Jobs

Datastore(NoSQL)

The recommender system is implemented on Google Cloud Platform and Hadoop Ecosystem to ensure its scalability and performance at large.

Pig Mahout Python

Smart CanvasUsers

Technology Components

Algorithms

HYBRID RECOMMENDER STRATEGY IS AN AGGREGATION OF THESE ALGORITHMS

Personalized

USER-BASED CF - Find similar users and recommend cards those users liked

CONTENT-BASED FILTERING - Recommends cards with content similar to the user's preference vector (weighted average of the most relevant words in the content of the cards user interacted with).

ITEM-ITEM FREQUENCY - Recommends X cards frequently read together with the last Y cards user has read.

DISLIKE FILTER - Reduces relevance of cards user has disliked

Non-Personalized

POPULARITY - Recommends most popular cards, decreasing its relevance as them become older

RECENCY - Recommends most recent cards

FIXED PACK - Human curation, where cards are fixed on top by the admin users (marketers)

ESSENTIALS PACK - Recommends a defined set of cards for new users (< Z interactions)

Recommender Algorithms Overview

Evaluation Framework

● Normalized Discounted Cumulative Gain (nDCG)Verify whether the cards are ranked accordingly to the relevance for the users

● Top-N AccuracyVerify whether the more relevant cards for the user are among the Top-N○ Precision - How many Top-N cards are relevant?○ Recall - How many relevant cards are among the Top-N?

Offline Metrics

Cognition Trainer

Offline Evaluation workflow

Offline Evaluator

1 - Filters touchpoints from train and test sets periods2 - Run cognition job3 - Gets recommendations for each user using a SmartCanvas WebService

UsersItemsInteractions

Cognition WebService

Pre-generatedrecommendations

(Utility Matrix)

21

3

4 5

4 - Compare recommendations ranking with real user interactions in test set, and calculate offline metrics and statistical significance5 - Stores results in BigQuery tables, which can be accessed by a Tableau dashboard

Offline Metrics

Database

1 - Users sessions are split among control group and variants of hybrid recommender settings2 - User interactions and recommendation logs are recorded as touchpoints in BigQuery3 - Online Evaluator Job calculates engagement, accuracy, and coverage metrics4 - Results are stored in BigQuery and accessible from Tableau Dashboards

Online metrics extraction workflow

4

1

3

Online Evaluator Job

Traffic Split(A/B Tests)

Touchpoints

bigquery

2

Online Metrics - Engagement

1. Average sessions by user Considers only engagement of active users - with at least one visit (session) in the period

2. Average PageViews by session

Objective: Improve User Engagement with best RecommendationsAggregation: Daily and Weekly

3. Average session strengthwhere S are the sessions and w the interaction strength

4. Average session minimum navigation timeMinimum navigation is calculated taking elapsed seconds between the first and last touchpoints in the session.When there is only one touchpoint in the session, 60 seconds will be considered as navigation time.

Online Metrics - Engagement

Online Metrics - Conversion

2. CTR for Personalized and Non-Personalized recommendations

Objective: Increase Click-Through Rate (CTR) for cards recommended to the user

1. CTR by Recommender#interactions / #visualizations

where # visualizations assumes that all cards were viewed till at least the last card position (ranking order) the user has interacted in the page, or a minimum position of 10, considering the user has made at least one scroll in the page on a desktop browser.

Online Metrics - Conversion

Online Metrics - Conversion

Online Metrics - Conversion

Online Metrics - Coverage

3. General Users Coverage

% of the users that receives at list one personalized recommendation

during a given period

Objective: Increase the number of users and cards with personalized recommendations

1. Users Coverage by Recommender#unique_users_with_at_least_a_personalized_recommendation / #unique_users_accesses

2. Cards Coverage by Recommender#cards_with_at_least_a_personalized_recommendation / #active_recommendable_cards

where #active_recommendable_cards assumes all cards that were active (status=CREATED, approvalStatus=APPROVED) at a given date

Online Metrics - Coverage

A/B Testing

● Recommendations embedded in the search results using elasticsearch

● Real-time and Near real-time recommendations delivered via message bus.

● New product focused on enterprise collaboration

Under Construction

● Cloud computing allow us to scale up and down easily

● A hybrid approach helped us to deal with a multi-domain scenario and to balance algorithms pros and cons.

● Our evaluation framework (offline and online) allows us to assess hypothesis, tune hyperparameters, and provides a deep understanding of recommenders effectiveness.

Conclusions

Gilmar [email protected]

Gabriel [email protected]

Thank you!www.smartcanvas.com