recommendations with neo4j (fosdem 2015)

62
GraphAware TM Michal Bachman graphaware.com @graph_aware Recommendations with Neo4j Building a high-performance recommendation engine

Upload: michal-bachman

Post on 14-Jul-2015

1.914 views

Category:

Software


2 download

TRANSCRIPT

GraphAwareTM

Michal Bachman

graphaware.com

@graph_aware

Recommendations with Neo4jBuilding a high-performance recommendation engine

GraphAwareTM

Quick Intro

Why Graphs?

Business and Technical Challenges

GraphAware Recommendation Engine

About This Talk

GraphAwareTM

News you should read

Books you should buy

People you may know

People you should date

People you should market a product to

Recommendation Engines

GraphAwareTM

Content-based (features)

Collaborative filtering (user <-> item relationships)

Recommendation Engines

GraphAwareTM

Features as well as relationships can be naturally represented as a graph.

Good News

GraphAwareTM

Example

IS_OF_GENRE

title: “Love Actually”

Movie

name: “Bob”

User

name: “Comedy”

Genre

RATEDrating: 5

name: “Alice”

Username:

“Romance”

Genre

title: “American Pie”

Movie

IS_OF_GENRE

IS_OF_GENRE

RATEDrating: 5

INTERESTED_IN

rating: 5RATED

GraphAwareTM

Easy to understand

Natural to model

Flexible (schema-free)

Fast to query

Graphs (Neo4j)

GraphAwareTM

Great for a quick PoC

Great for smaller data sets

Great for relatively simple logic

Cypher

GraphAwareTM

Cypher

MATCH (u:User)-[:LIKED]->(m:Movie), (m)<-[:LIKED]-(another:User), (another)-[:LIKED]->(reco:Movie)

WHERE NOT (u)-[:LIKED|DISLIKED]->(reco)

RETURN reco;

GraphAwareTM

Requirements of real-world recommendation engines are often much more complex.

The Reality

GraphAwareTM

Imagine you’re building the ”people you may know” feature on LinkedIn.

Example

GraphAwareTM

After a brainstorming session, your team came up with the following ways of finding people one may know:

Example

GraphAwareTM

Common contacts

Facebook friends in common

Email / mobile contacts in common

Each others email / mobile contact

Worked for the same company

Studied at the same school

Share the same interest

Live in the same city

People You May Know

GraphAwareTM

But that’s just the beginning! Let’s go back and re-visit.

Example

GraphAwareTM

More contacts in common = better chance?

Same city / school / company = does size matter?

What about emails that don’t represent a person?

What about people already connected?

And pending…

And rejected…

And repeatedly ignored…

People You May Know

GraphAwareTM

Finding things to recommend

Serving the most relevant recommendations

Measuring the quality of recommendations

Time to market / cost of development

Business Challenges

GraphAwareTM

Performance (real-time!)

Simplicity

Flexibility

Technical Challenges

GraphAwareTM

So we came up with an open-source recommendation engine skeleton that will help you solve all the challenges.

We’ve done it before

GraphAwareTM

plugin to Neo4j (uses GraphAware Framework)

you have to use a JVM-language

opinionated architecture

very fast

very flexible

handles all the plumbing

Recommendation Engine

GraphAwareTM

Engine per recommendation “reason” (core logic)

Engine executes a graph traversal to find items

Engines are assembled into higher-level engines

Design Decisions

GraphAwareTM

Example

IS_OF_GENRE

title: “Love Actually”

Movie

name: “Bob”

User

name: “Comedy”

Genre

RATEDrating: 5

name: “Alice”

Username:

“Romance”

Genre

title: “American Pie”

Movie

IS_OF_GENRE

IS_OF_GENRE

RATEDrating: 5

INTERESTED_IN

rating: 5RATED

GraphAwareTM

Engine per recommendation “reason” (core logic)

Engine executes a graph traversal to find items

Engines are assembled to higher-level engines

Items discovered multiple times are more relevant

Relevance depends on how was item discovered

Design Decision

GraphAwareTM

Example

IS_OF_GENRE

title: “Love Actually”

Movie

name: “Bob”

User

name: “Comedy”

Genre

RATEDrating: 5

name: “Alice”

Username:

“Romance”

Genre

title: “American Pie”

Movie

IS_OF_GENRE

IS_OF_GENRE

RATEDrating: 5

INTERESTED_IN

rating: 5RATED

GraphAwareTM

Engine per recommendation “reason” (core logic)

Engine executes a graph traversal to find items

Engines are assembled to higher-level engines

Items discovered multiple times are more relevant

Relevance depends on how was item discovered

Items not to be recommended: “cross-cutting” concern

Design Decisions

GraphAwareTM

Example

IS_OF_GENRE

title: “Love Actually”

Movie

name: “Bob”

User

name: “Comedy”

Genre

RATEDrating: 5

name: “Alice”

Username:

“Romance”

Genre

title: “American Pie”

Movie

IS_OF_GENRE

IS_OF_GENRE

RATEDrating: 5

INTERESTED_IN

rating: 5RATED

GraphAwareTM

Input -> Engine -> Recommendations

Scores and Score Transformers

Blacklists

Filters

Post-processors

Context (how many, how fast,…?)

Loggers

Architecture

GraphAwareTM

In 5 minutes, we’ll build a simple engine that recommends who you should be friends with.

Let’s Build Something

GraphAwareTM

0) Model

GraphAwareTM

1) Discover

GraphAwareTM

public class FriendsInCommon extends SomethingInCommon { @Override public String name() { return "friendsInCommon"; } @Override protected RelationshipType getType() { return FRIEND_OF; } @Override protected Direction getDirection() { return BOTH; }}

GraphAwareTM

2) Score

GraphAwareTM

public class FriendsInCommon extends SomethingInCommon {

@Override protected ScoreTransformer scoreTransformer() { return new ParetoScoreTransformer(100, 10); } @Override public String name() { return "friendsInCommon"; } @Override protected RelationshipType getType() { return FRIEND_OF; } @Override protected Direction getDirection() { return BOTH; }}

GraphAwareTM

3) Post-Process

GraphAwareTM

public class RewardSameLocation extends RewardSomethingShared { @Override protected RelationshipType type() { return LIVES_IN; } @Override protected Direction direction() { return OUTGOING; } @Override protected float scoreValue(Node reco, Node in, Node shared) { return 10; } @Override protected String scoreName() { return "sameLocation"; }}

GraphAwareTM

public class RewardSameLabels implements PostProcessor<Node, Node> { @Override public void postProcess(Recommendations<Node> out, Node in) { Label[] inLabels = toArray(in.getLabels()); for (Recommendation<Node> reco : out.get()) { if (Arrays.equals(inLabels, toArray(reco.getItem().getLabels()))) { reco.add("sameGender", 10); } } }}

GraphAwareTM

4) Filter

GraphAwareTM

public final class FriendsContextFactory extends Neo4jContextFactory { @Override protected List<BlacklistBuilder<Node, Node>> blacklistBuilders() { return asList( new ExcludeSelf(), new ExistingRelationshipBlacklistBuilder(FRIEND_OF, BOTH) ); } @Override protected List<Filter<Node, Node>> filters() { return asList( new ExcludeSelf() ); }}

GraphAwareTM

5) Assemble

GraphAwareTM

public final class FriendsComputingEngine extends Neo4jTopLevelDelegatingEngine { public FriendsComputingEngine() { super(new FriendsContextFactory()); } @Override protected List<RecommendationEngine<Node, Node>> engines() { return asList( new FriendsInCommon(), new RandomPeople() ); } @Override protected List<PostProcessor<Node, Node>> postProcessors() { return asList( new RewardSameLabels(), new RewardSameLocation(), new PenalizeAgeDifference() ); }}

GraphAwareTM

?) Precompute

GraphAwareTM

public final class FriendsComputingEngine extends Neo4jTopLevelDelegatingEngine { public FriendsComputingEngine() { super(new FriendsContextFactory()); } @Override protected List<RecommendationEngine<Node, Node>> engines() { return asList( new FriendsInCommon(), new RandomPeople() ); } @Override protected List<PostProcessor<Node, Node>> postProcessors() { return asList( new RewardSameLabels(), new RewardSameLocation(), new PenalizeAgeDifference() ); } @Override public ParticipationPolicy<Node, Node> participationPolicy(Context<Node, Node> context) { return ParticipationPolicy.IF_MORE_RESULTS_NEEDED; }}

GraphAwareTM

public final class FriendsRecoEngine extends Neo4jTopLevelDelegatingEngine { public FriendsRecommendationEngine() { super(new FriendsContextFactory()); } @Override protected List<RecommendationEngine<Node, Node>> engines() { return asList( new Neo4jPrecomputedEngine(), new FriendsComputingEngine() ); }}

GraphAwareTM

6) Log

GraphAwareTM

public final class FriendsRecoEngine extends Neo4jTopLevelDelegatingEngine { public FriendsRecommendationEngine() { super(new FriendsContextFactory()); } @Override protected List<RecommendationEngine<Node, Node>> engines() { return asList( new Neo4jPrecomputedEngine(), new FriendsComputingEngine() ); } @Override protected List<Logger<Node, Node>> loggers() { return asList( new Slf4jRecommendationLogger<Node, Node>(), new Slf4jStatisticsLogger<Node, Node>() ); }}

GraphAwareTM

7) Test

GraphAwareTM

List<Recommendation<Node>> reco = recommendationEngine.recommend(getPersonByName("Adam"), Mode.REAL_TIME, 2);String expected = "(Vince {total:19.338144," + "ageDifference:-5.527864," + "friendsInCommon:14.866008," + "sameGender:10.0})," + "(Luanne {total:11.553411," + "ageDifference:-3.312597," + "friendsInCommon:14.866008})"; assertEquals(expected, toString(reco));

GraphAwareTM

List<Recommendation<Node>> reco = recommendationEngine.recommend(getPersonByName("Luanne"), REAL_TIME, 4);assertEquals("Daniela", reco.get(0).getItem().getProperty("name"));assertEquals(22, reco.get(0).getScore().getTotalScore(), 0.5);assertEquals("Adam", reco.get(1).getItem().getProperty("name"));assertEquals(12, reco.get(1).getScore().getTotalScore(), 0.5);assertEquals("Vince", reco.get(2).getItem().getProperty("name"));assertEquals(8, reco.get(2).getScore().getTotalScore(), 0.5); assertEquals("Bob", reco.get(3).getItem().getProperty("name"));assertEquals(-9, reco.get(3).getScore().getTotalScore(), 0.5);

GraphAwareTM

Finding things to recommend

Serving the most relevant recommendations

Measuring the quality of recommendations

Time to market / cost of development

Business Challenges

GraphAwareTM

Performance (real-time!)

Simplicity

Flexibility

Technical Challenges

GraphAwareTM

Getting Started

<dependencies>

...

<dependency> <groupId>com.graphaware.neo4j</groupId> <artifactId>recommendation-engine</artifactId> <version>2.1.6.27.2</version> </dependency>

...

<dependencies>

GraphAwareTM

Built-in ability to pre-compute recommendations

Other built-in base-classes

But we need your help!

https://github.com/graphaware/neo4j-reco

There’s More!

GraphAwareTM

Built-in algorithms

Time-based ParticipationPolicy

Integration with compute engines

Machine learning

Future

GraphAwareTM

GraphAware Framework makes it easy to build, test, and deploy generic as well as domain-specific functionality for Neo4j.

GraphAware Framework

GraphAwareTM

GraphUnit& RestTest

RelCount WarmUp Schema (wip)Recommendation

Engine

GraphAware Framework

ChangeFeed UUID TimeTree Algorithms NodeRank

GraphAwareTM

Open Source (GPL)

Active

Production Ready

Github: github.com/graphaware

Our Web: graphaware.com

Maven Central

GraphAware Framework

GraphAwareTM

Try it

Give us feedback

Contribute

Build your own modules

Get in touch for support / consultancy

GraphAware Framework

GraphAwareTM

GraphAware Events31Jan

Recommendation Engines in Brussels

(FOSDEM)

31Jan

GraphGen in Brussels (FOSDEM)

5Feb

Recommendation Engines Webinar

5Feb

Meetup at GraphAware (build your own

Recommendation Engine)

10Feb

Neo4j Fundamentals in Manchester

10Feb

Neo4j Meetup in Manchester

17Feb

Neo4j Fundamentals in Edinburgh

17Feb

Neo4j Meetup in Edinburgh

GraphAwareTM

GraphConnect Europe 2015When:

Where:

Tickets:

Call for Papers:

Sponsors:

Thursday, 7th May, 2015 - main Conference Day

Wednesday, 6th May 2015 - Training Day

Etc venues, 155 Bishopsgate, London

(next to Liverpool Street Station)

now available on www.graphconnect.com

199$ early bird plus 100$ for training

499$ full price plus 100$ for training

open now till 29th January

all Neo4j community members, customers or

general graph enthusiasts are invited to submit their talk

open now till 29th January, email:

[email protected]

graphaware.com

@graph_aware

Thank You!

GraphAwareTM