recommendations with neo4j (fosdem 2015)
TRANSCRIPT
GraphAwareTM
Michal Bachman
graphaware.com
@graph_aware
Recommendations with Neo4jBuilding a high-performance recommendation engine
GraphAwareTM
Quick Intro
Why Graphs?
Business and Technical Challenges
GraphAware Recommendation Engine
About This Talk
GraphAwareTM
News you should read
Books you should buy
People you may know
People you should date
People you should market a product to
…
Recommendation Engines
GraphAwareTM
Content-based (features)
Collaborative filtering (user <-> item relationships)
Recommendation Engines
GraphAwareTM
Example
IS_OF_GENRE
title: “Love Actually”
Movie
name: “Bob”
User
name: “Comedy”
Genre
RATEDrating: 5
name: “Alice”
Username:
“Romance”
Genre
title: “American Pie”
Movie
IS_OF_GENRE
IS_OF_GENRE
RATEDrating: 5
INTERESTED_IN
rating: 5RATED
GraphAwareTM
Easy to understand
Natural to model
Flexible (schema-free)
Fast to query
Graphs (Neo4j)
GraphAwareTM
Great for a quick PoC
Great for smaller data sets
Great for relatively simple logic
Cypher
GraphAwareTM
Cypher
MATCH (u:User)-[:LIKED]->(m:Movie), (m)<-[:LIKED]-(another:User), (another)-[:LIKED]->(reco:Movie)
WHERE NOT (u)-[:LIKED|DISLIKED]->(reco)
RETURN reco;
GraphAwareTM
Requirements of real-world recommendation engines are often much more complex.
The Reality
GraphAwareTM
After a brainstorming session, your team came up with the following ways of finding people one may know:
Example
GraphAwareTM
Common contacts
Facebook friends in common
Email / mobile contacts in common
Each others email / mobile contact
Worked for the same company
Studied at the same school
Share the same interest
Live in the same city
People You May Know
GraphAwareTM
More contacts in common = better chance?
Same city / school / company = does size matter?
What about emails that don’t represent a person?
What about people already connected?
And pending…
And rejected…
And repeatedly ignored…
People You May Know
GraphAwareTM
Finding things to recommend
Serving the most relevant recommendations
Measuring the quality of recommendations
Time to market / cost of development
Business Challenges
GraphAwareTM
So we came up with an open-source recommendation engine skeleton that will help you solve all the challenges.
We’ve done it before
GraphAwareTM
plugin to Neo4j (uses GraphAware Framework)
you have to use a JVM-language
opinionated architecture
very fast
very flexible
handles all the plumbing
Recommendation Engine
GraphAwareTM
Engine per recommendation “reason” (core logic)
Engine executes a graph traversal to find items
Engines are assembled into higher-level engines
Design Decisions
GraphAwareTM
Example
IS_OF_GENRE
title: “Love Actually”
Movie
name: “Bob”
User
name: “Comedy”
Genre
RATEDrating: 5
name: “Alice”
Username:
“Romance”
Genre
title: “American Pie”
Movie
IS_OF_GENRE
IS_OF_GENRE
RATEDrating: 5
INTERESTED_IN
rating: 5RATED
GraphAwareTM
Engine per recommendation “reason” (core logic)
Engine executes a graph traversal to find items
Engines are assembled to higher-level engines
Items discovered multiple times are more relevant
Relevance depends on how was item discovered
Design Decision
GraphAwareTM
Example
IS_OF_GENRE
title: “Love Actually”
Movie
name: “Bob”
User
name: “Comedy”
Genre
RATEDrating: 5
name: “Alice”
Username:
“Romance”
Genre
title: “American Pie”
Movie
IS_OF_GENRE
IS_OF_GENRE
RATEDrating: 5
INTERESTED_IN
rating: 5RATED
GraphAwareTM
Engine per recommendation “reason” (core logic)
Engine executes a graph traversal to find items
Engines are assembled to higher-level engines
Items discovered multiple times are more relevant
Relevance depends on how was item discovered
Items not to be recommended: “cross-cutting” concern
Design Decisions
GraphAwareTM
Example
IS_OF_GENRE
title: “Love Actually”
Movie
name: “Bob”
User
name: “Comedy”
Genre
RATEDrating: 5
name: “Alice”
Username:
“Romance”
Genre
title: “American Pie”
Movie
IS_OF_GENRE
IS_OF_GENRE
RATEDrating: 5
INTERESTED_IN
rating: 5RATED
GraphAwareTM
Input -> Engine -> Recommendations
Scores and Score Transformers
Blacklists
Filters
Post-processors
Context (how many, how fast,…?)
Loggers
Architecture
GraphAwareTM
In 5 minutes, we’ll build a simple engine that recommends who you should be friends with.
Let’s Build Something
GraphAwareTM
public class FriendsInCommon extends SomethingInCommon { @Override public String name() { return "friendsInCommon"; } @Override protected RelationshipType getType() { return FRIEND_OF; } @Override protected Direction getDirection() { return BOTH; }}
GraphAwareTM
public class FriendsInCommon extends SomethingInCommon {
@Override protected ScoreTransformer scoreTransformer() { return new ParetoScoreTransformer(100, 10); } @Override public String name() { return "friendsInCommon"; } @Override protected RelationshipType getType() { return FRIEND_OF; } @Override protected Direction getDirection() { return BOTH; }}
GraphAwareTM
public class RewardSameLocation extends RewardSomethingShared { @Override protected RelationshipType type() { return LIVES_IN; } @Override protected Direction direction() { return OUTGOING; } @Override protected float scoreValue(Node reco, Node in, Node shared) { return 10; } @Override protected String scoreName() { return "sameLocation"; }}
GraphAwareTM
public class RewardSameLabels implements PostProcessor<Node, Node> { @Override public void postProcess(Recommendations<Node> out, Node in) { Label[] inLabels = toArray(in.getLabels()); for (Recommendation<Node> reco : out.get()) { if (Arrays.equals(inLabels, toArray(reco.getItem().getLabels()))) { reco.add("sameGender", 10); } } }}
GraphAwareTM
public final class FriendsContextFactory extends Neo4jContextFactory { @Override protected List<BlacklistBuilder<Node, Node>> blacklistBuilders() { return asList( new ExcludeSelf(), new ExistingRelationshipBlacklistBuilder(FRIEND_OF, BOTH) ); } @Override protected List<Filter<Node, Node>> filters() { return asList( new ExcludeSelf() ); }}
GraphAwareTM
public final class FriendsComputingEngine extends Neo4jTopLevelDelegatingEngine { public FriendsComputingEngine() { super(new FriendsContextFactory()); } @Override protected List<RecommendationEngine<Node, Node>> engines() { return asList( new FriendsInCommon(), new RandomPeople() ); } @Override protected List<PostProcessor<Node, Node>> postProcessors() { return asList( new RewardSameLabels(), new RewardSameLocation(), new PenalizeAgeDifference() ); }}
GraphAwareTM
public final class FriendsComputingEngine extends Neo4jTopLevelDelegatingEngine { public FriendsComputingEngine() { super(new FriendsContextFactory()); } @Override protected List<RecommendationEngine<Node, Node>> engines() { return asList( new FriendsInCommon(), new RandomPeople() ); } @Override protected List<PostProcessor<Node, Node>> postProcessors() { return asList( new RewardSameLabels(), new RewardSameLocation(), new PenalizeAgeDifference() ); } @Override public ParticipationPolicy<Node, Node> participationPolicy(Context<Node, Node> context) { return ParticipationPolicy.IF_MORE_RESULTS_NEEDED; }}
GraphAwareTM
public final class FriendsRecoEngine extends Neo4jTopLevelDelegatingEngine { public FriendsRecommendationEngine() { super(new FriendsContextFactory()); } @Override protected List<RecommendationEngine<Node, Node>> engines() { return asList( new Neo4jPrecomputedEngine(), new FriendsComputingEngine() ); }}
GraphAwareTM
public final class FriendsRecoEngine extends Neo4jTopLevelDelegatingEngine { public FriendsRecommendationEngine() { super(new FriendsContextFactory()); } @Override protected List<RecommendationEngine<Node, Node>> engines() { return asList( new Neo4jPrecomputedEngine(), new FriendsComputingEngine() ); } @Override protected List<Logger<Node, Node>> loggers() { return asList( new Slf4jRecommendationLogger<Node, Node>(), new Slf4jStatisticsLogger<Node, Node>() ); }}
GraphAwareTM
List<Recommendation<Node>> reco = recommendationEngine.recommend(getPersonByName("Adam"), Mode.REAL_TIME, 2);String expected = "(Vince {total:19.338144," + "ageDifference:-5.527864," + "friendsInCommon:14.866008," + "sameGender:10.0})," + "(Luanne {total:11.553411," + "ageDifference:-3.312597," + "friendsInCommon:14.866008})"; assertEquals(expected, toString(reco));
GraphAwareTM
List<Recommendation<Node>> reco = recommendationEngine.recommend(getPersonByName("Luanne"), REAL_TIME, 4);assertEquals("Daniela", reco.get(0).getItem().getProperty("name"));assertEquals(22, reco.get(0).getScore().getTotalScore(), 0.5);assertEquals("Adam", reco.get(1).getItem().getProperty("name"));assertEquals(12, reco.get(1).getScore().getTotalScore(), 0.5);assertEquals("Vince", reco.get(2).getItem().getProperty("name"));assertEquals(8, reco.get(2).getScore().getTotalScore(), 0.5); assertEquals("Bob", reco.get(3).getItem().getProperty("name"));assertEquals(-9, reco.get(3).getScore().getTotalScore(), 0.5);
GraphAwareTM
Finding things to recommend
Serving the most relevant recommendations
Measuring the quality of recommendations
Time to market / cost of development
Business Challenges
GraphAwareTM
Getting Started
<dependencies>
...
<dependency> <groupId>com.graphaware.neo4j</groupId> <artifactId>recommendation-engine</artifactId> <version>2.1.6.27.2</version> </dependency>
...
<dependencies>
GraphAwareTM
Built-in ability to pre-compute recommendations
Other built-in base-classes
But we need your help!
https://github.com/graphaware/neo4j-reco
There’s More!
GraphAwareTM
Built-in algorithms
Time-based ParticipationPolicy
Integration with compute engines
Machine learning
Future
GraphAwareTM
GraphAware Framework makes it easy to build, test, and deploy generic as well as domain-specific functionality for Neo4j.
GraphAware Framework
GraphAwareTM
GraphUnit& RestTest
RelCount WarmUp Schema (wip)Recommendation
Engine
GraphAware Framework
ChangeFeed UUID TimeTree Algorithms NodeRank
GraphAwareTM
Open Source (GPL)
Active
Production Ready
Github: github.com/graphaware
Our Web: graphaware.com
Maven Central
GraphAware Framework
GraphAwareTM
Try it
Give us feedback
Contribute
Build your own modules
Get in touch for support / consultancy
GraphAware Framework
GraphAwareTM
GraphAware Events31Jan
Recommendation Engines in Brussels
(FOSDEM)
31Jan
GraphGen in Brussels (FOSDEM)
5Feb
Recommendation Engines Webinar
5Feb
Meetup at GraphAware (build your own
Recommendation Engine)
10Feb
Neo4j Fundamentals in Manchester
10Feb
Neo4j Meetup in Manchester
17Feb
Neo4j Fundamentals in Edinburgh
17Feb
Neo4j Meetup in Edinburgh
GraphAwareTM
GraphConnect Europe 2015When:
Where:
Tickets:
Call for Papers:
Sponsors:
Thursday, 7th May, 2015 - main Conference Day
Wednesday, 6th May 2015 - Training Day
Etc venues, 155 Bishopsgate, London
(next to Liverpool Street Station)
now available on www.graphconnect.com
199$ early bird plus 100$ for training
499$ full price plus 100$ for training
open now till 29th January
all Neo4j community members, customers or
general graph enthusiasts are invited to submit their talk
open now till 29th January, email: