neo4j - graph database for recommendations

37
Neo4j - Graph database for recommendations Jakub Kříž, Ondrej Proksa 30.5.2013

Upload: proksik

Post on 27-Jan-2015

107 views

Category:

Technology


0 download

DESCRIPTION

The trend nowadays is to represent the relationships between entities in a graph structure. Neo4j is a NOSQL graph database, which allows for fast and effective queries on connected data. Implementation of own algorithms is possible, which can improve the functionality of built in API. We make use of the graph database to model and recommend movies and other media content.

TRANSCRIPT

Page 1: Neo4j - graph database for recommendations

Neo4j - Graph database for recommendations

Jakub Kříž, Ondrej Proksa30.5.2013

Page 2: Neo4j - graph database for recommendations

Summary

Graph databases Working with Neo4j and Ruby (On Rails) Plugins and algorithms – live demos

Document similarity Movie recommendation Recommendation from subgraph

TeleVido.tv

Page 3: Neo4j - graph database for recommendations

Why Graphs?

Graphs are everywhere! Natural way to model almost everything “Whiteboard friendly” Even the internet is a graph

Page 4: Neo4j - graph database for recommendations

Why Graph Databases?

Relational databases are not so great for storing graph structures Unnatural m:n relations Expensive joins Expensive look ups during graph traversals

Graph databases fix this Efficient storage Direct pointers = no joins

Page 5: Neo4j - graph database for recommendations

Neo4j

The World's Leading Graph Database www.neo4j.org

NOSQL database Open source - github.com/neo4j ACID Brief history

Official v1.0 – 2010 Current version 1.9 2.0 coming soon

Page 6: Neo4j - graph database for recommendations

Querying Neo4j

Querying languages Structurally similar to SQL Based on graph traversal

Most often used Gremlin – generic graph querying language Cypher – graph querying language for

Neo4j SPARQL – generic querying language for

data in RDF format

Page 7: Neo4j - graph database for recommendations

Cypher Example

CREATE (n {name: {value}})CREATE (n)-[r:KNOWS]->(m)

START[MATCH][WHERE]RETURN [ORDER BY] [SKIP] [LIMIT]

Page 8: Neo4j - graph database for recommendations

Cypher Example (2)

Friend of a friend

START n=node(0)MATCH (n)--()--(f)RETURN f

Page 9: Neo4j - graph database for recommendations

Working with Neo4j

REST API => wrappers Neography for Ruby py2neo for Python … Your own wrapper

Java API Direct access in JVM based applications neo4j.rb

Page 10: Neo4j - graph database for recommendations

Neography – API wrapper example

# create nodes and properties

n1 = Neography::Node.create("age" => 31, "name" => "Max")

n2 = Neography::Node.create("age" => 33, "name" => "Roel")

n1.weight = 190

# create relationships

new_rel = Neography::Relationship.create(:coding_buddies, n1, n2)

n1.outgoing(:coding_buddies) << n2

# get nodes related by outgoing friends relationship

n1.outgoing(:friends)

# get n1 and nodes related by friends and friends of friends

n1.outgoing(:friends).depth(2).include_start_node

Page 11: Neo4j - graph database for recommendations

Neo4j.rb – JRuby gem example

class Person < Neo4j::Rails::Model

property :name

property :age, :index => :exact # :fulltext

has_n(:friends).to(Person).relationship(Friend)

end

class Friend < Neo4j::Rails::Relationship

property :as

end

mike = Person.new(:name => ‘Mike’, :age => 24)

john = Person.new(:name => ‘John’, :age => 27)

mike.friends << john

mike.save

Page 12: Neo4j - graph database for recommendations

Our Approach

Relational databases are not so bad Good for basic data storage Widely used for web applications Well supported in Rails via ActiveRecord Performance issues with Neo4j

However, we need a graph database We model the domain as a graph Our recommendation is based on graph

traversal

Page 13: Neo4j - graph database for recommendations

Our Approach (2)

Hybrid model using both MySQL and Neo4j

MySQL contains basic information about entities

Neo4j contains only relationships Paired via identifiers (neo4j_id)

Page 14: Neo4j - graph database for recommendations

Our Approach (3)

Recommendation algorithms Made as plugins to Neo4j Written in Java Embedded into Neo4j API

Rails application uses custom made wrapper Creates and modifies nodes and

relationships via API calls Handles recommendation requests

Page 15: Neo4j - graph database for recommendations

Graph Algorithms

Built-in algorithms Shortest path All shortest paths Dijkstra’s algorithm

Custom algorithms Depth first search Breadth first search Spreading activation Flows, pairing, etc.

Page 16: Neo4j - graph database for recommendations

Document Similarity

Task: find similarities between documents

Documents data model: Each document is made of sentences Each sentence can be divided into n-grams N-grams are connected with relationships

Neo4J is graph database in Java (Neo4j, graph) – (graph, database) – (database,

Java)

Page 17: Neo4j - graph database for recommendations

Document Similarity (2)

Page 18: Neo4j - graph database for recommendations

Detecting similar documents in our graph model Shortest path between documents Number of paths shorter than some

distance Weighing relationships

How about a custom plugin? Spreading activation

Document Similarity (3)

Page 19: Neo4j - graph database for recommendations

Live Demo…

Document Similarity (4)

Page 20: Neo4j - graph database for recommendations

Task: recommend movies based on what we like

We like some entities, let’s call them initial Movies People (actors, directors etc.) Genres

We want recommended nodes from input Find nodes which are

The closest to initial nodes The most relevant to initial nodes

Movie Recommendation

Page 21: Neo4j - graph database for recommendations

165k nodes Movies People Genre

870k relationships Movies – People Movies – Genres

Easy to add more entities Tags, mood, period, etc.

Will it be fast? We need 1-2 seconds

Movie Recommendation (2)

Page 22: Neo4j - graph database for recommendations

Movie Recommendation (3)

Page 23: Neo4j - graph database for recommendations

Breadth first search Union Colors Mixing Colors

Modified Dijkstra Weighted relationships between entities

Spreading activation (energy) Each initial node gets same starting energy

Recommendation Algorithms

Page 24: Neo4j - graph database for recommendations

Union Colors

Page 25: Neo4j - graph database for recommendations

Mixing Colors

Page 26: Neo4j - graph database for recommendations

Spreading Activation (Energy)

100.0

100.0

100.0

100.0

Page 27: Neo4j - graph database for recommendations

Spreading Activation (Energy)

100.0

100.0

100.0

100.0

12.012.0

12.0

Page 28: Neo4j - graph database for recommendations

Spreading Activation (Energy)

0.0

100.0

100.0

100.0

12.0

10.0

10.0

Page 29: Neo4j - graph database for recommendations

Spreading Activation (Energy)

0.0

0.0

100.0

100.0

22.0

10.0

8.0

8.0 8.08.0

Page 30: Neo4j - graph database for recommendations

Spreading Activation (Energy)

0.0

0.0

0.0

100.0

22.0

18.0

Page 31: Neo4j - graph database for recommendations

Experimental evaluation Which algorithm is the best (rating on scale

1-5) 30 users / 168 scenarios

Recommendation - Evaluation

Spájanie farieb Miešanie farieb Šírenie energie Dijkstra0

0.5

1

1.5

2

2.5

3

3.5

Page 32: Neo4j - graph database for recommendations

Live Demo…

Movie Recommendation (4)

Page 33: Neo4j - graph database for recommendations

Movie Recommendation – User Model

Spreading energy Each initial node gets different starting

energy Based on user’s interests and feedback

Improves the recommendation!

Page 34: Neo4j - graph database for recommendations

Recommendation from subgraph

Recommend movies which are currently in cinemas

Recommend movies which are currently on TV

How? Algorithm will traverse normally Creates a subgraph from which it returns

nodes

Page 35: Neo4j - graph database for recommendations

Live Demo…

Recommendation from subgraph (2)

Page 36: Neo4j - graph database for recommendations

TeleVido.tv

Media content recommendation using Neo4j Movie recommendation Recommendation of movies in cinemas Recommendation of TV programs and

schedules

Page 37: Neo4j - graph database for recommendations

Summary

Graph databases Working with Neo4j and Ruby (On Rails) Plugins and algorithms

Document similarity Movie recommendation Recommendation from subgraph

TeleVido.tv