nosql, neo4j for java developers , oracleweek-2012

53
Seminar: BigData, NoSQL graph database for Java developers* Presenter: Evgeny Hanikblum

Upload: eugene-hanikblum

Post on 27-Jan-2015

109 views

Category:

Documents


0 download

DESCRIPTION

Ne4j for Java developers. Presented at OracleWeek 2012. Created by Eugene Hanikblum @ AlphaCSP

TRANSCRIPT

Page 1: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Seminar:BigData, NoSQL graph database for Java developers*

Presenter: Evgeny Hanikblum

Page 2: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Data is getting bigger:“Every 2 days we create as much information as we did up to 2003”

– Eric Schmidt, Google

Page 3: NoSQL, Neo4J for Java Developers , OracleWeek-2012
Page 4: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Big Data Technologies

Page 5: NoSQL, Neo4J for Java Developers , OracleWeek-2012

NoSQL Overview

Page 6: NoSQL, Neo4J for Java Developers , OracleWeek-2012

NoSQL->Not Only SQL

Page 7: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Key Value Stores

• Most Based on Dynamo: Amazon Highly Available Key-Value Store

• Data Model: – Global key-value mapping– Big scalable HashMap– Highly fault tolerant (typically)

• Projects:

Page 8: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Key Value Stores

• Pros:– Simple data model– Scalable

• Cons– Create your own “foreign keys”– Poor for complex data

Page 9: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Column Databases• Most Based on BigTable: Google’s Distributed

Storage System for Structured Data• Data Model: – A big table, with column families– Map Reduce for querying/processing

• Projects:

Page 10: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Column Databases• Pros:– Supports Simi-Structured Data– Naturally Indexed (columns)– Scalable

• Cons– Poor for interconnected data

Page 11: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Document Databases

• Data Model: – A collection of documents– A document is a key value collection– Index-centric, lots of map-reduce

• Projects :

Page 12: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Document Databases

• Pros:– Simple, powerful data model– Scalable

• Cons– Poor for interconnected data– Query model limited to keys and indexes– Map reduce for larger queries

Page 13: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Graph Databases• Data Model: – Nodes and Relationships

• Projects:

Page 14: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Graph Databases• Pros:– Powerful data model, as general as RDBMS– Connected data locally indexed– Easy to query

• Cons– Sharding ( lots of people working on this)• Scales UP reasonably well

– Requires rewiring your brain

Page 15: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Why you need GraphDB?

Page 16: NoSQL, Neo4J for Java Developers , OracleWeek-2012

GraphDB Overview

Because of Data expanded into relationships

Page 17: NoSQL, Neo4J for Java Developers , OracleWeek-2012

GraphDB Overview

Because of Data became interconnected

Page 18: NoSQL, Neo4J for Java Developers , OracleWeek-2012

When should I use it?

Page 19: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Use graph db, if you should deal with something like this :

Page 20: NoSQL, Neo4J for Java Developers , OracleWeek-2012

or this …

Page 21: NoSQL, Neo4J for Java Developers , OracleWeek-2012

or this …

Page 22: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Data is more connected:• Text (content)• HyperText (added pointers)• RSS (joined those pointers)• Blogs (added pingbacks)• Tagging (grouped related data)• RDF (described connected data)• GGG (content + pointers + relationships +

descriptions)

GraphDB Overview

Page 23: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Data is less structured:

• If you tried to collect all the data of every movie ever made, how would you model it?

• Actors, Characters, Locations, Dates, Costs, Ratings, Showings, Ticket Sales, etc.

GraphDB Overview

Page 24: NoSQL, Neo4J for Java Developers , OracleWeek-2012

What is Graph

Page 25: NoSQL, Neo4J for Java Developers , OracleWeek-2012

What is Graph

• An abstract representation of a set of objects where some pairs are connected by links.

Object (Vertex, Node)

Link (Edge, Arc, Relationship)

Page 26: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Different Kinds of Graphs• Undirected Graph• Directed Graph

• Pseudo Graph• Multi Graph

• Hyper Graph

Page 27: NoSQL, Neo4J for Java Developers , OracleWeek-2012

More Kinds of Graphs

• Weighted Graph

• Labeled Graph

• Property Graph

Page 28: NoSQL, Neo4J for Java Developers , OracleWeek-2012

What is Graph DB

Page 29: NoSQL, Neo4J for Java Developers , OracleWeek-2012

What is a Graph DB?

• A database with an explicit graph structure• Each node knows its adjacent nodes • As the number of nodes increases, the cost

of a local step (or hop) remains the same• Plus an Index for lookups

Page 30: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Compared to Relational DatabasesOptimized for aggregation Optimized for connections

Page 31: NoSQL, Neo4J for Java Developers , OracleWeek-2012

What is Neo4j?

Page 32: NoSQL, Neo4J for Java Developers , OracleWeek-2012

What is Neo4j?

• A java based graph database• Property Graph• Full ACID (atomicity, consistency, isolation, durability)• High Availability (with Enterprise Edition)• 32 Billion Nodes, 32 Billion Relationships,

64 Billion Properties• Embedded Server• REST API

Page 33: NoSQL, Neo4J for Java Developers , OracleWeek-2012

• Both nodes and relationships can have metadata. • Integrated pattern-matching-based query language (“Cypher”). • Also the “Gremlin” graph traversal language can be used. • Indexing of nodes and relationships. (Lucene) • Nice self-contained web admin. • Advanced path-finding with multiple algorithms. • Optimized for reads. • Has transactions (in the Java API)• Scriptable in Groovy• Online backup, advanced monitoring and High Availability is

AGPL/commercial licensed

What is Neo4j?

Page 34: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Neo4j is good for :• Highly connected data (social networks)• Recommendations (e-commerce)• Path Finding (how do I know you?)

• A* (Least Cost path)• Data First Schema (bottom-up, but you still

need to design)

Page 35: NoSQL, Neo4J for Java Developers , OracleWeek-2012

how do I know you?

Page 36: NoSQL, Neo4J for Java Developers , OracleWeek-2012

how can I get there ?

Page 37: NoSQL, Neo4J for Java Developers , OracleWeek-2012

If you’ve ever• Joined more than 7 tables together• Modeled a graph in a table• Written a recursive CTE• Tried to write some crazy stored procedure

with multiple recursive self and inner joins

You should use Neo4j

Page 38: NoSQL, Neo4J for Java Developers , OracleWeek-2012

rewiring you brain

name

code

word_count

Language

name

code

flag_uri

Country

IS_SPOKEN_IN

as_primary

language_code

language_name

word_count

Language

country_code

country_name

flag_uri

Country

language_code

country_code

primary

LanguageCountry

Page 39: NoSQL, Neo4J for Java Developers , OracleWeek-2012

name: “Canada”

languages_spoken: “[ ‘English’, ‘French’ ]”

name: “Canada”

language:“English”

language:“Frech”

spoken_in

spoken_in

name: “USA”

name: “France”

spoken_in

spoken_in

rewiring you brain

Page 40: NoSQL, Neo4J for Java Developers , OracleWeek-2012

name

flag_uri

language_name

number_of_words

yes_in_langauge

no_in_language

currency_code

Country

USES_CURRENCY

name

flag_uri

Country

name

number_of_words

yes

no

Language

SPEAKS

code

name

Currency

rewiring you brain

Page 41: NoSQL, Neo4J for Java Developers , OracleWeek-2012

show me the code!

GraphDatabaseService graphDb = new EmbeddedGraphDatabase("var/neo4j");

Node david = graphDb.createNode();Node andreas = graphDb.createNode();

david.setProperty("name", "David Montag");andreas.setProperty("name", "Andreas Kollegger");

Relationship presentedWith = david.createRelationshipTo(andreas,

PresentationTypes.PRESENTED_WITH);

presentedWith.setProperty("date", System.currentTimeMillis());

Page 42: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Neo4j data browser

Page 43: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Neo4j data browser

Page 44: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Neoclipse

Page 45: NoSQL, Neo4J for Java Developers , OracleWeek-2012

console.neo4j.org

Try it right now: start n=node(*) match n-[r:LOVES]->m return n, type(r), mNotice the two nodes in red, they are your result set.

Page 46: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Spring-Data-Neo4J

Page 47: NoSQL, Neo4J for Java Developers , OracleWeek-2012

• Focus on Spring Data Neo4j• VMWare is collaborating with Neo Technology, the

company behind the Neo4j graph database.• Improved programming model: Annotation-based

programming model for applications with rich domain models

• Cross-store persistence: Extend existing JPA application with NoSQL persistence

• Tagging (grouped related data)• RDF (described connected data)

Spring-Data-Neo4J

Page 48: NoSQL, Neo4J for Java Developers , OracleWeek-2012

@NodeEntity

Spring-Data-Neo4J

@NodeEntitypublic class Actor {

private String name;private int age;private HairColor hairColor;private transient String

nickname;

}

Page 49: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Spring-Data-Neo4J

@NodeEntity public class Movie {

@GraphId Long id;

@Indexed(type = FULLTEXT, indexName = "search") String title;

Person director;

@RelatedTo(type="ACTS_IN", direction = INCOMING) Set<Person> actors;

@RelatedToVia(type = "RATED") Iterable<Rating> ratings;

@Query("start movie=node({self}) match movie-->genre<--similar return similar") Iterable<Movie> similarMovies; }

Page 50: NoSQL, Neo4J for Java Developers , OracleWeek-2012

@RelationshipEntity

Spring-Data-Neo4J

@RelationshipEntitypublic class Role {

@StartNodeprivate Actor actor;@EndNodeprivate Movie movie;privateString roleName;

}

Page 51: NoSQL, Neo4J for Java Developers , OracleWeek-2012

Spring-Data-Neo4J

@RelationshipEntitypublic class Role {

@StartNode private Actor actor;@EndNode private Movie movie;

private String roleName;

}

@NodeEntitypublic class Actor {

@RelatedToVia(type = “ACTS_IN”)private Iterable<Role> roles;

}

Page 52: NoSQL, Neo4J for Java Developers , OracleWeek-2012

How they did that ?

Page 53: NoSQL, Neo4J for Java Developers , OracleWeek-2012

NoSql->Graph DB->Neo4JLecturer : Evgeny Hanikblum @ AlphaCSP:OracleWeek2012:Israel Email : [email protected]