journey to nosql - nimret · •for more on hibernate vs. mybatis, see video of last talk on...

37
JOURNEY TO NOSQL Will Iverson CTO, Dynacron Group

Upload: trinhkhue

Post on 02-May-2018

216 views

Category:

Documents


1 download

TRANSCRIPT

JOURNEY TO NOSQLWill IversonCTO, Dynacron Group

Background• Java since 1995• Consulting since 1999• Four Java Books

• Hibernate: A J2EE Developer’s Guide, 2004

• CTO, Dynacron Group, 2010-Present• Several NoSQL PoCs• Brian and Dustin on details

Trends & Context• Hibernate• MyBatis• NoSQL• Big Data & NoSQL = related, but not same thing

• Both are now famous!

Hibernate ORM & MyBatis• Hibernate

• Map classes to tables• Use HQL/Criteria queries• Pitch: Don’t have to learn SQL!• Reality: Learn Java, SQL, and HQL!

• Debugging very difficult

• MyBatis• Write interfaces• Add SQL via annotations

• For more on Hibernate vs. MyBatis, see video of last talk on SeaJUG.org

Most interesting: decoupling application

from RDBMS

MyBatis Decoupling?• Contract with data/persistence is just an interface

• Easy to mock• Easy to understand• Easy to… replace…

NoSQL

http://cdi-mdm.blogspot.com/2011/07/nosql-newsql-and-mdm.html

NoSQL Popularity

Today’s Two Examples• MongoDB

• Document Store• Order Example

• Neo4j• Graph walking

• Different Tools• …very complimentary

From http://www.neo4j.org/develop/example_data … and the BBC

Final Thoughts

Full Stack JavaScriptAngular.js or Ember.jsNode.jsMongoDB

Final Thoughts

Favorite (Personal) Stack Today?Persistence: MongoDBPlain Old Java + Maven

Interfaces in JSON, easy, fastJackson FTW

Add Ratpack (HTTP)http://www.ratpack-framework.org/

Angular.js vs Ember.js?FIGHT!

http://pastordonblog.blogspot.com/2010/09/stephen-hawking-is-probably-best-known.html

MONGODBHigh Level OverviewPrepared by Brian Kereszturi, Dynacron Group

MongoDB: What is it?• Document-oriented• “a scalable, high-performance, open source NoSQL

database” • Auto-Sharding

• Distributed writes

• Replica Sets• High Availability• Distributed reads

• Low cost of ownership• Commodity hardware• Minimal administration

MongoDB: Data Structure• BSON (Binary JSON) Documents• Schema-less• Secondary indexes

MongoDB: Use Cases/Strengths• Well-suited for large data sets

• Craigslist (10TB, 5B records)• Wordnik (3.5TB, 20B records)• Disney 1400 instances

• Always fetch an object with sub-objects• High volume reads/writes

• Latency as low as .1ms

• Map Reduce• Aggregation

MongoDB: Challenges• Data modeling

• What questions do I need to answer up front?

• Selecting a Shard Key• Multi-datacenter replication

• Time-delayed replicas• Hidden replicas

• Elections• Arbiter

MongoDB: Topology

NEO4JDeveloper Overview

What is Neo4j?It’s a Graph Database!

Say What?• GRAPH database

• Nodes, relationships, properties• Shines with complex, highly connected data

• Social networks• Recommendations• Path finding

• Graph DATABASE• Reliable: ACID Compliance, High availability• Scalable: 32B nodes and edges, 64B properties• Accessible: REST API, Embeddable on JVM

Querying• Cypher Query Language

• Best for ad-hoc querying• SQL-like language• REST interface• Easy to copy-paste in email• “Prepared” statements

• Traversal API• Best for high-performance querying• Custom JAX-RS plugin• Java code• More powerful• Lower latency• Clean REST interface

Cypher

• Start-Match-Where-Return• START root=node(0)

RETURN root• START root=node(100)

MATCH (root)-[:has]->(child) RETURN child

• START me=node:lookup("name=dustin") MATCH me-[f:friend]->(friend) WHERE friend.gender=‘M’ AND f.date < ‘2012-01-01’ RETURN friend.name, f.date

Traversal API

Domain Layers• Spring Data

• Collaboration with SpringSource• Annotation/AspectJ-driven

• Qi4j, jo4neo, …

Performance• > 1 billion nodes, > 1 billion relationships, > 3

billion properties• < 10ms query time on average• < 100ms query time, 99th percentile• 4000 req/sec on 3 beefy servers

• 16-core, 256GB ram, 1.1TB SSD in Raid0+1

• Demands• Practically begs for SSD• Not horizontally scalable

• Add more machines for read scaling

• Tuning is VERY important.• Order of magnitude speed increase letting memory-mapped

IO consume almost all system resources

Exploring Data• BEER GOOD• Wikipedia collection of Belgian Beers

Q&A