polyglottany is not a sin
Post on 04-Jul-2015
245 Views
Preview:
TRANSCRIPT
Polyglottany Is Not a Sin
Eric Lubow
@elubow
elubow@simplereach.com
#MongoBoston
Polyglottany Is Not A Sin Eric Lubow @elubow
Overview• SimpleReach
• Definitions and Data Stores
• Evolution to Polyglottany
• Tie It Together
• Final Thoughts
• Questions
Polyglottany Is Not A Sin Eric Lubow @elubow
Socially Intelligent
Polyglottany Is Not A Sin Eric Lubow @elubow
Size• 150m events
recorded per day and growing
• 600m Pageviews per month and growing
Polyglottany Is Not A Sin Eric Lubow @elubow
Polyglot Persistence
Polyglot Persistence, like polyglot programming, is all about choosing the right persistence option for the task at hand.
http://www.sleberknight.com/blog/sleberkn/entry/polyglot_persistence
Polyglottany Is Not A Sin Eric Lubow @elubow
Right Tool For The Job
Polyglottany Is Not A Sin Eric Lubow @elubow
OtherFinancial
Data
Decisions. Decisions.
Tech• Do I have legal requirements (HIPAA/FIPS/Sarbanes Oxley/PII)?
• What kind of enterprise support is available?
• What is the community like?
• Does the product roadmap pertain to my roadmap?
• Are my display requirements for realtime data?
• Do I need to aggregate data on the fly?
• Is my data structured or unstructured?
• Does my data lend itself to a specific design pattern?
• What are my query patterns?
• Is my data ingestion high volume/high velocity?
• Am I batch loading data?
• Am I write heavy or read heavy?
• Are data relationships important?
• Does my data need to be immediately available everywhere?
• Am I cloud based?
• Am I hardware based?
• Am I a cloud/iron hybrid?
• How much am I willing to spend?
• How much am I willing to spend if something goes wrong?
• How fault tolerant is the system?
• What supporting tools do I need?
• Is there support for my language?
• Is the encryption/authentication/authorization support sufficient for my needs?
• Are there monitoring architectures already built?
• Are there best practices guides already
• Will the data need to be distributed?
Data Tech
Financial Other
Polyglottany Is Not A Sin Eric Lubow @elubow
No One Size Fits All
Polyglottany Is Not A Sin Eric Lubow @elubow
ToolsC*
Polyglottany Is Not A Sin Eric Lubow @elubow
Free vs. Cost
Polyglottany Is Not A Sin Eric Lubow @elubow
Languages
Polyglottany Is Not A Sin Eric Lubow @elubow
Pre-Scale
Polyglottany Is Not A Sin Eric Lubow @elubow
SimpleReach Pre-Scale
Polyglottany Is Not A Sin Eric Lubow @elubow
Scale
Polyglottany Is Not A Sin Eric Lubow @elubow
SimpleReach
C*
Polyglottany Is Not A Sin Eric Lubow @elubow
Mongo Conference
Polyglottany Is Not A Sin Eric Lubow @elubow
• Large data volume ingestion at high velocity
• Really fast writes to many locations (eventual consistency)
• Query by column groups within rows (slicing)
• Opscenter
• Data toolkit: more than a data storage layer
• TTLs for small group aggregation
• Wrote Helenus, Node.js driver for Cassandra
Cassandra C*
Polyglottany Is Not A Sin Eric Lubow @elubow
• Fast atomic increments (Node.js is native JSON)
• Sharding
• Solid ORM for Rails (MongoID)
• Fast access for pub/sub of durable/persisted documents
• B-Tree Indexes
• Document based via JSON
• TTLs for ephemeral data
MongoDB
Polyglottany Is Not A Sin Eric Lubow @elubow
• Supports hundreds of thousands transactions per second
• Great caching engine
• Supports useful variable types like sets, sorted set, lists
• Everything is guaranteed to Memory Mapped (mmap)
• Transactional and supports bulk operations
• Centralized queueing and locking system
Redis
Polyglottany Is Not A Sin Eric Lubow @elubow
• Works with standard MySQL driver
• Column Stores for ad-hoc analytics queries in SQL
• Databases built for business intelligence
• Heavy compression of data
• Pre-aggregated data (Knowledge Grid)
Infobright
Polyglottany Is Not A Sin Eric Lubow @elubow
• Polyglottany doesn’t only apply to data stores
• Each language has its own benefit to each data storage layer
• Each language has its own individual benefits
• JSON, APIs, Performance
Ruby, Node.js, Python
Polyglottany Is Not A Sin Eric Lubow @elubow
Choice
Polyglottany Is Not A Sin Eric Lubow @elubow
Cons• Redis - Can only utilize a single core. SerDe price.
• MySQL Column Store - DELETE/UPDATEs are VERY expensive
• Cassandra - No btree indexes
• Mongo - Indexes must fit in memory. Forced Replica ping times
• Python - Whitespace. Community
• Ruby - Not high performance enough for our standards
• Javascript (Node.js) - Bad for CPU or IO intensive workloads
Polyglottany Is Not A Sin Eric Lubow @elubow
Even with the right tools, 80% of the work of building a big data system is acquiring and refining the raw data into usable data.
Tying It Together
Polyglottany Is Not A Sin Eric Lubow @elubow
Tying It Together
Polyglottany Is Not A Sin Eric Lubow @elubow
Tying It Together• Service Oriented Architecture (Internal API)• Data accuracy checks: visual and programmatic• Built framework for testing out storage engines• Access to many toolsets (for all languages and
DBs)
Polyglottany Is Not A Sin Eric Lubow @elubow
Service Architecture
Internal APIInternal API
AnalyticsAnalytics
Real-timeReal-time
C*
C*
Polyglottany Is Not A Sin Eric Lubow @elubow
Distributed ArchitectureUS-EAST-1a
MONGO-SHARD-0001-B
MONGO-SHARD-0000-A
CASSANDRA-0001
CASSANDRA-0010
REDIS-0001A
MYSQL-0001
iAPI-0001
US-EAST-1b
MONGO-SHARD-0002-B
MONGO-SHARD-0001-A
CASSANDRA-0002
CASSANDRA-0011
REDIS-0001B
iAPI-0002
US-EAST-1e
MONGO-SHARD-0002-A
MONGO-SHARD-0000-B
CASSANDRA-0003
CASSANDRA-0012
MYSQL-0002
iAPI-0003
Polyglottany Is Not A Sin Eric Lubow @elubow
Points To Consider• Data consistency - Same in all data stores
• How important is data durability?
• Managing many servers (Chef, AWS, CSSH)
• Managing and learning many different applications and tuning for them
• Expertise
Polyglottany Is Not A Sin Eric Lubow @elubow
Expertise• What happens when you need help?
• How do you become experts?
• What happens when you need more experts?
Polyglottany Is Not A Sin Eric Lubow @elubow
Summary• Polyglottany is not a sin
• Know your data read/write patterns
• Know the tools available to you
• Know your compromises
• Expertise
Polyglottany Is Not A Sin Eric Lubow @elubow
We’re Hiring
Questions are guaranteed in life.Answers aren’t.
Eric Lubow
@elubow
elubow@simplereach.com
#MongoBoston
Thank you.
top related