c* summit 2013: buy it now! cassandra at ebay by jay patel
DESCRIPTION
This session will cover various use cases for Cassandra at eBay. It’ll start with overview of eBay’s heterogeneous data platform comprised of SQL & NoSQL databases, and where Cassandra fits into that. For each use case, Jay will go into detail of system design, data model & multi-datacenter deployment. To conclude, Jay will summarize the best practices that guide Cassandra utilization at eBay.TRANSCRIPT
Cassandra @ eBay Jay Patel Architect, Platform Systems @pateljay3001
eBay Marketplaces
Thousands of servers Petabytes of data
Billions of SQLs/day 24x7x365 99.98+% Availability
turning over a TB every second Multiple Datacenters
Near-Real-time Always online
400+ million items for sale
$75 billion+ per year in goods are sold on eBay
Big Data
112 million active users
Billions of page views/day
3
eBay Site Data Infrastructure
Don’t force! One size does not fit all.
It’s a mixture of multiple SQL & NoSQL databases. We use the right database for the right problem.
eBay Site Data Infrastructure A heterogeneous mixture
Thousands of nodes > 2K sharded logical host > 16K tables > 27K indexes > 140 billion SQLs/day > 5 PB provisioned
Hundreds of nodes Persistent & in-memory > 40 billion SQLs/day
10+ clusters, 100+ nodes > 250 TB provisioned (local HDD + shared SSD) > 9 billion writes/day > 5 billion reads/day
Hundreds of nodes > 50 TB > 2 billion ops/day
Thousands of nodes The world largest cluster with 2K+ nodes
Dozens of nodes
How do we scale RDBMS?
Shard
– Patterns: Modulus, lookup-based, range, etc.
– Application sees only logical shard/database
Replicate
– Disaster recovery, read availability & read scalability
Big NOs
– No transactions
– No joins
– No referential integrity constraints
5
Why Cassandra?
Multi-datacenter (active-active)
Always Available - No SPOF
Easy to scale up & down
6
Write performance
Distributed counters
Hadoop support
Not replacing RDBMS, but complementing!
Some use cases don’t fit well in RDBMS - sparse data, big data, flexible schema, real-time analytics, …
Many use cases don’t need top-tier set-ups.
Cassandra Growth
Au
g, 2
01
1
Au
g, 2
01
2
May
, 20
13
1
2
3
4
5
6
7
Billions (per day)
writes async. reads
sync. site reads
Terabytes
50
100
200
250
300
350 storage capacity
Doesn’t predict business
7
eBay Use Cases on Cassandra Time-series data, real-time insights & immediate actions
• Fraud detection & prevention
• Quality Click Pricing for affiliates
• Order & shipment tracking and insights
• Mobile notification logging & tracking
• Cloud CMS change history storage
• RedLaser server logs and analytics
Server metrics collection for monitoring & alerting
Taste graph based next-gen recommendation system
Personalization Data Service
Social Signals on eBay Product & Item pages
Milo’s store-item availability inventory (evaluation phase)
8
Real-time insights & actions for
9
Fraud Prevention Reporting
Quality Click Pricing More…
10
System Overview
Business Event Stream
Checkout Shipping Refund & Recoup …
Order placed (bin/bid)
Paid Shipped Refunded
Raw
dat
a
Simple in-memory aggregations +/ Complex Event Processing +/
Cassandra’s distributed counters
Label printed per day per user User segmentation for affiliate pricing Orders per hour, …
Multiple Cassandra clusters
Payment
Act
in r
eal-
tim
e
Fraud Prevention
Affiliate Pricing Engine (eBay Partner Network)
Order tracking
Real-time reporting
… (Kept from several months to years)
A glimpse on Data Model
11
Historic & real-time insights per user per carrier. Sudden & drastic change might be suspicious.
User bucketing based on historic & real-time buying activity.
A glimpse on Data Model
12
Fraud Detection & Prevention
13
Shop with Confidence
System Overview
14
Cassandra
Fraud Detection & Prevention System
Sign
-in
in
fo
Business events (checkout, sell,…)
StaaS Oracle
Checkout Shipping … Payment Selling
Real-time Beacons data
Real-time Insights
Other data Machine
Learned Models
15
A glimpse on Data Model
Collected at sign-in & stored as key-value.
Pulled periodically to StaaS for training machine learned models.
Metrics collection for monitoring & alerting
16
System Overview
17
Transport (HTTP, …)
Scalable NIO servers based on Netty
Thousands of production machines
Cassandra
Stats for CPU, Memory, Disk, ..
…
agent agent agent agent …
Server
Server
Server
Server
Server In-memory grid (hazelcast) for rollups
A glimpse on Data Model
18
Granular data points
Rolled up metrics for various time intervals
Taste graph based recommendation system
19
Data Model
20
Tast
e G
rap
h
Tast
e V
ect
or
50 billion+ edges, 600 million+ writes, 3 billion+ reads, 30TB+ of data on SSD
System Overview
21
Business Event Stream
Recommendation system
Taste Graph Taste Vector
1. Item purchased.
2a. Write purchase edge. 2b. Read other edges for this user & item.
4. Req. recommendations.
5. Finds other items close to user’s coordinates.
6. Reco. shown to user
More, http://www.slideshare.net/planetcassandra/e-bay-nyc
Real-time Personalization Data Service
22
User performs search using keyword User gets personalized pages based on implicit/explicit profile
System Overview
23
Personalization Data Service
CacheMesh (write-back cache)
Heavy writes
eBay site pages (personalized)
Every few mins
in-memory MySQL & XMP DB
Cassandra Oracle (scaled out)
Hea
vy r
ead
s
Cache miss
user profiles
Application SOA services (multiple)
Data Warehouse
Data Model
24
• Keep column names short. • Don’t overload one CF with all the data:
- Split hot & cold data in separate CF. - Splitting & sharding can help compaction.
Static column families
25
Served by Cassandra
Social Signals
Manage signals via “Your Favorites”
26
Whole page is served by Cassandra
More, http://www.slideshare.net/jaykumarpatel/cassandra-at-ebay-13920376
Multi-Datacenter Deployment
27
Topology - NTS RF - 1:1 or 2:2 or 3:3 Read CL - ONE/QUORUM Write CL - ONE
Data is backed up periodically to protect against human or software error
User request has no datacenter affinity
Non-sticky load balancing
Multi-Datacenter Deployment
Topology - NTS RF – 1:1:1 or 2:2:2
Lessons & Best Practices
• One size does not fit all
– Use Cassandra for the right use cases.
• Choose proper Replication Factor and Consistency Level
– They alter latency, availability, durability, consistency and cost.
– Cassandra supports tunable consistency, but remember strong consistency is not free.
• Many ways to model data in Cassandra
– The best way depends on your use case and query patterns.
• De-normalize and duplicate for read performance
– But don’t de-normalize if you don’t need to.
http://www.slideshare.net/jaykumarpatel/cassandra-data-modeling-best-practices
29
Are you excited? Come Join Us!
30
Thank You @pateljay3001
#cassandra13