brian bulkowski : what startups can learn from real-time bidding
DESCRIPTION
Presentation about the technical issues of scaling out that apply to startup CTOs. https://ti.to/startup-cto-summit/sfTRANSCRIPT
© 2014 Aerospike. All rights reserved. Confidential 1
What Starups Can Learn from Real-time Bidding
Or
“10 times faster, really?”
Brian BulkowskiCTO and co-founder
Aerospike
© 2014 Aerospike. All rights reserved. Confidential 2
Who am I ?
■ TRS-80, PC, Apple II, Vax 11/70, Wang■First product: lightpen university teaching kiosk■Networks: computers without people are boring
■ Liberate / NetComputer through the boom■10B market cap in 1999, employee 32
■ 2003-2007 “time off” ( startups )
■ Citrusleaf / Aerospike history■ 42 year old first-time CEO (me)■ 2008 Prototype■ 2010 First sale, get the band back together■ 2011+ 3 rounds of funding (Draper, ALP, NEA, CNTP)■ 70 employees, 2 offices
[email protected]@aerospike.com@bbulkow
© 2014 Aerospike. All rights reserved. Confidential 3
MILLIONS OF CONSUMERSBILLIONS OF DEVICES
APP SERVERS
DATA WAREHOUSEINSIGHTS
Advertising Technology Stack
WRITE CONTEXT
In-memory NoSQL
WRITE REAL-TIME CONTEXTREAD RECENT CONTENT
PROFILE STORECookies, email, deviceID, IP address, location, segments, clicks, likes, tweets, search terms...
REAL-TIME ANALYTICS Best sellers, top scores, trending tweets
BATCH ANALYTICSDiscover patterns, segment data: location patterns, audience affinity
© 2014 Aerospike. All rights reserved. Confidential 4
Introduction to Advertising: Real-time Bidding
© 2014 Aerospike. All rights reserved. Confidential 5
North American RTB speeds & feeds
■ 1 to 6 billion cookies tracked■Some companies track 200M, some track 20B
■ Each bidder has their own data pool■Data is your weapon■Recent searches, behavior, IP addresses■Audience clusters (K-cluster, K-means) from offline Hadoop
■ “Remnant” from Google, Yahoo is about 0.6 million / sec
■ Facebook exchange: about 0.6 million / sec■ “other” is 0.5 million / sec
Currently about 3.0M / sec in North American
© 2014 Aerospike. All rights reserved. Confidential 6
Financial Services – Intraday Positions
LEGACY DATABASE(MAINFRAME)
Read/Write
Start of Day Data Loading
End of DayReconciliation
QueryREAL-TIME DATA FEED
ACCOUNTPOSITIONS
XDR
10M+ user records
Primary key access
1M+ TPS planned
Finance App
Records App
RT Reporting App
© 2014 Aerospike. All rights reserved. Confidential 7
Social Media
MYSQL or POSTGRES(ROTATIONAL DISK)
Recent user generated content
Java application tier
Data abstractionand sharding
MODIFIED REDIS(SSD ENABLED)
Content and Historical data
© 2014 Aerospike. All rights reserved. Confidential 8
Travel Portal
PRICING DATABASE(RATE LIMITED)
Poll for Pricing Changes
PRICING DATA
Store LatestPrice
SESSIONMANAGEMENT
SessionData
ReadPrice
XDR
Airlines forced interstate banking
Legacy mainframe technology
Multi-company reservation and pricing
Requirement: 1M TPS allowing overhead
Travel App
© 2014 Aerospike. All rights reserved. Confidential 9
SOURCE DEVICE/ USER
QOS & Real-Time Billing for Telcos
■In-switch Per HTTP request Billing■US Telcos: 200M subscribers, 50 metros
■In-memory use case
Hot Standby
Execute Request
Real-timeChecks
DESTINATION
UpdateDeviceUserSettings
Request
XDR
Real-time Auth. QoS Billing
Config Module App
© 2014 Aerospike. All rights reserved. Confidential 10
Old Architecture ( scale out in 2000 )
Request routing and sharding
APP SERVERS
CACHE
DATABASE
STORAGE
CONTENT DELIVERY NETWORK
LOAD BALANCER
© 2014 Aerospike. All rights reserved. Confidential 11
Modern Scale Out Architecture
Load balancerSimple stateless
APP SERVERS
IN-MEMORY NoSQL
RESEARCHWAREHOUSE
CONTENT DELIVERY NETWORK
LOAD BALANCER
Long term cold storageFast stateless
HDFS BASED
© 2014 Aerospike. All rights reserved. Confidential 12
How Fast You Can Go
( a few graphs )
© 2014 Aerospike. All rights reserved. Confidential 13
YCSB Performance Comparison 2014
© 2014 Aerospike. All rights reserved. Confidential 14
Hot Analytics
■High throughput Queries■2 node cluster, 10 Indexes■Query returns 100 of 50M records
■Predictable low latency
UN-PREDICTABLE LATENCY
128 – 300 ms
70 – 760 ms
7 – 10 ms
QPS
© 2014 Aerospike. All rights reserved. Confidential 15
Amazon EC2 results
© 2014 Aerospike. All rights reserved. Confidential 16
Mo’ speed, mo’ problems
I don’t need that much speed( you will ! )
“ferrari speed” is bad( but with camry
reliability? )
I don’t believe you( simple benchmark
tooling )Amazon will save me
( multicloud )( sell to API, platform
companies )
© 2014 Aerospike. All rights reserved. Confidential 17
Lessons Learned
© 2014 Aerospike. All rights reserved. Confidential 18
Coding standards
( hiring is the obvious problem )
© 2014 Aerospike. All rights reserved. Confidential 19
Memory matters – the new coding style
CPU is free
Memory is expensive
Malloc is the ultimate enemy
© 2014 Aerospike. All rights reserved. Confidential 20
Multithreading and reference counting
“we multithread so you don’t have to”
Hire old embedded guys
Build reference counted libraries
Memory access is the enemy
© 2014 Aerospike. All rights reserved. Confidential 21
Clients are hard
© 2014 Aerospike. All rights reserved. Confidential 22
Creative corner cutting (opinionated)
Server restart time doesn’t matter if the code is reliable
Hash collisions don’t matter if the hash function hasn’t had a collision (RIPE-160)
Rotational disk is dead( correct for
analytics ) Data commit doesn’t matter if the app server crashed
© 2014 Aerospike. All rights reserved. Confidential 23
Aerospike’s Flash Experience
■Know your Flash■ACT benchmark http://github.com/aerospike/act■Read-write benchmark results back to 2011
■All clouds support flash now■New EC2 instances■Google Compute■Internap, Softlayer, GoGrid…
■Write durability usually not a problem with modern flash■Durability is high (5 “drive writes per day” for 5 years, etc)■Read performance suffers under write load anyway
© 2014 Aerospike. All rights reserved. Confidential 24
Aerospike’s Flash Experience
■Densities increasing■100G 2 years ago 800G today■SATA vs PCI-E■Appliances: 50T per 1U this year
■Prices still dropping: perhaps $1/G next year
■ Intel P3700 results■250K per device @ $2.5 / G■Old standard: Micron P320h 500K @ $8 / G
■ “Wide SATA”■20 SATA drives■LSI “pass through mode”■250K+ per server
© 2014 Aerospike. All rights reserved. Confidential 25
Use Open Source
© 2014 Aerospike. All rights reserved. Confidential 26