evolution of a memcached deployment webinar 2010 01 13
DESCRIPTION
Memcached has become a critical tool in the web technology stack. High traffic web sites with dynamic content - like Facebook, Twitter, and Wikipedia - rely on Memcached to scale and ensure “snappy” site performance.This presentation willl cover a brief overview of Memcached, then dive into the evolution of Memcached’s use in dynamic web sites and how you can scale your site and get better performance with Memcached. We’ll also review emerging architectures and tools of high performance, large scale dynamic websites.In this webinar you will learn best practices used by some of the hottest sites and get tips on how to avoid potential pitfalls when scaling. Whether you're just building the infrastructure for a brand new site or have a large dynamic site with millions of users, this webinar is for you.TRANSCRIPT
The Evolution of a Memcached Deployment
Presented by:
Bill Takacs – Director, Product Management
January 2010
Agenda
• Rise of the dynamic web
• The web architecture
• Overview of memcached
• The evolution of a dynamic site and MemcachedThe evolution of a dynamic site and Memcached
G 6 S l ti• Gear 6 Solution
2 : Copyright 2010 Gear6 Inc.
The Web: What’s Changed?
• Population
• Traffic
• Content & Applications
3 : Copyright 2010 Gear6 Inc.
Web Growth: Population
• Forrester: 2.2 billion people online globally by 2013
4 : Copyright 2010 Gear6 Inc.
Web Growth: Traffic
Cisco: “Annual global IP traffic willCisco: Annual global IP traffic will exceed two-thirds of a zettabyte(667 exabytes) in four years (2013)-Cisco Visual Networking Index, 9 June 2009
5 : Copyright 2010 Gear6 Inc.
Web Growth: Application & Content
• Social Networking
• Entertainment
• Media Dynamic Static
• Communication
• Community generated content
6 : Copyright 2010 Gear6 Inc.
Use of online video sharing double since 2006since 2006
7 : Copyright 2010 Gear6 Inc.
Pew Internet & American Life Project, Online participation in the social media era by Aaron Smith Dec 10th, 2009 http://www.pewinternet.org/Presentations/2009/RTIP-Social-Media.aspx
Use of online social networks has 6X since 2005since 2005
8 : Copyright 2010 Gear6 Inc.
Pew Internet & American Life Project, Online participation in the social media era by Aaron Smith Dec 10th, 2009 http://www.pewinternet.org/Presentations/2009/RTIP-Social-Media.aspx
Web Architecture
➜ Most sites (over 65%) based on LAMP or JAVA ( )➜ Industry standard servers replaced proprietary SMP➜ Shift to Dynamic Content puts strain on origin sites
NetInterface
Web Stack
ClientsInternet
StorageInterface
P
WSer
Apach
LigASer
PHP, JavPerl,
DataM
yPost
Storagfile, bl
CDNProxy
Load
Web
rvershe, N
ginx, ghttpd
App
rversva, R
ails, C,
, Python
abaseySQ
L, tgreSQ
L
ge Interfacelock, FC
, SCSI
Balancer
e:
9 : Copyright 2010 Gear6 Inc.
Why Does it Matter?
= $= $
10 : Copyright 2010 Gear6 Inc.
What to do?
Cache
CACHECACHE
CACHE!CACHE!
11 : Copyright 2010 Gear6 Inc.
New Caching Architecturefor Scaling Out for Scaling Out
W b S k
ClientsStorage
NetInterface
Web Stack
CDN
InternetStorage
Proxy
Web
ServeA
pache, NLighttp
App
ServePH
P, Java, RPerl, Pyt
Databa
MySQ
PostgreS
Storage Ifile, block, CDN
LoadBalancer
b ersN
ginx, pd
p ersR
ails, C,
thon
aseL, SQ
L
nterface:FC
, SCSI
Cache Serversmemcached
12 : Copyright 2010 Gear6 Inc.
Memcached: Pillar of Web 2.0 Architecture
“Everything runs fromEverything runs fromMemory in Web 2.0”y
» Evan Weaver, Twitter, March 2009
13 : Copyright 2010 Gear6 Inc.
The Fix: Memcached
“A high performance, distributed memory object caching g p , y j gsystem, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load”database load”
• Big hash table
• Created by Danga Interactive for Live Journal
• Significantly reduced database load
• Perfect for web sites with high database load
• In use by Facebook Twitter MyYearBook others
14 : Copyright 2010 Gear6 Inc.
In use by Facebook, Twitter, MyYearBook, others
More on Memcached
• Takes advantage of available DRAMTakes advantage of available DRAM
• Open sourceOpe sou ce
• Distributed under BSD license
• Server - Current version is 1.4.3» http://www.danga.com/memcached/download.bml
M li t• Many clients» http://code.google.com/p/memcached/wiki/Clients
15 : Copyright 2010 Gear6 Inc.
Memcached: Best Practices
• Use with MySQL :-) • Careful with b f• Use on 64 bit servers
• Cache “expensive
numbers of connections
• Cache expensive operations”
• Cache bi
• Design to withstand failures gracefully
• Cache bi-directionally (R/W)U i t t
• Evictions• Optimize sizing:• Use consistent
hashingOptimize sizing: Instances and pools
• Instrumentation• Instrumentation
16 : Copyright 2010 Gear6 Inc.
What Memcached is NOT:
• A persistent data store
• A database
• Application specific
• A large object cache
• No HA Features
17 : Copyright 2010 Gear6 Inc.
Memcache Use Cases
Site Type Repeatable Use
Social Networking Profile caching
Content Aggregation HTML/page caching
Ad Targeting Cookie/profile trackingg g p g
Gaming/Entertainment Session caching
Location-based Services DB query scaling
Relationship Session cachingRelationship Session caching
Ecommerce Session & HTML caching
18 : Copyright 2010 Gear6 Inc.
Evolution of a Dynamic Site #1A day in the life of a growing web serviceA day in the life of a growing web service
App Server App Server App Server App Server App Server App Server
App Server App Server App Server
read
write
readwrite
readwrite …
MySQL MySQLMySQL
19 : Copyright 2010 Gear6 Inc.
Evolution of a Dynamic Site #2A day in the life of a growing web serviceA day in the life of a growing web service
App Server App Server App Server
App Server App Server App Server
App Server App Server App Server
App Server App Server App Servermemcached memcached memcached
memcached memcached memcached
writewrite writewrite
MySQLread
MySQLread
…MySQL
readMySQL
read…
20 : Copyright 2010 Gear6 Inc.
Evolution of a Dynamic Site #3A day in the life of a growing web serviceA day in the life of a growing web service
memcached memcached memcached
App Server App Server App Server
App Server App Server App Server
App Server App Server App Server
App Server App Server App Servermemcached memcached memcached
memcached memcached
memcached memcached memcached
memcached
writewrite
…
writewrite …
memcached memcached memcached
readread
MySQLMySQL
write
MySQL
writeMySQL readread
readread …
21 : Copyright 2010 Gear6 Inc.
22 : Copyright 2010 Gear6 Inc.22
Tinker: Follow Topic Streams Instead of People
People EventsInaugurationMamma Mia!
LOST
Political protestJazz showArt show
LOST
LOSTWine tastingFashion Week
Nascar
Follow peopleNew OperaNascar
Follow events
Jazz showDemo conf.LOSTInauguration
NascarInauguration Inauguration
23 : Copyright 2010 Gear6 Inc.
gFashion Week
augu at o
Product Challenges
• Large Data Pipe from multiple sources
• Real Time Analysis and Processing
• Exponential growth
• Intra DB traffic growing exponentially
Si ifi li i l l• Significant replication lag on slaves
24 : Copyright 2010 Gear6 Inc.
Tinker Infrastructure – Prototype
Launched March 2009
Glam is traditionally Oracle house but:
Application Servers X2
• Leveraged MySQL for Tinker• Cost• Features – Replication / Clustering MySQL:
Configuration• PHP Front End
y• 1 Master• 2 Slaves• MyISAM• Replicated
• MySQL Database
Performance• Up to 10K users• Up to 100 queries / second
25 : Copyright 2010 Gear6 Inc.
Tinker – A Modest Start
Master-Master ReplicationMaster Master Replication• 2 Master DBs replicated
o 1 for Aggregationo 1 for Trends
• 3 Slaves for application• InnoDB to prevent locking
PerformancePerformance• Doubled users 20K+• 1K events• No increase in qps• No increase in qps
26 : Copyright 2010 Gear6 Inc.
CLUSTERING - Reducing Replication
Created Two DB Clusters
Cluster 1 – Main DB• 1 Master DB for aggregation• 3 Slaves for application• 3 Slaves for application• 1 Slave for backup
Cluster 2 – Trends DB• 1 Master• 1 Slave• Replicates 3 Tables from master
Performance• Reduced traffic to slaves by half• 50K + users• 3K + events
27 : Copyright 2010 Gear6 Inc.
CACHING - Reducing the Number of Selects
Added Memcached to DB driver layerlayer
– caching selects
High Availability Memcached• Dedicated hardware HA Memcached
• Centralized memcached• 10GB Cache• 2 Servers – clustered
Failo er replication high a ailabilit• Failover, replication, high availability• Easy to manage and maintain
Performance• Reduced the load on slaves by 80%
28 : Copyright 2010 Gear6 Inc.
CACHING – Optimization and Tuning
Smart Caching Strategy• Based on timeliness of the dataBased on timeliness of the data
o 1 min to 1 hour• Invalidate cached based on user activity
o Creating eventso Following eventso Aggregating post
I d Q iImproved Queries• Created appropriate indexes• Reduced number of queries needed to load a page
All DB i t l d i < 5 d bi d• All DB queries on a page must load in <.5 seconds combined• Eliminate count queries where appropriate• The largest table Feed_Item was partitioned by date range
o Reduce table / row lockso Reduce table / row locks
Performance• Load average on Slave DBs dropped from >20 to <1
29 : Copyright 2010 Gear6 Inc.
g pp• Page loads that were 20 – 30 sec now loaded in <1
Generation - Next
Scaling to 10X and 100X of current DB transactionstransactions
Larger Memcached DeploymentCaching strateg• Caching strategy
• Large scale Widget deployment• MySQL based Memcached invalidation strategy
Failover – clusters in multiple colos with replication
DB Sharding• Balancing load based on event activity• Event migration from one cluster to another
30 : Copyright 2010 Gear6 Inc.
Useful Memcached Tools
advanced reporterp
wireshark
Track hot keys and clients in Memcached
wireshark
brutisDissect and analyze Memcached network traffic
brutisSize and test changes to memcache clusters
statsproxyView buffered Memcached stats in your browser
cactiGraph and analyze Memcached statistics
31 : Copyright 2010 Gear6 Inc.
Statsproxy + Cacti Templates
To use the cacti templates for memcached with statsproxy, you either need to modify the templates to use port 8080 or change the statsproxy config to use port 11211
32 : Copyright 2010 Gear6 Inc.
About Gear6
• First and leading providerof Memcached solutions
• Memcached solution including
• High densityHigh density
• High Availability
• Advanced memory• Advanced memorymanagement
• Enhanced reportingbiliticapabilities
• Support for multi-tenancy
• Disruption freesoftware upgrades
• 100% client compatible
33 : Copyright 2010 Gear6 Inc.
00% c e t co pat b e
Gear6 Products
Web Cache – Universal Distribution:
•Software
•Hardware
W b C h S Cl dWeb Cache Server – Cloud: Server
34 : Copyright 2010 Gear6 Inc.
Questions?
Thank you for attending“The Evolution of a Memcached Deployment”
by Gear6
Bill TakacsBill TakacsGear6
[email protected]+1 650 587 7118
35 : Copyright 2010 Gear6 Inc.
+1 650 587 7118
References
• Danga.com• Highscalability.comg y• Dev.gear6.com• Groups.google.com/group/memcached• Code.google.com/p/memcachedg g p• Twitter.com/gearsix• Cacti.net• Wireshark.orgWireshark.org• http://dev.mysql.com/doc/mysql-ha-scalability/en/ha-memcached-using-
deployment.html• http://dev.mysql.com/doc/mysql-ha-scalability/en/ha-memcached-using-p y q y q y g
hashtypes.html• http://jayant7k.blogspot.com/2009/04/memcached-replication.html• http://www.lexemetech.com/2007/11/consistent-hashing.html• http://www8.org/w8-papers/2a-
webserver/caching/paper2.htmlhttp://www.last.fm/user/RJ/journal/2007/04/10/rz_libketama_-_a_consistent_hashing_algo_for_memcache_clients
36 : Copyright 2010 Gear6 Inc.
• http://bazaar.launchpad.net/~libmemcached-developers/libmemcached/trunk/revision/539