scaling at showyou - operations - aphyr · 2011-09-27 ·...
TRANSCRIPT
![Page 1: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/1.jpg)
Introduction Storage Processing Monitoring Review
Scaling at ShowyouOperations
September 26, 2011
![Page 2: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/2.jpg)
Introduction Storage Processing Monitoring Review
I’m Kyle Kingsbury
Handle aphyrCode http://github.com/aphyrEmail [email protected] Backend, API, ops
![Page 3: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/3.jpg)
Introduction Storage Processing Monitoring Review
What the hell is Showyou?
![Page 4: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/4.jpg)
![Page 5: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/5.jpg)
Introduction Storage Processing Monitoring Review
Nontrivial complexity
![Page 6: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/6.jpg)
Introduction Storage Processing Monitoring Review
Challenges
� Scanning social networks� Feeds� Search� Trends� Responsive client experience
� Everything fails all the time
![Page 7: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/7.jpg)
Introduction Storage Processing Monitoring Review
Challenges
� Scanning social networks� Feeds� Search� Trends� Responsive client experience� Everything fails all the time
![Page 8: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/8.jpg)
![Page 9: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/9.jpg)
Introduction Storage Processing Monitoring Review
Storage
![Page 10: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/10.jpg)
![Page 11: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/11.jpg)
Introduction Storage Processing Monitoring Review
We left MySQL
� Changing the schema requires downtime� Crashes� Master-slave lag� Slow restarts� Node replacements difficult� Fully normalized queries complex, slow
![Page 12: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/12.jpg)
Introduction Storage Processing Monitoring Review
We left MySQL
� Changing the schema requires downtime� Crashes� Master-slave lag� Slow restarts� Node replacements difficult� Fully normalized queries complex, slow
![Page 13: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/13.jpg)
Introduction Storage Processing Monitoring Review
MySQL does scale
But there are tradeoffs
![Page 14: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/14.jpg)
Introduction Storage Processing Monitoring Review
Riak
� Key/value store� Homogenous� Scales linearly with nodes� Excellent durability/recoverability� Eventually consistent
![Page 15: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/15.jpg)
Introduction Storage Processing Monitoring Review
We use Riak as our durable datastore
� Users, feeds, videos, etc� Highly denormalized� Limited MR queries (feeds, etc)
� Latency-bounded MR jobs are Erlang� Hot-deployable
� Extensive use of conflict resolution� Made possible by Risky
![Page 16: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/16.jpg)
Introduction Storage Processing Monitoring Review
Riak at Showyou
� 51 million keys (153 M replicated)� 100 GB of data (300 GB replicated)� 260 gets/sec (baseline)� 75 puts/sec (baseline)� Capable of over 3000 ops/sec
![Page 17: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/17.jpg)
Introduction Storage Processing Monitoring Review
SSDs are amazing
WD 7200RPM
� 100 ops/sec� 95%: 100-300ms
Micron RealSSD P300
� 1000+ ops/sec� 95%: 3-5ms
![Page 18: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/18.jpg)
Introduction Storage Processing Monitoring Review
When Riak fails,
� Another node takes up the slack� Clients connected to that node reconnect to others� Typically no service interruption
� However, latencies may rise� Especially for MR jobs
![Page 19: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/19.jpg)
Introduction Storage Processing Monitoring Review
Riak has downsides
� Difficult to debug� Membership changes are dangerous� Significantly slower than MySQL� (Bitcask) All keys must fit in memory� Mapreduce is only appropriate for known keys� List-keys can take down your cluster
Long story short: it’s only a KV store
![Page 20: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/20.jpg)
Introduction Storage Processing Monitoring Review
+Redis
![Page 21: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/21.jpg)
Introduction Storage Processing Monitoring Review
We use Redis for fast, temporary state
� List of users� List of videos� Counters� Queues
Incredibly fast, excellent primitives
![Page 22: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/22.jpg)
Introduction Storage Processing Monitoring Review
When Redis fails,
� Daemons using those indexes pause� Frontend service continues� Bitcask scanners and incremental updaters repair
any lost data
Eventually consistent.
![Page 23: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/23.jpg)
Introduction Storage Processing Monitoring Review
When Redis fails,
� Daemons using those indexes pause� Frontend service continues� Bitcask scanners and incremental updaters repair
any lost data
Eventually consistent.
![Page 24: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/24.jpg)
Introduction Storage Processing Monitoring Review
We also use SOLR extensively
� Supplements Riak� Complex indices� Full-text search� Analytics
More on that later. . .
![Page 25: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/25.jpg)
Introduction Storage Processing Monitoring Review
Processing
![Page 26: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/26.jpg)
![Page 27: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/27.jpg)
Introduction Storage Processing Monitoring Review
Do one thing well
Lots of small processes handling well-defined tasks
� Easier to debug� Easier to test� Simplifies parallelism� Simplifies error handling� Less likely to cause total system failure
![Page 28: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/28.jpg)
Introduction Storage Processing Monitoring Review
Minimize Shared State
� Vector clocks for concurrent modification� Queues for message passing� Riak for durable storage� Redis for fast synchronous state
![Page 29: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/29.jpg)
Introduction Storage Processing Monitoring Review
Crash by Default
� Someone else will take your work� Repair constantly� Assume everybody is out to kill you
![Page 30: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/30.jpg)
Introduction Storage Processing Monitoring Review
Distribute
� Multiple threads, processes, hosts� Failover IPs with Heartbeat� Rolling restarts mean frequent deploys and nobody
notices� Losing a node is no big deal� Scaling out is easy
![Page 31: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/31.jpg)
![Page 32: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/32.jpg)
Introduction Storage Processing Monitoring Review
Monitoring
![Page 33: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/33.jpg)
![Page 34: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/34.jpg)
Introduction Storage Processing Monitoring Review
UState: A state aggregator
![Page 35: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/35.jpg)
Introduction Storage Processing Monitoring Review
Receive states over protobufs
Host backend1.showyou.comService feed merger rate
Time unix epoch secondsState ok
Metric 12.5Description 12.5 feed items/sec
![Page 36: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/36.jpg)
Introduction Storage Processing Monitoring Review
Query states
� state = "warning" or state = "critical"� service =∼ "api %" and host != null
![Page 37: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/37.jpg)
Introduction Storage Processing Monitoring Review
� Combine states together (sum, average, . . . )� Send email on changes� Forward to another UState server� Forward to Graphite� Dashboard
![Page 38: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/38.jpg)
![Page 39: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/39.jpg)
![Page 40: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/40.jpg)
Introduction Storage Processing Monitoring Review
Understand application behavior
![Page 41: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/41.jpg)
![Page 42: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/42.jpg)
Introduction Storage Processing Monitoring Review
When can we. . . ?
![Page 43: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/43.jpg)
![Page 44: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/44.jpg)
Introduction Storage Processing Monitoring Review
It’s 23:15 PST.
Do you know where YOUR database is?
![Page 45: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/45.jpg)
Introduction Storage Processing Monitoring Review
It’s 23:15 PST.
Do you know where YOUR database is?
![Page 46: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/46.jpg)
![Page 47: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/47.jpg)
![Page 48: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/48.jpg)
![Page 49: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/49.jpg)
![Page 50: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/50.jpg)
![Page 51: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/51.jpg)
![Page 52: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/52.jpg)
![Page 53: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/53.jpg)
Introduction Storage Processing Monitoring Review
http://github.com/aphyr/ustate
![Page 54: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/54.jpg)
Introduction Storage Processing Monitoring Review
Recap
� Robust, discrete components� Highly distributed� Message passing� Eventual consistency� Comprehensive monitoring
![Page 55: Scaling at Showyou - Operations - Aphyr · 2011-09-27 · IntroductionStorageProcessingMonitoringReview Challenges Scanning social networks Feeds Search Trends Responsive client experience](https://reader034.vdocument.in/reader034/viewer/2022042214/5eb9f8885fd37101c149d8f5/html5/thumbnails/55.jpg)
Introduction Storage Processing Monitoring Review
Thanks!
� Basho (esp. Pharkmillups!)� Formspring� Bump