distributedsystems 100912185813-phpapp01
TRANSCRIPT
![Page 1: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/1.jpg)
Distributed Systems
scalability and high availability
Renato Lucindo - lucindo.github.com - @rlucindo
Distributed System Design
![Page 2: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/2.jpg)
Renato Lucindo
Call me Lucindo (or Linus)2002 - Bachelor Computer Science2007 - M.Sc. Computer Science (Combinatorial Optimization)7+ year developing Distributed Systems
My default answer: "I don't know."
Distributed System Design
![Page 3: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/3.jpg)
Agenda
Scalability
High Availability
Problems
Tips and Tricks
Learning More
Distributed System Design
![Page 4: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/4.jpg)
Distributed Systems
Multiple computers that interact with each other over a network to achieve a common goalPurpose
ScalabilityHigh availability
source: http://www.cnds.jhu.edu/
Distributed System Design
![Page 5: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/5.jpg)
Scalability
System ability to handle gracefully a growing amount of work
Scale up (vertical)Add resources to a single nodeImprove existing code to handle more work
Scale out (horizontal)Add more nodes to a systemLinear (or better) scalabilityDi
stributed System Design
![Page 6: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/6.jpg)
Scalability - Vertical
Add: CPU, Memory, Disks (bigger box) Handling more simultaneous:
ConnectionsOperationsUsers
Choose a good I/O and concurrency modelNon-blocking I/OAsynchronous I/OThreads (single, pool, per-connection)Event handling patterns (Reactor, Proactor, ...)
Memory model?STM
Distributed System Design
![Page 7: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/7.jpg)
Scalability - Vertical
Careful with numbersRequests per second# of ConnectionsSimultaneous operations
Event handlingThink front-endSlow connections/clientsIt's slower than other options
In doubt, go asyncBack-end
Thread pool (thread per-connection)No eventsProcess per-core
Distributed System Design
![Page 8: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/8.jpg)
Scalability - Horizontal
Add nodes to handle more workFront-end
StraightforwardStateless
Back-endMaster/Slave(s)Partitioning
DHTVolatile Index
Distributed System Design
![Page 9: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/9.jpg)
Scalability - Horizontal
Master/SlaveWrite on single MasterRead on Slaves (one or more)Scales reads
Distributed System Design
![Page 10: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/10.jpg)
Scalability - Horizontal
Partitioning (Sharding)Distribute dada across nodes
Generally involves data de-normalizationWhere is some specific data?
Master IndexHash (DTH, Consistent Hashing)Volatile Index
Joins done in application levelNoSQL friendly
Distributed System Design
![Page 11: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/11.jpg)
Scalability - Horizontal
Volatile Index: build and maintain data index as cached information (all clients)
Distributed System Design
![Page 12: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/12.jpg)
High Availability
"Processes, as well as people, die"
Handle hardware and software failuresEliminate single point of failure
RedundancyFailoverReplicas
Distributed System Design
![Page 13: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/13.jpg)
High Availability - Failover/Redundancy
Distributed System Design
![Page 14: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/14.jpg)
High Availability - Replicas
Two or more copies of same dataReplica granularity
From node replica to "row" replicaLoad balancingWrite concurrencyReplica updatesKey for high availability and root of several problems
Distributed System Design
![Page 15: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/15.jpg)
Problems
Distributed System Design
![Page 16: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/16.jpg)
Problems - CAP Theorem
Distributed System Design
![Page 17: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/17.jpg)
Problems - CAP Theorem
Consistency: all operations (reads/writes) yield a global consistent state
Availability: all requests (on non-failed servers) must have a response
Partition Tolerance: nodes may not be able to communicate with each other.
Pick TwoDistributed System Design
![Page 18: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/18.jpg)
Problems - CAP Theorem
C + A: network problems might stop the system
Examples:Oracle RAC, IBM DB2 ParallelRDBMS (Master/Slave)Google File SystemHDFS (Hadoop)
Distributed System Design
![Page 19: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/19.jpg)
Problems - CAP Theorem
C + P: clients can't always perform operations
Examples:Distributed lock-systems: Chubby, ZooKeeperPaxos protocol (consensus)BigTable, HbaseHypertableMongoDB
Distributed System Design
![Page 20: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/20.jpg)
Problems - CAP Theorem
A + P: clients may read inconsistent (old or undone) data
Examples:�Amazon DynamoCassandraVoldemortCouchDBRiakCaches
Distributed System Design
![Page 21: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/21.jpg)
Problem with CAP Theorem
In practice, C + A and C + P systems are the same.C + A: not tolerant of network partitionsC + P: not available when a network partition occurs
Big problem: network partitionNot so big (how often does it happens?)
Pick twoAvailabilityConsistency
The forgotten: LatencyOr, how long the system waits before considering a partitioned network?
Distributed System Design
![Page 22: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/22.jpg)
Problems - Real World
Every component may fail:Network failureHardware failureElectricityNatural disastersCode failure
Distributed System Design
![Page 23: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/23.jpg)
Tips & Tricks
Distributed System Design
![Page 24: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/24.jpg)
Tips & Tricks - Pyramid
Capacity (connections, operations, ...) Pyramid
Distributed System Design
![Page 25: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/25.jpg)
Tips & Tricks - Reply Fast
FAIL FastBreak complex requests into smaller onesUse timeoutsNo transactionsBe aware that a single slow operation or component can generate contentionSelf-denial attack
Distributed System Design
![Page 26: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/26.jpg)
Tips & Tricks - Cache
Cache: component location, data, dns lookups, previous requests, etcUse negative cache for failed requests (low expiration)Don't rely on cacheYour system must work with no cache
Distributed System Design
![Page 27: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/27.jpg)
Tips & Tricks - Queues
Easy way to add asynchronous processing an decouple your system.
Distributed System Design
![Page 28: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/28.jpg)
Tips & Tricks - DNS
Distributed System Design
![Page 29: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/29.jpg)
Tips & Tricks - Logs
Log everythingUse several log levelsOn every log message
UserRequest hostComponent involvedVersionFilename and line
If log level not enabled do not process log messageAvoid lookup calls (gettimeofday)Di
stributed System Design
![Page 30: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/30.jpg)
Tips & Tricks - Domino Effect
Make sure your load balancer won't overload componentsUser smart algorithms
Load BalanceResource Allocation
Distributed System Design
![Page 31: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/31.jpg)
Tips & Tricks - (Zero) Configuration
No configuration filesUse good defaultsAuto-discovery (multicast, gossip, ...)Make everything configurable
Administrative commandNo need to stop for changes
Automatic self adjusts when possible
Distributed System Design
![Page 32: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/32.jpg)
Tips & Tricks - STOP Test
With your system under load: kill -STOP <component>
Distributed System Design
![Page 33: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/33.jpg)
Tips & Tricks - Know your tools
load average (uptime)stats tools
vmstatiostatmpstattcpstat, tcprstat, etc
tcpdump, nc, netstattunning
/proc/net/*ulimitsysctl
oprofiledebuging tools (gdb, valgrind)...
Distributed System Design
![Page 34: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/34.jpg)
Tips & Tricks - Count
Count everythingConnectionsOperationsFailuresSuccessesRequest times (granularity)
Total, average, standard deviationMonitor counters
Distributed System Design
![Page 35: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/35.jpg)
Tips & Tricks - Stability Patterns
Use TimeoutsCircuit BreakerBulkheadsSteady StateFail FastHandshakingTest HarnessDecoupling Middleware
Distributed System Design
![Page 36: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/36.jpg)
Tips & Tricks - Don't Panic!
Distributed System Design
![Page 37: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/37.jpg)
Learning More - Books
TCP/IP Illustrated, Vol. 1: The Protocols
Distributed System Design
![Page 38: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/38.jpg)
Learning More - Books
Unix Network Programming, Vol. 1: The Sockets Networking
Distributed System Design
![Page 39: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/39.jpg)
Learning More - Books
Pattern Oriented Software Architecture, Vol. 2
Distributed System Design
![Page 40: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/40.jpg)
Learning More - Books
Release It!
Distributed System Design
![Page 41: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/41.jpg)
Learning More - Papers
The Google File System Bigtable: A Distributed Storage System for Structured DataDynamo: Amazon's Highly Available Key-Value StorePNUTS: Yahoo!’s Hosted Data Serving PlatformMapReduce: Simplified Data Processing on Large Clusters
Towards robust distributed systemsBrewer's conjecture and the feasibility of consistent, available, partition-tolerant web servicesBASE: An Acid AlternativeLooking up data in P2P systems
Distributed System Design
![Page 42: Distributedsystems 100912185813-phpapp01](https://reader033.vdocument.in/reader033/viewer/2022060200/5598950e1a28abd0348b45f8/html5/thumbnails/42.jpg)
Thanks!!! Questions?
lucindo.github.com - @rlucindo
Distributed System Design