Download - Scalable Web Architectures
![Page 1: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/1.jpg)
Scalable Web Architectures
Common Patterns & Approaches
Cal Henderson
![Page 2: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/2.jpg)
Web 2.0 Expo NYC, 16th September 2008 2
Hello
![Page 3: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/3.jpg)
Web 2.0 Expo NYC, 16th September 2008 3
Flickr
• Large scale kitten-sharing
• Almost 5 years old– 4.5 since public launch
• Open source stack– An open attitude towards architecture
![Page 4: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/4.jpg)
Web 2.0 Expo NYC, 16th September 2008 4
Flickr
• Large scale
• In a single second:– Serve 40,000 photos– Handle 100,000 cache operations– Process 130,000 database queries
![Page 5: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/5.jpg)
Web 2.0 Expo NYC, 16th September 2008 5
Scalable Web Architectures?
What does scalable mean?
What’s an architecture?
An answer in 12 parts
![Page 6: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/6.jpg)
Web 2.0 Expo NYC, 16th September 2008 6
1.
Scaling
![Page 7: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/7.jpg)
Web 2.0 Expo NYC, 16th September 2008 7
Scalability – myths and lies
• What is scalability?
![Page 8: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/8.jpg)
Web 2.0 Expo NYC, 16th September 2008 8
Scalability – myths and lies
• What is scalability not ?
![Page 9: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/9.jpg)
Web 2.0 Expo NYC, 16th September 2008 9
Scalability – myths and lies
• What is scalability not ?– Raw Speed / Performance– HA / BCP– Technology X– Protocol Y
![Page 10: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/10.jpg)
Web 2.0 Expo NYC, 16th September 2008 10
Scalability – myths and lies
• So what is scalability?
![Page 11: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/11.jpg)
Web 2.0 Expo NYC, 16th September 2008 11
Scalability – myths and lies
• So what is scalability?– Traffic growth– Dataset growth– Maintainability
![Page 12: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/12.jpg)
Web 2.0 Expo NYC, 16th September 2008 12
Today
• Two goals of application architecture:
Scale
HA
![Page 13: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/13.jpg)
Web 2.0 Expo NYC, 16th September 2008 13
Today
• Three goals of application architecture:
Scale
HA
Performance
![Page 14: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/14.jpg)
Web 2.0 Expo NYC, 16th September 2008 14
Scalability
• Two kinds:– Vertical (get bigger)– Horizontal (get more)
![Page 15: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/15.jpg)
Web 2.0 Expo NYC, 16th September 2008 15
Big Irons
Sunfire E20k
$450,000 - $2,500,00036x 1.8GHz processors
PowerEdge SC1435Dualcore 1.8 GHz processor
Around $1,500
![Page 16: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/16.jpg)
Web 2.0 Expo NYC, 16th September 2008 16
Cost vs Cost
![Page 17: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/17.jpg)
Web 2.0 Expo NYC, 16th September 2008 17
That’s OK
• Sometimes vertical scaling is right
• Buying a bigger box is quick (ish)
• Redesigning software is not
• Running out of MySQL performance?– Spend months on data federation– Or, Just buy a ton more RAM
![Page 18: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/18.jpg)
Web 2.0 Expo NYC, 16th September 2008 18
The H & the V
• But we’ll mostly talk horizontal– Else this is going to be boring
![Page 19: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/19.jpg)
Web 2.0 Expo NYC, 16th September 2008 19
2.
Architecture
![Page 20: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/20.jpg)
Web 2.0 Expo NYC, 16th September 2008 20
Architectures then?
• The way the bits fit together
• What grows where
• The trade-offs between good/fast/cheap
![Page 21: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/21.jpg)
Web 2.0 Expo NYC, 16th September 2008 21
LAMP
• We’re mostly talking about LAMP– Linux– Apache (or LightHTTPd)– MySQL (or Postgres)– PHP (or Perl, Python, Ruby)
• All open source• All well supported• All used in large operations• (Same rules apply elsewhere)
![Page 22: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/22.jpg)
Web 2.0 Expo NYC, 16th September 2008 22
Simple web apps
• A Web Application– Or “Web Site” in Web 1.0 terminology
Interwebnet App server Database
![Page 23: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/23.jpg)
Web 2.0 Expo NYC, 16th September 2008 23
Simple web apps
• A Web Application– Or “Web Site” in Web 1.0 terminology
Interwobnet App server Database
Cache
Storage array
AJAX!!!1
![Page 24: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/24.jpg)
Web 2.0 Expo NYC, 16th September 2008 24
App servers
• App servers scale in two ways:
![Page 25: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/25.jpg)
Web 2.0 Expo NYC, 16th September 2008 25
App servers
• App servers scale in two ways:
– Really well
![Page 26: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/26.jpg)
Web 2.0 Expo NYC, 16th September 2008 26
App servers
• App servers scale in two ways:
– Really well
– Quite badly
![Page 27: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/27.jpg)
Web 2.0 Expo NYC, 16th September 2008 27
App servers
• Sessions!– (State)
– Local sessions == bad• When they move == quite bad
– Centralized sessions == good
– No sessions at all == awesome!
![Page 28: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/28.jpg)
Web 2.0 Expo NYC, 16th September 2008 28
Local sessions
• Stored on disk– PHP sessions
• Stored in memory– Shared memory block (APC)
• Bad!– Can’t move users– Can’t avoid hotspots– Not fault tolerant
![Page 29: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/29.jpg)
Web 2.0 Expo NYC, 16th September 2008 29
Mobile local sessions
• Custom built– Store last session location in cookie– If we hit a different server, pull our session
information across
• If your load balancer has sticky sessions, you can still get hotspots– Depends on volume – fewer heavier users
hurt more
![Page 30: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/30.jpg)
Web 2.0 Expo NYC, 16th September 2008 30
Remote centralized sessions
• Store in a central database– Or an in-memory cache
• No porting around of session data• No need for sticky sessions• No hot spots
• Need to be able to scale the data store– But we’ve pushed the issue down the stack
![Page 31: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/31.jpg)
Web 2.0 Expo NYC, 16th September 2008 31
No sessions
• Stash it all in a cookie!
• Sign it for safety– $data = $user_id . ‘-’ . $user_name;– $time = time();– $sig = sha1($secret . $time . $data);– $cookie = base64(“$sig-$time-$data”);
– Timestamp means it’s simple to expire it
![Page 32: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/32.jpg)
Web 2.0 Expo NYC, 16th September 2008 32
Super slim sessions
• If you need more than the cookie (login status, user id, username), then pull their account row from the DB– Or from the account cache
• None of the drawbacks of sessions• Avoids the overhead of a query per page
– Great for high-volume pages which need little personalization
– Turns out you can stick quite a lot in a cookie too– Pack with base64 and it’s easy to delimit fields
![Page 33: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/33.jpg)
Web 2.0 Expo NYC, 16th September 2008 33
App servers
• The Rasmus way– App server has ‘shared nothing’– Responsibility pushed down the stack
• Ooh, the stack
![Page 34: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/34.jpg)
Web 2.0 Expo NYC, 16th September 2008 34
Trifle
![Page 35: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/35.jpg)
Web 2.0 Expo NYC, 16th September 2008 35
Trifle
Sponge / Database
Jelly / Business Logic
Custard / Page Logic
Cream / Markup
Fruit / Presentation
![Page 36: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/36.jpg)
Web 2.0 Expo NYC, 16th September 2008 36
Trifle
Sponge / Database
Jelly / Business Logic
Custard / Page Logic
Cream / Markup
Fruit / Presentation
![Page 37: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/37.jpg)
Web 2.0 Expo NYC, 16th September 2008 37
App servers
![Page 38: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/38.jpg)
Web 2.0 Expo NYC, 16th September 2008 38
App servers
![Page 39: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/39.jpg)
Web 2.0 Expo NYC, 16th September 2008 39
App servers
![Page 40: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/40.jpg)
Web 2.0 Expo NYC, 16th September 2008 40
Well, that was easy
• Scaling the web app server part is easy
• The rest is the trickier part– Database– Serving static content– Storing static content
![Page 41: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/41.jpg)
Web 2.0 Expo NYC, 16th September 2008 41
The others
• Other services scale similarly to web apps– That is, horizontally
• The canonical examples:– Image conversion– Audio transcoding– Video transcoding– Web crawling– Compute!
![Page 42: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/42.jpg)
Web 2.0 Expo NYC, 16th September 2008 42
Amazon
• Let’s talk about Amazon– S3 - Storage– EC2 – Compute! (XEN based)– SQS – Queueing
• All horizontal
• Cheap when small– Not cheap at scale
![Page 43: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/43.jpg)
Web 2.0 Expo NYC, 16th September 2008 43
3.
Load Balancing
![Page 44: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/44.jpg)
Web 2.0 Expo NYC, 16th September 2008 44
Load balancing
• If we have multiple nodes in a class, we need to balance between them
• Hardware or software
• Layer 4 or 7
![Page 45: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/45.jpg)
Web 2.0 Expo NYC, 16th September 2008 45
Hardware LB
• A hardware appliance– Often a pair with heartbeats for HA
• Expensive!– But offers high performance– Easy to do > 1Gbps
• Many brands– Alteon, Cisco, Netscalar, Foundry, etc– L7 - web switches, content switches, etc
![Page 46: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/46.jpg)
Web 2.0 Expo NYC, 16th September 2008 46
Software LB
• Just some software– Still needs hardware to run on– But can run on existing servers
• Harder to have HA– Often people stick hardware LB’s in front– But Wackamole helps here
![Page 47: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/47.jpg)
Web 2.0 Expo NYC, 16th September 2008 47
Software LB
• Lots of options– Pound– Perlbal– Apache with mod_proxy
• Wackamole with mod_backhand– http://backhand.org/wackamole/– http://backhand.org/mod_backhand/
![Page 48: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/48.jpg)
Web 2.0 Expo NYC, 16th September 2008 48
Wackamole
![Page 49: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/49.jpg)
Web 2.0 Expo NYC, 16th September 2008 49
Wackamole
![Page 50: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/50.jpg)
Web 2.0 Expo NYC, 16th September 2008 50
The layers
• Layer 4– A ‘dumb’ balance
• Layer 7– A ‘smart’ balance
• OSI stack, routers, etc
![Page 51: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/51.jpg)
Web 2.0 Expo NYC, 16th September 2008 51
4.
Queuing
![Page 52: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/52.jpg)
Web 2.0 Expo NYC, 16th September 2008 52
Parallelizable == easy!
• If we can transcode/crawl in parallel, it’s easy– But think about queuing– And asynchronous systems– The web ain’t built for slow things– But still, a simple problem
![Page 53: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/53.jpg)
Web 2.0 Expo NYC, 16th September 2008 53
Synchronous systems
![Page 54: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/54.jpg)
Web 2.0 Expo NYC, 16th September 2008 54
Asynchronous systems
![Page 55: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/55.jpg)
Web 2.0 Expo NYC, 16th September 2008 55
Helps with peak periods
![Page 56: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/56.jpg)
Web 2.0 Expo NYC, 16th September 2008 56
Synchronous systems
![Page 57: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/57.jpg)
Web 2.0 Expo NYC, 16th September 2008 57
Asynchronous systems
![Page 58: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/58.jpg)
Web 2.0 Expo NYC, 16th September 2008 58
Asynchronous systems
![Page 59: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/59.jpg)
Web 2.0 Expo NYC, 16th September 2008 59
5.
Relational Data
![Page 60: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/60.jpg)
Web 2.0 Expo NYC, 16th September 2008 60
Databases
• Unless we’re doing a lot of file serving, the database is the toughest part to scale
• If we can, best to avoid the issue altogether and just buy bigger hardware
• Dual-Quad Opteron/Intel64 systems with 16+GB of RAM can get you a long way
![Page 61: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/61.jpg)
Web 2.0 Expo NYC, 16th September 2008 61
More read power
• Web apps typically have a read/write ratio of somewhere between 80/20 and 90/10
• If we can scale read capacity, we can solve a lot of situations
• MySQL replication!
![Page 62: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/62.jpg)
Web 2.0 Expo NYC, 16th September 2008 62
Master-Slave Replication
![Page 63: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/63.jpg)
Web 2.0 Expo NYC, 16th September 2008 63
Master-Slave Replication
Reads and Writes
Reads
![Page 64: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/64.jpg)
Web 2.0 Expo NYC, 16th September 2008 64
Master-Slave Replication
![Page 65: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/65.jpg)
Web 2.0 Expo NYC, 16th September 2008 65
Master-Slave Replication
![Page 66: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/66.jpg)
Web 2.0 Expo NYC, 16th September 2008 66
Master-Slave Replication
![Page 67: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/67.jpg)
Web 2.0 Expo NYC, 16th September 2008 67
Master-Slave Replication
![Page 68: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/68.jpg)
Web 2.0 Expo NYC, 16th September 2008 68
Master-Slave Replication
![Page 69: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/69.jpg)
Web 2.0 Expo NYC, 16th September 2008 69
Master-Slave Replication
![Page 70: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/70.jpg)
Web 2.0 Expo NYC, 16th September 2008 70
Master-Slave Replication
![Page 71: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/71.jpg)
Web 2.0 Expo NYC, 16th September 2008 71
Master-Slave Replication
![Page 72: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/72.jpg)
Web 2.0 Expo NYC, 16th September 2008 72
6.
Caching
![Page 73: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/73.jpg)
Web 2.0 Expo NYC, 16th September 2008 73
Caching
• Caching avoids needing to scale!– Or makes it cheaper
• Simple stuff– mod_perl / shared memory
• Invalidation is hard
– MySQL query cache• Bad performance (in most cases)
![Page 74: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/74.jpg)
Web 2.0 Expo NYC, 16th September 2008 74
Caching
• Getting more complicated…– Write-through cache– Write-back cache– Sideline cache
![Page 75: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/75.jpg)
Web 2.0 Expo NYC, 16th September 2008 75
Write-through cache
![Page 76: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/76.jpg)
Web 2.0 Expo NYC, 16th September 2008 76
Write-back cache
![Page 77: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/77.jpg)
Web 2.0 Expo NYC, 16th September 2008 77
Sideline cache
![Page 78: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/78.jpg)
Web 2.0 Expo NYC, 16th September 2008 78
Sideline cache
• Easy to implement– Just add app logic
• Need to manually invalidate cache– Well designed code makes it easy
• Memcached– From Danga (LiveJournal)– http://www.danga.com/memcached/
![Page 79: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/79.jpg)
Web 2.0 Expo NYC, 16th September 2008 79
Memcache schemes
• Layer 4– Good: Cache can be local on a machine– Bad: Invalidation gets more expensive with
node count– Bad: Cache space wasted by duplicate
objects
![Page 80: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/80.jpg)
Web 2.0 Expo NYC, 16th September 2008 80
Memcache schemes
• Layer 7– Good: No wasted space– Good: linearly scaling invalidation– Bad: Multiple, remote connections
• Can be avoided with a proxy layer– Gets more complicated
» Last indentation level!
![Page 81: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/81.jpg)
Web 2.0 Expo NYC, 16th September 2008 81
7.
HA Data
![Page 82: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/82.jpg)
Web 2.0 Expo NYC, 16th September 2008 82
But what about HA?
![Page 83: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/83.jpg)
Web 2.0 Expo NYC, 16th September 2008 83
But what about HA?
![Page 84: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/84.jpg)
Web 2.0 Expo NYC, 16th September 2008 84
SPOF!
• The key to HA is avoiding SPOFs– Identify– Eliminate
• Some stuff is hard to solve– Fix it further up the tree
• Dual DCs solves Router/Switch SPOF
![Page 85: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/85.jpg)
Web 2.0 Expo NYC, 16th September 2008 85
Master-Master
![Page 86: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/86.jpg)
Web 2.0 Expo NYC, 16th September 2008 86
Master-Master
• Either hot/warm or hot/hot
• Writes can go to either– But avoid collisions– No auto-inc columns for hot/hot
• Bad for hot/warm too• Unless you have MySQL 5
– But you can’t rely on the ordering!
– Design schema/access to avoid collisions• Hashing users to servers
![Page 87: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/87.jpg)
Web 2.0 Expo NYC, 16th September 2008 87
Rings
• Master-master is just a small ring– With 2 nodes
• Bigger rings are possible– But not a mesh!– Each slave may only have a single master– Unless you build some kind of manual
replication
![Page 88: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/88.jpg)
Web 2.0 Expo NYC, 16th September 2008 88
Rings
![Page 89: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/89.jpg)
Web 2.0 Expo NYC, 16th September 2008 89
Rings
![Page 90: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/90.jpg)
Web 2.0 Expo NYC, 16th September 2008 90
Dual trees
• Master-master is good for HA– But we can’t scale out the reads (or writes!)
• We often need to combine the read scaling with HA
• We can simply combine the two models
![Page 91: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/91.jpg)
Web 2.0 Expo NYC, 16th September 2008 91
Dual trees
![Page 92: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/92.jpg)
Web 2.0 Expo NYC, 16th September 2008 92
Cost models
• There’s a problem here– We need to always have 200% capacity to
avoid a SPOF• 400% for dual sites!
– This costs too much
• Solution is straight forward– Make sure clusters are bigger than 2
![Page 93: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/93.jpg)
Web 2.0 Expo NYC, 16th September 2008 93
N+M
• N+M– N = nodes needed to run the system– M = nodes we can afford to lose
• Having M as big as N starts to suck– If we could make each node smaller, we can
increase N while M stays constant– (We assume smaller nodes are cheaper)
![Page 94: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/94.jpg)
Web 2.0 Expo NYC, 16th September 2008 94
1+1 = 200% hardware
![Page 95: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/95.jpg)
Web 2.0 Expo NYC, 16th September 2008 95
3+1 = 133% hardware
![Page 96: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/96.jpg)
Web 2.0 Expo NYC, 16th September 2008 96
Meshed masters
• Not possible with regular MySQL out-of-the-box today
• But there is hope!– NBD (MySQL Cluster) allows a mesh– Support for replication out to slaves in a
coming version• RSN!
![Page 97: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/97.jpg)
Web 2.0 Expo NYC, 16th September 2008 97
8.
Federation
![Page 98: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/98.jpg)
Web 2.0 Expo NYC, 16th September 2008 98
Data federation
• At some point, you need more writes– This is tough– Each cluster of servers has limited write
capacity
• Just add more clusters!
![Page 99: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/99.jpg)
Web 2.0 Expo NYC, 16th September 2008 99
Simple things first
• Vertical partitioning– Divide tables into sets that never get joined– Split these sets onto different server clusters– Voila!
• Logical limits– When you run out of non-joining groups– When a single table grows too large
![Page 100: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/100.jpg)
Web 2.0 Expo NYC, 16th September 2008 100
Data federation
• Split up large tables, organized by some primary object– Usually users
• Put all of a user’s data on one ‘cluster’– Or shard, or cell
• Have one central cluster for lookups
![Page 101: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/101.jpg)
Web 2.0 Expo NYC, 16th September 2008 101
Data federation
![Page 102: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/102.jpg)
Web 2.0 Expo NYC, 16th September 2008 102
Data federation
• Need more capacity?– Just add shards!– Don’t assign to shards based on user_id!
• For resource leveling as time goes on, we want to be able to move objects between shards– Maybe – not everyone does this– ‘Lockable’ objects
![Page 103: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/103.jpg)
Web 2.0 Expo NYC, 16th September 2008 103
The wordpress.com approach
• Hash users into one of n buckets– Where n is a power of 2
• Put all the buckets on one server
• When you run out of capacity, split the buckets across two servers
• Then you run out of capacity, split the buckets across four servers
• Etc
![Page 104: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/104.jpg)
Web 2.0 Expo NYC, 16th September 2008 104
Data federation
• Heterogeneous hardware is fine– Just give a larger/smaller proportion of objects
depending on hardware
• Bigger/faster hardware for paying users– A common approach– Can also allocate faster app servers via magic
cookies at the LB
![Page 105: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/105.jpg)
Web 2.0 Expo NYC, 16th September 2008 105
Downsides
• Need to keep stuff in the right place• App logic gets more complicated• More clusters to manage
– Backups, etc
• More database connections needed per page– Proxy can solve this, but complicated
• The dual table issue– Avoid walking the shards!
![Page 106: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/106.jpg)
Web 2.0 Expo NYC, 16th September 2008 106
Bottom line
Data federation is how large applications are
scaled
![Page 107: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/107.jpg)
Web 2.0 Expo NYC, 16th September 2008 107
Bottom line
• It’s hard, but not impossible
• Good software design makes it easier– Abstraction!
• Master-master pairs for shards give us HA
• Master-master trees work for central cluster (many reads, few writes)
![Page 108: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/108.jpg)
Web 2.0 Expo NYC, 16th September 2008 108
9.
Multi-site HA
![Page 109: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/109.jpg)
Web 2.0 Expo NYC, 16th September 2008 109
Multiple Datacenters
• Having multiple datacenters is hard– Not just with MySQL
• Hot/warm with MySQL slaved setup– But manual (reconfig on failure)
• Hot/hot with master-master– But dangerous (each site has a SPOF)
• Hot/hot with sync/async manual replication– But tough (big engineering task)
![Page 110: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/110.jpg)
Web 2.0 Expo NYC, 16th September 2008 110
Multiple Datacenters
![Page 111: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/111.jpg)
Web 2.0 Expo NYC, 16th September 2008 111
GSLB
• Multiple sites need to be balanced– Global Server Load Balancing
• Easiest are AkaDNS-like services– Performance rotations– Balance rotations
![Page 112: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/112.jpg)
Web 2.0 Expo NYC, 16th September 2008 112
10.
Serving Files
![Page 113: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/113.jpg)
Web 2.0 Expo NYC, 16th September 2008 113
Serving lots of files
• Serving lots of files is not too tough– Just buy lots of machines and load balance!
• We’re IO bound – need more spindles!– But keeping many copies of data in sync is
hard– And sometimes we have other per-request
overhead (like auth)
![Page 114: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/114.jpg)
Web 2.0 Expo NYC, 16th September 2008 114
Reverse proxy
![Page 115: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/115.jpg)
Web 2.0 Expo NYC, 16th September 2008 115
Reverse proxy
• Serving out of memory is fast!– And our caching proxies can have disks too– Fast or otherwise
• More spindles is better
• We can parallelize it! – 50 cache servers gives us 50 times the
serving rate of the origin server– Assuming the working set is small enough to
fit in memory in the cache cluster
![Page 116: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/116.jpg)
Web 2.0 Expo NYC, 16th September 2008 116
Invalidation
• Dealing with invalidation is tricky
• We can prod the cache servers directly to clear stuff out– Scales badly – need to clear asset from every
server – doesn’t work well for 100 caches
![Page 117: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/117.jpg)
Web 2.0 Expo NYC, 16th September 2008 117
Invalidation
• We can change the URLs of modified resources– And let the old ones drop out cache naturally– Or poke them out, for sensitive data
• Good approach!– Avoids browser cache staleness– Hello Akamai (and other CDNs)– Read more:
• http://www.thinkvitamin.com/features/webapps/serving-javascript-fast
![Page 118: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/118.jpg)
Web 2.0 Expo NYC, 16th September 2008 118
CDN – Content Delivery Network
• Akamai, Savvis, Mirror Image Internet, etc
• Caches operated by other people– Already in-place– In lots of places
• GSLB/DNS balancing
![Page 119: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/119.jpg)
Web 2.0 Expo NYC, 16th September 2008 119
Edge networks
Origin
![Page 120: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/120.jpg)
Web 2.0 Expo NYC, 16th September 2008 120
Edge networks
Origin
Cache
Cache
Cache
CacheCache
Cache
CacheCache
![Page 121: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/121.jpg)
Web 2.0 Expo NYC, 16th September 2008 121
CDN Models
• Simple model– You push content to them, they serve it
• Reverse proxy model– You publish content on an origin, they proxy
and cache it
![Page 122: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/122.jpg)
Web 2.0 Expo NYC, 16th September 2008 122
CDN Invalidation
• You don’t control the caches– Just like those awful ISP ones
• Once something is cached by a CDN, assume it can never change– Nothing can be deleted– Nothing can be modified
![Page 123: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/123.jpg)
Web 2.0 Expo NYC, 16th September 2008 123
Versioning
• When you start to cache things, you need to care about versioning
– Invalidation & Expiry– Naming & Sync
![Page 124: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/124.jpg)
Web 2.0 Expo NYC, 16th September 2008 124
Cache Invalidation
• If you control the caches, invalidation is possible
• But remember ISP and client caches
• Remove deleted content explicitly– Avoid users finding old content– Save cache space
![Page 125: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/125.jpg)
Web 2.0 Expo NYC, 16th September 2008 125
Cache versioning
• Simple rule of thumb:– If an item is modified, change its name (URL)
• This can be independent of the file system!
![Page 126: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/126.jpg)
Web 2.0 Expo NYC, 16th September 2008 126
Virtual versioning
• Database indicates version 3 of file
• Web app writes version number into URL
• Request comes through cache and is cached with the versioned URL
• mod_rewrite converts versioned URL to path
Version 3
example.com/foo_3.jpg
Cached: foo_3.jpg
foo_3.jpg -> foo.jpg
![Page 127: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/127.jpg)
Web 2.0 Expo NYC, 16th September 2008 127
Reverse proxy
• Choices– L7 load balancer & Squid
• http://www.squid-cache.org/
– mod_proxy & mod_cache• http://www.apache.org/
– Perlbal and Memcache?• http://www.danga.com/
![Page 128: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/128.jpg)
Web 2.0 Expo NYC, 16th September 2008 128
More reverse proxy
• HA proxy
• Squid & CARP
• Varnish
![Page 129: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/129.jpg)
Web 2.0 Expo NYC, 16th September 2008 129
High overhead serving
• What if you need to authenticate your asset serving?– Private photos– Private data– Subscriber-only files
• Two main approaches– Proxies w/ tokens– Path translation
![Page 130: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/130.jpg)
Web 2.0 Expo NYC, 16th September 2008 130
Perlbal Re-proxying
• Perlbal can do redirection magic– Client sends request to Perlbal– Perlbal plugin verifies user credentials
• token, cookies, whatever• tokens avoid data-store access
– Perlbal goes to pick up the file from elsewhere– Transparent to user
![Page 131: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/131.jpg)
Web 2.0 Expo NYC, 16th September 2008 131
Perlbal Re-proxying
![Page 132: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/132.jpg)
Web 2.0 Expo NYC, 16th September 2008 132
Perlbal Re-proxying
• Doesn’t keep database around while serving
• Doesn’t keep app server around while serving
• User doesn’t find out how to access asset directly
![Page 133: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/133.jpg)
Web 2.0 Expo NYC, 16th September 2008 133
Permission URLs
• But why bother!?
• If we bake the auth into the URL then it saves the auth step
• We can do the auth on the web app servers when creating HTML
• Just need some magic to translate to paths
• We don’t want paths to be guessable
![Page 134: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/134.jpg)
Web 2.0 Expo NYC, 16th September 2008 134
Permission URLs
![Page 135: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/135.jpg)
Web 2.0 Expo NYC, 16th September 2008 135
Permission URLs
(or mod_perl)
![Page 136: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/136.jpg)
Web 2.0 Expo NYC, 16th September 2008 136
Permission URLs
• Downsides– URL gives permission for life– Unless you bake in tokens
• Tokens tend to be non-expirable– We don’t want to track every token
» Too much overhead
• But can still expire– But you choose at time of issue, not later
• Upsides– It works– Scales very nicely
![Page 137: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/137.jpg)
Web 2.0 Expo NYC, 16th September 2008 137
11.
Storing Files
![Page 138: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/138.jpg)
Web 2.0 Expo NYC, 16th September 2008 138
Storing lots of files
• Storing files is easy!– Get a big disk– Get a bigger disk– Uh oh!
• Horizontal scaling is the key– Again
![Page 139: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/139.jpg)
Web 2.0 Expo NYC, 16th September 2008 139
Connecting to storage
• NFS– Stateful == Sucks– Hard mounts vs Soft mounts, INTR
• SMB / CIFS / Samba– Turn off MSRPC & WINS (NetBOIS NS)– Stateful but degrades gracefully
• HTTP– Stateless == Yay!– Just use Apache
![Page 140: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/140.jpg)
Web 2.0 Expo NYC, 16th September 2008 140
Multiple volumes
• Volumes are limited in total size– Except (in theory) under ZFS & others
• Sometimes we need multiple volumes for performance reasons– When using RAID with single/dual parity
• At some point, we need multiple volumes
![Page 141: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/141.jpg)
Web 2.0 Expo NYC, 16th September 2008 141
Multiple volumes
![Page 142: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/142.jpg)
Web 2.0 Expo NYC, 16th September 2008 142
Multiple hosts
• Further down the road, a single host will be too small
• Total throughput of machine becomes an issue
• Even physical space can start to matter
• So we need to be able to use multiple hosts
![Page 143: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/143.jpg)
Web 2.0 Expo NYC, 16th September 2008 143
Multiple hosts
![Page 144: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/144.jpg)
Web 2.0 Expo NYC, 16th September 2008 144
HA Storage
• HA is important for file assets too– We can back stuff up– But we tend to want hot redundancy
• RAID is good– RAID 5 is cheap, RAID 10 is fast
![Page 145: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/145.jpg)
Web 2.0 Expo NYC, 16th September 2008 145
HA Storage
• But whole machines can fail
• So we stick assets on multiple machines
• In this case, we can ignore RAID– In failure case, we serve from alternative
source– But need to weigh up the rebuild time and
effort against the risk– Store more than 2 copies?
![Page 146: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/146.jpg)
Web 2.0 Expo NYC, 16th September 2008 146
HA Storage
![Page 147: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/147.jpg)
Web 2.0 Expo NYC, 16th September 2008 147
HA Storage
• But whole colos can fail
• So we stick assets in multiple colos
• Again, we can ignore RAID– Or even multi-machine per colo– But do we want to lose a whole colo because
of single disk failure?– Redundancy is always a balance
![Page 148: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/148.jpg)
Web 2.0 Expo NYC, 16th September 2008 148
Self repairing systems
• When something fails, repairing can be a pain– RAID rebuilds by itself, but machine
replication doesn’t
• The big appliances self heal– NetApp, StorEdge, etc
• So does MogileFS (reaper)
![Page 149: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/149.jpg)
Web 2.0 Expo NYC, 16th September 2008 149
Real Life
Case Studies
![Page 150: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/150.jpg)
Web 2.0 Expo NYC, 16th September 2008 150
GFS – Google File System
• Developed by … Google
• Proprietary
• Everything we know about it is based on talks they’ve given
• Designed to store huge files for fast access
![Page 151: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/151.jpg)
Web 2.0 Expo NYC, 16th September 2008 151
GFS – Google File System
• Single ‘Master’ node holds metadata– SPF – Shadow master allows warm swap
• Grid of ‘chunkservers’– 64bit filenames– 64 MB file chunks
![Page 152: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/152.jpg)
Web 2.0 Expo NYC, 16th September 2008 152
GFS – Google File System
1(a) 2(a)
1(b)
Master
![Page 153: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/153.jpg)
Web 2.0 Expo NYC, 16th September 2008 153
GFS – Google File System
• Client reads metadata from master then file parts from multiple chunkservers
• Designed for big files (>100MB)
• Master server allocates access leases
• Replication is automatic and self repairing– Synchronously for atomicity
![Page 154: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/154.jpg)
Web 2.0 Expo NYC, 16th September 2008 154
GFS – Google File System
• Reading is fast (parallelizable)– But requires a lease
• Master server is required for all reads and writes
![Page 155: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/155.jpg)
Web 2.0 Expo NYC, 16th September 2008 155
MogileFS – OMG Files
• Developed by Danga / SixApart
• Open source
• Designed for scalable web app storage
![Page 156: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/156.jpg)
Web 2.0 Expo NYC, 16th September 2008 156
MogileFS – OMG Files
• Single metadata store (MySQL)– MySQL Cluster avoids SPF
• Multiple ‘tracker’ nodes locate files
• Multiple ‘storage’ nodes store files
![Page 157: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/157.jpg)
Web 2.0 Expo NYC, 16th September 2008 157
MogileFS – OMG Files
Tracker
Tracker
MySQL
![Page 158: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/158.jpg)
Web 2.0 Expo NYC, 16th September 2008 158
MogileFS – OMG Files
• Replication of file ‘classes’ happens transparently
• Storage nodes are not mirrored – replication is piecemeal
• Reading and writing go through trackers, but are performed directly upon storage nodes
![Page 159: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/159.jpg)
Web 2.0 Expo NYC, 16th September 2008 159
Flickr File System
• Developed by Flickr
• Proprietary
• Designed for very large scalable web app storage
![Page 160: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/160.jpg)
Web 2.0 Expo NYC, 16th September 2008 160
Flickr File System
• No metadata store– Deal with it yourself
• Multiple ‘StorageMaster’ nodes
• Multiple storage nodes with virtual volumes
![Page 161: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/161.jpg)
Web 2.0 Expo NYC, 16th September 2008 161
Flickr File System
SM
SM
SM
![Page 162: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/162.jpg)
Web 2.0 Expo NYC, 16th September 2008 162
Flickr File System
• Metadata stored by app– Just a virtual volume number– App chooses a path
• Virtual nodes are mirrored– Locally and remotely
• Reading is done directly from nodes
![Page 163: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/163.jpg)
Web 2.0 Expo NYC, 16th September 2008 163
Flickr File System
• StorageMaster nodes only used for write operations
• Reading and writing can scale separately
![Page 164: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/164.jpg)
Web 2.0 Expo NYC, 16th September 2008 164
Amazon S3
• A big disk in the sky
• Multiple ‘buckets’
• Files have user-defined keys
• Data + metadata
![Page 165: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/165.jpg)
Web 2.0 Expo NYC, 16th September 2008 165
Amazon S3
Servers Amazon
![Page 166: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/166.jpg)
Web 2.0 Expo NYC, 16th September 2008 166
Amazon S3
Servers Amazon
Users
![Page 167: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/167.jpg)
Web 2.0 Expo NYC, 16th September 2008 167
The cost
• Fixed price, by the GB
• Store: $0.15 per GB per month
• Serve: $0.20 per GB
![Page 168: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/168.jpg)
Web 2.0 Expo NYC, 16th September 2008 168
The cost
S3
![Page 169: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/169.jpg)
Web 2.0 Expo NYC, 16th September 2008 169
The cost
S3
Regular Bandwidth
![Page 170: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/170.jpg)
Web 2.0 Expo NYC, 16th September 2008 170
End costs
• ~$2k to store 1TB for a year
• ~$63 a month for 1Mb
• ~$65k a month for 1Gb
![Page 171: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/171.jpg)
Web 2.0 Expo NYC, 16th September 2008 171
12.
Field Work
![Page 172: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/172.jpg)
Web 2.0 Expo NYC, 16th September 2008 172
Real world examples
• Flickr– Because I know it
• LiveJournal– Because everyone copies it
![Page 173: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/173.jpg)
Web 2.0 Expo NYC, 16th September 2008 173
FlickrArchitecture
![Page 174: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/174.jpg)
Web 2.0 Expo NYC, 16th September 2008 174
FlickrArchitecture
![Page 175: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/175.jpg)
Web 2.0 Expo NYC, 16th September 2008 175
LiveJournalArchitecture
![Page 176: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/176.jpg)
Web 2.0 Expo NYC, 16th September 2008 176
LiveJournalArchitecture
![Page 177: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/177.jpg)
Web 2.0 Expo NYC, 16th September 2008 177
Buy my book!
![Page 178: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/178.jpg)
Web 2.0 Expo NYC, 16th September 2008 178
Or buy Theo’s
![Page 179: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/179.jpg)
Web 2.0 Expo NYC, 16th September 2008 179
The end!
![Page 180: Scalable Web Architectures](https://reader035.vdocument.in/reader035/viewer/2022081503/56814295550346895daec3b4/html5/thumbnails/180.jpg)
Web 2.0 Expo NYC, 16th September 2008 180
Awesome!
These slides are available online:
iamcal.com/talks/