software architecture for high traffic website
TRANSCRIPT
![Page 1: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/1.jpg)
Software architecture for high traffic website
Case study - Stack Overflow
Presenter: Ngô Xuân Hòa (Novaon Adnetwork - Novanet)Hanoi .Net Meetup
![Page 2: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/2.jpg)
Contents
About Stack Overflow
● Beginning
● Restructure #1
● Restructure # 2
● Founders
● Principles
SO architecture
● StackExchange.Redis
● Dapper
● Jil
Open-source Libs
![Page 3: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/3.jpg)
About Stack Overflow
![Page 4: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/4.jpg)
Founders
Jeff Atwood
Joel Spolsky
![Page 5: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/5.jpg)
2008
Stack Overflow
2009 2010 2011
Server Fault
Stack Exchange 1.0
Stack Exchange 2.0
Stack Overflow Carees
Rome wasn’t build in a day!
![Page 6: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/6.jpg)
● 100+ Q&A Sites
● 600+ million pageviews a month
● 3000+ requests per second
● 16+ million users
● 8+ million question
● 40+ million answers
![Page 7: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/7.jpg)
Principles
Perfomance Is a FeatureCache All The Thing!Reinvention is OK
![Page 8: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/8.jpg)
Stack Overflow Architecture
![Page 9: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/9.jpg)
2 times restructuringStack Exchange 1.0
● ASP.NET MVC
● SQL Server
● LINQ to SQL
● Wikipedia DB Design
Stack Exchange Network
LINQ to SQL
HAProxy
Redis
Lucene.NET
Scale Up
● Cache every things
● Elastic Search
● Reinvention
![Page 10: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/10.jpg)
Stack Exchange 1.0 Structure
Windows NLBLoad balancing
IIS Server IIS ServerWeb server
SQL ServerDatabase
![Page 11: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/11.jpg)
Window NLB
● Cons:
○ Limit to 8 Nodes
○ Cannot detect service failed
Web-tier
ASP.NET MVC
LINQ to SQL
SQL Server
● All-in-memory
● Full text search
![Page 12: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/12.jpg)
● 16 million pageviews a month
● 3 million unique visitors a month
● 6 million visits a month
![Page 13: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/13.jpg)
Follow none but learn from everyone!
![Page 14: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/14.jpg)
Pros
● Bottleneck: Database SQL Server
● High cost to scale up● Simple
Cons
![Page 15: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/15.jpg)
Restructure #1 - Stack Exchange Network
HAProxyRedis CacheLucene.NETTag Engine
![Page 16: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/16.jpg)
Stack Exchange Network Structure
HAProxy
Redis
IIS ServersDatabase
protobuf
sqlhttp http
![Page 17: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/17.jpg)
Load Balancing
● HAProxy:
○ Run in Linux
○ Free
Web-tier
ASP.NET MVC 3
LINQ to SQL
jQuery 1.4.5
Lucene.Net
Redis
● In-memory cache
● Master-slave
● Messaging notification
![Page 18: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/18.jpg)
3 Type Cache
Local Cache Site Cache
● Use Redis● Cache Site’s
data:- Q&As- Acceptance rates- ...
Global Cache
● Use Redis● Cache System
Data:- User info- Inbox- ...
● Use HttpRunTime.Cache
● Cache: - User Session- View Count- ...
![Page 19: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/19.jpg)
Update cache flow - Local cache
Local Cache
Redis
DB
Other sites
1 3
2.1
2.2
41 - OnStartup - Subcribe invalidation message to Redis2.1 - Data changed (by other sites, apps…)2.2 - Send message to Redis3 - Redis send Notification to Subscribers4 - Get data from DB - update Local cache
![Page 20: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/20.jpg)
Deployment flow with HAProxy
● Tell HAProxy to take the server out of rotation via a POST● Delay to let IIS finish current requests (~5 sec)● Stop the website● Copy files● Start the website● Local testing, update local cache, etc…● Re-enable HAProxy via another POST
![Page 21: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/21.jpg)
● High performance
● Low-cost Load Balancing (use HAProxy)
● Use Messaging của Redis for cache invalidation
Pros
● Too many SQL query
Cons
![Page 22: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/22.jpg)
● 95 million pageviews a month
● 800 requests per second
● 16 million users
![Page 23: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/23.jpg)
Restructure #2 - Scale Up
Cache All the ThingElastic SearchReinvention
![Page 24: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/24.jpg)
Stack Exchange Network Structure
Elastic SearchTag Engine
Databases
Redis
HAProxy
![Page 25: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/25.jpg)
5 Level cacheNetwork
LevelLocal Cache
Redis Cache
SQL SV Cache SSD
● Network Level: Browser cache…● Local Cache: HttpRuntime.Cache - Cache all data in memory● Redis Cache: Cache all data● SQL Server Cache: Cache all data in memory (the database servers have
384GB of RAM)
![Page 26: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/26.jpg)
Cache Flow
● Check Local Cache
● Else, check Redis Cache and update Local Cache
● If Cache Redis doesn’t have data, fetch from databases, then update Redis Cache and Local Cache
![Page 27: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/27.jpg)
Cache All the Things!
![Page 28: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/28.jpg)
Pros
● Data has latency
● Very, Very Fast (<400ms)
● Low servers load:
○ IIS: 10-15% CPU usage
○ DB: 10% CPU usage
● 99% request served by cache
Cons
![Page 29: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/29.jpg)
● 95 million pageviews a month
● 800 requests per second
● 16 million users
![Page 30: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/30.jpg)
Open-source Libs
• StackExchange.Redis - high perfomance Redis client
• Dapper - a micro ORM - very fast• Jil - fast JSON Serializer
Reinvention is OK!
![Page 31: Software architecture for high traffic website](https://reader034.vdocument.in/reader034/viewer/2022052405/587a16bb1a28abb4238b54cb/html5/thumbnails/31.jpg)
Reference sources
● http://stackoverflow.com
● http://highscalability.com
● http://codinghorror.com
● http://www.joelonsoftware.com
● http://nickcraver.com
● http://josephwoodward.co.uk/2014/02/the-architecture-of-stackoverflow/