getting real with erlang: from the idea to a live system
DESCRIPTION
From 0 to 1,000,000 daily users with Erlang This talk wants to sum up the experience about building and maintaining live our first social game server completely written in erlang. The game has been now published and it’s time to think about the results achieved. In this talk we’ll describe details of our architecture, principles that drove its design and considerations about hosting the infrastructure. We’ll go in detail through what worked well and what needs to be improved, all backed by data coming from our live system. Come and follow us if you want to get the taste of real world erlang.TRANSCRIPT
![Page 1: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/1.jpg)
Getting real with erlangFrom the idea to a live system
Knut Nesheim @knutinPaolo Negri @hungryblank
Thursday, November 3, 2011
![Page 2: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/2.jpg)
Social GamesFlash client (game) HTTP API
http://www.flickr.com/photos/theplanetdotcom/4879421344Thursday, November 3, 2011
![Page 3: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/3.jpg)
Social GamesHTTP API
• @ 1 000 000 daily users
• 5000 HTTP reqs/sec
• more than 90% writes
Thursday, November 3, 2011
![Page 4: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/4.jpg)
November 2010
• At the erlang user group!
• Learn where/how erlang is used
• 0 lines of erlang code across all @wooga code base
Thursday, November 3, 2011
![Page 5: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/5.jpg)
Why looking into erlang?
HTTP API
• @ 1 000 000 daily users
• 5000 HTTP reqs/sec
• around 60000 queries/sec
Thursday, November 3, 2011
![Page 6: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/6.jpg)
60000 qps
• Most maintenance effort in databases
• mix of SQL / NoSQL
• Already using in RAM data stores (REDIS)
• RAM DB are fast but expensive / high maintenance
Thursday, November 3, 2011
![Page 7: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/7.jpg)
Social games data
• User data self contained
• Strong hot/cold pattern - gaming session
• Heavy write load (caching is ineffective)
Thursday, November 3, 2011
![Page 8: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/8.jpg)
User session
1. start session
2. game actions
3. end session
User data
1. load all
2. read/update many times
3. data becomes cold
Thursday, November 3, 2011
![Page 9: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/9.jpg)
User session
1. start session
2. game actions
3. end session
Erlang process
1. start (load state)
2. responds to messages (use state)
3. stop (save state)
Thursday, November 3, 2011
![Page 10: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/10.jpg)
User session DB usage
Stateless server Stateful server(Erlang )
start session load user state load user state
game actions many queries
end session save user state
Thursday, November 3, 2011
![Page 11: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/11.jpg)
Erlang process
• Follows session lifecycle
• Contains and defends state (data)
• Acts as a lock/serializer (one message at a time)
Thursday, November 3, 2011
![Page 12: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/12.jpg)
December 2010
• Prototype
user1 user2 user3
user4 user5 user6
user7 user8 userN
erlangprocess = user session
Thursday, November 3, 2011
![Page 13: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/13.jpg)
January 2011
• erlang team goes from 1 to 2 developers
• Distribution/clustering
• Error handling
• Deployments
• Operations
Open topics
Thursday, November 3, 2011
![Page 14: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/14.jpg)
Architecture
Thursday, November 3, 2011
![Page 15: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/15.jpg)
Architecture goals
• Move data into the applica4on server
• Be as simple as possible
• Graceful degrada4on when DBs go down• Easy to inspect and repair state of cluster• Easy to add more machines for scaling out
15
Thursday, November 3, 2011
![Page 16: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/16.jpg)
16
SessionSessionSession
Thursday, November 3, 2011
![Page 17: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/17.jpg)
17
Worker
SessionSessionSession
Thursday, November 3, 2011
![Page 18: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/18.jpg)
Worker
SessionSessionSession
Worker
SessionSessionSession
18
Worker
SessionSessionSession
Thursday, November 3, 2011
![Page 19: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/19.jpg)
Coordinator
Worker
SessionSessionSession
Worker
SessionSessionSession
19
Worker
SessionSessionSession
Coordinator
Thursday, November 3, 2011
![Page 20: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/20.jpg)
Coordinator
Worker
SessionSessionSession
Worker
SessionSessionSession
20
Worker
SessionSessionSession
LockCoordinator
Thursday, November 3, 2011
![Page 21: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/21.jpg)
Coordinator
Worker
SessionSessionSession
Worker
SessionSessionSession
21
Worker
SessionSessionSession
Coordinator
DBs
Lock
Thursday, November 3, 2011
![Page 22: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/22.jpg)
Worker
SessionSessionSession
Worker
SessionSessionSession
22
Worker
SessionSessionSession
Coordinator
DBs
New user comes online
Lock
Thursday, November 3, 2011
![Page 23: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/23.jpg)
Worker
SessionSessionSession
Worker
SessionSessionSession
22
Worker
SessionSessionSession
Coordinator
DBs
New user comes online
Flash calls ”setup”
Lock
Thursday, November 3, 2011
![Page 24: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/24.jpg)
Worker
SessionSessionSession
Worker
SessionSessionSession
22
Worker
SessionSessionSession
Coordinator
DBs
New user comes online
Flash calls ”setup”
session:start(Uid)on suitable worker Lock
Thursday, November 3, 2011
![Page 25: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/25.jpg)
Worker
SessionSessionSession
Worker
SessionSessionSession
22
Worker
SessionSessionSession
Coordinator
DBs
New user comes online
Flash calls ”setup”
session:start(Uid)on suitable worker
s3:get(Uid) Lock
Thursday, November 3, 2011
![Page 26: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/26.jpg)
Worker
SessionSessionSession
Worker
SessionSessionSession
22
Worker
SessionSessionSession
Coordinator
DBs
New user comes online
Flash calls ”setup”
session:start(Uid)on suitable worker
lock:acquire(Uid)test-‐and-‐set
s3:get(Uid) Lock
Thursday, November 3, 2011
![Page 27: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/27.jpg)
Worker
SessionSessionSession
Worker
SessionSessionSession
22
Worker
SessionSessionSession
Coordinator
DBs
New user comes online
Flash calls ”setup”
session:start(Uid)on suitable worker
lock:acquire(Uid)test-‐and-‐set
s3:get(Uid)
Game acFons from Flash
Lock
Thursday, November 3, 2011
![Page 28: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/28.jpg)
Worker
SessionSessionSession
Worker
SessionSessionSession
23
Worker
SessionSessionSession
Coordinator
DBs
User goes offline (session Fmes out)
lock:release/1
s3:put/1
gen_server Fmeout
Lock
Thursday, November 3, 2011
![Page 29: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/29.jpg)
Implementa4on of game logic
Thursday, November 3, 2011
![Page 30: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/30.jpg)
Dream game logic
25
+
Fast Safe
Thursday, November 3, 2011
![Page 31: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/31.jpg)
Dream game logic
26
Thursday, November 3, 2011
![Page 32: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/32.jpg)
Dream game logic
• We want high throughput (for scaling)–Try to spend as liPle CPU Fme as possible
–Avoid heavy computaFon
–Try to avoid creaFng garbage
• ..and simple and testable logic (correctness)–FuncFonal game logic makes thinking about code easy
–Single entry point, gets ”request” and game state
–Code for happy case, roll back on game-‐related excepFon
27
Thursday, November 3, 2011
![Page 33: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/33.jpg)
How we avoid using CPU
• Remove need for DB serializaFon by storing data in process
• Game is designed to avoid heavy liXing in the backend, very simple game logic
• OpFmize hot parts on the criFcal path, like collision detecFon, regrowing of forest
• Generate erlang modules for read-‐heavy configuraFon (~1 billion reads/1 write per week)
• Use NIFs for parsing JSON (jiffy)
28
Thursday, November 3, 2011
![Page 34: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/34.jpg)
How to find where CPU is used
• Profile (eprof, fprof, kprof[1])• Measure garbage collecFon (process_info/1, gcprof[2])
• Conduct experiment: change code, measure, repeat
• Is the increased performance worth the increase in complexity?
• SomeFmes a radically different approach is needed..
[1]: github.com/knuFn/kprof
[2]: github.com/knuFn/gcprof
29
Thursday, November 3, 2011
![Page 35: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/35.jpg)
Opera4onsAugust 2011 -‐ ...
Thursday, November 3, 2011
![Page 36: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/36.jpg)
Opera4ons
• At wooga, developers also operate the game
• Most developers are ex-‐sysadmins
• Simple tools:–remsh for deployments, maintenance, debugging
–automaFon with chef
–syslog, tail, cut, awk, grep–verbose crash logs (SASL)–alarms only when something really bad happens
31
Thursday, November 3, 2011
![Page 37: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/37.jpg)
Deployments
• Goal: upgrade without annoying users• SoT purge• Set system quiet (webserver & coordinator)
• Reload• Open the flood gates• Migrate process memory state on-‐demand
• Total 4me not answering game requests: < 1s
32
Thursday, November 3, 2011
![Page 38: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/38.jpg)
How we know what’s going on
• Event logging to syslog–Session start, session end (process memory, gc, game stats)
–Game-‐related excepFons
• Latency measurement within the app
• Use munin to pull overall server stats–CPU and memory usage by beam.smp
–Memory used by processes, ets tables, etc
–Throughput of app, dbs–Throughput of coordinator, workers, lock
33
Thursday, November 3, 2011
![Page 39: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/39.jpg)
How we know what’s going on
34
Game error: The game acFon is not allowed with the current server state and configuraFon
Thursday, November 3, 2011
![Page 40: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/40.jpg)
How we know what’s going on
35
Half-‐word emulator
Thursday, November 3, 2011
![Page 41: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/41.jpg)
How we know what’s going on
36
Thursday, November 3, 2011
![Page 42: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/42.jpg)
How we know what’s going on
37
Thursday, November 3, 2011
![Page 43: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/43.jpg)
How we know what’s going on
38
Thursday, November 3, 2011
![Page 44: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/44.jpg)
How we know what’s going on
39
Thursday, November 3, 2011
![Page 45: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/45.jpg)
Conclusions
Thursday, November 3, 2011
![Page 46: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/46.jpg)
Conclusions -‐ database
41
0
7500
15000
22500
30000
queries/sec
Ruby Stateless Erlang Stateful
700
Thursday, November 3, 2011
![Page 47: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/47.jpg)
ConclusionsDb maintenace
AWS S3 as a main datastore
one document/user
0 maintenance/setup cost
Redis for the derived data (leaderboard etc.)
load very far from Redis max capacity
42
Thursday, November 3, 2011
![Page 48: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/48.jpg)
Conclusionsdata locality
• average game call < 1ms
• no db roundtrip at every request• no need for low latency network• efficient setup for cloud environment
43
Thursday, November 3, 2011
![Page 49: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/49.jpg)
Conclusionsdata locality
• finally CPU bound• no CPU 4me for serializing/deserializing data from db
• CPU only busy transforming data (minimum possible ac4vity)
44
Thursday, November 3, 2011
![Page 50: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/50.jpg)
ConclusionsCPU usage
• 300K daily users• 1000 hcp req/sec (game ac4ons)
• 4 m1.large AWS instances (dual core 8GB RAM)
• 2 instances (coordinators) 5% CPU load• 2 instances (workers) 20% CPU load
45
Thursday, November 3, 2011
![Page 51: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/51.jpg)
Conclusionsextra benefits
en4re user state in one process
+
immutability
=
Transac4onal behavior
46
Thursday, November 3, 2011
![Page 52: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/52.jpg)
Conclusionsextra benefits
47
One user session -‐> one erlang process
The erlang VM is aware of processes
=>
the erlang VM is aware of user sessions
Thursday, November 3, 2011
![Page 53: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/53.jpg)
Conclusions
48
Thanks to VM process introspec4on
process reduc4ons -‐> cost of a game ac4on
process used memory -‐> memory used by session
We gained a lot of knowledge about a fundamental ”business” en4ty
Thursday, November 3, 2011
![Page 54: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/54.jpg)
Conclusions
49
• a radical change was made possible by a radically different tool (erlang)
Thursday, November 3, 2011
![Page 55: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/55.jpg)
Conclusions
49
• a radical change was made possible by a radically different tool (erlang)
• erlang can be good for data intensive/high throughput applica4ons
Thursday, November 3, 2011
![Page 56: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/56.jpg)
Conclusions
49
• a radical change was made possible by a radically different tool (erlang)
• erlang can be good for data intensive/high throughput applica4ons
• stateful is not necessarily hard/dangerous/unmaintainable
Thursday, November 3, 2011
![Page 57: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/57.jpg)
Conclusions
50
• november 2010: 0 lines of erlang @wooga
• november 2011: 1 erlang game server live
...with more erlang coming, join us
Thursday, November 3, 2011
![Page 58: Getting real with Erlang: From the idea to a live system](https://reader034.vdocument.in/reader034/viewer/2022051513/5478ed16b479595e098b4638/html5/thumbnails/58.jpg)
Q&A
51
Knut Nesheim @knutinPaolo Negri @hungryblank
hPp://wooga.com/jobs
Thursday, November 3, 2011