scaling erlang cluster to 10,000 nodes · 2019-08-14 · scaling erlang cluster to 10,000 nodes....
TRANSCRIPT
![Page 1: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/1.jpg)
Maxim FedorovSoftware Engineer @ WhatsApp!
Scaling Erlang cluster to 10,000 nodes
![Page 2: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/2.jpg)
WhatsApp User Base GrowthFrom 200m to 1.5B and growing fast !
0
200
400
600
800
1000
1200
1400
1600
2013 2014 2015 2016 2017 2018
![Page 3: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/3.jpg)
WhatsApp Features DevelopmentNot just a simple messenger !
• New platforms support!• Voice and video calls!• End-to-end encryption!• WhatsApp Business!• Live location !• … and a few more!
![Page 4: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/4.jpg)
Paradigm ShiftFew powerful servers to many tightly packed blades !
• Dual Socket Xeon E5-2690 2.6-3.5 GHz!128 - 512 G RAM!
• Xeon-D 1540 2.0 – 2.6 GHz 32 G RAM!• Dual Skylake-X, 256 G RAM!
![Page 5: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/5.jpg)
Foundation Replacement
WhatsApp!
Erlang R16
FreeBSD
IBM (SoftLayer)
WhatsApp!
Erlang R21
Linux
![Page 6: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/6.jpg)
WhatsApp Cluster SizeFrom just a few to over 10,000 !
2013 2014 2015 2016 2017 2018
![Page 7: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/7.jpg)
Erlang Cluster!
![Page 8: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/8.jpg)
Erlang Cluster: Fully Connected Mesh
node 1
node 2
node 3node 4
node 5
• 1500 nodes in a single distribution cluster!
• TCP connection maintenance overhead is hardly noticeable after a few tweaks!
• Fast startup!
Discovery is a Problem!
![Page 9: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/9.jpg)
Distributed Process RegistryA good problem to solve!
Coordinated• global• pg2• gproc• s_groups
Eventual• Riak PG• cpg• Syn• Swarm• Lasp PG
![Page 10: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/10.jpg)
Coordinated Approach (global)
node 1
node 2
node 3node 4
node 5
Registration
• Lock boss node• Lock the rest• Register process• Unlock all except boss• Unlock boss node
![Page 11: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/11.jpg)
Distributed Hash Tables
node 1
node 2
node 3node 4
node 5
Consistent Hashing CRDT
node 1
node 2
node 3
Kademlia
1 2 4 5 6
![Page 12: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/12.jpg)
Keep It SimpleMore than one process registry!
• Centralised store for high rate registrations -!session manager!
• Globally replicated state for rare changes -!pg2!
![Page 13: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/13.jpg)
Session ManagerCentral storage of phone-to-node mapping!
chat
chat 1
chat 2
chat 3
chat 4
chat 5
chat 6
chat 7
chat 8
chat 9
chat 10
session
session 1
session 2
session 3
session 4
![Page 14: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/14.jpg)
pg2 for Service Discovery
chat
chat 1
<...>
chat 8
session
session 1�{session, 1}�{session, 2}
session 2�{session, 3}�{session, 4}
![Page 15: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/15.jpg)
WhatsApp Meta-cluster!
![Page 16: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/16.jpg)
Meta-clustering
chat
offline session contacts
notifications groups
![Page 17: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/17.jpg)
Limits Are Still Therepg2 scaling is limited, but limits can be pushed further away!
• Denormalise pg2_table for fast access to local and remote group members!
• Apply ‘boss node’ algorithm to pg2!• Add monitoring for local processes!• … Introduce hidden (non-transitive) pg2 membership!• Pushed from 32 partitions to hundreds!
![Page 18: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/18.jpg)
wandist: Extending Erlang DistributionConnecting disjoint Erlang clusters!
• SSL support!• SOCKS proxy support!• Delivery confirmation!• Standby connections!• Maintain non-transitive pg2 lists!• Compatibility (R16 <-> R21)!
chat�{session, 1}�{account, 1}�{group, 1}
session�{session, 1}
account�{account, 1}
group�{group, 1}
![Page 19: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/19.jpg)
Challenges• I/O scaling – going from kqueue to epoll!
Upgrade to Erlang R21!• Routing performance – pg2 concurrent updates!
Reduce contention!• Long-range communications – increased latency!
Test with injected latency!Absorb latency with increased concurrency!
• SSL performance – handshake bottleneck!Reduce contention!
![Page 20: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/20.jpg)
Bits & Bolts!
![Page 21: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/21.jpg)
Diagnostic Tools• Built-in inspection: process_info, statistics, system_info!• MSACC – microstate accounting (with extra acc on)!• Lock-counting BEAM!• gdb (with etp-commands)!• BPF/BCC!• fprof, valgrind!• Erlang OTP source code!!
![Page 22: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/22.jpg)
Microstate Accounting
When things go wrong
![Page 23: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/23.jpg)
Lock Counting
![Page 24: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/24.jpg)
Lock Counting & Source Code
Meaningful lock name
![Page 25: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/25.jpg)
Lock Counting
Name is already there!
![Page 26: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/26.jpg)
BCC (BPF Compiler Collection)
Trace large reads
![Page 27: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/27.jpg)
gdb + etp-commandsNot necessarily post-mortem
![Page 28: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/28.jpg)
Erlang OTP is getting betterSome original WhatsApp patches are no longer in use!
• GC Throttling -> Off-Heap message queue !• prim_file patches -> built-in NIF-based file I/O !• TLS 1.2 support (cipher suite selection)!• HW-accelerated crypto!• public_key: PKCS8 support, certificate verification, SNI!• Hashing clashes (ETS to mnesia)!• Bugfixes!!
![Page 29: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/29.jpg)
But Not There YetRecently added and reworked patches!
• SSL/TLS handshake acceleration, PEM cache validation!• inet_db: race condition during .hosts file reload !• prim_inet: race/suboptimal accept() behaviour!• prepend send (stuck worker detection, TTL)!• flush process message queue!• process message/signal queue stats!
![Page 30: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/30.jpg)
But Not There YetRecently added and reworked patches!
• system monitoring (signal queues, rpc tracing)!• httpc_client TLS upgrade timeout!• wider lock tables (check IO, ETS meta)!• worker pools, dispatcher pools!• convenience patches (noisy logging suppression, default
eunit timeouts, listen backlog queues sizes, pretty printing, shell history, supervisor ‘ETS-TRANSFER’)!
![Page 31: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/31.jpg)
Embrace Open Source community
• Upstream our patches!• Be Open!!
![Page 32: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/32.jpg)
Questions?
Maxim Fedorovdane@ whatsapp.comGitHub: max-au
![Page 33: Scaling Erlang cluster to 10,000 nodes · 2019-08-14 · Scaling Erlang cluster to 10,000 nodes. WhatsApp User Base Growth From 200m to 1.5B and growing fast ! 0 200 400 600 800 1000](https://reader030.vdocument.in/reader030/viewer/2022040120/5e7b807876edb92d7a0c5144/html5/thumbnails/33.jpg)
Paradigm Shift
• Per-node monitoring -> cluster health!• Years of uptime -> simple & reliable restart!• Trigger-based alerts -> level-based!• Local configuration -> global!• Aggregates and centralised logging!