distributed applications with apache zookeeper

23
Building Distributed Applications with Apache Zookeeper Alex Ehrnschwender | Game Server Engineer at DeNA

Upload: alex-ehrnschwender

Post on 16-Jul-2015

353 views

Category:

Software


1 download

TRANSCRIPT

Page 1: Distributed Applications with Apache Zookeeper

Building Distributed

Applications with Apache

Zookeeper

Alex Ehrnschwender | Game Server Engineer at DeNA

Page 2: Distributed Applications with Apache Zookeeper

What is Zookeeper?

“ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.”

Zookeeper Wiki

Page 3: Distributed Applications with Apache Zookeeper

ZooKeeper: A Coordination Service for Distributed Applications

Coordination & synchronization for

distributed processes

Logical namespacing implemented by a

hierarchy (tree) of znodes

Replicated in-memory over multiple hosts

for reliability, availability, and performance

Simple API of CRUD & basic tree operations

for client integration

Page 4: Distributed Applications with Apache Zookeeper

Zookeeper: Reliability & Consistency

Distributed ensemble with automatic leader

election through quorum

Replicated in-memory on every instance with

snapshot writes to disk

Client TCP connection maintained to any

node with failover support

Guaranteed atomicity & sequential

consistency

Page 5: Distributed Applications with Apache Zookeeper

Zookeeper: Watches & Ephemeral nodes

Underlying znodes have a data structure consisting of version numbers (cversion, aversion) &

timestamps

Watches

● Client-initiated subscriptions to znodes

● Changes to a watched znode trigger notification to subscribed clients

Ephemeral Nodes

● Backed by a client session and deleted when client session ends

● Cannot have children

Page 6: Distributed Applications with Apache Zookeeper

Zookeeper: But… why?

“Because of the difficulty of implementing these kinds of services, applications initially usually skimp on them, which make them brittle in the presence of change and difficult to manage. Even when done correctly, different implementations of these services lead to management complexity when the applications are deployed.”

Zookeeper Wiki

Page 7: Distributed Applications with Apache Zookeeper

Zookeeper: Advantages for Backing a Server Cluster

Server workers can become cluster-aware

So much out-of-the-box that would be duplicated with a custom solution

Extremely fast reads (10:1 performance against writes)

Small footprint - An ensemble of only 5-7 zk instances can serve the

coordination needs of several large production applications

Centralized event broadcasting & failure detection (heartbeat)

Page 8: Distributed Applications with Apache Zookeeper

Zookeeper: Common Use Cases

● Configuration Management

● Service Discovery

● Distributed Cloud-Based File Systems

● Internal DNS Management

● Master (Leader) Election and Voting

● Messaging Queue

● Event Broadcasting & Notification

Page 9: Distributed Applications with Apache Zookeeper

Use Case Example #1 - Managing Redis Shards

Page 10: Distributed Applications with Apache Zookeeper

ZK Use Case Example #1 - Pinterest

Pinterest stores their entire follower model inside sharded Redis instances (

~9000 Redis shards, multiple instances per core)

Shard configuration is stored and managed by Zookeeper

Client lookups and watches for shard location & subsequent data retrieval

Master-slave failover triggers updates to znode representation (slave address replaces master)

Vertical splitting of data broadcasted to watching clients

Page 11: Distributed Applications with Apache Zookeeper

Use Case Example #2 - HBase Cluster Configuration

Page 12: Distributed Applications with Apache Zookeeper

Code Examples

public void join(String groupName, String memberName)

throws KeeperException, InterruptedException {

String path = "/" + groupName + "/" + memberName;

String createdPath = zk.create(path,

null /* data */,

ZooDefs.Ids.OPEN_ACL_UNSAFE,

CreateMode.EPHEMERAL);

System.out.println("Created " + createdPath);

}

public void create(String groupName)

throws KeeperException, InterruptedException {

String path = "/" + groupName;

String createdPath = zk.create(path,

null /* data */,

ZooDefs.Ids.OPEN_ACL_UNSAFE,

CreateMode.PERSISTENT);

System.out.println("Created " + createdPath);

}

Page 13: Distributed Applications with Apache Zookeeper

Code Examples (cont.)

public void delete(String groupName)

throws KeeperException, InterruptedException {

String path = "/" + groupName;

try {

List<String> children = zk.getChildren(path, false);

for(String child : children) {

zk.delete(path + "/" + child, -1); /* child */

}

zk.delete(path, -1); /* parent */

} catch (KeeperException.NoNodeException e) {

System.out.printf("Group %s does not exist\n", groupName);

}

}

public void list(String groupName)

throws KeeperException, InterruptedException {

String path = "/" + groupName;

try {

List<String> children = zk.getChildren(path, false);

for(String child : children) {

System.out.println(child);

}

} catch (KeeperException.NoNodeException e) {

System.out.printf("Group %s does not exist\n",

groupName);

}

}

Page 14: Distributed Applications with Apache Zookeeper

Performance

Standalone ops/sec 3-Node Ensemble (ops/sec)

Reference:

https://wiki.apache.org/hadoop/ZooKeeper/ServiceLatencyOverview

Page 15: Distributed Applications with Apache Zookeeper

Sample Configuration (zoo.cfg)

tickTime=2000

dataDir=/var/lib/zookeeper

clientPort=2181

initLimit=5

syncLimit=2

server.1=zoo1:2888:3888

server.2=zoo2:2888:3888

server.3=zoo3:2888:3888

Page 16: Distributed Applications with Apache Zookeeper

Exhibitor: A ZK Monitoring & Administration Tool from Netflix

Centralization & externalization of zk ensemble configuration* (S3/remote FS)

Web UI & REST API for ease of management

Instance monitoring with automatic configuration updates

Rolling ensemble changes while maintaining quorum

Miscellaneous administration tasks (backup/restore, log & snapshot cleanup)

* Configuration management for a configuration manager.... so meta!

Page 17: Distributed Applications with Apache Zookeeper

Questions?

Page 18: Distributed Applications with Apache Zookeeper

Appendix

Page 19: Distributed Applications with Apache Zookeeper

Zookeeper Atomic Broadcast (ZAB) Algorithm

● Protocol for managing atomic updates to replicas

● Responsible for:

o Agreeing on an ensemble leader

o Synchronizing replicas

o Managing transactions and broadcasts

o Recovery of state

● ZXIDs & transactional ordering

● Guarantees:

o Local & global primary order

o Primary integrity

Page 20: Distributed Applications with Apache Zookeeper

Performance

Page 21: Distributed Applications with Apache Zookeeper

Performance

Standalone ops/sec 3-Node Ensemble (ops/sec)

Reference:

https://wiki.apache.org/hadoop/ZooKeeper/ServiceLatencyOverview

Page 22: Distributed Applications with Apache Zookeeper

Sample Configuration (zoo.cfg)

tickTime=2000

dataDir=/var/lib/zookeeper

clientPort=2181

initLimit=5

syncLimit=2

server.1=zoo1:2888:3888

server.2=zoo2:2888:3888

server.3=zoo3:2888:3888

Page 23: Distributed Applications with Apache Zookeeper

References

● http://engineering.pinterest.com/post/55272557617/building-a-follower-model-from-scratch

● http://zookeeper.apache.org/doc/trunk/zookeeperOver.html

● http://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html

● https://github.com/Netflix/exhibitor/wiki

● http://www.tcs.hut.fi/Studies/T-79.5001/reports/2012-deSouzaMedeiros.pdf

● http://web.stanford.edu/class/cs347/reading/zab.pdf

● http://highscalability.com/blog/2008/7/15/zookeeper-a-reliable-scalable-distributed-coordination-

syste.html

● https://wiki.apache.org/solr/SolrCloud

● http://www.slideshare.net/scottleber/apache-zookeeper