Distributed Applications with Apache Zookeeper

Post on 16-Jul-2015

348 views 1 download

Tags:

transcript

Building Distributed

Applications with Apache

Zookeeper

Alex Ehrnschwender | Game Server Engineer at DeNA

What is Zookeeper?

“ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.”

Zookeeper Wiki

ZooKeeper: A Coordination Service for Distributed Applications

Coordination & synchronization for

distributed processes

Logical namespacing implemented by a

hierarchy (tree) of znodes

Replicated in-memory over multiple hosts

for reliability, availability, and performance

Simple API of CRUD & basic tree operations

for client integration

Zookeeper: Reliability & Consistency

Distributed ensemble with automatic leader

election through quorum

Replicated in-memory on every instance with

snapshot writes to disk

Client TCP connection maintained to any

node with failover support

Guaranteed atomicity & sequential

consistency

Zookeeper: Watches & Ephemeral nodes

Underlying znodes have a data structure consisting of version numbers (cversion, aversion) &

timestamps

Watches

● Client-initiated subscriptions to znodes

● Changes to a watched znode trigger notification to subscribed clients

Ephemeral Nodes

● Backed by a client session and deleted when client session ends

● Cannot have children

Zookeeper: But… why?

“Because of the difficulty of implementing these kinds of services, applications initially usually skimp on them, which make them brittle in the presence of change and difficult to manage. Even when done correctly, different implementations of these services lead to management complexity when the applications are deployed.”

Zookeeper Wiki

Zookeeper: Advantages for Backing a Server Cluster

Server workers can become cluster-aware

So much out-of-the-box that would be duplicated with a custom solution

Extremely fast reads (10:1 performance against writes)

Small footprint - An ensemble of only 5-7 zk instances can serve the

coordination needs of several large production applications

Centralized event broadcasting & failure detection (heartbeat)

Zookeeper: Common Use Cases

● Configuration Management

● Service Discovery

● Distributed Cloud-Based File Systems

● Internal DNS Management

● Master (Leader) Election and Voting

● Messaging Queue

● Event Broadcasting & Notification

Use Case Example #1 - Managing Redis Shards

ZK Use Case Example #1 - Pinterest

Pinterest stores their entire follower model inside sharded Redis instances (

~9000 Redis shards, multiple instances per core)

Shard configuration is stored and managed by Zookeeper

Client lookups and watches for shard location & subsequent data retrieval

Master-slave failover triggers updates to znode representation (slave address replaces master)

Vertical splitting of data broadcasted to watching clients

Use Case Example #2 - HBase Cluster Configuration

Code Examples

public void join(String groupName, String memberName)

throws KeeperException, InterruptedException {

String path = "/" + groupName + "/" + memberName;

String createdPath = zk.create(path,

null /* data */,

ZooDefs.Ids.OPEN_ACL_UNSAFE,

CreateMode.EPHEMERAL);

System.out.println("Created " + createdPath);

}

public void create(String groupName)

throws KeeperException, InterruptedException {

String path = "/" + groupName;

String createdPath = zk.create(path,

null /* data */,

ZooDefs.Ids.OPEN_ACL_UNSAFE,

CreateMode.PERSISTENT);

System.out.println("Created " + createdPath);

}

Code Examples (cont.)

public void delete(String groupName)

throws KeeperException, InterruptedException {

String path = "/" + groupName;

try {

List<String> children = zk.getChildren(path, false);

for(String child : children) {

zk.delete(path + "/" + child, -1); /* child */

}

zk.delete(path, -1); /* parent */

} catch (KeeperException.NoNodeException e) {

System.out.printf("Group %s does not exist\n", groupName);

}

}

public void list(String groupName)

throws KeeperException, InterruptedException {

String path = "/" + groupName;

try {

List<String> children = zk.getChildren(path, false);

for(String child : children) {

System.out.println(child);

}

} catch (KeeperException.NoNodeException e) {

System.out.printf("Group %s does not exist\n",

groupName);

}

}

Performance

Standalone ops/sec 3-Node Ensemble (ops/sec)

Reference:

https://wiki.apache.org/hadoop/ZooKeeper/ServiceLatencyOverview

Sample Configuration (zoo.cfg)

tickTime=2000

dataDir=/var/lib/zookeeper

clientPort=2181

initLimit=5

syncLimit=2

server.1=zoo1:2888:3888

server.2=zoo2:2888:3888

server.3=zoo3:2888:3888

Exhibitor: A ZK Monitoring & Administration Tool from Netflix

Centralization & externalization of zk ensemble configuration* (S3/remote FS)

Web UI & REST API for ease of management

Instance monitoring with automatic configuration updates

Rolling ensemble changes while maintaining quorum

Miscellaneous administration tasks (backup/restore, log & snapshot cleanup)

* Configuration management for a configuration manager.... so meta!

Questions?

Appendix

Zookeeper Atomic Broadcast (ZAB) Algorithm

● Protocol for managing atomic updates to replicas

● Responsible for:

o Agreeing on an ensemble leader

o Synchronizing replicas

o Managing transactions and broadcasts

o Recovery of state

● ZXIDs & transactional ordering

● Guarantees:

o Local & global primary order

o Primary integrity

Performance

Performance

Standalone ops/sec 3-Node Ensemble (ops/sec)

Reference:

https://wiki.apache.org/hadoop/ZooKeeper/ServiceLatencyOverview

Sample Configuration (zoo.cfg)

tickTime=2000

dataDir=/var/lib/zookeeper

clientPort=2181

initLimit=5

syncLimit=2

server.1=zoo1:2888:3888

server.2=zoo2:2888:3888

server.3=zoo3:2888:3888

References

● http://engineering.pinterest.com/post/55272557617/building-a-follower-model-from-scratch

● http://zookeeper.apache.org/doc/trunk/zookeeperOver.html

● http://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html

● https://github.com/Netflix/exhibitor/wiki

● http://www.tcs.hut.fi/Studies/T-79.5001/reports/2012-deSouzaMedeiros.pdf

● http://web.stanford.edu/class/cs347/reading/zab.pdf

● http://highscalability.com/blog/2008/7/15/zookeeper-a-reliable-scalable-distributed-coordination-

syste.html

● https://wiki.apache.org/solr/SolrCloud

● http://www.slideshare.net/scottleber/apache-zookeeper