Date post: | 13-Apr-2017 |
Category: |
Technology |
Upload: | joydeep-banik-roy |
View: | 82 times |
Download: | 0 times |
Winter is coming? Not if ZooKeeper is there!
Presented By : Joydeep Banik Roy Sr. Software Engineer
Cerner Corporation
Winter is coming?
Distributed System
“A distributed system is capable of exploiting the capacity of multiple processors by running
components, perhaps replicated, in parallel. A system might be distributed geographically for
strategic reasons, such as the presence of servers in multiple locations participating in a single
application.”- ZooKeeper
Distributed Process Coordination, O’Reilly
Fallacies of the Distributed System
o The network is reliable.o Latency is zero.o Bandwidth is infinite.o The network is secure.o Topology doesn't change.o There is one administrator.o Transport cost is zero.o The network is homogeneous.
Coordination
A coordination task is a task involving multiple processes for the purposes of cooperation or to regulate contention.
Examples: Master Election Crash detection Group membership management Metadata management
What is ZooKeeper?
“Distributed, open-source coordination service for distributed applications that exposes a simple API, like a file system API, that applications can build upon to implement higher level services for
synchronization, configuration maintenance, and groups and naming.”
/master “richman.com”
/worker• /worker/worker-1
“poorman.com”
/tasks• /tasks/task-1
“poor-to-rich.sh”
How it does : Shared Storage
Server 1
Server 2
Server 3
Server 4
(Leader)
Server 5
Client Library
Client Library
Client Library
Client Library
APPLICATION
Sessi
on 0x
AB Session 0x11Sessi
on 0x
2A
Session 0x10
/master “richman.com”
/worker• /worker/worker-1
“poorman.com”
/tasks• /tasks/task-1 “run-
cmd”
The ZooKeeper Data Model
ZooKeeper has a hierarchal name space.Each node in the namespace is called as a ZNode. Every ZNode has data (given as byte[]) and can have children. parent : “/zookeeper"|-- child1 : “/master"|-- child2 : “/workers"|-- child3 : “/tasks"`-- task-1 : “run cmd;"
ZNode properties: Maintains a stat structure with version
numbers for data changes, ACL changes and timestamps
Version number increases with changesData is read and written in its entirety
Znode Example: Simple Lock
/resource
Process1 Process2 Process3
/Lock ”PROCESS1”
Znode Example: Simple Lock
/Resource
Process2 Process3
/Lock ”PROCESS2”
Znode Example: Simple Lock
/Resource
Process3
/Lock ”PROCESS3”
ZNODE Types
Persistent exists till deleted
explicitly.
Ephemeral deleted once the
client session ends.
Sequential appends a
monotonically increasing counter to the end of path.
Watches and Notifications
Event – Execution of update to a znode
Watch – one time trigger associated with a znode
Notification – When a watch is triggered by an event it generates a notification
“ZooKeeper always pays its debts”
“One important guarantee of notifications is that they are delivered to a client before any
other change is made to the same znode”
ZooKeeper Guarantees
Sequential Consistency - Updates from a client will be applied in the order that they were sent.
Atomicity - Updates either succeed or fail. No partial results. Single System Image - A client will see the same view of the service
regardless of the server that it connects to. Reliability - Once an update has been applied, it will persist from that
time forward until a client overwrites the update. Timeliness - The clients view of the system is guaranteed to be up-to-
date within a certain time bound. Rather than watching stale data, a server will shut down and forse client to connect to another one with more recent image.
ZooKeeper is Simple
ZNODE OPERATIONS (API)
READ WRITEgetACL setACLexists create
getChildren deletegetData setData
SYNC() call
Example : Master-Worker
/master
/assign
/task
/worker /worker-1
/worker-1
/task-1
/task-1
/status DONE
ZooKeeper Recipes
● Configuration management – machines bootstrap config from a centralized source, facilitates simpler deployment/provisioning
● Naming service - like DNS, mappings of names to addresses
● Distributed synchronization - locks, barriers, queues
● Leader election - a common problem in distributed coordination
● Centralized and highly reliable (simple) data registry
Recipe #1 : Barriers
Used for Configuration management The clients want to read a configuration but the configuration is not yet
ready. Barrier blocks the processing of a set of nodes till a condition is met.
Therefore a /barrier znode is created. Client calls the ZooKeeper API's exists() function on the barrier node,
with watch set to true. If exists() returns false, the barrier is gone and the client proceeds Else, if exists() returns true, the clients wait for a watch event from
ZooKeeper for the barrier node.
Recipe #2 : Distributed Exclusive Lock
Assuming there are N clients trying to acquire a lock Clients creates an ephemeral, sequential znode under the
path /Cluster/_locknode_ Clients requests a list of children for the lock znode (i.e.
_locknode_) The client with the least ID according to natural ordering will
hold the lock. Other clients sets watches on the znode with id immediately
preceding its own id. This is done to avoid “The Herd Effect”. Periodically checks for the lock in case of notification. The client wishing to release a lock deletes the node, which
triggering the next client in line to acquire the lock.
ZK|---Cluster +---hadoopConfig +---memberships +---_locknode_ +---host1-HiveClient +---host2-Impala +---host3-YARN +--- … \---hostN-Crunch
Recipe #3 : Leader Election A znode, say “/leader/election-path" All participants of the election process create an ephemeral-sequential node on the
same election path. The node with the smallest sequence number is the leader. Each “follower” node listens to the node with the next lower seq. number Upon leader removal go to election-path and find a new leader or become the leader if
it has the lowest sequence number. Upon session expiration check the election state and go to election if needed. Applications may consider creating a separate znode to acknowledge that the leader
has executed the leader procedure.
Recipe #4 : Distributed Queue A znode /queue is created. Distributed clients create EPHEMERAL-SEQUENTIAL znodes by passing path name
ending in /queue- to create() Pathnames have the form /queue/queue-X where X is monotonically increasing number. If a single consumer takes items out of the queue, they will be ordered FIFO. The client calls getChildren() and process all queue nodes until exhausted. Guaranteed
to not miss anything as the nodes are ordered FIFO Priority Queues come with a small change.
Apache Curator
Lot more recipes available and open sourced by NetFlix. Visit http://curator.apache.org/ for more recipes and their
implementation.
Language Bindings
- ZooKeeper ships client libraries in: JavaCPerlPython
- Community contributed client bindings available for Scala, C#, Node.js, Ruby, ErLang, Go, Haskellhttps://cwiki.apache.org/ZOOKEEPER/zkclientbindings.html
Who uses ZooKeeper?
References
ZooKeeper : Distributed Process Coordination By Flavio Junqueira and Benjamin Reed https://zookeeper.apache.org/ It has some fabulous documentation! http://curator.apache.org/ Check out the recipes! Some really generous slides on slideshare like this one :
http://www.slideshare.net/sauravhaloi/introduction-to-apache-zookeeper And others…
Questions?
DON’T FORGET TO RATE THIS TALK
THANK YOU