+ All Categories
Home > Technology > Apache ZooKeeper TechTuesday

Apache ZooKeeper TechTuesday

Date post: 06-May-2015
Category:
Upload: andrei-savu
View: 5,714 times
Download: 3 times
Share this document with a friend
22
Apache ZooKeeper Andrei Savu @TechTuesday Why use it? What to expect in the future?
Transcript
Page 1: Apache ZooKeeper TechTuesday

Apache ZooKeeper

Andrei Savu @TechTuesday

Why use it? What to expect in the future?

Page 2: Apache ZooKeeper TechTuesday

Outline

Why use it?Crash CoursePractical Example

What to expect in the future (3.4.0 release)?GSoC 2010New ContribWork in Progress

Page 3: Apache ZooKeeper TechTuesday

Crash Course

Page 4: Apache ZooKeeper TechTuesday

What is ZooKeeper?

A highly available, scalable, distributed, configuration, consensus, group membership,

leader election, naming and coordination service.

Page 5: Apache ZooKeeper TechTuesday

What is ZooKeeper? (2)

replicated in memory tree data structuresomehow similar to a file systemno partial read / writesno renamesordered updatesstrong persistence guaranteesconditional updates (version)watches for data changesephemeral nodesgenerated file names

Page 6: Apache ZooKeeper TechTuesday

ZooKeeper Data Model

hierarchical namespaceeach znode has data and childrendata is read and written in its entirety

Page 7: Apache ZooKeeper TechTuesday

Basic ZooKeeper API

string create(path, data, acl, flags)

delete(path, expected_version)

stat set_data(path, data, expected_version)

(data, stat) get_data(path, watch)

stat exists(path, watch)

string[] get_children(path, watch)

Page 8: Apache ZooKeeper TechTuesday

ZooKeeper Service

Facts: 1) all servers store a copy of the data in memory 2) the leader is elected at startup 3) followers respond to clients 4) all updates go through the leader 5) responses are sent when a majority of servers have persisted the change

Page 9: Apache ZooKeeper TechTuesday

Practical Example

Page 10: Apache ZooKeeper TechTuesday

Distributed Queue (Python)

http://www.cloudera.com/blog/2009/05/building-a-distributed-concurrent-queue-with-apache-zookeeper/ http://github.com/henryr/pyzk-recipesRetry operation on ConnectionLoss:

http://github.com/andreisavu/pyzk-recipes

Page 11: Apache ZooKeeper TechTuesday

GSoC 2010

3 projects / 5 months

Page 12: Apache ZooKeeper TechTuesday

1. Monitoring & Web-based interface

Status: Committed to the trunk

1. JIRA: ZOOKEEPER-7012. Progress Tracking Wiki3. Monitoring for Ganglia, Nagios and Cacti

1. contrib / monitoring2. 'mntr' 4letter word

4. Web interface available as a Hue application1. contrib / huebrowser2. complete install instructions3. requirements: rest gateway, Hue 1.0

Page 13: Apache ZooKeeper TechTuesday

2. Read-Only Mode

Status: Under Review (Ready to be committed)

1. JIRA: ZOOKEEPER-7042. Progress Tracking Wiki3. Description: "When a ZooKeeper server loses contact with

over half of the other servers in an ensemble ('loses a quorum'), it stops responding to client requests. For some applications, it would be beneficial if a server still responded to read requests when the quorum is lost, but caused an error condition when a write request was attempted."

Page 14: Apache ZooKeeper TechTuesday

3. Failure Detector Model

Status: Under Review

1. JIRA: ZOOKEEPER-7022. Progress Tracking Wiki3. Detectors: Phi Accrual, Chen, Bertier, Fixed Heartbeat4. Why? Check the concluding remarks on the wiki. 5. Conclusion snippet: "in scenarios where we have a

changing network behavior, such in a WAN, the adaptive methods can be a good pick"

Page 15: Apache ZooKeeper TechTuesday

Contrib

Page 16: Apache ZooKeeper TechTuesday

Large Scale Pub/Sub (hedwig)

1. JIRA: ZOOKEEPER-7752. Uses ZooKeeper and BookKeeper3. Committed to the trunk4. Developed at Yahoo! Research5. Used for PNUTS cross data center replication6. http://vimeo.com/13282102

Page 17: Apache ZooKeeper TechTuesday

Work in Progress

only some interesting JIRAs

Page 18: Apache ZooKeeper TechTuesday

#834 Children for ephemerals

JIRA: ZOOKEEPER-834Allow ephemeral nodes to have children owned by the same session. Useful when publishing status information. No need to do serialization for basic data structures (hash tables)Similar to /proc in *nix systems.Examples: /agent-01/ip, /agent-01/memory, /agent-01/load

Page 19: Apache ZooKeeper TechTuesday

#829 /zookeeper/sessions/* 

JIRA: ZOOKEEPER-829Requested by HBase developers: " we'd like the ability to forcible expire someone else's ZK session "

Page 20: Apache ZooKeeper TechTuesday

Plenty of bug fixes

Join the community!

Page 21: Apache ZooKeeper TechTuesday

Resources

http://hadoop.apache.org/zookeeper/http://wiki.apache.org/hadoop/ZooKeeper/ProjectDescriptionhttp://wiki.apache.org/hadoop/ZooKeeper/Tao

Page 22: Apache ZooKeeper TechTuesday

Thanks! Questions?


Recommended