Date post: | 12-Jan-2016 |
Category: |
Documents |
Upload: | catherine-adams |
View: | 214 times |
Download: | 3 times |
Gil Einziger Roy FriedmanComputer Science Department
Technion
Postman: an Elastic Highly Resilient Pub/Sub Framework for
Self Sustained Service Independent P2P
Networks
Background: Publish/Subscribe
• Publisher: any entity that wishes to publish some event
Look at my new hairstyle
Background: Publish/Subscribe
• Subscriber: any entity that wishes to be notified about events that match its interests (also called subscription)
I want to know
everything about
hairstylesI want to
know everything
about Ariana Grande
I only care about science
fiction
Background: Publish/Subscribe
• The system’s goal is to deliver notifications about events to interested subscribers and only to them– Decoupling of information producers from consumers
• Applications:– Social networking (Twitter and the likes)– Stock quotes– Control systems– Data-center management– Etc.
Background: P2P
• Decentralized systems in which (most of) the communication is performed directly between the end nodes of the system – the peers
• Often, peers are donated users’ machines– But can also be set-top boxes, routers, a large datacenter’s
servers, etc.
• Famous example applications:– Skype, Bittorrent (and other file sharing), IPTV, Bitcoin
It’s a Brave New World Out There
• Most users access online content through their mobiles– Intermittent connectivity– Limited bandwidth– Limited battery life– Limited resources
• We need to decouple the devices used to access the data and the ones serving the P2P network
A Vision for Future P2P Solutions
• Whether ran as a true P2P network between donated machines or inside a data-center:
– Decoupling between devices that consume services and the ones providing the service
• Incentives might be in the form of revenue share with advertisers or paid subscribers
– P2P machines are used for providing multiple services• Not feasible to optimize the P2P overlay for a specific service
Problem Statement
• A scalable and efficient pub/sub system for self-sustained P2P networks– The challenge
• Subscribers might not be present much of the time• Short client sessions• Use the existing overlay
– Quality Goals• High delivery rate• Efficient publication delivery• Reasonable latency• High churn resilience
Our Solution: Overview
Our Solution: Subscribing to Mailbox
• Clients that are aware of mailboxes serving their topics, simply notify these mailboxes about it
• A client that is unaware of such mailboxes, initiates multiple biased random walks– Each mailbox distributes a hint (Bloom filter) with the topics it
subscribes to its overlay neighbors up to some distance– The random walks favor visiting nodes whose hints include a match– The random walks continue for a given TTL trying to find as many
matching mailbox as possible
• If none is found, then the home node becomes a mailbox for these topics
Our Solution: Mailboxes
• An overloaded mailbox can refuse to accept new clients and topics
• Mailboxes disappear naturally due to churn or when they are underutilized
• The important objective is load sharing rather than load balancing
Our Solution: Dissemination
• Spanning tree among mailboxes that know each other
• Random walks to discover new mailboxes and disseminate to them
Our Solution: Dissemination
• Spanning tree among mailboxes that know each other
• Random walks to discover new mailboxes and disseminate to them
Our Solution: Dissemination
• Spanning tree + random walks between mailboxes
• Normally, a mailbox pushes events to corresponding registered clients
• Additionally, out-of-band gossip between mailboxes and clients– Clients poll their set of known mailboxes– Exchange list of known events with each polled mailbox– Occurs periodically plus after re-connection
Implementation
– Written in Java– Open source project
• All code including testing available online
– Can be run on top of real IP networks as well as the PeerSim simulator
• In the real networks case, executed on top of the OpenKAD implementation of the Kademlia DHT
– Measurements confirm similarity between results with similar size networks
• Simulations can be used to explore scalability• Real networks can be used to validate simulation results
Evaluation: Methodology
• Traces:– Synthetic traces
• Subscriptions are spread to clients/home nodes uniformly• Topic publication distribution is Zipf-like with α=0.9
– Twitter traces
• Metrics:– Delivery rate– Communication load– Mailbox subscription pattern vs. users’– Effects of churn
Results: Synthetic Workload
Time (minutes)
Del
iver
y R
ate
Delivery rate over time vs. network size
1) Delivery rate approachs 100% after a few minutes
2) For 1500 nodes, simulation ~ real runs
Results: Twitter Traces
Time (minutes)
Del
iver
y R
ate
Delivery rate over time
Results: Load Distribution
# node
Tota
l Han
dled
Mes
sage
s
Load Distribution (Twitter)
1) Almost all load goes to mailboxes only
2) Even most loaded need to handle fewer than 10 messages per second
Results: Subscription Pattern
# node
Sub
scrip
tion
Num
ber
Twitter (Feb. 7 2010 19:00-20:20)
Subscription pattern of mailboxes much more uniform than of clients=> balanced dissemination trees
Results: Subscription Pattern
Client Subscriptions
Mai
lbox
Sub
scrip
tions
#Registered Clients/#Registered Mailboxes
Only a small number of mailboxes register even to the most popular topics => the dissemination trees are relatively small
Results: Single 10% Churn Event
Time (minutes)
Mis
s R
ate
Churn Recovery Time
Results: Repetitive 10% Churn
Time (minutes)
Mis
s R
ate
Churn Recovery Over Time
Results: Repetitive 100% Churn
Time (minutes)
Mis
s R
ate
Churn Recovery Time (Loosing All Mailboxes)
Summary
• The concept of elastic mailboxes– Self electing, self evaporating, self organizing
• Complementing delivery mechanisms– Spanning tree– Random walks– Out-of-band gossip through clients interaction
• End result:– Mailboxes dramatically reduce the scalability problem– Highly efficient, highly effective, highly robust to failures
and churn
Open Issues
• Exploit subscription similarity• Privacy
Q&A
• Thanks for listening…