Date post: | 20-Dec-2015 |
Category: |
Documents |
View: | 213 times |
Download: | 1 times |
Corona: A High Performance Publish-Subscribe System for
the World Wide Web
Cornell University, Ithaca, NYNetworked System Design and
Implementation (NSDI), May 2006
Motivation Web contents change rapidly Growing popularity of frequently
updated content Weblogs Wikis News sites
Existing Web protocols do not provide a mechanism for automatically notifying users of updates
State of the Art Uncoordinated polling tools
E.g. Micronews syndication tools (e.g. RSS readers)
Based on naïve repeated polling Suffer from poor performance and
scalability Subscribers are tempted to poll at faster
rates to detect updates quickly Different subscribers repeatedly poll for the
same content independently Content providers have to handle the high
bandwidth load
Background and Related Work:
Publish-Subscribe Systems
Can be classified as: Topic based Content based
Prior research focused on content filtering and event delivery mechanisms
Main drawback - non-compatibility with the current Web architecture
Background and Related Work:
Publish-Subscribe Systems
Topic-based systems are based on several decentralized mechanisms:
Group communication – Isis Shared object spaces – Linda, TSpace, Java Spaces Rendezvous points – TIBCO, Herald
Content-based publish-subscribe systems that use in-network content filtering and aggregation
SIENA Gryphon Elvin Astrolabe
Background and Related Work:
Micronews Systems
Short descriptions of frequently updated information in XML based formats such as RSS and Atom
Accessed via HTTP through URLs and supported by feed readers
Commercial services have started disseminating micronews updates to users:
Bloglines, NewsGator, Queoo Use fragile servers and relentless polling
FeedTree – recent system for micronews dissemination Uses a structured overlay, cooperative polling and shares
updates between peers CAM and WIC use techniques for resource allocation
similar to Corona, but are limited to a single node
CoronaCornell Online News Aggregator
Topic based publish – subscribe system for the Web
Interoperates with the current pull-based architecture of the Web
URLs of Web content serve as topics or channels Any Web object identifiable by a URL
can be monitored with Corona
Corona Architecture
Corona Architecture Uses structured overlay network (Pastry)
Provides decentralization, good failure resilience, and high scalability
Key feature that enables Corona to achieve fast update detection is cooperative polling:
Multiple nodes are assigned to periodically poll the same channel and updates detected by any polling node are shared
The number of nodes that poll for each channel is determined based on an analysis of the tradeoff between update performance and network load
Corona poses this tradeoff as an optimization problem
Pastry The network is organized into a ring Each node is assigned an id from a
circular numeric space The ids are treated as a sequence of k
digits of base b (b is a power of 2) Routing:
Pointers to neighbors in each direction “Long distance” contacts - the entry in the
i’th row and j’th column of a node's routing table points to a node whose id shares i prefix digits with the present node and whose (i + 1)th digit is j
Pastry
Routing table of a pastry node with NodeId= 65a1x. b=16, x is chosen so the pointed node is the closest to the present node (according to a scalar proximity metric, such as the round trip time)
Pastry
Routing a message from node 65a1fc to node d46a1c
Analytical ModelingPastry
The NodeId space can be viewed as “level sectors” - the nodes in a sector of level l share the first l digits of the NodeId:
Example with b=2, k=40000
0010
0100
01111000
1111
1101
1011
1010
l=0l=1l=2l=3
0110
l=4
Pastry This way the nodes in the i’th row in a
routing table are in the same sector of level i with the present node
If the nodes are uniformly distributed only logb(N) levels in the routing table are populated (the smallest sector is of level logb(N))
Any node can be reached in logb(N) hops
The Polling Scheme Corona assigns nodes in well -
defined wedges of the Pastry ring for polling each channel
Each channel is assigned an id from the same circular numeric space
A channel with polling level l is polled by all nodes with at least l matching prefix digits in their ids
The Polling Scheme
Any problems with the scheme? Polling levels selection What about topology?
The Polling Scheme
Cooperative polling in Corona
Analytical Modeling The polling level of a channel quantifies
its performance-overhead tradeoff: A channel at level l has, on average, nodes
polling it Which can cooperatively detect updates in
time on average, – the polling
interval
The collective load placed on the content
server of the channel is proportional to
N
b lfffffff
2ffffBb l
Nfffffff
N
b lfffffff
Performance-overhead tradeoff approaches
Corona-Lite Performance goal - minimizing the average
update detection time while bounding the network load on content servers
The overall update performance = average of the update detection time of each channel weighted by the number of clients subscribed to the channels
Target network load - the total number of subscriptions in the system
minX1
Mq ib l i
Nffffff s.t. X
1
Ms iN
b l iffffffX
1
Mq i
M - number of channelsN - number of nodesb - base of structured overlayT - performance targetli - polling level of channel iqi - number of clients for channel isi - content size for channel iui - update interval for channel i
Performance-overhead tradeoffs
Is the average a meaningful metric? Other suggestions? User-effected weights
Update detection time What about the propagation to the user
Comments on the selection of the polling interval?
Other parameters that could be accounted for? User inputs to tackle stickiness, 24 hour
polling
Performance-overhead tradeoff approaches
Corona-Lite
Clients of popular channels gain greater benefits than clients of less popular channels.
Yet, Corona-Lite does not suffer from “diminishing returns”
Corona-Lite performance can vary depending on the current workload
Performance-overhead tradeoff approaches
Corona-Fast Corona-Fast provides stable update
performance, maintained steadily at a desired level through changes in the workload
Minimizes the total network load on the content servers while meeting a target average update
detection timemin X1
Ms iN
b l iffffff s.t. X
1
Mqib li
NffffffTX
1
Mq i
M - number of channelsN - number of nodesb - base of structured overlayT - performance targetli - polling level of channel iqi - number of clients for channel isi - content size for channel iui - update interval for channel i
Performance-overhead tradeoff approaches
Corona Fair Corona-Fast and Corona-Lite do not
consider the actual rate of change of content in a channel
Corona-Fair incorporates the update rate of channels into the performance tradeoff to achieve a fairer distribution of update performance between channels
It defines a modified update performance metric as the ratio of the update detection time and the polling interval of the channel
Performance-overhead tradeoff approaches
Corona Fair Minimizes the average of the ratio
metric, bounding load on content servers
minX1
Mq i
uifffffb
l i
Nffffff s.t. X
1
Ms iN
b l iffffffX
1
Mq i
M - number of channelsN - number of nodesb - base of structured overlayT - performance targetli - polling level of channel iqi - number of clients for channel isi - content size for channel iui - update interval for channel i
minX1
Mq i
uifffff
s
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwb l i
Nffffff s.t. X
1
Ms iN
b l iffffffX
1
Mq i
minX1
Mq ilog
` a
log ui` afffffffffffffffb
l i
Nffffff s.t. X
1
Ms iN
b l iffffffX
1
Mq i
Performance-overhead tradeoff approaches
Comments about the approaches?
Decentralized Optimization
Honeycomb Corona determines the optimal polling
levels using the Honeycomb optimization toolkit
Provides numerical algorithms and decentralized mechanisms for solving optimization problems of the kind:
Honeycomb finds an approximate solution in O(M logM logN) time (using Lagrange multiplier)
minX1
M
fil i
b cs.t. X
1
M
gi l ib c
T
Decentralized Optimization
Tradeoff clusters Solving the optimization problem using limited
data available locally can produce highly inaccurate solutions
Collecting the tradeoff factors for all the channels at each node is expensive and impractical
Honeycomb combines channels with similar tradeoff factors into a tradeoff cluster
The nodes periodically exchange the clusters with contacts in the routing table and aggregate the clusters received from the contacts
The overhead for clusters aggregation is kept low by limiting the number of clusters to a constant
System Management Each channel in Corona has a unique id
and one or more owner nodes managing it
The primary owner of a channel is the Corona node with the numerically closest id to the channel's
Corona adds the F closest neighbors of the primary owner as additional owners to tolerate failures
System Management Owners take responsibility for managing
subscriptions, polling, and updates for a channel
Problems? Also keep track of channel-specific factors that
affect the performance tradeoffs - the number of subscribers, the content size and the update rate
All nodes run a periodic protocol: Optimization phase Maintenance phase Aggregation phase
System ManagementChanging Polling Levels (Maintenance
phase)
Corona nodes operate independently and make decisions to increase or decrease polling levels locally
Initially only the owner nodes poll for the channels
When a level i node lowers the level to i-1 or raises the level from i+1 back to i, it instructs its contact in row i-1 of its routing table to start or stop polling for that channel
When a node is instructed to begin polling it waits for a random interval of time between 0 and the polling interval before the first poll
System ManagementUpdating Tradeoff Factors (Aggregation
phase)
Owners monitor the number of subscribers and send out fresh estimates along with the maintenance message
Descendant nodes propagate these estimates to all the nodes in the wedge
The update interval and size feed only change during updates and are therefore sent along with update messages
Tradeoff clusters are also sent by contacts in the routing table in response to maintenance messages
System ManagementFailure resilience
Inherited from the underlying overlay When new nodes join the system or
nodes fail, Corona ensures the transfer of subscription state to new owners Simultaneous failure of more than F
adjacent nodes might cause a system failure But clients can easily renew subscriptions
System Management
Comments on system management?
Update Dissemination Version numbers are used to identify new
content Either timestamp provided by the content
server or numbers assigned by the primary owner
Corona nodes share updates as deltas between old and new content Bandwidth can be saved through data encoding
When a delta is generated by a node, it shares the update with all other nodes in the channel's polling wedge Why?
User Interface Corona employs instant messaging (IM) as its
user interface Subscribe/Unsubscribe messages
A subscribe or unsubscribe message is routed to all the owner nodes of the channel, which update their subscription state
When an update is detected by Corona, the current primary owner sends a message with the delta to all the subscribers through the IM system
Comments on the user interface?
Just to compare: FeedTree Also uses Pastry Group communication (Scribe) – the
subscribers join the overlay Full deployment:
Push-based - the publishers join the overlay Updates are sent using the overlay multicast
Partial deployment: The publishers are collectively polled by groups of
the overlay nodes They produce the updates and distribute them to the
subscribers Advantages, disadvantages?
Evaluation
Large scale simulations Wide-area experiments on
PlanetLab Performance is compared to that
of the legacy RSS Comments?
Simulation Real-life RSS traces are used The tradeoff parameters are
extrapolated to a larger scale: 1024 nodes 100,000 channels 5,000,000 subscribers
Polling interval – 30 minutes How was that selected?
All three schemes are checked and compared to RSS
SimulationResults
Network Load on Content Servers
Average Update Detection Time
SimulationResults
Number of Pollers per Channel
Update Detection Time per Channel
SimulationResults
There are roughly two levels of polling according to this simulation Most of the channels are polled by
~100 nodes Is Honeycomb really necessary for
that?
SimulationResults – Corona Fair
Update Detection Time
SimulationSummary
What didn’t they check?
Deployment
A set of 60 PlanetLab nodes Corona-Lite scheme is used 7500 RSS feeds from
www.syndic8.com 150,000 subscriptions Polling interval – 30 minutes
DeploymentResults
Average Update Detection Time
Total Polling Load on Servers