Date post: | 26-Dec-2015 |
Category: |
Documents |
Upload: | marjorie-mildred-mclaughlin |
View: | 215 times |
Download: | 2 times |
NaradaBrokering for CTS05 GlobalMMCS Tutorial
CTS05 St. Louis May 17 2005
Geoffrey Fox
CTO Anabas Corporation andComputer Science, Informatics, Physics
Pervasive Technology LaboratoriesIndiana University Bloomington IN 47401
[email protected]://www.infomall.org
Grid (Web Service) Messaging Build distributed systems from “interoperable” services linked by messages
(SOAP) – architect capabilities as services Grids are “just” large scale sets of such services Need to support real time streams and NOT just files (collections of
messages) consistent with WS standards (P2P and “central”) Open Source http://www.naradabrokering.org (4 downloads/day) is a
scalable distributed pub-sub system supporting multiple standards (JMS, WS) and subscription methods• Implements “Service Internet” and Notification areas of WS-*
Infrastructure Manage messaging for
• Optimize communication for bad links, firewalls etc• Collaboration (multi-cast streams)• Fault tolerance with re-transmitted messages and Replicated Services• Replay – access any message at any time• Virtualize addressing with pub-sub metaphor• Performance from protocol (UDP v Parallel TCP ..) and representation• Heterogeneous Clients – filter to and from PDA’s
Candidate for Axis2-MOM (Message Oriented Middleware) infrastructure
NaradaBrokering
Stream
NB supports messagesand streams
NB role for Grid isSimilar toMPI role for MPP
Queues
Multiple protocol transport supportIn publish-subscribeParadigm with differentProtocols on each link
Transport protocols supported include TCP, Parallel TCP streams, UDP, Multicast, SSL, HTTP and HTTPS.Communications through authenticating proxies/firewalls & NATs. Network QoS based RoutingAllows Highest performance transport
Subscription Formats Subscription can be Strings, Integers, XPath queries, Regular Expressions, SQL and tag=value pairs.
Reliable delivery Robust and exactly-once delivery in presence of failures
Ordered delivery Producer Order and Total Order over a message type. Time Ordered delivery using Grid-wide NTP based absolute time
Recovery and Replay Recovery from failures and disconnects.Replay of events/messages at any time. Buffering services.
Security Message-level WS-Security compatible security
Message Payload options
Compression and Decompression of payloadsFragmentation and Coalescing of payloads
Messaging Related Compliance
Java Message Service (JMS) 1.0.2b compliant Support for routing P2P JXTA interactions.
Grid Feature Support NaradaBrokering enhanced Grid-FTP. Bridge to Globus GT3.
Web Services supported
Implementations of WS-ReliableMessaging, WS-Reliability and WS-Eventing.
Traditional NaradaBrokering Features
Features for March—June 2005 Releases Production implementations of WS-Eventing, WS-
Notification, WS-RM and WS-Reliability. SOAP message support and NaradaBrokers viewed as SOAP
Intermediaries Active replay support: Pause and Replay live streams. Stream Linkage: can link permanently multiple streams –
using in annotating real-time video streams Replicated storage support for fault tolerance and resiliency
to storage failures. Management: HPSearch Scripting Interface to streams and
services Broker Discovery: Locate appropriate brokers
Summary NaradaBrokering provides a fully distributed queue manager
where queues buffer streams with overheads of a few milliseconds per broker
• << 30 ms frame interval
• << 100’s ms network delay
• Much faster than using databases or writing files Collaboration is implemented by sharing synchronizing
streams Compatible with Grids, Web Services, Java Message Service Streams are “first class entities” with rich set of features
• Don ‘t open a socket; hand data to NaradaBrokering Software Overlay Network or Message Oriented Middleware
NaradaBrokering Services
Reliable Delivery Service Guaranteed delivery in multiple producer/ consumer
settings. Guarantees hold true in the presence of• Node/Link Failures• Links can lose messages and garble message order.• Storage failures: Stores need to recover after failure.• Prolonged entity disconnects
Exactly-Once and Ordered delivery of events Uses both positive& negative acknowledgements Supports Replay and Fast Recovery from failures Independent of underlying archival system. Was used to enhance fault tolerance in Grid-FTP. Uses “Reliable Storage” to keep messages temporarily
0
2
4
6
8
10
12
14
16
0 1000 2000 3000 4000 5000 6000 7000 8000 900010000
Tim
e (
Mill
iseco
nds)
Content Payload Size in Bytes
Transit delays/Standard deviations in a 3 broker network.NB-BestEffort(BE)(TCP) Vs NB-ReliableDelivery(RD)(UDP)
Mean delay (NBRD-UDP) Mean delay (NBBE-TCP)
Std Dev (NBRD-UDP) Std Dev (NBBE-TCP)
0
2
4
6
8
10
12
0 1000 2000 3000 4000 5000 6000 7000 8000 900010000
Tim
e (
Mill
iseco
nds)
Content Payload Size in Bytes
Transit delays/Standard deviations in a single broker network.NB-Best Effort(TCP) Versus NB-Reliable Delivery(UDP)
Mean delay (NBRD-UDP) Mean delay (NB-BETCP)
Std Dev (NBRD-UDP) Std Dev (NBBE-TCP)
Dealing with large payload sizes To cope with large payloads, the substrate incorporate
2 sets of services. Compression/Decompression service: The substrate
incorporate support for zlib based compression and decompression of payloads.
Fragmentation/Coalescing Service: These service can break up a large payload into smaller fragments. The coalescing service can take these smaller fragments and coalesce them into the original large payload.• This was used to deal with transfer of large payloads (up to 1
GB) in the NB enhanced Grid-FTP application.
Replay Services Replay requestors can specify replays based on several
parameters• A range of sequence numbers can be specified.
• Additionally, constraints on an event’s content synopsis can be specified.
• Based on a specified time range. Replay services have been tested with applications such
as Audio/Video conferencing, Whiteboards etc. Essential for recording and replay of collaborative
sessions Important special case supports rewind and similar
operations on a real-time stream
Buffering Service This service is incorporated into the system to facilitate
the buffering of events prior to releasing them. Buffering service time orders events and releases event
based on three metrics• Number of events in the buffer
• Size of the buffer
• Time spent by event in a buffer.
Time Differential Service This service is essential to reduce jitters in large
distributed environments.• Networks introduce unpredictable delays that increase jitter.
Service takes events released by buffering service, and ensures that it preserves time spacing between events.
TDS can provide time spacing resolution of up to 1 millisecond between events.
-2
0
2
4
6
8
10
12
0 100 200 300 400 500 600 700 800 900 1000
Jitte
r (M
illis
econ
ds)
Sample Number
Jitter values comparing the Input to the Buffering Service and the Output of the TDS
Buffering InputTDS Output
Trans-Atlantic Settings
-0.02
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0 100 200 300 400 500 600 700 800 9001000
Jitte
r (M
illis
econ
ds)
Sample Number
Jitter values from the output of the TDS
TDS Output
Trans-Atlantic Settings
NaradaBrokering NTP Service NaradaBrokering includes an implementation of the Network
Time Protocol (NTP) All entities within the system use NTP to communicate with
atomic time servers maintained by organizations like NIST and USNO to compute offsets• Offset is the computed difference between global time and the
local time.• The offset is computed based on the time returned from
multiple atomic time servers.• The NTP algorithms weighs results from individual time
clocks based on the distance of the atomic server from the entity.
This ensures that all entities are within 1 ms of each other. The timestamps account for clock drifts on machines
• Time returned corrects software clocks which can slow down with increased computing load on the machine.
-1
-0.5
0
0.5
1
0 20 40 60 80 100 120 140 160
Offs
et c
hang
e (M
illis
econ
ds)
Elapsed time in 100s of seconds
NTP Offset variations over a period of 4 hoursIndiana Linux machine with
a native NTP daemon process
Offset Variation
Broker Discovery Service Locates the nearest available broker that a client can
connect to.• Incorporates specialized nodes – broker discovery nodes – to
maintain broker info.
• Depending on load or security issues, brokers may decide to respond/ignore discovery requests.
• If available the scheme can exploit IP multicast for discovery.
• Nearest broker determined by ping times, loss rates and available bandwidth.
Broker Discovery: Brokers at Indianapolis, NCSA, UMN, FSU, IU & San Diego Supercomputing Center. Broker at Indy selects IU, NCSA and UMN for pings.
Broker Discovery: Brokers at Indianapolis, NCSA, University of Minnesota, FSU & San Diego Supercomputing Center. Cardiff selects Indy, NCSA and UMN for pings.
Topic Discovery Service Allows publishers and subscribers to advertise topics. Creator of topic possesses credentials to indicate
ownership of the topic. Discovery of topics takes into account credentials of
client trying to discover topic.• Topic owner may restrict discovery to a limited authorized set
of clients. Discovery requests can be made using simple strings or
regular expression queries.
Based on Message Level SecurityMessages organized into topicsEach topic has a separate key; Topics can be organized into sessions
Security Service
NaradaBrokeringSupport for SOAP and Web Services
SOAP Support I The broker can receive SOAP messages (over HTTP)
from any entity. • This removes any client dependence in client-broker
interaction The broker can function as an intermediary performing
multiple roles which could just be routing but also involve mapping using filters
There can be multiple filter-pipelines, each comprising multiple filters, available at the broker node.• Some of these would be system filter-pipelines configured
statically.
• Filter-Pipelines can also be configured by users, dynamically, at run-time.
SOAP Support II Multiple roles could be associated with
• Different servlets hosted by a broker.• A given servlet hosted by a broker.
Scheme will allow filters to be registered for individual roles.• A filter could be part of multiple roles.
There is a dedicated filter pipeline per role. This implies that a NaradaBroker can be used as a Web
Service container although full container support is not yet available
Filters are used internally by NB to implement performance monitoring
The FilterPipeline-Filter model I The filter and filter-chain facilitate many of the
interactions that are missing in JAX-RPC handlers.• Filters are NaradaBrokering approach to the handlers used in
Web Service containers Filters can inject messages at any time
• These messages can be sent either to the application or over the network.
• No limit on the number of messages that can be triggered because of a single message from application.
Messages can be injected into a Filter Pipeline from either directions.
Filters can generate responses automatically. No need to route to application.
The FilterPipeline-Filter model II Applications have access to individual filters and
filter-pipeline at all times. Explicitly direct which filters need to be skipped or added.
Filters have access to position within Filter Pipeline, and can specify message injection at a specific location.
Dynamic reconfiguration possible for Filter Chain.
Allow different networking substrates to be registered. This can be dynamically changed. • Network substrate is last filter and is
responsible ONLY for routing SOAP message.
Web Services Support I Currently we have incorporated support for the
following Web Service specifications• WS-Eventing (WSE): This is a publish/subscribe based
notification framework from Microsoft and IBM.• WS-ReliableMessaging WSRM): This is a protocol for
ensuring the guaranteed delivery of SOAP messages between 2 Web Service endpoints. This specification is from IBM and Microsoft.
• WS-Reliability (WSR)- This is a competing specification from Oracle and Sun in the area of reliable messaging between Web Services.
These handlers are available for use in Axis1.2 or exploiting NB SOAP Intermediary support without a container• Axis1.2 version can be used inside container or as a Proxy
Web Services Support - II We are also working on implementing support for the
WS-Notification (WSN) suite of specification that is part of the Web Services Resource Framework (WSRF).
WS-Notification explicitly adds brokers to Eventing Note that almost all these specifications leverage the
WS-Addressing (WSA) specification. • We have incorporated support for all the rules associated
with WSA.
NaradaBrokering in Web Services a) WSM WSR WSN WSE support for Axis1.2 which is
available as standalone handlers without need for any NaradaBrokers
b) The support described in a) implemented as a separate proxy and inside containers
c) NaradaBrokers used as SOAP Intermediaries d) NaradaBrokers can support filters in SOAP
intermediaries forming limited light-weight containers e) NaradaBrokers can be Brokers defined in WSN
Specification f) One can use NaradaBrokers in non-brokered
publish-subscribe such as WS-Eventing to make it scalable
Operation Mean StdDev
StdError
Outlier
Min Max Mem (Bytes)
Create an XMLBeans based Envelope Document
121.29
25.77 2.65 6 110 333 2192
Create an Axis based SOAPMessage
85.76 79.36 8.22 7 34 540 1824
Convert an EnvelopeDocument to a SOAPMessage
3503.8
758.48
80.85
12 2632
5406
57152
Convert SOAPMessage to EnvelopeDocument
730.08
392.35
41.58
11 327 1911
34424
Create a WS-Addressing EPR(Contains just a URL address)
84.61 25.61 2.67 8 72 301 2072
Create a WS-Addressing EPR(Contains WSA ReferenceProperties)
133.13
35.64 3.71 8 114 354 2648
Create an Envelope targeted to a specific WSA EPR
157.98
12.19 1.27 8 140 219 7184
Create an Envelope targeted to a specific WSA EPR with most WSA message information headers
263.20
35.73 3.74 9 240 471 13880
Implementation of WS-Reliable Messaging (WSRM) I
Implementation of WS-Reliable Messaging (WSRM) II
Operation Mean StdDev
StdError
Outlier Min Max Mem(Bytes)
Parse an EnvelopeDocument to retrieve WSA Headers
711.74
231.61
23.76
5 555 1317
61024
Create a Wsrm Fault 413.80
239.17
25.07
9 271 1212
18096
Create a Wsrm SequenceRequest
268.95
37.93 3.97 9 212 374 16392
Create a Wsrm SequenceResponse
234.97
17.40 1.81 8 212 324 18160
Create a Wsrm SequenceDocument
43.812
2.99 0.30 4 42 53 2424
Add a WsrmSequenceDocument to an existing envelope. (Contains sequence identifier and message number)
13.01
0.57 0.05 4 11 15 464
Create a WSRM SequenceAcknowledgement based on a set of message numbers
461.17
172.40
18.27
11 301 1043
20624
Create a WSRM TerminateSequence
20.95
1.30 0.13 4 20 25 2072
Transport Layer in
NaradaBrokering
Transport Layer Support for multiple network protocols such as TCP,
UDP, Multicast, SSL, RTP, HTTP and Parallel TCP.• Support for both blocking and non-blocking IO in the TCP
support.
• The UDP support manages payloads greater than 64K datagram limit. Also incorporates pinging mechanism to detect connection losses in connectionless setting.
Tunnel through firewalls/proxies • Microsoft’s ISA, Checkpoint, Apache
hop-3
0
1
2
3
4
5
6
7
8
9
100 1000
Tra
nsit
Del
ay
(Mill
isec
onds
)
Message Payload Size (Bytes)
Mean transit delay for message samples in NaradaBrokering: Different communication hops
hop-2
hop-5 hop-7
Pentium-3, 1GHz, 256 MB RAM100 Mbps LAN
JRE 1.3 Linux
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
1000 1500 2000 2500 3000 3500 4000 4500 5000
Sta
nd
ard
De
via
tion
(M
illis
eco
nd
s)
Message Payload Size (Bytes)
Standard Deviation for message samples in NaradaBrokering Different communication hops - Internal Machines
hop-2hop-3hop-5hop-7
Performance of NaradaBrokering in collaborative settings
0.1
1
10
100
0 200 400 600 800 1000 1200 1400 1600
Tim
e (
Mill
ise
con
ds)
Number of Users
Average Latencies and Jitters for Audio Conferencing Clients. Single Broker, Single Meeting
Average LatencyAverage Jitter
0.1
1
10
100
1000
0 100 200 300 400 500 600 700 800 900
Tim
e (
Mill
ise
con
ds)
Number of Users
Average Latencies and Jitters for Video Conferencing Clients. Single Broker, Single Meeting
Average LatencyAverage Jitter
10
100
1000
200 300 400 500 600 700 800 900
La
ten
cy (
Mill
ise
con
ds)
Number of Users per broker
Average Latencies for Video Conferencing Clients at different Brokers. 4 Brokers, Single Meeting
Latency at B1Latency at B2Latency at B3Latency at B4
1
10
100
20 30 40 50 60 70 80 90 100
La
ten
cy (
Mill
ise
con
ds)
Number of Meetings
Average Latencies for Video Conferencing Clients at different Brokers. 4 Brokers, Multiple Meetings (20 Users per Meeting)
Latency at B1Latency at B2Latency at B3Latency at B4
1
10
100
0 20 40 60 80 100 120 140 160
La
ten
cy (
Mill
ise
con
ds)
Number of Users per Site
Average Latencies for Video Conferencing Clients at different locations. Sites in Indiana, Florida, New York and Cardiff
IndianaNew York
FloridaCardiff UK
“GridMPI” v. NaradaBrokering In parallel computing, MPI and PVM provided “all the features
one needed’ for inter-node messaging NB aims to play same role for the Grid but the requirements and
constraints are very different• NB is not MPI ported to a Grid/Globus environment
Typically MPI aiming at microsecond latency but for Grid, time scales are different• 100 millisecond quite normal network latency• 30 millisecond typical packet time sensitivity (this is one audio or video
frame) but even here can buffer 10-100 frames on client (conferencing to streaming)
• 1 millisecond is time for a Java server to “think” Jitter in latency (transit time through broker) due to routing,
processing (in NB) or packet loss recovery is important property Grids need and can use software supported message functions and
trade-offs between hardware and software routing different from parallel computing
HPSearch Management Engine HPSearch is an engine for orchestrating distributed
Web Service interactions• It uses an event system and supports both file transfers and
data streams. HPSearch flows can be scripted with JavaScript
• HPSearch engine binds the flow to a particular set of services and executes the script.
HPSearch can access and set NaradaBrokering features (create topics, display performance data)
ProxyWebService: a wrapper class that adds notification and streaming support to a remote Web Service.
HPSearch is a streaming sensitive workflow engine
WMS GIS service and a data Layer
Data Filter(Danube)
Pattern Informatics(Danube) Accumulate Data Run PI Code Create Graph Convert RAW -> GML
GPS Database(Gridfarm001)
WMS
HPSearch(TRex)
HPSearch(Danube)
HPSearch hosts an AXIS service for remote deployment of scripts
GML(Danube)
WS Context(Tambora)
NaradaBroker network: Used by HPSearch engines as well as for data transfer
Actual Data flow
HPSearch controls the Web services
Final Output pulled by the WMS
HPSearch Engines communicate using NB Messaging infrastructure
Virtual Data flow
Data can be stored and retrieved from the 3rd part repository (Context Service)
WMS submits script execution request (URI of script, parameters)
Workflow (BPEL) Fragment
SensorML and NaradaBrokering OGC defined a set SensorML of specifications
indicating how to integrate Sensors with its GIS Services
We are using Southern California SCIGN GPS data to prototype this
RYO Binary
Text
GML
Sensor Source
Filter
Filter
NaradaBrokeringTopics
You can access whicheverversion you want!
Issues in High Performance Web Services I http://grids.ucs.indiana.edu/ptliupages/publications/OptSOAP_CTS05.pdf http://grids.ucs.indiana.edu/ptliupages/publications/HighPerfDataStreaming.pdf
Web Services rely upon SOAP for message exchanges. Creating, parsing, canonicalizing, transporting, and processing
text-based XML SOAP messages is a performance bottleneck.
• Large data files needed by science applications are time-consuming to create and send in interconnected Grid applications.
• “Large” is relative as Grids must support hand-held devices, sensors, etc. ; Devices have limited memory and network bandwidth, so require efficient representations.
We are developing this approach for PDA clients to allow optimized Grid (Web) Service <-> PDA link
As part of Axis2-MOM activity intend to develop for general Web services
Issues in High Performance Web Services II SOAP 1.2 is defined using the XML Infoset.
• These define rules for maintaining the essence of SOAP messages without relying upon specific representations
• Separates SOAP message content from XML angle-bracket syntax.
• So we can freely move between binary and classic angle-bracket representations of SOAP messages with no loss of content.
More importantly, if successful, we won’t need to “pollute” web service architectures with ad-hoc solutions that complicate interoperability• Removes the need for non-Web Service transport
mechanisms (i.e. negotiating data channels that require non-Web Service protocols)
• Removes the need for non-Web Service data representations (i.e. HDF, NetCDF, etc)
SOAP Message Structure I SOAP Message consists of headers and a body
• Headers could be for Addressing, WSRM, Security, Eventing etc. Headers are processed by handlers or filters controlled by
container as message enters or leaves a service Body processed by Service itself The header processing essentially defines the “Web Service
Distributed System” Containers queue messages; control processing of headers and
offer convenient (for particular languages) service interfaces Handlers are really the core Operating system services as they
receive and give back messages like services; they just process and perhaps modify different elements of SOAP Message
H1 H4H3H2 Body F1 F2 F3 F4 Service
Container Handlers
Container Workflow
SOAP Message Structure II Content of individual headers and the body is defined by XML
Schema associated with WS-* headers and the service WSDL SOAP Infoset captures header and body structure XML Infoset for individual headers and the body capture the
details of each message part Web Service Architecture requires that we capture Infoset
structure but does not require that we represent XML in angle bracket <content>value</content> notation
H1 H4H3H2 Body
bp1 bp2 bp3hp1 hp2 hp3 hp4 hp5
Infoset representssemantic structureof message and itsparts
High Performance XML I There are many approaches to efficient “binary”
representations of XML Infosets• MTOM, XOP, Attachments, Fast Web Services• DFDL is one approach to specifying a binary format
Assume URI-S labels Scheme and URI-R labels realization of Scheme for a particular message i.e. URI-R defines specific layout of information in each message
Assume we are interested in conversations where a stream of messages is exchanged between two services or between a client and a service i.e. two end-points
Assume that we need to communicate fast between end-points that understand scheme URI-S but must support conventional representation if one end-point does not understand URI-S
High Performance XML II First Handler Ft=F1 handles Transport protocol; it negotiates
with other end-point to establish a transport conversation which uses either HTTP (default) or a different transport such as UDP with WSRM implementing reliability• URI-T specifies transport choice
Second Handler Fr=F2 handles representation and it negotiates a representation conversation with scheme URI-S and realization URI-R• Negotiation identifies parts of SOAP header that are present in all
messages in a stream and are ONLY transmitted ONCE
Fr needs to negotiate with Service and other handlers to decide what representation they will process
F1 F2 F3 F4
Container Handlers
High Performance XML III Filters controlled by Conversation Context convert messages
between representations using permanent context (metadata) catalog to hold conversation context
Different views for each point or even for individual handlers and service within one end point
NaradaBrokering can implement Fr and Ft as supports multiple transports, fast filters and message queuing; an appropriate context system is being developed – very dynamic
H1 H4H3H2 Body
Service
Conversation ContextURI-S, URI-R, URI-T
Replicated Message Header
Transported Message Message View1 Message View2
Container Handlers
Ft Fr F3 F4
Filters
NaradaBrokering Futures Support for replicated storages within the system.
• In a system with N replicas the scheme can sustain the loss of N-1 replicas. Clarification and expansion of NB Broker to act as a WS
container Integration with Axis 2.0 as Message Oriented Middleware
infrastructure Support for High Performance transport and representation for
Web Services• Needs Context catalog under development
Performance based routing• The broker network will dynamically respond to changes in the network
based on metrics gathered at individual broker nodes. Replicated publishers for fault tolerance Pure client P2P implementation (originally we linked to JXTA) Security Enhancements for fine-grain topic authorization, multi-
cast keys, Broker attacks