Reliable Messaging for Grids and Web
ServicesGeoffrey Fox, Shrideep Pallickara, Damodar Yemme, Hasan Bulut and Sima Patel(gcf, spallick, dyemme, hbulut and skpatel)@indiana.eduCommunity Grids LabIndiana University
Message-Based Reliability Web Services exchange messages and interact
with resources that produce and absorb messages
Action and state (if exists) of a service defined by messages
Our approach to Reliability is based on a building a messaging infrastructure that is intrinsically reliable and high performance WS-RM and WS-Reliability for web services Naradabrokering message-oriented middleware
Database
SS
SS
SS
SS
SS
SS
SS
SS
SS
SS
FS
FS
FS
FS
FS
FS
FS
FS FS
FS
FS
FS
FS
FS
FS
FS
FS FS
FS
FS
PortalFS
OS
OS
OS
OS
OS
OS
OS
OS
OS
OS
OS
OS
MD
MD
MD
MD
MD
MD
MD
MD
MD
MetaDataFilter Service
Sensor Service
OtherService
AnotherGrid
Raw Data Data Information Knowledge Wisdom
Decisions
SS
SS
AnotherService
AnotherService
SSAnother
Grid SS
AnotherGrid
SS
SS
SS
SS
SS
SS
SS
SS
FS
SOAP Messages
Applications of our Technology 1) Point-to-point generic linkage of services using
WSRM with messages saved in databases as required in specification
2) Scalable Management Architecture to support dynamic robust collections of entities Applied first to the brokers used in distributed
messaging of NaradaBrokering 3) Management of the streams of data from
sensors and web-cams Allow real-time replay and annotation based on real-
time saving of messages forming streams
WSRM and WS-Reliability WSRM describes a protocol that facilitates the reliable
delivery of messages between two web service endpoints in the presence of component, system or network failures.
WSRM facilitates the reliable delivery of messages from the source (or originator) of messages to the sink (or destination) of messages.
The delivery (and ordering) guarantees are valid over a group of messages, which is referred to as a sequence.
Publishing Messages in WSRM Every message from the source contains two
pieces of information ─ The Sequence that this message is a part of and A monotonically increasing Message Number within
this Sequence. These Message Numbers enable the tracking
of problems, if any, in the intended message delivery at a sink. Message Numbers enable the determination of out
of order receipt of messages as well as message losses.
Protocol has acknowledgements and negative acknowledges defined
Typical Processing Acknowledgments Upon receipt of acknowledgements a source
can determine which messages might have been lost in transit and proceed to retransmit the missed messages.
Thus if a sink has acknowledged the receipt of messages 1 ─ 10 and 13 ─ 18. The source can conclude that messages with
Message Numbers 11 and 12 were lost en route to the sink and proceed to retransmit these messages.
Notification of Errors WSRM provides for notification of errors in
processing between the endpoints involved in reliable delivery. These are routed back as SOAP Faults.
The range of errors can vary from an inability to decipher a message’s content to complex errors pertaining to violations in implied agreements between the interacting source and sink.
All errors are reported as faults with the appropriate wsa:Action attribute, and encapsulated in WSRM fault elements.
Comments on WSRM Implementation We are delivering this to the UK Open Middleware
Infrastructure Institute We built WS-Eventing that is available in OMII 2.3.3
http://www.omii.ac.uk/news/newsdetail.jsp?id=25 in FINS Project
WS-RM is currently being tested in OMII container (FIRMS Project) and is expected to be finished in a month and released by OMII in approximately June 2006
WS-RM and WS-Eventing use SOAP handlers that are not well supported in current Axis used by OMII; we should hope Axis 2 will be soon mature enough to use
0
5000
10000
15000
20000
0 10 20 30 40 50 60 70 80 90 100
Ela
psed
Tim
e (M
icro
seco
nds)
Test Run #
Total W SRM Processsing times at Source and Sink
W SRM Source NodeW SRM Sink Node
Operation Mean StandDev
StandError
MinVal
MaxVal
MemoryUse(Bytes)
Create an XMLBeans based Envelope Document
127 49. 5.0 108 424 2192
Create an Axis based SOAPMessage
117 188. 19. 34 1183 1824
Convert an EnvelopeDocument to a SOAPMessage
2630. 910. 94. 1722 5350 60816
Convert SOAPMessage to EnvelopeDocument
828. 590. 60. 325 2802 34424
Create a WS-Addressing EPR(Contains just a URL address)
87.6 58. 6.0 71 465 2072
Create a WS-Addressing EPR(Contains WSA ReferenceProperties)
151. 97. 9.9 112 705 2648
Create an Envelope targeted to a specific WSA EPR
397. 200. 21. 267 1276 7184
Create an Envelope targeted to a specific WSA EPR with most WSA message information headers
538. 350. 35. 344 2123 13880
Parse an EnvelopeDocument to retrieve Wsa Message Info Headers
1220. 730. 74. 645 4573 61024
Times are Microseconds
Operation Mean StandDev
StandError
MinVal
MaxVal
MemoryUtil(Bytes)
CreateWsrmSequenceRequest 352. 261 26. 229
1568 16392
CreateWsrmSequenceResponse 335. 226. 23. 224
1174 18160
CreateWsrmSequenceDocument 45. 4.7 0.48 42 75 2424
Add a WsrmSequenceDocument to an existing envelope. (Contains sequence identifier and message number)
12.7 0.49 0.05 12 14 464
Create a WSRM SequenceAcknowledgement based on a set of message numbers
516. 250. 25. 335
1514 20624
CreateTerminateSequence 24.7 36.203
3.6 19 380 2072
CreateWsrmFault 520. 294.699
30. 347
1619 18096
Times are Microseconds
Management of services We prefer to build Grids (collections of web
services) that use distributed publish-subscribe message-oriented middleware to transport all messages. Our publish-subscribe software is called
NaradaBrokering (NB) and one can bind SOAP to NB transport (very different from WS-Notification/Eventing) building a handler for this
NB will guarantee message delivery and its distributed nature has implicit reliability However we need to maximize reliability of this
infrastructure including attention to network QoS, firewalls etc.
NaradaBrokering
Stream
NB supports messagesand streams
NB role for Grid isSimilar toMPI role for MPP
Queues
NaradaBrokering 2003-2006 Messaging infrastructure for collaboration, peer-to-peer and Grids
Implements JMS and native high-performance protocols (message transit time of 1 to 2 ms per hop)
Order-preserving message transport with QoS and security profiles Support for different underlying transport such as TCP, UDP,
Multicast, RTP SOAP message support and WS-Eventing, WS-RM and WS-Reliability.
• WS-Notification when specification agreed Active replay support: Pause and Replay live streams. Stream Linkage: can link permanently multiple streams – using in
annotation of real-time video streams Replicated storage support for fault tolerance and resiliency to storage
failures. Management: HPSearch Scripting Interface to streams and brokers
(uses WS-Management) Broker Topics and Message Discovery: Locate appropriate Integration with Axis2 Web Service Container (?) High Performance Transport supporting SOAP Infoset
Management Architecture
Network
Registry
Discover
Registry Registry
ADAPTER
Entity being managed
WS ADAPTER
Entity being managed
WS ADAPTER
Entity being managed
WS ADAPTER
Entity being managed
WS ADAPTER
Entity being managed
WS
Statically configured bootstrap nodes
Register / Renew
Manager Service
Manager Service
Manager Service
Manager Service
Multiple DistributedManager Instances
Multiple DistributedManagee InstancesWith web service proxy
WS-Management
Features of the Managee Service
The distributed managers use NaradaBrokering itself for robust messaging with the “Managees” (Web Service adaptors or proxies to each broker in NaradaBroker networker)
Features of the Manager Service
WS-Management used for communicating between Managers and Managees
Managers implement policy and user instructions but this very primitive
e - Annotation Player
Archived stream player Annotation / WB
player
Archieved stream list
Real time stream list
e - Annotation Whiteboard
Real time stream player Archived Real Time Real Time
Stream List Stream List Player
e-Annotation Archived Stream Annotated e-Annotation
Player Player Stream Player Whiteboard
Generic Recording and Replay Framework A generic framework for recording and replay of any
type of streaming event or data. Active replay of streams: Real-time (live) streams
can be replayed, paused and rewound while streams are being recorded. Fast forward is available for the duration of the recorded
stream. Note streams are collections of events and events
are essentially messages Rewind is same as undo (as in Office)
Go back N messages in stream Replay is same as redo i.e. re-apply sequence of
messages to a Web service ports Good replay implies robust message recording
Generic Recording and Replay Framework Stream linkage: Multiple streams are linked together to
construct a session. A collaboration session can be recorded and replayed within
this framework. Examples; Anabas – Uses JMS events to transport data such as
whiteboard, shared display, audio, etc. GlobalMMCS – Uses NaradaBrokering RTP Events to
transport audio and video data. Streams can be added/removed to/from session
dynamically while the session is being recorded. Maintains metadata information for recorded sessions
and their streams. Dynamic metadata stored in high performance light weight
WS-Context service
Uniform Event Type For Generic Framework
Received events are wrapped inside NaradaBrokering native events (NBEvent) with additional event specific information. Received event is placed to the payload of the NBEvent to
preserve original data and related information. NBEvent also contains timestamp information to timespace
original events during replay and event type to initiate appropriate player for that event type.
Events and related metadata are stored in database tables.
Session Recorders Session recorder includes topic
recorders that subscribe to each topic defined in that session.
Topic recorders are like subscribing clients receiving the streaming events. Topic recorders are specialized for event types. i.e. JMS events need JMS topic recorder to receive those type of events.
Event types for those streams are already known from the initiated record request.
Session Recorder
Client 1
Client 2 Client
N
Broker Nodes
Reliable Delivery Service
NBEvent Companion Event
JMS Event / RTPEvent
JMS Event / RTPEvent
JMS Event / RTPEvent
Time Differential Service (TDS) Replay of events rely on one critical service: Time differential
service. Each replay session has one dedicated TDS to achieve replay,
pause, rewinding and fast forwarding of the streams in the session in one operation.
Achieves synchronization of multiple streams in the same session by maintaining a shared buffer for those streams.
Maintains the timespace between the replay events equal to the timespace between the original received events .
Resolution of this timespacing is one millisecond; events can be timespaced with one millisecond accuracy.
TDS can be maintained on robust node (NaradaBrokering node that provides stable storage) or on client side.
Replication of robust nodes supported for better fault tolerance
Session Players The primary purpose of session player is to simulate
clients in the original session. To achieve this;
Each recorded topic (or stream) is mapped to a new topic and events of the same original topic are released to the mapped topic.
While releasing the events, timespacing between events are preserved.
Utilizes Time Differential Service to timespace events Recording of live streams are available for replay as
soon as they are stored to the reliable storage. Session players support replay, pause, rewind and
fast forward operations. When one of those operations is requested, it is applied to all of the topics (streams) in that session.