MESSAGE PASSING
04/08/23 2/112
Message passingMessage passing
Key concepts
Introduction
IPC
Remote Procedure Calls
Group communication
04/08/23 3/112
IntroductionIntroduction
In distributed system, processes executing on
different computers often need to communicate
with each other to achieve some common goal
Inter process communication (I.P.C.) requires
information sharing among two or more processes
2 basic methods for information sharing are
1. Original sharing or shared-Memory approach
2. Copy sharing or message-passing approach
04/08/23 4/112
Introduction (contd…)Introduction (contd…) Shared-Memory approach
Message-passing approach
A message passing system is a subsystem of a distributed OS that provides a set of message-based IPC protocols
It serves as a suitable infrastructure for building other higher level IPC systems, such as remote procedure call(RPC) and distributed shared memory(DSM)
Read A
P1 P2 Shared commonMemory area
Write A
P1 P2
Send A Receive A
04/08/23 5/112
Desirable Features of a good message-Desirable Features of a good message-passing systempassing system
Simplicity
Uniform semantics Local communication
Remote communication
Efficiency If the message passing system is not efficient, IPC
become more expensive. i.e. users will not feel like using this mechanism)
04/08/23 6/112
Features of a good message-passing Features of a good message-passing system (contd…)system (contd…)
Some optimizations normally adopted
Avoiding the cost of establishing and terminating connection between the processes for each and every message exchange.
Minimizing the costs of maintaining connections
Piggy backing of acknowledgement.
Reliability
DS are prone to node crashes or link failures. Retransmit the message ( may be based on timeouts)
Due to timeouts – Duplicate Message
A good Message passing system should have IPC protocols to handle these issues.
04/08/23 7/112
Features of a good message-passing Features of a good message-passing system (contd…)system (contd…)
Correctness - related to group communication Atomicity – either to all or None
Ordered delivery – Order acceptable to the application
Survivability – guarantees message delivery despite of failures
Flexibility IPC primitives must also have the flexibility to permit
any kind of control flow between the co-operating processes including synchronous and asynchronous send/ Receive.
04/08/23 8/112
Features of a good message-passing Features of a good message-passing system (contd…)system (contd…)
Security Authentication of the receiver / sender
Encrypted message
Portability 2 aspects of portability
The message passing system should itself be portable
The applications written by using the primitives of IPC protocols of the message passing system should be portable. So, Heterogeneity must be considered while desiging message passing system
04/08/23 9/112
Issues in IPC by message passingIssues in IPC by message passing A message is a block of information formatted by a
sending process in such a manner that it is
meaningful to receiving process
It consists of a fixed length header and a variable size
collection of typed data objects
The header consists of:
Address – to identify the sending/receiving
process
Sequence number – message identifier for identifying lost / duplicate message
Structural information
1. Type – data or pointer to data
2. Length of the variable size message
04/08/23 10/112
Issues in IPC (contd…)Issues in IPC (contd…)
A typical message structure
Actual dataor
pointerto the data
Structural information
Number of
bytes/elements
Type
Sequence number
orMessage id
Addresses
ReceivingProcessaddress
Sending processaddress
Fixed-length headerVariable size collection of typed data
04/08/23 11/112
Issues in IPC (contd…)Issues in IPC (contd…) In the design of an IPC protocol, following important issues
need to be considered Who is the sender? Who is the receiver? Is there one receiver or many receivers? Is the message guaranteed to have been accepted by its
receiver's? Does the sender need to wait for the reply? What should be done if a node crash or link failure
occurs? What should be done if the receiver is not ready to
accept the message? If there are several outstanding messages for a receiver,
can it choose the order in which to service the outstanding messages?
04/08/23 12/112
Issues in IPC (contd…)Issues in IPC (contd…)
Issues in IPC are addressed by:
Synchronization
Buffering
Multi Data gram Messages
Encoding and Decoding
Process Addressing
Failure Handling
Group Massaging
04/08/23 13/112
Synchronization Synchronization Semantics used for synchronization may be broadly
classified as
Blocking – its invocation blocks the execution of its invoker
Nonblocking - Its invocation does not block the execution of its invoker
How a non blocking RECEIVING process knows message arrival?
Polling : Periodically poll the Kernal to check the Buffer status.
Interrupts: When message is filled in the buffer, a software interrupt is used to notify the receiving process.
04/08/23 14/112
Sender’s execution Receiver’s execution
Message
Acknowledgement Execution resumed
Send (acknowledgement)
Execution resumed
Send (message)
Execution suspended
Receive (message);
Execution suspended
Blocked state
Executing state
Synchronous mode of communication with both send and receive primitives having blocking-type semantics
04/08/23 15/112
Buffering Buffering Messages can be transmitted from one process to
another by copying the body of the message from the
address space of sending process to the address
space of the receiving process
The message buffering strategy in IPC is strongly
related to synchronization strategy
Four types of buffering strategy are:
Null buffer ( or no buffering)
Single message buffer
Buffer with unbounded capacity
Finite-bound (or multiple-message) buffer
04/08/23 16/112
Buffering (contd…)Buffering (contd…)
Null buffer (or No buffering) There is no place to temporarily store the message
Strategies used are: The message remains in the senders process’s
address space and the execution of the send is delayed until the receiver executes the corresponding receive
The message is simply discarded and the timeout mechanism is used to resend the message after a timeout period
04/08/23 17/112
Buffering (contd…)Buffering (contd…)
The logical path of message transfer is directly from the sender’s address space to the receiver’s address space, involving single copy operation
Message transfer in synchronous send with no buffering strategy
MSG
Sending process Receiving process
04/08/23 18/112
Buffering (contd…)Buffering (contd…)
Single-message buffer Null buffer strategy is not suitable for synchronous
communication A message has to be transferred two or more times,
and receiver of the message has to wait for the entire time taken to transfer the message across the network
Synchronous communication mechanisms in Distributed systems use a single-message buffer strategy
A buffer having the capacity to store a single-message is used on the receiver’s node
04/08/23 19/112
Buffering (contd…)Buffering (contd…) Idea is to keep the message ready for use at location of
the receiver
The request message is buffered on the receiver’s node if the receiver is not ready to receive the message
The message buffer may be either in kernel’s address space or in the receiver’s process’s address space
Sending process Receiving process
Single-message buffer
Node boundary
Message transfer in synchronous send with single-message buffering strategy (two copy operations needed)
04/08/23 20/112
Buffering (contd…)Buffering (contd…)
Unbounded-capacity buffer In asynchronous mode of communication, since a
sender does not wait for the receiver to be ready, there may be several pending messages that have not yet been accepted by the receiver
An unbounded-capacity buffer is needed that can store all unreceived messages to support asynchronous communication With assurance that all the messages sent to the
receiver will be delivered
04/08/23 21/112
Buffering (contd…)Buffering (contd…) Finite-bound (or multiple-message) buffer
Unbounded capacity of a buffer is practically impossible
When buffer has finite-bound - problem is buffer overflow
The buffer overflow can be dealt with one of 2 ways:
Unsuccessful communication
Message transfers simply fail whenever there is no more buffer space
Flow-controlled communication
The sender is blocked until the receiver accepts some messages, thus creating space in the buffer for new messages
04/08/23 22/112
Buffering (contd…)Buffering (contd…)
The message is first copied from the sending process’s memory into the receiving process’s mailbox
Then message is copied from the mailbox to the receiver’s memory when the receiver calls for the message
MSG
Send Receive
Multiple-message buffer/mailbox port
Message transfer in asynchronous send with multiple-message buffering strategy
04/08/23 23/112
Multidatagram messagesMultidatagram messages Maximum transfer unit (MTU)
Upper bound on the size of data that can be transmitted at a time
Message whose size is greater than MTU has to be fragmented into multiples of MTU and sent separately
Each fragment is sent in a packet (known as datagram)
Messages smaller than MTU can be sent in a single packet (known as single-datagram messages)
Messages larger than MTU have to separated and sent in multiple packets (known as Multidatagram messages)
The disassembling and reassembling of messages on sender and receiver side is the responsibility of message passing system
04/08/23 24/112
Encoding and Decoding of Encoding and Decoding of message datamessage data
The structure of the message data should be
preserved between the sending and receiving
processes
It is very difficult to achieve this goal in both
heterogeneous and homogenous systems
2 reasons
An absolute pointer value loses its meaning when
transferred from one process address space to
another Different program objects occupy varying amount of
storage space A message must normally contain several types of
program objects, such as long integers, short int, variable length characters and so on
04/08/23 25/112
Encoding and decoding of message Encoding and decoding of message data (contd…)data (contd…)
Two representation for encoding and decoding of message data: Tagged representation
The type of each program object along with its value is encoded in the message
Because of self-describing nature of the coded data format Receiving process does not need prior knowledge
Untagged representation Message data only contains program objects No information is included in the message data to
specify the type of each program object Receiving process must have prior knowledge of how to
decode the received data
04/08/23 26/112
Process addressingProcess addressing Message passing system usually supports 2 types of
process addressing
Explicit addressing
The process with which communication is desired is
explicitly named as a parameter in the communication
primitive used
Send (Process-id, Message)
To the process
Receive (Process_id, Message)
From the process
04/08/23 27/112
Process addressing (contd…)Process addressing (contd…) Implicit addressing
Process willing to communicate does not explicitly name a process for communication Send-any (service_id, Message)
Send a message to any process that provides the service of type “service id” Receive any (Process_id, Message)
Receive a message from any process & return the “process_id” of the process from which message was received.
04/08/23 28/112
Process addressing (contd…)Process addressing (contd…) Processes can be identified by the combination of three
fields:
Machine_id, local_id, machine_id
First field identifies the node on which process is created
Second field is a local identifier generated by the node
on which processes is created
Third filed identifies the last known location (node) of the
process
The value of the first 2 fields of its identifier never
change; the third field, however, may
This method of addressing is known as link-based
addressing
04/08/23 29/112
Process addressing (contd…)Process addressing (contd…) Link-based addressing:
When a process is migrated from its current node to a new node, a link information {process id, networks M/c id} is left on its previous node and on a new node,
a new local id is assigned to a process, and its process identifier and the new local-id is entered in a mapping table maintained by the kernel of the new node for all processes created on another node but running on their node.
If the value of the third field is equal to the first field, the message will be sent to the node on which the process was created
04/08/23 30/112
Process addressing (contd…)Process addressing (contd…)
Drawbacks: Eventhough it supports migration facility, it suffers from 2 main drawbacks
The overhead of locating a process may be large if the process has migrated several times during its lifetime
It may not be possible to locate a process if an intermediate node on which the process once resided during its lifetime is down
Both process addressing methods are nontransparent due to the need to specify the machine identifier
What are the alternatives?
04/08/23 31/112
Process addressing (contd…)Process addressing (contd…)1. Centralized process identifier allocator
Maintains a counter. When it receives a request for identifier, it returns the current value of the counter and increments the counter
It suffers from poor reliability and scalability
2. Two level naming scheme for processes
1. Machine independent high level name
2. Machine dependent low level name
with a centralized( or replicated/distributed) name server maintaining the map table that maps high level name to the low level name
04/08/23 32/112
Failure handlingFailure handling Possible problems in IPC due to different types of
system failures
Loss of request message
Failure of communication link between sender and receiver or receiver’s node is down at time the request reaches there
Loss of response message
Failure of communication link or Sender’s node is down at the time the response message reaches there
Unsuccessful execution of the request
Receiver’s node crashing while request is being processed
04/08/23 33/112
Failure handling (contd…)Failure handling (contd…)Sender Receiver
Send request
Send request
Send request
Request message
Request message
Request message
Lost
Lost
Response message
Successful request execution
Successful request execution
Crash
Restarted
Send response
a) Request message is lost
b) Response message is lost
c) Receiver’s computer crashed
04/08/23 34/112
Failure handling (contd…)Failure handling (contd…)
Four-message reliable IPC protocol for client-server communication between two processes
Client Server
Request
Acknowledgement
Reply
Acknowledgement Blocked state
Executing state
04/08/23 35/112
Failure handling (contd…)Failure handling (contd…) Three-message reliable IPC protocol for client-server
communication between two processes
Client Server
Request
Reply
Acknowledgement Blocked state
Executing state
04/08/23 36/112
Failure handling (contd…)Failure handling (contd…) Two-message reliable IPC protocol for client-server
communication between two processes
Client Server
Request
Reply
Blocked state
Executing state
04/08/23 37/112
Failure handling (contd…)Failure handling (contd…) Fault tolerant communication between a client and a server
Successful Execution
Retransmit REQUEST Message
Send ResponseSuccessful Execution
Restarted
Lost
Client Server
REQUEST Message
Retransmit REQUEST Message
Retransmit REQUEST Message
Lost
Response
Send Request
Time Out
Time Out
Time Out
Unsuccessful Execution
Crash
Send Request
Send Request
Send Request
These 2 successful executions of the message may produce different results
04/08/23 38/112
Failure handling (contd…)Failure handling (contd…)
Idempotency and handling of duplicate request
messages
Idempotency means repeatability
An Idempotent operation produces same results
without any side effects no matter how many times it
is performed with the same arguments
Example simpleIntrest( 25000, 2, 8 ) procedure
produces same result when executed repeatedly
A Non Idempotent operation produces different
results for the same set of arguments when executed
repeatedly
04/08/23 39/112
Failure handling (contd…)Failure handling (contd…) Example : Non Idempotent operation
int Cal_Final_Marks (int End_Sem_Marks, int attndnce)
{ Total_Marks += End_Sem_Marks ;
if ( attndnce > 95 )
Total_Marks += 5 ;
else if ( attndnce > 90 )
Total_Marks += 3 ;
else if ( attndnce > 85 )
Total_Marks += 2 ;
else if ( attndnce > 80 ) Total_Marks += 1 ; return(Total_Marks ); }
04/08/23 40/112
Failure handling (contd…)Failure handling (contd…)
Execute Cal_Final_Marks.
Total_Marks=43+34+2 = 79
Execute Cal_Final_Marks.
Total_Marks=79++34+2 = 115
Lost
CLIENT SERVERTotal_Marks = 43
Cal_Final_Marks (34, 87)
Retransmit REQUEST Message
Send Request
Retrun(79)
Send Request
Receive Total_Marks = 115
Cal_Final_Marks(34, 87)
Retrun(115)
Timeout
A nonidempotent procedure
04/08/23 41/112
Failure handling (contd…)Failure handling (contd…) When no response is received by the client, it is
impossible to determine whether the failure was due to server crash or loss of the request or response message.
Using timeouts client resends the request.
Repeated execution of NonIdempotent requests results in “ORPHAN” executions
How to ensure only one execution of NonIdempotent requests ?
Using Exactly once semantics
Exactly once semantics is implemented using unique identifier for each request at the client side and reply cache on the server side
04/08/23
Failure handling (contd…)Failure handling (contd…)
REQUESTIDENTIFIER
REPLY TO BE
SENT
Request02 45
.. ..
Check reply Cache for request01.
Execute Cal_Final_Marks.
Total_Marks=43+34+2 = 79Save Reply
Lost
CLIENT SERVERTotal_Marks = 43
Cal_Final_Marks (34, 87)
Retransmit
Retrun(79)
Send Request01
Receive Total_Marks
= 79
Cal_Final_Marks(34, 87)
Retrun(79)
Timeout
EXACTLY ONCE SEMANTICS USING REQUEST IDENTIFIERS AND REPLY CACHE
Send Request01
Check reply Cache for request01. FOUNDExtract Reply
Reply Cache
Request 01 79
NOT FOUND
04/08/23 43/112
Failure handling (contd…)Failure handling (contd…)
Keeping track of Lost &out-of-sequence packets in multi data gram Messages.
How ensure reliable delivery of all the packets of the Multidatagram message?
Simple approach is using STOP & WAIT Protocol
Acknowledge each packet seperately
Disadvantage: Communication Overhead.
Better approach is using BLAST Protocol
Single Acknowledgement packet for all the packets of a multidatagram message
04/08/23 44/112
Failure handling (contd…)Failure handling (contd…)When BLAST protocol is used, Node or Common link
failure leads to
Loss of packets
Out of sequence delivery of Packets
To solve this:
Use of Bitmap to identify the packet of a message using 2 extra fields to the Header.
Total No of Packets, Bit map specifying the position of the packet.
Use “ SELECTIVE REPEAT “ method to transmit the Lost packets after time out period.
04/08/23 45/112
Failure handling (contd…)Failure handling (contd…)
Create a buffer for 4 packets and place this packet in position 1
SENDER RECEIVER
First of 4 packets(4,1000)
Resend Missing packets
Packets of the Response message
Retransmit the request of missing packets
Buffer for 4 packets
Send Request message
Second of 4 packets(4,0100)
Fourth of 4 packets(4,0001)
Timeout
Missing packets(4,0101)
Third of 4 packets(4,0010)
Second of 4 packets(4,0100)
Fourth of 4 packets(4,0001)
Acknowledgement
04/08/23 46/112
Group communicationGroup communication
Three types of group communication:
One to many (single sender and multiple receivers)
Many to one (multiple senders and single receiver)
Many to many ( multiple senders and multiple
receivers)
04/08/23 47/112
Group communication (contd…)Group communication (contd…) One-to-many communication
Also known as multicast communication
Special case of multicast communication is broadcast communication
Message is sent to all processors connected to a network
Group management Closed Group - Only the members of the group can
send message to the group.
Open Group – Any person in the system can send the message to the group.
Centralized Group Servers (with Replication) – For dynamic management of Group members.
04/08/23 48/112
Group communication (contd…)Group communication (contd…)
Group addressing
2 level naming scheme is normally used for group
addressing
High level group name is an ASCII string that is
independent of the location information of processes in
the group
Low level group name depends on underlying hardware
Special address to which multiple machines can listen
is called multicast address
Networks that do not have multicast address have
broadcasting facility with Broadcast address
04/08/23 49/112
Group communication (contd…)Group communication (contd…)
Message delivery to receiver process User applications use high-level group names in
programs
The centralized group server maintains a mapping of high-level group names to their low-level names
Group server also maintains a list of the process identifiers of all the processes for each group
04/08/23 50/112
Group communication (contd…)Group communication (contd…) Buffered and unbuffered multicast
Multicast is an asynchronous communication mechanism
Multicast send cannot be synchronous due to: It is unrealistic to expect a sending process to wait until
all the receiving processes that belong to the multicast group are ready to receive the multicast message
The sending process may not be aware of all the receiving processes that belong to the multicast group
For unbuffered multicast, the message is not buffered Lost if receiving process is not in a state to receive it
For buffered multicast, the message is buffered for receiving process Each process of group receive the message
04/08/23 51/112
Group communication (contd…)Group communication (contd…) 2 types of semantics for one-to-many communications
Send-to-all semantics
Bulletin-board semantics
Bulletin-board semantics is more flexible than send-to-all semantics, because of the following factors ignored by send to all: The relevance of a message to a particular receiver
may depend on the receiver’s state
Messages not accepted within a certain time after transmission may no longer be useful
04/08/23 52/112
Group communication (contd…)Group communication (contd…) Flexible reliability in multicast communication
In one to many communication, the degree of reliability is normally expressed in:
The 0-reliable
Ex. Time signal generation
The 1-reliable
Ex. Request for service
The m-out-of-n-reliable
(1 < m < n) Ex. Consistency Control Algorithm
All reliable
Ex. Updation of replicas
04/08/23 53/112
Group communication (contd…)Group communication (contd…) Atomic multicast
Has an all-or-nothing property When message is sent to group, it is either received by all
processes that are members of the group or else it is not received by any of them
Many-to-one communication Multiple senders send messages to a single receiver
Single receiver may be selective or nonselective
Selective receiver specifies a unique sender Message exchange takes place only if that sender sends a
message
Nonselective receiver specifies a set of senders Message exchange takes place only if any sender in the
set sends a message to this receiver
04/08/23 54/112
Group communication (contd…)Group communication (contd…) Many-to-many communication
Multiple senders send messages to multiple receivers
Important issue is ordered message delivery
Ordered message delivery ensures that all messages
are delivered to all receivers in an order acceptable to
the application
Ordered message delivery requires message
sequencing
Commonly used semantics for ordered delivery of
multicast messages are:
Absolute ordering
Consistent ordering
Casual ordering
04/08/23 55/112
Group communication (Contd…)Group communication (Contd…) Absolute ordering
All messages are delivered to all receiver processes in the exact order in which they were sent
System is assumed to have clock at each machine, and clocks are synchronized with each other
Uses global timestamp as message identifiers
Kernal of the receiver places the message in a queue
Sliding window mechanism is used to deliver the message periodically
Messages whose time stamp falls within the current window are delivered to the receiver
04/08/23 56/112
Group communication (Contd…)Group communication (Contd…) Absolute ordering
S1
t1
R1 R2 S2
m1
m2
m1
m2
t1 < t2
Absolute ordering of messages
Time
t2
04/08/23 57/112
Group communication (contd…)Group communication (contd…) Consistent ordering
All messages are delivered to all receiver processes in the same order
However this order may be different from the order in which messages were sent
S1
t1
R1 R2 S2
m1
m2
m1
m2 t1 < t2
Consistent ordering of messages
Time
t2
04/08/23 58/112
Group communication (contd…)Group communication (contd…) Implementation of consistent-ordering
I Approach : Centralised Sequencer Method
Many-to-many scheme appear as a combination of
many-to-one and one-to-many schemes
Kernels of sending machines send messages to a
single receiver (known as sequencer)
Assigns sequence number to each message and then
multicasts it
Kernel of each receiving machine saves all incoming
messages meant for a receiver in a separate queue
Messages in queue are delivered immediately to
receiver unless there is a gap in the sequence
number
04/08/23 59/112
Group communication (Contd…)Group communication (Contd…)
Implementation of consistent-ordering (Contd…)
Sequencer based method is subject to single point
failure and has poor reliability
II Approach : ABCAST protocol (Distributed)
Assigns sequence number to a message by
Distributed agreement among the group members
and the sender
1. Sender assigns a temporary sequence number to the message and sends it to all members of the multicast group.
04/08/23 60/112
Group communication (Contd…)Group communication (Contd…)ABCAST protocol :
This sequence number should be greater than the
previous number used by the sender. A counter is
used.
2. On receiving the message, each member of the group
returns a proposed sequence number to the sender Member(i) calculates its proposed sequence number
as max ( Fmax, Pmax) + 1 + i / N
Fmax largest final sequence number agreed upon so
far for a message received by the group
Pmax largest proposed sequence number by this
member
N total number of members in the multicast group
i member number
04/08/23 61/112
Group communication (Contd…)Group communication (Contd…) ABCAST protocol :
3. When sender has received the proposed sequence
numbers from all the members, it selects the largest
one as the final sequence number for the message
and sends it to all members in a COMMIT message
On receiving the COMMIT message, each member
attaches the final sequence number to the message
Committed messages with final sequence numbers
are delivered to the application programs in order of
their final sequence numbers
04/08/23 62/112
Group communication (contd…)Group communication (contd…) Casual ordering
Ensures that if the event of sending one message is
casually related to the event of sending another
message, the two messages are delivered to all
receivers in the correct order
Two message sending events are said to be casually
related if they are co-related by the happened-before
relation
04/08/23 63/112
Group communication (contd…)Group communication (contd…) Casual ordering
S1
t1
R1 R2 S2
Time
CASUAL ORDERING OF MESSAGES
R3
m2
m2
m1
m1
m1
m3
m3
04/08/23 64/112
Group communication (contd…)Group communication (contd…) Implementation of casual ordering
CBCAST protocol
1. Each member process of a group maintains a vector
of “n” components, where “n” is the total number of
members in the group
2. Each member is assigned a sequence number from
0 to n.
3. ith component of the vector corresponds to the
member with sequence number i and it is equal to
the number of last message received in sequence
by the ith member.
04/08/23 65/112
Group communication (contd…)Group communication (contd…)4. To send a message, a process increments the value of its
own component in its own vector and sends the vector
as part of the message
5. When message arrives at a receiver process’s site, it is
buffered by the runtime system and the Runtime system
tests the two conditions, to decide whether message can
be delivered or it must be delayed to ensure casual-
ordering semantics
S[ i ] = R[ i ] +1 and
S[ j ] <= R[ j ] for all j != i
where S is Vector of Sender process and R is Vector of
Receiver process
04/08/23 66/112
Group communication (contd…)Group communication (contd…)
S[i] = R[i] +1 ensures that the receiver has not missed any
message from the sender
S[j] <= R[j] for all j!=i ensures that the sender has not
received any message that the receiver has not yet
received
6. If message passes these two tests, the runtime system
delivers it to the user process
7. Otherwise the message is left in the buffer and the test is
carried out again for it when a new message arrives
04/08/23 67/112
Group communication (contd…)Group communication (contd…) CBCAST protocol for implementing casual ordering
3 2 5 1 3 2 5 1 2 2 5 1 3 2 4 1
4 2 5 1 message data
Deliver
Delay because the condition
A[1]=C[1] + 1
is FALSEDelay because the condition
A[3]<=D[3]
is not TRUE
Process A sends a new message to other processes
Vector of process A
Vector of process B
Vector of process C
Vector of process D
Status of vectors at some instance of time
04/08/23 68/112
Remote Procedure CallsRemote Procedure Calls It is a special case of general message-passing model
of IPC
RPC has become a widely accepted IPC mechanism in distributed systems because of the following features Simple call syntax
Familiar semantics ( similar to Local procedure calls)
Well-defined interface
Ease of use
Generality
Efficiency
Can be used as an IPC mechanism to communicate between processes on different machines as well as between different processes on the same machine
04/08/23 69/112
RPC modelRPC model RPC model is similar to the procedure call model used
for the transfer of control and data within a program in
the following manner:
For making a procedure call, the caller places
arguments to the procedure in some well specified
location
Control is then transferred to the sequence of
instructions that constitutes the body of the procedure
The procedure body is executed in a newly created
execution environment
After the procedure’s execution, control returns to the
calling point, possibly returning a result
04/08/23 70/112
it can be asynchronous , so that client can do other task while waiting for reply.
Typical Model of a RPCTypical Model of a RPC Caller Callee
(Client Process) (Server Process)
Request Message with Remote
procedure’s Parameters
Call Procedure &Wait for reply
Receive request& start Procedure Execution
Procedure Executes
Reply Message with Result of
Procedure Execution Send Reply & Wait for next Request
Resume Execution
04/08/23 71/112
Transparency of RPCTransparency of RPC A transparent RPC mechanism is one in which local
procedures and remote procedures are
indistinguishable to programmers
Transparent RPC require
1. Syntactic transparency
RPC should have exactly the same syntax as a local
procedure call
2. Semantic transparency
Semantics of RPC should be identical to those of a
local procedure call
04/08/23 72/112
Transparency of RPC (Contd…)Transparency of RPC (Contd…) Differences between RPC and LPC:
With RPC, the called procedure is executed in an
address space that is disjoint from the calling
program’s address space. So, remote procedure
cannot have access to any variables or data values in
the calling program’s environment
RPC are more vulnerable to failure than LPC’s
Since they involve 2 different processes and possibly a network and 2 different computers
RPCs consume much more time (100-1000 times more) than LPCs
Due to involvement of a communication network
04/08/23 73/112
Implementation of RPC mechanismImplementation of RPC mechanism
Implementation of RPC mechanism involves five
elements of program:
1. The client
2. The client stub
3. The RPCRuntime
4. The server stub
5. The server
The client, the client stub, and one instance of
RPCRuntime execute on the client machine
The Server, the Server stub, and one instance of
RPCRuntime execute on the server machine
04/08/23 74/112
Implementation of RPC mechanismImplementation of RPC mechanism
2
110
9 7
6 5
4
8
3
Client Process Server Process
Call packet
Result packet
Client Machine Server Machine
Call Return
Client Stub
Pack Unpack
RPC Runtime
WaitSend Receive ReceiveSend
RPC Runtime
Unpack Pack
Server Stub
Call Execute Return
04/08/23 75/112
Implementation of RPC Implementation of RPC mechanismmechanism
Client
User process that initiates a RPC
Makes perfectly normal local procedure call that in turn
invokes corresponding procedure in client stub
Client stub
Two tasks:
On receipt of call request from client, it packs a
specification of the target procedure and the
arguments into a message and then asks the local
RPC Runtime to send it to the server stub
On receipt of the result of procedure execution, it
unpacks the result and passes to the client
04/08/23 76/112
Implementation of RPC mechanismImplementation of RPC mechanism RPCRuntime
Handles transmission of messages across the network between client and server machines
It is responsible for retransmission, acknowledgements, packet routing and encryption
RPC runtime on the client machine receives the call request message from the client stub and sends it to the server machine. It also receives the result message from the server and passes it to the client stub
RPC runtime on the Server machine receives the result
message from the server stub and sends it to the client
machine. It also receives the call request message
from the client and passes it to the server stub
04/08/23 77/112
Implementation of RPC mechanismImplementation of RPC mechanism Server stub
Two tasks:
On receipt of request from local RPCRuntime, it
unpacks it and makes a perfectly normal call to invoke
the appropriate procedure in the server
On receipt of result, it packs the result into a message
and then asks the local RPCRuntime to send it to the
client stub
Server
On receiving call request from server stub, the server
executes the appropriate procedure and returns the
result of procedure execution to the server stub
04/08/23 78/112
Implementation of RPC Implementation of RPC mechanismmechanism
Stub generation:
2 ways
Manually : RPC implementor provides a set of
translation functions from which a user can construct
stubs
Automatically : Uses Interface Definition Language
(IDL) to define the interface between a client and a
server.
RPC messages:
2 types of messages involved in the implementation of
an RPC system are:
Call messages
Reply messages
04/08/23 79/112
Implementation of RPC mechanismImplementation of RPC mechanism
Call messages:
2 basic components necessary in a call message are:
The identification information of the remote procedure
to be executed
The arguments necessary for the execution of the
procedure
In addition to these fields, a call message normally has
A message identification field
A message type field
A client identification field
04/08/23 80/112
Implementation of RPC mechanismImplementation of RPC mechanism
A typical RPC call message format
Remote procedure identifier
Message identifier
(Seq.No.)
Message type Client
identifierProgram number
Version number
Procedure number
Arguments
(Call / Reply)
04/08/23 81/112
Implementation of RPC mechanismImplementation of RPC mechanism
Message identifier
Message type
Reply status
(successful)
Result
Message identifier
Message type
Reply status
(unsuccessful)
Reason for failure
a) A successful reply message format
b) A unsuccessful reply message format
RPC reply message format
04/08/23 82/112
Server ManagementServer Management
1) Server Implementation
2) Server Creation
Based on style of server Implementation sever can
be classified as
1. Stateful server – Maintains client state information.
So client need not send the information all the time.
2. Stateless server – Does not Maintain client state
information.
04/08/23 83/112
Stateful serverStateful server
Stateful file server
Open ( Fid, 5, buffer )
Return ( bytes 0 to 199 )
Read ( Fid , 200, buffer )
Return ( Successful )
Close ( fid )
Return ( Fid )
File Mode R/W PointerId
Open ( Filename, Mode )
Return ( bytes 200 to 204 )
Client Process Server Process
04/08/23 84/112
Stateless serverStateless server
Stateless file server
Return ( bytes 0 to 199 )
Read( Filename,0, 200,buffer )
Return ( bytes 400 to 419 )
Server Process
File Mode R/W PointerId
Client Process
Read( Filename,400,20,buffer )
04/08/23 85/112
Staless vs. Stateful serversStaless vs. Stateful servers
Stateful servers provide an easier programming
paradigm, clients need not keep track of state
information
Stateful servers are more efficient than stateless
servers
Stateless servers make crash recovery easy in the
event of server crash
Choice of using stateless or stateful server is purely
application dependent
04/08/23 86/112
Server Creation SemanticsServer Creation Semantics Sever processes may either be created and installed
before their client processes or be created on demand basis.
Based on the time duration for which RPC server survive, RPC servers are classified as
1. Instance – per-call Server.
2. Instance – per- session Server3. Persistent Server
1. Instance–per-call Server : Servers exist only for the duration of a single call.
It is created by RPC Runtime on the server machine, only when the call message arrives.
Server is deleted after the call execution.
04/08/23 87/112
Server Creation SemanticsServer Creation SemanticsNot commonly used approach because,
It is stateless approach, needs state information to be presented either at client process (Time consuming and loss of data abstraction) or at server O.S. (Expensive)
Multiple invocation of same server becomes more expensive.
2. Instance – per- session Server : Server exists for the entire session for which client & server interact. Server can maintain internal state information. Overhead involved in creation and destruction is minimized.
3. Persistent Server : Server remains in existence indefinitely. A persistent server can be shared unlike other two.
04/08/23 88/112
Communication protocols for RPCs Communication protocols for RPCs 1. The Request(R) protocol
Client Server
First RPC
Next RPC
Procedure execution
Procedure execution
Request message
Request message
04/08/23 89/112
Communication protocols for Communication protocols for RPCsRPCs
The Request protocol
Used in RPC in which the called procedure has nothing
to return and client requires no confirmation that
procedure is executed
Only one message per call is transmitted
An RPC that uses the R protocol is called asynchronous
RPC
In asynchronous RPC, the RPCRuntime does not take
responsibility for retrying a request in case of
communication failure
Asynchronous RPC with unreliable transport protocol
are generally useful for implementing periodic update
services
04/08/23 90/112
Communication protocols for Communication protocols for RPC’sRPC’s2. The Request/Reply(RR) protocol
Server
Also serves as acknowledgement for the reply of previous RPC
Procedure execution
Also serves as acknowledgement for the request message
Client
First RPC
Next RPC
Request message
Reply message
Request message
Reply message
Also serves as acknowledgement for the request message
Procedure execution
04/08/23 91/112
Communication protocols for Communication protocols for RPC’sRPC’s
The Request/Reply (RR) protocol Suitable for simple RPC in which all the arguments &
results fit in a single packet buffer and duration of call and interval between the call is short (less than transmission time)
It is based on the idea of using implicit acknowledgement to eliminate explicit acknowledgment messages
In this protocol A server’s reply message is regarded as an
acknowledgment of client’s request message
A subsequent call packet from a client is regarded as an acknowledgement of the server’s reply message of the previous call made by that client
04/08/23 92/112
Communication protocols for Communication protocols for RPC’sRPC’s3. The Request/Reply/Acknowledge-reply(RRA) protocol
Client
First RPC
Next RPC
Request message
Reply message
Request message
Reply message
Reply ack message
Procedure execution
Procedure execution
Reply ack message
Server
04/08/23 93/112
Communication protocols for RPC’sCommunication protocols for RPC’s The RRA protocol
Message identifiers associated with request
messages are ordered
Client acknowledges the reply message only if it has
received the reply for all the previous requests
Server deletes information from its cache only after
receiving an acknowledgement for it from the client
Loss of acknowledgement is harmless, since an
acknowledgement message guarantees the receipt of
reply for earlier messages
04/08/23 94/112
Client Server BindingClient Server Binding Binding: Process by which client become associated
with server so that calls can take place.
Server locating:
1. Broadcasting:
Message is broadcast to all nodes.
Node housing the desired server responds.
Easy to implement & suitable for small networks.
Expensive for large networks.
2. Binding Agent:
A name server used to bind a client to a server.
Name server maintains the Binding Table.
04/08/23 95/112
Client Server BindingClient Server Binding
Bin
ding
Age
nt r
etur
ns
Serv
er L
ocat
ion
Client Calls the Server4
Server Registers itself
with B
inding Agent
Clie
nt r
eque
st
the
Bin
ding
Age
nt fo
r se
rver
Loc
atio
n
Name Server Binding Agent
Client Process
12
3
Server Process
04/08/23 96/112
Client Server BindingClient Server Binding
Advantages of using Binding Agent:
Can support Multiple Servers having the same interface type so that any of the available server may be used to service the client’s request.
Binding agent can Balance the load evenly among the servers providing the same service.
User Authorization facility can be provided for binding
Disadvantages:
Overhead becomes large when many client processes are short lived.
Binding Agent may become a performance bottleneck
04/08/23 97/112
Client Server BindingClient Server BindingBinding time: -
1. Compile time Binding → Hard coding of Server’s network addresses. Extremely Inflexible (if configuration changes)
2. Link time Binding → Request B.A. before making call Server process exports its services by registering it Client makes Import request to the binding agent for
the service before making call Binding Agent returns the server details to the client Client caches it to avoid contacting the Binding agent
for subsequent calls
3. Call time Binding Client is bound to a server at the time when it calls
the server for the first time during its execution.
04/08/23 98/112
Client Server Binding - Client Server Binding - Call time Binding
Ret
urns
the
resu
lts
alon
g
wit
h th
e Se
rver
’s H
andl
e
Subsequent calls are Sent directly
5
Sends a RP
C call m
essage Clie
nt p
asse
s se
rver
’s in
terf
ace
nam
e an
d ar
gum
ents
of R
PC
Binding Agent
Client Process
21
4
Server Process
Server returns the result
of request processing3
04/08/23 99/112
Complicated RPC’sComplicated RPC’s 2 types of complicated RPC’s are:
1. RPC’s involving long-duration calls or large gaps between calls 2 methods used to handle
Periodic probing of the server by the client
Periodic generation of an acknowledgement by the
server
2. RPC’s involving arguments and/or results that are too large to fit in a single-datagram packet A long RPC argument or result is fragmented and
transmitted in multiple packets
04/08/23 100/112
Special types of RPC’sSpecial types of RPC’s1. Call Back RPC
2. Broadcast RPC
3. Batch-mode RPC
1. Call Back RPC
In Normal RPC, the caller and callee processes have a client-server relationship, where as in call back RPC uses Peer-to-Peer paradigm where a node acts as both client and Server.
Call Back RPC is for interactive applications, which require user intermediate inputs
During procedure execution the server process makes a callback RPC to client process
04/08/23 101/112
Special types of RPC’sSpecial types of RPC’s Callback RPC
Start procedure execution
Stop procedure execution temporarily
Resume procedure execution
Procedure execution ends
Client Server
Process callback request and send reply
Call (parameter list)
Callback (parameter list)
Reply (result of callback)
Reply (result of call)
04/08/23 104/112
Special types of RPC’s (Contd…)Special types of RPC’s (Contd…)2. Broadcast RPC
Client request is Broadcast on Network & processed by all the servers providing that service.
Two ways Using Binding Agent, which forwards the request to
all Servers registered with it.
Using Broadcast Ports of servers. Client process may wait for zero, one, m-out–of-n,
all replies Depending on reliability desired.
04/08/23 105/112
Special types of RPC’sSpecial types of RPC’s3. Batch-mode RPC
Queue separate RPC request at client side in a
transmission buffer & send them over network in a
batch.
Reduces overhead of sending each RPC.
Applications requiring higher RPC call rates (50-100
RPC/sec) can be implemented easily.
Transmission Buffer is flushed when
Predetermined interval lapses.
Predetermined number of requests received.
Amount of batch data exceeds the buffer size.
A call is made to one of the server’s procedure for
which result is expected. ( Nonqueuing RPC)
04/08/23 106/112
Optimizations in RPC for better Optimizations in RPC for better performanceperformance
1. Concurrent Access to Multiple Servers
a) Use of threads: - Each thread can independently
make calls to different servers.
b) Early Reply Approach: -
RPC is split into 2 RPC calls
1. One RPC for Passing Parameters
2. One RPC for requesting result
c) Call Buffering Approach
04/08/23 107/112
Early Reply Approach: - to provide concurrent access to multiple servers.
Client Server
Call procedure (parameter)
Reply (tag)
Request result (tag)
Reply (result)
Carry out other activities
Return (tag)
Execute the procedure
Store (result)
Return (result)
Optimizations in RPC for better performance***Optimizations in RPC for better performance***
04/08/23 108/112
Call buffering approach : to provide concurrent access to
multiple servers.
Clients and servers do not interact directly with each
other
Interact indirectly via a call buffer sever
To make an RPC call
A client sends its call request to the call buffer server
Client then performs other activities until it needs the
result
Client periodically polls the call buffer server, when it
needs the result
If result is available it recovers the result
Optimizations in RPC for better performanceOptimizations in RPC for better performance
04/08/23 109/112
Call buffering approach On server side
When server is free, it periodically polls the call buffer server, if there is any call for it
If there is, it recovers the call request, executes it and makes a call back to the call buffer server
Returns the result of execution to the call buffer server
Optimizations in RPC for better performanceOptimizations in RPC for better performance
04/08/23 110/112
Carryout otheractivities
Check for Result ( Tag)
Call Procedure (parameter)
Check for result ( Tag)
Reply (not done)
Client
Polling for Waiting request
Execute the Procedure
Polling for
result
Reply ( Result)
Check for a waiting request
ServerCall Buffer Server
Reply (No request)
Reply ( Tag)
Check for waiting request
Reply ( Tag, Parameter)
Reply (Tag, Result)
Acknowledgement
04/08/23 111/112
2. Serving multiple requests simultaneously
Delays encountered in RPC systems :
Delay caused while a server waits for a resource that
is temporarily unavailable
A delay can occur when a server calls a remote
function that involves a considerable amount of
computation to complete or involves considerable
transmission delay
Use of Multi-threaded server with dynamic thread
creation facility allow the server to accept and
process other requests, instead of being idle while
waiting will provide better performance
Optimizations in RPC for better performanceOptimizations in RPC for better performance
04/08/23 112/112
3. Reducing per-call workload of servers
One way to achieve this improvement is to use
stateless servers
4. Reply caching of idempotent remote procedures
Proper selection of timeout values
Too small timeout value will cause timers to expire too
often, resulting in unnecessary retransmissions
Too large timeout value will cause a needlessly long
delay in the event that a message is actually lost
Optimizations in RPC for better performanceOptimizations in RPC for better performance
04/08/23 113/112
Servers are likely to take varying amounts of time to
service individual requests, depending on various
factors like server load, network routing and network
congestion
If the clients continue to retry sending requests, the
server loading and network congestion problem will
become worse
One method for proper selection of timeout values is
to use some back-off strategy or exponentially
increasing timeout values
5. Proper design of RPC protocol specification
Optimizations in RPC for better performanceOptimizations in RPC for better performance
04/08/23 117/112
End of chapter 2End of chapter 2