Date post: | 30-Dec-2015 |
Category: |
Documents |
Upload: | sandra-gordon |
View: | 224 times |
Download: | 2 times |
Jon Maloy, EricssonSteven Blake, Ericsson
Maarten Koning, WindRiver
draft-maloy-tipc-00.txt
Transparent Inter Process CommunicationTIPC
NOKIA RESEARCH CENTER / BOSTON
TIPCTIPC
A general transport protocol for cluster environments
Not dedicated for distributed routers or ForCES protocol only
A framework for detecting, supervising and maintaining cluster topology
Available as portable open source code package under dual BSD/GPL licence
13000 lines of C code, 112 kbyte Linux kernel module
Runs on 4 OS:es so far, and more to come
Proven concept, used and deployed in several Ericsson products
NOKIA RESEARCH CENTER / BOSTON
TIPCTIPC
In the ForCES context, TIPC is intended used as a transport protocol for carrying ForCES protocol messages, not a
ForCES protocol as such
NOKIA RESEARCH CENTER / BOSTON
Connectionless, Connection Oriented and Multicast communication modes Reliable transport in all modes, but can be made unreliable per socket or
per connection
Single, consistent addressing model for all three modes
Scalable
Can handle clusters hundreds of nodes
Performance Directly on media (Ethernet,Infiniband...) when possible 24 byte header for most messages
Location Transparency in all three communication modes The cluster can be seen as one single computer Selective transparency
Why TIPC in ForCES ?Why TIPC in ForCES ?
NOKIA RESEARCH CENTER / BOSTON
Congestion control at three levels Connection level, signalling link level and media level
Based on 4 importance priorities
Simple to configure Each node needs to know its own identity, that is all
Automatic neighbour detection using multicast/broadcast
Lightweigth, Reactive Connections
Immediate connection abortion at node/process failure or overload
Toplogy Subscription Service Functional and physical topology
Why TIPC in ForCES ?Why TIPC in ForCES ?
NOKIA RESEARCH CENTER / BOSTON
InfinibandMirrored Memory
Ethernet SCTPUDP
Bearer Adapter API
Sequence/RetransmissionControl
Packet BundlingCongestion Control
Fragmentation/De-fragmentation
Reliable MulticastNeighbour Detection
Link Establish/Supervision/Failover
Address Table Distribution
Connection SupervisionRoute/Link Selection
Address Subscription Address Resolution
User Adapter API
Socket API Adapter Port API Adapter Other API Adapters
NodeInternal
Functional ViewFunctional View
NOKIA RESEARCH CENTER / BOSTON
Zone <1>
Zone <2>
Node <1.2.3>
Internet/Intranet
Slave Node <2.1.3333>
Network TopologyNetwork Topology
Cluster <1.2>
Cluster <1.1>
Cluster <2.1>
NOKIA RESEARCH CENTER / BOSTON
Server Process,Partition B
Server Process,Partition A
Client Process
bind(type = foo, lower=0, upper=99)
sendto(type = foo, instance = 33)
bind(type = foo, lower=100, upper=199)
foo,33
Functional Addressing: UnicastFunctional Addressing: Unicast
Function Address Persistent, reusable 64 bit port identifier assigned by user Consists of type number and instance number
Function Address Sequence Sequence of function addresses with same type
NOKIA RESEARCH CENTER / BOSTON
Server Process,Partition B
Server Process,Partition A
Client Process
bind(type = foo, lower=0, upper=99)
sendto(type = foo, lower = 33,
upper = 133)
bind(type = foo, lower=100, upper=199)
foo,33,133
foo,33,133
Functional Addressing: MulticastFunctional Addressing: Multicast
Based on Function Address Sequences
Any partition overlapping with the range used in the destination address will receive a copy of the message
Client defines “multicast group” per call
NOKIA RESEARCH CENTER / BOSTON
Location of server not known by client Lookup of physical destination performed on-the-fly Efficient, no secondary messaging involved
Client Process
sendto(type = foo, lower = 33,
upper = 133)
Node <1.1.1> Server Process,Partition B
Server Process,Partition A
bind(type = foo, lower=0, upper=99)
bind(type = foo, lower=100, upper=199)
foo,33,133
Location TransparencyLocation Transparency
NOKIA RESEARCH CENTER / BOSTON
Location of server not known by client Lookup of physical destination performed on-the-fly Efficient, no secondary messaging involved
Client Process
sendto(type = foo, lower = 33,
upper = 133)
Node <1.1.1>
Server Process,Partition B
Server Process,Partition A
bind(type = foo, lower=0, upper=99)
bind(type = foo, lower=100, upper=199)
foo,33,133
Location TransparencyLocation Transparency
Node <1.1.2>
NOKIA RESEARCH CENTER / BOSTON
Node <1.1.2>
bind(type = foo, lower=100, upper=199)
Node <1.1.3>
Location of server not known by client Lookup of physical destination performed on-the-fly Efficient, no secondary messaging involved
Client Process
sendto(type = foo, lower = 33,
upper = 133)
Node <1.1.1>
Server Process,Partition B
Server Process,Partition A
bind(type = foo, lower=0, upper=99)
foo,33,133
Location TransparencyLocation Transparency
NOKIA RESEARCH CENTER / BOSTON
Many sockets may bind to same partition
Closest-First or Round-Robin algorithm chosen by client
bind(type = foo, lower=0, upper=99)
Client Process
sendto(type = foo, lower = 33,
upper = 133)
Server Process,Partition A’
Server Process,Partition A
bind(type = foo, lower=0, upper=99)
foo,33,133
Address BindingAddress Binding
NOKIA RESEARCH CENTER / BOSTON
Many sockets may bind to same partition
Closest-First or Round-Robin algorithm chosen by client
Same socket may bind to many partitions
bind(type = foo, lower=100, upper=199)
Client Process
sendto(type = foo, lower = 33,
upper = 133)
Server Process,Partition B
Server Process,Partition A+B’
bind(type = foo, lower=0, upper=99)bind(type=foo, lower=100, upper=199)
foo,33,133
Address BindingAddress Binding
NOKIA RESEARCH CENTER / BOSTON
Many sockets may bind to same partition
Closest-First or Round-Robin algorithm chosen by client
Same socket may bind to many partitions
Same socket may bind to different functions
bind(type = foo, lower=100, upper=199)
Client Process
sendto(type = foo, lower = 33,
upper = 133)
Server Process,Partition B
Server Process,Partition A
bind(type = foo, lower=0, upper=99)bind(type=bar, lower=0, upper=999)
foo,33,133
Address BindingAddress Binding
NOKIA RESEARCH CENTER / BOSTON
Server Process,Partition B
Server Process,Partition A
Client Process
bind(type = foo, lower=0, upper=99)
subscribe(type = foo, lower = 0,
upper = 500)
bind(type = foo, lower=100, upper=199)
foo,100,199
foo,0,99
Functional Topology SubscriptionFunctional Topology Subscription
Function Address/Address Partition bind/unbind events
NOKIA RESEARCH CENTER / BOSTON
TIPC
bind(type = node, lower=0x1001003, upper=0x1001003)
Node <1.1.2>
Client Process
subscribe(type = node, lower = 0x1001000,
upper = 0x1001009)
node,0x1001003
node,0x1001002
Node <1.1.1>
Node <1.1.3>
bind(type = node, lower=0x1001002, upper=0x1001002)
TIPC
Network Topology SubscriptionNetwork Topology Subscription
Node/Cluster/Zone availability events Same mechanism as for function events
NOKIA RESEARCH CENTER / BOSTON
ForCES Applied on TIPCForCES Applied on TIPC
Network EquipmentNetwork Equipment
Control ElementControl Element
Forwarding Element Forwarding Element
OSPF, RIPOSPF, RIP COPS, CLI, SNMPCOPS, CLI, SNMP Other ApplicationsOther Applications
ForCES Protocol/TIPCForCES Protocol/TIPC
LFB <IPv4F,5>LFB <CNT,17>LFB <IPv4F,1>LFB <CNT,32>
NOKIA RESEARCH CENTER / BOSTON
Network EquipmentNetwork Equipment
Control ElementControl Element Control ElementControl Element
ForCES applied on TIPCForCES applied on TIPC
Control ElementControl Element
Forwarding Element Forwarding Element Forwarding Element Forwarding Element
OSPF, RIPOSPF, RIP COPS, CLI, SNMPCOPS, CLI, SNMP Other ApplicationsOther Applications
Internet
InternetForCES Protocol/TIPCForCES Protocol/TIPC
LFB <IPv4F,5>LFB <CNT,17>LFB <IPv4F,1>LFB <CNT,32>
NOKIA RESEARCH CENTER / BOSTON
CONNECTIONSCONNECTIONS Establishment based on functional addressing
Selectable lookup algorithm, partitioning, redundancy etc
No protocol messages exchanged during setup/shutdown
Only payload carrying messages
Traditional TCP-style connection setup/shutdown as alternative
End-to-end flow control
SOCK_SEQPACKET
SOCK_STREAM
SOCK_RDM for connectionless and multicast
SOCK_DGRAM can easily be added if needed
Same with “Unreliable SOCK_SEQPACKET”
NOKIA RESEARCH CENTER / BOSTON
CONNECTIONSCONNECTIONS
foo,117
Server Process,Partition BClient
Process
sendto(type = foo, instance = 117)
No protocol messages exchanged during setup/shutdown
Only payload carrying messages
NOKIA RESEARCH CENTER / BOSTON
CONNECTIONSCONNECTIONS No protocol messages exchanged during setup/shutdown
Only payload carrying messages
Server Process,Partition BClient
Process connect(client)send()
NOKIA RESEARCH CENTER / BOSTON
CONNECTIONSCONNECTIONS No protocol messages exchanged during setup/shutdown
Only payload carrying messages
Server Process,Partition BClient
Process
connect(server)
NOKIA RESEARCH CENTER / BOSTON
CONNECTIONSCONNECTIONS Immediate “abortion” event in case of peer process crash
Server Process,Partition BClient
Processabort
NOKIA RESEARCH CENTER / BOSTON
CONNECTIONSCONNECTIONS Immediate “abortion” event in case of peer node crash
Server Process,Partition BClient
Process
abort
Node <1.1.5>Node <1.1.3>
NOKIA RESEARCH CENTER / BOSTON
CONNECTIONSCONNECTIONS Immediate “abortion” event in case of communication failure
Server Process,Partition BClient
Process
abort
Node <1.1.5>Node <1.1.3>
NOKIA RESEARCH CENTER / BOSTON
CONNECTIONSCONNECTIONS Immediate “abortion” event in case of node overload
Server Process,Partition BClient
Process
Node <1.1.5>Node <1.1.3>
abort
NOKIA RESEARCH CENTER / BOSTON
Network RedundancyNetwork Redundancy Retransmission protocol and congestion control at signalling link level
Normally two links per node pair, for full load sharing and redundancy
Server Process,Partition BClient
Process
Node <1.1.5>Node <1.1.3>
NOKIA RESEARCH CENTER / BOSTON
Network RedundancyNetwork Redundancy Retransmission protocol and congestion control at signalling link level
Normally two links per node pair, for full load sharing and redundancy
Smooth failover in case of single link failure, with no consequences for user level connections
Server Process,Partition BClient
Process
Node <1.1.5>Node <1.1.3>
NOKIA RESEARCH CENTER / BOSTON
Remaining WorkRemaining Work
Implementation Reliable Multicast not fully implemented yet (exp. end of Q1) Re-stabilization after most recent changes Re-implementation of multi-cluster neighbour detection and link
setup
Protocol Fully manual inter cluster link setup Guaranteeing Name Table consistency between clusters Slave node Name Table reduction ?????
NOKIA RESEARCH CENTER / BOSTON
http://tipc.sourceforge.nethttp://tipc.sourceforge.net
NOKIA RESEARCH CENTER / BOSTON
QUESTIONS ??QUESTIONS ??