Date post: | 26-Mar-2015 |
Category: |
Documents |
Upload: | maya-mccurdy |
View: | 220 times |
Download: | 0 times |
2006-09-292006-09-29 Emin Gabrielyan, Three Topics in ParallEmin Gabrielyan, Three Topics in Parallel Communicationsel Communications
11
Three Topics in Parallel Three Topics in Parallel CommunicationsCommunications
Thesis presentation by Emin Thesis presentation by Emin GabrielyanGabrielyan
222006-09-292006-09-29 Emin Gabrielyan, Three Topics in ParallEmin Gabrielyan, Three Topics in Parallel Communicationsel Communications
Parallel communications: Parallel communications: bandwidth enhancement or fault-bandwidth enhancement or fault-
tolerance?tolerance?
We do not know if parallel communications We do not know if parallel communications were first used for fault-tolerance or for were first used for fault-tolerance or for bandwidth enhancementbandwidth enhancement
In 1964 Paul Baran proposed parallel In 1964 Paul Baran proposed parallel communications for fault-tolerance communications for fault-tolerance (inspiring the design of ARPANT and Internet)(inspiring the design of ARPANT and Internet)
1981 IBM invented the 8-bit parallel port 1981 IBM invented the 8-bit parallel port for faster communicationfor faster communication
332006-09-292006-09-29 Emin Gabrielyan, Three Topics in ParallEmin Gabrielyan, Three Topics in Parallel Communicationsel Communications
Bandwidth enhancement by Bandwidth enhancement by parallelizing the sources and sinksparallelizing the sources and sinks
Bandwidth enhancement Bandwidth enhancement can be achieved by can be achieved by adding parallel pathsadding parallel pathsBut a greater capacity But a greater capacity enhancement is enhancement is achieved if we can achieved if we can replace the senders and replace the senders and destinations with parallel destinations with parallel sources and sinkssources and sinksThis is possible in This is possible in parallel I/O (first topic of parallel I/O (first topic of the thesis)the thesis)
442006-09-292006-09-29 Emin Gabrielyan, Three Topics in ParallEmin Gabrielyan, Three Topics in Parallel Communicationsel Communications
Parallel transmissions in coarse-Parallel transmissions in coarse-grained networks cause congestionsgrained networks cause congestions
In coarse-grained circuit-switched HPC In coarse-grained circuit-switched HPC networks uncoordinated parallel networks uncoordinated parallel transmissions cause congestionstransmissions cause congestions
The overall throughput degrades due to The overall throughput degrades due to access conflicts on shared resourcesaccess conflicts on shared resources
Coordination of parallel transmissions is Coordination of parallel transmissions is covered by the second topic of my thesis covered by the second topic of my thesis (liquid scheduling)(liquid scheduling)
552006-09-292006-09-29 Emin Gabrielyan, Three Topics in ParallEmin Gabrielyan, Three Topics in Parallel Communicationsel Communications
Classical backup parallel circuits for Classical backup parallel circuits for fault-tolerancefault-tolerance
Typically the Typically the redundant redundant resource remains resource remains idleidle
As soon as there is As soon as there is a failure with the a failure with the primary resourceprimary resource
The backup The backup resource replaces resource replaces the primary onethe primary one
662006-09-292006-09-29 Emin Gabrielyan, Three Topics in ParallEmin Gabrielyan, Three Topics in Parallel Communicationsel Communications
Parallelism in living organismsParallelism in living organismsParallelism is Parallelism is observed in observed in almost every almost every living organismsliving organismsDuplication of Duplication of organs primarily organs primarily serves for fault-serves for fault-tolerancetoleranceAnd as a And as a secondary secondary purpose, for purpose, for capacity capacity enhancementenhancement
772006-09-292006-09-29 Emin Gabrielyan, Three Topics in ParallEmin Gabrielyan, Three Topics in Parallel Communicationsel Communications
Simultaneous parallelism for fault-Simultaneous parallelism for fault-tolerance in fine-grained networkstolerance in fine-grained networks
A challenging bio-A challenging bio-inspired solution is inspired solution is to use to use simultaneously all simultaneously all available paths for available paths for achieving fault-achieving fault-tolerancetoleranceThis topic is This topic is addressed in the addressed in the last part of my last part of my presentation presentation (capillary routing)(capillary routing)
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
8
Fine Granularity Parallel I/O for Cluster
Computers
SFIO, a Striped File parallel I/O
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
9
Why is parallel I/O required
Single I/O gateway for cluster computer saturates
Does not scale with the size of the cluster
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
10
What is Parallel I/O for Cluster Computers
Some or all of the cluster computers can be used for parallel I/O
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
11
Objectives of parallel I/O
Resistance to concurrent access Scalability as the number of I/O nodes
increases High level of parallelism and load balance for
all application patterns and all types of I/O requests
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
12
Parallel I/O Subsystem
Concurrent Access by Multiple Compute Nodes
No concurrent access overheads
No performsne degradation
When the number of compute nodes increases
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
13
Scalable throughput of the parallel I/O subsystem
The overall parallel I/O throughput should increase linearly as the number of I/O nodes increasesParallel I/O Subsystem
Number of I/O Nodes
Thr
ough
put
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
14
Concurrency and Scalability = Scalable All-to-All Communication
Concurrency and Scalability (as the number of I/O nodes increases) can be represented by scalable overall throughput when the number of compute and I/O nodes increases
Number of I/O and Compute Nodes
All-
to-A
ll T
hrou
ghpu
t
I/O Nodes
Compute Nodes
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
15
High level of parallelism and load balance
Balanced distribution across parallel disks must be ensured:
For all types of application patterns: Using small or large I/O requests Continuous or fragmented I/O request
patterns
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
16
How parallelism is achieved?
Split the logical file into stripes
Distribute the stripes cyclically across the subfiles
Sub
files
file1
file2 file3
file4
file5file6
Logical file
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
17
The POSIX-like Interface of Striped File I/O
Using SFIO from MPI
Simple Posix like interface
#include <mpi.h>#include "/usr/local/sfio/mio.h"int _main(int argc, char *argv[]){ MFILE *f; int r=rank(); //Collective open operation f=mopen("p1/tmp/a.dat;p2/tmp/a.dat;", 5); //each process writes 8 to 14 characters at its own position
if(rank==0) mwritec(f,0,"Good*morning!",13); if(rank==1) mwritec(f,13,"Bonjour!",8); if(rank==2) mwritec(f,21,"Buona*mattina!",14);
mclose(f); //Collective close operation}
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
18
Distribution of the global file data across the subfiles Example with three compute nodes and two I/O
nodes
First subfile
Global file
Second subfile
G o o d *
G o o d *
n g ! B o
n g ! B o
! B u o n
! B u o n
t i n a !
t i n a !
m o r n i
m o r n i
n j o u r
n j o u r
a * m a t
a * m a t
130 21
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
19
Impact of the stripe unit size on the load balance
When the stripe unit size is large there is no guarantee that an I/O request will be well parallelized
subfiles
Logical fileI/O Request
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
20
Fine granularity striping with good load balance
Low granularity ensures good load balance and high level of parallelism
But results in high network communication and disk access costsubfiles
Logical fileI/O Request
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
21
Fine granularity striping is to be maintained
Most of the HPC parallel I/O solutions are optimized only for large I/O blocks (order of Megabytes)
But we focus on maintaining fine granularity The problem of the network communication
and disk access are addressed by dedicated optimizations
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
22
Overview of the implemented optimizations
Disk access requests aggregation (sorting, cleaning-overlaps and merging)
Network communication aggregation Zero-copy streaming between network and
fragmented memory patterns (MPI derived datatypes)
Support of the multi-block interface efficiently optimizes application related file and memory fragmentations (MPI-I/O)
Overlapping of network communication with disk access in time (at the moment write operation only)
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
23
Multi-block I/O request
Disk access optimizations Sorting Cleaning the
overlaps Merging Input: striped
user I/O requests
Output: optimized set of I/O requests
No data copy
block 1 bk. 2 block 3
access1 access2
Local subfile
6 I/O access requests are
merged into 2
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
24
Network Communication Aggregation without Copying
Striping across 2 subfiles
Derived datatypes on the fly
Contiguous streaming
Logical file
From: application memory
Remote I/O node 1
Remote I/O node 2
To: remote I/O nodes
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
25
SFIO library on compute node
Functional Architecture
Blue: Interface functions
Green: Striping functionality
Red: I/O request optimizations
Orange: Network communication and relevant optimizations
bkmerge: overlapping and aggregation
mkbset: creates on the fly MPI derived datatypes
SFP_CMD_WRITESFP_CMD
_READ
mreadmwrite
mreadc mreadb mwritec mwriteb
mrw (cyclic distribution)
sfp_rflush sfp_wflush
sfp_readc sfp_writec
sfp_rdwrc (request caching)
flushcache
sfp_readsfp_write sortcache
sfp_readb sfp_writeb
bkmerge
mkbsetsfp_wait
all
SFP_CMD_BREAD
SFP_CMD_BWRITE
I/O Node
MPI MPIMPIMPI
I/O L
isten
er
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
26
Optimized throughput as a function of the stripe unit size
3 I/O nodes
1 compute node
Global file size: 660 Mbytes
TNET About 10
MB/s per disk
0
5
10
15
20
25
3050 100
200
500
1000
2000
5000
1000
0
2000
0
5000
0
Stripe unit size (bytes)
Wri
te t
hro
ug
hp
ut
(MB
/s)
non-optimized optimized
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
27
All-to-all stress test on Swiss-Tx cluster supercomputer
Stress test is carried out on Swiss-Tx machine
8 full crossbar 12-port TNet switches
64 processors Link throughput is
about 86 MB/s
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
28
SFIO on the Swiss-Tx cluster supercomputer
MPI-FCI Global file size: up
to 32 GB Mean of 53
measurements for each number of nodes
Nearly linear scaling with 200 bytes stripe unit !
Network is a bottleneck above 12 nodes
0
50
100
150
200
250
300
350
400
1 3 5 7 911 13 15 17 19 21 23 25 27 29 31
Number of compute and I/O nodes
Ove
rall
all-t
o-al
l thr
ough
put (
MB
/s)
write maximum
write average
read maximum
read average
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
29
Liquid scheduling for low-latency circuit-switched networks
Reaching liquid throughput in HPC wormhole switching and in Optical lightpath routing networks
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
30
Upper limit of the network capacity
Given is a set of parallel transmissions
and a routing scheme
The upper limit of network’s aggregate capacity is its liquid throughput
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
31
Distinction: Packet Switching versus Circuit Switching
Packet switching is replacing circuit switching since 1970 (more flexible, manageable, scalable)
New circuit switching networks are emerging (HPC clusters, Optical switching)
In HPC wormhole routing targets extremely low latency requirements
In optical network packet switching is not possible due to lack of technology
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
32
Coarse-Grained Networks In circuit switching
the large messages are transmitted entirely (coarse-grained switching)
Low latency The sink starts
receiving the message as soon as the sender starts transmission
Message Sink
Message Source
Fin
e-G
rain
ed
Pac
ket
switc
hing
Coa
rse-
grai
ned
Circ
uit
switc
hing
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
33
Parallel transmissions in coarse-grained networks
When the nodes transmit in parallel across a coarse-grained network in uncoordinated fashion congestion may occur
The resulting throughput can be far below the expected liquid throughput
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
34
Congestions and blocked paths in wormhole routing
When the message encounters a busy outgoing port it waits
The previous portion of the path remains occupied
Source1
Sink2
Sink1
Source2
Sink3
Source3
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
35
Hardware solution in Virtual Cut-Through routing
In VCT when the port is busy
The switch buffers the entire message
Much more expensive hardware than in wormhole switching
Source1
Sink2
Sink1
Source2
Sink3
Source3
buffering
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
36
Other hardware solutions
In optical networks OEO conversion can be used
Significant impact on the cost (vs. memory-less wormhole switch and MEMS optical switches)
Affecting the properties of the network (e.g. latency)
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
37
Application level coordinated liquid scheduling
Liquid scheduling is a software solution
Implemented at the application level No investments in network hardware Coordination between the edge nodes
is required Network topology knowledge is
assumed
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
38
Example of a simple traffic pattern
5 sending nodes (above)
5 receiving nodes (below)
2 switches 12 links of
equal capacity Traffic consist
of 25 transfers
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
39
Round robin schedule of all-to-all traffic pattern
First, all nodes simultaneously send the message to the node in front
Then, simultaneously, to the next node
etc
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
40
Throughput of round-robin schedule
3rd and 4th phases require each two timeframes
7 timeframes are needed in total
Link throughput = 1Gbps Overall throughput =
25/7x1Gbps = 3.57Gbps
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
41
A liquid schedule and its throughput
6 timeframes of non-congesting transfers Overall throughput = 25/6x1Gbps = 4.16Gbps
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
42
Problem of liquid scheduling
Building liquid schedule for arbitrary traffic of transfers
Problem of partitioning of the traffic into minimal number of subsets consisting of non-congesting transfers
Timeframe = a subset of non-congesting transfers
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
43
Definitions of our mathematical model
Transfer is a set of links lying on the path of the transmission
Load of a link is the number of transfers in the traffic using that link
Most loaded links are called bottlenecks
Duration of the traffic is the load of its bottlenecks
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
44
bott
lene
cks
Teams = non-congesting transfers using all bottleneck links
The shortest possible time to carry out the traffic is the active time of the bottleneck links
Then the schedule must keep the bottleneck links busy all the time
Therefore the timeframes of a liquid schedule must consist of transfers using all bottlenecks
team
not
a te
am
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
45
Retrieval of teams without repetitions by subdivisions
Teams can be retrieved without repetitions by recursive partitioning
By a choice of a transfer all teams are divided into teams using that transfer and teams not using it
Each halves can be similarly sub divided until individual teams are retrieved
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
46
Teams use all bottlenecks: retrieving teams of traffic skeleton
Since teams must use transfers using the bottleneck links
We can first create teams using only such transfers (traffic skeleton)
Chart: fraction of the traffic skeleton
0%10%20%30%40%50%60%70%80%90%
100%
0 (0
0)64
(08
)10
0 (1
0)12
1 (1
1)14
4 (1
2)16
9 (1
3)19
6 (1
4)22
5 (1
5)22
5 (1
5)25
6 (1
6)28
9 (1
7)32
4 (1
8)36
1 (1
9)40
0 (2
0)44
1 (2
1)48
4 (2
2)57
6 (2
4)62
5 (2
5)90
0 (3
0)
Number of transfers (and number of contributing nodes) for 362 different traffic patterns across Swiss-Tx cluster
Frac
tion
of
tran
sfer
s us
ing
bott
lene
cks
nodes:transfers:
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
47
Optimization by first retrieving the teams of the skeleton
Speedup: by skeleton optimization
Reducing the search space 9.5 times
4.7
5.5 7.4
7.9
8.1
8.3
9.2
9.3
9.6
9.9
10.0
10.1
10.7
10.8
10.9
11.3
12.0
12.2
12.6
12.7
13.4
14.0 20
.0
0%
5%
10%
15%
20%
25%
30%
35%
466.
6K (
100)
926.
2K (
121)
4.2M
(12
1)4.
2M (
121)
212K
(10
0)4.
9M (
121)
4.1M
(12
1)9.
2M (
121)
693.
2K (
100)
14.1
M (
121)
15.2
M (
121)
753.
7K (
100)
682K
(10
0)93
6K (
100)
1.2M
(10
0)88
.1K
(81
)95
K (
81)
115.
9K (
81)
1.8M
(10
0)57
.6K
(81
)9.
2K (
64)
136.
7K (
81)
14.2
M (
121)
Number of possible full teams (and number of transfers) for 23 different traffic patterns across the Swiss-Tx cluster
Sea
rch
spac
e re
duct
ion
(%)
idle+skeleton+blank idle+blank blank
transfers:
full
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
48
Liquid schedule assembling from retrieved teams
By relying on efficient retrieval of full teams (subsets of non-congesting transfers using all bottlenecks)
We assemble liquid schedule by trying together different combinations of teams
Until all transfers of the traffic are used
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
49
Liquid schedule assembling optimizations (reduced traffic)
Proved. If we remove a team from a traffic, new bottlenecks can emerge
New bottlenecks add additional constraints on the teams of the reduced traffic
Proved. A liquid schedule can be assembled if we use teams of the reduced traffic (instead of constructing teams of the initial traffic from the remaining transfers)
Proved. A liquid schedule can be assembled by considering only saturated full teams
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
50
Liquid schedule construction speed with our algorithm
0.001
0.01
0.1
1
10
100
1000
10000
100000
1 21 41 61 81 101
121
141
161
181
201
221
241
261
281
301
321
341
361
362 sample topologies
CP
U ti
me
in s
econ
ds -
MILP Cplex method Liquid schedule construction algorithm
360 traffic patterns across Swiss-Tx network
Up to 32 nodes Up to 1024 transfers Comparison of our
optimized construction algorithm with MILP method (optimized for discrete optimization problems)
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
51
Carrying real traffic patterns according to liquid schedules
Swiss-Tx supercomputer cluster network is used for testing aggregate throughputs
Traffic patterns are carried out according liquid schedules
Compare with topology-unaware round robin or random schedules
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
52
Theoretical liquid and round-robin throughputs of 362 traffic samples
362 traffic samples across Swiss-Tx network
Up to 32 nodes Traffic carried out
according to round robin schedule reaches only 1/2 of the potential network capacity
0
200
400
600
800
1000
1200
1400
1600
1800
0 (
00)
64 (
08)
100
(10
)12
1 (
11)
144
(12
)16
9 (
13)
196
(14
)22
5 (
15)
225
(15
)25
6 (
16)
289
(17
)32
4 (
18)
361
(19
)40
0 (
20)
441
(21
)48
4 (
22)
576
(24
)62
5 (
25)
900
(30
)
Ove
rall
thro
ughp
ut (
MB
/s)
-
liquid throughput round-robin schedule
nodes:
transfers:
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
53
Throughput of traffic carried out according liquid schedules
Traffic carried out according to liquid schedule practically reaches the theoretical throughput
200
400
600
800
1000
1200
1400
1600
1800
1 (
01)
64 (
08)
100
(10
)
121
(11
)
144
(12
)
169
(13
)
196
(14
)
225
(15
)
225
(15
)
256
(16
)
289
(17
)
324
(18
)
361
(19
)
400
(20
)
441
(21
)
484
(22
)
576
(24
)
676
(26
)
961
(31
)
Ove
rall
tthr
ough
put (
MB
/s)
theoretical liquid throughputmeasured throughput of a topology-unaware schedulemeasured throughput of a liquid schedule
nodes:
transfers:
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
54
Liquid scheduling conclusions: application, optimization, speedup
In HPC networks, large messages are “copied” across the network causing congestions
Arbitrarily transmitted transfers yield throughput below the theoretical capacity
Liquid scheduling: relies on network topology and reaches the theoretical liquid throughput of the network
Liquid schedules can be constructed in less than 0.1 sec for traffic patterns with 1000 transmissions (about 100 nodes)
Future work: dynamic traffic patterns and application in OBS
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
55
Fault-tolerant streaming with Capillary-routing
Path diversity and Forward Error Correction codes at the packet level
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
56
Structure of my talk The advantages of packet level FEC in
Off-line streaming Solving the difficulties of Real-time
streaming by multi-path routing Generating multi-path routing
patterns of various path diversity Level of the path diversity and the
efficiency of the routing pattern for real-time streaming
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
57
Decoding a file with Digital Fountain Codes
A file is divided into packets
Digital fountain code generates numerous checksum packets
Sufficient quantity of any checksum packets recovers the file
Like when filling your cup only collecting a sufficient amount of drops matters
…
…
…
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
58
Transmitting large files without feedback across lossy networks using digital fountain codes
Sender transmits the checksum packets instead of the source packets
Interruptions cause no problems
The file is recovered once a sufficient number of packets is delivered
FEC in off-line streaming relies on time stretching
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
59
In Real-time streaming the receiver play-back buffering time is limited
While in off-line streaming the data can be hold in the receiver buffer …
In real-time streaming the receiver is not permitted to keep data too long in the playback buffer
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
60
Long failures on a single path route
If the failures are short, by transmitting a large number of FEC packets, receiver may constantly have in time a sufficient number of checksum packets
If the failure lasts longer than the playback buffering limit, no FEC can protect the real-time communication
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
61
Reliable Off-line streaming
Rel
iabl
e re
al-
Tim
e st
ream
ing
Applicability of FEC in Real-Time streaming by using path diversity
Time stretching
Pla
ybac
k b
uffe
r lim
it
Real-time streaming
Losses can be recovered by extra packets:
received later (in off-line streaming)
received via another path (in real-time streaming)
Path diversity replaces time-stretching
Pat
h di
vers
ity
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
62
Creating an axis of multi-path patterns
Intuitively we imagine the path diversity axis as shown
High diversity decreases the impact of individual link failures, but uses much more links, increasing the overall failure probability
We must study many multi-path routings patterns of different diversity in order to answer this question
Single path routing
Multi-path routing
Multi-path routing
Multi-path routing
Path diversity
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
63
Capillary routing creates solutions with different level of path diversity
As a method for obtaining multi-path routing patterns of various path diversity we relay on capillary routing algorithm
For any given network and pair of nodes capillary routing produces layer by layer routing patterns of increasing path diversity
Path diversity = Layer of Capillary Routing
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
64
Capillary routing - introduction
Capillary routing first offers a simple multi-path routing pattern
At each successive layer it recursively spreads out individual sub-flows of previous layers
The path diversity develops as the layer number increases
The construction relies on LP
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
65
Reduce the maximal load of all links
Capillary routing – first layer First take the
shortest path flow and minimize the maximal load of all links
This will split the flow over a few parallel routes
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
66
Capillary routing – second layer Then identify the
bottleneck links of the first layer
And minimize the flow of the remaining links
Continue similarly, until the full routing pattern is discovered layer by layer
Reduce the load of the remaining
links
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
67
Capillary Routing Layers
Single network
4 routing patterns
Increasing path diversity
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
68
Application model: evaluating the efficiency of path diversity To evaluate the efficiencies of patterns
with different path diversities we rely on an application model where:
The sender uses a constant amount of FEC checksum packets to combat weak losses and
The sender dynamically increases the number of FEC packets in case of serious failures
source packets re
dund
ant
pack
ets
FEC block
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
69
Packet Loss Rate = 3%
Packet Loss Rate = 30%
Strong FEC codes are used in case of serious failures
When the packet loss rate observed at the receiver is below the tolerable limit, the sender transmits at its usual rate
But when the packet loss rate exceeds the tolerable limit, the sender adaptively increases the FEC block size by adding more redundant packets
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
70
Redundancy Overall Requirement The overall amount of dynamically
transmitted redundant packets during the whole communication time is proportional:
to the duration of communication and the usual transmission rate
to a single link failure frequency and its average duration
and to a coefficient characterizing the given multi-path routing pattern
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
71
Equation for ROR: it depends only on the routing pattern r(l)
Where: FECr(l) is the FEC transmission block size in case of the complete failure of link l
r(l) is the load of link l for a given routing pattern FECt is the FEC block size at default
streaming (tolerating loss rate t)
1)(|
)( 1lrtLl t
lr
FEC
FECROR
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
72
ROR coefficient Smaller the ROR coefficient of the multi-
path routing pattern, better is the choice of multi-path routing for real-time streaming
By measuring ROR coefficient of multi-path routing patterns of different path diversity, we can evaluate the advantages (or disadvantages) of diversification
Multi-path routing patterns of different diversity are created by capillary routing algorithm
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
73
05
1015202530354045505560
laye
r1
laye
r2
laye
r3
laye
r4
laye
r5
laye
r6
laye
r7
laye
r8
laye
r9
laye
r10
capillarization
Ave
rage
RO
R r
atin
g
ROR as a function of diversity Here is ROR as a
function of the capillarization level
It is an average function over 25 different network samples (obtained from MANET)
The constant tolerance of the streaming is 5.1%
Here is ROR function for a stream with a static tolerance of 4.5%
Here are ROR functions for static tolerances from 3.3% to 7.5%
3.3%3.9%4.5%5.1%
7.5%6.3%
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
74
05
1015202530354045505560
Eight different sets of 25 network samples
Ave
rage
RO
R r
atin
g
3.3%
3.9%
4.5%5.1%
7.5%…
layers: 1…10 |1…10 |1…10 |1…10 |1…10 |1…10 |1…10 |1…10
Set2 Set3 Set4 Set5 Set6 Set7 Set8Set1
ROR rating over 200 network samples
ROR coefficients for 200 network samples
Each section is the average for 25 network samples
Network samples are obtained from random walk MANET
Path diversity obtained by capillary routing reduces the overall amount of FEC packets
2006-09-29 Emin Gabrielyan, Three Topics in Parallel Communications
75
Conclusions
Although strong path diversity increases the overall failure rate it is beneficiary for real-time streaming (except a few pathological cases)
Capillary routing patterns reduce the overall number of redundant packets required from the sender
In single-path real-time streaming application of FEC at packet level is almost useless
With multi-path routing patterns real-time applications can have great advantages from application of FEC
Future work: using overly network to achieve a multi-path communication flow
Considering coding also inside network, not only at the edges; aiming also at energy saving in MANET
76762006-09-292006-09-29 Emin Gabrielyan, Three Topics in ParallEmin Gabrielyan, Three Topics in Parallel Communicationsel Communications
Thank you!Thank you!
Presented topics:Presented topics:
Fine-grained parallel I/O for cluster Fine-grained parallel I/O for cluster computerscomputers
Liquid scheduling of parallel transmissions Liquid scheduling of parallel transmissions in coarse-grained networksin coarse-grained networks
Capillary routing: fault-tolerance in fine-Capillary routing: fault-tolerance in fine-grained networksgrained networks