Towards an Application-Aware Multicast Communication Framework for Computational Grids

Towards an Application-Aware Towards an Application-Aware Multicast Communication Multicast Communication

Framework for Computational Framework for Computational GridsGrids

M. MM. MAIMOURAIMOUR, C. P, C. PHAMHAM

RESO/LIP, UCB LyonRESO/LIP, UCB Lyon

ASIAN'02, HanoiASIAN'02, Hanoi

Dec 5th, 2002Dec 5th, 2002

Computational gridsComputational gridsapplication user

from Dorian Arnold: Netsolve Happenings

The current usage of gridsThe current usage of grids

MostlyMostly– Database accesses, sharing, replications(DataGrid, Database accesses, sharing, replications(DataGrid,

Encyclopedia of Life Project…)Encyclopedia of Life Project…)

– Distributed Data Mining (seti@home…)Distributed Data Mining (seti@home…)

– Data and code transfert, massively parallel job Data and code transfert, massively parallel job submissions (task-farm computing)submissions (task-farm computing)

FewFew– Distributed applications (MPI…) Distributed applications (MPI…)

– Interactive applications (DIS, HLA…), remote Interactive applications (DIS, HLA…), remote visualizationvisualization

WHY?

WHY?WHY?

End-to-End performances are not here yetEnd-to-End performances are not here yet

Not scalable!Not scalable!

Unable to adapt to new technologies and usesUnable to adapt to new technologies and uses

WHY??WHY??People forgot the networking side of gridsPeople forgot the networking side of grids

Gbits/s links do not mean E2E performances!Gbits/s links do not mean E2E performances!

Computing resources and network resources Computing resources and network resources are logically separatedare logically separated

Visions for a gridVisions for a gridFROM DUMB LINKS CONNECTING COMPUTING RESOURCES

TO COLLABORATIVE RESOURCES

The network can work together withthe applications toprovide in-networkprocessing functions

The network can work together withthe applications toprovide in-networkprocessing functions

Application-AwareApplication-AwareInfrastructure on GridsInfrastructure on Grids

core networkGbits/s rate

100 Base TXsourcesource

active router active router

active routerInternet Data Center

application-aware component

computing center

computing center

campus/corporate

lab cluster

Application-Aware Components Application-Aware Components AACAAC

Based on pBased on programmable rogrammable active active nodesnodes/routers/routers

Customized Customized computationscomputations on packetson packets

Standardized execution Standardized execution environment and environment and programming interfaceprogramming interface

DataData

active code A1

active code A2

A1A1A2A2

Interoperability with legacy Interoperability with legacy routersrouters

IP IP IP IP IP IP

TCP/UDP TCP/UDP TCP/UDP TCP/UDP

AL AL AL ALtraditional IP routing

APPLI APPLI

similar to tunnelling

Deploying new servicesDeploying new services

Collective/gather operationsCollective/gather operations Interest management, filtering (DIS, HLA)Interest management, filtering (DIS, HLA) On-the-fly flow adaptation (compression, On-the-fly flow adaptation (compression,

layering…) for remote displayslayering…) for remote displays Intelligent directory servicesIntelligent directory services Distributed, hierarchical security systemDistributed, hierarchical security system Distributed Logistical StorageDistributed Logistical Storage Custom QoS policyCustom QoS policy

Ex: Collective operationsEx: Collective operationsmax computationmax computation

if x<a then x=a

if x<a then x=a

MAX MAX

MAX

MAX

if x<a then x=a

AAC

AAC

AAC

Ex: Wide-area interactive Ex: Wide-area interactive simulationssimulations

human in the loopflight simulator

remote displayflight traffic generator

INTERNETGRID

airport simulator

flow adaptationspecific filter

specific filter

specific filter

"only very closeevents" filter

Deploying reliable multipoint Deploying reliable multipoint data distribution servicesdata distribution services

ForFor– Database accesses, sharing, replicationsDatabase accesses, sharing, replications– Data and code transfert, massively parallel job Data and code transfert, massively parallel job

submissions (task-farm computing)submissions (task-farm computing)– Distributed applications (MPI…) Distributed applications (MPI…) – Interactive applications (DIS, HLA…)Interactive applications (DIS, HLA…)

Desired featuresDesired features– scalablescalable– low latencieslow latencies


ForFor– Database accesses, sharing, Database accesses, sharing,

replicationsreplications– Data and code transfert, Data and code transfert,

massively parallel job massively parallel job submissions (task-farm submissions (task-farm computing)computing)

– Distributed applications Distributed applications (MPI…) (MPI…)

– Interactive applications Interactive applications (DIS, HLA…)(DIS, HLA…)


Sender

data

datadata

data

Receiver Receiver

datadata

withoutmulticast


ForFor– Database accesses, sharing, Database accesses, sharing,

replicationsreplications– Data and code transfert, Data and code transfert,

massively parallel job massively parallel job submissions (task-farm submissions (task-farm computing)computing)

– Distributed applications Distributed applications (MPI…) (MPI…)

– Interactive applications Interactive applications (DIS, HLA…)(DIS, HLA…)


Sender

data

datadata

data

Receiver Receiver Receiver

IP multicast

DyRAMDyRAM

Protocol with modular services for achieving Protocol with modular services for achieving reliability, scalability and low latenciesreliability, scalability and low latencies

global NACKsuppression

Early PacketLoss Detection

Local

Recoveries

DynamicReplierElection

AccurateCongestion

Control

subcast ofrepair

packets

Ex: Ex: Global NACKs suppressionGlobal NACKs suppression

NACK4NACK4

NACK4

NACK4data4

NACK4

only one NACK is forwarded to the source

Ex: Ex: EEaarly losrly lostt packet packet detectiondetection

NACK4

NACK4

NACK4

NACK4

NACK4

A NACK is sent by the router

data3data4

data5

The repair latency can be reduced if the lost packet could be requested as soon as possible

These NACKs are ignored!

core networkGbits/s rate

active router active router

active router

sourcesource

Internet Data Center

application-aware component

computing center

computing center

campus/corporate

The AAC associated to the source can perform early processing on packets. For instance the DyRAM protocol uses subcast and loss detection services in order to reduce the end-to-end latency.

In DyRAM, any recei-ver can be designated as a replier for a loss packet.The election service is performed by the upstream AAC on a per-packet basis. Having dynamic repliers allows for more scalability as caching within routers is not required.

An AAC associated to a tail link performs NACK aggregation, subcasting and the election on a per-packet basis of the replier.


Local recovery & replier electionLocal recovery & replier election

Local recoveries Local recoveries reduces the end-to-reduces the end-to-end delay end delay (especially for (especially for high loss rates and high loss rates and a large number of a large number of receivers).receivers).

#grp: 6…24

4 receivers/group

p=0.25

Local recovery & replier electionLocal recovery & replier election

As the group size As the group size increases, doing the increases, doing the recoveries from the recoveries from the receivers greatly receivers greatly reduces the reduces the bandwidth bandwidth consumptionconsumption

48 receivers distributed in g groups #grp: 2…24

Early Packet Loss ServiceEarly Packet Loss Service

p=0.25

#grp: 6…244 receivers/group

EPLD is very beneficialto DyRAM

DyRAM implementation DyRAM implementation

TAMANOIR active execution environmentTAMANOIR active execution environment Java 1.3.1 and a linux kernel 2.4Java 1.3.1 and a linux kernel 2.4 A set of PCs receivers and 2 PC-based A set of PCs receivers and 2 PC-based

routers ( Pentium II 400 MHz 512 KB routers ( Pentium II 400 MHz 512 KB cache 128MB RAM)cache 128MB RAM)

Data packets are 4 KBytesData packets are 4 KBytes

testbed configurationtestbed configuration

The data pathThe data path

IP UDP S,@IP data

ANEP packet

IP

UDP

S,@IP dataTamanoir portFTP port

S

TAMANOIRTAMANOIR

S1

FTPFTP

Cost of Data Packet ServicesCost of Data Packet Services

ike

resama

resamo

resamdstan

Cost of Data Packet ServicesCost of Data Packet Services

NACK: 135μsNACK: 135μs DP : 20μs if DP : 20μs if

no seq gap, no seq gap, 12ms-17ms 12ms-17ms otherwise. otherwise. Only 256μs Only 256μs without timer without timer settingsetting

Repair: 123μsRepair: 123μs

Cost of Replier ElectionCost of Replier Election

ike

resamo

NACK

The election is performed on-the-fly.

It depends on the number of downstream links.

Costs range from 0.1 to 1ms for 5 to 25 links per router.

Cost of Replier ElectionCost of Replier Election

Conclusions (1)Conclusions (1)

Grids can be more than end-host computing Grids can be more than end-host computing resources interconnected with network linksresources interconnected with network links

High-bandwidth links is not enough to High-bandwidth links is not enough to provide E2E performances for distributed, provide E2E performances for distributed, interactive applicationsinteractive applications

Application-aware components can be Application-aware components can be deployed to host high-value servicesdeployed to host high-value services

In-network processing functions can make In-network processing functions can make grids more responsive to applications' needsgrids more responsive to applications' needs

Conclusions (2)Conclusions (2)

The paper shows how an efficient The paper shows how an efficient multipoint service can be deployed on an multipoint service can be deployed on an application-aware infrastructureapplication-aware infrastructure

Simulations and experimentations shows Simulations and experimentations shows that low latencies can be obtained with the that low latencies can be obtained with the combination and collaboration of light and combination and collaboration of light and simple servicessimple services

Date post:	07-Jan-2016
Category:	Documents
Upload:	atalo
View:	48 times
Download:	2 times

Towards an Application-Aware Multicast Communication Framework for Computational Grids

Documents