Download - Thesis Presentation P2 P Vo D On Internet Rodrigo Godoi

P2P-VoD on Internet: Fault Tolerance and Control Architecture

Rodrigo Godoi

caos Universitat Autònoma de BarcelonaComputer Architecture & Operating Systems Department

Barcelona, July 2009.

Advisor: Dr. Porfidio Hernández Budé

Contents

Introduction

Goal of the Thesis

Control assessment

The Fault Tolerance Scheme

Simulation

Experimental Results

Conclusions

Future Work

Contents

Introduction

Goal of the Thesis

Control Assessment


Simulation


Conclusions

Future Work

Video on Demand - VoD•Multimedia service

•Asynchronous requests

•Every client enjoys entire content

•Long sessions (> 60 min.)

VoD - requirements and constraints

Large scale Video on Demand - LVoD Clients: thousands and disperse Multimedia contents: huge catalogue

Time limit on handling data

Quality of Service (QoS) Fault Tolerance

Multicast

Peer-to-Peer

Internet

Control Architecture

Scalability

Soft real-time

Multicast

IP Multicast: •Source tree (e.g. PIM-DM)•Shared tree (e.g. PIM-SM)

Application Layer Multicast ALM (e.g. NICE, ALMI)

Overlay Multicast(e.g. OMNI)

- implementations

Multicast - Patching

Multicast

Unicast

t= 0

t = 6

Base stream

Patch stream

Patching: multicast technique for multimedia data delivery

Peer-to-PeerFree cooperation of equals in view of the performance of a

common task.

•Takes advantage of resources (storage, cycles, content, human presence) available at the edges of the Internet.

•Usage: file sharing, distributed multimedia systems, high performance computing.

Synchronised usage of peers resources: Collaboration Groups

Peer-to-Peer - classification

P2P taxonomy

Unstructured

Purely decentralised (Gnutella)

Partially decentralised (FastTrack)

Hybrid (BitTorrent)

Structured

Purely decentralised (Chord)

Location mechanism

Peers

Supernodes

Peers Tracker

Peers

Peers

Chain MeshTree

Overlay topology

Peer-to-Peer - classification

Internet environment

•Worldwide scale

•Heterogeneous environment

•Best-effort service

•Exponential growth rate

Organisation:

•Autonomous Systems (AS): collection of connected IP routing prefixes under the control of one or more network operators (ISPs, universities, companies)

•Network arranged by dimension and purpose (LAN, WAN, MAN)

•Modeled by complex network theory

)ln(

)ln(

k

Nr )1(

2

ii

ii

kk

EC

Clustering coefficient Average path length

Problem

Frequent arrivals/departures

Failures Network Server Peers

Large scale system

ProblemInput rate fluctuation

Source crash

Start-up delay

VoD service must…

respect deadlinesprovide low start-up delayhave a clever buffer usageenforce low control overhead

Cushion bufferFailures/Errors

treatment QoS

Control ArchitectureFault Tolerance

Control relevance

Performance improvement

Control complexity

Resources sharing

Delivery Architecture

P2P and Multicast

Heterogeneity: Internet, peers capabilities, lifetimes

Control Architecture

Fault Tolerance

Fault Tolerance

Failure

Network redundancy

Source redundancy

Error

Forward Error Correction

Automatic Repeat Request

Do not solve the fault

System defect

Consequence of a failure

State of the artService P2P Multicast Internet Fault

ToleranceControl

Assessment

DirectStream VoD Unstructured Tree

ALM x

CoopNet Streaming / VoD

Unstructured Tree

ALMx x

P2VoD VoD Unstructured Tree

ALMx x

DynaPeer VoD Unstructured Tree

IP/ALM x

PPLive Streaming Unstructured Mesh

ALM x x

P2Cast VoD Unstructured Tree

ALM x

Pn2Pn VoD Unstructured Mesh

IPx x

BitToS VoD Unstructured Mesh

ALM x

Promise Streaming Structured Mesh

ALM x

GloVe VoD Unstructured Tree

IP/ALMx x x

Contents

Introduction

Goal of the Thesis

Control Assessment


Simulation


Conclusions

Future Work

Goal of the Thesis

LVoD system

Control Architecture and Fault Tolerance

Multicast

To assess Control impact and propose a Fault Tolerance Scheme for P2P-VoD

service on the Internet.

ScalabilityFlexibilityReliabilityEfficiencyLow overhead

QoS

Contents

Introduction

Goal of the Thesis

Control Assessment


Simulation


Conclusions

Future Work

System architecture

Service P2P Multicast Internet Fault Tolerance

Control Assessment

SystemArchitecture

VoD UnstructuredTree/Mesh

IP/ALM

Internet

Clients overlay topology

Distributed proxy servers

Servers overlay topology

P2P Collaborations

Distributed video servers

Clients

Internet Autonomous SystemIP Multicast

zone

The Failure Management Process

Detection Recovery

Maintenance

Basis of Fault Tolerance Mechanisms

• Income stream monitoring• Heartbeat messages

• Centralised• Subsequent queries

• Network infrastructure• Peer status

Load and Time metrics

Volume of control messages that flows through the system on failure management processes

Control overhead - congestion, bandwidth consumption

Time cost

Time consumed by solving peer failures

Control efficiency - start-up delay, buffer usage

Load cost

Background: VoD service schemes

Gather different aspects of P2P-VoD services

PCM/MCDB•IP Multicast (local level)•Patching•Mesh-based P2P•Heartbeats/ buffer monitoring •Centralised recovery

P2Cast•ALM (AS level)•Patching•Tree-based P2P•Heartbeats/ buffer monitoring •Recursive recovery

PCM/MCDBPCM: Patch Collaboration Manager

MCDB: Multicast Channel Distributed Branching

PCM

MCDB

Bypass

Fault Tolerance - PCM/MCDB

Ch. M0

Ch. M1

Ch. M2

MCDB

Detection messages

Maintenance messages

Recovery messages

• Centralised recovery.

• IP Multicast tree rearrangement

P2Cast

VoD Server

35.8

31.0

34.0 35.0

39.9

40.0

37.0 35.5

Session 4

• Clients are divided into sessions according to the arrival time in the system (session threshold parameter - T)

• Best-fit algorithm: Peer with great amount of available bandwidth is selected as parent

20.0

21.0 24.0

27.0

Session 3

Base Stream

Patch Stream

T

Fault Tolerance - P2Cast

20.0

35.8

31.0

21.0 24.0

27.0

34.0 35.0

39.9

40.0

37.0 35.5

Session 3Session 4

VoD Server

• Source peers (Parents) failures provoke stream disruption

• Subsequent recovery queries

Detection messages

Recovery messages

Load cost

CG

iigccHBd NfC

1)(_

][.MGMG

CHfer NNOfC

MG

iigHTICCIm NfNfC

1)(

Detection

Recovery

Maintenance

Heartbeats

IP multicast rearrangement

Recovery request

Peers status

Routers status

Control messages

ccHBd NfC

PCM/MCDB P2Cast

efer p

OfC.

Subsequent queries

HfC TIm .

Routers status

Heartbeats

)ln(

)ln(1

k

Hl

pf

WC

eHBt

)ln(

)ln(

k

Hl

f

WC

HBt

Time cost

PCM/MCDB

P2Cast

Detection

Subsequent queries

Recovery messages

Detection

latency

Path (network theory): small-world effect

Time consumption

Recovery messages

latency

Path (network theory): small-world effect

Contents

Introduction

Goal of the Thesis

Control Assessment


Simulation


Conclusions

Future Work

The Fault Tolerance Scheme (FTS)

FTS stands on peers capabilities:

89101112L 6713

•Input / Output bandwidth•Buffer size

CushionDeliveryCollaborationAltruist

Buffer In

Buffer Out

bwibwo

buffer

Fault Tolerance Groups


MN

Video 8 9 10

L

6 7 13 14 1511 123 4 51 2

1 2 3 1 2 3

MN

t = 0

Video

C1

8 9 106 7 13 14 1511 123 4 51 2

7 L4 5 61 2 3 6 7

L 4 51 2 37 8 94 5 61 2 3 10

C1

MN

t = 0

t = 3

7

Video

MN

C1

C2

8 9 10

7 L

6 7 13 14 1511 123 4 51 2

4 5 61 2 3 14 1511 12 13

10 11 L127 8 94 5 6 9 106 7 813 14

13 14 L1510 11 128 9 4 51 2 3

C1 C2

MN

t = 0

t = 3 t = 10

[t = 17]

Cushion Delivery Gen. purpose Altruist

MN

Fault Tolerance Groups

Manager Node

FTS Collaborators

Load and Time costs with the FTS

MNHBd NfC

)ln(

)ln(

k

HlWC t

The proposed Fault Tolerance Scheme…

MN

G

iigccHBd NNfC

C

·21

)(_

MG

iigHTIm NfC

1)(

fer OfC .

)ln(

)ln(

k

Hl

f

WC

HBt

distributes the control through Manager Nodes

eliminates messages for peers status maintenance

removes subsequent queries during recovery

can detect failures through heartbeats (FTS I) and income stream monitoring (FTS II)

Contents

Introduction

Goal of the Thesis

Control Assessment


Simulation


Conclusions

Future Work

Simulation tool: VoDSimComputational simulations provide a more dynamic and

scalable analysis

•Discrete event-driven model•More than 50 classes in C++•Over 46.000 lines

•Peer arrival rate: Poisson

•Content popularity: Zipf

•Implementation of ALM service scheme: P2Cast

•Peers disruptions: Weibull

•FMP instrumentation: Load and Time costs measurement

VoDSim extensions

P2Cast

Peers disruptio

ns

FMP instrumen

ta-tion

)/()( xexp )/(1)( xexp

Fault probability Lifetime

Contents

Introduction

Goal of the Thesis

Control Assessment


Simulation


Conclusions

Future Work


1 - Control relevance

• PCM/MCDB and P2Cast service schemes• Analytical and Simulated results• Load and Time costs behaviour• Control vs. Multimedia traffic

Failure Management Process validationParameter Value

Service scheme P2CastRequest rate 10 - 60 requests/minute.

Client’s output bandwidth 3000 kb/s.Client’s buffer 113MB (10 min. of video).

Video catalogue 1 video.Video length 90 minutes.

Video play rate 1500 kb/s.Threshold 1%, 5% and10% of video

length

0,00E+000

1,00E+006

2,00E+006

3,00E+006

4,00E+006

5,00E+006

6,00E+006

7,00E+006

8,00E+006

1 10 100 1000 10000

Number of clients

Lo

ad C

ost

(M

essa

ges

)

Analytical Model Simulation

N

MZs

2)(

Z Simulated valueM Average simulated cost.N Number of simulation samples.s Standard deviation.

1,0

10,0

100,0

1000,0

0,0010 0,0100 0,1000 1,0000

Success probability on search collaboration

Tim

e C

ost (

sec.

)

Analytical Model Simulation

Control vs. Multimedia traffic100

_

videoserver

controlw Tr

Tr

Simulated results (P2Cast)

∆w = 10%-28%

∆w = 13%-39%

∆w =13%- 37%

Performance improvement

Control complexity

Analytical results (PCM/MCDB and P2Cast)

Load cost analysisParameter Value

Number of clients [27 ; 5400]Number of multicast groups 40

Heartbeat frequency 60 msg./min.Failure frequency 0,2 fault/ min.

Peers available bandwidth 1.5-3.0 Mb/sPlayback rate 1500 kb/sVideo length 90 minutes.

Load cost analysisParameter Value

Number of clients 5400Number of multicast groups 11

Heartbeat frequency [0.2 ; 20] msg./min.Failure frequency 0,2 fault/ min.

Peers available bandwidth 1.5-3.0 Mb/sPlayback rate 1500 kb/sVideo length 90 minutes.

Time cost analysis

0,0

10,0

20,0

30,0

0,0001 0,0010 0,0100 0,1000

Network Latency (seconds)

Tim

e C

ost

(se

c.)

PCM-MCDB P2Cast

Time cost increment

High latency

Recovery control messages

Cushion buffer 56MB 11MB

Start-up delay 5 min. 1 min.

Download rate: 1500 kb/s (750+750)

Time cost analysis


2 - The Fault Tolerance Scheme

• PCM/MCDB and P2Cast service schemes• Load and Time costs without the FTS (analytical)• Load and Time costs with the FTS (analytical)• FTS service performance - Simultaneous failures

0,0E+00

2,0E+04

4,0E+04

6,0E+04

8,0E+04

1,0E+05

1,2E+05

1 10 100

Multicast groups size (clients/session)

Lo

ad C

ost

(m

sg/m

in)

PCM-MCDB PCM-MCDB (FTS I) PCM-MCDB (FTS II)

P2Cast P2Cast (FTS I) P2Cast (FTS II)

Load cost analysis

100)( ._

SP

SPFTSSP

C

CC

)(._ SPFTSSP CC

)(._ SPFTSSP CC

Cost increment

Cost reduction

Parameter ValueNumber of clients [27 ; 5400]

Number of multicast groups 11 / 40Heartbeat frequency [0.2 ; 60] msg./min.

Failure frequency [0.03 ; 80] fault/ min.Peers available bandwidth 1.5-3.0 Mb/s

Playback rate 1500 kb/sBuffer capacity 113MB (10 min.)Video length 90 minutes.

average ∆FTS I x PCM/MCDB -60.3%FTS II x PCM/MCDB -85.5%

FTS I x P2Cast 8.5%FTS II x P2Cast -87.5%

FTS I - heartbeat detectionFTS II - buffer monitoring detection

0,E+00

1,E+04

2,E+04

3,E+04

4,E+04

5,E+04

6,E+04

7,E+04

8,E+04

0,10 1,00 10,00 100,00

Heartbeat frequency (msg/min)

Lo

ad

Co

st

(ms

g/m

in)


P2Cast P2cast (FTS I) P2Cast (FTS II)

0,0E+00

5,0E+04

1,0E+05

1,5E+05

2,0E+05

2,5E+05

0,01 0,1 1 10 100

Failure Frequency (faults/min)

Lo

ad

Co

st

(ms

g/m

in)


P2Cast P2cast (FTS I) P2Cast (FTS II)

Load cost analysis


FTS I x P2Cast 2.1%FTS II x P2Cast -80.4%

Overhead reductionScalability


FTS I x P2Cast -0.9%FTS II x P2Cast -91.6%

Cushion buffer 56MB

11MB

Time cost analysis

0,0

10,0

20,0

30,0

0,0001 0,0010 0,0100 0,1000

Network Latency (seconds)

Tim

e C

ost

(se

c.)

PCM-MCDB P2Cast

Time cost increment

High latency

Volume of communication FTS

τ = 1/(2·fHB)

Start-up delay5 min.

1 min.

Efficiency

FTS I - heartbeat detectionFTS II - buffer monitoring detection

Fault Tolerance service performanceParameter Value

Number of clients 10800Video channels with P2P collaboration

1000

Altruist Buffer338MB (30 min.) and

102MB (9 min.).Video length 90 minutes.

Video play rate 1500 kb/s.

NCFTG \ bw (kb/s) sf5 \ 300 2004 \ 750 400

3 \ 1500 6003 \ 3000 12003 \ 6000 2400

NCFTG \ bw (kb/s) sf10 \ 300 40010 \ 750 1000

10 \ 1500 200010 \ 3000 400010 \ 6000 8000

Altruist buffer 338MBReliabilityFlexibility

Altruist buffer 102MB

Contents

Introduction

Goal of the Thesis

Control Assessment


Simulation


Conclusions

Future Work

Conclusions

Quality of Service

Load costControl overhead: network congestion,

bandwidth resources

Time costEfficiency: buffer usage, start-up delay

Load and Time costs trade-off

Reduction of Load and Time costs

Internet

P2P

Multicast

Increase

control

complexity

Control mechanism plays a crucial role on designing P2P-VoD systems

Conclusions

is flexible for Internet use

presents hierarchical control structure

has scalable backup mechanism

do not demand extra data communication and dedicated resources

is able to guarantee system reliability

reduces Load and Time costs

The Fault Tolerance Scheme…

Contents

Introduction

Goal of the Thesis

Control Assessment


Simulation


Conclusions

Future Work

Future Work

• Application and assessment of the FTS in a wide range of VoD architectures and service policies

• Implementation of the FTS in a simulation environment

• FTS improvement: storing parts of non-visualized contents; using non-volatile storage devices (e.g. Solid State Disk drives)

• Addition of VCR / DVD-like operations

• Usage of clients behaviour information to improve system performance

P2P-VoD on Internet: Fault Tolerance and Control Architecture

Rodrigo Godoi

caos

Barcelona, July 2009.

Thank you

Gracias

Obrigado


•Server: content seed

•Peer: multimedia client / source

•FTG member: collaborator in the FTS

•Manager Node: organize and monitor FTG

Architecture elements

•Distributed backup: flexibility and reliability.

•Built on the fly: backup do not need retransmission.

•P2P based: mechanism uses own system available resources.

•Hierarchical control: scalability and deployment.

Manager Node

Server

Fault Tolerance Group

Clients

Control comm.

FTG members

The FTS formation lawPeers’ bandwidth greater than playback rate (bw ≥ Vpr)

];[ maxmin ddd

];[ maxmin bwbwbwo

FTGCN

iidL

1

maxmin

minmaxCFTGCFTGCFTGCFTG NNN

d

LN

d

L

pr

o

V

bwMINCC

)(

vf streamsPPFTSPs _2%

)(bwfsCC

Ld

Ld

While

If

then

Add Collaborator to FTGIf

then New

FTG.

Input parameters Distributed

backupFTG size

Service conditions

The FTS formation lawPeers’ bandwidth lower than playback rate (bw < Vpr )

FTGCN

iidL

1

FTGCN

iiopr bwV

1

max2

max1 ;

d

LN

bw

VN pr

121 minNNNN CFTG

10

0

,

1

221

min

min

1

2

QNNr

NNr

rQ

NN

CFTG

CFTG

NN

if:

if:

maxmin CFTGCFTGCFTG NNN

pr

o

V

bwCC

)(bwfsCC

1 CCLd

1 CCLd

While

If

then

Add Collaborator to FTG.If

then

New FTG.

Buffer and Bandwidth constraints

Collaboration Capacity

FTG size

A*…

…C1

…C2

AVideo C BDEH G F…

…C3

A*

A*

A*

B*

B*

B*

B*

A

15A · 2 15

A · 315

A · 5

15A · 5

C1 C2

MN

C3

MN [500kb/s]C1 [200kb/s]C2 [300kb/s]C3 [500kb/s]Vpr [1500kb/s]

MN


Creation of Fault Tolerance Groups

Client

I

Local Server

II

Collab. availability

FTS ack.Join to FTG

Standby status

Start new FTG and become Manager Node

C1 C2

MN

MN

IV

III


FTG: complexity and maintenance

Standby Peer

Local Server

Restoring the FTGDesignation of new MN

C1 C2

I

II Restoring the FTG

MN

failure

C1

MN

Member failure

O(NCFTG)

Evaluation environment

Underlying network: GT-ITM topology generatorTransit-stub model

• 1Transit domain (3 routers)• 6 Stub domain (54 routers)

Service schemes• ALM / tree-based P2P (P2Cast)• IP Multicast / mesh-based P2P (PCM/MCDB)

Network protocols• Unicast: OSPF• IP Multicast: IGMP and PIM-SM

Evaluation environment

Description Stage

IP MulticastPIM-SM

Hello - sent periodically on each PIM-enabled interface, with a destination address for all the PIM routers multicast group.

Maintenance

Join/Prune - consists of a list of groups and a list of Joined and Pruned sources for each group in order to build the distribution tree.

Recovery \ Maintenance

IP MulticastIGMP

Create Group Request - requests the creation of a new transient host group.

Recovery

General Query - periodically solicits the group membership information.

Recovery

Host Membership Report - message to the group address for each group to which a host desires to belong.

Recovery

UnicastOSPF

Hello - used to perform neighbour discovery, continually sent to notice when connectivity has failed.

Maintenance

Advertisement - communicates the router's local routing topology to all other local routers in the same OSPF area.

Maintenance

Network Protocols

Conclusions - publicationsThe FMPLoad cost PCM/MCDBTree topology network

CEDI/JP07

Load costPCM/MCDBCentralised vs. Distributed FMPMulticast vs. Unicast

CACIC07

Load cost PCM/MCDBTransit-stub network topologyCentralised vs. Distributed FMP

JCS&T08

PCM/MCDB Load cost Transit-stub topology networkManager NodesControl vs. Multimedia traffic

EuroPar08

Load costTime costP2CastFMP simulated results

PDPTA09

Load costTime costP2CastControl vs. Multimedia traffic (simulated results)

ICIP09