+ All Categories
Home > Documents > CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated...

CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated...

Date post: 24-Dec-2015
Category:
Upload: alicia-strickland
View: 215 times
Download: 0 times
Share this document with a friend
Popular Tags:
38
CSE 461 University of Washington 1 Topic Peer-to-peer content delivery Runs without dedicated infrastructure BitTorrent as an example Peer Peer Peer Peer Peer
Transcript
Page 1: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

CSE 461 University of Washington 1

Topic• Peer-to-peer content delivery– Runs without dedicated infrastructure– BitTorrent as an example

Peer

Peer

Peer

PeerPeer

Page 2: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

CSE 461 University of Washington 2

Context• Delivery with client/server CDNs:– Efficient, scales up for popular content– Reliable, managed for good service

• … but some disadvantages too:– Need for dedicated infrastructure– Centralized control/oversight

Page 3: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

CSE 461 University of Washington 3

P2P (Peer-to-Peer)• Goal is delivery without dedicated

infrastructure or centralized control– Still efficient at scale, and reliable

• Key idea is to have participants (or peers) help themselves– Initially Napster ‘99 for music

(gone)– Now BitTorrent ‘01 onwards

(popular!)

Page 4: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

CSE 461 University of Washington 4

P2P Challenges• No servers on which to rely– Communication must be peer-to-peer

and self-organizing, not client-server– Leads to several issues at scale …

Peer

Peer

Peer

PeerPeer

Page 5: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

CSE 461 University of Washington 5

P2P Challenges (2)1. Limited capabilities– How can one peer deliver content

to all other peers?

2. Participation incentives– Why will peers help each other?

3. Decentralization– How will peers find content?

Page 6: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

CSE 461 University of Washington 6

Overcoming Limited Capabilities• Peer can send content to all other

peers using a distribution tree– Typically done with replicas over time– Self-scaling capacity

Source

Page 7: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

CSE 461 University of Washington 7

Overcoming Limited Capabilities (2)• Peer can send content to all other

peers using a distribution tree– Typically done with replicas over time– Self-scaling capacity

Source

Page 8: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

CSE 461 University of Washington8

Providing Participation Incentives• Peer play two roles:– Download ( ) to help themselves,

and upload ( ) to help others

Source

Page 9: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

CSE 461 University of Washington 9

Providing Participation Incentives (2)• Couple the two roles:– I’ll upload for you if you upload for me– Encourages cooperation

Source

Page 10: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

CSE 461 University of Washington 10

Enabling Decentralization• Peer must learn where to get

content– Use DHTs (Distributed Hash Tables)

• DHTs are fully-decentralized, efficient algorithms for a distributed index– Index is spread across all peers– Index lists peers to contact for content– Any peer can lookup the index – Started as academic work in 2001

Page 11: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

CSE 461 University of Washington 11

BitTorrent• Main P2P system in use today– Developed by Cohen in ‘01 – Very rapid growth, large transfers– Much of the Internet traffic today! – Used for legal and illegal content

• Delivers data using “torrents”:– Transfers files in pieces for

parallelism– Notable for treatment of incentives– Tracker or decentralized index (DHT)By Jacob Appelbaum, CC-BY-SA-2.0, from Wikimedia Commons

Bram Cohen (1975—)

Page 12: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

CSE 461 University of Washington 12

BitTorrent Protocol• Steps to download a torrent:

1. Start with torrent description2. Contact tracker to join and get list

of peers (with at least seed peer)2. Or, use DHT index for peers3. Trade pieces with different peers4. Favor peers that upload to you

rapidly; “choke” peers that don’t by slowing your upload to them

Page 13: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

CSE 461 University of Washington 13

BitTorrent Protocol (2)

• All peers (except seed) retrieve torrent at the same time

Page 14: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

CSE 461 University of Washington 14

BitTorrent Protocol (3)

• Dividing file into pieces gives parallelism for speed

Page 15: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

CSE 461 University of Washington 15

BitTorrent Protocol (4)

• Choking unhelpful peers encourages participation

STOPSTOP

STOP

XXX

Page 16: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

CSE 461 University of Washington 16

BitTorrent Protocol (5)

• DHT index (spread over peers) is fully decentralized

DHT

DHT

DHTDHT

DHT

DHT

DHT

DHT

Page 17: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

CSE 461 University of Washington 17

P2P Outlook

• Alternative to CDN-style client-server content distribution– With potential advantages

• P2P and DHT technologies finding more widespread use over time– E.g., part of skype, Amazon– Expect hybrid systems in the future

Page 18: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications

Robert MorrisIon Stoica, David Karger,

M. Frans Kaashoek, Hari Balakrishnan

MIT and Berkeley

Page 19: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

A peer-to-peer storage problem

• 1000 scattered music enthusiasts• Willing to store and serve replicas• How do you find the data?

Page 20: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

The lookup problem

Internet

N1

N2 N3

N6N5

N4

Publisher

Key=“title”Value=MP3 data…

ClientLookup(“title”)

?

Page 21: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

Centralized lookup (Napster)

Publisher@

Client

Lookup(“title”)

N6

N9 N7

DB

N8

N3

N2N1SetLoc(“title”, N4)

Simple, but O(N) state and a single point of failure

Key=“title”Value=MP3 data…

N4

Page 22: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

Flooded queries (Gnutella)

N4Publisher@

Client

N6

N9

N7N8

N3

N2N1

Robust, but worst case O(N) messages per lookup

Key=“title”Value=MP3 data…

Lookup(“title”)

Page 23: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

Routed queries (Freenet, Chord, etc.)

N4Publisher

Client

N6

N9

N7N8

N3

N2N1

Lookup(“title”)

Key=“title”Value=MP3 data…

Page 24: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

Routing challenges

• Define a useful key nearness metric• Keep the hop count small• Keep the tables small• Stay robust despite rapid change

• Freenet: emphasizes anonymity• Chord: emphasizes efficiency and simplicity

Page 25: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

Chord properties

• Efficient: O(log(N)) messages per lookup– N is the total number of servers

• Scalable: O(log(N)) state per node• Robust: survives massive failures

• Proofs are in paper / tech report– Assuming no malicious participants

Page 26: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

Chord overview

• Provides peer-to-peer hash lookup:– Lookup(key) IP address– Chord does not store the data

• How does Chord route lookups?• How does Chord maintain routing tables?

Page 27: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

Chord IDs

• Key identifier = SHA-1(key)• Node identifier = SHA-1(IP address)• Both are uniformly distributed• Both exist in the same ID space

• How to map key IDs to node IDs?

Page 28: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

Consistent hashing [Karger 97]

N32

N90

N105

K80

K20

K5

Circular 7-bitID space

Key 5Node 105

A key is stored at its successor: node with next higher ID

Page 29: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

Consistent hashing [Karger 97]Theorem: For any set of N nodes and K keys, with “high probability”

1) Each node is responsible for at most (1+ eps) K/N keys

2) When the (N+1)th node joins or leaves the network, responsibilityfor O(K/N) keys changes hands

Page 30: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

Basic lookup

N32

N90

N105

N60

N10N120

K80

“Where is key 80?”

“N90 has K80”

Page 31: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

Simple lookup algorithm

Lookup(my-id, key-id)n = my successorif my-id < n < key-id

call Lookup(id) on node n // next hop

elsereturn my successor // done

• Correctness depends only on successors

Page 32: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

“Finger table” allows log(N)-time lookups

N80

½¼

1/8

1/161/321/641/128

Page 33: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

Finger i points to successor of n+2i

N80

½¼

1/8

1/161/321/641/128

112

N120

Page 34: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

Lookup with fingers

Lookup(my-id, key-id)look in local finger table for

highest node n s.t. my-id < n < key-idif n exists

call Lookup(id) on node n// next hop

elsereturn my successor // done

Page 35: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

Lookups take O(log(N)) hops

N32

N10

N5

N20

N110

N99

N80

N60

Lookup(K19)

K19

Page 36: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

Failures might cause incorrect lookupN120

N113

N102

N80

N85

N80 doesn’t know correct successor, so incorrect lookup

N10

Lookup(90)

Page 37: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

Solution: successor lists

• Each node knows r immediate successors• After failure, will know first live successor• Correct successors guarantee correct lookups

• Guarantee is with some probability

Page 38: CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.

Choosing the successor list length

• Assume 1/2 of nodes fail• P(successor list all dead) = (1/2)r – I.e. P(this node breaks the Chord ring)– Depends on independent failure

• P(no broken nodes) = (1 – (1/2)r)N

– r = 2log(N) makes prob. = 1 – 1/N


Recommended