+ All Categories
Home > Documents > Should we build Gnutella on a structured overlay? ·

Should we build Gnutella on a structured overlay? ·

Date post: 20-Jan-2021
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
40
Should we build Gnutella on a structured overlay? Ant Rowstron joint work with Miguel Castro, Manuel Costa Microsoft Research Cambridge
Transcript
Page 1: Should we build Gnutella on a structured overlay? ·

Should we build Gnutella on astructured overlay?

Ant Rowstron

joint work with

Miguel Castro, Manuel Costa

Microsoft Research Cambridge

Page 2: Should we build Gnutella on a structured overlay? ·

Structured P2P overlay networks

• Structured overlay network maps keys to nodes• Routes messages to keys; (can implement hash table)

overlay network with N nodes

k,v

[CAN, Chord, Kademlia, Pastry, Skipnets, Tapestry, Viceroy]

route(“insert v”, k)

route(“lookup”, k) v

Page 3: Should we build Gnutella on a structured overlay? ·

Mapping keys to nodes

• Large id space (128-bit integers)

• NodeIds picked randomly from space

• Key is managed by its root node:

• Live node with id closest to the key

root nodefor key

id space

nodeIdkey

Page 4: Should we build Gnutella on a structured overlay? ·

Pastry

2033*2032*2031*2030*

203*202*201*200*

23*22*21*20*

3*2*1*0*

203231

• routing table• nodeIds and keys in some base 2b (e.g., 4)• prefix constraints on nodeIds for each slot

leaf set

nodeId

Page 5: Should we build Gnutella on a structured overlay? ·

Structured overlays• Overlay topology

– nodes self organize into structured graph– node identity constrains set of neighbors

• Data placement– data identified by a key– data stored at node responsible for key

• Queries– efficient key lookups (O(logN))

examples: CAN, Chord, Pastry, Tapestry

Page 6: Should we build Gnutella on a structured overlay? ·

Gnutella

• Nodes form random graph (unstructured overlay)• Node stores its own published content• Lookups flooded through network (inefficient)

route(“insert v”)

v

overlay network with N nodes

route(“lookup”, reg. exp)

Page 7: Should we build Gnutella on a structured overlay? ·

Gnutella

• Nodes form random graph (unstructured overlay)• Node stores its own published content• Lookup using random walks (needles and haystacks!)

route(“insert v”)

v

overlay network with N nodes

route(“lookup”, reg. exp)

Page 8: Should we build Gnutella on a structured overlay? ·

Unstructured overlay

• Overlay topology– nodes self-organize into random graph

• Data placement– node stores data it publishes

• Queries– overlay supports arbitrarily complex queries– floods or random walks disseminate query– each node evaluates query locally

example: Gnutella

Page 9: Should we build Gnutella on a structured overlay? ·

Can we build Gnutella on astructured overlay?

• Complex queries are important– unstructured overlays support them

– structured overlays do support them

• Peers are extremely transient– unstructured overlays more robust to churn

– structured overlays have higher overhead

[Chawathe et al. SIGCOMM’03]

Page 10: Should we build Gnutella on a structured overlay? ·

Complex queries

• Arbitrarily complex queries– Unstructured overlay

• Flood– High overhead due to duplicates

• Random walks– High lookup latency

– Support arbitrarily complex queries

– Structured overlays• ?

Page 11: Should we build Gnutella on a structured overlay? ·

Complex queries (structured)

• Structured overlay topology– nodes self organize into structured graph

• Same data placement as unstructured– node stores data it publishes

• Same queries as unstructured– overlay supports arbitrarily complex queries– floods or random walks disseminate queries– each node evaluates query locally

Page 12: Should we build Gnutella on a structured overlay? ·

Flood queries

0x

1x

2x

3x

• Exploit structure to avoid duplicates

Page 13: Should we build Gnutella on a structured overlay? ·

03x

0x

1x

2x

3x

Flood queries 00x

01x

02x

Page 14: Should we build Gnutella on a structured overlay? ·

Random walk queries 1

Page 15: Should we build Gnutella on a structured overlay? ·

Random walk queries 2

Page 16: Should we build Gnutella on a structured overlay? ·

03x

0x

1x

2x

3x

Random walkqueries 3

00x

01x

02x• Exploiting routing tables

• Breadth-firstsearch

Page 17: Should we build Gnutella on a structured overlay? ·

Story so far….

• Gnutella is built using an unstructured overlay• Described hybrid approach

– Structured overlay graph– Unstructured overlay data placement

• Described how to exploit structure in lookup– Same techniques as in an unstructured overlay– Implemented more efficiently

Next part: Churn and overhead

Page 18: Should we build Gnutella on a structured overlay? ·

Overhead

• Both structured and unstructured– detect failures

– repair overlay graph when nodes join or leave

Page 19: Should we build Gnutella on a structured overlay? ·

Detecting failures

• Probe neighbors in overlay

• Exploit symmetric state– Heartbeats versus probes

• Number of heartbeats is number of neighbors• Supress heartbeats with application traffic

Page 20: Should we build Gnutella on a structured overlay? ·

Exploiting structure for maintenance

• Heartbeat sent to neighbor on the left• Probe node if no heartbeat• Tell others about failure if no probe reply

• Leads to lower overhead

Page 21: Should we build Gnutella on a structured overlay? ·

Comparing overhead

• Unstructured overlay (Gnutella 0.4)– Max and min bounds placed on # neighbors

– Node discovery on join using random walks

– Failure detection heartbeat every 30 seconds

• Structured overlay (MS Pastry)– Leafsets

• Failure detection using heartbeats every 30 seconds

– Routing table• Failure detection using probes (tuned to churn)

Page 22: Should we build Gnutella on a structured overlay? ·

Experimental comparison

• Discrete event simulator– Transit-stub network topology

• UW trace of node arrivals and departures– [Saroiu et al. MMCN’02]

– 60 hours trace

– average session = 2.3 hours, median ~ 1 hour

– Active nodes varies between 2,700 and 1,300

Page 23: Should we build Gnutella on a structured overlay? ·

Gnutella trace: Failure rate

0.00E+00

5.00E-05

1.00E-04

1.50E-04

2.00E-04

2.50E-04

3.00E-04

0 10 20 30 40 50 60

Time (Hours)

Nod

e fa

ilure

s pe

r se

cond

per

nod

e

Page 24: Should we build Gnutella on a structured overlay? ·

Overhead: Configuration

• Gnutella 0.4 (4)– Min neighbors 4, max neighbors 8 (avg. 5.8)

• Gnutella 0.4 (8)– Min neighbors 8, max neighbors 32 (avg. 11)

• Pastry– b=1, no proximity neighbor selection, l = 32

Page 25: Should we build Gnutella on a structured overlay? ·

Overhead: Maintenance

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 10 20 30 40 50 60

Time(hours)

Mes

sag

es /

seco

nd

/ n

od

e

Gnutella 0.4 (8)

Gnutella 0.4 (4)

Pastry

Page 26: Should we build Gnutella on a structured overlay? ·

Gnutella 0.6 (SuperPeers)

• Super peers form random graph– Uses Gnutella 0.4 algorithm

• Normal nodes use super peers as proxies– Failure detections using heartbeats (30 secs)

– Connect to multiple super peers

Page 27: Should we build Gnutella on a structured overlay? ·

SuperPastry

• Super peers form Pastry overlay

• Normal nodes use super peers as proxies– Failure detections using heartbeats (30 secs)

Page 28: Should we build Gnutella on a structured overlay? ·

Overhead: Configuration

• 0.2 probability of node being a super peer

• Gnutella 0.6 configured:– Min neighbours = 10

– Max neighbours = 32

• SuperPastry configured– Max in-degree from routing table = 32

• Super peers proxy for 30 normal nodes

• Normal nodes pick 3 super peers

Page 29: Should we build Gnutella on a structured overlay? ·

Overhead: Maintenance

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0 10 20 30 40 50 60

Time(hours)

Mes

sag

es /

seco

nd

/ n

od

e

Gnutella 0.6

SuperPastry

Page 30: Should we build Gnutella on a structured overlay? ·

Gia [Chawathe et al. SIGCOMM’03]

• Adapts overlay to exploit heterogeneity– Uses a per-node metric of satisfaction

– Seeks new neighbors if unsatisfied

– Use parameters in Sigcomm Paper

– Neighbors [min = 3, max = max(3,min(128,C/4)) ]• Average 15.8

0.001

0.049

0.30

0.45

0.20

Probability

128

125

25

3

3

NeighborsCapacity

1000

10000

100

10

1

Page 31: Should we build Gnutella on a structured overlay? ·

HeteroPastry

• Routing table neighbor selection usingcapacity metric

• Uses routing table in-degree bound– Calculated as for Gia

Page 32: Should we build Gnutella on a structured overlay? ·

Overhead: Maintenance

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 10 20 30 40 50 60

Time(hours)

Mes

sag

es /

seco

nd

/ n

od

e

Gia

HeteroPastry

Page 33: Should we build Gnutella on a structured overlay? ·

The story so far….

• Both structured and unstructured– detect failures

– repair overlay graph when nodes join or leave

• Structured exploits structure– Lower overheads

• Unstructured overlays sensitive to neighbors choice– Random walks between node discovery

Finally: Putting it all together….

Page 34: Should we build Gnutella on a structured overlay? ·

Search: Configuration

• eDonkey file trace [Fessant et al. IPTPS’04]– 37,000 peers (25,172 contribute no files)

– 923,000 unique files (heavy tail zipf-like)

• Each node performs 0.01 lookups persecond (using a Poisson process)– Random walks TTL 128

• One hop replication [Chawathe et al. SIGCOMM’03]

– Uses routing table in structured overlays (***)

Page 35: Should we build Gnutella on a structured overlay? ·

Search: Messages

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0 10 20 30 40 50 60Time (hours)

Mes

sag

es /

seco

nd

/ n

od

e

GiaGnutella 0.6SuperPastryHeteroPastry

Page 36: Should we build Gnutella on a structured overlay? ·

Search: Success rate

0

0.2

0.4

0.6

0.8

1

1.2

0 10 20 30 40 50 60Time(hours)

Su

cces

s ra

te

HeteroPastryGiaGnutella 0.6SuperPastry

Page 37: Should we build Gnutella on a structured overlay? ·

Search: Delay

0

5000

10000

15000

20000

25000

30000

0 10 20 30 40 50 60Time(hours)

Del

ay (

ms)

Gnutella 0.6SuperPastryGiaHeteroPastry

Page 38: Should we build Gnutella on a structured overlay? ·

Conclusions

• Structure can improve Gnutella– Handles transient peers well

– Exploits structure to reduce maintenance overhead

– Supports complex queries

– Can also support DHT functionality

– Can exploit heterogenity

Page 39: Should we build Gnutella on a structured overlay? ·

And finally a question…

Does structure make security easier?

For slides:

http://www.research.microsoft.com/~antr/camb-ast.ppt

For more information:

http://www.research.microsoft.com/~antr/Pastry

Page 40: Should we build Gnutella on a structured overlay? ·

Flooding queries

• exploit structure to avoid duplicates

• flooding a query q– if node is source of q do

for each routing table row r

send <flood, q, r> to nodes in row r

– if node receives <flood, q, s> do

for each routing table row r such that r > s

send <flood, q, r> to nodes in row r

• recursively partitions nodes into disjoint sets


Recommended