Overlay NetworksEECS 122: Lecture 18
Department of Electrical Engineering and Computer SciencesUniversity of California
Berkeley
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 2
What is an overlay network?
A network defined over another set of networksThe overlay addresses its own nodesLinks on one layer are network segments of lower layers
Requires lower layer routing to be utilized
Overlaying mechanism is called tunneling
A
AA’
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 3
Overlay Concept: Going Up
1 12
10
13
11
5
A B
C 7
48
6
2
3
Overlay Network Nodes
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 4
Overlay Concept: Going Up
31
2
12
10
13
11
8
5
4
A B
C 7
6
Overlay Networks are extremely popularMBONE, Akamai, Virtual Private Networks, Napster, GnutellaOverlay Networks may even peer!
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 5
Overlay Concept: Going Down
12
10
13
11
5 7
48
6
2
31
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 6
IP Network is the Overlay…
12
10
13
11
d
c
5 7
48
6
2
3a1
b
ATM links can be the “physical layer” for IP
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 7
IP Network is the overlay
12
10
13
11
d
c
5 7
48
6
2
3a1
b
Virtual Circuit under Datagram!
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 8
Example: Napster
m5
AB
C
D
E
Fm1 Am2 Bm3 Cm4 Dm5 Em6 F
E?m5
E? E
m6
m4
m3m1
m2
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 9
Routing On the overlay
UnderlyingNetwork
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 10
Routing on the Overlay
The underlying network induces a complete graph of connectivity
No routing required!
UnderlyingNetwork
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 11
Routing on the Overlay
The underlying network induces a complete graph of connectivity
No routing required!But
One virtual hop may be many underlying hops away. Latency and cost vary significantly over the virtual linksState information may grow with E (n^2)
10
100
200100
90
100
10
10 20
90
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 12
Routing Issues
The underlying network induces a complete graph of connectivity
No routing required!But
One virtual hop may be many underlying hops away. Latency and cost vary significantly over the virtual linksState information may grow with E (n^2)
UnderlyingNetwork
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 13
Routing Issues
2
54
3
1
The underlying network induces a complete graph of connectivity
No routing required!But
One virtual hop may be many underlying hops away. Latency and cost vary significantly over the virtual linksState information may grow with E (n^2)
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 14
Relating the virtual topology to the underlying network
2
54
3
1
Message from 4 1
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 15
Relating the virtual topology to the underlying network
2
54
3
1 2
54
1
3
Message from 4 14 3 2 1
4 1 3 1 5 2 5 1Extreme Inefficiencies Possible
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 16
Routing Issues
2
54
3
1
The underlying network induces a complete graph of connectivity
No routing required!But
One virtual hop may be many underlying hops away. Latency and cost vary significantly over the virtual linksState information may grow with E (n^2)
At any given time, the overlay network picks a connected sub-graph based on nearest neighbors
How often can varyAlso, structured (Chord) v/s unstructured (Gnutella)
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 17
Kinds of Overlay Networks
Three kinds of Overlays1. Only Hosts: Peer to Peer Networks (P2P)
Example: Gnutella, Napster2. Only Gateway nodes: Infrastructure Overlays
Content Distribution Networks (CDNs)Example: Akamai
3. Host and Gateway Nodes: Virtual Private Networks
Overlay node structureRegular: Chord, PastryAdhoc: Gnutella
FunctionsRoute Enhancement: Better QoS, Application Level MulticastResource Discovery: P2P
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 18
Outline
Infrastructure OverlaysAdding performance and route functionalityResource Discovery
P2P OverlaysResource Discovery in Gnutella
Example of an Infrastructure OverlayApplication Level Multicast
Example of a P2P OverlayContent Addressable Networks
Conclusions
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 19
Infrastructure Overlays
2
54
3
1
Overlay network users are not directly connected to the overlay nodes
E.g. Akamai
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 20
Overlay Routing: Edge Mapping
2
54
3
1
Overlay network users are not directly connected to the overlay nodes
E.g. AkamaiUser must be redirected to a “close by” overlay nodeEdge-Mapping, or redirection function is hard since
# potential users enormousUser clients not under direct control
When overlay clients are directly connected the edge mapping function is obviated
E.g. P2P: users/nodes colocated
IP(5)
?
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 21
Overlay Routing: Edge Mapping
2
54
3
1
Overlay nodes interconnect clientsEnhance nature of connection
MulticastSecureLow Loss
Much easier to add functionality than to integrate into a router
IP(5)
?
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 22
Overlay Routing: Adding Function to the route
2
54
3
1
Overlay nodes interconnect clientsEnhance nature of connection
MulticastSecureLow Loss
Much easier to add functionality than to integrate into a routerOverlay nodes can become bottlenecks
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 23
Overlay Routing: Resource Location
2
54
3
1
ABC
AD
BF
BDE
B?
Overlay network may contain resources. Eg.
ServersFiles
Client makes request for resourceOverlay must “search” for “closest” node that has the resource
E.g. find the least loaded server that has a piece of content and that is has low network latency to client
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 24
Overlay Routing: Resource Location
2
54
3
1
ABC
AD
BF
BDE
B?AC
DB
FC
DE
Overlay network may contain resources. Eg.
ServersFiles
Client makes request for resourceOverlay must “search” for “closest” node that has the resource
E.g. find the least loaded server that has a piece of content and that is has low network latency to client
A single “index” is not scalableOverlay launches a query to locate resource
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 25
Overlay Routing: Resource Location
2
54
3
1
ABC
AD
BF
BDE
B?AC
DB
FC
DE
Overlay network may contain resources. Eg.
ServersFiles
Client makes request for resourceOverlay must “search” for “closest” node that has the resource
E.g. find the least loaded server that has a piece of content and that is has low network latency to client
A single “index” is not scalableOverlay launches a query to locate resourceQuery is “Routed” through the overlay until object is located
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 26
Overlay Routing: Resource Location
2
54
3
1
ABC
AD
BF
BDE
AC
DB
FC
DE
4
4
B?4
Overlay network may contain resources. Eg.
ServersFiles
Client makes request for resourceOverlay must “search” for “closest” node that has the resource
E.g. find the least loaded server that has a piece of content and that is has low network latency to client
A single “index” is not scalableOverlay launches a query to locate resourceQuery is “Routed” through the overlay until object is located
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 27
Overlay Routing: Resource Location
2
54
3
1
ABC
AD
BF
BDE
AC
DB
FC
DE
4
4
B?4
Overlay network may contain resources. Eg.
ServersFiles
Client makes request for resourceOverlay must “search” for “closest” node that has the resource
E.g. find the least loaded server that has a piece of content and that is has low network latency to client
A single “index” is not scalableOverlay launches a query to locate resourceQuery is “Routed” through the overlay until object is located
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 28
P2P OverlaysOverlay network users are not directly connected to the overlay nodes
E.g. Napster, GnutellaNo edge mapping problemNo gateways to maintain
ButNodes have limited resources
storage, connectivitycomputational power
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 29
Gnutella
Distribute file locationIdea: multicast the requestHot to find a file:
Send request to all neighborsNeighbors recursively multicast the requestEventually a machine that has the file receives the request, and it sends back the answer
Advantages:Totally decentralized, highly robust
Disadvantages:Not scalable; the entire network can be swamped with request (to alleviate this problem, each request has a TTL)
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 30
Gnutella: Example
Assume: m1’s neighbors are m2 and m3; m3’s neighbors are m4 and m5;…
m5
AB
C
D
E
F
E?
E?
E?E?
E
m6
m4
m3m1
m2
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 31
Summary
Two kinds of overlays functionsOverlay provides access to distributed resourcesOverlay facilitates communication among other client applications
Two kinds of virtual topologiesStructured: mesh, ring etc.Unstructured
Two kinds of client connectivtyDirect: P2PNot direct: Akamai
Overlay Network FunctionsSelect Virtual Edges (fast or slow timescales)Overlay Routing ProtocolEdge MappingResource Location
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 32
Example: Application Level Multicast
Content Producer
Media Distribution Media Distribution NetworkNetwork
Content ProducerMedia Clients
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 33
The Broadcast Internet
Content Producer
Content Producer
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 34
Broadcast Overlay Architecture
Man
agem
ent P
latfo
rm
redirection management
load balancingsystem availability
network management
monitoring & provisioning
server management
viewer management
subscriptions, PPV,monitoring, Neilson ratings, targeted advertising
content management
injection & real-time control
Media DeliverySystem
Redirection
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 35
Broadcast Management
Application-level information for management and trackingWorks across multiple networksContent Producer event programming with ad-hoc query audience statistics
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 36
Broadcast Manager
Node Information
Stream Switchover
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 37
Policy Management
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 38
Example: Content Addressable P2P Networks (CAN)
CAN is one of several recent P2P architectures that imposes a structure on the virtual topologyuses a distributed hash-table data structure abstraction
Note: item can be anything: a data object, document, file, pointer to a file…
routes queries through the structured overlayattempts to distribute (object, location) pairs uniformly throughout the network supports object lookup, insertion and deletion of objects efficiently.
Others: Chord, Pastry, Tapestry
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 39
Content Addressable Network (CAN)
Associate to each node and item a unique idin an d-dimensional spaceProperties
Routing table size O(d)Guarantee that a file is found in at most d*n1/d
steps, where n is the total number of nodes
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 40
CAN Example: Two Dimensional SpaceSpace divided between nodesAll nodes cover the entire spaceEach node covers either a square or a rectangular area of ratios 1:2 or 2:1Example:
Assume space size (8 x 8)Node n1:(1, 2) first node that joins cover the entire space
7
6
5
4
3n1
2
1
0
0 2 3 4 6 751
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 41
CAN Example: Two Dimensional SpaceNode n2:(4, 2) joins space is divided between n1 and n2
n1 n2
1 2 3 4 5 6 70
7
6
5
4
3
2
1
0
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 42
CAN Example: Two Dimensional SpaceNode n2:(4, 2) joins space is divided between n1 and n2
n1 n2
n3
1 2 3 4 5 6 70
7
6
5
4
3
2
1
0
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 43
CAN Example: Two Dimensional SpaceNodes n4:(5, 5) and n5:(6,6) join
n1 n2
n3 n4n5
1 2 3 4 5 6 70
7
6
5
4
3
2
1
0
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 44
CAN Example: Two Dimensional SpaceNodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5);n5:(6,6)Items: f1:(2,3); f2:(5,1); f3:(2,1); f4:(7,5);
n1
f1
f3
n2
n3 n4n5
f2
f4
7
6
5
4
3
2
1
0
0 2 3 4 6 751
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 45
CAN Example: Two Dimensional SpaceEach item is stored by the node who owns its mapping in the space
n1
f1
f3
n2
n3 n4n5
f2
f4
7
6
5
4
3
2
1
0
0 2 3 4 6 751
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 46
CAN: Query Example
Each node knows its neighbors in the d-spaceForward query to the neighbor that is closest to the query idExample: assume n1 queries f4
5
6
n1 n2
n3 n4n5
f1
f2
f3
f4
7
4
3
2
1
0
0 2 3 4 6 751
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 47
Adding/Deleting nodes
New node picks a point P at randomAssuming it can find any overlay node, it sends a join message to the node which owns that pointWhen the message has reached P, the node divides itself in half along one of the dimensions (first x then y etc)Pairs are transferred and neighbor sets updatedSimilar reasoning handles departures and failures
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 48
Relating Virtual Topology to the Underlying Network
Neighbors should be close to each other in terms of latency on the underlying networkPick a set of well known landmark hosts Each node distributivelycomputes its “bin”
Nodes in the same bin are “close” to each otherOrders the landmark set in increasing order of RTT from it. Latency is partitioned into levelsThus, associated with each landmark, at each node is a rank and a level.These values identify the bin
Example:Three landmarks
0-30ms: level 031-100ms: level 1101-300ms: level 2
Node j measures latencies of 10ms, 110ms, 40ms to the three landmarks. The bin of node j is
(l1,l3,l2 : 021)
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 49
Your standard Networking Functions…
Addressing: Uniquely identify the nodes host IP address, group address, attributesset is dynamic!
Topology Update: Characterize and maintain connectivityDiscover topologyMeasure “distance” metric(s)Dynamically provision (on slower timescale)
Destination Discovery: Find node identifiers of the destination setRoute Computation: Pick the tree (path)
Kind of path: Multicast, UnicastGlobal or Distributed AlgorithmPolicyHierarchy
Switching: Forward the packets at each node
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 50
And Their Overlay Analogs
Addressing: Uniquely identify the nodes host IP address, group address, attributesset is dynamic!
Topology Update: Characterize and maintain connectivityDiscover topologyMeasure “distance” metric(s)Dynamically provision (on slower timescale)
Destination Discovery: Find node identifiers of the destination setRoute Computation: Pick the tree (path)
Kind of path: Multicast, UnicastGlobal or Distributed AlgorithmPolicyHierarchy
Switching: Forward the packets at each node
Structured Topology
Add/Insert Nodes, Binning
Resource Location Edge Mapping
Application Level Routing. E..g streaming broadcastStructured Topology
April 3, 2003A. Parekh, EE122 S2003. Revised and
enhanced F'02 Lectures 51
Conclusions
Overlays are an irreversible trend in networkOverlays add new functions to the network infrastructure much faster than
by trying to integrate them in the routerrelying on a infrastructure service provider on deploy the function
DisadvantagesOverlay nodes can create performance bottlenecksNew end-to-end protocols may not work since the overlay nodes don’t understand them
Generally better to improve performance by building an “underlay” and add functionality by building an overlay