Date post: | 27-Dec-2015 |
Category: |
Documents |
Upload: | kevin-moody |
View: | 214 times |
Download: | 0 times |
Routing and Network Design:Algorithmic Issues
Kamesh Munagala
Duke University
Graph Model for the Links
Model sensor nodes as vertices in a graph
Gateway
d(7,8) = “Length” of link
“Length” of link models communication cost per bit“Length” should be a function of #bits being sent (Why?)
Specialized Nature “Geometric Random” graph
• Nodes on a 2D plane• Each node has a fixed communication radius
Correlation Structures:• Spatial Gaussian models• Simple AR(1) temporal models
Assumptions do not always hold!
Unique Features Distributed algorithms:
• Reconfigure routes around failures
• Learning network topology
• Learning correlation structures
• Query processing
Light-weight implementations:• Low compute power and memory
• Limited communication and battery life
• Noisy sensing and transmission
Goals in this Lecture General algorithmic ideas capturing:
• Simplicity and efficiency• Some performance guarantees• Distributed implementations• Low reliance on specific assumptions
Caveats: • Ideas need to be tailored to context• Specialized algorithms might work better
Topics What constitutes good routing?
• Measures of quality
Algorithm design framework• Basic problem statements
• Spanning, shortest path, and Steiner trees
• Aggregation networks
• Location and subset selection problems
• Solution techniques• Types of guarantees on solution quality
Models of information in a sensor network• Tailoring generic algorithms to specific models
Problem 1:Information Aggregation
Routing Tree Problem Statement:
Route information from nodes to gateway Choose subset of edges to route data Edges “connect” all nodes to gateway
• Tree Property
Minimize: Long-term average “cost” of routing Answer will depend on:
• What constitutes “cost”• Correlations in data being collected
Toy Example
66 6
6
1 1 1
Gateway
Each node has 100 bits of information to send to gateway
Value on link (edge) is the cost of transmitting one bit
How should we route the bits?
“Star” network
Depends on Correlations
66 6
6
1 1 1
Gateway
Suppose information is perfectly correlatedInformation from all sources together is also 100 bits!
Spanning tree is optimal
Cost = 100 * (6+1+1+1) = 900 units
Ignore cost of compression
Other Extreme: No Correlation
66 6
6
1 1 1
Gateway
Suppose information is not correlated at allInformation from all sources together is now 400 bits
Shortest path tree is optimal
Cost = 100 * (6+6+6+6) = 2400 units
Had we used a Spanning Tree
66 6
6
1 1 1
Gateway
Suppose information is not correlated at allInformation from all sources together is now 400 bits
Shortest path tree is optimal
Cost = 100 * (6+7+8+9) = 3000 units > 2400 units!
In summary… Moral of the story:
• Choosing good routes is important• Choice depends on correlation structure
Issues to address:• How do we specify correlations
• Simple yet faithful specifications desirable
• Algorithms for finding (near-)optimal routes• Efficient and simple to implement
• Reliability and “backup” routes
There could be nn-2 many spanning trees in general
Exhaustive enumeration is out of question
Minimum Spanning Tree
Cost of MST = 23
10
520
7 15
121
10
520
7 15
121
Spanning Tree Algorithm
“Greedy” schemes add edges one at a time in clever fashionNo backtracking
Kruskal's algorithm: Consider edges in ascending order of cost. Insert an edge unless doing so would create a cycle.
Prim's algorithm: Start with gateway and greedily grow a tree from the gateway outward. At each step, add the cheapest edge that has exactly one endpoint in current tree.
Prim’s Algorithm: Execution10
520
7 15
121
10
520
7 15
121
10
520
7 15
121
10
520
7 15
121
“Distributed” Algorithm?Nodes connect in arbitrary order
Each node simply connects to “closest” existing neighbor
10
520
7 15
121
10
520
7 15
121
10
520
7 15
121
10
520
7 15
121
Cost = 25
Guarantee on “Online” Scheme
n nodes in graph
Cost of “online” tree is within log n factor of cost of MST
Irrespective of order in which nodes join the system!
Intuition: In “star” network, “online” scheme produces MST!
Natural implementation: Greedy starting from gateway
Such a guarantee is called an “approximation guarantee”
Shortest Paths: OSPF
Key algorithmic idea: Greedy local updates
Each node v maintains “tentative” distance d(v) to gateway
Initially, all these distances are infinity
Each node v does a greedy check:
If for some neighbor u, d(v) > d(u) + Length(u,v), then:
Route v through u
Set d(v) = d(u) + Length(u,v)
Run this till it stabilizes
OSPF Execution10
520
7 1
210
10
520
7 1
210
10
520
7 1
210
∞
∞
∞0
10∞
20 2
0 10
2
3
17
710
520
7 1
210
2
3
10
7
Rate of Convergence
n nodes in graph
1. The protocol converges to the shortest path tree
2. The number of rounds till convergence is roughly n
Intermediate CorrelationsOne tree for all correlation values?
Both spanning and shortest path trees at once?
Do-able if we settle for “nearly” optimal trees
In other words, there exists a tree with:
Cost at most thrice cost of MST
Distances to gateway at most twice S.P. distances
Example: MST
nn n
n
1 1 1
Gateway
11 1
n2 nodes
Cost of MST = n2+n
Path length = n2+n
Example: Shortest Path Tree
nn n
n
1 1 1
Gateway
11 1
n2 nodes
Cost = n3
Path Length = n2
Example: Balanced Tree
nn n
n
1 1 1
Gateway
11 1
n nodes
Cost = 2n2
Path Length = 2n
Walk on a Tree
Gateway
Balancing Algorithm
GatewayWalk along Spanning Tree
Add shortcuts to gateway
At node v:
Suppose previous shortcut at u
If SP(u) + Walk(u,v) > 2 SP(v)
Add “shortcut” from v
Walk too long!
Shortcut
Example Revisited
nn n
n
1 1 1
Gateway
11 1
n nodes Walk length = 2n
Proof Idea Final Path Lengths < 2 S.P. Lengths
• Follows from description
Final Cost < 3 MST Cost• Final Cost = MST + Shortest Paths Added• Suppose paths are added at …,u,v… on walk
• SP(u) + Walk(u,v) > 2 SP(v)
• Add these up:• Total Walk Length > Total Length of added Paths
• But, Total Walk Length = 2 MST Cost
Problem 2:Sensor Location
“Most Informative” Placement
Close by locations are not very “informative”
Abstraction Parameters:
• Each node v has communication cost to gateway = cv
• Depends on location
• Subset S of nodes has “information” f(S)• Information is a property of a set of nodes
• Depends on whether “close by” nodes also in set
Problem Statement:• Choose set S so that:
• Sum of costs of nodes in S is at most C
• Maximize Information = f(S)
Algorithmic Issue
Number of subsets of n locations = 2n
• Inefficient to enumerate over them
Given subset S, how do we compute f(S)• Needs a correlation model among locations
Communication costs are not additive• Also depend on location of nodes!
Information Functions
f(S) = Entropy of S
Correlations are multidimensional Gaussian: = Covariance matrix between locationsEntropy log det()
Covariance(j,k) exp(-dist(j,k)2 / h2)
Properties of f(S)
A B
v
Location v is more informative w.r.t A than w.r.t B
Property 2: f(A+v) - f(A) ≥ f(B+v) - f(B)
Property 1: f(A+v) ≥ f(A)
Greedy Algorithm
Start with S =
Repeat till cost of S exceeds C:
• Choose v such that:• ( f(S+v) - f(S) ) / cv is maximized
• “Information gain per unit cost”
• Add v to S
Analysis
Suppose:
All costs cv = 1
O = Best Information set of size at most C
At any stage, if adding v is best greedy decision
Adding entire O cannot give more information per unit cost!
f(S + v) - f(S) ≥ ( f(S + O) - f(S) )/C ≥ ( f(O) - f(S) )/C
Let d(S) = f(O) - f(S) = Deficit w.r.t. Optimal Solution
Implies: d(S) - d(S+v) ≥ d(S) / C
Analysis
d(S+v) ≤ d(S) (1 - 1/C)
d(Initial) = f(O)
d(Final) = f(O) - f(Final)
f(O) - f(Final) = d(Final) ≤ d(Initial) ( 1 - 1/C )C ≤ f(O) / 2
Implies: f(Final) ≥ f(O) / 2
Greedy set has information at least 1/2 information in optimal set
Two-Level Routing
Aggregation Hub
Clustering
Optimal placement of cluster-heads
Minimize routing cost
K-Means Algorithm Start with k arbitrary leaders
Repeat Steps 1 and 2 till convergence: Step 1:
• Assign each node to closest leader• Yields k “clusters” of nodes
Step 2:• For each cluster, choose “best” leader• Minimizes total routing cost within cluster
Analysis Convergence is guaranteed:
Each step reduces total distance• Step 1: Each node travels smaller distance• Step 2: Each cluster’s routing cost reduces
Rate of convergence:• Fast in practice
Quality of solution:• “Local” optimum depending on initial k nodes• Need not be best possible solution• Works very well in practice