+ All Categories
Home > Documents > CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on...

CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on...

Date post: 20-Dec-2015
Category:
View: 213 times
Download: 0 times
Share this document with a friend
Popular Tags:
30
CS728 Lecture 17 Web Indexes III
Transcript
Page 1: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

CS728

Lecture 17

Web Indexes III

Page 2: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

•Last Time• Showed how build indexes for graph connectivity

•Based on 2-hop covers•Today

•Look at more general problem of compact encodings for

graphs and network problems• Applications

- Fast queries for path information - routing & routing table construction

- topology control- spanning trees- dominating sets & clustering- hierarchical clustering

Page 3: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Main Problem Considered

• arbitrary topology• goal small routing tables to find path to destination• related problem: finding closest item of certain type

Routing: how do I get there from here?

source

destination

Page 4: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Definitions:

Spanner: subgraph whose distance between two nodes is close to that in the original graph

We will see that radio networks need energy-spanners, i.e, subgraphs that contain energy-efficient paths

Page 5: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Spanning Trees:

K-Dominating Sets:

• minimum connected subgraph• useful for routing• single point of failure• non-minimal routes• many variants

• set of nodes that are within K hops of every node• used to defines partition of the network into zones 1-dominating set

Page 6: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Graph Clustering:

Hierarchical Clustering

• K-center problem – find k nodes such that minimize the max distance to all nodes – Flat Clustering

• Hierarchical Clustering• tree clustering with internal and border nodes and edges

Page 7: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Hierarchical Clustering

• The hierarchy imposes a natural addressing scheme

• Each node labeled with the path in the hierarchy tree

• Problem: give a compact labeling for a tree– Clearly need logn bits to identify some nodes.– Need to add information about tree structure– Complete binary tree– Other n-node trees

Page 8: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

• Interval labeling scheme– Label the leaves of the tree uniquely logn bits– Label each internal node with the range of its

descendents 2log n bits.– Given two nodes x,y and their labels

• Can you test if x is an ancestor or y?• Can you describe the path from x to y?

Page 9: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

• Greedy Dewey Labeling scheme

• Label each edge with small unique string

• Nodes are concatenation of edge labels

v0

00 01

v1 v2

10

v3 v4

Out-degree 4 requires edge labels of maximum length 2.

v0

v1

0

v600

……..

Out-degree 600 requires edge labels of maximum length 10.

1101101110

Page 10: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Theorem: Upper bound on GDL label length withunary delimiters is bits, - is the depth of v in T - n is number of nodes in T

• Alternative use binary (fixed length) for delimiting each edge– Seems to do worse in practice

• Can remove dependence on depth by converting encodings of long interior paths using count labels

)log(2 nv

v

Page 11: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Spanners and Stretch

• Stretch of a subgraph H is the maximum ratio of the distance between two nodes in H to that between them in G– Extensively studied in the graph algorithms and graph

theory literature [Eppstein 96]• Distance stretch and topological stretch• A spanner is a subgraph that has constant stretch

– The Delaunay triangulation yields a planar Euclidean distance-spanner

– The Yao-graph [Yao 82] is also a simple distance-spanner

Page 12: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Energy Stretch and Energy Spanners

• Commonly adopted power attenuation model:– is between 2 and 4

• Assuming uniform threshold for reception power and interference/noise levels, energy consumed for transmitting from to needs to be proportional to

• Power control: Radios have the capability to adjust their power levels so as to reach destination with desired fidelity

• Energy consumed along a path is simply the sum of the transmission energies along the path links

• Define energy-stretch analogous to distance-stretch

distancepowerTransmit

Power Received

u v ),( vud

Page 13: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Energy-Aware Routing

• A path with many short hops consumes less energy than a path with a few large hops– Which edges to use? (Considered in topology control)– Can maintain “energy cost” information to find minimum-energy

paths [Rodoplu-Meng 98]

• Routing to maximize network lifetime [Chang-Tassiulas 99]– Formulate the selection of paths and power levels as an

optimization problem– Suggests the use of multiple routes between a given source-

destination pair to balance energy consumption

• Energy consumption also depends on transmission rate– Schedule transmissions lazily [Prabhakar et al 2001]– Can split traffic among multiple routes at reduced rate [Shah-

Rabaey 02]

Page 14: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Topology Control

• Given:– A collection of nodes in the plane– Transmission range of the nodes

(assumed equal)

• Goal: To determine a subgraph of the transmission graph G that is– Connected – Low-degree– Small stretch, hop-stretch, and power-

stretch

Page 15: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

The Yao Graph

• Divide the space around each node into sectors (cones) of angle

• Each node has an edge to nearest node in each sector

• Number of edges is

• For any edge (u,v) in transmission graph– There exists edge (u,w) in same sector such that w is closer to v than u is

• Theorem: The Yao Graph has stretch ))2/sin(21/(1

)(nO

u

wv

Page 16: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Dominating Set

• Applications Facility location– A set of -dominating centers can be selected to

locate servers or copies of a distributed directory– Dominating sets can serve as location database for

storing routing information in ad hoc networks [Liang Haas 00]

• NP-hard for general graphs• Reduces to the minimum set cover problem• Recall last time: Greedy gives logn

approximation• Admits a PTAS for planar graphs [Baker 94]

k

Page 17: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

• An Example

Greedy Algorithm

Page 18: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Hierarchical Network Decomposition

• Sparse neighborhood covers [Awerbuch-Peleg 89, Linial-Saks 92]– Applications in location management, replicated data

management, routing– Provable guarantees, though difficult to adapt to a

dynamic environment

• Routing scheme using hierarchical partitioning [Dolev et al 95]– Adaptive to topology changes– Weak guarantees in terms of stretch and memory per

node

Page 19: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Sparse Neighborhood Covers

• An r-neighborhood cover is a set of overlapping clusters such that the r-zone of any node is in one of the clusters

• Aim: Have covers that are low diameter and have small overlap

• Overlap is measured by the max number of clusters a node is in

• Tradeoff between diameter and overlap– Set of all r-zones: Have diameter 2r but overlap n– The entire network single cluster: Overlap 1 but diameter could

be n

• Sparse r-neighborhood with O(r log(n)) diameter clusters and O(log(n)) overlap [Peleg 89, Awerbuch-Peleg 90]

Page 20: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Sparse Neighborhood Covers

• Set of sparse neighborhood covers– { -neighborhood cover: }

• For each node:– For any , the -zone is contained within a

cluster of diameter – The node is in clusters

• Applications:– Tracking mobile users– Distributed directories for replicated objects

r)log( nrO

)(log2 nO

ni log0

r

i2

Page 21: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Online Tracking of Mobile Users

• Given a fixed network with mobile users• Need to support location query operations• Home location register (HLR) approach:

– Whenever a user moves, corresponding HLR is updated

– Inefficient if user is near the seeker, yet HLR is far

• Performance issues:– Cost of query: ratio with “distance” between source

and destination– Cost of updating the data structure when a user

moves

Page 22: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Mobile User Tracking: Initial Setup

• The sparse -neighborhood cover forms a regional directory at level

• At level , each node u selects a home cluster that contains the -zone of u

• Each cluster has a leader node.

• Initially, each user registers its location with the home cluster leader at each of the levels

i2i

)(lognO

i2i

Page 23: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

The Location Update Operation

• When a user X moves, X leaves a forwarding pointer at the previous host.

• User X updates its location at only a subset of home cluster leaders– For every sequence of moves that add up to

a distance of at least , X updates its location with the leader at level

• Amortized cost of an update is for a sequence of moves totaling distance

i2i

)log( ndO

d

Page 24: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

The Location Query Operation

• To locate user X, go through the levels starting from 0 until the user is located

• At level , query each of the clusters u belongs to in the -neighborhood cover

• Follow the forwarding pointers, if necessary• Cost of query: , if is the

distance between the querying node and the current location of the user

i2i

)log( ndOd

)(lognO

Page 25: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Comments on the Tracking Scheme

• Distributed construction of sparse covers in time [Awerbuch et al 93]

• The storage load for leader nodes may be excessive; use hashing to distribute the leadership role (per user) over the cluster nodes

• Distributed directories for accessing replicated objects [Awerbuch-Bartal-Fiat 96]– Allows reads and writes on replicated objects– An -competitive algorithm assuming each

node has times more memory than the optimal

• Unclear how to maintain sparse neighborhood covers in a dynamic network

)loglog( 2 nnnmO

)(lognO)(lognO

Page 26: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Bubbles Routing and Partitioning Scheme

• Adaptive scheme by [Dolev et al 95]

• Hierarchical Partitioning of a spanning tree structure

• Provable bounds on efficiency for updates

2-level partitioningof a spanning tree

root

Page 27: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Bubbles (cont.)

• Size of clusters at each level is bounded

• Cluster size grows exponentially

• # of levels equal to # of routing hops

• Tradeoff between number of routing hops and update costs

• Each cluster has a leader who has routing information

• General idea:

- route up the tree until in the same cluster as destination,

- then route down

- maintain by rebuilding/fixing things locally inside subtrees

Page 28: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Bubbles Algorithm

• A partition is an [x,y]-partition if all its clusters are of size between x and y

• A partition P is a refinement of another partition P’ if each cluster in P is contained in some cluster of P’.

• An (x_1, x_2, …, x_k)-hierarchical partitioning is a sequence of partitions P_1, P_2, .., P_k such that

- P_i is an [x_i, d x_i] partitioning (d is the degree)

- P_i is a refinement of P_(i-1)

• Choose x_(k+1) = 1 and x_i = x_(i+1) n1/k

Page 29: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

Clustering Construction

• Build a spanning tree, say, using BFS

• Let P_1 be the cluster consisting of the entire tree

• Partition P_1 into clusters, resulting in P_2

• Recursively partition each cluster

• Maintenance rules:

- when a new node is added, try to include in existing cluster, else split cluster

- when a node is removed, if necessary combine clusters

Page 30: CS728 Lecture 17 Web Indexes III. Last Time Showed how build indexes for graph connectivity Based on 2-hop covers Today Look at more general problem of.

• memory requirement

• adaptability

• k hops during routing

• matching lower bound for bounded degree graphs

• Note: Bubbles does not provide a non-trivial upper bound

on stretch in the non-hop model

Performance Bounds

kk nd /123

nkdn k log/11


Recommended