+ All Categories
Home > Documents > Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger...

Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger...

Date post: 25-Dec-2015
Category:
Upload: bathsheba-elliott
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
26
Roger Zimmermann Roger Zimmermann COMPSAC 2004, September 30 COMPSAC 2004, September 30 Spatial Data Query Spatial Data Query Support in Peer-to- Support in Peer-to- Peer Systems Peer Systems Roger Zimmermann Roger Zimmermann , Wei-Shinn Ku, and Haojun Wang , Wei-Shinn Ku, and Haojun Wang Computer Science Department Computer Science Department University of Southern California University of Southern California Los Angeles, CA 90089 Los Angeles, CA 90089 COMPSAC 2004 COMPSAC 2004
Transcript
Page 1: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Spatial Data Query Spatial Data Query Support in Peer-to-Peer Support in Peer-to-Peer

SystemsSystems

Roger ZimmermannRoger Zimmermann, Wei-Shinn Ku, and Haojun Wang, Wei-Shinn Ku, and Haojun WangComputer Science DepartmentComputer Science Department

University of Southern CaliforniaUniversity of Southern CaliforniaLos Angeles, CA 90089Los Angeles, CA 90089

COMPSAC 2004COMPSAC 2004

Page 2: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

OutlineOutline

MotivationMotivation

Introduction to DHTs (CAN)Introduction to DHTs (CAN)

Technical ApproachTechnical Approach

ResultsResults

Conclusions and Future ResearchConclusions and Future Research

Page 3: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

MotivationMotivation

Spatial data sets are used for many Spatial data sets are used for many applications, e.g., GIS, CAD, …applications, e.g., GIS, CAD, …

P2P systems provide a distributed platform P2P systems provide a distributed platform that is very scalable.that is very scalable.

Pros:Pros:– Scalability, no central point of failureScalability, no central point of failure

Cons:Cons:– Very dynamic (unreliable), topology Very dynamic (unreliable), topology

maintenance requiredmaintenance required

Page 4: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Motivaton (cont.)Motivaton (cont.)

Question: how to use P2P systems for Question: how to use P2P systems for spatial data sharing.spatial data sharing.

Query Challenges:Query Challenges:– Unstructured P2P systems: querying by Unstructured P2P systems: querying by

flooding is not efficientflooding is not efficient– Structured P2P systems based on DHTs Structured P2P systems based on DHTs

(Chord, CAN): only efficient (Chord, CAN): only efficient exact matchexact match queries are supportedqueries are supported

E.g., search files based on their names/titlesE.g., search files based on their names/titlesput(key, value); get(key) return valueput(key, value); get(key) return value

Page 5: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Motivation (cont.)Motivation (cont.)

Spatial queries are usually Spatial queries are usually range queriesrange queries– Intersect, overlapIntersect, overlap– Nearest neighbor(s) (kNN)Nearest neighbor(s) (kNN)

DHTs are not suitable without modificationDHTs are not suitable without modification

Page 6: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Distributed Hash Tables (DHT)Distributed Hash Tables (DHT)

DHT systems: Content Addressable Network DHT systems: Content Addressable Network (CAN), Chord, Pastry, etc.(CAN), Chord, Pastry, etc.Using DHT to allocate large data sets to many Using DHT to allocate large data sets to many nodes with no central controlnodes with no central controlData objects are near uniformly distributed Data objects are near uniformly distributed through a through a hash functionhash function, resulting in superb , resulting in superb scalability and load balancescalability and load balanceEach node only maintains a small routing table Each node only maintains a small routing table to know its neighborsto know its neighborsLocating a particular data object requires Locating a particular data object requires O(logO(logNN) search steps on average) search steps on average

Page 7: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Content Addressable Network Content Addressable Network (CAN)(CAN)

A scalable indexing mechanism in a P2P A scalable indexing mechanism in a P2P networknetworkCreates a Creates a logicallogical dd-dimensional Cartesian -dimensional Cartesian coordinate spacecoordinate spaceDivides the space into zones, where each zone Divides the space into zones, where each zone is controlled by a node in the systemis controlled by a node in the systemZones are dynamically partitioned or merged as Zones are dynamically partitioned or merged as nodes join and leavenodes join and leaveEach Zone is addressed with a Virtual Identifier Each Zone is addressed with a Virtual Identifier (VID), which is deterministically calculated from (VID), which is deterministically calculated from the location of the zonethe location of the zone

Page 8: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Content Addressable Network Content Addressable Network (CAN)(CAN)

Example: A 2-D space partitioned into 7 CAN zones

Page 9: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Content Addressable Network Content Addressable Network (cont)(cont)

NodeNode Operations Operations

(e.g., Insertion)(e.g., Insertion)

1.1. Find a bootstrap Find a bootstrap node firstnode first

Page 10: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Content Addressable Network Content Addressable Network (cont)(cont)

2. Randomly 2. Randomly choose a point choose a point in the CAN in the CAN plane and plane and route the new route the new node from the node from the bootstrap bootstrap node to the node to the chosen chosen locationlocation

Page 11: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Content Addressable Network Content Addressable Network (cont)(cont)

3. The new node arrives 3. The new node arrives at the destination at the destination zone covering that zone covering that point. The destination point. The destination zone is split into two zone is split into two zones, each zones, each controlled by one controlled by one node (old and new)node (old and new)

Page 12: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Content Addressable Network Content Addressable Network (cont)(cont)

4. Update the 4. Update the neighborhood zone neighborhood zone routing informationrouting information

Page 13: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Content Addressable Network Content Addressable Network (CAN)(CAN)

Data ObjectData Object Operation (e.g. Insertion) Operation (e.g. Insertion)1.1. Generate a key based on the object identification Generate a key based on the object identification

and insert data object as a <key, value> pairand insert data object as a <key, value> pair

2.2. Map the key into a point P in the CAN plane by Map the key into a point P in the CAN plane by using a uniform hash functionusing a uniform hash function

3.3. Store the <key, value> pair at the node which owns Store the <key, value> pair at the node which owns the zone within which the point P is located the zone within which the point P is located

4.4. To retrieve the value, the same hash function is To retrieve the value, the same hash function is applied to the key in order to regenerate the point P applied to the key in order to regenerate the point P and find the zone owns that point, the zone will and find the zone owns that point, the zone will return the value to the clientreturn the value to the client

Page 14: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Storing Spatial Data w/ DHTsStoring Spatial Data w/ DHTs

Hash function distributes data objects evenly Hash function distributes data objects evenly within the space to achieve a balanced loadwithin the space to achieve a balanced loadSpatial locality information needs to be Spatial locality information needs to be preserved for range queries. Applying a hash preserved for range queries. Applying a hash function to spatial data will destroy localityfunction to spatial data will destroy localityRelated work explored storing R-tree or Quad-Related work explored storing R-tree or Quad-tree based index on DHTtree based index on DHT– Harwood et al. Harwood et al. Hashing Spatial Content over Peer-to-Hashing Spatial Content over Peer-to-

Peer NetworksPeer Networks– Mondal et al. Mondal et al. P2PR-tree: An R-tree-based Spatial P2PR-tree: An R-tree-based Spatial

Index for Peer-to-Peer EnvironmentsIndex for Peer-to-Peer Environments

Page 15: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Spatial Range Query Design for Spatial Range Query Design for P2P SystemsP2P Systems

Mapping a physical space to a CAN spaceMapping a physical space to a CAN space– Propose a new hash function to map spatial Propose a new hash function to map spatial

data objects onto nodes over a modified CAN data objects onto nodes over a modified CAN systemsystem

– Purpose: allow efficient spatial data query Purpose: allow efficient spatial data query execution while at the same time considering execution while at the same time considering load balanceload balance

– Calculating the location of zones in the logical Calculating the location of zones in the logical space – Virtual Identifier (VID) tree for space – Virtual Identifier (VID) tree for mapping purposemapping purpose

Page 16: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Spatial Range Query Design for Spatial Range Query Design for P2P SystemsP2P Systems

Approach:Approach:– Object Object keykey is generated with three different is generated with three different

componentscomponents(a) Scatter region address: based on the spatial (a) Scatter region address: based on the spatial locality of the object; preserves spatial locality.locality of the object; preserves spatial locality.

(b) Zone address: randomized; achieves load (b) Zone address: randomized; achieves load balancebalance

(c) Object identifier (hashed)(c) Object identifier (hashed)

– The scatter region size is fixed and The scatter region size is fixed and predeterminedpredetermined

Page 17: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Spatial Range Query Design for Spatial Range Query Design for P2P Systems (cont.)P2P Systems (cont.)

– The value of zone bit string is decided randomly and The value of zone bit string is decided randomly and the object identifier is the data content hash resultthe object identifier is the data content hash result

– The VID tree is created with its height determined by The VID tree is created with its height determined by the scatter region sizethe scatter region size

– The maximum number of zones is 2The maximum number of zones is 2(a+b)(a+b)

– The relationship between data locality and load The relationship between data locality and load balance can be determined along a spectrumbalance can be determined along a spectrum

Page 18: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Spatial Range Query Design for Spatial Range Query Design for P2P Systems (cont.)P2P Systems (cont.)

Scatter Scatter regionregion(11000)(11000)

e.g.:e.g.:a=5 bitsa=5 bits

00

10

01

000 001 010 011 100 101 110 111

11

Page 19: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Spatial Range Query Design for Spatial Range Query Design for P2P Systems (cont.)P2P Systems (cont.)

ZonesZones

e.g.:e.g.:b=4 bitsb=4 bits

00

10

01

000 001 010 011 100 101 110 111

11

Page 20: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Spatial Range Query Design for Spatial Range Query Design for P2P Systems (cont.)P2P Systems (cont.)

System Operation and Spatial Range QuerySystem Operation and Spatial Range Query– Node OperationNode Operation

Bootstrap mechanismBootstrap mechanismNode join mechanismNode join mechanismZone split and the search thresholdZone split and the search threshold

– Balance the number of data objects in each zoneBalance the number of data objects in each zone– The zone being selected must be larger than the The zone being selected must be larger than the

minimum zone size (1/2minimum zone size (1/2(a+b)(a+b)))– The threshold is the upper bound on the number of The threshold is the upper bound on the number of

search hops to find a zone to splitsearch hops to find a zone to split– Data Object InsertionData Object Insertion– Data Object DeletionData Object Deletion– Spatial Range QuerySpatial Range Query

Page 21: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Spatial Range Query Design for Spatial Range Query Design for P2P Systems (cont.)P2P Systems (cont.)

Spatial Range Query

Step 1: The querying node launches a spatial range query.

Page 22: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Spatial Range Query Design for Spatial Range Query Design for P2P Systems (cont.)P2P Systems (cont.)

Spatial Range Query

Step 2:The node determines the overlapping scatter regions.

Page 23: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Spatial Range Query Design for Spatial Range Query Design for P2P Systems (cont.)P2P Systems (cont.)

Spatial Range Query

Step 3:The node multicasts the query to the overlapping scatter regions.

Page 24: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Spatial Range Query Design for Spatial Range Query Design for P2P Systems (cont.)P2P Systems (cont.)

Step 4:Step 4:– The range query is multicast The range query is multicast withinwithin all all

overlapping scatter regions (M-CAN).overlapping scatter regions (M-CAN).– Recall: data is randomized within each scatter Recall: data is randomized within each scatter

region, so an exhaustive search is necessaryregion, so an exhaustive search is necessary– Choice of scatter region sizeChoice of scatter region size

Large: good load balance; uniform within a scatter Large: good load balance; uniform within a scatter regionregion

Small: exhaustive search covers less areaSmall: exhaustive search covers less area

Page 25: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Conclusions and Future Conclusions and Future Research DirectionsResearch Directions

– We proposed a hash function to preserve We proposed a hash function to preserve both spatial locality information and both spatial locality information and constrained load balanceconstrained load balance

– The proposed mechanism works will with The proposed mechanism works will with CAN P2P architectureCAN P2P architecture

– We are currently running simulations to test We are currently running simulations to test our approachour approach

Page 26: Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Roger ZimmermannRoger Zimmermann COMPSAC 2004, September 30COMPSAC 2004, September 30

Thank you!

Questions?


Recommended