FDSI-tree: A Fully Distributed Spatial Index Tree for Efficient & Power-Aware Range Queries in Sensor Networks
Sanghun Eo, Suraj Pandey, Myungkeun Kim, YoungHwan Oh and Haeyoung Bae
2
Contents
• Introduction
• Related Work
• FDSI-Tree
• Assumptions and System Model
• The Tree Structure
• Energy Efficient & Power-aware Query Processing
• Experiment
• Conclusion and Future Work
3
Introduction
• A sensor network consists of many spatially distributed sensors, which are used to monitor or detect phenomena at different locations, such as temperature changes or pollutant level
• Sensor nodes, such as the Berkeley MICA Mote which already support temperature sensors, a magnetometer, an accelerometer, a microphone, and also several actuators, are getting smaller, cheaper, and able to perform more complex operations, including having mini embedded operating systems
• While these advances are improving the capabilities of sensor nodes, there are still many crucial problems with deploying sensor networks• Limited storage, limited network bandwidth, poor inter-node
communication, limited computational ability, and limited power still persist
4
Related Work
• We have laid out the importance to the need of spatial indexing
schemes in sensor networks
• Traditionally, the database community has focused mostly on
centralized indices and our approach essentially is to embed them into
sensor nodes
• But, the index structure is decided not just upon the data, but also
considering the performance metrics and power measurements of
collective sensors
5
Related Work
• The Cougar project at Cornell discusses queries over sensor networks,
which has a central administration that is aware of the location of all th
e sensors
• Madden et.al., introduced Fjord architecture for management of multip
le queries focusing on the query processing in the sensor environment
• The TinyOS group at UC Berkeley has published a number of papers d
escribing the design of motes, the design of TinyOS, and the implemen
tation of the networking protocols used to conduct ad-hoc sensor netw
orks
6
Related Work
• TAG was proposed for an aggregation service as a part of TinyDB, which is a query processing system for a network of Berkeley motes
• They also described a distributed index, called Semantic Routing Trees (SRT)• SRTs are based on single attributes, historical sensor reading and fixed no
de query originations, as contrasting to our design over these aspects
• The work on directed diffusion, which is a data centric framework, uses flooding to find paths from the query originator node to the data source nodes• The notion is grouping to compute aggregates over partitions of sensor rea
dings
7
Related Work
• Pre-computed indices are used to facilitate range queries in traditional
database systems, and have been adopted by the above mentioned
work
• Indices trade-off some initial pre-computation cost to achieve a
significantly more efficient querying capability
• For sensor networks, we emphasize that a centralized index for range
queries are not feasible for energy-efficiency as the energy cost of
transmitting 1Kb a distance of 100m is approximately the same as that for
executing 3 million instructions by a 100 (MIPS)/W processor
8
FDSI-Tree
• All the schemes reviewed earlier are based on grouping of the sensor
nodes either by event/attribute, which are data centric demanding
communication that is redundant.
• The FDSI-Tree overcomes these inherent deficiencies
• Assumptions and System Model
• Wireless Sensor networks have the following physical resource constraints
and unique characteristics
• Communication: The wireless network connecting the sensor nodes is
usually limited, with only a very limited quality of service, with high
variance in latency, and high packet loss rates
9
FDSI-Tree
• Assumptions and System Model
• Power consumption: Sensor nodes have limited supply of energy,
most commonly from a battery source
• Computation: Sensor nodes have limited computing power and
memory sizes that restrict the types of data processing algorithms that
can be used and intermediate results that can be stored on the sensor
nodes
• Streaming data: Sensor nodes produce data continuously without
being explicitly asked for that data
• Real-time processing: Sensor data usually represent real-time events
• Moreover, it is often expensive to save raw sensor streams to disk at the
sink
• Hence, queries over streams need to be processed in real time
10
FDSI-Tree
• Assumptions and System Model
• Uncertainty: The information gathered by the sensors contains noise
from environment
• Moreover, factors such as sensor malfunction, and sensor placement
might bias individual readings
• We consider a static sensor network distributed over a large area
• All sensors are aware of their geographical position
• Each sensor could be equipped with GPS device or use location
estimation techniques
11
FDSI-Tree
• The Tree Structure
• A FDSI-tree is an index designed to allow each node to efficiently determi
ne if any of the nodes below it will need to participate in a given query ove
r some queried range
• To accommodate the spatial query in the network we need additional para
meters to be stored by individual nodes.
• Each node must store the calculated MBR of its children along with t
he aggregate values
• The parent node of each region in the tree has a structure in the form
<child-pointers, child-MBRs, overall-MBR, location-info>.
• The child-pointers helps traverse the node structure
12
FDSI-Tree
• The Tree Structure
• We have added the MBR in each node which confines the children into a
box over which a query can be made.
• The confinement algorithm is responsible to analyze and distribute the sensor
nodes into the appropriate MBR
• This classification is largely based on their proximity to their respective
parent and the contribution factor to the dead space of the resulting MBR
13
FDSI-Tree
• The network structure, which is common to both Cougar and TinyDB,
consists of nodes connected as a tree (tree-based routing)
• As it’s evident that nodes within the same level do not communicate wi
th each other, the communication boundary is constrained within childr
en and their respective parent
• This communication relationship is viable to changes due to moving n
odes, the power shortage of the nodes, or when new nodes appear
14
FDSI-Tree
• TinyDB has a list of parent candidates
• The parent changes if link quality degrades sufficiently
• The Cougar has a similar mechanism: a parent sensor node will keep a
list of all its children, which is called the waiting list, and will not repor
t its reading until it hears from all the sensor nodes on its waiting list
• We use Cougar’s approach in our system under similar semantics
15
FDSI-Tree
• The Tree Structure
Fig. 1. Node positions in one section of our sensor test bed.
(a) Simulated Physical Environment showing region of interest.
(b) The MBR under each parent node of a sub tree
(a) (b)
16
FDSI-Tree
• The Tree Structure
• For the construction of FDSI-tree, in the descending stage, a bounded box
which overlaps the children and the parent itself should be stored by each
parent in that region
• Each descent correspondingly stores the MBR of the region where link
exists until the leaf node is reached
• At the end of the descent, when all the nodes have been traversed, the
parent node of each region is notified about their child nodes’ MBR
• Hence, in the ascending stage the parent of each region gets updated the
new MBR of their children which now should include the sub-tree under
that node, and a distributed R-tree like structure is formed among the
sensor nodes
17
FDSI-Tree
• Energy Efficient & Power-aware Query Processing
• One critical operation of FDSI-tree, called energy efficient forwarding, is
to isolate the regions containing the sensor nodes that can contribute to the
range query
• Our prime objective is to maintain the minimum count of nodes taking
part in the query
• A range query returns all the relevant data collected/relayed that is
associated with regions within a given query window W (e.g., a rectangle
in a two-dimensional space)
• To process a range query with FDSI-tree, at first the root node receives the
query; originating at any node
• The disseminating of this request to the children node now is based on the
calculation of the child node/s whose overall-MBR overlaps W
18
FDSI-Tree
• Energy Efficient & Power-aware Query Processing
• Each parent under that overlapping region receives this query and based o
n the overlapping regions of its children, the corresponding network (sub-t
ree) is flooded
• It is here that the child-MBR is used to decide the particular regions which
need precise selection in-order to limit unnecessary node traversal
• These child-MBRs are comparatively small regions that cover only the per
imeter of the children including their parent
• So the selection operation needs minimum traversal to include the nodes in
the list needed for range query
19
FDSI-Tree
• Energy Efficient & Power-aware Query Processing
• The optional parameter location-info should help to get accurate result for
overlapping, independent regions
• Its inclusion is based on the type of sensor network and its scalability
factor
• In addition to the geographic information it may include additional values
e.g., time t, location attributes etc., that should act as a filter, which again
is largely dependent on the computational power of each sensor node
20
Experiment (Environment Setup)
• Regular tessellation, as like a grid
• Each node could transmit data to s
ensors that were at most one hop a
way
• Experiments based upon TinyDB s
etup and attributes
• Best-case and closest parent appro
ach of TinyDB used as the base for
comparison
• The overall cost highly depends on
the size of the window query and t
he scale of the sensor network
Fig. 2. Sensor node linkage showing the grid tessellation
21
Experiment (Performance Evaluation)
• Parent selection an important iss
ue
• Closest-parent as in TinyDB
• Benefits of FDSI-tree dependent
on quality of MBR of children b
eneath the parents
• FDSI-tree reduces the over netw
ork traffic by 20%
• Maintenance cost and constructi
on cost is nevertheless foreseeab
le
• Emulation based on AVRORA
Fig. 3. Number of nodes participating in window queries of different sizes (20 × 20 grid, 400 nodes)
Query Range Versus Nodes in Query
0
0.2
0.4
0.6
0.8
1
1.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Query Size as Percent of Vaue RangeFr
actio
n of
Nod
es In
volv
ed in
Que
ry
No FDSI-tree
TinyDB (Closest Parent)
FDSI-tree
Ideal
22
Conclusion and Future Work
• We contribute a new technique to group the sensors in a region for spatial range queries
• FDSI-tree can reduce the number of nodes that disseminate queries by nearly an order of magnitude
• Isolating the overlapping regions of sensor nodes with the range query, non-relevant nodes can be avoided in the communication
• Only the sensor nodes leading to the path of the requested region are communicated, and hence substantial reduction in power is achieved due to reduced number of sub-trees involved
23
Conclusion and Future Work
• In addition, the aggregate values for the region of interest is collected,
following the in-network aggregation paradigm which has an
advantage over the centralized index structure in that it does not
require complete topology and sensor value information to be collected
at the root of the network
• Since data transmission is the biggest energy-consuming activity in
sensor nodes, using FDSI-tree results in significant energy savings.
24
Conclusion and Future Work
• FDSI-tree provides a scalable solution to facilitate range queries
adopting similar protocols and query processing used so far, making it
highly portable
• Currently, we are expanding our scheme to consider moving objects
trying to achieve moreover the same throughput as in static networks
• Adoption of distributed redundant architecture for efficient processing
of concurrent queries and for supporting join operations, are challenges
which are under scrutiny as the capabilities of sensor nodes reaches
higher levels