+ All Categories
Home > Documents > IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, …mcn.cse.psu.edu/paper/mshao/tmc09.pdf ·...

IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, …mcn.cse.psu.edu/paper/mshao/tmc09.pdf ·...

Date post: 07-Oct-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
16
pDCS: Security and Privacy Support for Data-Centric Sensor Networks Min Shao, Student Member, IEEE, Sencun Zhu, Wensheng Zhang, Member, IEEE, Guohong Cao, Senior Member, IEEE, and Yi Yang, Student Member, IEEE Abstract—The demand for efficient data dissemination/access techniques to find relevant data from within a sensor network has led to the development of Data-Centric Sensor (DCS) networks, where the sensor data instead of sensor nodes are named based on attributes such as event type or geographic location. However, saving data inside a network also creates security problems due to the lack of tamper resistance of the sensor nodes and the unattended nature of the sensor network. For example, an attacker may simply locate and compromise the node storing the event of his interest. To address these security problems, we present pDCS, a privacy- enhanced DCS network which offers different levels of data privacy based on different cryptographic keys. pDCS also includes an efficient key management scheme to facilitate the management of multiple types of keys used in the system. In addition, we propose several query optimization techniques based on euclidean Steiner Tree and keyed Bloom Filter (KBF) to minimize the query overhead while preserving query privacy. Finally, detailed analysis and simulations show that the KBF scheme can significantly reduce the message overhead with the same level of query delay and maintain a very high level of query privacy. Index Terms—Security, privacy, data centric, keyed Bloom Filter, wireless sensor networks. Ç 1 INTRODUCTION S ENSOR networks are envisioned to be extremely useful for a broad spectrum of emerging civil and military applications [1], such as remote surveillance, habitat monitoring, and collaborative target tracking. Sensor net- works scale in size as time goes on, so does the amount of sensing data generated. The large volume of data coupled with the fact that the data are spread across the entire network creates a demand for efficient data dissemination/ access techniques to find the relevant data from within the network. This demand has led to the development of Data- Centric Sensor (DCS) networks [2], [3], [4]. DCS exploits the notion that the nature of the data is more important than the identities of the nodes that collect the data. Thus, sensor data as contrasted to sensor nodes are “named,” based on attributes such as event type (e.g., elephant-sightings) or geographic location. According to their names, the sensing data are passed to and stored at corresponding sensor nodes determined by a mapping function such as Geographic Hash Table (GHT) [2]. As the sensing data with the same name are stored in the same location, queries for data of a particular name can be sent directly to the storing nodes using geographic routing protocols such as GPSR [5], rather than flooding the query throughout the network. Fig. 1 shows an example of using a DCS-based sensor network to monitor the activities or presence of animals in a wild animal habitat. The sensed data can be used by zoologists to study the animals or by an authorized hunter to locate certain types of animals (e.g., boars and deers) for hunting. With DCS, all the sensing data regarding one type of animals are forwarded to and stored in one location. As a result, a zoologist only needs to send one query to the right location to find out the information about that type of animals. Similarly, a soldier can easily obtain enemy tank information from storage sensors through a DCS-based sensor network in the battlefield. In many cases, DCS-based data dissemination offers a significant advantage over previous external storage-based data dissemination approaches, where an external base station (BS) is used for collecting and storing the sensing data. If many queries are issued from nodes within the network [6], [4], external storage-based scheme is very inefficient since data must be sent back and forth between the sensors and the BS, thus causing the nodes close to the BS to die rapidly due to energy depletion. Further, for sensor networks deployed in hostile environments such as a battlefield, external BS may not be available because the BS is very attractive for physical destruction and compromise, thus becoming a single point of failure from both security and operation perspectives. In contrast, the operation of a DCS system does not assume the availability of a persistent BS; instead, mobile sinks (MSs) such as mobile sensors, users, or soldiers may be dispatched on-demand to collect the stored data (or to perform other tasks) on appropriate occasions. IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, AUGUST 2009 1023 . M. Shao is with Microsoft Corp., 1 Microsoft Way, Redmond, WA 98052. E-mail: [email protected]. . S. Zhu is with the Department of Computer Science and Engineering and College of Information Sciences and Technology, Pennsylvania State University, 338F Information Sciences and Technology Building, Uni- versity Park, PA 16802. E-mail: [email protected]. . W. Zhang is with the Department of Computer Science, Iowa State University, 109 Atanasoff Hall, Ames, IA 50011-1040. E-mail: [email protected]. . G. Cao is with the Department of Computer Science and Engineering, Pennsylvania State University, 354G Information Sciences and Technol- ogy Building, University Park, PA 16802-6106. E-mail: [email protected]. . Y. Yang is with the Department of Computer Science and Engineering, Pennsylvania State University, 344 Information Sciences and Technology Building, University Park, PA 16802. E-mail: [email protected]. Manuscript received 15 Feb. 2007; revised 10 Jan. 2008; accepted 21 Oct. 2008; published online 21 Nov. 2008. For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference IEEECS Log Number TMC-0054-0207. Digital Object Identifier no. 10.1109/TMC.2008.168. 1536-1233/09/$25.00 ß 2009 IEEE Published by the IEEE CS, CASS, ComSoc, IES, & SPS Authorized licensed use limited to: Penn State University. Downloaded on January 14, 2010 at 14:53 from IEEE Xplore. Restrictions apply.
Transcript
Page 1: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, …mcn.cse.psu.edu/paper/mshao/tmc09.pdf · instead, mobile sinks (MSs) such as mobile sensors, users, or soldiers may be dispatched

pDCS: Security and Privacy Support forData-Centric Sensor Networks

Min Shao, Student Member, IEEE, Sencun Zhu, Wensheng Zhang, Member, IEEE,

Guohong Cao, Senior Member, IEEE, and Yi Yang, Student Member, IEEE

Abstract—The demand for efficient data dissemination/access techniques to find relevant data from within a sensor network has led to

the development of Data-Centric Sensor (DCS) networks, where the sensor data instead of sensor nodes are named based on

attributes such as event type or geographic location. However, saving data inside a network also creates security problems due to the

lack of tamper resistance of the sensor nodes and the unattended nature of the sensor network. For example, an attacker may simply

locate and compromise the node storing the event of his interest. To address these security problems, we present pDCS, a privacy-

enhanced DCS network which offers different levels of data privacy based on different cryptographic keys. pDCS also includes an

efficient key management scheme to facilitate the management of multiple types of keys used in the system. In addition, we propose

several query optimization techniques based on euclidean Steiner Tree and keyed Bloom Filter (KBF) to minimize the query overhead

while preserving query privacy. Finally, detailed analysis and simulations show that the KBF scheme can significantly reduce the

message overhead with the same level of query delay and maintain a very high level of query privacy.

Index Terms—Security, privacy, data centric, keyed Bloom Filter, wireless sensor networks.

Ç

1 INTRODUCTION

SENSOR networks are envisioned to be extremely useful fora broad spectrum of emerging civil and military

applications [1], such as remote surveillance, habitatmonitoring, and collaborative target tracking. Sensor net-works scale in size as time goes on, so does the amount ofsensing data generated. The large volume of data coupledwith the fact that the data are spread across the entirenetwork creates a demand for efficient data dissemination/access techniques to find the relevant data from within thenetwork. This demand has led to the development of Data-Centric Sensor (DCS) networks [2], [3], [4].

DCS exploits the notion that the nature of the data ismore important than the identities of the nodes that collectthe data. Thus, sensor data as contrasted to sensor nodes are“named,” based on attributes such as event type (e.g.,elephant-sightings) or geographic location. According totheir names, the sensing data are passed to and stored atcorresponding sensor nodes determined by a mapping

function such as Geographic Hash Table (GHT) [2]. As thesensing data with the same name are stored in the samelocation, queries for data of a particular name can be sentdirectly to the storing nodes using geographic routingprotocols such as GPSR [5], rather than flooding the querythroughout the network.

Fig. 1 shows an example of using a DCS-based sensornetwork to monitor the activities or presence of animals in awild animal habitat. The sensed data can be used byzoologists to study the animals or by an authorized hunterto locate certain types of animals (e.g., boars and deers) forhunting. With DCS, all the sensing data regarding one typeof animals are forwarded to and stored in one location. As aresult, a zoologist only needs to send one query to the rightlocation to find out the information about that type ofanimals. Similarly, a soldier can easily obtain enemy tankinformation from storage sensors through a DCS-basedsensor network in the battlefield.

In many cases, DCS-based data dissemination offers asignificant advantage over previous external storage-baseddata dissemination approaches, where an external basestation (BS) is used for collecting and storing the sensingdata. If many queries are issued from nodes within thenetwork [6], [4], external storage-based scheme is veryinefficient since data must be sent back and forth betweenthe sensors and the BS, thus causing the nodes close to the BSto die rapidly due to energy depletion. Further, for sensornetworks deployed in hostile environments such as abattlefield, external BS may not be available because the BSis very attractive for physical destruction and compromise,thus becoming a single point of failure from both security andoperation perspectives. In contrast, the operation of a DCSsystem does not assume the availability of a persistent BS;instead, mobile sinks (MSs) such as mobile sensors, users, orsoldiers may be dispatched on-demand to collect the storeddata (or to perform other tasks) on appropriate occasions.

IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, AUGUST 2009 1023

. M. Shao is with Microsoft Corp., 1 Microsoft Way, Redmond, WA 98052.E-mail: [email protected].

. S. Zhu is with the Department of Computer Science and Engineering andCollege of Information Sciences and Technology, Pennsylvania StateUniversity, 338F Information Sciences and Technology Building, Uni-versity Park, PA 16802. E-mail: [email protected].

. W. Zhang is with the Department of Computer Science, Iowa StateUniversity, 109 Atanasoff Hall, Ames, IA 50011-1040.E-mail: [email protected].

. G. Cao is with the Department of Computer Science and Engineering,Pennsylvania State University, 354G Information Sciences and Technol-ogy Building, University Park, PA 16802-6106. E-mail: [email protected].

. Y. Yang is with the Department of Computer Science and Engineering,Pennsylvania State University, 344 Information Sciences and TechnologyBuilding, University Park, PA 16802. E-mail: [email protected].

Manuscript received 15 Feb. 2007; revised 10 Jan. 2008; accepted 21 Oct.2008; published online 21 Nov. 2008.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference IEEECS Log Number TMC-0054-0207.Digital Object Identifier no. 10.1109/TMC.2008.168.

1536-1233/09/$25.00 � 2009 IEEE Published by the IEEE CS, CASS, ComSoc, IES, & SPS

Authorized licensed use limited to: Penn State University. Downloaded on January 14, 2010 at 14:53 from IEEE Xplore. Restrictions apply.

Page 2: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, …mcn.cse.psu.edu/paper/mshao/tmc09.pdf · instead, mobile sinks (MSs) such as mobile sensors, users, or soldiers may be dispatched

The previous DCS systems, however, were not designedwith security in mind. All data of the same event type arestored at the same node [7], [2] or several nodes [3], [4]based on a publicly known mapping function. As long asthe mapping function and the types of events monitored inthe system are known, one can easily determine thelocations of the sensors storing different types of data. Inour previous example, a zoologist can use the DCS systemto locate any animals of interest, whereas a hunter is onlypermitted to hunt certain kinds of animals (e.g., boars anddeers) but not the protected ones (e.g., elephants). Never-theless, a nonconforming hunter may acquire the locationsof the protected animals for hunting purpose. As such,security and privacy should be provided for DCS system.

Securing DCS systems is complicated by the networkscale, the highly constrained system resource, the difficultyof dealing with node compromises, and the fact that sensornetworks are often deployed in unattended and hostileenvironments. The low cost of sensor nodes (e.g., less thanas envisioned for smart dust [8]) precludes the built-intamper-resistance capability of sensor nodes. Thus, the lackof tamper resistance coupled with the unattended naturegives an adversary the opportunity to break into thecaptured sensor nodes to read out sensor data andcryptographic keys.

We present pDCS, a privacy-enhanced DCS system forunattended sensor networks. To the best of our knowledge,pDCS is the first one to provide security and privacy to DCSnetworks. Specifically, pDCS provides the following fea-tures. First, even if an attacker can compromise a sensornode and obtain all its keys, he cannot decrypt the datastored in the compromised node. Second, after an attackerhas compromised a sensor node, he cannot know where thiscompromised node stored its event data generated in theprevious time intervals. Third, pDCS includes very efficientkey management schemes for revoking a compromisednode once its compromise has been detected, thus prevent-ing an attacker from knowing the future storage location forparticular events. Finally, pDCS provides a novel queryoptimization scheme to significantly reduce the messageoverhead without losing any query privacy.

The salient features of pDCS are due to the followingtechniques. Instead of using a publicly known mappingfunction, pDCS provides private data-location mappingbased on cryptographic keys. The keys are assigned andupdated to thwart outsider attackers or insider attackersfrom deriving the locations of the storage cells for previoussensor data. The key management scheme for updatingcompromised keys makes a seamless mapping betweenlocation keys and logical keys. On the other hand, as privatemapping may reduce the efficiency of sending MS queries,we also propose several query optimization techniquesbased on euclidean Steiner Tree (EST) [9] and keyed BloomFilter (KBF) to minimize the query overhead while provid-ing certain query privacy.

The rest of this paper is organized as follows: We firstdescribe the related work in Section 2 and then discuss theassumptions and design goal in Section 3. Section 4 presentsseveral secure mapping functions, followed by a keymanagement scheme and optimization technique for sendingqueries. In Section 5, we compare the performance of severalquery methods. Finally, we conclude this paper in Section 6.

2 RELATED WORK

We introduce the related work in three categories: privacyand anonymity, key management, and location-basedforwarding.

2.1 Location Privacy and CommunicationAnonymity

There are mainly two approaches for restricting MS access tosensor data: policy enforcement and data perturbation. Inthe spirit of the first approach, Myles et al. [10], Hengartnerand Steenkiste [11], and Snekkenes [12] studied the issue ofspecifying location privacy policies on which access controldecisions are based. Alternatively, anonymity mechanismscould also be employed to provide the required level ofprivacy by properly perturbing the sensor data before itsrelease. Gruteser et al. [13] proposed techniques such as datacloaking and hierarchical data aggregation to prevent anattacker from tracking the precise location of an individualmonitored by sensors. The main difference between ourwork and the previous work is that we achieve sensor dataprivacy in an unattended environment by encryption as wellas random location mapping, instead of policy enforcementor data perturbation. These techniques are complementaryto each other and could be applied jointly if needed.

Deng et al. [14] studied how to conceal BSs from outsiderattackers. In their schemes, all sensor nodes transmit at aconstant rate and the mix technique [15] is used to hidesender-receiver correlations. Ozturk et al. [16] studied anoutsider attack in which a single attacker tries to trace backto the data source by analyzing the observed traffic insensor networks where sensor nodes report sensing data toa fixed external sink. To defend against the attack, aphantom flooding scheme is proposed to disturb the trafficpattern and mislead the attacker. Currently, pDCS does notinclude its own anonymous communication techniques yet.Instead, it relies on one of the existing schemes to providethe service when required. In [17], we proposed apreliminary version of pDCS, but important issues such as

1024 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, AUGUST 2009

Fig. 1. A DCS-based sensor network which can be used by zoologists

(who are authorized to know the locations of all animals) and hunters

(who should only know the locations of boars and deers, but not

elephants).

Authorized licensed use limited to: Penn State University. Downloaded on January 14, 2010 at 14:53 from IEEE Xplore. Restrictions apply.

Page 3: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, …mcn.cse.psu.edu/paper/mshao/tmc09.pdf · instead, mobile sinks (MSs) such as mobile sensors, users, or soldiers may be dispatched

key management and load balancing evaluation have notbeen addressed yet.

2.2 Key Management for Sensor Networks

Key management for sensor networks has been extensivelystudied recently. There are pairwise key establishmentschemes using a trusted third party (BS) [18], exploiting theinitial trustworthiness of newly deployed sensors [19], andbased on the framework of probabilistic key predeployment[20], [21], [22], [23], [24], [25], [26], [27]. pDCS may adopt oneof these pairwise key establishment schemes according tosecurity requirements and resource constraints.

Many logical-key-tree (LKH)-based group key manage-ment schemes have been proposed for secure multicast inwired networks, including LKH [28], ELK [29], subset-difference [30], to name a few. Since these schemes were notdesigned for sensor networks, they are less optimized andless efficient when employed in sensor networks directly. Afew schemes also discussed the management of group keysin sensor networks. In [19], an updated group key isdistributed in a network through hop-by-hop encryption bytrading computation for communication. In [31], geogra-phical information is exploited to map an LKH [28] to thephysical tree structure so as to optimize the energyexpenditure of a group rekeying operation. There aremainly two differences between our key managementscheme and the above. First, in addition to group keyupdating, in pDCS row keys and cell keys also need to beupdated upon a node revocation. Second, in pDCS, the keyencryption keys (KEKs) in an LKH are location-dependentkeys and our cell-based network partition allows ourscheme to further reduce rekeying overhead.

2.3 Location-Based Forwarding

Location-based forwarding has been studied for bothmobile ad hoc networks and sensor networks. The loca-tion-aided routing [32] was proposed to reduce the cost ofdiscovery by restricted area flooding when the uncertaintyabout a destination is limited. Greedy routing schemes, e.g.,GPSR [5], choose the next hop that provides most progresstoward the destination. In these schemes, the delivery ofpackets is guaranteed by planarizing the network graph andapplying detour algorithms which avoid obstacles using the“right hand rule” strategy. Niculescu and Nath [33]proposed trajectory-based routing, in which the sourceencodes trajectory to traverse and embeds it into eachpacket. Upon the arrival of each packet, intermediate nodesemploy greedy forwarding techniques such that the packetfollows its trajectory as much as possible. With this scheme,routing becomes source-based while there is no need formaintaining routing tables at intermediate nodes. We notethat the scheme in [33] is suitable for a regular shapetrajectory, not for totally random shape trajectory, which isthe case in pDCS.pDCS employs two approaches for forwarding query

packets to randomly distributed locations. One is trajectory-based routing, in which the trajectory is explicitly encodedin each packet using EST. In another approach, a novel KBFtechnique is applied to encode the trajectory implicitly,which can achieve destination anonymity while guarantee-ing that each query packet reaches its destination.

3 MODELS AND DESIGN GOAL

3.1 Network Model

As in other DCS systems [7], [2], [3], our pDCS system alsoassumes that a sensor network is divided into cells (orgrids) where each pair of nodes in neighboring cells cancommunicate directly with each other. Cell is the minimumunit for detecting events (referred to as detection cell) and forstoring sensor data (referred to as storage cell); for example,a cell head coordinates all the actions inside a cell. Each cellhas a unique ID and every sensor node knows in which cellit is located through a GPS when affordable. In the caseseither GPS services are not available or GPS devices are tooexpensive, attack-resilient GPS-free localization techniques[34], [35], [36], [37] may be employed instead because pDCSdoes not rely on absolute coordinates. For example, inVerifiable Multilateration (VM) [34], distances are measuredbased on radio signal propagation time and it providessecure and reasonably accurate sensor positioning.

We assume the events of interest to the MSs are classifiedinto multiple types. For example, when a sensor network isdeployed for monitoring the activities and locations of theanimals in a wild animal habitat, all the activities of acertain kind of animal may be considered as belonging toone event type.

We do not assume a fixed BS in the network. Instead, atrusted MS may enter the network at an appropriate timeand work as the network controller for collecting data orperforming key management. We also assume the clocks ofsensor nodes in a network are loosely synchronized based onan attack-resilient time synchronization protocol [38], [39].

3.2 Attack Model

Given the unattended nature of a sensor network, anattacker may launch various security attacks in the networkat all layers of the protocol stack [40], [41], [42]. Due to thelack of a one-for-all solution, in the literature, these attacksare studied separately and the proposed defense techniquesare also attack specific. As such, instead of addressing allattacks, we will focus on the specific security problems inour pDCS network. We assume that in a pDCS network the(ultimate) goal of an attacker is to obtain the event data ofhis interest. To achieve this goal, an attacker may launch thefollowing attacks:

. Passive attack. An attacker may passively eavesdropon the message transmissions in the network.

. Query attack. An attacker may simply send a queryinto the network to obtain the sensor data of interestto him.

. Readout attack. An attacker may capture somesensor nodes and read out the stored sensor datadirectly. It is not hard to download data from boththe RAM and ROM spaces of sensor nodes (e.g.,Mica motes [43]).

. Mapping attack. In this attack, the goal of anattacker is to identify the mapping relation betweentwo cells. Specifically, he may either identify thestorage cell for a specific detection cell or reverselyfigure out the detection cell for a storage cell of hisinterest. Mapping attack is normally followed by areadout attack.

SHAO ET AL.: pDCS: SECURITY AND PRIVACY SUPPORT FOR DATA-CENTRIC SENSOR NETWORKS 1025

Authorized licensed use limited to: Penn State University. Downloaded on January 14, 2010 at 14:53 from IEEE Xplore. Restrictions apply.

Page 4: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, …mcn.cse.psu.edu/paper/mshao/tmc09.pdf · instead, mobile sinks (MSs) such as mobile sensors, users, or soldiers may be dispatched

The passive attack can be relatively easily addressed bymessage encryption with keys of sufficient length, and thequery attack can be addressed by source authentication [18]so that a node only answers queries from an authorizedentity. Given that compromising nodes is much easier thanbreaking the underlying encryption/authentication algo-rithm, we assume that the readout attack and the mappingattack are more preferable to the attacker. Note that lettingdetection cells encrypt sensor data and store the encrypteddata locally cannot address the readout attack because anattacker can read out the encryption keys from the capturedsensor nodes as well.

3.3 Security Assumption

We assume that an authorized MS has a mechanism toauthenticate broadcast messages (e.g., based on �TESLA[18]), and every node can verify the broadcast messages. Wealso assume that when an attacker compromises a node hecan obtain all the sensitive keying material possessed by thecompromised node. Note that although technically anattacker can compromise an arbitrary number of currentgeneration of sensor nodes without much effort, we assumethat only nodes in a small number ðsÞ of cells have beencompromised. For instance, it may not be very easy forsensor nodes to be captured because of their geographiclocations or their tiny sizes. Also, the attacker needs to spendlonger time on compromising more sensor nodes, whichmay increase the chance of being identified. For simplicity,we say a cell is compromised when at least one node in thecell is compromised. To deal with the worst scenario, weallow an attacker to selectively compromise s cells.

We assume the existence of anti-traffic analysis techni-ques if so required. If an attacker is capable of monitoringand collecting all the traffic in the network, he may be ableto correlate the detection cells and the storage cells withoutknowing the mapping functions. Therefore, we assume oneof the existing schemes [44], [14], [16], [45] may be appliedto counter traffic analysis if the attacker is assumed to becapable of analyzing traffic.

3.4 Design Goal

Our main objective is to prevent an attacker from obtainingthe data of his interest in a DCS network through variousattacks. In more detail, our goal is to address the types ofattacks that are specific to pDCS, i.e., passive attack, queryattack, readout attack, and mapping attack. As passiveattack and query attack are easy to address, below wemainly discuss the requirements to be met for addressingthe readout attack and the mapping attack:

. Event data confidentiality. Even if an attacker cancompromise a sensor node and obtain all its keys, heshould be prevented from knowing the event datastored in the compromised node.

. Backward event privacy. An attacker should beprevented from obtaining the previous sensor datafor an event of his interest even if he hascompromised some nodes.

. Forward event privacy. We should also thwart (ifnot completely preventing) an attacker from obtain-ing the sensor data regarding an event in the futureeven if he has compromised some nodes.

. Query privacy. An MS query should reveal as littlelocation information of the sensor data as possible.For example, if multiple events are mapped andstored in the same storage cell, a query for one of theevents will also reveal the storage cell of the otherevents. As such, an attacker may eavesdrop on MSqueries to minimize his efforts in launching amapping attack.

In addition, as sensor networks are scarce in resources,especially the nonregenerative power, our security mechan-isms should be resource efficient. For example, we shouldavoid network-wide flooding and public-key operations ifat all possible. Especially, as communication normallyconsumes much more energy than computation [23], wewill prefer computation to communication when theyachieve the same goal.

4 pDCS: PRIVACY-ENHANCED DATA-CENTRIC

SENSOR NETWORKS

In this section, we first give an operational overview ofpDCS. Then, we present several schemes to randomize themapping function and propose efficient protocols tomanage various keys involved in the system. Finally, wedescribe optimization techniques for issuing queries.

4.1 The Overview of pDCS

First of all, we assume that each sensor processes five typesof keys, including master key (shared only with the MS),pairwise key (shared with every neighbor), cell key (sharedby all sensors in the same cell), row key (shared by allsensors in the same row), and group key (shared by allsensors in the network). Different keys are useful in differentschemes or under different circumstances. The details of keymanagement will be discussed in Section 4.3.

Our solution involves six basic steps in handling senseddata: determine the storage cell, encrypt, forward, store,query, and decrypt. We demonstrate the whole processthrough an example in which a cell u has detected an eventE:

1. Cell u first determines the location of the storagecell v through a keyed hash function.

2. u encrypts the recorded information ðMeÞwith its cellkey. To enable MS queries, either the event type E orthe detection time interval T is in its plain textformat, subject to the requirement of the application.

3. u then forwards the message toward the destinationstorage cell. Here, techniques [14] should be appliedto prevent traffic analysis and to prevent an attackerfrom injecting false packets.

4. On receiving the message, v stores it locally.5. If an authorized MS is interested in the event E

occurred in cell u, it determines the storage cell v andissues a query (optimized query schemes arediscussed in Section 4.4).

6. After it retrieves the data of interest, the MS decryptsit with the proper cell key (more details arediscussed in Section 4.5).

The first step is for defending against the mapping attack.Without the mapping key, an attacker cannot determine themapping from the detection cell to the storage cell. The

1026 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, AUGUST 2009

Authorized licensed use limited to: Penn State University. Downloaded on January 14, 2010 at 14:53 from IEEE Xplore. Restrictions apply.

Page 5: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, …mcn.cse.psu.edu/paper/mshao/tmc09.pdf · instead, mobile sinks (MSs) such as mobile sensors, users, or soldiers may be dispatched

second step is for preventing the readout attack. Since thestorage cell v does not possess the decryption key for Me, anattacker is prevented from deciphering Me after he hascompromised a node in v. Step 3 and Step 4 deal withforwarding and storing the sensed data, Step 5 shows thebasic operation for issuing an MS query, and Step 6describes the local processing of retrieved data.

The following sections focus on the performance andsecurity issues related to Step 1, Step 2, Step 5, and Step 6.Currently, we assume some existing schemes [14], [4]for Step 3 and Step 4; we believe research in these areasbears its own importance and deserves independentstudy.

4.2 Privacy-Enhanced Data-Location Mapping

From the system overview, we can see that an attacker canlaunch various attacks if he can find the correct mappingrelation between a detection cell and a storage cell. Thismotivated our design of secure mapping functions torandomize the mapping relationship among cells. Below,we present three representative secure mapping schemes inthe order of increasing privacy. The following notations areused during the discussion. Let N be the number of cells inthe field, Nr and Nc be the number of rows and the numberof columns, respectively. Every cell is uniquely identifiedwith Lði; jÞ, 0 � i � Nr � 1 and 0 � j � Nc � 1.

To quantify and compare the privacy levels of differentschemes, we assume that an attacker is capable ofcompromising totally s cells of his choice. To simplify theanalysis, we assume that there are m detection cells for theevent of interest to the attacker, and the locations of thesem cells are independent and identically distributed (iid)over N cells. (In real applications, the locations of thesem detection cells may correlate.) We further introduce theconcept of event privacy level (EPL).

Definition 1. EPL is the probability that an attacker cannotobtain both the sensor data and the encryption keys for anevent of his interest.

According to this definition, the larger the EPL, thehigher the privacy. This definition can be easily extended tothe concepts of backward event privacy level (BEPL) andforward event privacy level (FEPL).

4.2.1 Scheme I: Group-Key-Based Mapping

In this scheme, all nodes store the same type of event E inthe same location ðLr; LcÞ based on a group-wide sharedkey K. Here,

Lr ¼ H 0jKjEð Þmod ðNrÞ; Lc ¼ H 1jKjEð Þmod ðNcÞ: ð1Þ

To prevent the stand-alone readout attack, a cell should notstore its data in its own cell. Hence, if a cell Lðx; yÞ finds outits storage cell is the same, that is, Lr ¼ x and Lc ¼ y, itapplies H on Lr and Lc until either Lr 6¼ x or Lc 6¼ y. Tosimplify the presentation, however, we will not mention theabove case again during the future discussions.

Type 1 query. An MS can answer the following querywith one message: what is the information about an event E?This is because all the information about event E is stored inone location. An MS first determines the location based on

the key K and E, then sends a query to it directly to fetchthe data by, e.g., the GPSR protocol [5] (shortly we willdiscuss several query methods with optimized performanceand higher query privacy).

Security and performance analysis. In this scheme, all

m detection cells are mapped to one storage cell. An

attacker first randomly compromises a node to read out the

group key, based on which he locates the storage cell for the

event. Because the data stored in the compromised node

were encrypted by individual cell keys and the IDs of

detection cell were also encrypted, the attacker has to

randomly guess the IDs of these m detection cells. Assume

that an attacker can compromise up to s cells. If the first

compromised cell is the storage cell1 (with probability 1=N),

the attacker will randomly compromise ðs� 1Þ cells from

the rest ðN � 1Þ cells. There are totally N�1s�1

� �combinations,

among which N�1�ms�1�i� �

mi

� �combinations correspond to the

case where i out of m detection cells are all compromised.

On the other hand, in the case when the first compromised

node is not the storage cell (with probability ðN � 1Þ=N),

the attacker first compromise the storage cell, then ran-

domly compromise ðs� 2Þ cells from the rest ðN � 2Þ cells.

There are totally N�2s�2

� �combinations, among which

N�2�ms�2�i� �

mi

� �combinations correspond to the case where i

out of m detection cells are all compromised. Also note that

an attacker can only obtain im of the event data when i out of

m detection cells are compromised. Let B1 ¼ minðs� 1;mÞand B2 ¼ minðs� 2;mÞ, then the BEPL of this scheme is

p1bðm; sÞ ¼ 1� 1

N

XB1

i¼1

i

m

� �N � 1�ms� 1� i

� �m

i

� �. N � 1

s� 1

� �

�N � 1

N

XB2

i¼1

i

m

� �N � 2�ms� 2� i

� �m

i

� �. N � 2

s� 2

� �:

Fig. 2 shows the analytical result of BEPL as a function ofm and s for a network size of N ¼ 20 � 20 ¼ 400 cells, fromwhich we can make two observations. First, without

SHAO ET AL.: pDCS: SECURITY AND PRIVACY SUPPORT FOR DATA-CENTRIC SENSOR NETWORKS 1027

1. For simplicity, we ignore the case when the first compromised cell is adetection cell. Our study shows that the error introduced by thissimplification is negligible.

Fig. 2. The BEPL as a function of m and s, where m is the number of

detection cells and s is the number of compromised cells.

Authorized licensed use limited to: Penn State University. Downloaded on January 14, 2010 at 14:53 from IEEE Xplore. Restrictions apply.

Page 6: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, …mcn.cse.psu.edu/paper/mshao/tmc09.pdf · instead, mobile sinks (MSs) such as mobile sensors, users, or soldiers may be dispatched

surprise, BEPL decreases with s. Second, BEPL does notchange with m. This is due to the tradeoff between thenumber of detection cells and storage cells that are probablycompromised and the fraction of event data possessed bythe compromised storage cells.

Suppose the attacker compromises s cells including thestorage cell at time t0. He can come back at a time t1 in thefuture to obtain the event data from the storage cell, andthen simply decrypt all the data that were detected by theses cells during t0 and t1. Assume that m cells will detect theevent during t0 and t1 and the locations of these m cells areindependent and identically distributed over N cells. Onaverage, msN out of s compromised nodes are detection cellsand they will provide the encryption keys. Hence, the FEPLof this scheme is simply

p1fðm; sÞ ¼ 1� ðms=NÞ=m ¼ 1� s=N:

Note that this formula holds after the attacker hascompromised s cells and cannot compromise any morecells. We do not consider the FEPL during the process ofcompromising s cells.

Because all information about one event is stored in onelocation, Scheme I is subject to a single point of failure.Furthermore, both the traffic load and resources for storingthe information are not uniformly distributed among allthe nodes.

4.2.2 Scheme II: Time-Based Mapping

In this scheme, all nodes store the event E occurring in thesame time interval T (including a start time and an endtime, the duration is denoted as jT j) into the same locationðLr; LcÞ based on a group-wide shared key KT :

Lr ¼ H 0jKT jEjTð Þ mod ðNrÞ: ð2Þ

Similarly, Lc ¼ Hð1jKT jEjT Þ mod ðNcÞ. In addition, everysensor node maintains a timer which fires periodicallywith time period jT j. When its timer fires, a node derivesthe next group key KT ¼ HðKT Þ. Finally, it erases theprevious key KT .

Type 2 query. An MS can answer the following querywith one message: what is about the event E during the timeinterval T? This is because the information about E in T isstored in one location. An MS first determines the locationbased on KT ;E; T , and then sends a query to fetch the data.

Security and performance analysis. Due to the use ofthe one-way hash function, an attacker cannot derive theold group keys from the current group key of a capturednode. Hence, the locations for storing the events occurredduring the previous time periods are not derivable. Anattacker has to randomly guess the previous storage cellsand detection cells for the event of his interest. The BEPLp2bðm; sÞ of the previous data is very complicated to derive

because it depends on the spatial and temporal distributionof m detection cells, the number of previous storage cellsfor the event, which in turn depends on the number ofprevious key updating periods and the probability of hashcollisions. For ease of analysis, we ignore the case where acell serves as both a detection cell and a storage cell. Underthis assumption, on average, an attacker can correctly guesss=N fraction of detection cells and s=N fraction of storage

cells. Only when these detection cells are mapped to thesestorage cells can the attacker decrypt the encrypted data.As such

p2bðm; sÞ ¼ 1� ðs=NÞðs=NÞ ¼ 1� s

N

� �2:

Consider the case s ¼ 40 and N ¼ 400, the BEPL ofScheme II is 99 percent. From Fig. 2, we can see the BEPLof Scheme I under the same condition is slightly over90 percent. Thus, Scheme II provides higher BEPL (i.e.,higher backward privacy) than Scheme I.

There are two cases for the FEPL. If the attacker changesthe code of the compromised nodes such that in the futurethese nodes keep their detected event data locally, the FEPLp2fðm; sÞ of this scheme is simply 1� s=N . However, if the

compromised nodes follow our protocol and hence do notkeep a local copy of their data, the FEPL will increase. Thisis because in the future the event data might be forwardedto new storage cells that are not controlled by the attacker(who is assumed not to be able to compromise more than scells). Consider that every storage cell used in the futuremight have been compromised with probability s=N ; in thiscase, the FEPL p2

fðm; sÞ is the same as the BEFL, i.e.,p2fðm; sÞ ¼ p2

bðm; sÞ ¼ 1� ð sNÞ2.

Compared to Scheme I, both the traffic load andresources for storing the information in Scheme II are moreuniformly distributed in all the cells.

4.2.3 Scheme III: Cell-Based Mapping

In this scheme, all the nodes in the same cell Lði; jÞ of thegridded sensor field store in the same location ðLr; LcÞ thesame type of event E occurring during a time interval T ,based on a cell key Kij shared among all the nodes in thecell Lði; jÞ. Here,

Lr ¼ H 0jijjjEjKijjT� �

mod ðNrÞ ð3Þ

and Lc is computed similarly. This scheme differs from theprevious schemes in two aspects. First, in this scheme,every node in cell Lði; jÞ updates the cell key Kij

periodically based on H such as Kij ¼ HðKijÞ, and thenerases the old cell key to achieve backward event privacy.Second, since cell keys are also used for encryption, theupdating of cell keys leads to the change of encryption keyfor the same event detected by the same cell but in differenttime periods.

Type 3 query. An MS can answer the following querywith one message: has event E happened in cell Lði; jÞ duringthe time interval T? An MS first determines the locationbased on the key Kij; T ; E, and the detection cell Lði; jÞ ofinterest, then sends a query to the cell to fetch the data.

Security and performance analysis. The updating of cellkeys prevents an attacker from deriving old cell keys basedon the current cell key of a compromised cell. Hence, theevent data recorded in the previous periods are indecipher-able irrespective of the number of compromised cells (thenetwork controller however still keeps the older keys todecrypt previous event data). In other words, the BEFL ofthis scheme is

p3bðm; sÞ ¼ 1:

Clearly, Scheme III provides the highest BEFL.

1028 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, AUGUST 2009

Authorized licensed use limited to: Penn State University. Downloaded on January 14, 2010 at 14:53 from IEEE Xplore. Restrictions apply.

Page 7: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, …mcn.cse.psu.edu/paper/mshao/tmc09.pdf · instead, mobile sinks (MSs) such as mobile sensors, users, or soldiers may be dispatched

The FEPL p3fðm; sÞ of this scheme is the same as that in

Scheme II. It can also be seen that this scheme is the leastsubject to the single point of failure problem compared tothe previous schemes. Moreover, both the traffic load andresources for storing the information are the most uniformlydistributed among all the nodes.

4.2.4 Comparison of Different Mapping Schemes

Above, we have presented three data-to-location mappingschemes with increasing privacy and complexity. Thesethree mapping schemes certainly do not exhaust the designspace, because we have three dimensions (time, space, andkey) to manipulate. In the Appendix, we further introducea row-based mapping scheme. In general, the higher theevent privacy, the larger the message overhead for query.On the other hand, these schemes may be used simulta-neously based on the levels of privacy required by differenttypes of data.

Next, we use simulations to compare the message over-head of the three mapping schemes: group-key-basedmapping, time-based mapping, and cell-based mapping.Message overhead is defined as the total number oftransmission hops of all the messages sent out by the detectioncells toward their storage cells. The simulations were run for20,000 time units in a DCS network with 20� 20 cells. In eachtime unit, 10 events are generated from randomly selectedcells and a random event type ID (ranging from 1 to 3) isassigned to each event. After an event is sensed in a cell, thecell will calculate the storage cell coordinates based on themapping schemes and forward a message toward it.

Fig. 3 shows that the amortized message overhead(message overhead per time unit per cell) linearly increaseswith the number of events. We observe that cell-basedmapping incurs a slightly higher message overhead thanthe other two schemes. Also, even when there are as manyas 50 events happening in one time unit, the amortizedmessage overhead is low, e.g., 1.2 in group-key-basedmapping and 1.39 in cell-based mapping.

In Fig. 4, we use 3D plots to show the message overheaddistribution over a plane of cells. We observe that themessage overhead is the most balanced with the cell-basedmapping scheme and the least balanced with the group-key-based mapping scheme. In general, when messageoverhead is more balanced among all the cells, the networkcan have a longer lifetime. Note that we also change thetime period jT j, the number of event types, and the eventrate in each time unit. The message overhead distributionsof these mapping schemes are similar.

Finally, we briefly mention the memory usage of sensornodes. Since sensed data have to be stored somewhere inthe network, the overall memory requirement is the same inall these mapping schemes. But because the cell-basedscheme involves most storage cells, intuitively it will bestbalance the memory requirement among sensor nodes. Sowe will expect similar memory usage distribution as theresults in Fig. 4.

4.3 Key Management

So far we have seen several types of symmetric keysinvolved in pDCS. Now, we are ready to show the completelist of keys that are used in pDCS and discuss their purposesas well as efficient ways for management of these keys:

. Master key. Every node u has a master key Ku

shared only with MS. Although master key is notexplicitly used in the data-location mappingschemes, it is necessary to secure the communica-tions between the MS and individual sensors. In ourapplication, for example, when the node wants toreport the misbehavior of another node in the samecell to MS, it may use the master key to calculate amessage authentication code over the report, orwhen MS distributes a new cell key to a cell with anode to be revoked, the master keys of the remainingnodes in the cell can be used to encrypt the new cell

SHAO ET AL.: pDCS: SECURITY AND PRIVACY SUPPORT FOR DATA-CENTRIC SENSOR NETWORKS 1029

Fig. 3. Overhead comparisons among different mapping schemes.

Fig. 4. Message overhead distribution of different mapping schemes. (a) Group-key-based mapping. (b) Time-based mapping. (c) Cell-basedmapping.

Authorized licensed use limited to: Penn State University. Downloaded on January 14, 2010 at 14:53 from IEEE Xplore. Restrictions apply.

Page 8: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, …mcn.cse.psu.edu/paper/mshao/tmc09.pdf · instead, mobile sinks (MSs) such as mobile sensors, users, or soldiers may be dispatched

key for secure key distribution. (The new cell keycan be encrypted master key.)

. Pairwise key. Every pair of neighboring nodesshares a pairwise key. This key is used for 1) securedistribution of keying material such as a new cell keyamong a cell or 2) hop-by-hop authentication of datamessages between neighboring cells for preventingpacket injection attacks.

. Cell key. A cell key can be used 1) for encryptingsensed data to be stored in a storage cell, 2) forprivate cell-to-cell mapping, or 3) as a KEK forsecure delivery of a row key.

. Row key. A row key can be used 1) for private row-to-cell mapping or 2) as a KEK for secure delivery ofa group key.

. Group key. A group key is used 1) for secure group-to-cell mapping or 2) when MS broadcasts a securequery or command to all the nodes.

Of these five keys, four keys (except pairwise keys) canbe organized into an LKH [28], [46], [47] data structuremaintained by MS, as shown in Fig. 5. The first level key(i.e., root key) is the group key, the second level of keys arerow keys, the third level of keys are cell keys, and the fourthlevel are master keys. The out-degree of a key node is Nr,Nc, Nij, respectively, where Nij is the number of nodes incell Lði; jÞ. Like in LKH, every node only knows the keys onthe path from its leaf key to the root key. Unlike in LKHwhere group members do not share pairwise keys, in ourscheme, a node shares a pairwise key with every neighbornode. We will show shortly that pairwise keys help reducethe bandwidth overhead of a group rekeying operation forrevoking a node.

Initial key setup. Next, we show how nodes establishall these types of keys initially. Pairwise keys can beestablished by an existing scheme introduced in Section 2.2.Group key and master keys are easy to establish by loadingevery node with them before network deployment. How-ever, it might not be feasible to set up row keys and cellkeys by preloading every node with the corresponding keysfor large-scale sensor networks. For massive deployment ofsensor nodes (e.g., through aerial scattering), it is hard toguarantee the precise locations of sensor nodes. If a nodedoes not have the cell key for the actual cell it falls in, it will

not be able to communicate with the other nodes in thesame cell. To address this key setup issue, we need toestablish row/cell keys after deployment.

Based on real experiments, Deng et al. [48] showed that itis possible for an experienced attacker to obtain copies of allthe memory and data of a Mica2 mote in minutes after anode is captured. Zhu et al. [49] showed through experi-ments that it takes several seconds for a node with areasonable node density (�20 neighbors) to communicatewith each neighbor and establish a secret key with each ofthem. As the number of message exchanged in a localizationprotocol [34] is no more than that in [49], in pDCS, we wouldassume that during the initial network deployment phase, anode will not be compromised before it discovers its locationbased on a secure location scheme [34], [50]. This assump-tion also holds if the initial deployment is monitored.

With this assumption, our scheme works by preloadingevery node with the same initial network key KI . For a nodelocated in cell ði; jÞ, it can derive its cell key as follows:

Kij ¼ HðKI; ijjÞ: ð4Þ

After this, it erases K from its memory completely. A rowkey can be established similarly as Ki ¼ HðKI; iÞ.

Key updating upon node revocations. pDCS does notinclude a mechanism for detecting compromised nodesalthough its key updating operation introduced below istriggered by the detection of node compromises. Instead,pDCS assumes the employment of such schemes [41], [40],[51], [52], [53].

Suppose node u in cell Lð2; 2Þ is compromised and itscell reports its compromise to MS. For example, a majorityof the other nodes in the cell each computes a MAC overthe report using their master keys. Since node u knowskeys K22, K2, Kg, these keys will need to be updated totheir new versions, say K022, K02, K0g. Based on LKH, MSwill need to encrypt each updated key with its child keys(new version if updated) and then broadcast all theencryptions. For example, the new group key K0g isencrypted by K0, K1, K02, and K3, respectively, K02 isencrypted by K20, K21, K022, and K32, respectively, and K022

is encrypted by Kv0, Kv1

, Kv2, Kv3

, respectively. In general,Nr þNc þNij � 1 encrypted keys will be broadcast andflooded in the network.

1030 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, AUGUST 2009

Fig. 5. The mapping between physical network into an LKH and the rekeying packet flows for revoking node u. (a) A sensor network divided into

cells. (b) An LKH (each dot denotes a key). (c) Demonstration of rekeying packet flows.

Authorized licensed use limited to: Penn State University. Downloaded on January 14, 2010 at 14:53 from IEEE Xplore. Restrictions apply.

Page 9: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, …mcn.cse.psu.edu/paper/mshao/tmc09.pdf · instead, mobile sinks (MSs) such as mobile sensors, users, or soldiers may be dispatched

Next, we present a variant of the above scheme, whichincorporates two techniques to further improve therekeying efficiency. The first technique is based on networktopology. Instead of flooding all the keys in the network,MS sends them separately to different sets of nodes. This isbased on the observation that nodes in different locationsshould receive different sets of encrypted keys. Supposethe node to be revoked is in cell Lði; jÞ. For nodes in rowm ðr 6¼ iÞ, they only need to receive the new group key K0gencrypted by its row key Km. Hence, MS only needs tosend one encrypted key to the cell ðm; 0Þ, and the key isthen propagated to the other cells in row m. For nodes inrow i, there are two scenarios. If the nodes are in columnn ðn 6¼ jÞ, they only need to receive K0g encrypted with K0iand K0i encrypted with the cell key Kin. Otherwise, if theyare located in the same cell as node u, each of them needsto receive K0ij encrypted with its own master key. Inthese scenarios, MS sends Nc þNij � 1 keys to thecell ði; 0Þ, and the keys are then propagated in row i. Notethat a cell can remove from the keying message theencrypted keys that are of only interest to itself beforeforwarding the message to the next cell. As such, the sizeof a keying message decreases when it is forwarded.

Our second technique trades computation for commu-nication because communication is more energy consumingthan computation in sensor networks. It has been shown in[23] and [50] that the energy consumption for encrypting orcomputing a MAC over an 8-byte packet based on RC5 isequivalent to that for transmitting one byte. As such,instead of sending the Nij � 1 encryptions of K0ij to thecell ði; jÞ across multiple hops, MS may send only one of theencryptions to a specific node (e.g., v0 in Fig. 5) and thenrequest that node to securely propagate K0ij to the nodes butu using their pairwise keys for encryption.

Key management performance analysis. Now, weanalyze the performance of our rekeying scheme upon anode revocation. For simplicity, we define the performanceoverhead C as the average number of keys that traverseeach cell during a rekeying event. That is

C ¼XNr�1

i¼0

XNc�1

j¼0

sij=ðNrNcÞ; ð5Þ

where sij is the number of keys that have traversedcell Lði; jÞ. Here, we do not count the Nij � 1 unicast

transmission cost inside the cell Lði; jÞ because this cost isrelatively small when amortized over N cells. Without lossof generality, we assume MS is in cell Lð0; 0Þ whendistributing rekeying messages. From Fig. 5c, we can deriveC as follows:

C ¼ 1:5þ N2c þN2

r þ 2Nc þ 2� �

=ð2NrNcÞ: ð6Þ

For a sensor network deployed in a square field, i.e.,Nc ¼ Nr, C � 2:5 keys when Nr > 2. Compared to theintuitive scheme that broadcasts all the LKH keys and thushas the per cell overhead of Nr þNc þNij � 1 keys, ourrekeying scheme is far more efficient.

4.4 Improving the Query Efficiency

We have shown that the proposed mapping schemes arecapable of answering queries of different granularity andcan achieve different levels of privacy. Better privacy isnormally achieved at the cost of larger query messageoverhead. For example, to answer a query like “Where werethe elephants in the last three days,” one query message isenough in the group-key-based mapping; however, thismay take multiple query messages in the cell-basedmapping as the data are stored at multiple places. Next,we propose several techniques to decrease the querymessage overhead.

4.4.1 The Basic Scheme

Suppose an MS needs to send multiple query messages tomultiple storage cells to serve a query. Due to therandomness of the mapping function, these storage cellsmay be separated by other cells. In the basic scheme, asshown in Fig. 6a, the MS sends one query message to eachcell using a routing protocol such as GPSR [5]. Since eachquery message contains the query information and the IDof the destination storage cell, these query messages aredifferent and have to be sent out separately. It is easy to seethat this scheme has very high message overhead.

Another weakness of the basic scheme is its lack of queryprivacy. Query privacy is measured by the probability thatan attacker cannot find the IDs of the storage cells fromeavesdropped MS query messages. In the basic scheme,since the MS has to specify the IDs of the destinationstorage cells, the query privacy of this scheme, denoted byP1, is P1 ¼ 0.

SHAO ET AL.: pDCS: SECURITY AND PRIVACY SUPPORT FOR DATA-CENTRIC SENSOR NETWORKS 1031

Fig. 6. Three schemes for delivering a query to the storage cells. (a) Basic scheme. (b) EST scheme. (c) BF scheme.

Authorized licensed use limited to: Penn State University. Downloaded on January 14, 2010 at 14:53 from IEEE Xplore. Restrictions apply.

Page 10: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, …mcn.cse.psu.edu/paper/mshao/tmc09.pdf · instead, mobile sinks (MSs) such as mobile sensors, users, or soldiers may be dispatched

4.4.2 The Euclidean Steiner Tree (EST) Scheme

A natural solution to reduce the message overhead of thebasic scheme is to organize the storage cells as a minimumspanning tree. In this way, the MS can first generate theminimum spanning tree which includes all the storage cells,and then send the query message to these cells followingthis minimum spanning tree. Although this solutionincreases the message size, it greatly reduces the numberof query messages. Because a message includes manyredundant header information, combining multiple mes-sages can significantly reduce the overall message over-head. Similar to the basic scheme, the MS has to include theIDs of the destination storage cells in his query messages.Thus, the query privacy of this solution is still 0.

To further reduce the message overhead, we can use EST[9], [54], which has been shown to have better performancethan minimum spanning tree and is widely used in networkmulticasting. Fig. 6b shows an EST, which includes somecells other than the storage cells, called Steiner cells. Notethat these Steiner cells can also help improve the queryprivacy because they add noise into the set of storage cells.

With EST, the cell that the MS resides will be the root cell.The MS constructs a query message, which contains the IDsof the cells in the EST, and sends it to its child cells usingrouting protocols such as GPSR. When a cell head receives aquery message, it reconstructs an EST subtree by removingsuch information as its own ID and the IDs of its siblingnodes, and only keeping the information about the subtreerooted at itself. Then, it forwards the query message with theEST subtree to its child cell. This recursive process continuesuntil each storage cell in the EST receives the query message.

To construct an EST, we use a technique proposed byWinter and Zachariasen [9]. Since their solution may returna noninteger Steiner cell, we use the nearest integer Steinercell to replace the noninteger Steiner cell. Let n denote thenumber of storage cells. With this solution, an EST spanningk ð2 � k � nÞ cells, has at most k� 2 integer Steiner cells,which means that at most 2k� 2 cells are included in theSteiner tree. The use of Steiner cells can improve the queryprivacy up to 1� n

2n�2 ¼ n�22n�2 . That is,

P1 ¼ 0 � P2 �n� 2

2n� 2: ð7Þ

4.4.3 The Keyed Bloom Filter Scheme

Bloom Filter. A Bloom Filter [55] is a popular datastructure used for membership queries. It represents a setS ¼ s1; s2; . . . ; sn using k independent hash functionsh1; h2; . . . ; hk and a string of m bits, each of which isinitially set to 0. For each s 2 S, we hash it with all thek hash functions and obtain their values hiðsÞð1 � i � kÞ.The bits corresponding to these values are then set to 1 inthe string. Note that multiple values may map to thesame bit (see Fig. 7 for an example). To determinewhether an item s0 is in S, bits hiðs0Þ are checked. If allthese bits are 1s, s0 is considered to be in S.

Since multiple hash values may map to the same bit,Bloom Filter may yield false positives. That is, an element isnot in S but its bits hiðsÞ are collectively marked by elementsin S. If the hash is uniformly random over m values, theprobability that a bit is 0 after all the n elements are hashed

and their bits marked is ð1� 1mÞ

kn � e�knm . Therefore,the probability for a false positive is ð1� ð1� 1

mÞknÞk �

ð1� e�knm Þk. The right-hand side is minimized when

k ¼ ln 2�m=n; ð8Þ

in which case it becomes ð12Þk ¼ ð0:6185Þm=n.

A Bloom Filter can be used to construct query messages.A basic approach is as follows: After an MS determines thelocation information of all the storage cells, it builds an ESTand gathers the IDs of all the cells covered by the tree. TheMS then inserts the IDs into a Bloom Filter, which is sentwith other query information to the root cell of the EST usingthe GPSR algorithm (as shown in Fig. 6c). When a querymessage arrives at a cell, the cell checks the embeddedBloom Filter to determine which of its neighbors belong tothe Bloom Filter, and then forwards the message to them.Recursively, every storage cell receives one query message.

Using Bloom Filter for directed forwarding provideshigher query privacy than EST. This is because Bloom Filterintroduces some additional noise cells, including the non-storage cells connecting the Steiner cells in the EST and asmall number of noise cells caused by the false positive rate.

KBF. In the Bloom Filter-based scheme, an attacker canfreely check whether a cell is one of the storage cellsalthough there could be a high false positive rate. To furtherimprove the query privacy, we should disable the attacker’scapability in performing membership verification over aBloom Filter. This motivated our design of a KBF scheme,which uses cell keys to “encrypt” the cell IDs before they areinserted. In this way, an attacker can derive none or only asmall number of cell IDs from a query message. As such, theattacker will have negligible probability to identify thestorage cells other than randomly guessing.

In the KBF scheme, each cell ID is concatenated with thecell key of its parent node in the EST before it is insertedinto the Bloom Filter. Specifically, to insert cell ID x, the bitscorresponding to HiðxjkpÞ ði ¼ 1; . . . ; kÞ are set to 1, wherekp is the cell key of the parent of cell x. When a querymessage arrives at a cell, the cell concatenates its own cellkey with the ID of each neighboring cell that is not aneighbor of its own parent node (to avoid redundantcomputation and forwarding), and determines whether theneighbor is in the Bloom Filter. If it is, the message isforwarded to the neighbor. Algorithms 1 and 2 formally

1032 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, AUGUST 2009

Fig. 7. A Bloom Filter with k hash functions.

Authorized licensed use limited to: Penn State University. Downloaded on January 14, 2010 at 14:53 from IEEE Xplore. Restrictions apply.

Page 11: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, …mcn.cse.psu.edu/paper/mshao/tmc09.pdf · instead, mobile sinks (MSs) such as mobile sensors, users, or soldiers may be dispatched

describe the ways to create a Bloom Filter and to forward aquery message, respectively.

Algorithm 1. Create a Bloom FilterInput: an array of storage-cell Cartesian coordinates c½�;Output: Bloom Filter BF ;Procedure:1: initialize a Bloom Filter BF ;2: build Steiner tree based on c½�;3: for each cell u in the Steiner tree do4: p ¼ parent of u; kp ¼ cell key of p;5: map ðujkpÞ into BF ;6: end for7: return BF ;

Algorithm 2. Forward a query messageInput: a query message received by cell u, which includesa Bloom Filter BF .Procedure:1: ku ¼ cell key of u;2: for for each neighboring cell u0 of u do3: if u0 6¼ parent of u ^ u0 6¼ neighbor of the parent of

u ^BF contains u0 then4: forward the query message to u0

5: end if6: end for

Query privacy. In this scheme, cell IDs are “encrypted”with cell keys before being inserted into the Bloom Filter. Ifan attacker has not compromised any cells in the EST, he willnot know any cell keys. In this case, he cannot obtain anyinformation about storage cells from an eavesdropped querymessage. Next, we consider the case that the attacker hascompromised some cells in the EST. If a compromised cell iscontained in the EST, from the received query message, it canfind out which of its neighboring cells also belong to the EST.However, it cannot verify the membership of the other cells.In fact, this is one prominent advantage of the KBF schemeover the EST scheme. To make the EST scheme more secure,a straightforward extension would be to encrypt the ESTtree. To enable every cell in the tree to access the informationfor correct forwarding of a query message, a group key willneed to be used to encrypt the EST tree. Thus, an attacker candecrypt the entire EST as long as he can compromise one cell.Clearly, the KBF scheme offers much better query privacythan the EST scheme. The query privacy of the KBF schemeand other schemes are compared in Section 5, and the resultsshow that the KBF scheme has the highest privacy.

4.4.4 Plane Partition

The EST scheme reduces the number of query messages atthe price of larger messages. The limited packet size, e.g.,29 bytes in TinyOS [56] may prevent the MS to piggybackall the storage cell IDs together with the query informationin a single packet. A Bloom Filter may be designed to fit in apacket, but to maintain a low false positive rate, only alimited number of cell IDs should be included in a packet.To address this problem, we use multiple Steiner trees, eachof which is encoded into a single packet. Because partition-ing a Steiner tree into multiple Steiner trees, known as theminimum forest partition problem, is NP-hard ([57]), wepropose heuristics to perform the partition.

In Fig. 8a, the solid lines are used to represent the ESTtree, and the shaded areas along these solid lines are usedby Bloom Filters to encode the EST tree. An intuitivepartition method is to first cluster the storage cells in a top-down and left-right fashion, and then build a sub-ESTwithin each partition. We can let the EST scheme and theKBF scheme have the same partitions and build the samesub-EST trees. After the partition, the MS sends a query toeach partition at the same time. In this way, the messagesize can be reduced. Further, since multiple queries are sentout at the same time, the average query delay is alsoreduced.

Fanlike partition method. With the intuitive partition, thequery message from the MS has to go through someredundant cells. For example, in Fig. 8a, the query messageof the MS has to go through many cells before reaching the toppartition. To address this problem, we change the Cartesiancoordinates into Polar coordinates. In this new coordinationsystem, storage cells are within ½��; ��. The partitionalgorithm scans the plane from �� to � and collects enoughstorage cells into each partition. Fig. 8b shows one example ofdividing the plane into three partitions using the Fanlikepartition method. The detailed description is shown inAlgorithm 3.

Algorithm 3. Fanlike partition methodInput: an array of Cartesian coordinates c½�, where s is the

size of the array and c½0� is the cell that the MS resides;

Output: Partition Sets;

Procedure:

1: initiate an array degree½� to store the degree of each cell;

2: for i ¼ 1 to s do

3: degree½i� ¼ tan�1ðc½i�:y�c½0�:yc½i�:x�c½0�:xÞ;4: if c½i� is in the 2nd quadrant then

5: degree½i�� ¼ �;

6: end if

7: if c½i� is in the 3rd quadrant then

8: degree½i�þ ¼ �;

9: end if

10: end for

11: Sort all the cells according to their degrees, and then

uniformly divide the cells into the specified number of

partitions and put them into a set array A½�.12: return A;

SHAO ET AL.: pDCS: SECURITY AND PRIVACY SUPPORT FOR DATA-CENTRIC SENSOR NETWORKS 1033

Fig. 8. Seventeen storage cells are partitioned into three parts.

(a) Intuitive partition. (b) Fanlike partition.

Authorized licensed use limited to: Penn State University. Downloaded on January 14, 2010 at 14:53 from IEEE Xplore. Restrictions apply.

Page 12: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, …mcn.cse.psu.edu/paper/mshao/tmc09.pdf · instead, mobile sinks (MSs) such as mobile sensors, users, or soldiers may be dispatched

4.5 MS Data Processing

Through the above query process, an MS can retrieve themessage of his interest, which is encrypted by the cell key ofthe detection cell. To process the event, the MS needs todecrypt the message first. However, for preventing selectivecompromise attacks, in our design the ID of a detection cellis also encrypted. As such, the MS will try all the cell keysuntil the decrypted message is meaningful (e.g., including asource cell ID and following a certain format). The averagenumber of decryptions is N=2. Though this may not be a bigissue for a laptop-class MS, which can perform about4 million en/decryptions per second [58], we will continueto design more efficient ways in our future work.

Another concern in pDCS is the number of keys that haveto be possessed by an MS when the MS needs to decryptdata from many cells. If we assume that the MS could not becompromised, we can simply load it with a single key,which is the initial group key KI . From this initial key, theMS can derive the cell key Kij of each cell ði; jÞ asKij ¼ HðKI; ijjÞ. This is however dangerous if the MScould be compromised, because all the cell keys would beexposed. This problem can be relieved in the following way.Instead of applying its cell key for encryption directly, everynode may first derive some variances of its cell key forspecific events or time intervals using a hash function. Thevariance keys are then used to encrypt event messages. TheMS will be loaded with the variance keys for the event of hisinterest. In case that the MS is compromised, the othervariance keys are still secure.

5 PERFORMANCE EVALUATIONS

In this section, we evaluate and compare the performance ofthree query schemes: the Basic scheme, the EST scheme andthe KBF scheme. In our simulation setup, each querymessage contains the query information and the encodedquery path. The query information occupies 4 bytes whichare used to represent time and event,2 and 25 bytes are usedto represent the query path. For evaluation purpose, we donot consider the overhead of source authentication.

In the EST scheme, the query path is encoded as a Steinertree. Each node ID is presented by 2 bytes, so only 12 cell IDscan be encoded in each packet. In the KBF scheme, 25 bytesare used to encode the query path with Bloom Filter, and it isexpected to achieve an acceptable false positive rate, say 0.1.Considering these limitations, we choose ðn; kÞ ¼ ð20; 5Þ.

These schemes are evaluated under various storage celldensities, ranging from 1

40 to 12:5 . The storage cell density is

defined as the ratio of the number of storage cells to thenumber of total cells in the plane. For example, with oursetting of 20 � 20 cells, a density of 1

10 means that there areabout 400 � 1

10 ¼ 40 storage cells.Four metrics are used to evaluate the performance of

the proposed schemes: the number of query messages, theaverage query delay, the maximum query delay, and themessage overhead. The number of query messages isthe total number of messages sent out by the MS for aquery. The average query delay is the average of the querydelays for different storage cells. The maximum query delayis the maximum among all the query delays. The messageoverhead is defined as the total number of transmitted hopsof all the messages sent out by the MS to serve a query. Inthe KBF scheme, the message overhead also includes theextra messages due to false positive. As query messages areforwarded in the network in a hop-by-hop fashion, thenumber of query messages and message overhead alsoproportionally reflect the communication costs by thesensor nodes.

5.1 Choosing the Partition Method

In this section, we evaluate the performance of EST withintuitive partition and EST with Fanlike partition. As shownin Fig. 9, the Fanlike partition method outperforms theintuitive method in terms of average query delay, max-imum query delay, and message overhead. We did notshow the number of messages, since both schemes have thesame number of messages determined by the packet size.

As discussed earlier, in the intuitive partition method,each query message is sent from the MS to the partition,which may go through many redundant cells and henceincrease the message overhead. However, in the Fanlikepartition, less redundant cells are involved, and hence themessage overhead is lower. This also explains why theFanlike partition has lower average and maximum querydelay when compared to the intuitive partition.

1034 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, AUGUST 2009

2. Some applications may require more bytes; nevertheless, since we areinterested in the comparative results of multiple schemes, normally thepayload size will not affect much. Further, the time should be in hour/minute level instead of microsecond level, and hence only need less numberof bits.

Fig. 9. Performance comparisons between different partitioning schemes. (a) Average query delay. (b) Maximum query delay. (c) Message

overhead.

Authorized licensed use limited to: Penn State University. Downloaded on January 14, 2010 at 14:53 from IEEE Xplore. Restrictions apply.

Page 13: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, …mcn.cse.psu.edu/paper/mshao/tmc09.pdf · instead, mobile sinks (MSs) such as mobile sensors, users, or soldiers may be dispatched

In Fig. 9a, with Fanlike partition, the average query delaydrops as the storage cell density increases. This can beexplained as follows. When the storage cell density is high,each partition is small. Therefore, the Steiner tree is limitedwithin a small range and the zigzag paths from MS tostorage cells tend to be shorter. This results in smalleraverage query delays.

The aforementioned reason also explains the phenom-enon that the maximum query delay decreases as thestorage cell density increases for the Fanlike partition inFig. 9b. However, when the density is very low ð 1

40Þ, theintuitive partition has a little bit lower maximum querydelay than the Fanlike partition. We checked the simulationtrace and found the following reason. When the density is140 , there are about 10 storage cells. Due to the use of Steinercells and that each packet is limited to 12 cell IDs, there area very small number (one or two) of cells left into thesecond packet. These leftover cells tend to be faraway in theintuitive partition method but not in the Fanlike partition.As a result, the intuitive partition can achieve a slightlyshorter maximum delay than the Fanlike partition methodwhen the storage cell density is very low.

We also evaluated the performance of the KBF schemeunder both partition methods. The results are similar to ESTwhere the Fanlike partition performs better. Thus, we usethe Fanlike partition method in the following comparisons.

5.2 Performance Comparisons of DifferentSchemes

This section compares the performance of three schemes:the Basic scheme, the EST scheme, and the KBF scheme.

Fig. 10 compares the number of messages and the messageoverhead of the three schemes. As can be seen, bothoptimization schemes (EST and KBF) outperform the basicscheme since the optimization schemes combine severalmessages into one. We can also see that the message overheadof the KBF scheme is higher than the EST scheme althoughboth schemes have similar number of messages. This is due tothe fact that the query messages in the KBF scheme may gothrough some redundant cells due to false positive.

Figs. 11a and 11b compare the average delay and themaximum delay of the three schemes. As can be seen, the basicscheme outperforms the other two. This is because in the basicscheme, the query messages are sent directly to the storagecells in parallel along shortest paths, resulting in a lowerquery delay. Although EST and KBF can reduce the messageoverhead, the query delay is increased since the messagehas to go through many intermediate cells sequentially.

As shown in Figs. 11a and 11b, when the storage celldensity is low, KBF outperforms EST in terms of querydelay. To explain this, we need to understand the effects ofthe number of partitions. When the number of partitions issmall and hence each partition is large, the path to eachstorage cell is more zigzag like, which may result in longdelay. As shown in Fig. 10a, when the density is low, ESThas less number of messages and hence less number ofpartitions, which means that EST will have large partitionsand long delay. Similarly, when the density is high, EST hasmore partitions and shorter delay.

In addition, as shown in Fig. 11c, the KBF scheme has thehighest query privacy. Even after s ¼ 20 cells have been

SHAO ET AL.: pDCS: SECURITY AND PRIVACY SUPPORT FOR DATA-CENTRIC SENSOR NETWORKS 1035

Fig. 10. The message overhead of different schemes. (a) Number of messages. (b) Message overhead.

Fig. 11. Comparisons among different schemes. (a) Average query delay. (b) Maximum query delay. (c) Privacy.

Authorized licensed use limited to: Penn State University. Downloaded on January 14, 2010 at 14:53 from IEEE Xplore. Restrictions apply.

Page 14: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, …mcn.cse.psu.edu/paper/mshao/tmc09.pdf · instead, mobile sinks (MSs) such as mobile sensors, users, or soldiers may be dispatched

compromised, the query privacy level is still above83 percent.

In summary, there is a tradeoff among query delay,message overhead, and query privacy. The Basic schemehas the lowest delay but the highest message overhead andthe lowest query privacy. The EST scheme and the KBFscheme can significantly reduce the number of messagesand the message overhead with the same level of querydelay. Especially the query privacy level of KBF is farhigher than the other schemes.

6 CONCLUSIONS AND FUTURE WORK

In this paper, we have proposed solutions on privacysupport for DCS networks (pDCS). The proposed schemesoffer different levels of location privacy and allow atradeoff between privacy and query efficiency. pDCS alsoincludes an efficient key management scheme that makes aseamless mapping between location keys and logical keys,and several query optimization techniques based onEST and Bloom Filter to minimize the query messageoverhead and increase the query privacy. Simulationresults verified that the KBF scheme can significantlyreduce the message overhead with the same level of querydelay. More importantly, the KBF scheme can achieve thesebenefits without losing any query privacy.

To the best of our knowledge, this is the first paper toaddress privacy issues in DCS networks. As the initialwork, we do not expect to solve all the problems. In thefuture, we will address other issues such as sourceanonymity, and look into other query techniques to balancethe tradeoff between query delay and message overhead.Techniques for initial key setup without relying on a shortsafe time period are also needed.

APPENDIX

ROW-BASED MAPPING

In this scheme, all the nodes in the same row i (or column)of the gridded sensor field store the same type of event Eoccurring during T in the same location ðLr; LcÞ based on akey Ki shared only among all the nodes in row i. Here

Lr ¼ H 0jijEjKijTð Þ ModðNrÞ; ð9Þ

and Lc is computed in the similar way. Instead of updatinga group key as in Scheme II, in this scheme, every nodeupdates its row key periodically based on H and thenerases the old row key to achieve backward event privacy.

Type 4 query. An MS can answer the following querywith one message: has event E happened in row i during thetime interval T? This is because all the information about theevent E happened in row i during T is stored in onelocation. An authorized MS first determines the locationbased on Ki, T , E and row i of interest, then sends a queryto it to fetch the data.

Security and performance analysis. As in time-basedmapping, an attacker cannot derive old row keys from thecurrent row key of a captured node. Hence, the locations forstoring the events occurred during the previous timeperiods are not derivable. An attacker has to randomly

guess the previous storage cells for the event of his interest.The BEPDL p4

bðm; sÞ of the previous data is also verycomplicated to derive; therefore, we also give qualitativeanalysis. Let SðmÞ be the number of storage cells corre-sponding to m detection cells. If in row-based and time-based mapping, m detection cells were mapped into thesame name of storage cells, their BEPDLs should be thesame because the mapping uncertainty is the same forthe attacker. In practice, however, on average SðmÞ in row-based mapping should be larger than that in time-basedmapping. This is because in row-based mapping, sensordata on the same event occurred at the same time periodbut different rows are highly likely mapped to differentstorage cells whereas in time-based mapping, the data aremapped to the same storage cell. As such, the BEPDL ofrow-based mapping should be (slightly) lower than that oftime-based mapping on average.

The FEPDL p4fðm; sÞ of this scheme is the same as the

previous schemes. That is, p4fðm; sÞ ¼ p1

fðm; sÞ. On the otherhand, compared to the previous schemes, this scheme is lesssubject to the single point of failure; further, both the trafficload and resources for storing the information are moreuniformly distributed among the cells.

ACKNOWLEDGMENTS

The authors would like to thank the anonymous refereeswhose insightful comments helped improve the presenta-tion of this paper. A preliminary version [17] of this paperappeared in INFOCOM ’07. This work was supported in partby the US Army Research Office (W911NF-05-1-0270 andW911NF-07-1-0318) and the US National Science Foundation(CNS-0519460, CNS-0643906, and CNS-0627382).

REFERENCES

[1] I. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci,“Wireless Sensor Networks: A Survey,” Computer Networks,vol. 38, no. 4, Mar. 2002.

[2] S. Ratnasamy, B. Karp, L. Yin, F. Yu, D. Estrin, R. Govindan,and S. Shenker, “GHT: A Geographic Hash Table for Data-CentricStorage,” Proc. First ACM Workshop Wireless Sensor Networks andApplications (WSNA ’02), Sept. 2002.

[3] A. Ghose, J. Grobklags, and J. Chuang, “Resilient Data-CentricStorage in Wireless Ad-Hoc Sensor Networks,” Proc. Fourth Int’lConf. Mobile Data Management (MDM ’03), pp. 45-62, 2003.

[4] W. Zhang, G. Cao, and T. La Porta, “Data Dissemination withRing-Based Index for Wireless Sensor Networks,” Proc. 11th IEEEInt’l Conf. Network Protocols (ICNP ’03), pp. 305-314, Nov. 2003.

[5] B. Karp and H. Kung, “GPSR: Greedy Perimeter Stateless Routingfor Wireless Networks,” Proc. ACM MobiCom, 2000.

[6] F. Ye, H. Luo, J. Cheng, S. Lu, and L. Zhang, “A Two-Tier DataDissemination Model for Large-Scale Wireless Sensor Networks,”Proc. ACM MobiCom, pp. 148-159, Sept. 2002.

[7] S. Ratnasamy, D. Estrin, R. Govindan, B. Karp, L. Yin, S. Shenker,and F. Yu, “Data-Centric Storage in Sensornets,” Proc. ACM FirstWorkshop Hot Topics in Networks, 2001.

[8] The Smartdust Project, http://robotics.eecs.berkeley.edu/pister/SmartDust/, 2008.

[9] P. Winter and M. Zachariasen, “Euclidean Steiner MinimumTrees: An Improved Exact Algorithm,” Networks, vol. 30, no. 3,pp. 149-166, 1997.

[10] G. Myles, A. Friday, and N. Davies, “Preserving Privacy inEnvironments with Location-Based Applications,” IEEE PervasiveComputing, 2003.

[11] U. Hengartner and P. Steenkiste, “Protecting Access to PeopleLocation Information,” Proc. First Int’l Conf. Security in PervasiveComputing (SPC ’03), 2003.

1036 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, AUGUST 2009

Authorized licensed use limited to: Penn State University. Downloaded on January 14, 2010 at 14:53 from IEEE Xplore. Restrictions apply.

Page 15: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, …mcn.cse.psu.edu/paper/mshao/tmc09.pdf · instead, mobile sinks (MSs) such as mobile sensors, users, or soldiers may be dispatched

[12] E. Snekkenes, “Concepts for Personal Location Privacy Policies,”Proc. Third ACM Conf. Electronic Commerce (EC ’01), 2001.

[13] M. Gruteser, G. Schelle, A. Jain, R. Han, and D. Grunwald,“Privacy-Aware Location Sensor Networks,” Proc. Ninth USENIXWorkshop Hot Topics in Operating Systems (HotOS ’03), 2003.

[14] J. Deng, R. Han, and S. Mishra, “Intrusion Tolerance and Anti-Traffic Analysis Strategies for Wireless Sensor Networks,” Proc.Int’l Conf. Dependable Systems and Networks (DSN ’04), June 2004.

[15] D. Chaum, “Untraceable Electronic Mail, Return Address, andDigital Pseudonyms,” Comm. ACM, vol. 24, no. 2, pp. 84-88,1981.

[16] C. Ozturk, Y. Zhang, and W. Trappe, “Source-Location Privacy inEnergy-Constrained Sensor Networks Routing,” Proc. ACM Work-shop Security of Ad Hoc and Sensor Networks (SASN ’04), Oct. 2004.

[17] M. Shao, S. Zhu, W. Zhang, and G. Cao, “pDCS: Security andPrivacy Support for Data-Centric Sensor Networks,” Proc. IEEEINFOCOM, 2007.

[18] A. Perrig, R. Szewczyk, V. Wen, D. Culler, and J. Tygar, “Spins:Security Protocols for Sensor Networks,” Proc. ACM MobiCom,2001.

[19] S. Zhu, S. Setia, and S. Jajodia, “Leap: Efficient SecurityMechanisms for Large-Scale Distributed Sensor Networks,” Proc.10th ACM Conf. Computer and Comm. Security (CCS ’03), 2003.

[20] H. Chan, A. Perrig, and D. Song, “Random Key PredistributionSchemes for Sensor Networks,” Proc. IEEE Security and PrivacySymp., 2003.

[21] W. Du, J. Deng, Y. Han, and P. Varshney, “A Pairwise Key Pre-Distribution Scheme for Wireless Sensor Networks,” Proc. 10thACM Conf. Computer and Comm. Security (CCS ’03), pp. 42-51, 2003.

[22] L. Eschenauer and V. Gligor, “A Key-Management Scheme forDistributed Sensor Networks,” Proc. Ninth ACM Conf. Computerand Comm. Security (CCS ’02), 2002.

[23] D. Liu and P. Ning, “Establishing Pairwise Keys in DistributedSensor Networks,” Proc. 10th ACM Conf. Computer and Comm.Security (CCS ’03), 2003.

[24] S. Zhu, S. Xu, S. Setia, and S. Jajodia, “Establishing Pair-Wise Keysfor Secure Communication in Ad Hoc Networks: A ProbabilisticApproach,” Proc. 11th IEEE Int’l Conf. Network Protocols (ICNP ’03),2003.

[25] Y. Zhang, W. Liu, W. Lou, and Y. Fang, “Location-BasedCompromise-Tolerant Security Mechanisms for Wireless SensorNetworks,” IEEE J. Selected Areas in Comm., Feb. 2006.

[26] W. Zhang and G. Cao, “Group Rekeying for Filtering False Data inSensor Networks: A Predistribution and Local Collaboration-Based Approach,” Proc. IEEE INFOCOM, Mar. 2005.

[27] W. Zhang, M. Tran, S. Zhu, and G. Cao, “A RandomPerturbation-Based Scheme for Pairwise Key Establishment inSensor Networks,” Proc. ACM MobiHoc, 2007.

[28] C.K. Wong, M. Gouda, and S. Lam, “Secure Group Communica-tion Using Key Graphs,” Proc. ACM SIGCOMM, 1998.

[29] A. Perrig, D. Song, and D. Tygar, “Elk, a New Protocol forEfficient Large-Group Key Distribution,” Proc. IEEE Symp. Securityand Privacy, 2001.

[30] D. Naor, M. Naor, and J. Lotspiech, “Revocation and TracingSchemes for Stateless Receivers,” Proc. Advances in Cryptology(CRYPTO ’01), pp. 41-62, 2001.

[31] L. Lazos and R. Poovendran, “Energy-Aware Secure MulticastCommunication in Ad-Hoc Networks Using Geographic LocationInformation,” Proc. IEEE Int’l Conf. Acoustics, Speech, and SignalProcessing (ICASSP ’03), 2003.

[32] Y. Ko and N. Vaidya, “Location-Aided Routing in Mobile Ad HocNetworks,” Proc. ACM MobiCom, pp. 66-75, 1998.

[33] D. Niculescu and B. Nath, “Trajectory Based Forwarding and ItsApplications,” Proc. ACM MobiCom, 2003.

[34] S. Capkun and J. Hubaux, “Secure Positioning of Wireless Deviceswith Application to Sensor Networks,” Proc. IEEE INFOCOM,2005.

[35] H. Akcan, V. Kriakov, H. Bronnimann, and A. Delis, “GPS-FreeNode Localization in Mobile Wireless Sensor Networks,” Proc.Fifth Int’l ACM Workshop Data Eng. for Wireless and Mobile Access(MobiDE ’06), June 2006.

[36] R. Iyengar and B. Sikdar, “Scalable and Distributed GPS FreePositioning for Sensor Networks,” Proc. IEEE Int’l Conf. Comm.(ICC ’03), 2003.

[37] N. Bulusu, J. Heidemann, and D. Estrin, “GPS-Less Low CostOutdoor Localization for Very Small Devices,” IEEE PersonalComm., 2000.

[38] K. Sun, P. Ning, and C. Wang, “Secure and Resilient ClockSynchronization in Wireless Sensor Networks,” IEEE J. SelectedAreas in Comm., vol. 24, no. 2, pp. 395-408, 2006.

[39] H. Song, S. Zhu, and G. Cao, “Attack-Resilient Time Synchroniza-tion for Wireless Sensor Networks,” Ad Hoc Networks, vol. 5, no. 1,pp. 112-125, Jan. 2007.

[40] C. Karlof and D. Wagner, “Secure Routing in Sensor Networks:Attacks and Countermeasures,” Proc. First IEEE Workshop SensorNetwork Protocols and Applications (SNPA ’03), 2003.

[41] A. Cardenas, S. Radosavac, and J. Baras, “Detection andPrevention of MAC Layer Misbehavior for Ad Hoc Networks,”Proc. ACM Workshop Security of Ad Hoc and Sensor Networks(SASN ’04), 2004.

[42] W. Xu, T. Wood, W. Trappe, and Y. Zhang, “Channel Surfing andSpatial Retreats: Defenses against Wireless Denial of Service,”Proc. ACM Workshop Wireless Security (WiSe ’04), 2004.

[43] Crossbow Technology Inc., http://www.xbow.com, 2004.[44] M. Shao, Y. Yang, S. Zhu, and G. Cao, “Towards Statistically

Strong Source Anonymity for Sensor Networks,” Proc. IEEEINFOCOM, 2008.

[45] Y. Yang, M. Shao, S. Zhu, B. Urgaonkar, and G. Cao, “TowardsEvent Source Unobservability with Minimum Network Traffic inSensor Networks,” Proc. ACM Conf. Wireless Network Security(WiSec ’08), 2008.

[46] W.-T. Chen, H.-L. Hsu, and J.-L. Chiang, “Logical Key Tree BasedSecure Multicast Protocol with Copyright Protection,” Proc. 19thIEEE Int’l Conf. Advanced Information Networking and Applications(AINA ’05), 2005.

[47] G. Hao, N.V. Vinodchandran, and B. Ramamruthy, “A BalancedKey Tree Approach for Dynamic Secure Group Communication,”Proc. IEEE 14th Int’l Conf. Computer Comm. and Networks(ICCCN ’05), 2005.

[48] J. Deng, R. Han, and S. Mishra, “A Practical Study of TransitoryMaster Key Establishment for Wireless Sensor Networks,” Proc.First IEEE/CreateNet Conf. Security and Privacy in Comm. Networks(SecureComm ’05), pp. 289-299, Sept. 2005.

[49] S. Zhu, S. Setia, and S. Jajodia, “Leap+: Efficient SecurityMechanisms for Large-Scale Distributed Sensor Networks,”ACM Trans. Sensor Networks, vol. 2, 2007.

[50] F. Ye, H. Luo, S. Lu, and L. Zhang, “Statistical En-Route Detectionand Filtering of Injected False Data in Sensor Networks,” Proc.IEEE INFOCOM, 2004.

[51] T. Park and K. Shin, “Soft Tamper-Proofing via Program IntegrityVerification in Wireless Sensor Networks,” IEEE Trans. MobileComputing, vol. 4, no. 3, 2005.

[52] A. Seshadri, A. Perrig, L. Doorn, and P. Khosla, “Swatt: Software-Based Attestation for Embedded Devices,” Proc. IEEE Symp.Security and Privacy, May 2004.

[53] Y. Yang, X. Wang, S. Zhu, and G. Cao, “Distributed Software-Based Attestation for Node Compromise Detection in SensorNetworks,” Proc. 26th IEEE Int’l Symp. Reliable Distributed Systems(SRDS ’07), 2007.

[54] M. Cagalj, J. Hubaux, and C. Enz, “Minimum-Energy Broadcast inAll Wireless Networks: NP-Completeness and Distribution,” Proc.ACM MobiCom, 2002.

[55] B. Bloom, “Space/Time Trade-Offs in Hash Coding with Allow-able Errors,” Comm. ACM, 1970.

[56] The Tinydb Project, http://telegraph.cs.berkeley.edu/tinydb/,2008.

[57] R. Cordone and F. Maffioli, “On the Complexity of Graph TreePartition Problems,” Discrete Applied Math., vol. 134, nos. 1-3,pp. 51-65, 2004.

[58] Weidai’s Crypto++, http://www.eskimo.com/weidai/benchmarks.html, July 2005.

Min Shao received the BS degree from Tsin-ghua University, Beijing, and the PhD degree incomputer science from the Pennsylvania StateUniversity in 2008. Since then, she has beenwith Microsoft Corp., Redmond, Washington.Her research interests include distributed sys-tem, security and privacy, and wireless sensornetworks. She is a student member of the IEEE.

SHAO ET AL.: pDCS: SECURITY AND PRIVACY SUPPORT FOR DATA-CENTRIC SENSOR NETWORKS 1037

Authorized licensed use limited to: Penn State University. Downloaded on January 14, 2010 at 14:53 from IEEE Xplore. Restrictions apply.

Page 16: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, …mcn.cse.psu.edu/paper/mshao/tmc09.pdf · instead, mobile sinks (MSs) such as mobile sensors, users, or soldiers may be dispatched

Sencun Zhu received the BS degree in preci-sion instruments from Tsinghua University,Beijing, in 1996, the MS degree in signalprocessing from the University of Science andTechnology of China, Beijing, in 1999, and thePhD degree in information technology fromGeorge Mason University in 2004. He iscurrently with the Department of ComputerScience and Engineering and the College ofInformation Sciences and Technology, Pennsyl-

vania State University, University Park. His research interests includenetwork and systems security with a focus on ad hoc and sensornetwork security, P2P security, and malware defenses. He was arecipient of the US National Science Foundation CAREER Award in2007. He is a cochair of the Fourth ACM Workshop on Security ofAd Hoc and Sensor Networks (SASN ’06). He served on the technicalprogram committee of many international conferences, including theACM Conference on Computer and Communications Security (CCS),IEEE INFOCOM, and so forth. His publications can be found at http://www.cse.psu.edu/szhu.

Wensheng Zhang received the BS degree fromTongji University, Shanghai, the MS degree fromthe Chinese Academy of Sciences, and the PhDdegree in computer science from the Pennsyl-vania State University in 2005. Since then, hehas been with the Department of ComputerScience, Iowa State University, Ames, as anassistant professor. His research interests in-clude wireless networks and network security.He is a member of the IEEE.

Guohong Cao received the BS degree fromXian Jiaotong University, Xian, China, and theMS and PhD degrees in computer science fromOhio State University in 1997 and 1999, respec-tively. Since then, he has been with the Depart-ment of Computer Science and Engineering,Pennsylvania State University, University Park,where he is currently a full professor. Hisresearch interests include wireless networksand mobile computing. He has published more

than 100 papers in the areas of sensor networks, wireless networksecurity, data dissemination, resource management, and distributedfault-tolerant computing. He has served on the editorial board of theIEEE Transactions on Mobile Computing and the IEEE Transactions onWireless Communications, and has served on the program committee ofmany conferences. He was a recipient of the US National ScienceFoundation CAREER award in 2001. He is a senior member of the IEEE.

Yi Yang is a PhD candidate in the Department ofComputer Science and Engineering, Pennsylva-nia State University, University Park, where sheis also a member of the Networking and SecurityResearch Center. Her research interests includesecurity and privacy issues in wireless sensornetworks and network management. She is astudent member of the IEEE.

. For more information on this or any other computing topic,please visit our Digital Library at www.computer.org/publications/dlib.

1038 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 8, NO. 8, AUGUST 2009

Authorized licensed use limited to: Penn State University. Downloaded on January 14, 2010 at 14:53 from IEEE Xplore. Restrictions apply.


Recommended