A Dynamic Privacy Protection Mechanism for Spatiotemporal ...

Research ArticleA Dynamic Privacy Protection Mechanism forSpatiotemporal Crowdsourcing

Tianen Liu 1 Yingjie Wang 1 Zhipeng Cai2 Xiangrong Tong1 Qingxian Pan1

and Jindong Zhao1

1School of Computer and Control Engineering Yantai University Yantai 264005 China2Department of Computer Science Georgia State University Atlanta 30303 GA USA

Correspondence should be addressed to Yingjie Wang towangyingjie163com

Received 25 May 2020 Revised 26 June 2020 Accepted 29 July 2020 Published 28 August 2020

Academic Editor Xiaolong Xu

Copyright copy 2020 Tianen Liu et al -is is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

In spatiotemporal crowdsourcing applications sensing data uploaded by participants usually contain spatiotemporal sensitivedata If application servers publish the unprocessed sensing data directly it is easy to expose the privacy of participants Inaddition application servers usually adopt the static publishing mechanism which is easy to produce problems such as poortimeliness and large information loss for spatiotemporal crowdsourcing applications -erefore this paper proposes a spa-tiotemporal privacy protection (STPP) method based on dynamic clustering methods to solve the privacy protection problem forcrowd participants in spatiotemporal crowdsourcing systems Firstly the working principles of a dynamic privacy protectionmechanism are introduced -en based on k-anonymity and l-diversity the spatiotemporal sensitive data are anonymized Inaddition this paper designs the dynamic k-anonymity algorithm based on the previous anonymous results -rough extensiveperformance evaluation on real-world data compared with existing methods the proposed STPP algorithm could effectively solvethe problem of poor timeliness and improve the privacy protection level while reducing the information loss of sensing data

1 Introduction

With the widespread use of wireless communication tech-nologies and smart mobile terminals location-based services(LBS) are becoming more and more popular [1 2] In manyspatiotemporal crowdsourcing applications participantsreceive corresponding rewards by submitting their ownsensing tasks to crowdsourcing application servers [3]However the submitted sensing data contain the partici-pantsrsquo spatiotemporal data [4 5] If the crowdsourcingapplication server publishes these spatiotemporal datawithout processing the participantrsquos privacy informationwill be obtained by attackers [6 7] More importantly at-tackers can infer the participantrsquos recent medical service orentertainment venue by locating his spatiotemporal infor-mation to understand his health status preferences timeand scope of the outing [8] -erefore in a spatiotemporalcrowdsourcing application it is especially important toprotect the spatiotemporal information of participants -e

privacy protection technology based on spatiotemporalcrowdsourcing has also become a research hotspot in thefield of spatiotemporal crowdsourcing systems [9]

In order to ensure that participantsrsquo spatiotemporalprivate information is not leaked when publishing data alarge amount of work on spatiotemporal privacy protectionis devoted to disturbing and anonymizing the spatiotem-poral data that may reveal personal whereabouts To et al[10] proposed a protection framework based on differentialprivacy -e workers in spatiotemporal crowdsourcing firstsubmit their real location information to the truthful mobileservice provider and the mobile service provider uses a grid-based method to construct the private spatial decomposi-tions (PSDs) for the original location information and addsLaplace noise to process workersrsquo real location data forprivacy protection purposes Vu et al [11] proposed aprivacy protection mechanism based on local sensitivehashing to group participantsrsquo positions Each group con-tains at least k participants to achieve spatial anonymity -e

HindawiSecurity and Communication NetworksVolume 2020 Article ID 8892954 13 pageshttpsdoiorg10115520208892954

ideal partition of spatial data under low time complexity isrealized and participantsrsquo location information is protectedin a spatiotemporal crowdsourcing application scenario In[12] the problem of insufficient diversity of k-anonymityalgorithm to the participantsrsquo sensitive locations is solvedand the probability of participantsrsquo access to the sensitivelocations is limited or the probability analysis based onadversary knowledge is used to ensure the location diversity

However most researchers currently only consider thedata publishing in static scenario -e attackers use historicalpublishing results to reveal sensitive information during staticpublishing for example to compare with the results of theprevious publishing In spatiotemporal crowdsourcing ap-plications many data analysis applications actually involvedynamic data publishing For example in order to plan travelroutes for special vehicles (cash trucks ambulances fireengines etc) it is necessary to issue a sensing task to collectroad traffic jams [13 14] For such a spatiotemporal sensitivetask application server needs to dynamically publish sensingdata submitted by participants to improve the timeliness ofthe task Dynamic sensing data change constantly over timeso it is often necessary to anonymize and dynamically publishsensing data at different times However most anonymityalgorithms are invalid when dealing with the dynamic pub-lishing of spatiotemporal data [15] -e previous anonymityresult cannot be effectively utilized Because of the big datascenario the time complexity of algorithms is high and thetimeliness is poor [16] Moreover most researchers proposedprivacy preserving for participantrsquos location information butfailed to consider that attackers can also infer other privateinformation based on participantrsquos spatiotemporal informa-tion According to these problems we research the privacyprotection for spatiotemporal privacy information in spa-tiotemporal crowdsourcing systems and the following issuesshould be improved further

(1) In the process of dynamic data publishing the resultsafter anonymization should be effectively utilizedinstead of unifying the anonymization of incre-mental data with previous data to improve thetimeliness of dynamic publishing of big data

(2) In the process of anonymizing the location attributeof participants the time attribute is added to ef-fectively avoid the background knowledge attack andhomogeneity attack against the location attribute

In order to solve the above problems we propose aspatiotemporal privacy protection method for spatiotem-poral crowdsourcing systems -e contributions of thispaper are shown as follows

(1) A dynamic publishing algorithm based on spatio-temporal data privacy protection is designed byimproving k-anonymity When incremental dataarrive the anonymization result of the last time willbe utilized to solve the timeliness problem of dy-namic publishing

(2) Based on the traditional position coordinate a timeaxis is added to form the spatiotemporal informationof participants and the anonymization of participantsrsquo

spatiotemporal information is carried out by applyingk-anonymity and l-diversitymethods so as to solve thebackground knowledge attack and homogeneity attackproblems

(3) In order to verify the effectiveness of the proposedprivacy protection method the comparison experi-ments with k-anonymity and variable centroid lo-cation aggregation (VCLA) [17] algorithms areconducted on two real-world datasets

-e structure of the paper is as follows Section 2 in-troduces the related works of spatiotemporal privacy pro-tection Section 3 introduces the proposed spatiotemporalprivacy protection method for spatiotemporal crowd-sourcing systems In Section 4 the real-world datasets andthe existing anonymity algorithms are used for evaluatingthe performance of the proposed method Section 5 con-cludes the paper and presents the future work

2 Related Works

In this section we will introduce the related works aboutprivacy protection methods for spatiotemporal data anddynamic publishing of sensitive data under a participatorysensing environment Participatory sensing (PS) refers to theformation of a mobile Internet through daily mobile deviceswhere data are sensed collected analyzed or screened by thepublic and professional users and then uploaded to theparticipatory sensing network [18] With the popularizationof mobile terminals and the rapid development of wirelesssensor technology the application of PS is becoming moreand more common in real life For example in [19] Chenet al studied the energy-efficient task offloading in mobileedge computing (MEC) However in the process of taskoffloading the privacy of participants will be exposed Inorder to deal with the problems that participantsrsquo privacywill be exposed during the task offloading process Xu et al[20] put forward a two-phase offloading optimizationstrategy for joint optimization of offloading utility andprivacy in edge computing Further Xu et al [21] discussedthe problem that transmitted information is vulnerable toattack and may cause incomplete data during task off-loading A blockchain-enabled computation offloadingmethod was proposed to ensure data integrity In theimplementation process of these participatory sensing ap-plications sensing tasks uploaded by participants will markpersonal spatiotemporal data which brings great risks to theprivacy security and personal safety of participants-erefore while people enjoy the convenience brought byLBS their privacy is also at risk of being exposed [22]

In LBS using anonymous technology to solve the lo-cation privacy problem of participants has been widelystudied [23] -e k-anonymity technology was firstly pro-posed by Samarati and Sweeney [24] -e parameter kspecifies the maximum risk of information disclosure thatusers can bear It requires at least k indistinguishable recordson the quasi-identifier in published data so that attackerscannot identify the specific individual that the privacy in-formation belongs to so as to protect personal privacy In

2 Security and Communication Networks

[25] the clustering-based k-anonymity strategy is adopted toprotect the privacy disclosure of wearable owners when theyupload sensing data In [26] a k-anonymous location pri-vacy protection method based on coordinate transformationwas proposed for the problem that the third-party truthfulserver (TTP) was often untruthful in real life [27] -eanonymous server receives the coordinate-converted par-ticipant location and constructs an anonymous area withoutknowing the userrsquos actual location thereby protecting theparticipantrsquos location privacy In [28] the optimal k value ofthe current user is determined according to the userrsquos en-vironment and social attributes and a location protection k-anonymity method based on the truthful chain was pro-posed to protect the location privacy of participants whileensuring the quality of service

However k-anonymity cannot cope with the back-ground knowledge attack and homogeneity attackMachanavajjhala et al [29] firstly proposed l-diversity toimprove k-anonymity Each k-anonymity group in thepublished data sheet contains at least l different sensitiveattribute values so that the probability that an attacker infersa certain record privacy information will be less than 1l In[30] considering the identity attributes of participants it isensured that each anonymous set at least has k minus 1 partic-ipants and each anonymous set has p different sensitivevalues In [31] k-anonymity and l-diversity were adopted asprivacy models and an anonymization method based ongenetic algorithm clustering was proposed -e basic op-erator of genetic algorithm is improved to protect thepersonal sensitive information contained in the publishedreport

However when requesting LBS services the location ofmost participants is always related to time [32] -e aboveworks only protect the location attribute of participants butdo not associate the location attribute of participants withthe time attribute Trajectory anonymity refers to the se-quence of user location information in a continuous periodof time which anonymizes and protects the userrsquos locationattribute and time attribute together In [33] the trajectoryprivacy protection method based on user demand wasproposed By dividing different time intervals and settingdifferent privacy protection parameters for different tra-jectories the anonymous trajectory equivalence class isconstructed In [34] the Hilbert curve was used to extract thedistribution characteristics of trajectory data each time andthe personalized differential privacy publishing mechanismwas designed according to the individual needs for differentdegrees of privacy In [35] a collaborative trajectory privacyprotection scheme for continuous query was proposed toconfuse attackers by issuing false query thus confusingusersrsquo actual trajectory In [36] a trajectory privacy pro-tection algorithm based on trajectory shape diversity wasproposed by combining k-anonymity and l-diversity to solvethe trajectory privacy leakage problem that may be caused bythe high similarity between trajectories in the anonymousset

In the research of privacy protection data publishing(PPDP) the first proposed model was mainly used for staticpublishing that is only considering the one-time publishing

of data and the above research was mainly conducted for thestatic data publishing [37] However in many spatiotem-poral crowdsourcing applications a large amount of datastays in a changing state and dynamic data publishingoccurs from time to time [38] In order to solve the problemthat static publishing cannot resist link attacks and criticalmissing attacks Wang and Fung [39] firstly studied thepossible privacy leaks of data redistribution and proposed amethod to prevent privacy leakage -e main idea of thismethod is to hide the true connection relationship betweenthe two publishing versions thereby weakening the globalquasi-identifier Xiao [40] firstly proposed the privacyprotection modelm-invariance for dynamic data publishingwhose key is to introduce pseudogeneralization technologyto ensure that any QI group records in different datapublishing versions have the same sensitive attribute valueIn [41] because of the problem that the privacy protectionassociation rule mining algorithm is not applicable to thedynamic change database the incremental privacy protec-tion data mining algorithm based on granularity calculationwas proposed and the incremental update algorithm wasused to solve the problem of frequent item set calculation ofincremental transaction database In [42] a differentialprivacy histogram publishing method based on fractal di-mensionmining technology was proposed-emethod usedfractal dimension to cluster datasets and counted the valuesof each class Laplace noise was added to data beforepublishing to achieve differential privacy However theabove methods cannot cope with the privacy protection forspatiotemporal information and it is difficult to adapt to theissue of dynamic privacy protection in spatiotemporalcrowdsourcing applications Even if the above methodsconsider participantsrsquo location information attackers couldalso infer participantsrsquo privacy through time informationMore importantly the above methods are invalid for real-time data tasks

Based on the above discussions a dynamic publishingmethod for spatiotemporal privacy protection under theparticipatory sensing environment is proposed By com-bining k-anonymity and l-diversity the proposed dynamicpublishing method could protect the privacy information ofparticipants and reduce the information loss

3 Dynamic Privacy Protection Algorithm

In this section a dynamic privacy protection mechanism forspatiotemporal sensitive information is researched and theworking principles of the three main parts of the dynamicprivacy protection mechanism are introduced-e proposedalgorithms and corresponding explanations are giventhrough an example

31 Dynamic Privacy Protection Mechanism -e mecha-nism is divided into three parts participants TTPs andapplication server

(i) Participants in spatiotemporal crowdsourcing ap-plications participants are responsible for the col-lection and uploading of sensing data [43] Sensing

Security and Communication Networks 3

data uploaded by participant pi 1le ile n and pi

include following attributes lt idi datai timei

xi yi gt where idi is the identity attribute of pidataimeans completed sensing tasks uploaded by piand lt timei xi yi gt indicates real-time attribute andlocation attribute contained in datai denoted by di Itis a sensitive attribute and requires to anonymize Inthe dynamic privacy protection publishing mecha-nism participants submit sensing data in batches

(ii) TTPs in this mechanism participants firstly uploadsensing data to TTPs and TTPs preprocesses thesensing data to extract sensitive data (ie partici-pantsrsquo real spatiotemporal data) [44] -en using k-anonymity the real spatiotemporal data are ano-nymized -e sensing data that do not satisfy theanonymity condition are stored in buffer pool andanonymize with the next incremental data Moreimportantly when incremental data arrive thecorresponding equivalence classes will be added if theadaptive threshold is satisfied by utilizing the pre-vious anonymity results Finally the cluster centervalue ui 1le ile r is sent to the application server

(iii) Application server for avoiding the backgroundknowledge attack and the homogeneity attackagainst k-anonymity cluster center values need tobe clustered again based on l-diversity idea Ap-plication server anonymizes ui according to the timeattribute Each cluster contains at least l clustercenter values and then the newly generated clustercenter value ti 1le ile c is published After ano-nymization processing on TTPs and applicationserver the results are shown in Table 1 whereU Ur

i1Mi Mi Uri

j1uij L Uci1Ti and

Ti Uci

j1tij both r and c respectively represent thenumber of position clusters and time clustersui

represents a cluster containing spatiotemporalsensitive data ri and ci represent the number ofspatiotemporal sensitive data in the ith cluster uij

andtij represent the spatiotemporal sensitive dataincluded in a cluster

-e dynamic publishing privacy protection mechanismproposed in this paper is different from the traditional spa-tiotemporal crowdsourcing process In the process of tradi-tional spatiotemporal crowdsourcing requesters firstlypublish tasks and then recruit participants to complete thetask In the process of uploading tasks by the participantstraditional spatiotemporal crowdsourcing does not considerthe privacy of participants More importantly the staticpublish of tasks will reduce usersrsquo experience -e workingprocess of the proposed dynamic publishing privacy pro-tection mechanism is shown in Figure 1 In Step 1 partici-pants send collected sensing data (including spatiotemporalsensitive information) to TTPs by secure wireless networks InStep 2 TTPs reprocess the sensing data and extract spatio-temporal sensitive data -e spatiotemporal sensitive data areused by k-anonymity to anonymize If the clustering con-dition is not met Step 3 is performed to temporarily store thecorresponding spatiotemporal sensitive data into buffer pool

If the clustering condition is met Step 4 is performed andTTP sends the anonymity result to the application server InStep 5 application server clusters based on l-diversity for thetime attribute of anonymity results In Step 6 applicationserver publishes the sensing data containing anonymitysensitive spatiotemporal data In Step 7 when other partic-ipants submit sensing data incremental data are sent to TTPstogether with the sensing data temporarily stored in bufferpool In Step 8 sensing data are dynamically anonymized byutilizing the previous anonymity results Perform the aboveprocess until participants no longer submit sensing data

32 Static Publishing Anonymous Protection In order toprotect the spatiotemporal privacy of participants k-ano-nymization is used to anonymize participantsrsquo time andlocation attributes together In the spatiotemporal crowd-sourcing application because of the different dimensions oftime and location attributes of participants we standardizethe spatiotemporal sensitive data di 1le ile n by using thestandard deviation method expressed by equation (1) dik

represents the real spatiotemporal information of the kthdimension of the ith data dik

prime represents normalized spa-tiotemporal information of dik that is shown by the followingequation

d ikprime

dik minus min djk1113966 1113967

max djk1113966 1113967 minus min djk1113966 1113967 10le jle n 1le kle 3 (1)

-e distance between participants pi and pj is calculatedby equation (2) -e distance includes spatial distance andtemporal distance between participants pi and pj

dis pi pj1113872 1113873

1113944

3

k1dikprime minus djkprime1113872 1113873

2

11139741113972

1le i jle n (2)

In order to easily find the center points of positioncluster and time cluster we calculate the global centroid d ofthe actual spatiotemporal dataset for anonymization by thefollowing equation

d lt1113936

ni1 timei

n1113936

ni1 xi

n1113936

ni1 yi

ngt (3)

In order to reduce the information loss and increase theprivacy protection we set the adaptive threshold expressedas follows

avej 11139443

k11113944

r

i1dikprime minus djkprime1113872 1113873 1le jle |G|

or |L|

(4)

where r indicates that there is r spatiotemporal data in thecluster -e static publishing anonymity protection based onk-anonymity is shown in Algorithm 1

Algorithm 1 describes that participants send sensingdata to TTPs-e TTPs firstly process the sensing data andextract participantrsquos real spatiotemporal information(represented by set A) as sensitive data for privacy


protection -e input of Algorithm 1 is k-anonymity-specified parameter k and the participantrsquos real spatio-temporal dataset A -e output of Algorithm 1 is theanonymity result setU and buffer pool dataset B Calculatethe global centroid d in Step 1 Step 2 initializes param-eters and count is the number of new split clusters Steps4ndash11 describe that the number of points in the new splitcluster is k Step 5 selects a point dsma with the largestdistance to the global center point d dsma indicates thenew cluster center point and will be deleted from A (Step6) Steps 7ndash11 select k minus 1points that have the smallestdistance with dsmato form a new cluster Ncount Update thecenter point dcount of Ncount (Step 8) and select the pointdsmi with the smallest distance to dcount (Step 9) Step 10adds dsmi to cluster Ncount and removes it from A Steps12ndash19 extend the cluster Ncount and in order to reduceinformation loss we set the adaptive threshold ave (Step13) If the point dcmi (Step 14) in A satisfies the adaptivethreshold it will be added to the cluster Ncount (Step 16)and the centroid dcount of Ncount (Step 17) is updated InStep 20 Ncount is added to U and the number of cluster isupdated If there are remaining data in A it is stored in thebuffer pool B (Step 22) In Step 23 the output of Algo-rithm 1 is returned

-e real spatiotemporal information contained in thesensing data uploaded by participants is anonymized byTTPs and returned to anonymity result set U Send thecenter point ui to the application server -en we illustratethe anonymity process of spatiotemporal data more vi-sually by some data examples in experiments As shown inTable 2 the first column is the class ID being run by Al-gorithm 1 the second and third columns are the realspatiotemporal information of participants and the fourthand fifth columns are the anonymity spatiotemporal

information We can see that each equivalence class con-tains at least 3 points

33 Improved Static Publishing Anonymity Protection Basedon l-Diversity However k-anonymity is vulnerable tobackground knowledge attack and homogeneity attack-erefore when the application server publishes anonymityresults we adopt l-diversity to improve the algorithmApplication server receives the cluster center value ui sent byTTPs anonymizes the time attribute based on l-diversityand calculates the time center value by the followingequation

t 1113936

mi1 timei

m (5)

where m refers to the number of spatiotemporal dataanonymized by TTPs ie m 1113936

ri1 |Mi|

Algorithm 2 describes the anonymous releasing basedon l-diversity for time attribute -e input is l-diversityparameter l and the output set U of Algorithm 1 and theoutput is anonymous set L Step 1 and Step 2 take the timeset T and the position set O out of U respectively Step 3calculates the global central value t of time set T and thenumber of initial clusters is count 1 Steps 4ndash15 describethat the number of points in the newly generated cluster isl Step 5 initializes time cluster Tcount and location clusterOcount Step 6 finds the ttma with the largest distance to theglobal central value t by equation (2) Step 7 adds ttma tothe new cluster Tcount and the coordinate clusterOtmacorresponding to the subscript tma is added to the newclusterOcount -en ttma is deleted from the time set T -el minus 1 points with the smallest distance are selected to jointhe cluster (steps 8ndash12) Step 13 updates the time center

Data collector

(1) Data transfer

Trusted third party server

(7) Data transfer (3) Data transfer

Incremental data Bufferpool

(2) Static anonymity(8) Dynamic anonymity

(4) Data transfer

Application server

(5) Temporalanonymity

(6) Release

The cloud

Figure 1 -e framework of a dynamic publishing mechanism for privacy protection

Table 1 Anonymization results

Anonymization results TTPs Application serverClustering result set U LEach cluster in result set Mi 1le ile r Ti 1le ile c

Spatiotemporal sensitive data included in a cluster uij 1le jle ri tij 1le jle ci

Cluster center value ui 1le ile r ti 1le ile c


value tcount of clusterTcount -e output of Algorithm 2 inStep 14 is Lcount and the Cartesian product of time centervalue sets tcount and position clusterOcount Steps 16ndash23describe that if there is any remaining point in time set Tthe cluster with the smallest distance (Step 18) is found byequation (2) and added to the cluster (Step 19) then thetime center value tlmi of the cluster Tlmi is updated (Step20) In Step 24 the output of Algorithm 2 is returned

-e following is a more visual illustration of releasingspatiotemporal data based on l-diversity As shown in

Table 3 the first column is the group ID being run byAlgorithm 2 the second column is the class ID being runby Algorithm 1 the third and fourth columns are theanonymous spatiotemporal information being run byAlgorithm 1 and the fifth and sixth columns are theanonymous spatiotemporal information anonymized bythe application server We can see that each 2-equivalencegroup (l 2) contains at least two 3-equivalence classes(k 3) where the anonymous time attribute is the sameand the anonymous location attribute is different

Input k-anonymous parameter k the actual spatiotemporal dataset A from participantsOutput aggregation result U buffer pool dataset B

(1) Calculate the global centroid d of A by equation (3)(2) count 1 U φ(3) while |A|ge k do(4) Ncount φ(5) sma argmaxiisinAdis(di d)

(6) Ncount Ncount cupdsma A Adsma(7) for j⟵ 1 to k minus 1 do(8) Update the centroid dcount of Ncount by equation (3)(9) smi argminiisinAdis(di dcount)

(10) Ncount Ncount cupdsmi A Adsmi(11) end for(12) while |Ncount|lt 2k minus 1 do(13) Calculate the average distance avecount of Ncount by equation (4)(14) cmi argminiisinAdis(di dcount)

(15) if dis(dcmi dcount)lt avecountthen(16) Ncount Ncount cupdcmi A Adcmi(17) Update the centroid dcountof Ncount by equation (3)(18) end if(19) end while(20) U UcupNcount count count + 1(21) end while(22) BA(23) return U B

ALGORITHM 1 Static publishing anonymity protection based on k-anonymity

Table 2 -ree anonymous examples

Class ID Time Location Anonymized time Anonymized location1 204736 (213675 minus1579388) 203822 (213166 minus1578616)1 214633 (212866 minus1578129) 203822 (213166 minus1578616)1 192057 (212958 minus1578331) 203822 (213166 minus1578616)2 233100 (455894 minus1227524) 230057 (463272 minus1225448)2 220000 (457801 minus1225400) 230057 (463272 minus1225448)2 233150 (476122 minus1223419) 230057 (463272 minus1225448)3 185405 (304810 minus978295) 191509 (320273 minus974996)3 193326 (327368 minus973271) 191509 (320273 minus974996)3 191748 (328640 minus973421) 191509 (320273 minus974996)4 184648 (302016 minus976671) 190211 (315155 minus974498)4 190906 (326804 minus973746) 190211 (315155 minus974498)4 190347 (328382 minus970045) 190211 (315155 minus974498)4 190903 (303417 minus977530) 190211 (315155 minus974498)5 195021 (593238 180977) 172548 (593232 180543)5 154908 (593457 180587) 172548 (593232 180543)5 163804 (593055 179892) 172548 (593232 180543)5 172822 (593122 180796) 172548 (593232 180543)5 172305 (593288 180461) 172548 (593232 180543)


34 Dynamic Publishing Anonymity Protection For staticone-release mechanisms k-anonymity and l-diversityare valid However in real life application serversusually publish sensing data dynamically -erefore inthis section we improve k-anonymity and l-diversity toaccommodate dynamic publishing mechanism Algo-rithm 3 describes the dynamic publishing anonymityprotection

Algorithm 3 describes how TTPs use the previousanonymity result to solve the problem of dynamic pub-lishing when participants submit sensing data in differenttime periods -e input of Algorithm 3 is k-anonymousparameter k the clustering result U of Algorithm 1 in-cremental dataset I (that is the sensing data submitted byparticipants) and buffer pool dataset B -e output of thealgorithm is the clustering result D and buffer pool dataset

Input l-diversity parameter l aggregation result U from Algorithm 1Output aggregation result L

(1) Take time set T out of U(2) Take location set O out of U(3) Calculate the global centroid t by equation (5) count 1(4) while |T|ge ldo(5) Tcount φ Ocount φ(6) tma argmaxiisinTdis(ti t)

(7) Tcount Tcount cup ttma Ocount Ocount cupOtma T Tttma(8) for j⟵ 1 to l minus 1 do(9) Update the centroid tcount of Tcount by equation (3)(10) tmi argminiisinTdis(ti tcount)

(11) Tcount Tcount cup ttmi Ocount Ocount cupOtmi T Tttmi(12) end for(13) Update the centroid tcount of Tcount by equation (3)(14) Lcount tcount times Ocount count count + 1(15) end while(16) while |T|gt 0do(17) for i isin |T|do(18) lmi argminjisinLdis(ti tj)

(19) Tlmi Tlmi cup ti Olmi Olmi cupOi

(20) Update the centroid tlmiof Tlmi(21) Llmi tlmi times Olmi(22) end for(23) end while(24) return L

ALGORITHM 2 Static publishing anonymity protection based on l-diversity

Table 3 3-Anonymity 2-diversity examples

Group ID Class ID Time Location Anonymized time Anonymized location1 3 191509 (320273 minus974996) 183423 (320273 minus974996)1 3 191509 (320273 minus974996) 183423 (320273 minus974996)1 3 191509 (320273 minus974996) 183423 (320273 minus974996)1 4 190211 (315155 minus974498) 183423 (315155 minus974498)1 4 190211 (315155 minus974498) 183423 (315155 minus974498)1 4 190211 (315155 minus974498) 183423 (315155 minus974498)1 4 190211 (315155 minus974498) 183423 (315155 minus974498)1 5 172548 (593232 180543) 183423 (593232 180543)1 5 172548 (593232 180543) 183423 (593232 180543)1 5 172548 (593232 180543) 183423 (593232 180543)1 5 172548 (593232 180543) 183423 (593232 180543)1 5 172548 (593232 180543) 183423 (593232 180543)2 1 203822 (213166 minus1578616) 214940 (213166 minus1578616)2 1 203822 (213166 minus1578616) 214940 (213166 minus1578616)2 1 203822 (213166 minus1578616) 214940 (213166 minus1578616)2 2 230057 (463272 minus1225448) 214940 (463272 minus1225448)2 2 230057 (463272 minus1225448) 214940 (463272 minus1225448)2 2 230057 (463272 minus1225448) 214940 (463272 minus1225448)


Bprime -e global dataset W is the incremental data I and thebuffer pool data B (Step 1) Steps 2ndash11 describe the processof adding data from dataset W that meets the adaptivethreshold condition to the last clustering result where rrepresents the number of clusters of U (Table 1) First thecluster center set U Ur

i1ui in the clustering result U istaken out (Step 3) and Step 4 finds the subscript of thesmallest cluster center point smi to point Wi -enadaptive threshold values r ave are set by equation (4)(Steps 5 and 6) where r ave is the average distance be-tween point ue and center point usmi in clusterusmi If thepoint in dataset W meets the adaptive threshold join thecorresponding cluster and delete the point from W (Step7) update the central value usmi of cluster usmi (Step 8)and assign the updated U to the output result D in Al-gorithm 3 (Step 9) Steps 12ndash19 describe that if the numberof points in cluster Mi is greater than or equal to 2k thenAlgorithm 4 is called to split Mi -e clustering result D isdenoted by D UcupG where U is the number of points incluster Miless than 2k and G is the output of Algorithm 4and temporarily stores the remaining data in W to bufferpool Bprime(Step 18) In Step 20 the output of Algorithm 3 isreturned

Algorithm 4 describes that if the number of points in thecluster is greater than or equal to 2k the cluster is split -einput of Algorithm 4 is k-anonymous parameter k andclusterM -e output of Algorithm 4 is the clustering resultG of the new split Step 1 calculates the global center point d

of cluster M by equation (3) Step 2 initializes parameterscount is the number of new split clusters and G is the outputof Algorithm 4 Steps 3ndash13 describe that the number ofpoints in the new split cluster is k Step 5 selects a point dsmawith the largest distance to the global center point dTake dsma as the new cluster center point and delete it from

M (Step 6) Steps 7ndash11 select k-1 points that have the smallestdistance with dsma to form a new clusterNcount update thecenter point dcountof Ncount (Step 8) and select the pointdsmiwith the smallest distance to dcount (Step 9) Step 10 addsdsmi to cluster Ncount and removes it fromM In Step 12 thenewly generated cluster Ncount is added to the output resultG and the number of clusters increases If there areremaining points in cluster M add them to the new clusterclosest to them (steps 14ndash20) Step 16 finds a cluster Ncmihaving the smallest distance with the remaining point di adddito Ncmi (Step 17) and update the center point dcmi ofcluster Ncmi In Step 21 the output G of Algorithm 4 isreturned

4 Experiments and Result Analysis

In this section we use real-world datasets includingGowallarsquos Friendship Network dataset and Kagglersquos NewYork Taxi Travel Time dataset Table 4 shows the numberof attributes and data points and the density of data pointscontained in datasets We compare the proposed STPPalgorithm with k-anonymity and VCLA algorithms interms of running time information loss and privacyprotection -e hardware environment of the experimentsis an AMD A8-5550M APU with Radeon (tm) HDGraphics 210 GHz equipped with 4 GB RAM andrunning the Win 10 OS

Datasets are processed to better protect participantsrsquospatiotemporal sensitive data First we randomly extract1000 data from Friendship Network dataset as a segment atotal of five segments as participantrsquos sensing data toconduct comparison experiments -en we randomly ex-tract 3000 data from New York City Taxi Trip dataset as a

Input k-anonymous parameter k aggregation result U from Algorithm 1 incremental dataset I buffer pool dataset BOutput aggregation result D buffer pool dataset Bprime

(1) Calculate global dataset WI + B(2) for i⟵ 1 to r do(3) Take the centroid set U out of U(4) smi argminjisinUdis(uj tWi)

(5) r ave argmineisin|usmi |dis(ue usmi)

(6) if dis(usmi tWi)le r ave then(7) Msmi Msmi cupWi W WWi

(8) Update the centroid usmi of usmi(9) D U

(10) end if(11) end for(12) for i⟵ 1to r do(13) if |Mi|gt 2k then(14) Callback Algorithm 4⟶(15) input k-anonymous parameter k cluster M(16) output aggregation result G(17) end if(18) D UcupG Bprime W

(19) end for(20) return D Bprime

ALGORITHM 3 Dynamic publishing anonymous protection


segment a total of five segments as participantrsquos sensingdata to design comparison experiments Each segment ofsensing data is uploaded to TTPs in batches dynamically-en the spatiotemporal sensitive data of participants in-cluding time and location attributes are extracted fromsensing data for anonymization

Figure 2 shows the comparison of experimental results bycomparing the proposed STPP algorithm with k-anonymousand VCLA algorithms on running time Figure 2(a) shows theexperimental result on Friendship Network dataset andFigure 2(b) shows the experimental result on New York CityTaxi Trip dataset -e x-coordinate is the number of partici-pants and the y-coordinate is running time It can be seen thatthe STPP algorithm is superior to the other two algorithmswhether it is on a small dataset where participantsrsquo spatio-temporal distance is sparse or on a large dataset with densespatiotemporal distance When there are fewer participantssubmitting tasks the running time of the three algorithms isnot much different It is because that the three algorithms areimproved by k-anonymity algorithms the STPP algorithmproposed in this paper does not have obvious advantages interms of running time when there are few participantsHowever when the number of participants gradually increasesSTPP algorithm could better solve the problem of poortimeliness of data publishing due to the large number ofparticipants in spatiotemporal crowdsourcing applications

Since anonymized data are used for dynamic publishingthe difference between real spatiotemporal data and ano-nymized data is seen as the information loss -e infor-mation loss is expressed by the following equation

IL 1113944n

i11113944

3

k1dikprime minus dikprime (6)

where dikprime represents the anonymized information of dik

prime krepresents dimension which includes time dimension andlocation dimension

Figure 3 shows the comparison of experimental result bycomparing the STPP algorithm with k-anonymous andVCLA algorithm on information loss -e x-coordinate isthe number of participants on the Friendship Networkdataset and the y-coordinate is information loss From theexperimental result it can be seen that the information lossincreases with the increase of participants Moreover STPPalgorithm is obviously better than the comparison algo-rithms on information loss

Figure 4 shows the relationship between the parameter kof k-anonymous and the information loss of the STPP al-gorithm where different curves represent different amountsof data -e experiments are conducted on the FriendshipNetwork dataset From the experimental result it can beinferred that with the increase of k the information lossincreases gradually which is because that increasing k leads

Table 4 Attribute quantity and density of datasets

Datasets Dimensions Quantity SparsenessFriendship Network dataset 5 6442892 SparseNew York City Taxi Trip dataset 11 1458644 Dense

Input k-anonymous parameter k cluster MOutput aggregation result G

(1) Calculate the global centroid d of cluster M by equation (3)(2) count 1 G φ(3) while |M|ge k do(4) Ncount φ(5) sma argmaxiisinMdis(di d)

(6) Ncount Ncount cup dsma M Mdsma(7) for j⟵ 1 to k minus 1 do(8) Update the centroid dcount of Ncount

(9) smi argminiisinMdis(di dcount)

(10) Ncount Ncount cup dsmi M Mdsmi(11) end for(12) G GcupNcount count count + 1(13) end while(14) while |M|gt 0do(15) for i isin |M|do(16) cmi argminiisinGdis(di dj)

(17) Ncma Ncmi cupdi

(18) Update the centroid dcmi of Ncmi(19) end for(20) end while(21) return G

ALGORITHM 4 Breaking up clusters


to an increase of spatiotemporal sensitive data in clustersand IL in each cluster will increase correspondingly

For evaluating the performance of privacy protectionwe use the probability of attackersrsquo attack success toquantify and compare that is attackers guess theprobability of participantsrsquo specific spatiotemporal databased on the published sensing data Suppose that nsensing data are published and spatiotemporal sensitivedata di 1le ile n are aggregated into r location clustersand c time clusters In this paper equation 7 is used toquantify privacy protection where 1113936

ri1(1|ui|)r and

1113936cj1(1|cj|)c represent the average probability that at-

tackers can infer real location attribute and time attributeof each sensing data respectively

p 1n

times1113936

ri1 1 ui

111386811138681113868111386811138681113868111386811138681113872 1113873

rtimes

1113936cj1 1 cj

11138681113868111386811138681113868

111386811138681113868111386811138681113874 1113875

c

(7)

Figure 5 shows the comparison of experimental result bycomparing the STPP algorithm with k-anonymous andVCLA algorithms on privacy protection-e x-coordinate isthe number of participants on the New York City Taxi Tripdataset and the y-coordinate is the probability of attackers toinfer specific spatiotemporal data of participants based onthe published sensing data From the experimental result itcan be seen that the privacy protection gradually increaseswith the increase of participants It is because that if thenumber of participants increases the sensing data publishedby the application server will increase correspondinglywhich reduces the probability of attackersrsquo attack successsince the probability of successful attack without

550500450400350300250200150100

Info

rmat

ion

loss

2 3 4 5 6e value of k-anonymous parameters l = 3

100020003000

40005000

Figure 4 Information loss of STPP algorithm

11001000

900

800700

600

500400

300200100

Info

rmat

ion

loss

1000 2000 3000 4000 5000Number of participants k = 5 l = 2

KVCLA

STPP

Figure 3 Comparison of experimental result on information loss

8

7

6

5

4

3

2

1

0

Runn

ing

time (

s)


KVCLA

STPP

(a)

45

40

35

30

25

20

15

10

5

0

Runn

ing

time (

s)


KVCLA

STPP

(b)

Figure 2 Comparison of experimental results of running time on (a) Friendship Network dataset and (b) New York City Taxi Trip dataset


background knowledge is very low (y-coordinate unit is10minus 5) STPP algorithm is slightly better than the comparisonalgorithms on privacy protection

Figure 6 shows the relationship between the parameter kof k-anonymous and the privacy protection of the STPPalgorithm We conduct the experiments on New York CityTaxi Trip dataset From the experimental result we can seethat with the increase of k the privacy protection increasesgradually which is because that increasing k leads to anincrease of spatiotemporal sensitive data in each cluster andthe average probability is reduced that the real spatiotem-poral data are inferred by attackers

When participants upload sensing data TTPs willtemporarily store sensing data that do not meet anonymitycondition into buffer pool -en TTPs wait for the arrival ofthe next incremental data which will generate the problemof delayed publish of sensing data Figure 7 shows the ratio of

buffer pool data to the number of sensing data for thispublish -e x-coordinate is the number of participants onthe Friendship Network dataset and the y-coordinate is theratio of sensing data in buffer pool It can be seen that theproportion of data in buffer pool is very low which provesthat the sensing data in buffer pool have no great impact ondelayed publish

-rough experiments on real-world datasets we can seethat the proposed STPP algorithm is superior to k-anony-mous and VCLA algorithms in terms of running time in-formation loss and privacy protection STPP algorithmcould solve the privacy protection problem of dynamicpublishing for spatiotemporal crowdsourcing

5 Conclusions

In the existing work few researchers focus on privacyprotection for dynamic publishing mechanism -ere arefew privacy protection methods for spatiotemporalsensitive data in dynamic publishing mechanism In thispaper a dynamic publishing mechanism for spatio-temporal sensitive data privacy protection is proposed-en we design the dynamic k-anonymity algorithm andadd the spatiotemporal data that met the adaptivethreshold condition to the corresponding equivalenceclasses making full use of the previous anonymous resultto solve the problem of poor timeliness of static pub-lishing -irdly aiming at the shortcomings of k-ano-nymity which is vulnerable to background knowledgeattacks and homogeneous attacks we anonymize par-ticipantsrsquo time attribute based on l-diversity so as toimprove privacy protection and reduce information lossFinally the performance of the proposed STPP algorithmis evaluated on two real-world datasets Compared withthe existing algorithms experimental results show thatSTPP algorithm has lower time complexity less infor-mation loss and stronger privacy protection

In the future we will detect and process maliciousparticipants (ie outliers) so as to better reduce informationloss and protect participantsrsquo privacy data

0035

003

0025

002

0015

001

0005

0

The p

erce

ntag

e of d

ata i

n bu

ffer p

ool


Figure 7 -e proportion of sensing data in buffer pool

4

35

3

25

2

15

1

05

Atta

ck su

cces

s pro

babi

lity

(10ndash5

)

2 3 4 5 6The value of k-anonymous parameters l = 3

300060009000

1200015000

Figure 6 Privacy protection of the STPP algorithm

Atta

ck su

cces

s pro

babi

lity

(10ndash5

)7

6

5

4

3

2

1

03000 6000 9000 12000 15000

Number of participants k = 5 l = 3

KVCLA

STPP

Figure 5 Experimental result on privacy protection


Data Availability

-e experiment data used to support the findings of thisstudy have been deposited in the GitHub repository (httpsgithubcomltn21999K_L-dynamic-privacy-protection)

Conflicts of Interest

-e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

-is work was supported by the National Natural ScienceFoundation of China under Grant nos 61822602 6177220761802331 61572418 61602399 61702439 and 61773331 theChina Postdoctoral Science Foundation under Grant nos2019T120732 and 2017M622691 the National ScienceFoundation (NSF) under Grant nos 1704287 1252292 and1741277 and the Graduate Innovation Foundation of YantaiUniversity (GIFYTU) under Grant nos YDYB2024 andYDZD1908

References

[1] Z Cai X Zheng and J Yu ldquoA differential-private frameworkfor urban traffic flows estimation via taxi companiesrdquo IEEETransactions on Industrial Informatics vol 15 no 12pp 6492ndash6499 2019

[2] J Li T Cai K Deng X Wang T Sellis and F XialdquoCommunity-diversified influence maximization in socialnetworksrdquo Information Systems vol 92 pp 1ndash12 2020

[3] Y Wang Z Cai Z-H Zhan Y-J Gong and X Tong ldquoAnoptimization and auction-based incentive mechanism tomaximize social welfare for mobile crowdsourcingrdquo IEEETransactions on Computational Social Systems vol 6 no 3pp 414ndash429 2019

[4] N A H Haldar J Li M Reynolds T Sellis and J X YuldquoLocation prediction in large-scale social networks an in-depth benchmarking studyrdquo VLDB Journal vol 28 no 5pp 623ndash648 2019

[5] J Wang Z Cai and J Yu ldquoAchieving personalized $k$-Anonymity-Based content privacy for autonomous vehicles inCPSrdquo IEEE Transactions on Industrial Informatics vol 16no 6 pp 4242ndash4251 2020

[6] Z Xiong W Li Q Han et al ldquoPrivacy-preserving auto-driving a GAN-based approach to protect vehicular cameradatardquo in Proceedings of 2019 IEEE International Conference onData Mining (ICDM) Beijing China November 2019

[7] M Bi Y Wang Y Li and X Tong A Privacy-PreservingMechanism Based on Local Differential Privacy in EdgeComputing China Communications Hong Kong China2020

[8] Y Liang Z Cai J Yu Q Han and Y Li ldquoDeep learning basedinference of private information using embedded sensors insmart devicesrdquo IEEE Network vol 32 no 4 pp 8ndash14 2018

[9] X Xu X Zhang X Liu J Jiang L Qi and M Z A BhuiyanldquoAdaptive computation offloading with edge for 5G-envi-sioned internet of connected vehiclesrdquo IEEE Transactions onIntelligent Transportation Systems 2020

[10] H To G Ghinita and C Shahabi ldquoA framework for pro-tecting worker location privacy in spatial crowdsourcingrdquo

Proceedings of the VLDB Endowment vol 7 no 10pp 919ndash930 2014

[11] K Vu R Zheng and J Gao ldquoEfficient algorithms fork-anonymous location privacy in participatory sensingrdquo inProceeding of the IEEE INFOCOM pp 2399ndash2407 OrlandoFL USA March 2012

[12] S B Avaghade and S S Patil ldquoPrivacy preserving for spatio-temporal data publishing ensuring location diversity usingK-anonymity techniquerdquo in Proceedings of the 2015 Inter-national Conference on Computer Communication andControl (IC4) September 2015

[13] Z Cai Z Duan and W Li ldquoExploiting multi-dimensionaltask diversity in distributed auctions for mobile crowdsens-ingrdquo IEEE Transactions on Mobile Computing no 99 p 12020

[14] Y Wang Y Gao Y Li and X Tong ldquoA worker-selectionincentive mechanism for optimizing platform-centric mobilecrowdsourcing systemsrdquo Computer Networks vol 171pp 1ndash14 2020

[15] T Liu Y Wang Y Li X Tong L Qi and N Jiang ldquoPrivacyprotection based on stream cipher for spatio-temporal data inIoTrdquo IEEE Internet of =ings Journal 2020

[16] Z Cai and Z He ldquoTrading private range counting over big IoTdatardquo in Proceedings of the 39th IEEE International Confer-ence on Distributed Computing Systems (ICDCS Dallas TXUSA July 2019

[17] X Wang Z Liu X Tian et al ldquoIncentivizing crowdsensingwith location-privacy preservingrdquo IEEE Transactions onWireless Communications vol 16 no 10 pp 6940ndash69522017

[18] Z Cai and X Zheng ldquoA private and efficient mechanism fordata uploading in smart cyber-physical systemsrdquo IEEETransactions on Network Science and Engineering (TNSE)vol 7 no 2 pp 766ndash775 2020

[19] Y Chen N Zhang Y Zhang X ChenWWu and X S ShenldquoEnergy efficient dynamic offloading in mobile edge com-puting for internet of thingsrdquo IEEE Transactions on CloudComputing 2019

[20] X Xu C He Z Xu L Qi S Wan and M Z A BhuiyanldquoJoint optimization of offloading utility and privacy for edgecomputing enabled IoTrdquo IEEE Internet of =ings Journalvol 7 no 4 pp 2622ndash2629 2020

[21] X Xu X Zhang H Gao Y Xue L Qi and W Dou ldquoBe-Come blockchain-enabled computation offloading for IoT inmobile edge computingrdquo IEEE Transactions on IndustrialInformatics vol 16 no 6 pp 4187ndash4195 2020

[22] Z Cai Z He X Guan and Y Li ldquoCollective data-sanitizationfor preventing sensitive information inference attacks insocial networksrdquo IEEE Transactions on Dependable and SecureComputing vol 15 no 4 pp 577ndash590 2018

[23] Y Wang Z Cai X Tong Y Gao and G Yin ldquoTruthfulincentive mechanism with location privacy-preserving formobile crowdsourcing systemsrdquoComputer Networks vol 135pp 32ndash43 2018

[24] P Samarati ldquoProtecting respondents identities in microdatareleaserdquo IEEE Transactions on Knowledge and Data Engi-neering vol 13 no 6 pp 1010ndash1027 2001

[25] L Fang and L Tong ldquoA clustering K-anonymity privacy-preserving method for wearable iot devicesrdquo Security andCommunication Networks vol 2018 Article ID 49451528 pages 2018

[26] S C Lin A Y Ye and L Xu ldquoK-anonymity location privacyprotection method with coordinate transformationrdquo Journalof Chinese Computer Systems vol 37 pp 119ndash123 2016


[27] X Zheng Z Cai and Y Li ldquoData linkage in smart internet ofthings systems a consideration from a privacy perspectiverdquoIEEECommunicationsMagazine vol 56 no 9 pp 55ndash61 2018

[28] H Wang H Huang Y Qin Y Wang and M Wu ldquoEfficientlocation privacy-preserving K-anonymity method based onthe credible chainrdquo ISPRS International Journal of Geo-In-formation vol 6 no 6 p 163 2017

[29] A Machanavajjhala J Gehrke D Kifer et al ldquoL-diversityprivacy beyond k-anonymityrdquo in Proceedings of the 22ndInternational Conference on Data Engineering April 2006

[30] T Dargahi M Ambrosin M Conti and N AsokanldquoABAKA a novel attribute-based k-anonymous collaborativesolution for LBSsrdquo Computer Communications vol 85pp 1ndash13 2016

[31] A Abdrashitov and A Spivak ldquoSensor data anonymizationbased on genetic algorithm clustering with L-Diversityrdquo inProceedings of the 18th Conference of Open Innovations As-sociation and Seminar on Information Security and Protectionof Information Technology (FRUCT-ISPIT) April 2016

[32] Y Wang Z Cai G Yin Y Gao X Tong and G Wu ldquoAnincentive mechanism with privacy protection in mobilecrowdsourcing systemsrdquo Computer Networks vol 102pp 157ndash171 2016

[33] Z Hu J Yang and J Zhang ldquoPersonalized trajectory privacyprotection method based on user-requirementrdquo InternationalJournal of Cooperative Information Systems vol 27 no 32018

[34] F Tian S Zhang L Lu et al ldquoA novel personalized differ-ential privacy mechanism for trajectory data publicationrdquo in2017 Proceedings of the International Conference on Net-working amp Network Applications (NaNA) October 2017

[35] T Peng Q Liu D Meng et al ldquoCollaborative trajectoryprivacy preserving scheme in location-based servicesrdquo In-formation Sciences vol 387 pp 165ndash179 2017

[36] D Sun Y Luo G Fan et al ldquoPrivacy protection algorithmbased on trajectory shape diversityrdquo Journal of ComputerApplications vol 36 no 6 pp 1544ndash1551 2016

[37] X Zheng Z Cai J Yu C Wang and Y Li ldquoFollow but Notrack privacy preserved profile publishing in cyber-physicalsocial systemsrdquo IEEE Internet of =ings Journal vol 4 no 6pp 1868ndash1878 2017

[38] X Xu B Shen X Yin et al ldquoEdge server quantification andplacement for offloading social media services in industrialcognitive IoVrdquo IEEE Transactions on Industrial Informaticsno 99 p 1 2020

[39] K Wang and B C M Fung ldquoAnonymizing sequential re-leasesrdquo in Proceedings of the Twelfth ACM SIGKDD Inter-national Conference on Knowledge Discovery and DataMining ACM Philadelphia PA USAACM Philadelphia PAUSA August 2006

[40] X Xiao ldquoM-invariance towards privacy preserving re-pub-lication of dynamic datasetsrdquo in Proceedings of the ACMSIGMOD International Conference on Management of DataBeijing China June 2007

[41] S Cheng C Xu and H Dan ldquoResearch on incrementalprivacy preserving data miningrdquo Application Research ofComputers vol 3 no 8 2018

[42] F Yan X Zhang C Li et al ldquoDifferentially private histogrampublishing through Fractal dimension for dynamic datasetsrdquoin Proceedings of IEEE 2018 13th IEEE Conference on In-dustrial Electronics and Applications (ICIEA) pp 1542ndash1546Wuhan China June 2018

[43] Y Wang Z Cai Z Zhan B Zhao X Tong and L QildquoWalrasian equilibrium-based multiobjective optimization

for task allocation in mobile crowdsourcingrdquo IEEE Trans-actions on Computational Social Systems 2020

[44] Z He Z Cai and J Yu ldquoLatent-data privacy preserving withcustomized data utility for social network datardquo IEEETransactions on Vehicular Technology vol 67 no 1pp 665ndash673 2018


ideal partition of spatial data under low time complexity isrealized and participantsrsquo location information is protectedin a spatiotemporal crowdsourcing application scenario In[12] the problem of insufficient diversity of k-anonymityalgorithm to the participantsrsquo sensitive locations is solvedand the probability of participantsrsquo access to the sensitivelocations is limited or the probability analysis based onadversary knowledge is used to ensure the location diversity

However most researchers currently only consider thedata publishing in static scenario -e attackers use historicalpublishing results to reveal sensitive information during staticpublishing for example to compare with the results of theprevious publishing In spatiotemporal crowdsourcing ap-plications many data analysis applications actually involvedynamic data publishing For example in order to plan travelroutes for special vehicles (cash trucks ambulances fireengines etc) it is necessary to issue a sensing task to collectroad traffic jams [13 14] For such a spatiotemporal sensitivetask application server needs to dynamically publish sensingdata submitted by participants to improve the timeliness ofthe task Dynamic sensing data change constantly over timeso it is often necessary to anonymize and dynamically publishsensing data at different times However most anonymityalgorithms are invalid when dealing with the dynamic pub-lishing of spatiotemporal data [15] -e previous anonymityresult cannot be effectively utilized Because of the big datascenario the time complexity of algorithms is high and thetimeliness is poor [16] Moreover most researchers proposedprivacy preserving for participantrsquos location information butfailed to consider that attackers can also infer other privateinformation based on participantrsquos spatiotemporal informa-tion According to these problems we research the privacyprotection for spatiotemporal privacy information in spa-tiotemporal crowdsourcing systems and the following issuesshould be improved further

(1) In the process of dynamic data publishing the resultsafter anonymization should be effectively utilizedinstead of unifying the anonymization of incre-mental data with previous data to improve thetimeliness of dynamic publishing of big data

(2) In the process of anonymizing the location attributeof participants the time attribute is added to ef-fectively avoid the background knowledge attack andhomogeneity attack against the location attribute

In order to solve the above problems we propose aspatiotemporal privacy protection method for spatiotem-poral crowdsourcing systems -e contributions of thispaper are shown as follows

(1) A dynamic publishing algorithm based on spatio-temporal data privacy protection is designed byimproving k-anonymity When incremental dataarrive the anonymization result of the last time willbe utilized to solve the timeliness problem of dy-namic publishing

(2) Based on the traditional position coordinate a timeaxis is added to form the spatiotemporal informationof participants and the anonymization of participantsrsquo

spatiotemporal information is carried out by applyingk-anonymity and l-diversitymethods so as to solve thebackground knowledge attack and homogeneity attackproblems

(3) In order to verify the effectiveness of the proposedprivacy protection method the comparison experi-ments with k-anonymity and variable centroid lo-cation aggregation (VCLA) [17] algorithms areconducted on two real-world datasets

-e structure of the paper is as follows Section 2 in-troduces the related works of spatiotemporal privacy pro-tection Section 3 introduces the proposed spatiotemporalprivacy protection method for spatiotemporal crowd-sourcing systems In Section 4 the real-world datasets andthe existing anonymity algorithms are used for evaluatingthe performance of the proposed method Section 5 con-cludes the paper and presents the future work

2 Related Works

In this section we will introduce the related works aboutprivacy protection methods for spatiotemporal data anddynamic publishing of sensitive data under a participatorysensing environment Participatory sensing (PS) refers to theformation of a mobile Internet through daily mobile deviceswhere data are sensed collected analyzed or screened by thepublic and professional users and then uploaded to theparticipatory sensing network [18] With the popularizationof mobile terminals and the rapid development of wirelesssensor technology the application of PS is becoming moreand more common in real life For example in [19] Chenet al studied the energy-efficient task offloading in mobileedge computing (MEC) However in the process of taskoffloading the privacy of participants will be exposed Inorder to deal with the problems that participantsrsquo privacywill be exposed during the task offloading process Xu et al[20] put forward a two-phase offloading optimizationstrategy for joint optimization of offloading utility andprivacy in edge computing Further Xu et al [21] discussedthe problem that transmitted information is vulnerable toattack and may cause incomplete data during task off-loading A blockchain-enabled computation offloadingmethod was proposed to ensure data integrity In theimplementation process of these participatory sensing ap-plications sensing tasks uploaded by participants will markpersonal spatiotemporal data which brings great risks to theprivacy security and personal safety of participants-erefore while people enjoy the convenience brought byLBS their privacy is also at risk of being exposed [22]

In LBS using anonymous technology to solve the lo-cation privacy problem of participants has been widelystudied [23] -e k-anonymity technology was firstly pro-posed by Samarati and Sweeney [24] -e parameter kspecifies the maximum risk of information disclosure thatusers can bear It requires at least k indistinguishable recordson the quasi-identifier in published data so that attackerscannot identify the specific individual that the privacy in-formation belongs to so as to protect personal privacy In


















i1Mi Mi Uri

j1uij L Uci1Ti and

Ti Uci









d ikprime




dis pi pj1113872 1113873

1113944

3


2

11139741113972

1le i jle n (2)


d lt1113936

ni1 timei

n1113936

ni1 xi

n1113936

ni1 yi

ngt (3)


avej 11139443

k11113944

r


or |L|

(4)








t 1113936

mi1 timei

m (5)


ri1 |Mi|


Data collector

(1) Data transfer





(4) Data transfer

Application server


(6) Release

The cloud



















































IL 1113944n

i11113944

3



















ri1(1|ui|)r and



p 1n

times1113936

ri1 1 ui

111386811138681113868111386811138681113868111386811138681113872 1113873

rtimes

1113936cj1 1 cj

11138681113868111386811138681113868

111386811138681113868111386811138681113874 1113875

c

(7)


550500450400350300250200150100

Info

rmat

ion

loss


100020003000

40005000


11001000

900

800700

600

500400

300200100

Info

rmat

ion

loss


KVCLA

STPP


8

7

6

5

4

3

2

1

0

Runn

ing

time (

s)


KVCLA

STPP

(a)

45

40

35

30

25

20

15

10

5

0

Runn

ing

time (

s)


KVCLA

STPP

(b)








5 Conclusions



0035

003

0025

002

0015

001

0005

0

The p

erce

ntag

e of d

ata i

n bu

ffer p

ool



4

35

3

25

2

15

1

05

Atta

ck su

cces

s pro

babi

lity

(10ndash5

)


300060009000

1200015000


Atta

ck su

cces

s pro

babi

lity

(10ndash5

)7

6

5

4

3

2

1

03000 6000 9000 12000 15000


KVCLA

STPP



Data Availability




Acknowledgments


References

































































i1Mi Mi Uri

j1uij L Uci1Ti and

Ti Uci









d ikprime




dis pi pj1113872 1113873

1113944

3


2

11139741113972

1le i jle n (2)


d lt1113936

ni1 timei

n1113936

ni1 xi

n1113936

ni1 yi

ngt (3)


avej 11139443

k11113944

r


or |L|

(4)








t 1113936

mi1 timei

m (5)


ri1 |Mi|


Data collector

(1) Data transfer





(4) Data transfer

Application server


(6) Release

The cloud



















































IL 1113944n

i11113944

3



















ri1(1|ui|)r and



p 1n

times1113936

ri1 1 ui

111386811138681113868111386811138681113868111386811138681113872 1113873

rtimes

1113936cj1 1 cj

11138681113868111386811138681113868

111386811138681113868111386811138681113874 1113875

c

(7)


550500450400350300250200150100

Info

rmat

ion

loss


100020003000

40005000


11001000

900

800700

600

500400

300200100

Info

rmat

ion

loss


KVCLA

STPP


8

7

6

5

4

3

2

1

0

Runn

ing

time (

s)


KVCLA

STPP

(a)

45

40

35

30

25

20

15

10

5

0

Runn

ing

time (

s)


KVCLA

STPP

(b)








5 Conclusions



0035

003

0025

002

0015

001

0005

0

The p

erce

ntag

e of d

ata i

n bu

ffer p

ool



4

35

3

25

2

15

1

05

Atta

ck su

cces

s pro

babi

lity

(10ndash5

)


300060009000

1200015000


Atta

ck su

cces

s pro

babi

lity

(10ndash5

)7

6

5

4

3

2

1

03000 6000 9000 12000 15000


KVCLA

STPP



Data Availability




Acknowledgments


References






















































i1Mi Mi Uri

j1uij L Uci1Ti and

Ti Uci









d ikprime




dis pi pj1113872 1113873

1113944

3


2

11139741113972

1le i jle n (2)


d lt1113936

ni1 timei

n1113936

ni1 xi

n1113936

ni1 yi

ngt (3)


avej 11139443

k11113944

r


or |L|

(4)








t 1113936

mi1 timei

m (5)


ri1 |Mi|


Data collector

(1) Data transfer





(4) Data transfer

Application server


(6) Release

The cloud



















































IL 1113944n

i11113944

3



















ri1(1|ui|)r and



p 1n

times1113936

ri1 1 ui

111386811138681113868111386811138681113868111386811138681113872 1113873

rtimes

1113936cj1 1 cj

11138681113868111386811138681113868

111386811138681113868111386811138681113874 1113875

c

(7)


550500450400350300250200150100

Info

rmat

ion

loss


100020003000

40005000


11001000

900

800700

600

500400

300200100

Info

rmat

ion

loss


KVCLA

STPP


8

7

6

5

4

3

2

1

0

Runn

ing

time (

s)


KVCLA

STPP

(a)

45

40

35

30

25

20

15

10

5

0

Runn

ing

time (

s)


KVCLA

STPP

(b)








5 Conclusions



0035

003

0025

002

0015

001

0005

0

The p

erce

ntag

e of d

ata i

n bu

ffer p

ool



4

35

3

25

2

15

1

05

Atta

ck su

cces

s pro

babi

lity

(10ndash5

)


300060009000

1200015000


Atta

ck su

cces

s pro

babi

lity

(10ndash5

)7

6

5

4

3

2

1

03000 6000 9000 12000 15000


KVCLA

STPP



Data Availability




Acknowledgments


References





















































t 1113936

mi1 timei

m (5)


ri1 |Mi|


Data collector

(1) Data transfer





(4) Data transfer

Application server


(6) Release

The cloud



















































IL 1113944n

i11113944

3



















ri1(1|ui|)r and



p 1n

times1113936

ri1 1 ui

111386811138681113868111386811138681113868111386811138681113872 1113873

rtimes

1113936cj1 1 cj

11138681113868111386811138681113868

111386811138681113868111386811138681113874 1113875

c

(7)


550500450400350300250200150100

Info

rmat

ion

loss


100020003000

40005000


11001000

900

800700

600

500400

300200100

Info

rmat

ion

loss


KVCLA

STPP


8

7

6

5

4

3

2

1

0

Runn

ing

time (

s)


KVCLA

STPP

(a)

45

40

35

30

25

20

15

10

5

0

Runn

ing

time (

s)


KVCLA

STPP

(b)








5 Conclusions



0035

003

0025

002

0015

001

0005

0

The p

erce

ntag

e of d

ata i

n bu

ffer p

ool



4

35

3

25

2

15

1

05

Atta

ck su

cces

s pro

babi

lity

(10ndash5

)


300060009000

1200015000


Atta

ck su

cces

s pro

babi

lity

(10ndash5

)7

6

5

4

3

2

1

03000 6000 9000 12000 15000


KVCLA

STPP



Data Availability




Acknowledgments


References





























































































IL 1113944n

i11113944

3



















ri1(1|ui|)r and



p 1n

times1113936

ri1 1 ui

111386811138681113868111386811138681113868111386811138681113872 1113873

rtimes

1113936cj1 1 cj

11138681113868111386811138681113868

111386811138681113868111386811138681113874 1113875

c

(7)


550500450400350300250200150100

Info

rmat

ion

loss


100020003000

40005000


11001000

900

800700

600

500400

300200100

Info

rmat

ion

loss


KVCLA

STPP


8

7

6

5

4

3

2

1

0

Runn

ing

time (

s)


KVCLA

STPP

(a)

45

40

35

30

25

20

15

10

5

0

Runn

ing

time (

s)


KVCLA

STPP

(b)








5 Conclusions



0035

003

0025

002

0015

001

0005

0

The p

erce

ntag

e of d

ata i

n bu

ffer p

ool



4

35

3

25

2

15

1

05

Atta

ck su

cces

s pro

babi

lity

(10ndash5

)


300060009000

1200015000


Atta

ck su

cces

s pro

babi

lity

(10ndash5

)7

6

5

4

3

2

1

03000 6000 9000 12000 15000


KVCLA

STPP



Data Availability




Acknowledgments


References

















































































IL 1113944n

i11113944

3



















ri1(1|ui|)r and



p 1n

times1113936

ri1 1 ui

111386811138681113868111386811138681113868111386811138681113872 1113873

rtimes

1113936cj1 1 cj

11138681113868111386811138681113868

111386811138681113868111386811138681113874 1113875

c

(7)


550500450400350300250200150100

Info

rmat

ion

loss


100020003000

40005000


11001000

900

800700

600

500400

300200100

Info

rmat

ion

loss


KVCLA

STPP


8

7

6

5

4

3

2

1

0

Runn

ing

time (

s)


KVCLA

STPP

(a)

45

40

35

30

25

20

15

10

5

0

Runn

ing

time (

s)


KVCLA

STPP

(b)








5 Conclusions



0035

003

0025

002

0015

001

0005

0

The p

erce

ntag

e of d

ata i

n bu

ffer p

ool



4

35

3

25

2

15

1

05

Atta

ck su

cces

s pro

babi

lity

(10ndash5

)


300060009000

1200015000


Atta

ck su

cces

s pro

babi

lity

(10ndash5

)7

6

5

4

3

2

1

03000 6000 9000 12000 15000


KVCLA

STPP



Data Availability




Acknowledgments


References





































































IL 1113944n

i11113944

3



















ri1(1|ui|)r and



p 1n

times1113936

ri1 1 ui

111386811138681113868111386811138681113868111386811138681113872 1113873

rtimes

1113936cj1 1 cj

11138681113868111386811138681113868

111386811138681113868111386811138681113874 1113875

c

(7)


550500450400350300250200150100

Info

rmat

ion

loss


100020003000

40005000


11001000

900

800700

600

500400

300200100

Info

rmat

ion

loss


KVCLA

STPP


8

7

6

5

4

3

2

1

0

Runn

ing

time (

s)


KVCLA

STPP

(a)

45

40

35

30

25

20

15

10

5

0

Runn

ing

time (

s)


KVCLA

STPP

(b)








5 Conclusions



0035

003

0025

002

0015

001

0005

0

The p

erce

ntag

e of d

ata i

n bu

ffer p

ool



4

35

3

25

2

15

1

05

Atta

ck su

cces

s pro

babi

lity

(10ndash5

)


300060009000

1200015000


Atta

ck su

cces

s pro

babi

lity

(10ndash5

)7

6

5

4

3

2

1

03000 6000 9000 12000 15000


KVCLA

STPP



Data Availability




Acknowledgments


References




















































IL 1113944n

i11113944

3



















ri1(1|ui|)r and



p 1n

times1113936

ri1 1 ui

111386811138681113868111386811138681113868111386811138681113872 1113873

rtimes

1113936cj1 1 cj

11138681113868111386811138681113868

111386811138681113868111386811138681113874 1113875

c

(7)


550500450400350300250200150100

Info

rmat

ion

loss


100020003000

40005000


11001000

900

800700

600

500400

300200100

Info

rmat

ion

loss


KVCLA

STPP


8

7

6

5

4

3

2

1

0

Runn

ing

time (

s)


KVCLA

STPP

(a)

45

40

35

30

25

20

15

10

5

0

Runn

ing

time (

s)


KVCLA

STPP

(b)








5 Conclusions



0035

003

0025

002

0015

001

0005

0

The p

erce

ntag

e of d

ata i

n bu

ffer p

ool



4

35

3

25

2

15

1

05

Atta

ck su

cces

s pro

babi

lity

(10ndash5

)


300060009000

1200015000


Atta

ck su

cces

s pro

babi

lity

(10ndash5

)7

6

5

4

3

2

1

03000 6000 9000 12000 15000


KVCLA

STPP



Data Availability




Acknowledgments


References



















































ri1(1|ui|)r and



p 1n

times1113936

ri1 1 ui

111386811138681113868111386811138681113868111386811138681113872 1113873

rtimes

1113936cj1 1 cj

11138681113868111386811138681113868

111386811138681113868111386811138681113874 1113875

c

(7)


550500450400350300250200150100

Info

rmat

ion

loss


100020003000

40005000


11001000

900

800700

600

500400

300200100

Info

rmat

ion

loss


KVCLA

STPP


8

7

6

5

4

3

2

1

0

Runn

ing

time (

s)


KVCLA

STPP

(a)

45

40

35

30

25

20

15

10

5

0

Runn

ing

time (

s)


KVCLA

STPP

(b)








5 Conclusions



0035

003

0025

002

0015

001

0005

0

The p

erce

ntag

e of d

ata i

n bu

ffer p

ool



4

35

3

25

2

15

1

05

Atta

ck su

cces

s pro

babi

lity

(10ndash5

)


300060009000

1200015000


Atta

ck su

cces

s pro

babi

lity

(10ndash5

)7

6

5

4

3

2

1

03000 6000 9000 12000 15000


KVCLA

STPP



Data Availability




Acknowledgments


References






















































5 Conclusions



0035

003

0025

002

0015

001

0005

0

The p

erce

ntag

e of d

ata i

n bu

ffer p

ool



4

35

3

25

2

15

1

05

Atta

ck su

cces

s pro

babi

lity

(10ndash5

)


300060009000

1200015000


Atta

ck su

cces

s pro

babi

lity

(10ndash5

)7

6

5

4

3

2

1

03000 6000 9000 12000 15000


KVCLA

STPP



Data Availability




Acknowledgments


References

















































Data Availability




Acknowledgments


References





































































Date post:	26-Oct-2021
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

A Dynamic Privacy Protection Mechanism for Spatiotemporal ...

Documents