Research ArticleEfficient Processing of Moving Top-119896 Spatial KeywordQueries in Directed and Dynamic Road Networks
Muhammad Attique 1 Hyung-Ju Cho 2 and Tae-Sun Chung 3
1Department of Software Sejong University Republic of Korea2Department of Software Kyungpook National University Republic of Korea3Department of Software Ajou University Republic of Korea
Correspondence should be addressed to Tae-Sun Chung tschungajouackr
Received 28 May 2018 Revised 15 August 2018 Accepted 18 September 2018 Published 1 November 2018
Academic Editor Ke Guan
Copyright copy 2018 Muhammad Attique et al This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited
A top-k spatial keyword (TkSk) query ranks objects based on the distance to the query location and textual relevance to the querykeywords Several solutions have been proposed for top-k spatial keyword queries However most of the studies focus on Euclideanspace or only investigate the snapshot queries where both the query and data object are static A few algorithms study TkSk queriesin undirected road networks where each edge is undirected and the distance between two points is the length of the shortest pathconnecting them However TkSk queries have not been thoroughly investigated in directed and dynamic spatial networks whereeach edge has a particular orientation and its weight changes according to the traffic conditions Therefore in this study we addressthis problemby presenting a newmethod calledCOSK for processing continuous top-k spatial keyword queries formoving queriesin directed and dynamic road networks We first propose an efficient framework to process snapshot TkSK queries Furthermorewe propose a safe-exit-based approach to monitor the validity of the results for moving TkSK queries Our experimental resultsdemonstrate that COSK significantly outperforms existing techniques in terms of query processing time and communication cost
1 Introduction
With the popularization of geo-tagged data (eg geo-taggedphotos videos check-ins and text messages) many onlinelocation-based services such as Google Maps Yahoo Mapsand Bing Maps have started providing useful information vialocation-based queries [1ndash4] Moreover textual descriptionsof points of interest eg hotels shopping malls and touristattractions are easily accessible on the Web These devel-opments demand techniques that efficiently process top-kspatial keyword queries that return a ranked list of the kbest facilities based on their proximity to the query locationand relevance to the query keywords Several algorithms havebeen proposed for processing top-k spatial keyword queriesin Euclidean space [5 6] Although few algorithms exist thatstudy keyword queries in a road network they all focuson undirected road networks However in real scenariosthe urban road networks are directed and dynamic whereeach edge has a particular orientation and its weight changes
according to traffic conditions such as traffic congestion andreversible lanes Therefore in this study we investigate mov-ing top-k spatial keyword queries in directed and dynamicroad networks
Top-k keyword queries can be used for a wide rangeof applications in recommendation and decision supportsystems For example tourists may want to retrieve a sortedlist of restaurants that serve Italian steak based on theshortest distance from their location and textual relevanceto the query keywords Tourists can issue a top-k spatialkeyword query to the location-based services (LBS) to collectinformation about qualifying restaurants in their vicinityHowever through moving top-k spatial keyword queries ifthey does not like the results they can simply keep movingand the updated results will be provided until a desiredrestaurant is found Typically the query issuer follows theunderlying road network to reach at the desired locationTherefore TkSK algorithms based on Euclidean space doesnot work in road networks A road network is generally
HindawiWireless Communications and Mobile ComputingVolume 2018 Article ID 7373286 19 pageshttpsdoiorg10115520187373286
2 Wireless Communications and Mobile Computing
2
q
3
2
1
1 1
2
1
2
1
2
1
1
3
1 2
d4 (Chinese Restaurant)
d1 (Grand Hotel)
d5 (Pub and Bar)
n1
n6
n2 n3
n4
n7
n5
d6 (Italian Restaurant)
d3 (Italian Restaurant)
d2 (Cafe)
d7 (Cafe and Bakery)
Figure 1 Illustration of directed road network
modeled as a weighted directed graph where each edge hassome direction and its weight can vary according to the trafficconditions
Given a set of data objects 119863 = 1198891 1198892 119889|119863| querylocation and set of keywords theTkSKquery returns the bestk data objects from D according to their combined textualand spatial relevance to the query We use distance function119889119894119904119905(119902 119889) to represent the shortest network distance from q todata object d Figure 1 presents an example of a directed roadnetwork where rectangles represent the data objects witha textual description and the triangle represents the querylocation The number label on each edge indicates the weightof that edge such as the amount of time required to travelalong it eg 119889119894119904119905(1198891 1198991) = 1 and 119889119894119904119905(1198891 1198992) = 2 Considera scenario where a tourist is interested in finding an ldquoItalianRestaurantrdquo If an undirected road network is considered thetop-1 ldquoItalian Restaurantrdquo is 1198896 However in a directed roadnetwork the shortest path from q to 1198896 is (119902 997888rarr 1198993 997888rarr1198997 997888rarr 1198896) Therefore for a directed road network the top-1result is 1198893 because it is closer to the query location than 1198896Now consider that the tourist is looking for ldquoCafe BakeryrdquoThe data object 1198897 could score higher than data object 1198891because 1198897 (ldquoCafe and bakeryrdquo) is more textually relevantto query keywords than 1198892 (ldquoCaferdquo) and 119889119894119904119905(119902 1198897) is onlymarginally greater than 119889119894119904119905(119902 1198892)
Moving Top-k spatial keywords in directed and dynamicroad networks are useful for many location-based applica-tions However query processing is costly because movementof query object qmay invalidate the query results Thereforethe main challenge in moving TkSk is to maintain the fresh-ness of the query results when the query objects are movingfreely A straightforward approach is to increase the updatefrequency of the queryHowever this approachnot only com-promises the up-to-date query results but also increases thecomputation and communication overhead Because when-ever query object changes its location the query object has toreport its location to server which increases the communica-tion cost and server has to recompute the results again whichincreases the computation cost
To address the aforementioned challenges we first pre-sent an efficient processing technique of snapshot TkSKqueries in directed road networks Then we present a safe-exit-based approach for processing and monitoring movingTkSk queries where query object q is freely moving in adirected spatial network The safe exit point of query object qrepresents a boundary point between the safe region andnon-safe region of q A safe region of query points indicates thatthe query result remains valid if the query object lies withinits respective safe region Therefore the query results willonly be recomputed when q leaves its respective safe regionwhich significantly reduces the computation and communi-cation costs To the best of our knowledge this is the firstattempt to study moving top-k spatial keyword queries indirected and dynamic road networks
Below we summarize our contributions
(i) We study the problem of continuous monitoring ofmoving top-k spatial keyword queries in a directedand dynamic road networks
(ii) We present an algorithm tomonitor themoving TkSKqueries which efficiently computes the safe exit pointsfor query object q in a directed road network Thealgorithm significantly minimizes the computationand communication costs for moving queries
(iii) We also propose a method that monitors the validityof query results and safe region when weight of roadsegments is updated due to traffic conditions
(iv) Finally we conduct extensive experiments on realroad network datasets and demonstrate the superi-ority of the proposed algorithm over the existing ap-proach
The remainder of this paper is structured as followsSection 2 reviews the existing work on the processing of TkSkqueries on Euclidean and road networks Section 3 providesterminology definitions and describes the problem Section 4elaborates on the proposed query processing technique for
Wireless Communications and Mobile Computing 3
TkSK queries in directed road networks In Section 5 we pre-sent our safe-exit-based technique to process moving TkSKqueries Section 7 presents a performance analysis of theproposed technique Section 8 concludes this paper
2 Related Work
In this section we discuss some of the promising relatedstudies of top-k spatial keyword queries Our related workis divided into two sections Section 21 reviews snapshotTkSK queries and Section 22 presents the studies proposedto address moving TkSK queries
21 Snapshot Top-k Spatial Keyword Queries In recent yearsspatial keyword queries have drawn the attention of manyresearchers Several approaches have been proposed forranking spatial data objects Initially Zhou et al [7] workedon combining inverted indexes [8] and R-trees [9] Theyproposed three different hybrid indexing structures Theirstudy demonstrated that building an inverted index on topof an R-tree provides superior performance Hariharan et al[10] proposed the indexing structure KRlowast-tree by capturingthe joint distribution of keywords in space Ian de Felipe et al[11] proposed a data structure that combines an R-tree withtext signatures Each node of the R-tree exploits a signatureto indicate the presence of keywords in the subtree of thenode However both these approaches address only Booleankeyword queries in Euclidean space
Top-k spatial keyword queries where data objects areranked according to their combined textual and spatialrelevance to keyword queries were first studied by Cong etal [5] and Li et al [6] Both studies [6] integrate locationindexing and text indexing to generate IR-treesThese studiesprocess top-k spatial keyword queries only in Euclidean spaceand are not suitable for processing top-k spatial preferencequeries in road networks where the distance between objectsis determined by the shortest path connecting them LaterRocha et al [12] proposed the indexing technique S2I whichmaps each term in the vocabulary into a separate blockor aR tree for efficient processing of top-k spatial keywordqueries Zhang et al [13] proposed an m-closest keywordquery that returns the closest object based on distance andwhich matchesm query keywords
Top-k spatial keyword queries in road networks wereintroduced by Rocha et al [14] In particular they pro-posed three different indexing techniques (Basic IndexingEnhanced Indexing and Overlay Indexing) for processingspatial keyword queries in road networks
22 Moving Top-k Spatial Keyword Queries Recently re-search focus has shifted to the continuous processing ofspatial queries where query or data objects are arbitrarilymoving in road networks which is themost realistic scenarioConsiderable research effort has been undertaken to processmoving range k nearest neighbor (kNN) and reverse knearest neighbor queries (RkNN) [15ndash18] However there isa lack of efficient algorithms for moving top-k spatial key-word queries Initially Wu et al [19] and Huang et al [20]
Table 1 Comparisons with existing solutions
Algorithm Type Space Domain OrientationCong et al [5] Snapshot Euclidean No orientationRocha et al [14] Snapshot Static Road UndirectedWu et al [19] moving Euclidean No orientationHuang et al [20] moving Euclidean No orientationGuo et al [21] moving Static Road UndirectedLi et al [22] moving Static Road UndirectedCOSK moving Dynamic Road Directed
proposed different methods formonitoring top-k spatial key-word queries in Euclidean space Guo et al [21] studied mov-ing top-k spatial keyword queries on road networks Theypresented two methods for monitoring moving queries in ancontinuous manner that reduces the traversing of networkedges Later Li et al [22] proposed TPR-tree-based indexingtomonitor moving top-k spatial keyword queries In contrastto [21 22] in this study we consider moving top-k spatialkeyword queries in directed and dynamic road networkswhere each road segment has a particular orientation and itsweight changes due to according to traffic conditions
Table 1 compares our problem scenario with related workin terms of query type space domain and orientation of roadnetworks
3 Preliminaries
Section 31 defines the terms and notations used in this paperSection 32 formulates the problem using an example thatillustrates the general results of top-k spatial keyword queries
31 Definition of Terms and Notations
311 Road Network A road network is represented by aweighted directed graph 119866 = (119873119864119882) where N E and Wdenote the node set edge set and edge distance matrixrespectively The network distance of an edge changes de-pending on the traffic conditions Each edge is also assignedan orientation that is either undirected or directed Theundirected edge is represented by 119890 = (119899119904 119899119890) where 119899119904 and 119899119890are the boundary nodes 119899120573 of an edge whereas the directed
edge is represented by 119890 = 997888997888997888997888997888rarr(119899119904 119899119890) or 119890 = larr997888997888997888997888997888(119899119890 119899119904) Naturallythe arrow above the edge indicates the associated directionWe refer to 119899119904 as the starting node and 119899119890 as the ending nodeof an edge For example in Figure 1 1198996 is the starting node ofedge
997888997888997888997888997888rarr(1198996 1198992) whereas it is the ending node for edgelarr997888997888997888997888997888(1198996 1198995)Theparticular edgewhere a query object is located is called anactive edge It is important to note that the distance betweentwo points 1199011 and 1199012 is not symmetrical in directed roadnetworks (ie 119889119894119904119905(1199011 1199012) = 119889119894119904119905(1199012 1199011)) For example inFigure 1 the 119889119894119904119905(1198893 1198894) = 3 whereas the 119889119894119904119905(1198894 1198893) = 11because shortest path from 1198894 to 1198893 is (1198894 997888rarr 1198996 997888rarr 1198992 997888rarr1198993 997888rarr 1198893)312 Segment Segment 119904 = (1199011 1199012) is the part of an edgebetween two points 1199011 and 1199012 on the edge An edge consists
4 Wireless Communications and Mobile Computing
of one or more segments An edge is also considered a seg-ment where the nodes are the end points of the edge Theweight of a segment (1199011 1199012) is denoted by119882(119904)32 Problem Formulation Similar to previous studies [5 1423] we assume each data object 119889 isin 119863 has a point location119889119897 in the road network and a text description 119889119905 Given aquery location 119902119897 a set of keywords 119902119905 and k number ofdata objects to return the top-k spatial keyword query 119876119896 isdefined as119876119896 = (119902119897 119902119905 119896) which takes three arguments andreturns the best k data objects from D according to a scorethat considers spatial proximity and text relevance The score120595(119889) of a data object d is defined by the following equation
120595 (119889) = 120583 (119889119905 119902119905)1 + 120572 sdot 120582 (119889119897 119902119897) (1)
where 120582(119889119897 119902119897) is the spatial relevance between 119889119897 and119902119897 120583(119889119905 119902119905) is the textual relevance between 119889119905 and 119902119905 and120572 is a positive real number that determines the importanceof one measure over the other For example if only textualrelevance is considered then 120572 = 0 If more importance isgiven to spatial relevance then 120572 gt 1
Spatial relevance (120582) is defined as the shortest distancebetween data objects d and q 120582(119889119897 119902119897) = 119889119894119904119905(119889119897 119902119897)Thus 119889119894119904119905(119889119894119897 119902119897) lt 119889119894119904119905(119889119895119897 119902119897) indicates that data object119889119894 is more spatially relevant to q than data object 119889119895 Thetextual relevance (120583) can be computed using any popularinformation retrieval model such as cosine similarity or thelanguage model In this study we use the cosine similarity be-tween 119889119905 and 119902119905 The textual relevance is defined as follows
120583 (119889119905 119902119905) = sum119905isin119902119905 119908119905(119889119905)119908119905(119902119905)radicsum119905isin119889119905 [119908119905(119889119905)]2 sum119905isin119902119905 [119908119905(119902119905)]2
(2)
The weight 119908119905(119889119905) = 1 + ln(119891119905(119889119905)) where 119891119905(119889119905) representsthe frequency of term t in 119889119905 The weight 119908119905(119902119905) = ln(1 +|119863|119889119891119905) where |119863| is the number of objects in D and 119889119891119905 isthe document frequency A higher 120583 means a higher textualrelevance to the query keywords We used the variation ofcosine similarity based on the significance factor 120579119905(119899) ofterm t in a document n where n represents the descriptionof data object 119889119905 or query keywords 119902119905 The significance120579119905(119899) = 119908119905(119899)radicsum119905isin119899(119908119905(119899))2 is the normalized weight of theterm in the document by taking into account the length ofthe document [24 25] Hence the textual relevance 120583(119889119905 119902119905)can be rewritten as
120583 (119889119905 119902119905) = sum119905isin119902119905
120579119905(119889119905)120579119905(119902119905) (3)
4 Query Processing System
In this section we present the proposed query processingsystem that indexes the data objects and prunes the irrelevantedges for efficient query processing In Section 41 we discussthe indexing framework and in Section 42 we present anefficient keyword query processing algorithm for snapshotqueries
41 Indexing Framework In this study our main work focu-ses on moving queries in a directed and dynamic road net-works We use a method similar to the enhanced techniquepresented in [12] as our basic framework for processingsnapshot queries in directed and dynamic road networksTheindexing framework combines a road network framework[1] for storing spatial information and an inverted file forindexing data objects For easy traversing of the networkwe store the adjacent nodes of each given node by storingnode id (119899119894119889) edge id (119890119894119889) the direction of the edge andthe weight of the edge The indexing framework consists oftwomain components a pruning component and an invertedfile component Figure 2 illustrates the main componentsof an indexing framework The pruning component firstprunes the edges that contain data objects irrelevant to thequery keyword To achieve this we introduced the highestsignificance 120579+119905 of a given term t in the description of objectslying on the edge The 120579+119905 on an edge is retrieved by a keycomposed of a pair of edge id and term id (119890119894119889 119905119894119889) The 120579+119905represents an upper-bound significance of any object lying onan edge with term t in its description The inverted list of aterm t on an edge is accessed only if the upper-bound scorecomposed by 120579+119905 and theminimumnetwork distance betweenthe starting node of the edge and query q may return acandidate data object Naturally the edges with upper-boundscores smaller than the score of the k-th object found so farare pruned
We implement an inverted file for indexing data objectsThe inverted file contains a vocabulary and inverted lists Thevocabulary keeps general information about each term (suchas the frequency of the term) which is helpful in computingthe textual relevance of the data objects The inverted liststores the data objects located on the edge
997888997888997888997888997888rarr(119899119904 119899119890) that havea term t in their description An inverted list is identifiedby a key composed of (119890119894119889 119905119894119889) Each inverted file is a set ofinverted lists A separate inverted list is used for each term inthe object description An inverted list stores two attributesfor each data object first the distance between the data objectand the starting node 119889119894119904119905(119899119904 119889119894) second the significancefactor 120579(119905119894 119889119894) of the term 119905119894 in the description of the dataobject Note that the network distance between two points ina directed road network is not symmetrical (ie 119889119894119904119905(119899119904 119889119894) =119889119894119904119905(119889119894 119899119904)) Recall that the starting node is chosen accordingto the orientation of the edge such that the direction of theedge is from the node toward the data object In Figure 1 1198993is the starting node for 1198897 For bidirectional edges any of theadjacent nodes can act as a starting node
The proposed indexing scheme has three main advan-tages First the object search relevant to query keywords isvery efficient using the (119890119894119889 119905119894119889) pair Second inverted filesalso store the network distance between the starting node andthe data object which helps in accessing the data object in thedirected road network Finally the pruning technique allowsfor faster query processing by exploring fewer edges
Table 2 presents the notations used in this study
42 Query Processing Algorithm Our algorithm traverses theroad network incrementally in a similar fashion to Dijkstrarsquos
Wireless Communications and Mobile Computing 5
Inverted FileInverted Lists
PruningVocabulary
1 Compute upper-bound score using
2 Inverted list of a term is accessedonly if the upper-bound score is greater than kth object
dist(nq) and t+
lteid tidgt
lteid tidgt
tid Dftid
di dist(ns di) (d t )
+t
Figure 2 Indexing framework
Table 2 Summary of notations used in this paper
Notation DefinitionG = (N EW) Graph model of road network119889119894119904119905(119901119904 119901119890) Length of shortest path from 119901119904 to 119901119890 where 119901119904 and 119901119890 represent start and end points respectively119897119890119899(1199011 1199012) Length of segment connecting two points 1199011 and 1199012119899119894 Node in road network119890 = (119899119904 119899119890) Edge in edge set E where 119899119904 and 119899119890 are start and end points of the edge119899120573 Boundary node corresponding to start (119899119904) or end (119899119890) point of an edge119882(119890) Weight of edge (119899119904 119899119890)q Query point in road networkk A number that represents q can be among k number of closest facilities to a data object dD Set of data objects119863 = 1198891 1198892 119889|119863|119863(119899119904 119899119890) Set of data objects in an edge119901119886 Anchor point that corresponds to start point of expansion119875119878119864 Safe exit point where safe and non-safe regions of q intersect120572 query parameter120595(119889) Score of data object d120583(119889119905 119902119905) textual relevance of data object d with query keywords120582(119889119897 119902119897) Spatial relevance of data object d with query location119863+ Set of answer objects119863minus Set of non-answer objects119889+119897 Lowest answer object119889minusℎ Highest non-answer object
algorithm [26] Algorithm 1 returns the top-k data objectswith the highest scores according to their joint textual andspatial relevance to the query The algorithm begins byexploring the active edge where query object q is located andexpands the network in an increasing order of distance fromq Each entry in the min-heap has the form (119901119886 119890119889119892119890) where119901119886 indicates the anchor point in the edge For an active edgeq becomes the anchor point Otherwise for directed edgesending node 119899119890 becomes the anchor point For bidirectionaledges either of the adjacent boundary nodes ie 119899119904 or 119899119890becomes the anchor point Let119863119896 be the current set of top-kdata objects and 119904119896 be the score of the k-th data object in119863119896The 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896) function retrieves the candidatedata objects 119863119888 located in an edge with a better score 120595(119889)than 119904119896 Next the 119863119896 set is updated with the data objects in
119863119888 and so does 119904119896The algorithm continues its expansion andinserts the adjacent edges of the boundary node until the heapis exhausted or the upper-bound score of the remaining dataobjects cannot have a better score than 119904119896 The upper-boundscore 120595(119899) of node n is computed using 119889119894119904119905(119899 119902) and themaximum textual relevance (120583 = 1)Therefore if120595(119899) le 119904119896 itmeans that even if there is unexplored data object dmatchingall query keywords its score can be better than the k-th objectin 119863119896 because 119889119894119904119905(119889 119902119897) ge 119889119894119904119905(119899 119902119897) This is certain owingto the fact that the algorithm strictly expands the node with aminimum distance to the query location
Algorithm 2 presents the 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896) proce-dure which finds the candidate data objects This procedurehas twomain steps In the first step the upper-bound score ofthe edges is computed using a significance factor (120579119905 ) of a term
6 Wireless Communications and Mobile Computing
(1) Input Top-k spatial keyword query 119876119873 = (119902119897 119902119905 119896)(2) Output Top-k data objects with highest score(3) 119863119888 larr997888 0 lowastset of candidate data objects(4) max-heap 119863119896 larr997888 0 lowastcurrent Top-k set(5) 119904119896 larr997888 0 lowastk-th score in119863119896(6) min-heap larr997888 0(7) 119890119909119901119897119900119903119890119889 larr997888 0(8) min-heapinsert(119902119897 119890119889119892119890119886119888119905119894V119890)(9) 119863119888 larr997888 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896)(10) update119863119896 and 119904119896 with 119889 isin 119863119888(11) whilemin-heap = 0 and (1(1 + 120572120582(119889119897 119902119897)) lt 119904119896) do(12) for each unexplored adjacent edge of (119901119886 119890119889119892119890) do(13) 119890119909119901119897119900119903119890119889 larr997888 119890119909119901119897119900119903119890119889 cup (119901119886 119890119889119892119890)(14) 119863119888 larr997888 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896)(15) update119863119896 and 119904119896 with 119889 isin 119863119888(16) end(17) min-heapinsert(adjacent node edge)(18) end(19) return119863119896
Algorithm 1 EvaluateSnapshotQuery(Node 119899119894 Edge 119890119894)
(1) Input Edge ID 119890119894119889 Term ID 119905119894119889 score of k-th object 119904119896(2) Output candidate list119863119888(3) compute 120579119905(119890119894)(4) if 120579119905(119890119894) gt 0 then(5) 119898119886119909119904119888119900119903119890(119890119894) larr997888 119888119900119898119901119906119905119890119898119886119909119904119888119900119903119890(120579119905 119889119894119904119905(119890119894 119902119897))(6) end(7) if 119898119886119909119904119888119900119903119890(119890119894) gt 119904119896 then(8) for each data object in 119890119894 do(9) compute 119889119904119888119900119903119890(10) end(11) if 119889119904119888119900119903119890 gt 119904119896 then(12) 119863119888 larr997888 119863119888 cup 119889(13) end(14) end(15) return119863119888
Algorithm 2 CandidateSearch((119890119894119889 119905119894119889) 119904119896)
119905 isin 119902119905 and the shortest distance 119904119889119894119904119905(119890119894 119902119897) between the edgeand the query location In the next step the inverted lists ofterm t are fetched if their upper-bound score is greater than119904119896 In the inverted lists the objects with score 120595(119889) greaterthan 119904119896 are returned
To understand the proposed algorithm consider theroad network presented in Figure 1 Assume that a query qgenerated a top-1 keyword query with qd ldquoItalian Restau-rantrdquo For ease of presentation we assume 120572 = 1 and thetextual relevance 120583 is the number of occurrences of querykeywords in 119889119905 divided by the number of keywords in thedocument (description of data object) For example 120595(1198894) =120583(1198894119905 119902119905)(1 + 120582(1198894119897 119902119897)) = 058 = 006 The algorithmstarts the network expansion from an active edge
997888997888997888997888997888rarr(1198992 1198993)where q is the anchor point Note that the direction of the edge997888997888997888997888997888rarr(1198992 1198993) is from 1198992 to 1198993 Therefore the algorithm explores
only997888997888997888997888997888rarr(119902 1198993) There is no data object found in
997888997888997888997888997888rarr(119902 1198993) Then1198993 becomes the anchor point and edges (1198993 1198994) (1198993 1198995)and (1198993 1198997) are inserted in min-heap Next the 119888119886119899119889119904119890119886119903119888ℎfunction retrieves the candidate data objects on edges (1198993 1198994)(1198992 1198993) and (1198993 1198997) whose score is better than 119904119896 On edge(1198993 1198995) data object 1198893 is retrieved with 120595(1198893) = 02 Dataobject 1198893 is inserted in the119863119896 set and the value of 119904119896 is set to02 For edges (1198993 1198994) and (1198993 1198997) there is no candidate objectfound because 1198892119905 (ldquoCaferdquo) and 1198897119905 (ldquoCafe and Bakeryrdquo) donot match with 119902119905 The algorithm continues expanding theedges whose upper-bound score is greater than 119904119896 The edge997888997888997888997888997888rarr(1198997 1198992) is explored next The upper-bound score of
997888997888997888997888997888rarr(1198997 1198992)is 17 which is less than 119904119896 Similarly for edge
larr997888997888997888997888997888(1198996 1198995) theupper-bound score is 058 lt 119904119896 Therefore the algorithmterminates and reports 1198893 as the top-1 result
Wireless Communications and Mobile Computing 7
q
q issues TkSK query at p1
Server returns a set of objects for p1
Figure 3 Illustration of directed road network
qq issues TkSK query at p2
Server returns a set of objects for p2
Figure 4 Illustration of directed road network
5 Moving Top-119896 Spatial Keyword Queries
In this section we present our method to monitor themoving top-k spatial keyword queries where query objectsare moving in a directed road network Figure 3 providesan example of TkSK in road networks where query point qissues a TkSK query at point 1199011 Note that the numbers onthe arrows in the figure indicate the order of the steps Toobtain top-k results at 1199011 the server executes Algorithm 1as mentioned in Section 42 Now consider that the queryobject is moved to 1199012 as shown in Figure 4 to retrieve thetop-k results at point 1199012 The simple method is to repeat theprocedure executed at 1199011 However the use of recomputationwhenever query q changes its location significantly increasesthe computation cost Furthermore it also increases thecommunication overhead because the query object mustreport its location whenever it moves and the server mustsend the results set To address these issues we introduce thesafe exit approach
In the proposed framework the server computes safeexit points for a query object The server maintains a set ofmoving queries and the query result remains valid until thequery objects remain inside their respective safe exit pointsWhenever a query object leaves its safe exit points the serverrecomputes theTkSK and safe exit points for the query object
Next we present our method to compute the safe exitpoints for a query objectThe safe exit point represents a pointin the segment where a safe region and nonsafe region meetWe compute the safe exit point using the divide-and-conquertechnique Before presenting the detailed methodology wedefine the terminologies used in this section
Definition 1 (safe region) A portion of a road segment thatcan guarantee that as long as the query point lies in it itstop-k results remain valid
Definition 2 (answer objects 119863+) A data object d is calledan answer object of query q if the score of data object d(120595(119889) gt 120595(119889119886)) where 119889119886 represents any other data object inthe directed road network Similarly we can generalize thisdefinition for TkSK a data object d is called an answer object
of query q if the score of a data object d (120595(119889) gt 120595(119889119896+1))where 119889119896+1 represents the (119896+1)119905ℎ data object in the directedroad network In other words we can state that all answerobjects are top-k results of query q
Definition 3 (nonanswer objects119863minus) A data object d is calleda nonanswer object of query q if the score of data object d(120595(119889) lt 120595(119889119886)) where 119889119886 represents any other data object inthe directed road network Similarly we can generalize thisdefinition for TkSK a data object d is called a nonanswerobject of query q if the score of data object d (120595(119889) lt 120595(119889119896))where 119889119896 represents the kth data object in the directed roadnetwork That is we can say that all answer objects are top-k results of query q Therefore we can state that none of thenonanswer objects are in the top-k results of query q
Definition 4 (lowest answer object 119863+119897 ) An answer object119889+ isin 119863+ is called a lowest answer object to a point 119901 isin 119866such that 120595(119889+119897 )119901 = min(120595(119889+1 )119901 120595(119889+2 )119901 120595(119889+|119889+|)119901)where120595(119889+119897 )119901 represents the score of the lowest answer objectat point p In other words 120595(119889+119897 )119901 lt 120595(119889+119886 )119901 at point p where119889+119886 is any other answer object in the 119863+ setDefinition 5 (highest nonanswer object 119863minusℎ) A nonanswerobject 119889minus isin 119863minus is called a highest nonanswer object toa point 119901 isin 119866 such that 120595(119889minusℎ )119901 = max(120595(119889minus1 )119901 120595(119889minus2 )119901 120595(119889+|119889minus|)119901) where 120595(119889+ℎ)119901 represents the score of thehighest nonanswer object at point p In other words the120595(119889minus119897 )119901 lt 120595(119889minus119886 )119901 at point p where 119889minus119886 is any other nonanswerobject in the 119863minus set
As discussed earlier the main challenge in the continuousprocessing of moving TkSK is to maintain the validity of theresult set because the movement of query objects can nullifythe result set To monitor the validity of the result set wepropose a safe-region-based approach
51 Computation of Safe Exit Points In this section wepresent our technique to compute the safe exit points Themain goal is to find a point in the road network where the
8 Wireless Communications and Mobile Computing
query result set will change The result set will change whenthe score of highest nonanswer 119863minusℎ surpasses the score of119863+119897 Generally the textual relevance score does not changeTherefore the score of data objects only changes because ofthe spatial relevance score which can only change by themovement of query objects The computation of the safe exitpoint is based on two key observations
Observation 1 If 119863+119899120573 = 119863+119901119886 there is no safe exit point in thesegment
Explanation 119863+119901119886 represents the set of answer objects atanchor point 119901119886 whereas 119863+119899120573 represents the set of answerobjects at boundary node 119899120573 As discussed earlier the safe exitpoint is the particular point where the query results changedIf the query results at the starting node are the same as theending node of any segmentedge there does not exist anypoint where the query result is changing Hence we do notsearch the safe exit point in that segment
Observation 2 If 119863+119901119886 = 119863+119899120573 there is a safe exit point in thesegment
Explanation In contrast to Observation 1 if the query resultsare different at the starting and ending points then thereexists a point where the query results are changing Hencethere is a safe exit point in the segment
To find the safe region we observe the following cases
Case 1 (when 120572 = 1 and the textual relevance of the highestnonanswer object and lowest answer object is the same)In this case both the textual and spatial relevance have thesame importance (ie 120572 = 1) In addition the top-k resultdepends only on the spatial relevance because the textualrelevance of both objects is the same The data object thatis closer to query point q becomes the answer object For anundirected edge the safe exit point 119901119904119890 is the center pointie max(119889119894119904119905(119901119904119890 119889+1 ) 119889119894119904119905(119901119904119890 119889+2 ) 119889119894119904119905(119901119904119890 119889+|119889+|)) =min(119889119894119904119905(119901119904119890 119889minus1 ) 119889119894119904119905(119901119904119890 119889minus2 ) 119889119894119904119905(119901119904119890 119889minus|119889minus|)) betweenthe lowest answer object and the highest nonanswer objectHowever in case of a directed edge where 119889119894119904119905(119901119886 119899120573) =119889119894119904119905(119899120573 119901119886) the safe exit point is either 119889+119897 or 119901119886 If 119889+119897 isin(119901119886 119899120573) then the safe exit point is 119889+119897 otherwise the safe exitpoint is 119901119886Case 2 (when 120572 = 1 and the textual relevance of the highestnonanswer object and lowest answer object is different) Inthis case the top-k result depends on all functions that are the120572 spatial and textual relevance Clearly for the undirectededges the midpoint between the lowest answer object andthe highest nonanswer object does not provide a valid safeexit point Therefore we introduce the divide-and-conquertechnique This will keep dividing the search space until weget the point where the score of the nonanswer is greater thanthat of the answer object Typically the safe exit point shouldbe closer to the data object whose score is lower Based onthis observation first we compute the midpoint in a similarfashion to Case 1 and then we continue dividing the search
space until we find the point For undirected edges the safeexit point can be computed in a similar fashion to Case 1
Case 2 also works for other cases when the safe exit pointis not the mid point between the lowest answer object andthe highest nonanswer object In these cases the safe exitpoint depends on two or more functions Therefore the safeexit point can be easily computed using the aforementioneddivide-and-conquer technique Following are the scenarioswhere the safe exit point can be computed using Case 2
(a) When 120572 = 1 and textual relevance of the nearest non-answer object and farthest answer object is different
(b) When 120572 = 1 and textual relevance of the nearestnonanswer object and farthest answer object is same
Case 3 (when 120572 = 0) This means the spatial relevance hasno effect on the score of data objects Hence no monitoringis required for this scenario
Algorithm 3 retrieves the safe exit points using theobservations we discussed earlier The core function in thisalgorithm is ComputeSafeExit(119901119886 119899120573) which finds the safeexit point in a segment between 119901119886 and 119899120573 The detailedComputeSafeExit(119901119886 119899120573) is described in Algorithm 4 FirstAlgorithm 4 determines 119889+119897 and 119889minusℎ at point 119901 isin [119901119886 119899120573]Recall that 119889+119897 is the lowest answer object to p where 119889minusℎ isthe highest nonanswer object to p Algorithm 4 computes thesafe exit point based on the cases we discussed earlier Thereare a further two scenarios for Cases 1 and 2 For Case 1 if119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then the safe exit point is the mid-point between 119889+119897 and 119889minusℎ If 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) thenthe edge is directed and therefore the safe exit point is either119901119886 or 119889+119897 If 119889+119897 lies on the edge [119901119886 119899120573] then 119889+119897 is the safe exitpoint Otherwise 119901119886 is the safe exit point
Similarly for Case 2 if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) thenthe safe exit point is computed by dividing the search space byhalf until we find the closest point such that 120595(119889minusℎ) gt 120595(119889+119897 )The safe exit point is computed in the same way as in Case 2if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886)52 Computation of Safe Exit Points for Example Considerthe same example in Figure 1 where the query point q issuesa top-1 keyword query with qt ldquoItalian restaurantrdquo For thisexample let us consider 120572 = 1 The monitoring algorithmstarts exploring from the active edge containing the queryobject q Therefore
997888997888997888997888997888rarr(119902 1198993) is explored first As shown inTable 3 for
997888997888997888997888997888rarr(119902 1198993) 119863+119902 = 1198893 and 119863+1198993 = 1198893 Accordingto Observation 1 no safe exit point exists in this segmentTherefore edges adjacent to 1198993 are explored and 1198993 becomesthe new 119901119886 The edge (1198993 1198994) is explored next Similarlythe answer object at 1198993 and 1198994 is the same 119863+1198993 = 119863+1198994 =1198893 Therefore a safe exit point does not exist in (1198993 1198994)The edge (1198993 1198997) is explored next As shown in Table 3119863+1198993 = 1198893 and 119863+1198997 = 1198896 By Observation 2 there is asafe exit point in (1198993 1198997) As shown in Figure 1 1198893119905 =1198896119905 = ldquo119868119905119886119897119894119886119899119877119890119904119905119886119906119903119886119899119905rdquo and 119889119894119904119905(1198993 1198997) = 119889119894119904119905(1198997 1198993)
Wireless Communications and Mobile Computing 9
(1) Input Same as Algorithm 1(2) Output 119875119878119864 a set of safe exit points(3) 119875119878119864 larr997888 0 lowastset of safe exit points(4) 119863+119901119886 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119901119886 (119901119886 119899120573))(5) lowastResults calculated using Algorithm 1(6) 119863+119899120573 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910((119899120573 (119901119886 119899120573)))(7) lowastResults calculated using Algorithm 1(8) if 119863+119901119886 = 119863+119899120573 then(9) no safe exit point lowastrefer to Observation 1(10) end(11) if 119863+119901119886 = 119863+119899120573 then(12) 119875119878119864 larr997888 119875119878119864 cup 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119901119886 119899120573) lowastsafe exit point
exist - refer to Observation 2(13) end(14) return 119875119878119864
Algorithm 3 COSK monitoring algorithm
(1) Input same as Algorithm 1(2) Output se safe exit point in (119901119886 119899120573)(3) 119863+119897 larr997888 lt 119901119863+119897 gt | for each point 119901 isin [119901119886 119899120573] 119889+119897 such that120595(119889+119897 )119901 = min(120595(119889+1 )119901 120595(119889+2 )119901 120595(119889+|119889+|)119901(4) 119863minusℎ larr997888 lt 119901119863minusℎ gt | for each point 119901 isin [119901119886 119899120573] 119889minusℎ such that120595(119889minusℎ )119901 = max(120595(119889minus1 )119901 120595(119889minus2 )119901 120595(119889minus|119889minus |)119901(5) if Case 1 then(6) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(7) 119901119904119890 =
max(119889119894119904119905(119904119890 119889+1 ) 119889119894119904119905(119904119890 119889+2 ) 119889119894119904119905(119904119890 119889+|119889+ |)) =min(119889119894119904119905(119904119890 119889minus1 ) 119889119894119904119905(119904119890 119889minus2 ) 119889119894119904119905(119904119890 119889minus|119889minus |))
(8) end(9) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(10) 119901119904119890 = 119901119886 or 119901119904119890 = 119889+119897 where 119889+119897 isin (119901119886 119899120573)(11) end(12) end(13) if Case 2 then(14) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(15) 119901119904119890 =closest point to 119901119886 such that 120595(119889minusℎ ) gt 120595(119889+119897 )(16) end(17) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(18) Same as Line (10)(19) end(20) end(21) return 119901119904119890
Algorithm 4 ComputeSafeExit(119901119886 119899120573)
Therefore according to Case 1 the safe exit point 1199041 isthe midpoint between 1198893 and 1198896 That is 119889119894119904119905(1199011199041198901 1198893) =119889119894119904119905(1199011199041198901 1198896) where119889119894119904119905(1199011199041198901 1198893) = 119909+3 and 119889119894119904119905(1199011199041198901 1198896) =minus119909 + 5 for 0 lt 119909 lt 3 Consequently 119909 = 1 which means thatthe distance from 1198993 to 1199011199041198901 is 1
Next we determine a safe exit point in (1198993 1198995) As shownin Table 3 the answer object at 1198995 is also the same as 1198993Hence no safe exit point exists in this edge Next
larr997888997888997888997888997888(1198996 1198995) isexplored with 119901119886 = 1198995 According to Table 3 119863+1198997 = 1198894 and
119863+1198995 = 1198893 Therefore a safe exit point exists in this edge This
edge is directed and for each point 119901 isin larr997888997888997888997888997888(1198996 1198995) the shortestdistance from p to 1198893 is from 119901 997888rarr 1198996 997888rarr 1198992 997888rarr 1198993 997888rarr 1198893Therefore 1198995 is the safe exit point
The bold lines in Figure 5 indicate the safe region of qThetop-1 result remains 1198893 until the query q lies in the safe region
Next we analyze the time complexity for determininga set of safe exit points using a set of qualifying objects119889 isin 119863+119901119886 cup 119863+119899120573 cup 119863(119901119886 119899120573) Note that 119863+119901119886 (119863+119899120573) indicates
10 Wireless Communications and Mobile Computing
Table 3 Computation of safe exit points for example scenario
EdgeSegment 119901119886 119863+119901119886 119863+119899120573 119901119904119890997888997888997888997888rarr(119902 1198993) q 119863+119902 = 1198893 119863+1198993 = 1198893 none(1198993 1198994) q 119863+1198993 = 1198893 119863+1198994 = 1198893 none(1198993 1198997) 1198993 119863+1198993 = 1198893 119863+1198997 = 1198896 1199011199041198901997888997888997888997888997888rarr(1198993 1198995) 1198993 119863+1198993 = 1198893 119863+1198995 = 1198893 nonelarr997888997888997888997888997888(1198996 1198995) 1198995 119863+1198995 = 1198893 119863+1198996 = 1198894 1199011199041198902
2
q
3
1
1 1
1
1
2
1
2
1 2
1
3
2
1
1
d4 (Chinese Restaurant)
d1 (Grand Hotel)
d5 (Pub and Bar)
n1
n6
n2 n3
n4
n7
pse1
pse2
n5
d6(Italian Restaurant)
d3 (Italian Restaurant)
d2 (Cafe)
d7 (Cafe and Bakery)
Figure 5 Illustration of safe region of q
the set of k data objects that satisfies the query conditionat 119901119886 (119899120573) According to Dijkstras algorithm [26] the timecomplexity 119874(119863+119902 ) for computing a set of answer objects at aquery point q is119874(119863+119902 ) = 119874(|119864|+|119873| log |119873|)Thismeans that119874(119863+119901119886) = 119874(119863+119899120573) = 119874(|119864| + |119873| log |119873|) holds for endpoints119901119886 and 119899120573 Thus time complexity 119874(Ω119896119905ℎ) when determiningthe skyline Ω119896119905ℎ with the k-th highest score is 119874(Ω119896119905ℎ) =119862119896119905ℎ119874(|119863+119901119886 cup 119863+119899120573 cup 119863(119901119886 119899120573)|) where 119862119896119905ℎ is the numberof qualifying objects that participate in the constitution ofthe skyline with the k-th highest score Therefore the timecomplexity of determining a safe exit point coincides withthe time complexity of determining the two skylines iethe skyline 119863+119897 with the k-th highest (or lowest) score foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects This is because the safe exit point is foundat the cross point between these skylines
Figure 6 represents the skyline graph for 119896 = 1 in an edge(1198997 1198993) Let us draw the score function for 1198893 and 1198896 for theroad segment (1198997 1198993) where a safe exit point exists This isbecause 119863(1198993)+ = 1198893 and 119863(1198997)+ = 1198896 for 119896 = 1 For eachpoint 119901 isin (1198997 1198993) the distance between 1198893 and point p canbe represented as 119889119894119904119905(1198893 119901) = 119889119894119904119905(1198893 1198993) + 119897119890119899(1198993 119901) = 6 minus119897119890119899(1198997 119901) Similarly for each point 119901 isin (1198997 1198993) the distancebetween 1198896 and point p can be represented as 119889119894119904119905(1198896 119901) =119889119894119904119905(1198896 1198997) + 119897119890119899(1198997 119901) = 2 + 119897119890119899(1198997 119901) Let 119897119890119899(1198997 119901) be
n7
10
08
06
04
02
n3pse1d7
distance
Scor
e
05 10 15 20 25 30
(d6) = 1(x + 3)
(d3) = 1(minusx + 7)
Figure 6 Skyline graph for 119896 = 1 on the road segment (1198997 1198993)
a variable x (0 le 119909 le 3) We can write 120582(1198893 119901) =119889119894119904119905(1198893 119901) = 6 minus 119909 and 120582(1198896 119901) = 119889119894119904119905(1198896 119901) = 2 + 119909 Thenwe can represent score function 120595(1198893) and 120595(1198896) as follows
120595(1198893) = 120583(1198893119905 119902119905)(1 + 120572 sdot 120582(1198893 119901)) = 1(7 minus 119909) for(0 le 119909 le 3)
Wireless Communications and Mobile Computing 11
120595(1198896) = 120583(1198896119905 119902119905)(1 + 120572 sdot 120582(1198896 119901)) = 1(3 + 119909) for(0 le 119909 le 3)Finally we present the lemma to prove that safe exit points
computed by COSK are correct
Lemma 8 The COSK algorithm correctly computes a set ofsafe exit points
Proof We will prove the correctness of the COSK algorithmby contradiction We assume that if 119863+119901119886 = 119863+119899120573 there is nosafe exit point in a road segment (119901119886119899120573) This means that foreach point p in the road segment (119901119886119899120573) the query result atp equals 119863+119901119886 ie 119863+119901 = 119863+119901119886forall119901 isin (119901119886119899120573) However it leadsto a contradiction that 119863+119899120573 = 119863+119901119886 when 119901 = 119899120573 There-fore if 119863+119901119886 = 119863+119899120573 a safe exit point exists in (119901119886119899120573) In addi-tion a safe exit point is determined using the skyline 119863+119897 foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects when 119863+119901119886 = 119863+119899120573 The first skyline is acomposite polyline drawn from answer objects in 119863+119901119886 Thesecond skyline is a composite polyline drawn from nonan-swer objects in 119863+119899120573 cup 119863(119901119886 119899120573) minus 119863+119901119886
6 Monitoring Query Results and Safe Regionsin Dynamic Directed Road Networks
In this section we discuss the monitoring of spatial key-word queries in dynamic road networks where the networkdistance changes depending on the traffic conditions Theupdates on weight of some edges may invalidate the queryresults or safe region of q even though the query objectq remains within their respective safe region Figure 7illustrates an example of changing the weights edges
larr997888997888997888997888997888(1198991 1198992)and
larr997888997888997888997888997888(1198991 1198996) For convenience we consider 120572 = 1 and qt =ldquoItalian restaurantrdquo In Figure 7(a) the top-1 result is 1198891 andbold lines show the safe region of query q Now consider attime 119905119895 the weights of two edgeslarr997888997888997888997888997888(1198991 1198992) andlarr997888997888997888997888997888(1198991 1198996) changeddue to heavy traffic condition as shown in Figure 7(b) Theupdate in weight of edges may invalidate the query resultor safe region of q Therefore it is necessary to monitor thevalidity of results and safe region when the changes occur
Next we introduce a monitoring region to monitor thevalidity of the safe region effectively when the weight ofan edge is changed Monitoring region MR contains all thepoints between query point q and lowest answer object andhighest nonanswer object Formally it is defined as 119872119877 =119889119894119904119905(119902119863+119897 ) cup 119889119894119904119905(119902119863minusℎ) where 119889119894119904119905(119902119863+119897 ) is the distancebetween q and lowest answer object and 119889119894119904119905(119902119863minusℎ) is highestnonanswer object In given example the 119863+119897 = 1198891 and 119863minusℎ =1198892 1198893 Therefore the dotted lines in Figure 8(a) shows themonitoring region of query object q
Now at time 119905119895 the update to edgeslarr997888997888997888997888997888(1198991 1198996) and larr997888997888997888997888997888997888(1198991 1198891)
which is not part of monitoring region can safely be ignoredHowever the updated on segment
997888997888997888997888997888997888rarr(1198992 1198891)which is associatedwith monitoring region may nullify the results As shown in
Figure 8(b) after update the top-1 result becomes 1198892 and boldlines represents the new safe region of q
Algorithm 5 monitors the validity of result set and saferegion of query object qwhen the weight of any edge changesLet us consider weight of edge (119899119894 119899119895) changes at time 119905119895First algorithm checks whether edge (119899119894 119899119895) is associatedwith monitoring region or not If it is not part of monitoringregion then algorithm simply ignores the update in edge(119899119894 119899119895) and query results and safe region remains valid Incontrast if edge is associated with monitoring region (ie119872119877cap(119899119894 119899119895) = 0) then algorithm evaluates the query resultsConsequently the top-k results and safe region of queryq needs to be updated Finally the algorithm updates themonitoring region of q
7 Performance Evaluation
In this section we evaluate the performance of COSKthrough simulation experiments We describe our experi-mental settings in Section 71 and we present our experimen-tal results for static and dynamic road networks in Sections72 and 73 respectively
71 Experimental Settings All of our experiments wereperformed using real road networks namely OldenburgSan Francisco and San Joaquin All three road networkswere obtained from [27] The original road network of SanFrancisco had 21047 nodes and 21692 edges We reformat-ted the network pruned approximately 30 of the nodesand adjusted the edges and their weights accordingly Thisresulted in a network with 14732 nodes and 14316 edgesBoth the direction of edges and data objects on the edgeswere generated randomly The description of each data objectwas extracted from Twitter messages [28] and we assignedone tweet per data object Table 4 presents the characteristicsof the data sets used in the experimental evaluation Wesimulated moving query objects by using a spatiotemporaldata generator [29] The input to generator was the road net-work of the data set used and the output was the set of queryobjects moving on the road network Each experiment had100 moving queries which were continuously monitored for100 timestamps (1 timestamp = 1 second) and the averageresult was reported in the experiments
As a benchmark for COSK in static road network weimplemented a CMTkSK+ algorithm [22] which also contin-uously monitored the moving top-k spatial keyword queriesin the road networks However this algorithm was originallydesigned for undirected road networks To make a faircomparison we modified CMTkSK+ to process top-k spatialkeyword queries in directed road networks and called itCMTkSK+ Specifically we modified the distance computa-tion method between two points such that in directed roadnetworks 119889119894119904119905(1199011 1199012) = 119889119894119904119905(1199012 1199011) Since CMTkSK+ doesnot handle top-k spatial queries in dynamic road roads wecompared the performance of COSK with basic algorithmwhich recomputes the results whenever query object changesits location All algorithms were implemented in Java andwere executed on a desktop PC 280-GHz Intel Core i5 with
12 Wireless Communications and Mobile Computing
3
q5 5
2 3
3
2
2 3 5
11
d3 (Chinese Restaurant)
n1
n6
n2 pse2
pse1
pse3
n4n5
n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Safe region at time 119905119894
9
q10 5
6 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6
n2 n3
n4n5
d2 (Italian Restaurant)d1 (Italian Restaurant)
(b) Updating weight oflarr997888997888997888997888997888997888(1198991 1198992) and
larr997888997888997888997888997888997888(1198991 1198996) at time 119905119895
Figure 7 Updating the weight of edges in a dynamic road network where 119905119894 lt 119905119895
3
q5 5
2 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6 n4n5
n2 n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Monitoring region at time 119905119894
9
q10 5
5 4
233
2
2 3 5
11
037
pse2pse1
pse3
d3 (Chinese Restaurant)n6 n4n5
n2 n3d2 (Italian Restaurant)n1 d1 (Italian Restaurant)
(b) New safe region at time 119905119895
Figure 8 Monitoring region and updated safe region at time 119905119895
(1) InputMonitoring regionMR updated edge (119899119894 119899119895)(2) Output none(3) if 119872119877cap (119899119894 119899119895) = 0 then(4) lowastedge (119899119894 119899119895) is not part of monitoring region(5) ignore the change in the weight of edge (119899119894 119899119895)(6) end(7) 119875119878119864 larr997888 0 lowastset of safe exit points(8) else(9) 119863119896119906119901119889 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119899119894 119890119894) lowastupdate set of
top-k results(10) 119875119878119864119906119901119889 larr997888 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119875119886 119899120573) lowastupdate safe exit
points(11) 119872119877119906119901119889 larr997888 119862119900119898119901119906119905119890119872119900119899119894119905119900119903119894119899119892119877119890119892119894119900119899(119863+119897 119863minusℎ )
lowastupdate monitoring region(12) end
Algorithm 5 MonitoringSafeRegion(MR(119899119894 119899119895))
Table 4 Summary of datasets
Attribute Oldenburg San Francisco San JoaquinTotal no of nodes 6104 14732 18262Total no of edges 7034 14316 23876Percentage of directed edges 30 30 30Total no of objects 5627 11453 19098Average no of objects per edge 08 08 08Total no of words 49517 103649 166153
Wireless Communications and Mobile Computing 13
Table 5 Experimental parameter settings
Parameter RangeNumber of results (k) 5 10 15 20 25Number of keywords (n) 1 2 3 4 5Query parameter (120572) 001 01 1 10 100Dataset Oldenburg San Francisco San JoaquinNumber of data objects (119873119863) 10 20 30 40 50 (x1000)Speed of query objects (119881119902119903119910) 25 50 75 100 125 (kmh)Mobility (119872119902119903119910) 20 40 60 80 100Ratio of directed edges (119864119889119894119903) 10 20 30 40 50Ratio of updated edges (119864119906119901119889) 15 30 60 80 100
8GB of memory In the experiments we compared (1) queryprocessing times (2) edges processed ie the number ofedges processed for retrieving query results and (3) indexsizes Table 5 summarizes the parameters used in the exper-iments In each experiment we varied a single parameterwithin the range that is shown in Table 5 while maintainingthe other parameters at the bolded default values
We evaluated the performance of the algorithms by usingthe following measures (1) total amount of server CPUtime which indicates the query processing time and (2)total communication cost as the total number of points (iethe location updates sent by query objects and the queryresults and safe exit points returned by the server) transferredbetween clients and the serverThebattery power andwirelessbandwidth consumption typically increase with the amountof data transferred between objects (clients) and serversThus we used the amount of transferred data as a metric toevaluate the communication cost
72 Experimental Results of Top-k Spatial KeywordQueries in Static Road Networks
721 Effect of k Figure 9 indicates the effect of the numberof results on the query processing time and communicationcost for both algorithms Figure 9(a) indicates that the queryprocessing time increases for both algorithms as the value ofk increases This is expected because with an increase in kmore data objects are required to be explored and verifiedNevertheless COSK significantly outperforms CMTkSK+ fortwo main reasons First a relevant object search is very effi-cient when using the highest significant factor and secondCOSKdoes not need to verify the set of answer objects as longas the query object lies in a safe region On the other handthe CMTkSK+ query processing time increases significantlybecause it has to monitor and verify the set of candidateobjects periodically In Figure 9(b) the communication costsfor both algorithms increase as the number of objects in-creases However the proposed algorithm demonstrates su-perior performance compared to CMTkSK+ because client-server communication is not required when the query objectlies within the safe exit points whereas in CMTkSK+ thequery object is required to report its location to the serverwhenever it moves
722 Effect of119873119863 This experimentwas conducted on datasetSan Joaquin This dataset included 19098 data objects there-fore we randomly generated approximately 30000 additionaldata objects on different edges In Figure 10 we evaluate theperformance of COSK and CMTkSK+ by varying the cardi-nality of the data objects Note that119873119863 = 10119870 corresponds toa low density of data points while119873119863 = 50119870 corresponds toa high density In Figure 10(a) it is interesting to notice thatthe query processing times of both algorithms decrease asthe cardinality of the data objects increases For CMTkSK+this is because with high density the monitoring range of aquery decreases However for COSK it is mainly becausewhen the data density is high fewer edges are required tobe expanded which decreases the query processing time InFigure 10(b) we study the influence of the cardinality of thedata objects on the communication costs The experimentalresults indicate that the communication costs of CMTkSK+incur almost constant communication costs regardless ofdata object cardinality However the communication costsof COSK increase in proportion to the 119873119863 value This isexpected because the safe region becomes smaller as thedensity of the data objects increases which increases thecommunication costs
723 Effect of Query Keywords (n) Figure 11 shows thequery processing time and communication for COSK andCMTkSK+ as a function of the number of query keywordsFigures 11(a) and 11(b) show the trend that the performanceof both algorithms degrades when the number of keywordsincreases This is mainly because by increasing the numberof query keywords the number of relevant objects may alsoincrease resulting in a higher query processing time andcommunication cost However the safe-region-based algo-rithm COSK scales better than CMTkSk+ because of its lessexpensive monitoring technique
724 Effect of 120572 Figure 12 demonstrates the impact of queryparameter 120572 on the query processing time and on the com-munication cost A small value of 120572 indicates a greater im-portance of textual relevance whereas a high value of 120572gives more preference to the spatial relevance It is interestingto note that the query processing time is lower for higher
14 Wireless Communications and Mobile Computing
k
50
10
10
15 20
20
30
Que
ry p
roce
ssin
g tim
e (s)
COSKCMTkSK+
40
25
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
k
5 10 15 20 25
(b) Communication cost
Figure 9 Effect of k on query processing time and number of edges processed
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
10k 20k 30k 40k 50kND
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
tran
sferr
ed m
essa
ges
1M
10 20 30 40 50ND
(b) Communication cost
Figure 10 Effect of119873119863 on query processing time and communication cost
values of 120572 which indicates more importance to the spatialrelevance This is mainly because when the spatial relevanceis higher fewer edges and objects are required to be exploredand processed to determine the top-k data objects Observethat in Figure 12(b) the number of messages sent by COSKdecreases sharply with an increase in 120572725 Effect of Speed Figure 13(a) demonstrates the influenceof the speed of the query objects on the query processingtime of the COSK and CMTkSK+ algorithms The experi-mental results indicate that the performance of CMTkSK+is not significantly influenced by the speed of the query
objects because the candidate objects must be continuouslymonitored after a regular interval of time regardless ofthe speed On the other hand for COSK the performancegradually decreases as the speed of the query objects increasesbecause the objects leave their respective safe regions morefrequently Figure 13(b) shows the communication costs ofCOSK and CMTkSK+ with respect to the speed of the queryobjects CMTkSK+ incurs almost constant communicationcosts because a server-initiated request to verify the candidateobjects does not depend on the speed For COSK the queryobjects cross safe regions more frequently when the speed ishigh which increases the communication costs
Wireless Communications and Mobile Computing 15
Number of keywords1 2 3 4 5
COSKCMTkSK+
0
15
30
45
Que
ry p
roce
ssin
g tim
e (s)
60
(a) Query processing time
COSK
Number of keywords
CMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1 2 3 4 5
(b) Communication cost
Figure 11 Effect of number of keywords on query processing time and communication cost
001 01 1 10 100
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
001 01 1 10 100
(b) Communication cost
Figure 12 Effect of 120572 on query processing time and communication cost
726 Effect of Mobility Figure 14 shows the effect of mobility119872119902119903119910 (mobility refers to the percentage of query objects thatare moving at any timestamp) on the performance of COSKand CMTkSK+ algorithms As expected the query pro-cessing time and communication costs for both algorithmsincrease with119872119902119903y Nevertheless COSK performs better thanCMTkSK+ in terms of query processing time and commu-nication costs
727 Effect of Directed Edges Figure 15 shows the impactof percentage of directed edges 119864119889119894119903 on the performance ofCOSK and CMTkSK+ algorithms The query processing time
increases with 119864119889119894119903 because algorithm needs to explore moreedges to retrieve the top-k keyword queries However thecommunication cost is not significantly affected by the valueof 119864119889119894119903 for both the algorithms
728 Effect of Datasets Figure 16 demonstrates the indexsizes of the COSK and CMTkSK+ approaches for differentdatasets As shown in Figure 16 both algorithms have similarindex sizes However COSK has minor space overheadbecause it stores additional information of the highest signifi-cance factor 120579119905 of edges More important this space overheadis minimal as compared to the gain achieved by COSK inquery processing time and communication costs
16 Wireless Communications and Mobile Computing
25 50 75 100 125
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
Vqry
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
25 50 75 100 125Vqry
(b) Communication cost
Figure 13 Effect of speed on query processing time and communication cost
20 40 60 80 100Mqry
COSKCMTkSK+
0
15
45
30
60
Que
ry p
roce
ssin
g tim
e (s)
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
20 40 60 80 100Mqry
1k
COSKCMTkSK+
(b) Communication cost
Figure 14 Effect of mobility on query processing time and communication cost
73 Experimental Results of Top-k Spatial Keyword Queriesin Dynamic Road Networks In this section we evaluate theperformance of COSK and basic algorithm for dynamic roadnetworks The 119864119906119901119889 indicates the percentage of all edges thatchange their weight at each timestamp The length of anupdated edge is randomly selected between 01 to 10 times theoriginal length Figure 17(a) depicts the query processing timeof COSK and basic algorithm It is evident from the figure thatquery processing time of basic algorithm is not significantlyaffected by 119864119906119901119889 This is mainly because the query objectsissue top-k spatial queries at each timestamp However query
processing time of COSK increases with the value of 119864119906119901119889because the probability that the updated edge may associatedwith the monitoring region of query q increases with 119864119906119901119889Therefore when 119864119906119901119889 becomes large the results need to befrequently updated which increases the query processingtime Figure 17(b) shows the communication costs of COSKand basic algorithm with respect to 119864119906119901119889 Basic algorithmincurs almost constant communication costs regardless of thevalue of 119864119906119901119889 In contrast the communication cost of COSKincreases with 119864119906119901119889 because the query result and safe regionsneeds to be frequently updated
Wireless Communications and Mobile Computing 17
COSKCMTkSK+
10 20 30 40 50Edir
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1k
10 20 30 40 50Edir
eSPAKCMTkSK+
(b) Communication cost
Figure 15 Effect of 119864119889119894119903 on query processing time and communication cost
COSKCMTkSK+
0
15
45
30
60
Inde
x siz
e (M
B)
OldenburgDatasets
San Francisco San Joaquin
Figure 16 Effect of dataset on index size
8 Conclusion
In this paper we investigated moving top-k spatial keywordqueries in directed and dynamic road networksWepresentedan efficient indexing framework using inverted files thatindexes the data objects on edges allowing for the effectivesearching of data objects relevant to queries in terms ofboth textual and spatial relevance We also presented a safe-exit-based algorithm called COSK to monitor moving top-k spatial keyword queries We demonstrated that the queryresults remain valid as long as the query object resides withina safe region Furthermore COSK can effectively monitor thevalidity of query results and safe regions in dynamic roadnetworks Finally an experimental evaluation conducted on
real road networks demonstrated that COSK significantlyreduced the query processing time and communication costscompared to the CMTkSK+ algorithm
Data Availability
The real road network data used in this study are also used inmany previous studies The road network data is cited in themanuscript and it is available at httpswwwcsutahedusimlifeifeiSpatialDatasethtm To simulate the moving queriesthe authors used the spatiotemporal data generator which isalso used in previous studiesThe research article of generatoris cited in the manuscript The documentation and source
18 Wireless Communications and Mobile Computing
0
20
40
60
Que
ry p
roce
ssin
g tim
e (s)
80
15 30 45 60 75Eupd
COSKBasic
(a) Query processing time
15 30 45 60 75Eupd
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
COSKBasic
(b) Communication cost
Figure 17 Effect of 119864119906119901119889 on query processing time and communication cost
files of generator are available at httpsiapgjade-hsdeper-sonenbrinkhoffgenerator They used the Twitter tweetsfor generating the description of data objects and also querykeywords The tweets used can be accessible at httpfollow-thehashtagcomdatasetsfree-twitter-dataset-usa-200000-free-usa-tweets
Conflicts of Interest
The authors declare that there is no conflicts of interestregarding the publication of this paper
Acknowledgments
Hyung-JuChowas supported by theNational Research Foun-dation of Korea (NRF) grant funded by the Korean Govern-ment (MSIP) (NRF-2016R1A2B4009793) and this researchwas partially supported by Basic Science Research Programthrough the National Research Foundation of Korea (NRF)fundedby theMinistry of Education (2016R1D1A1B03934129)
References
[1] D Papadias N Mamoulis J Zhang and Y Tao ldquoQuery pro-cessing in spatial network databasesrdquo in Proceedings of the 29thInternational Conference on Very Large Data Bases (VLDB rsquo03)pp 802ndash813 September 2003
[2] H-J Cho K Ryu and T-S Chung ldquoAn efficient algorithm forcomputing safe exit points of moving range queries in directedroad networksrdquo Information Systems vol 41 pp 1ndash19 2014
[3] G Tsatsanifos and A Vlachou ldquoOn processing Top-k spatio-textual preference queriesrdquo in Proceedings of the 18th Interna-tional Conference on ExtendingDatabase Technology (EDBT rsquo15)pp 433ndash444 March 2015
[4] R Li A X Liu A L Wang and B Bruhadeshwar ldquoFast rangequery processing with strong privacy protection for cloud com-putingrdquo Proceedings of the VLDB Endowment vol 7 no 14 pp1953ndash1964 2014
[5] G Cong C S Jensen andDWu ldquoEfficient retrieval of the Top-k most relevant spatial web objectsrdquo Proceedings of the VLDBEndowment vol 2 no 1 pp 337ndash348 2009
[6] Z Li K C K Lee B Zheng W-C Lee D Lee and X WangldquoIR-tree An efficient index for geographic document searchrdquoIEEE Transactions on Knowledge and Data Engineering vol 23no 4 pp 585ndash599 2011
[7] Y Zhou X Xie C Wang Y Gong and W Ma ldquoHybrid indexstructures for location-based web searchrdquo in Proceedings of the14th ACM International Conference on Information and Knowl-edge Management pp 155ndash162 Bremen Germany October2005
[8] J Zobel and A Moffat ldquoInverted files for text search enginesrdquoACM Computing Surveys vol 38 no 2 2006
[9] N Beckmann H Kriegel R Schneider and B Seeger ldquoR-anefficient and robust accessmethod for points and rectanglesrdquo inProceedings of the ACM SIGMOD International Conference onManagement of Data vol 19 pp 322ndash331 May 1990
[10] R Hariharan B Hore C Li and S Mehrotra ldquoProcessing spa-tial-keyword (sk) queries in geographic information retrieval(gir) systemsrdquo in Proceedings of the 19th International Confer-ence on Scientific and Statistical DatabaseManagement (SSDBMrsquo07) July 2007
[11] I De FelipeV Hristidis andN Rishe ldquoKeyword search on spa-tial databasesrdquo in Proceedings of the 24th International Confer-ence on Data Engineering (ICDE rsquo08) pp 656ndash665 April 2008
[12] J B Rocha-Junior O Gkorgkas S Jonassen and K NoslashrvagldquoEfficient processing of top-k spatial keyword queriesrdquo inProceedings of the International Symposium on Spatial andTemporal Databases pp 205ndash222 Springer 2011
[13] D Zhang K-L Tan andAK Tung ldquoScalable top-k spatial key-word searchrdquo in Proceedings of the 16th International Conferenceon Extending Database Technology pp 359ndash370 2013
Wireless Communications and Mobile Computing 19
[14] J B Rocha-Junior andK Noslashrvag ldquoTop-k spatial keyword quer-ies on road networksrdquo in Proceedings of the 15th InternationalConference on Extending Database Technology pp 168ndash179Berlin Germany March 2012
[15] H-J Cho S J Kwon and T-S Chung ldquoA safe exit algorithmfor continuous nearest neighbor monitoring in road networksrdquoMobile Information Systems vol 9 no 1 pp 37ndash53 2013
[16] D Yung M L Yiu and E Lo ldquoA safe-exit approach for efficientnetwork-based moving range queriesrdquo Data amp KnowledgeEngineering vol 72 pp 126ndash147 2012
[17] M Attique H Cho R Jin and T Chung ldquoEfficient Processingof Continuous Reverse k Nearest Neighbor on Moving Objectsin Road Networksrdquo ISPRS International Journal of Geo-Infor-mation vol 5 no 12 p 247 2016
[18] H G Elmongui M F Mokbel and W G Aref ldquoContinuousaggregate nearest neighbor queriesrdquoGeoInformatica vol 17 no1 pp 63ndash95 2013
[19] D Wu M L Yiu C S Jensen and G Cong ldquoEfficient con-tinuously moving top-k spatial keyword query processingrdquo inProceedings of the IEEE International Conference on Data En-gineering (ICDE rsquo11) pp 541ndash552 Hannover Germany April2011
[20] W Huang G Li K-L Tan and J Feng ldquoEfficient safe-re-gion construction for moving top-k spatial keyword queriesrdquoin Proceedings of the 21st ACM International Conference onInformation and Knowledge Management pp 932ndash941 2012
[21] L Guo J ShaoHHAung andK-L Tan ldquoEfficient continuoustop-k spatial keyword queries on road networksrdquoGeoInformat-ica vol 19 no 1 pp 29ndash60 2014
[22] Y Li G Li L Shu Q Huang and H Jiang ldquoContinuous moni-toring of top-k spatial keyword queries in road networksrdquo Jour-nal of Information Science and Engineering vol 31 no 6 pp1831ndash1848 2015
[23] M Attique A Khan and T-S Chung ldquoESPAK Top-k spatialkeyword query processing in directed road networksrdquo in Pro-ceedings of the Workshops of the International Conference onExtending Database Technology and the International Confer-ence on DatabaseTheory (EDBTICDT rsquo17) March 2017
[24] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing ampManagement vol24 no 5 pp 513ndash523 1988
[25] V N Anh O de Kretser and A Moffat ldquoVector-space rankingwith effective early terminationrdquo in Proceedings of the 24th An-nual International ACM SIGIR Conference pp 35ndash42 NewOrleans LO USA 2001
[26] E W Dijkstra ldquoA note on two problems in connexion withgraphsrdquo Numerische Mathematik vol 1 pp 269ndash271 1959
[27] ldquoReal datasets for spatial databasesrdquo httpswwwcsutahedulifeifeiSpatialDatasethtm
[28] ldquoTwitterrdquo httpstwittercom[29] T Brinkhoff ldquoA framework for generating network-basedmov-
ing objectsrdquo GeoInformatica vol 6 no 2 pp 153ndash180 2002
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
2 Wireless Communications and Mobile Computing
2
q
3
2
1
1 1
2
1
2
1
2
1
1
3
1 2
d4 (Chinese Restaurant)
d1 (Grand Hotel)
d5 (Pub and Bar)
n1
n6
n2 n3
n4
n7
n5
d6 (Italian Restaurant)
d3 (Italian Restaurant)
d2 (Cafe)
d7 (Cafe and Bakery)
Figure 1 Illustration of directed road network
modeled as a weighted directed graph where each edge hassome direction and its weight can vary according to the trafficconditions
Given a set of data objects 119863 = 1198891 1198892 119889|119863| querylocation and set of keywords theTkSKquery returns the bestk data objects from D according to their combined textualand spatial relevance to the query We use distance function119889119894119904119905(119902 119889) to represent the shortest network distance from q todata object d Figure 1 presents an example of a directed roadnetwork where rectangles represent the data objects witha textual description and the triangle represents the querylocation The number label on each edge indicates the weightof that edge such as the amount of time required to travelalong it eg 119889119894119904119905(1198891 1198991) = 1 and 119889119894119904119905(1198891 1198992) = 2 Considera scenario where a tourist is interested in finding an ldquoItalianRestaurantrdquo If an undirected road network is considered thetop-1 ldquoItalian Restaurantrdquo is 1198896 However in a directed roadnetwork the shortest path from q to 1198896 is (119902 997888rarr 1198993 997888rarr1198997 997888rarr 1198896) Therefore for a directed road network the top-1result is 1198893 because it is closer to the query location than 1198896Now consider that the tourist is looking for ldquoCafe BakeryrdquoThe data object 1198897 could score higher than data object 1198891because 1198897 (ldquoCafe and bakeryrdquo) is more textually relevantto query keywords than 1198892 (ldquoCaferdquo) and 119889119894119904119905(119902 1198897) is onlymarginally greater than 119889119894119904119905(119902 1198892)
Moving Top-k spatial keywords in directed and dynamicroad networks are useful for many location-based applica-tions However query processing is costly because movementof query object qmay invalidate the query results Thereforethe main challenge in moving TkSk is to maintain the fresh-ness of the query results when the query objects are movingfreely A straightforward approach is to increase the updatefrequency of the queryHowever this approachnot only com-promises the up-to-date query results but also increases thecomputation and communication overhead Because when-ever query object changes its location the query object has toreport its location to server which increases the communica-tion cost and server has to recompute the results again whichincreases the computation cost
To address the aforementioned challenges we first pre-sent an efficient processing technique of snapshot TkSKqueries in directed road networks Then we present a safe-exit-based approach for processing and monitoring movingTkSk queries where query object q is freely moving in adirected spatial network The safe exit point of query object qrepresents a boundary point between the safe region andnon-safe region of q A safe region of query points indicates thatthe query result remains valid if the query object lies withinits respective safe region Therefore the query results willonly be recomputed when q leaves its respective safe regionwhich significantly reduces the computation and communi-cation costs To the best of our knowledge this is the firstattempt to study moving top-k spatial keyword queries indirected and dynamic road networks
Below we summarize our contributions
(i) We study the problem of continuous monitoring ofmoving top-k spatial keyword queries in a directedand dynamic road networks
(ii) We present an algorithm tomonitor themoving TkSKqueries which efficiently computes the safe exit pointsfor query object q in a directed road network Thealgorithm significantly minimizes the computationand communication costs for moving queries
(iii) We also propose a method that monitors the validityof query results and safe region when weight of roadsegments is updated due to traffic conditions
(iv) Finally we conduct extensive experiments on realroad network datasets and demonstrate the superi-ority of the proposed algorithm over the existing ap-proach
The remainder of this paper is structured as followsSection 2 reviews the existing work on the processing of TkSkqueries on Euclidean and road networks Section 3 providesterminology definitions and describes the problem Section 4elaborates on the proposed query processing technique for
Wireless Communications and Mobile Computing 3
TkSK queries in directed road networks In Section 5 we pre-sent our safe-exit-based technique to process moving TkSKqueries Section 7 presents a performance analysis of theproposed technique Section 8 concludes this paper
2 Related Work
In this section we discuss some of the promising relatedstudies of top-k spatial keyword queries Our related workis divided into two sections Section 21 reviews snapshotTkSK queries and Section 22 presents the studies proposedto address moving TkSK queries
21 Snapshot Top-k Spatial Keyword Queries In recent yearsspatial keyword queries have drawn the attention of manyresearchers Several approaches have been proposed forranking spatial data objects Initially Zhou et al [7] workedon combining inverted indexes [8] and R-trees [9] Theyproposed three different hybrid indexing structures Theirstudy demonstrated that building an inverted index on topof an R-tree provides superior performance Hariharan et al[10] proposed the indexing structure KRlowast-tree by capturingthe joint distribution of keywords in space Ian de Felipe et al[11] proposed a data structure that combines an R-tree withtext signatures Each node of the R-tree exploits a signatureto indicate the presence of keywords in the subtree of thenode However both these approaches address only Booleankeyword queries in Euclidean space
Top-k spatial keyword queries where data objects areranked according to their combined textual and spatialrelevance to keyword queries were first studied by Cong etal [5] and Li et al [6] Both studies [6] integrate locationindexing and text indexing to generate IR-treesThese studiesprocess top-k spatial keyword queries only in Euclidean spaceand are not suitable for processing top-k spatial preferencequeries in road networks where the distance between objectsis determined by the shortest path connecting them LaterRocha et al [12] proposed the indexing technique S2I whichmaps each term in the vocabulary into a separate blockor aR tree for efficient processing of top-k spatial keywordqueries Zhang et al [13] proposed an m-closest keywordquery that returns the closest object based on distance andwhich matchesm query keywords
Top-k spatial keyword queries in road networks wereintroduced by Rocha et al [14] In particular they pro-posed three different indexing techniques (Basic IndexingEnhanced Indexing and Overlay Indexing) for processingspatial keyword queries in road networks
22 Moving Top-k Spatial Keyword Queries Recently re-search focus has shifted to the continuous processing ofspatial queries where query or data objects are arbitrarilymoving in road networks which is themost realistic scenarioConsiderable research effort has been undertaken to processmoving range k nearest neighbor (kNN) and reverse knearest neighbor queries (RkNN) [15ndash18] However there isa lack of efficient algorithms for moving top-k spatial key-word queries Initially Wu et al [19] and Huang et al [20]
Table 1 Comparisons with existing solutions
Algorithm Type Space Domain OrientationCong et al [5] Snapshot Euclidean No orientationRocha et al [14] Snapshot Static Road UndirectedWu et al [19] moving Euclidean No orientationHuang et al [20] moving Euclidean No orientationGuo et al [21] moving Static Road UndirectedLi et al [22] moving Static Road UndirectedCOSK moving Dynamic Road Directed
proposed different methods formonitoring top-k spatial key-word queries in Euclidean space Guo et al [21] studied mov-ing top-k spatial keyword queries on road networks Theypresented two methods for monitoring moving queries in ancontinuous manner that reduces the traversing of networkedges Later Li et al [22] proposed TPR-tree-based indexingtomonitor moving top-k spatial keyword queries In contrastto [21 22] in this study we consider moving top-k spatialkeyword queries in directed and dynamic road networkswhere each road segment has a particular orientation and itsweight changes due to according to traffic conditions
Table 1 compares our problem scenario with related workin terms of query type space domain and orientation of roadnetworks
3 Preliminaries
Section 31 defines the terms and notations used in this paperSection 32 formulates the problem using an example thatillustrates the general results of top-k spatial keyword queries
31 Definition of Terms and Notations
311 Road Network A road network is represented by aweighted directed graph 119866 = (119873119864119882) where N E and Wdenote the node set edge set and edge distance matrixrespectively The network distance of an edge changes de-pending on the traffic conditions Each edge is also assignedan orientation that is either undirected or directed Theundirected edge is represented by 119890 = (119899119904 119899119890) where 119899119904 and 119899119890are the boundary nodes 119899120573 of an edge whereas the directed
edge is represented by 119890 = 997888997888997888997888997888rarr(119899119904 119899119890) or 119890 = larr997888997888997888997888997888(119899119890 119899119904) Naturallythe arrow above the edge indicates the associated directionWe refer to 119899119904 as the starting node and 119899119890 as the ending nodeof an edge For example in Figure 1 1198996 is the starting node ofedge
997888997888997888997888997888rarr(1198996 1198992) whereas it is the ending node for edgelarr997888997888997888997888997888(1198996 1198995)Theparticular edgewhere a query object is located is called anactive edge It is important to note that the distance betweentwo points 1199011 and 1199012 is not symmetrical in directed roadnetworks (ie 119889119894119904119905(1199011 1199012) = 119889119894119904119905(1199012 1199011)) For example inFigure 1 the 119889119894119904119905(1198893 1198894) = 3 whereas the 119889119894119904119905(1198894 1198893) = 11because shortest path from 1198894 to 1198893 is (1198894 997888rarr 1198996 997888rarr 1198992 997888rarr1198993 997888rarr 1198893)312 Segment Segment 119904 = (1199011 1199012) is the part of an edgebetween two points 1199011 and 1199012 on the edge An edge consists
4 Wireless Communications and Mobile Computing
of one or more segments An edge is also considered a seg-ment where the nodes are the end points of the edge Theweight of a segment (1199011 1199012) is denoted by119882(119904)32 Problem Formulation Similar to previous studies [5 1423] we assume each data object 119889 isin 119863 has a point location119889119897 in the road network and a text description 119889119905 Given aquery location 119902119897 a set of keywords 119902119905 and k number ofdata objects to return the top-k spatial keyword query 119876119896 isdefined as119876119896 = (119902119897 119902119905 119896) which takes three arguments andreturns the best k data objects from D according to a scorethat considers spatial proximity and text relevance The score120595(119889) of a data object d is defined by the following equation
120595 (119889) = 120583 (119889119905 119902119905)1 + 120572 sdot 120582 (119889119897 119902119897) (1)
where 120582(119889119897 119902119897) is the spatial relevance between 119889119897 and119902119897 120583(119889119905 119902119905) is the textual relevance between 119889119905 and 119902119905 and120572 is a positive real number that determines the importanceof one measure over the other For example if only textualrelevance is considered then 120572 = 0 If more importance isgiven to spatial relevance then 120572 gt 1
Spatial relevance (120582) is defined as the shortest distancebetween data objects d and q 120582(119889119897 119902119897) = 119889119894119904119905(119889119897 119902119897)Thus 119889119894119904119905(119889119894119897 119902119897) lt 119889119894119904119905(119889119895119897 119902119897) indicates that data object119889119894 is more spatially relevant to q than data object 119889119895 Thetextual relevance (120583) can be computed using any popularinformation retrieval model such as cosine similarity or thelanguage model In this study we use the cosine similarity be-tween 119889119905 and 119902119905 The textual relevance is defined as follows
120583 (119889119905 119902119905) = sum119905isin119902119905 119908119905(119889119905)119908119905(119902119905)radicsum119905isin119889119905 [119908119905(119889119905)]2 sum119905isin119902119905 [119908119905(119902119905)]2
(2)
The weight 119908119905(119889119905) = 1 + ln(119891119905(119889119905)) where 119891119905(119889119905) representsthe frequency of term t in 119889119905 The weight 119908119905(119902119905) = ln(1 +|119863|119889119891119905) where |119863| is the number of objects in D and 119889119891119905 isthe document frequency A higher 120583 means a higher textualrelevance to the query keywords We used the variation ofcosine similarity based on the significance factor 120579119905(119899) ofterm t in a document n where n represents the descriptionof data object 119889119905 or query keywords 119902119905 The significance120579119905(119899) = 119908119905(119899)radicsum119905isin119899(119908119905(119899))2 is the normalized weight of theterm in the document by taking into account the length ofthe document [24 25] Hence the textual relevance 120583(119889119905 119902119905)can be rewritten as
120583 (119889119905 119902119905) = sum119905isin119902119905
120579119905(119889119905)120579119905(119902119905) (3)
4 Query Processing System
In this section we present the proposed query processingsystem that indexes the data objects and prunes the irrelevantedges for efficient query processing In Section 41 we discussthe indexing framework and in Section 42 we present anefficient keyword query processing algorithm for snapshotqueries
41 Indexing Framework In this study our main work focu-ses on moving queries in a directed and dynamic road net-works We use a method similar to the enhanced techniquepresented in [12] as our basic framework for processingsnapshot queries in directed and dynamic road networksTheindexing framework combines a road network framework[1] for storing spatial information and an inverted file forindexing data objects For easy traversing of the networkwe store the adjacent nodes of each given node by storingnode id (119899119894119889) edge id (119890119894119889) the direction of the edge andthe weight of the edge The indexing framework consists oftwomain components a pruning component and an invertedfile component Figure 2 illustrates the main componentsof an indexing framework The pruning component firstprunes the edges that contain data objects irrelevant to thequery keyword To achieve this we introduced the highestsignificance 120579+119905 of a given term t in the description of objectslying on the edge The 120579+119905 on an edge is retrieved by a keycomposed of a pair of edge id and term id (119890119894119889 119905119894119889) The 120579+119905represents an upper-bound significance of any object lying onan edge with term t in its description The inverted list of aterm t on an edge is accessed only if the upper-bound scorecomposed by 120579+119905 and theminimumnetwork distance betweenthe starting node of the edge and query q may return acandidate data object Naturally the edges with upper-boundscores smaller than the score of the k-th object found so farare pruned
We implement an inverted file for indexing data objectsThe inverted file contains a vocabulary and inverted lists Thevocabulary keeps general information about each term (suchas the frequency of the term) which is helpful in computingthe textual relevance of the data objects The inverted liststores the data objects located on the edge
997888997888997888997888997888rarr(119899119904 119899119890) that havea term t in their description An inverted list is identifiedby a key composed of (119890119894119889 119905119894119889) Each inverted file is a set ofinverted lists A separate inverted list is used for each term inthe object description An inverted list stores two attributesfor each data object first the distance between the data objectand the starting node 119889119894119904119905(119899119904 119889119894) second the significancefactor 120579(119905119894 119889119894) of the term 119905119894 in the description of the dataobject Note that the network distance between two points ina directed road network is not symmetrical (ie 119889119894119904119905(119899119904 119889119894) =119889119894119904119905(119889119894 119899119904)) Recall that the starting node is chosen accordingto the orientation of the edge such that the direction of theedge is from the node toward the data object In Figure 1 1198993is the starting node for 1198897 For bidirectional edges any of theadjacent nodes can act as a starting node
The proposed indexing scheme has three main advan-tages First the object search relevant to query keywords isvery efficient using the (119890119894119889 119905119894119889) pair Second inverted filesalso store the network distance between the starting node andthe data object which helps in accessing the data object in thedirected road network Finally the pruning technique allowsfor faster query processing by exploring fewer edges
Table 2 presents the notations used in this study
42 Query Processing Algorithm Our algorithm traverses theroad network incrementally in a similar fashion to Dijkstrarsquos
Wireless Communications and Mobile Computing 5
Inverted FileInverted Lists
PruningVocabulary
1 Compute upper-bound score using
2 Inverted list of a term is accessedonly if the upper-bound score is greater than kth object
dist(nq) and t+
lteid tidgt
lteid tidgt
tid Dftid
di dist(ns di) (d t )
+t
Figure 2 Indexing framework
Table 2 Summary of notations used in this paper
Notation DefinitionG = (N EW) Graph model of road network119889119894119904119905(119901119904 119901119890) Length of shortest path from 119901119904 to 119901119890 where 119901119904 and 119901119890 represent start and end points respectively119897119890119899(1199011 1199012) Length of segment connecting two points 1199011 and 1199012119899119894 Node in road network119890 = (119899119904 119899119890) Edge in edge set E where 119899119904 and 119899119890 are start and end points of the edge119899120573 Boundary node corresponding to start (119899119904) or end (119899119890) point of an edge119882(119890) Weight of edge (119899119904 119899119890)q Query point in road networkk A number that represents q can be among k number of closest facilities to a data object dD Set of data objects119863 = 1198891 1198892 119889|119863|119863(119899119904 119899119890) Set of data objects in an edge119901119886 Anchor point that corresponds to start point of expansion119875119878119864 Safe exit point where safe and non-safe regions of q intersect120572 query parameter120595(119889) Score of data object d120583(119889119905 119902119905) textual relevance of data object d with query keywords120582(119889119897 119902119897) Spatial relevance of data object d with query location119863+ Set of answer objects119863minus Set of non-answer objects119889+119897 Lowest answer object119889minusℎ Highest non-answer object
algorithm [26] Algorithm 1 returns the top-k data objectswith the highest scores according to their joint textual andspatial relevance to the query The algorithm begins byexploring the active edge where query object q is located andexpands the network in an increasing order of distance fromq Each entry in the min-heap has the form (119901119886 119890119889119892119890) where119901119886 indicates the anchor point in the edge For an active edgeq becomes the anchor point Otherwise for directed edgesending node 119899119890 becomes the anchor point For bidirectionaledges either of the adjacent boundary nodes ie 119899119904 or 119899119890becomes the anchor point Let119863119896 be the current set of top-kdata objects and 119904119896 be the score of the k-th data object in119863119896The 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896) function retrieves the candidatedata objects 119863119888 located in an edge with a better score 120595(119889)than 119904119896 Next the 119863119896 set is updated with the data objects in
119863119888 and so does 119904119896The algorithm continues its expansion andinserts the adjacent edges of the boundary node until the heapis exhausted or the upper-bound score of the remaining dataobjects cannot have a better score than 119904119896 The upper-boundscore 120595(119899) of node n is computed using 119889119894119904119905(119899 119902) and themaximum textual relevance (120583 = 1)Therefore if120595(119899) le 119904119896 itmeans that even if there is unexplored data object dmatchingall query keywords its score can be better than the k-th objectin 119863119896 because 119889119894119904119905(119889 119902119897) ge 119889119894119904119905(119899 119902119897) This is certain owingto the fact that the algorithm strictly expands the node with aminimum distance to the query location
Algorithm 2 presents the 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896) proce-dure which finds the candidate data objects This procedurehas twomain steps In the first step the upper-bound score ofthe edges is computed using a significance factor (120579119905 ) of a term
6 Wireless Communications and Mobile Computing
(1) Input Top-k spatial keyword query 119876119873 = (119902119897 119902119905 119896)(2) Output Top-k data objects with highest score(3) 119863119888 larr997888 0 lowastset of candidate data objects(4) max-heap 119863119896 larr997888 0 lowastcurrent Top-k set(5) 119904119896 larr997888 0 lowastk-th score in119863119896(6) min-heap larr997888 0(7) 119890119909119901119897119900119903119890119889 larr997888 0(8) min-heapinsert(119902119897 119890119889119892119890119886119888119905119894V119890)(9) 119863119888 larr997888 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896)(10) update119863119896 and 119904119896 with 119889 isin 119863119888(11) whilemin-heap = 0 and (1(1 + 120572120582(119889119897 119902119897)) lt 119904119896) do(12) for each unexplored adjacent edge of (119901119886 119890119889119892119890) do(13) 119890119909119901119897119900119903119890119889 larr997888 119890119909119901119897119900119903119890119889 cup (119901119886 119890119889119892119890)(14) 119863119888 larr997888 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896)(15) update119863119896 and 119904119896 with 119889 isin 119863119888(16) end(17) min-heapinsert(adjacent node edge)(18) end(19) return119863119896
Algorithm 1 EvaluateSnapshotQuery(Node 119899119894 Edge 119890119894)
(1) Input Edge ID 119890119894119889 Term ID 119905119894119889 score of k-th object 119904119896(2) Output candidate list119863119888(3) compute 120579119905(119890119894)(4) if 120579119905(119890119894) gt 0 then(5) 119898119886119909119904119888119900119903119890(119890119894) larr997888 119888119900119898119901119906119905119890119898119886119909119904119888119900119903119890(120579119905 119889119894119904119905(119890119894 119902119897))(6) end(7) if 119898119886119909119904119888119900119903119890(119890119894) gt 119904119896 then(8) for each data object in 119890119894 do(9) compute 119889119904119888119900119903119890(10) end(11) if 119889119904119888119900119903119890 gt 119904119896 then(12) 119863119888 larr997888 119863119888 cup 119889(13) end(14) end(15) return119863119888
Algorithm 2 CandidateSearch((119890119894119889 119905119894119889) 119904119896)
119905 isin 119902119905 and the shortest distance 119904119889119894119904119905(119890119894 119902119897) between the edgeand the query location In the next step the inverted lists ofterm t are fetched if their upper-bound score is greater than119904119896 In the inverted lists the objects with score 120595(119889) greaterthan 119904119896 are returned
To understand the proposed algorithm consider theroad network presented in Figure 1 Assume that a query qgenerated a top-1 keyword query with qd ldquoItalian Restau-rantrdquo For ease of presentation we assume 120572 = 1 and thetextual relevance 120583 is the number of occurrences of querykeywords in 119889119905 divided by the number of keywords in thedocument (description of data object) For example 120595(1198894) =120583(1198894119905 119902119905)(1 + 120582(1198894119897 119902119897)) = 058 = 006 The algorithmstarts the network expansion from an active edge
997888997888997888997888997888rarr(1198992 1198993)where q is the anchor point Note that the direction of the edge997888997888997888997888997888rarr(1198992 1198993) is from 1198992 to 1198993 Therefore the algorithm explores
only997888997888997888997888997888rarr(119902 1198993) There is no data object found in
997888997888997888997888997888rarr(119902 1198993) Then1198993 becomes the anchor point and edges (1198993 1198994) (1198993 1198995)and (1198993 1198997) are inserted in min-heap Next the 119888119886119899119889119904119890119886119903119888ℎfunction retrieves the candidate data objects on edges (1198993 1198994)(1198992 1198993) and (1198993 1198997) whose score is better than 119904119896 On edge(1198993 1198995) data object 1198893 is retrieved with 120595(1198893) = 02 Dataobject 1198893 is inserted in the119863119896 set and the value of 119904119896 is set to02 For edges (1198993 1198994) and (1198993 1198997) there is no candidate objectfound because 1198892119905 (ldquoCaferdquo) and 1198897119905 (ldquoCafe and Bakeryrdquo) donot match with 119902119905 The algorithm continues expanding theedges whose upper-bound score is greater than 119904119896 The edge997888997888997888997888997888rarr(1198997 1198992) is explored next The upper-bound score of
997888997888997888997888997888rarr(1198997 1198992)is 17 which is less than 119904119896 Similarly for edge
larr997888997888997888997888997888(1198996 1198995) theupper-bound score is 058 lt 119904119896 Therefore the algorithmterminates and reports 1198893 as the top-1 result
Wireless Communications and Mobile Computing 7
q
q issues TkSK query at p1
Server returns a set of objects for p1
Figure 3 Illustration of directed road network
qq issues TkSK query at p2
Server returns a set of objects for p2
Figure 4 Illustration of directed road network
5 Moving Top-119896 Spatial Keyword Queries
In this section we present our method to monitor themoving top-k spatial keyword queries where query objectsare moving in a directed road network Figure 3 providesan example of TkSK in road networks where query point qissues a TkSK query at point 1199011 Note that the numbers onthe arrows in the figure indicate the order of the steps Toobtain top-k results at 1199011 the server executes Algorithm 1as mentioned in Section 42 Now consider that the queryobject is moved to 1199012 as shown in Figure 4 to retrieve thetop-k results at point 1199012 The simple method is to repeat theprocedure executed at 1199011 However the use of recomputationwhenever query q changes its location significantly increasesthe computation cost Furthermore it also increases thecommunication overhead because the query object mustreport its location whenever it moves and the server mustsend the results set To address these issues we introduce thesafe exit approach
In the proposed framework the server computes safeexit points for a query object The server maintains a set ofmoving queries and the query result remains valid until thequery objects remain inside their respective safe exit pointsWhenever a query object leaves its safe exit points the serverrecomputes theTkSK and safe exit points for the query object
Next we present our method to compute the safe exitpoints for a query objectThe safe exit point represents a pointin the segment where a safe region and nonsafe region meetWe compute the safe exit point using the divide-and-conquertechnique Before presenting the detailed methodology wedefine the terminologies used in this section
Definition 1 (safe region) A portion of a road segment thatcan guarantee that as long as the query point lies in it itstop-k results remain valid
Definition 2 (answer objects 119863+) A data object d is calledan answer object of query q if the score of data object d(120595(119889) gt 120595(119889119886)) where 119889119886 represents any other data object inthe directed road network Similarly we can generalize thisdefinition for TkSK a data object d is called an answer object
of query q if the score of a data object d (120595(119889) gt 120595(119889119896+1))where 119889119896+1 represents the (119896+1)119905ℎ data object in the directedroad network In other words we can state that all answerobjects are top-k results of query q
Definition 3 (nonanswer objects119863minus) A data object d is calleda nonanswer object of query q if the score of data object d(120595(119889) lt 120595(119889119886)) where 119889119886 represents any other data object inthe directed road network Similarly we can generalize thisdefinition for TkSK a data object d is called a nonanswerobject of query q if the score of data object d (120595(119889) lt 120595(119889119896))where 119889119896 represents the kth data object in the directed roadnetwork That is we can say that all answer objects are top-k results of query q Therefore we can state that none of thenonanswer objects are in the top-k results of query q
Definition 4 (lowest answer object 119863+119897 ) An answer object119889+ isin 119863+ is called a lowest answer object to a point 119901 isin 119866such that 120595(119889+119897 )119901 = min(120595(119889+1 )119901 120595(119889+2 )119901 120595(119889+|119889+|)119901)where120595(119889+119897 )119901 represents the score of the lowest answer objectat point p In other words 120595(119889+119897 )119901 lt 120595(119889+119886 )119901 at point p where119889+119886 is any other answer object in the 119863+ setDefinition 5 (highest nonanswer object 119863minusℎ) A nonanswerobject 119889minus isin 119863minus is called a highest nonanswer object toa point 119901 isin 119866 such that 120595(119889minusℎ )119901 = max(120595(119889minus1 )119901 120595(119889minus2 )119901 120595(119889+|119889minus|)119901) where 120595(119889+ℎ)119901 represents the score of thehighest nonanswer object at point p In other words the120595(119889minus119897 )119901 lt 120595(119889minus119886 )119901 at point p where 119889minus119886 is any other nonanswerobject in the 119863minus set
As discussed earlier the main challenge in the continuousprocessing of moving TkSK is to maintain the validity of theresult set because the movement of query objects can nullifythe result set To monitor the validity of the result set wepropose a safe-region-based approach
51 Computation of Safe Exit Points In this section wepresent our technique to compute the safe exit points Themain goal is to find a point in the road network where the
8 Wireless Communications and Mobile Computing
query result set will change The result set will change whenthe score of highest nonanswer 119863minusℎ surpasses the score of119863+119897 Generally the textual relevance score does not changeTherefore the score of data objects only changes because ofthe spatial relevance score which can only change by themovement of query objects The computation of the safe exitpoint is based on two key observations
Observation 1 If 119863+119899120573 = 119863+119901119886 there is no safe exit point in thesegment
Explanation 119863+119901119886 represents the set of answer objects atanchor point 119901119886 whereas 119863+119899120573 represents the set of answerobjects at boundary node 119899120573 As discussed earlier the safe exitpoint is the particular point where the query results changedIf the query results at the starting node are the same as theending node of any segmentedge there does not exist anypoint where the query result is changing Hence we do notsearch the safe exit point in that segment
Observation 2 If 119863+119901119886 = 119863+119899120573 there is a safe exit point in thesegment
Explanation In contrast to Observation 1 if the query resultsare different at the starting and ending points then thereexists a point where the query results are changing Hencethere is a safe exit point in the segment
To find the safe region we observe the following cases
Case 1 (when 120572 = 1 and the textual relevance of the highestnonanswer object and lowest answer object is the same)In this case both the textual and spatial relevance have thesame importance (ie 120572 = 1) In addition the top-k resultdepends only on the spatial relevance because the textualrelevance of both objects is the same The data object thatis closer to query point q becomes the answer object For anundirected edge the safe exit point 119901119904119890 is the center pointie max(119889119894119904119905(119901119904119890 119889+1 ) 119889119894119904119905(119901119904119890 119889+2 ) 119889119894119904119905(119901119904119890 119889+|119889+|)) =min(119889119894119904119905(119901119904119890 119889minus1 ) 119889119894119904119905(119901119904119890 119889minus2 ) 119889119894119904119905(119901119904119890 119889minus|119889minus|)) betweenthe lowest answer object and the highest nonanswer objectHowever in case of a directed edge where 119889119894119904119905(119901119886 119899120573) =119889119894119904119905(119899120573 119901119886) the safe exit point is either 119889+119897 or 119901119886 If 119889+119897 isin(119901119886 119899120573) then the safe exit point is 119889+119897 otherwise the safe exitpoint is 119901119886Case 2 (when 120572 = 1 and the textual relevance of the highestnonanswer object and lowest answer object is different) Inthis case the top-k result depends on all functions that are the120572 spatial and textual relevance Clearly for the undirectededges the midpoint between the lowest answer object andthe highest nonanswer object does not provide a valid safeexit point Therefore we introduce the divide-and-conquertechnique This will keep dividing the search space until weget the point where the score of the nonanswer is greater thanthat of the answer object Typically the safe exit point shouldbe closer to the data object whose score is lower Based onthis observation first we compute the midpoint in a similarfashion to Case 1 and then we continue dividing the search
space until we find the point For undirected edges the safeexit point can be computed in a similar fashion to Case 1
Case 2 also works for other cases when the safe exit pointis not the mid point between the lowest answer object andthe highest nonanswer object In these cases the safe exitpoint depends on two or more functions Therefore the safeexit point can be easily computed using the aforementioneddivide-and-conquer technique Following are the scenarioswhere the safe exit point can be computed using Case 2
(a) When 120572 = 1 and textual relevance of the nearest non-answer object and farthest answer object is different
(b) When 120572 = 1 and textual relevance of the nearestnonanswer object and farthest answer object is same
Case 3 (when 120572 = 0) This means the spatial relevance hasno effect on the score of data objects Hence no monitoringis required for this scenario
Algorithm 3 retrieves the safe exit points using theobservations we discussed earlier The core function in thisalgorithm is ComputeSafeExit(119901119886 119899120573) which finds the safeexit point in a segment between 119901119886 and 119899120573 The detailedComputeSafeExit(119901119886 119899120573) is described in Algorithm 4 FirstAlgorithm 4 determines 119889+119897 and 119889minusℎ at point 119901 isin [119901119886 119899120573]Recall that 119889+119897 is the lowest answer object to p where 119889minusℎ isthe highest nonanswer object to p Algorithm 4 computes thesafe exit point based on the cases we discussed earlier Thereare a further two scenarios for Cases 1 and 2 For Case 1 if119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then the safe exit point is the mid-point between 119889+119897 and 119889minusℎ If 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) thenthe edge is directed and therefore the safe exit point is either119901119886 or 119889+119897 If 119889+119897 lies on the edge [119901119886 119899120573] then 119889+119897 is the safe exitpoint Otherwise 119901119886 is the safe exit point
Similarly for Case 2 if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) thenthe safe exit point is computed by dividing the search space byhalf until we find the closest point such that 120595(119889minusℎ) gt 120595(119889+119897 )The safe exit point is computed in the same way as in Case 2if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886)52 Computation of Safe Exit Points for Example Considerthe same example in Figure 1 where the query point q issuesa top-1 keyword query with qt ldquoItalian restaurantrdquo For thisexample let us consider 120572 = 1 The monitoring algorithmstarts exploring from the active edge containing the queryobject q Therefore
997888997888997888997888997888rarr(119902 1198993) is explored first As shown inTable 3 for
997888997888997888997888997888rarr(119902 1198993) 119863+119902 = 1198893 and 119863+1198993 = 1198893 Accordingto Observation 1 no safe exit point exists in this segmentTherefore edges adjacent to 1198993 are explored and 1198993 becomesthe new 119901119886 The edge (1198993 1198994) is explored next Similarlythe answer object at 1198993 and 1198994 is the same 119863+1198993 = 119863+1198994 =1198893 Therefore a safe exit point does not exist in (1198993 1198994)The edge (1198993 1198997) is explored next As shown in Table 3119863+1198993 = 1198893 and 119863+1198997 = 1198896 By Observation 2 there is asafe exit point in (1198993 1198997) As shown in Figure 1 1198893119905 =1198896119905 = ldquo119868119905119886119897119894119886119899119877119890119904119905119886119906119903119886119899119905rdquo and 119889119894119904119905(1198993 1198997) = 119889119894119904119905(1198997 1198993)
Wireless Communications and Mobile Computing 9
(1) Input Same as Algorithm 1(2) Output 119875119878119864 a set of safe exit points(3) 119875119878119864 larr997888 0 lowastset of safe exit points(4) 119863+119901119886 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119901119886 (119901119886 119899120573))(5) lowastResults calculated using Algorithm 1(6) 119863+119899120573 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910((119899120573 (119901119886 119899120573)))(7) lowastResults calculated using Algorithm 1(8) if 119863+119901119886 = 119863+119899120573 then(9) no safe exit point lowastrefer to Observation 1(10) end(11) if 119863+119901119886 = 119863+119899120573 then(12) 119875119878119864 larr997888 119875119878119864 cup 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119901119886 119899120573) lowastsafe exit point
exist - refer to Observation 2(13) end(14) return 119875119878119864
Algorithm 3 COSK monitoring algorithm
(1) Input same as Algorithm 1(2) Output se safe exit point in (119901119886 119899120573)(3) 119863+119897 larr997888 lt 119901119863+119897 gt | for each point 119901 isin [119901119886 119899120573] 119889+119897 such that120595(119889+119897 )119901 = min(120595(119889+1 )119901 120595(119889+2 )119901 120595(119889+|119889+|)119901(4) 119863minusℎ larr997888 lt 119901119863minusℎ gt | for each point 119901 isin [119901119886 119899120573] 119889minusℎ such that120595(119889minusℎ )119901 = max(120595(119889minus1 )119901 120595(119889minus2 )119901 120595(119889minus|119889minus |)119901(5) if Case 1 then(6) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(7) 119901119904119890 =
max(119889119894119904119905(119904119890 119889+1 ) 119889119894119904119905(119904119890 119889+2 ) 119889119894119904119905(119904119890 119889+|119889+ |)) =min(119889119894119904119905(119904119890 119889minus1 ) 119889119894119904119905(119904119890 119889minus2 ) 119889119894119904119905(119904119890 119889minus|119889minus |))
(8) end(9) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(10) 119901119904119890 = 119901119886 or 119901119904119890 = 119889+119897 where 119889+119897 isin (119901119886 119899120573)(11) end(12) end(13) if Case 2 then(14) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(15) 119901119904119890 =closest point to 119901119886 such that 120595(119889minusℎ ) gt 120595(119889+119897 )(16) end(17) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(18) Same as Line (10)(19) end(20) end(21) return 119901119904119890
Algorithm 4 ComputeSafeExit(119901119886 119899120573)
Therefore according to Case 1 the safe exit point 1199041 isthe midpoint between 1198893 and 1198896 That is 119889119894119904119905(1199011199041198901 1198893) =119889119894119904119905(1199011199041198901 1198896) where119889119894119904119905(1199011199041198901 1198893) = 119909+3 and 119889119894119904119905(1199011199041198901 1198896) =minus119909 + 5 for 0 lt 119909 lt 3 Consequently 119909 = 1 which means thatthe distance from 1198993 to 1199011199041198901 is 1
Next we determine a safe exit point in (1198993 1198995) As shownin Table 3 the answer object at 1198995 is also the same as 1198993Hence no safe exit point exists in this edge Next
larr997888997888997888997888997888(1198996 1198995) isexplored with 119901119886 = 1198995 According to Table 3 119863+1198997 = 1198894 and
119863+1198995 = 1198893 Therefore a safe exit point exists in this edge This
edge is directed and for each point 119901 isin larr997888997888997888997888997888(1198996 1198995) the shortestdistance from p to 1198893 is from 119901 997888rarr 1198996 997888rarr 1198992 997888rarr 1198993 997888rarr 1198893Therefore 1198995 is the safe exit point
The bold lines in Figure 5 indicate the safe region of qThetop-1 result remains 1198893 until the query q lies in the safe region
Next we analyze the time complexity for determininga set of safe exit points using a set of qualifying objects119889 isin 119863+119901119886 cup 119863+119899120573 cup 119863(119901119886 119899120573) Note that 119863+119901119886 (119863+119899120573) indicates
10 Wireless Communications and Mobile Computing
Table 3 Computation of safe exit points for example scenario
EdgeSegment 119901119886 119863+119901119886 119863+119899120573 119901119904119890997888997888997888997888rarr(119902 1198993) q 119863+119902 = 1198893 119863+1198993 = 1198893 none(1198993 1198994) q 119863+1198993 = 1198893 119863+1198994 = 1198893 none(1198993 1198997) 1198993 119863+1198993 = 1198893 119863+1198997 = 1198896 1199011199041198901997888997888997888997888997888rarr(1198993 1198995) 1198993 119863+1198993 = 1198893 119863+1198995 = 1198893 nonelarr997888997888997888997888997888(1198996 1198995) 1198995 119863+1198995 = 1198893 119863+1198996 = 1198894 1199011199041198902
2
q
3
1
1 1
1
1
2
1
2
1 2
1
3
2
1
1
d4 (Chinese Restaurant)
d1 (Grand Hotel)
d5 (Pub and Bar)
n1
n6
n2 n3
n4
n7
pse1
pse2
n5
d6(Italian Restaurant)
d3 (Italian Restaurant)
d2 (Cafe)
d7 (Cafe and Bakery)
Figure 5 Illustration of safe region of q
the set of k data objects that satisfies the query conditionat 119901119886 (119899120573) According to Dijkstras algorithm [26] the timecomplexity 119874(119863+119902 ) for computing a set of answer objects at aquery point q is119874(119863+119902 ) = 119874(|119864|+|119873| log |119873|)Thismeans that119874(119863+119901119886) = 119874(119863+119899120573) = 119874(|119864| + |119873| log |119873|) holds for endpoints119901119886 and 119899120573 Thus time complexity 119874(Ω119896119905ℎ) when determiningthe skyline Ω119896119905ℎ with the k-th highest score is 119874(Ω119896119905ℎ) =119862119896119905ℎ119874(|119863+119901119886 cup 119863+119899120573 cup 119863(119901119886 119899120573)|) where 119862119896119905ℎ is the numberof qualifying objects that participate in the constitution ofthe skyline with the k-th highest score Therefore the timecomplexity of determining a safe exit point coincides withthe time complexity of determining the two skylines iethe skyline 119863+119897 with the k-th highest (or lowest) score foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects This is because the safe exit point is foundat the cross point between these skylines
Figure 6 represents the skyline graph for 119896 = 1 in an edge(1198997 1198993) Let us draw the score function for 1198893 and 1198896 for theroad segment (1198997 1198993) where a safe exit point exists This isbecause 119863(1198993)+ = 1198893 and 119863(1198997)+ = 1198896 for 119896 = 1 For eachpoint 119901 isin (1198997 1198993) the distance between 1198893 and point p canbe represented as 119889119894119904119905(1198893 119901) = 119889119894119904119905(1198893 1198993) + 119897119890119899(1198993 119901) = 6 minus119897119890119899(1198997 119901) Similarly for each point 119901 isin (1198997 1198993) the distancebetween 1198896 and point p can be represented as 119889119894119904119905(1198896 119901) =119889119894119904119905(1198896 1198997) + 119897119890119899(1198997 119901) = 2 + 119897119890119899(1198997 119901) Let 119897119890119899(1198997 119901) be
n7
10
08
06
04
02
n3pse1d7
distance
Scor
e
05 10 15 20 25 30
(d6) = 1(x + 3)
(d3) = 1(minusx + 7)
Figure 6 Skyline graph for 119896 = 1 on the road segment (1198997 1198993)
a variable x (0 le 119909 le 3) We can write 120582(1198893 119901) =119889119894119904119905(1198893 119901) = 6 minus 119909 and 120582(1198896 119901) = 119889119894119904119905(1198896 119901) = 2 + 119909 Thenwe can represent score function 120595(1198893) and 120595(1198896) as follows
120595(1198893) = 120583(1198893119905 119902119905)(1 + 120572 sdot 120582(1198893 119901)) = 1(7 minus 119909) for(0 le 119909 le 3)
Wireless Communications and Mobile Computing 11
120595(1198896) = 120583(1198896119905 119902119905)(1 + 120572 sdot 120582(1198896 119901)) = 1(3 + 119909) for(0 le 119909 le 3)Finally we present the lemma to prove that safe exit points
computed by COSK are correct
Lemma 8 The COSK algorithm correctly computes a set ofsafe exit points
Proof We will prove the correctness of the COSK algorithmby contradiction We assume that if 119863+119901119886 = 119863+119899120573 there is nosafe exit point in a road segment (119901119886119899120573) This means that foreach point p in the road segment (119901119886119899120573) the query result atp equals 119863+119901119886 ie 119863+119901 = 119863+119901119886forall119901 isin (119901119886119899120573) However it leadsto a contradiction that 119863+119899120573 = 119863+119901119886 when 119901 = 119899120573 There-fore if 119863+119901119886 = 119863+119899120573 a safe exit point exists in (119901119886119899120573) In addi-tion a safe exit point is determined using the skyline 119863+119897 foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects when 119863+119901119886 = 119863+119899120573 The first skyline is acomposite polyline drawn from answer objects in 119863+119901119886 Thesecond skyline is a composite polyline drawn from nonan-swer objects in 119863+119899120573 cup 119863(119901119886 119899120573) minus 119863+119901119886
6 Monitoring Query Results and Safe Regionsin Dynamic Directed Road Networks
In this section we discuss the monitoring of spatial key-word queries in dynamic road networks where the networkdistance changes depending on the traffic conditions Theupdates on weight of some edges may invalidate the queryresults or safe region of q even though the query objectq remains within their respective safe region Figure 7illustrates an example of changing the weights edges
larr997888997888997888997888997888(1198991 1198992)and
larr997888997888997888997888997888(1198991 1198996) For convenience we consider 120572 = 1 and qt =ldquoItalian restaurantrdquo In Figure 7(a) the top-1 result is 1198891 andbold lines show the safe region of query q Now consider attime 119905119895 the weights of two edgeslarr997888997888997888997888997888(1198991 1198992) andlarr997888997888997888997888997888(1198991 1198996) changeddue to heavy traffic condition as shown in Figure 7(b) Theupdate in weight of edges may invalidate the query resultor safe region of q Therefore it is necessary to monitor thevalidity of results and safe region when the changes occur
Next we introduce a monitoring region to monitor thevalidity of the safe region effectively when the weight ofan edge is changed Monitoring region MR contains all thepoints between query point q and lowest answer object andhighest nonanswer object Formally it is defined as 119872119877 =119889119894119904119905(119902119863+119897 ) cup 119889119894119904119905(119902119863minusℎ) where 119889119894119904119905(119902119863+119897 ) is the distancebetween q and lowest answer object and 119889119894119904119905(119902119863minusℎ) is highestnonanswer object In given example the 119863+119897 = 1198891 and 119863minusℎ =1198892 1198893 Therefore the dotted lines in Figure 8(a) shows themonitoring region of query object q
Now at time 119905119895 the update to edgeslarr997888997888997888997888997888(1198991 1198996) and larr997888997888997888997888997888997888(1198991 1198891)
which is not part of monitoring region can safely be ignoredHowever the updated on segment
997888997888997888997888997888997888rarr(1198992 1198891)which is associatedwith monitoring region may nullify the results As shown in
Figure 8(b) after update the top-1 result becomes 1198892 and boldlines represents the new safe region of q
Algorithm 5 monitors the validity of result set and saferegion of query object qwhen the weight of any edge changesLet us consider weight of edge (119899119894 119899119895) changes at time 119905119895First algorithm checks whether edge (119899119894 119899119895) is associatedwith monitoring region or not If it is not part of monitoringregion then algorithm simply ignores the update in edge(119899119894 119899119895) and query results and safe region remains valid Incontrast if edge is associated with monitoring region (ie119872119877cap(119899119894 119899119895) = 0) then algorithm evaluates the query resultsConsequently the top-k results and safe region of queryq needs to be updated Finally the algorithm updates themonitoring region of q
7 Performance Evaluation
In this section we evaluate the performance of COSKthrough simulation experiments We describe our experi-mental settings in Section 71 and we present our experimen-tal results for static and dynamic road networks in Sections72 and 73 respectively
71 Experimental Settings All of our experiments wereperformed using real road networks namely OldenburgSan Francisco and San Joaquin All three road networkswere obtained from [27] The original road network of SanFrancisco had 21047 nodes and 21692 edges We reformat-ted the network pruned approximately 30 of the nodesand adjusted the edges and their weights accordingly Thisresulted in a network with 14732 nodes and 14316 edgesBoth the direction of edges and data objects on the edgeswere generated randomly The description of each data objectwas extracted from Twitter messages [28] and we assignedone tweet per data object Table 4 presents the characteristicsof the data sets used in the experimental evaluation Wesimulated moving query objects by using a spatiotemporaldata generator [29] The input to generator was the road net-work of the data set used and the output was the set of queryobjects moving on the road network Each experiment had100 moving queries which were continuously monitored for100 timestamps (1 timestamp = 1 second) and the averageresult was reported in the experiments
As a benchmark for COSK in static road network weimplemented a CMTkSK+ algorithm [22] which also contin-uously monitored the moving top-k spatial keyword queriesin the road networks However this algorithm was originallydesigned for undirected road networks To make a faircomparison we modified CMTkSK+ to process top-k spatialkeyword queries in directed road networks and called itCMTkSK+ Specifically we modified the distance computa-tion method between two points such that in directed roadnetworks 119889119894119904119905(1199011 1199012) = 119889119894119904119905(1199012 1199011) Since CMTkSK+ doesnot handle top-k spatial queries in dynamic road roads wecompared the performance of COSK with basic algorithmwhich recomputes the results whenever query object changesits location All algorithms were implemented in Java andwere executed on a desktop PC 280-GHz Intel Core i5 with
12 Wireless Communications and Mobile Computing
3
q5 5
2 3
3
2
2 3 5
11
d3 (Chinese Restaurant)
n1
n6
n2 pse2
pse1
pse3
n4n5
n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Safe region at time 119905119894
9
q10 5
6 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6
n2 n3
n4n5
d2 (Italian Restaurant)d1 (Italian Restaurant)
(b) Updating weight oflarr997888997888997888997888997888997888(1198991 1198992) and
larr997888997888997888997888997888997888(1198991 1198996) at time 119905119895
Figure 7 Updating the weight of edges in a dynamic road network where 119905119894 lt 119905119895
3
q5 5
2 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6 n4n5
n2 n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Monitoring region at time 119905119894
9
q10 5
5 4
233
2
2 3 5
11
037
pse2pse1
pse3
d3 (Chinese Restaurant)n6 n4n5
n2 n3d2 (Italian Restaurant)n1 d1 (Italian Restaurant)
(b) New safe region at time 119905119895
Figure 8 Monitoring region and updated safe region at time 119905119895
(1) InputMonitoring regionMR updated edge (119899119894 119899119895)(2) Output none(3) if 119872119877cap (119899119894 119899119895) = 0 then(4) lowastedge (119899119894 119899119895) is not part of monitoring region(5) ignore the change in the weight of edge (119899119894 119899119895)(6) end(7) 119875119878119864 larr997888 0 lowastset of safe exit points(8) else(9) 119863119896119906119901119889 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119899119894 119890119894) lowastupdate set of
top-k results(10) 119875119878119864119906119901119889 larr997888 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119875119886 119899120573) lowastupdate safe exit
points(11) 119872119877119906119901119889 larr997888 119862119900119898119901119906119905119890119872119900119899119894119905119900119903119894119899119892119877119890119892119894119900119899(119863+119897 119863minusℎ )
lowastupdate monitoring region(12) end
Algorithm 5 MonitoringSafeRegion(MR(119899119894 119899119895))
Table 4 Summary of datasets
Attribute Oldenburg San Francisco San JoaquinTotal no of nodes 6104 14732 18262Total no of edges 7034 14316 23876Percentage of directed edges 30 30 30Total no of objects 5627 11453 19098Average no of objects per edge 08 08 08Total no of words 49517 103649 166153
Wireless Communications and Mobile Computing 13
Table 5 Experimental parameter settings
Parameter RangeNumber of results (k) 5 10 15 20 25Number of keywords (n) 1 2 3 4 5Query parameter (120572) 001 01 1 10 100Dataset Oldenburg San Francisco San JoaquinNumber of data objects (119873119863) 10 20 30 40 50 (x1000)Speed of query objects (119881119902119903119910) 25 50 75 100 125 (kmh)Mobility (119872119902119903119910) 20 40 60 80 100Ratio of directed edges (119864119889119894119903) 10 20 30 40 50Ratio of updated edges (119864119906119901119889) 15 30 60 80 100
8GB of memory In the experiments we compared (1) queryprocessing times (2) edges processed ie the number ofedges processed for retrieving query results and (3) indexsizes Table 5 summarizes the parameters used in the exper-iments In each experiment we varied a single parameterwithin the range that is shown in Table 5 while maintainingthe other parameters at the bolded default values
We evaluated the performance of the algorithms by usingthe following measures (1) total amount of server CPUtime which indicates the query processing time and (2)total communication cost as the total number of points (iethe location updates sent by query objects and the queryresults and safe exit points returned by the server) transferredbetween clients and the serverThebattery power andwirelessbandwidth consumption typically increase with the amountof data transferred between objects (clients) and serversThus we used the amount of transferred data as a metric toevaluate the communication cost
72 Experimental Results of Top-k Spatial KeywordQueries in Static Road Networks
721 Effect of k Figure 9 indicates the effect of the numberof results on the query processing time and communicationcost for both algorithms Figure 9(a) indicates that the queryprocessing time increases for both algorithms as the value ofk increases This is expected because with an increase in kmore data objects are required to be explored and verifiedNevertheless COSK significantly outperforms CMTkSK+ fortwo main reasons First a relevant object search is very effi-cient when using the highest significant factor and secondCOSKdoes not need to verify the set of answer objects as longas the query object lies in a safe region On the other handthe CMTkSK+ query processing time increases significantlybecause it has to monitor and verify the set of candidateobjects periodically In Figure 9(b) the communication costsfor both algorithms increase as the number of objects in-creases However the proposed algorithm demonstrates su-perior performance compared to CMTkSK+ because client-server communication is not required when the query objectlies within the safe exit points whereas in CMTkSK+ thequery object is required to report its location to the serverwhenever it moves
722 Effect of119873119863 This experimentwas conducted on datasetSan Joaquin This dataset included 19098 data objects there-fore we randomly generated approximately 30000 additionaldata objects on different edges In Figure 10 we evaluate theperformance of COSK and CMTkSK+ by varying the cardi-nality of the data objects Note that119873119863 = 10119870 corresponds toa low density of data points while119873119863 = 50119870 corresponds toa high density In Figure 10(a) it is interesting to notice thatthe query processing times of both algorithms decrease asthe cardinality of the data objects increases For CMTkSK+this is because with high density the monitoring range of aquery decreases However for COSK it is mainly becausewhen the data density is high fewer edges are required tobe expanded which decreases the query processing time InFigure 10(b) we study the influence of the cardinality of thedata objects on the communication costs The experimentalresults indicate that the communication costs of CMTkSK+incur almost constant communication costs regardless ofdata object cardinality However the communication costsof COSK increase in proportion to the 119873119863 value This isexpected because the safe region becomes smaller as thedensity of the data objects increases which increases thecommunication costs
723 Effect of Query Keywords (n) Figure 11 shows thequery processing time and communication for COSK andCMTkSK+ as a function of the number of query keywordsFigures 11(a) and 11(b) show the trend that the performanceof both algorithms degrades when the number of keywordsincreases This is mainly because by increasing the numberof query keywords the number of relevant objects may alsoincrease resulting in a higher query processing time andcommunication cost However the safe-region-based algo-rithm COSK scales better than CMTkSk+ because of its lessexpensive monitoring technique
724 Effect of 120572 Figure 12 demonstrates the impact of queryparameter 120572 on the query processing time and on the com-munication cost A small value of 120572 indicates a greater im-portance of textual relevance whereas a high value of 120572gives more preference to the spatial relevance It is interestingto note that the query processing time is lower for higher
14 Wireless Communications and Mobile Computing
k
50
10
10
15 20
20
30
Que
ry p
roce
ssin
g tim
e (s)
COSKCMTkSK+
40
25
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
k
5 10 15 20 25
(b) Communication cost
Figure 9 Effect of k on query processing time and number of edges processed
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
10k 20k 30k 40k 50kND
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
tran
sferr
ed m
essa
ges
1M
10 20 30 40 50ND
(b) Communication cost
Figure 10 Effect of119873119863 on query processing time and communication cost
values of 120572 which indicates more importance to the spatialrelevance This is mainly because when the spatial relevanceis higher fewer edges and objects are required to be exploredand processed to determine the top-k data objects Observethat in Figure 12(b) the number of messages sent by COSKdecreases sharply with an increase in 120572725 Effect of Speed Figure 13(a) demonstrates the influenceof the speed of the query objects on the query processingtime of the COSK and CMTkSK+ algorithms The experi-mental results indicate that the performance of CMTkSK+is not significantly influenced by the speed of the query
objects because the candidate objects must be continuouslymonitored after a regular interval of time regardless ofthe speed On the other hand for COSK the performancegradually decreases as the speed of the query objects increasesbecause the objects leave their respective safe regions morefrequently Figure 13(b) shows the communication costs ofCOSK and CMTkSK+ with respect to the speed of the queryobjects CMTkSK+ incurs almost constant communicationcosts because a server-initiated request to verify the candidateobjects does not depend on the speed For COSK the queryobjects cross safe regions more frequently when the speed ishigh which increases the communication costs
Wireless Communications and Mobile Computing 15
Number of keywords1 2 3 4 5
COSKCMTkSK+
0
15
30
45
Que
ry p
roce
ssin
g tim
e (s)
60
(a) Query processing time
COSK
Number of keywords
CMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1 2 3 4 5
(b) Communication cost
Figure 11 Effect of number of keywords on query processing time and communication cost
001 01 1 10 100
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
001 01 1 10 100
(b) Communication cost
Figure 12 Effect of 120572 on query processing time and communication cost
726 Effect of Mobility Figure 14 shows the effect of mobility119872119902119903119910 (mobility refers to the percentage of query objects thatare moving at any timestamp) on the performance of COSKand CMTkSK+ algorithms As expected the query pro-cessing time and communication costs for both algorithmsincrease with119872119902119903y Nevertheless COSK performs better thanCMTkSK+ in terms of query processing time and commu-nication costs
727 Effect of Directed Edges Figure 15 shows the impactof percentage of directed edges 119864119889119894119903 on the performance ofCOSK and CMTkSK+ algorithms The query processing time
increases with 119864119889119894119903 because algorithm needs to explore moreedges to retrieve the top-k keyword queries However thecommunication cost is not significantly affected by the valueof 119864119889119894119903 for both the algorithms
728 Effect of Datasets Figure 16 demonstrates the indexsizes of the COSK and CMTkSK+ approaches for differentdatasets As shown in Figure 16 both algorithms have similarindex sizes However COSK has minor space overheadbecause it stores additional information of the highest signifi-cance factor 120579119905 of edges More important this space overheadis minimal as compared to the gain achieved by COSK inquery processing time and communication costs
16 Wireless Communications and Mobile Computing
25 50 75 100 125
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
Vqry
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
25 50 75 100 125Vqry
(b) Communication cost
Figure 13 Effect of speed on query processing time and communication cost
20 40 60 80 100Mqry
COSKCMTkSK+
0
15
45
30
60
Que
ry p
roce
ssin
g tim
e (s)
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
20 40 60 80 100Mqry
1k
COSKCMTkSK+
(b) Communication cost
Figure 14 Effect of mobility on query processing time and communication cost
73 Experimental Results of Top-k Spatial Keyword Queriesin Dynamic Road Networks In this section we evaluate theperformance of COSK and basic algorithm for dynamic roadnetworks The 119864119906119901119889 indicates the percentage of all edges thatchange their weight at each timestamp The length of anupdated edge is randomly selected between 01 to 10 times theoriginal length Figure 17(a) depicts the query processing timeof COSK and basic algorithm It is evident from the figure thatquery processing time of basic algorithm is not significantlyaffected by 119864119906119901119889 This is mainly because the query objectsissue top-k spatial queries at each timestamp However query
processing time of COSK increases with the value of 119864119906119901119889because the probability that the updated edge may associatedwith the monitoring region of query q increases with 119864119906119901119889Therefore when 119864119906119901119889 becomes large the results need to befrequently updated which increases the query processingtime Figure 17(b) shows the communication costs of COSKand basic algorithm with respect to 119864119906119901119889 Basic algorithmincurs almost constant communication costs regardless of thevalue of 119864119906119901119889 In contrast the communication cost of COSKincreases with 119864119906119901119889 because the query result and safe regionsneeds to be frequently updated
Wireless Communications and Mobile Computing 17
COSKCMTkSK+
10 20 30 40 50Edir
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1k
10 20 30 40 50Edir
eSPAKCMTkSK+
(b) Communication cost
Figure 15 Effect of 119864119889119894119903 on query processing time and communication cost
COSKCMTkSK+
0
15
45
30
60
Inde
x siz
e (M
B)
OldenburgDatasets
San Francisco San Joaquin
Figure 16 Effect of dataset on index size
8 Conclusion
In this paper we investigated moving top-k spatial keywordqueries in directed and dynamic road networksWepresentedan efficient indexing framework using inverted files thatindexes the data objects on edges allowing for the effectivesearching of data objects relevant to queries in terms ofboth textual and spatial relevance We also presented a safe-exit-based algorithm called COSK to monitor moving top-k spatial keyword queries We demonstrated that the queryresults remain valid as long as the query object resides withina safe region Furthermore COSK can effectively monitor thevalidity of query results and safe regions in dynamic roadnetworks Finally an experimental evaluation conducted on
real road networks demonstrated that COSK significantlyreduced the query processing time and communication costscompared to the CMTkSK+ algorithm
Data Availability
The real road network data used in this study are also used inmany previous studies The road network data is cited in themanuscript and it is available at httpswwwcsutahedusimlifeifeiSpatialDatasethtm To simulate the moving queriesthe authors used the spatiotemporal data generator which isalso used in previous studiesThe research article of generatoris cited in the manuscript The documentation and source
18 Wireless Communications and Mobile Computing
0
20
40
60
Que
ry p
roce
ssin
g tim
e (s)
80
15 30 45 60 75Eupd
COSKBasic
(a) Query processing time
15 30 45 60 75Eupd
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
COSKBasic
(b) Communication cost
Figure 17 Effect of 119864119906119901119889 on query processing time and communication cost
files of generator are available at httpsiapgjade-hsdeper-sonenbrinkhoffgenerator They used the Twitter tweetsfor generating the description of data objects and also querykeywords The tweets used can be accessible at httpfollow-thehashtagcomdatasetsfree-twitter-dataset-usa-200000-free-usa-tweets
Conflicts of Interest
The authors declare that there is no conflicts of interestregarding the publication of this paper
Acknowledgments
Hyung-JuChowas supported by theNational Research Foun-dation of Korea (NRF) grant funded by the Korean Govern-ment (MSIP) (NRF-2016R1A2B4009793) and this researchwas partially supported by Basic Science Research Programthrough the National Research Foundation of Korea (NRF)fundedby theMinistry of Education (2016R1D1A1B03934129)
References
[1] D Papadias N Mamoulis J Zhang and Y Tao ldquoQuery pro-cessing in spatial network databasesrdquo in Proceedings of the 29thInternational Conference on Very Large Data Bases (VLDB rsquo03)pp 802ndash813 September 2003
[2] H-J Cho K Ryu and T-S Chung ldquoAn efficient algorithm forcomputing safe exit points of moving range queries in directedroad networksrdquo Information Systems vol 41 pp 1ndash19 2014
[3] G Tsatsanifos and A Vlachou ldquoOn processing Top-k spatio-textual preference queriesrdquo in Proceedings of the 18th Interna-tional Conference on ExtendingDatabase Technology (EDBT rsquo15)pp 433ndash444 March 2015
[4] R Li A X Liu A L Wang and B Bruhadeshwar ldquoFast rangequery processing with strong privacy protection for cloud com-putingrdquo Proceedings of the VLDB Endowment vol 7 no 14 pp1953ndash1964 2014
[5] G Cong C S Jensen andDWu ldquoEfficient retrieval of the Top-k most relevant spatial web objectsrdquo Proceedings of the VLDBEndowment vol 2 no 1 pp 337ndash348 2009
[6] Z Li K C K Lee B Zheng W-C Lee D Lee and X WangldquoIR-tree An efficient index for geographic document searchrdquoIEEE Transactions on Knowledge and Data Engineering vol 23no 4 pp 585ndash599 2011
[7] Y Zhou X Xie C Wang Y Gong and W Ma ldquoHybrid indexstructures for location-based web searchrdquo in Proceedings of the14th ACM International Conference on Information and Knowl-edge Management pp 155ndash162 Bremen Germany October2005
[8] J Zobel and A Moffat ldquoInverted files for text search enginesrdquoACM Computing Surveys vol 38 no 2 2006
[9] N Beckmann H Kriegel R Schneider and B Seeger ldquoR-anefficient and robust accessmethod for points and rectanglesrdquo inProceedings of the ACM SIGMOD International Conference onManagement of Data vol 19 pp 322ndash331 May 1990
[10] R Hariharan B Hore C Li and S Mehrotra ldquoProcessing spa-tial-keyword (sk) queries in geographic information retrieval(gir) systemsrdquo in Proceedings of the 19th International Confer-ence on Scientific and Statistical DatabaseManagement (SSDBMrsquo07) July 2007
[11] I De FelipeV Hristidis andN Rishe ldquoKeyword search on spa-tial databasesrdquo in Proceedings of the 24th International Confer-ence on Data Engineering (ICDE rsquo08) pp 656ndash665 April 2008
[12] J B Rocha-Junior O Gkorgkas S Jonassen and K NoslashrvagldquoEfficient processing of top-k spatial keyword queriesrdquo inProceedings of the International Symposium on Spatial andTemporal Databases pp 205ndash222 Springer 2011
[13] D Zhang K-L Tan andAK Tung ldquoScalable top-k spatial key-word searchrdquo in Proceedings of the 16th International Conferenceon Extending Database Technology pp 359ndash370 2013
Wireless Communications and Mobile Computing 19
[14] J B Rocha-Junior andK Noslashrvag ldquoTop-k spatial keyword quer-ies on road networksrdquo in Proceedings of the 15th InternationalConference on Extending Database Technology pp 168ndash179Berlin Germany March 2012
[15] H-J Cho S J Kwon and T-S Chung ldquoA safe exit algorithmfor continuous nearest neighbor monitoring in road networksrdquoMobile Information Systems vol 9 no 1 pp 37ndash53 2013
[16] D Yung M L Yiu and E Lo ldquoA safe-exit approach for efficientnetwork-based moving range queriesrdquo Data amp KnowledgeEngineering vol 72 pp 126ndash147 2012
[17] M Attique H Cho R Jin and T Chung ldquoEfficient Processingof Continuous Reverse k Nearest Neighbor on Moving Objectsin Road Networksrdquo ISPRS International Journal of Geo-Infor-mation vol 5 no 12 p 247 2016
[18] H G Elmongui M F Mokbel and W G Aref ldquoContinuousaggregate nearest neighbor queriesrdquoGeoInformatica vol 17 no1 pp 63ndash95 2013
[19] D Wu M L Yiu C S Jensen and G Cong ldquoEfficient con-tinuously moving top-k spatial keyword query processingrdquo inProceedings of the IEEE International Conference on Data En-gineering (ICDE rsquo11) pp 541ndash552 Hannover Germany April2011
[20] W Huang G Li K-L Tan and J Feng ldquoEfficient safe-re-gion construction for moving top-k spatial keyword queriesrdquoin Proceedings of the 21st ACM International Conference onInformation and Knowledge Management pp 932ndash941 2012
[21] L Guo J ShaoHHAung andK-L Tan ldquoEfficient continuoustop-k spatial keyword queries on road networksrdquoGeoInformat-ica vol 19 no 1 pp 29ndash60 2014
[22] Y Li G Li L Shu Q Huang and H Jiang ldquoContinuous moni-toring of top-k spatial keyword queries in road networksrdquo Jour-nal of Information Science and Engineering vol 31 no 6 pp1831ndash1848 2015
[23] M Attique A Khan and T-S Chung ldquoESPAK Top-k spatialkeyword query processing in directed road networksrdquo in Pro-ceedings of the Workshops of the International Conference onExtending Database Technology and the International Confer-ence on DatabaseTheory (EDBTICDT rsquo17) March 2017
[24] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing ampManagement vol24 no 5 pp 513ndash523 1988
[25] V N Anh O de Kretser and A Moffat ldquoVector-space rankingwith effective early terminationrdquo in Proceedings of the 24th An-nual International ACM SIGIR Conference pp 35ndash42 NewOrleans LO USA 2001
[26] E W Dijkstra ldquoA note on two problems in connexion withgraphsrdquo Numerische Mathematik vol 1 pp 269ndash271 1959
[27] ldquoReal datasets for spatial databasesrdquo httpswwwcsutahedulifeifeiSpatialDatasethtm
[28] ldquoTwitterrdquo httpstwittercom[29] T Brinkhoff ldquoA framework for generating network-basedmov-
ing objectsrdquo GeoInformatica vol 6 no 2 pp 153ndash180 2002
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
Wireless Communications and Mobile Computing 3
TkSK queries in directed road networks In Section 5 we pre-sent our safe-exit-based technique to process moving TkSKqueries Section 7 presents a performance analysis of theproposed technique Section 8 concludes this paper
2 Related Work
In this section we discuss some of the promising relatedstudies of top-k spatial keyword queries Our related workis divided into two sections Section 21 reviews snapshotTkSK queries and Section 22 presents the studies proposedto address moving TkSK queries
21 Snapshot Top-k Spatial Keyword Queries In recent yearsspatial keyword queries have drawn the attention of manyresearchers Several approaches have been proposed forranking spatial data objects Initially Zhou et al [7] workedon combining inverted indexes [8] and R-trees [9] Theyproposed three different hybrid indexing structures Theirstudy demonstrated that building an inverted index on topof an R-tree provides superior performance Hariharan et al[10] proposed the indexing structure KRlowast-tree by capturingthe joint distribution of keywords in space Ian de Felipe et al[11] proposed a data structure that combines an R-tree withtext signatures Each node of the R-tree exploits a signatureto indicate the presence of keywords in the subtree of thenode However both these approaches address only Booleankeyword queries in Euclidean space
Top-k spatial keyword queries where data objects areranked according to their combined textual and spatialrelevance to keyword queries were first studied by Cong etal [5] and Li et al [6] Both studies [6] integrate locationindexing and text indexing to generate IR-treesThese studiesprocess top-k spatial keyword queries only in Euclidean spaceand are not suitable for processing top-k spatial preferencequeries in road networks where the distance between objectsis determined by the shortest path connecting them LaterRocha et al [12] proposed the indexing technique S2I whichmaps each term in the vocabulary into a separate blockor aR tree for efficient processing of top-k spatial keywordqueries Zhang et al [13] proposed an m-closest keywordquery that returns the closest object based on distance andwhich matchesm query keywords
Top-k spatial keyword queries in road networks wereintroduced by Rocha et al [14] In particular they pro-posed three different indexing techniques (Basic IndexingEnhanced Indexing and Overlay Indexing) for processingspatial keyword queries in road networks
22 Moving Top-k Spatial Keyword Queries Recently re-search focus has shifted to the continuous processing ofspatial queries where query or data objects are arbitrarilymoving in road networks which is themost realistic scenarioConsiderable research effort has been undertaken to processmoving range k nearest neighbor (kNN) and reverse knearest neighbor queries (RkNN) [15ndash18] However there isa lack of efficient algorithms for moving top-k spatial key-word queries Initially Wu et al [19] and Huang et al [20]
Table 1 Comparisons with existing solutions
Algorithm Type Space Domain OrientationCong et al [5] Snapshot Euclidean No orientationRocha et al [14] Snapshot Static Road UndirectedWu et al [19] moving Euclidean No orientationHuang et al [20] moving Euclidean No orientationGuo et al [21] moving Static Road UndirectedLi et al [22] moving Static Road UndirectedCOSK moving Dynamic Road Directed
proposed different methods formonitoring top-k spatial key-word queries in Euclidean space Guo et al [21] studied mov-ing top-k spatial keyword queries on road networks Theypresented two methods for monitoring moving queries in ancontinuous manner that reduces the traversing of networkedges Later Li et al [22] proposed TPR-tree-based indexingtomonitor moving top-k spatial keyword queries In contrastto [21 22] in this study we consider moving top-k spatialkeyword queries in directed and dynamic road networkswhere each road segment has a particular orientation and itsweight changes due to according to traffic conditions
Table 1 compares our problem scenario with related workin terms of query type space domain and orientation of roadnetworks
3 Preliminaries
Section 31 defines the terms and notations used in this paperSection 32 formulates the problem using an example thatillustrates the general results of top-k spatial keyword queries
31 Definition of Terms and Notations
311 Road Network A road network is represented by aweighted directed graph 119866 = (119873119864119882) where N E and Wdenote the node set edge set and edge distance matrixrespectively The network distance of an edge changes de-pending on the traffic conditions Each edge is also assignedan orientation that is either undirected or directed Theundirected edge is represented by 119890 = (119899119904 119899119890) where 119899119904 and 119899119890are the boundary nodes 119899120573 of an edge whereas the directed
edge is represented by 119890 = 997888997888997888997888997888rarr(119899119904 119899119890) or 119890 = larr997888997888997888997888997888(119899119890 119899119904) Naturallythe arrow above the edge indicates the associated directionWe refer to 119899119904 as the starting node and 119899119890 as the ending nodeof an edge For example in Figure 1 1198996 is the starting node ofedge
997888997888997888997888997888rarr(1198996 1198992) whereas it is the ending node for edgelarr997888997888997888997888997888(1198996 1198995)Theparticular edgewhere a query object is located is called anactive edge It is important to note that the distance betweentwo points 1199011 and 1199012 is not symmetrical in directed roadnetworks (ie 119889119894119904119905(1199011 1199012) = 119889119894119904119905(1199012 1199011)) For example inFigure 1 the 119889119894119904119905(1198893 1198894) = 3 whereas the 119889119894119904119905(1198894 1198893) = 11because shortest path from 1198894 to 1198893 is (1198894 997888rarr 1198996 997888rarr 1198992 997888rarr1198993 997888rarr 1198893)312 Segment Segment 119904 = (1199011 1199012) is the part of an edgebetween two points 1199011 and 1199012 on the edge An edge consists
4 Wireless Communications and Mobile Computing
of one or more segments An edge is also considered a seg-ment where the nodes are the end points of the edge Theweight of a segment (1199011 1199012) is denoted by119882(119904)32 Problem Formulation Similar to previous studies [5 1423] we assume each data object 119889 isin 119863 has a point location119889119897 in the road network and a text description 119889119905 Given aquery location 119902119897 a set of keywords 119902119905 and k number ofdata objects to return the top-k spatial keyword query 119876119896 isdefined as119876119896 = (119902119897 119902119905 119896) which takes three arguments andreturns the best k data objects from D according to a scorethat considers spatial proximity and text relevance The score120595(119889) of a data object d is defined by the following equation
120595 (119889) = 120583 (119889119905 119902119905)1 + 120572 sdot 120582 (119889119897 119902119897) (1)
where 120582(119889119897 119902119897) is the spatial relevance between 119889119897 and119902119897 120583(119889119905 119902119905) is the textual relevance between 119889119905 and 119902119905 and120572 is a positive real number that determines the importanceof one measure over the other For example if only textualrelevance is considered then 120572 = 0 If more importance isgiven to spatial relevance then 120572 gt 1
Spatial relevance (120582) is defined as the shortest distancebetween data objects d and q 120582(119889119897 119902119897) = 119889119894119904119905(119889119897 119902119897)Thus 119889119894119904119905(119889119894119897 119902119897) lt 119889119894119904119905(119889119895119897 119902119897) indicates that data object119889119894 is more spatially relevant to q than data object 119889119895 Thetextual relevance (120583) can be computed using any popularinformation retrieval model such as cosine similarity or thelanguage model In this study we use the cosine similarity be-tween 119889119905 and 119902119905 The textual relevance is defined as follows
120583 (119889119905 119902119905) = sum119905isin119902119905 119908119905(119889119905)119908119905(119902119905)radicsum119905isin119889119905 [119908119905(119889119905)]2 sum119905isin119902119905 [119908119905(119902119905)]2
(2)
The weight 119908119905(119889119905) = 1 + ln(119891119905(119889119905)) where 119891119905(119889119905) representsthe frequency of term t in 119889119905 The weight 119908119905(119902119905) = ln(1 +|119863|119889119891119905) where |119863| is the number of objects in D and 119889119891119905 isthe document frequency A higher 120583 means a higher textualrelevance to the query keywords We used the variation ofcosine similarity based on the significance factor 120579119905(119899) ofterm t in a document n where n represents the descriptionof data object 119889119905 or query keywords 119902119905 The significance120579119905(119899) = 119908119905(119899)radicsum119905isin119899(119908119905(119899))2 is the normalized weight of theterm in the document by taking into account the length ofthe document [24 25] Hence the textual relevance 120583(119889119905 119902119905)can be rewritten as
120583 (119889119905 119902119905) = sum119905isin119902119905
120579119905(119889119905)120579119905(119902119905) (3)
4 Query Processing System
In this section we present the proposed query processingsystem that indexes the data objects and prunes the irrelevantedges for efficient query processing In Section 41 we discussthe indexing framework and in Section 42 we present anefficient keyword query processing algorithm for snapshotqueries
41 Indexing Framework In this study our main work focu-ses on moving queries in a directed and dynamic road net-works We use a method similar to the enhanced techniquepresented in [12] as our basic framework for processingsnapshot queries in directed and dynamic road networksTheindexing framework combines a road network framework[1] for storing spatial information and an inverted file forindexing data objects For easy traversing of the networkwe store the adjacent nodes of each given node by storingnode id (119899119894119889) edge id (119890119894119889) the direction of the edge andthe weight of the edge The indexing framework consists oftwomain components a pruning component and an invertedfile component Figure 2 illustrates the main componentsof an indexing framework The pruning component firstprunes the edges that contain data objects irrelevant to thequery keyword To achieve this we introduced the highestsignificance 120579+119905 of a given term t in the description of objectslying on the edge The 120579+119905 on an edge is retrieved by a keycomposed of a pair of edge id and term id (119890119894119889 119905119894119889) The 120579+119905represents an upper-bound significance of any object lying onan edge with term t in its description The inverted list of aterm t on an edge is accessed only if the upper-bound scorecomposed by 120579+119905 and theminimumnetwork distance betweenthe starting node of the edge and query q may return acandidate data object Naturally the edges with upper-boundscores smaller than the score of the k-th object found so farare pruned
We implement an inverted file for indexing data objectsThe inverted file contains a vocabulary and inverted lists Thevocabulary keeps general information about each term (suchas the frequency of the term) which is helpful in computingthe textual relevance of the data objects The inverted liststores the data objects located on the edge
997888997888997888997888997888rarr(119899119904 119899119890) that havea term t in their description An inverted list is identifiedby a key composed of (119890119894119889 119905119894119889) Each inverted file is a set ofinverted lists A separate inverted list is used for each term inthe object description An inverted list stores two attributesfor each data object first the distance between the data objectand the starting node 119889119894119904119905(119899119904 119889119894) second the significancefactor 120579(119905119894 119889119894) of the term 119905119894 in the description of the dataobject Note that the network distance between two points ina directed road network is not symmetrical (ie 119889119894119904119905(119899119904 119889119894) =119889119894119904119905(119889119894 119899119904)) Recall that the starting node is chosen accordingto the orientation of the edge such that the direction of theedge is from the node toward the data object In Figure 1 1198993is the starting node for 1198897 For bidirectional edges any of theadjacent nodes can act as a starting node
The proposed indexing scheme has three main advan-tages First the object search relevant to query keywords isvery efficient using the (119890119894119889 119905119894119889) pair Second inverted filesalso store the network distance between the starting node andthe data object which helps in accessing the data object in thedirected road network Finally the pruning technique allowsfor faster query processing by exploring fewer edges
Table 2 presents the notations used in this study
42 Query Processing Algorithm Our algorithm traverses theroad network incrementally in a similar fashion to Dijkstrarsquos
Wireless Communications and Mobile Computing 5
Inverted FileInverted Lists
PruningVocabulary
1 Compute upper-bound score using
2 Inverted list of a term is accessedonly if the upper-bound score is greater than kth object
dist(nq) and t+
lteid tidgt
lteid tidgt
tid Dftid
di dist(ns di) (d t )
+t
Figure 2 Indexing framework
Table 2 Summary of notations used in this paper
Notation DefinitionG = (N EW) Graph model of road network119889119894119904119905(119901119904 119901119890) Length of shortest path from 119901119904 to 119901119890 where 119901119904 and 119901119890 represent start and end points respectively119897119890119899(1199011 1199012) Length of segment connecting two points 1199011 and 1199012119899119894 Node in road network119890 = (119899119904 119899119890) Edge in edge set E where 119899119904 and 119899119890 are start and end points of the edge119899120573 Boundary node corresponding to start (119899119904) or end (119899119890) point of an edge119882(119890) Weight of edge (119899119904 119899119890)q Query point in road networkk A number that represents q can be among k number of closest facilities to a data object dD Set of data objects119863 = 1198891 1198892 119889|119863|119863(119899119904 119899119890) Set of data objects in an edge119901119886 Anchor point that corresponds to start point of expansion119875119878119864 Safe exit point where safe and non-safe regions of q intersect120572 query parameter120595(119889) Score of data object d120583(119889119905 119902119905) textual relevance of data object d with query keywords120582(119889119897 119902119897) Spatial relevance of data object d with query location119863+ Set of answer objects119863minus Set of non-answer objects119889+119897 Lowest answer object119889minusℎ Highest non-answer object
algorithm [26] Algorithm 1 returns the top-k data objectswith the highest scores according to their joint textual andspatial relevance to the query The algorithm begins byexploring the active edge where query object q is located andexpands the network in an increasing order of distance fromq Each entry in the min-heap has the form (119901119886 119890119889119892119890) where119901119886 indicates the anchor point in the edge For an active edgeq becomes the anchor point Otherwise for directed edgesending node 119899119890 becomes the anchor point For bidirectionaledges either of the adjacent boundary nodes ie 119899119904 or 119899119890becomes the anchor point Let119863119896 be the current set of top-kdata objects and 119904119896 be the score of the k-th data object in119863119896The 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896) function retrieves the candidatedata objects 119863119888 located in an edge with a better score 120595(119889)than 119904119896 Next the 119863119896 set is updated with the data objects in
119863119888 and so does 119904119896The algorithm continues its expansion andinserts the adjacent edges of the boundary node until the heapis exhausted or the upper-bound score of the remaining dataobjects cannot have a better score than 119904119896 The upper-boundscore 120595(119899) of node n is computed using 119889119894119904119905(119899 119902) and themaximum textual relevance (120583 = 1)Therefore if120595(119899) le 119904119896 itmeans that even if there is unexplored data object dmatchingall query keywords its score can be better than the k-th objectin 119863119896 because 119889119894119904119905(119889 119902119897) ge 119889119894119904119905(119899 119902119897) This is certain owingto the fact that the algorithm strictly expands the node with aminimum distance to the query location
Algorithm 2 presents the 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896) proce-dure which finds the candidate data objects This procedurehas twomain steps In the first step the upper-bound score ofthe edges is computed using a significance factor (120579119905 ) of a term
6 Wireless Communications and Mobile Computing
(1) Input Top-k spatial keyword query 119876119873 = (119902119897 119902119905 119896)(2) Output Top-k data objects with highest score(3) 119863119888 larr997888 0 lowastset of candidate data objects(4) max-heap 119863119896 larr997888 0 lowastcurrent Top-k set(5) 119904119896 larr997888 0 lowastk-th score in119863119896(6) min-heap larr997888 0(7) 119890119909119901119897119900119903119890119889 larr997888 0(8) min-heapinsert(119902119897 119890119889119892119890119886119888119905119894V119890)(9) 119863119888 larr997888 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896)(10) update119863119896 and 119904119896 with 119889 isin 119863119888(11) whilemin-heap = 0 and (1(1 + 120572120582(119889119897 119902119897)) lt 119904119896) do(12) for each unexplored adjacent edge of (119901119886 119890119889119892119890) do(13) 119890119909119901119897119900119903119890119889 larr997888 119890119909119901119897119900119903119890119889 cup (119901119886 119890119889119892119890)(14) 119863119888 larr997888 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896)(15) update119863119896 and 119904119896 with 119889 isin 119863119888(16) end(17) min-heapinsert(adjacent node edge)(18) end(19) return119863119896
Algorithm 1 EvaluateSnapshotQuery(Node 119899119894 Edge 119890119894)
(1) Input Edge ID 119890119894119889 Term ID 119905119894119889 score of k-th object 119904119896(2) Output candidate list119863119888(3) compute 120579119905(119890119894)(4) if 120579119905(119890119894) gt 0 then(5) 119898119886119909119904119888119900119903119890(119890119894) larr997888 119888119900119898119901119906119905119890119898119886119909119904119888119900119903119890(120579119905 119889119894119904119905(119890119894 119902119897))(6) end(7) if 119898119886119909119904119888119900119903119890(119890119894) gt 119904119896 then(8) for each data object in 119890119894 do(9) compute 119889119904119888119900119903119890(10) end(11) if 119889119904119888119900119903119890 gt 119904119896 then(12) 119863119888 larr997888 119863119888 cup 119889(13) end(14) end(15) return119863119888
Algorithm 2 CandidateSearch((119890119894119889 119905119894119889) 119904119896)
119905 isin 119902119905 and the shortest distance 119904119889119894119904119905(119890119894 119902119897) between the edgeand the query location In the next step the inverted lists ofterm t are fetched if their upper-bound score is greater than119904119896 In the inverted lists the objects with score 120595(119889) greaterthan 119904119896 are returned
To understand the proposed algorithm consider theroad network presented in Figure 1 Assume that a query qgenerated a top-1 keyword query with qd ldquoItalian Restau-rantrdquo For ease of presentation we assume 120572 = 1 and thetextual relevance 120583 is the number of occurrences of querykeywords in 119889119905 divided by the number of keywords in thedocument (description of data object) For example 120595(1198894) =120583(1198894119905 119902119905)(1 + 120582(1198894119897 119902119897)) = 058 = 006 The algorithmstarts the network expansion from an active edge
997888997888997888997888997888rarr(1198992 1198993)where q is the anchor point Note that the direction of the edge997888997888997888997888997888rarr(1198992 1198993) is from 1198992 to 1198993 Therefore the algorithm explores
only997888997888997888997888997888rarr(119902 1198993) There is no data object found in
997888997888997888997888997888rarr(119902 1198993) Then1198993 becomes the anchor point and edges (1198993 1198994) (1198993 1198995)and (1198993 1198997) are inserted in min-heap Next the 119888119886119899119889119904119890119886119903119888ℎfunction retrieves the candidate data objects on edges (1198993 1198994)(1198992 1198993) and (1198993 1198997) whose score is better than 119904119896 On edge(1198993 1198995) data object 1198893 is retrieved with 120595(1198893) = 02 Dataobject 1198893 is inserted in the119863119896 set and the value of 119904119896 is set to02 For edges (1198993 1198994) and (1198993 1198997) there is no candidate objectfound because 1198892119905 (ldquoCaferdquo) and 1198897119905 (ldquoCafe and Bakeryrdquo) donot match with 119902119905 The algorithm continues expanding theedges whose upper-bound score is greater than 119904119896 The edge997888997888997888997888997888rarr(1198997 1198992) is explored next The upper-bound score of
997888997888997888997888997888rarr(1198997 1198992)is 17 which is less than 119904119896 Similarly for edge
larr997888997888997888997888997888(1198996 1198995) theupper-bound score is 058 lt 119904119896 Therefore the algorithmterminates and reports 1198893 as the top-1 result
Wireless Communications and Mobile Computing 7
q
q issues TkSK query at p1
Server returns a set of objects for p1
Figure 3 Illustration of directed road network
qq issues TkSK query at p2
Server returns a set of objects for p2
Figure 4 Illustration of directed road network
5 Moving Top-119896 Spatial Keyword Queries
In this section we present our method to monitor themoving top-k spatial keyword queries where query objectsare moving in a directed road network Figure 3 providesan example of TkSK in road networks where query point qissues a TkSK query at point 1199011 Note that the numbers onthe arrows in the figure indicate the order of the steps Toobtain top-k results at 1199011 the server executes Algorithm 1as mentioned in Section 42 Now consider that the queryobject is moved to 1199012 as shown in Figure 4 to retrieve thetop-k results at point 1199012 The simple method is to repeat theprocedure executed at 1199011 However the use of recomputationwhenever query q changes its location significantly increasesthe computation cost Furthermore it also increases thecommunication overhead because the query object mustreport its location whenever it moves and the server mustsend the results set To address these issues we introduce thesafe exit approach
In the proposed framework the server computes safeexit points for a query object The server maintains a set ofmoving queries and the query result remains valid until thequery objects remain inside their respective safe exit pointsWhenever a query object leaves its safe exit points the serverrecomputes theTkSK and safe exit points for the query object
Next we present our method to compute the safe exitpoints for a query objectThe safe exit point represents a pointin the segment where a safe region and nonsafe region meetWe compute the safe exit point using the divide-and-conquertechnique Before presenting the detailed methodology wedefine the terminologies used in this section
Definition 1 (safe region) A portion of a road segment thatcan guarantee that as long as the query point lies in it itstop-k results remain valid
Definition 2 (answer objects 119863+) A data object d is calledan answer object of query q if the score of data object d(120595(119889) gt 120595(119889119886)) where 119889119886 represents any other data object inthe directed road network Similarly we can generalize thisdefinition for TkSK a data object d is called an answer object
of query q if the score of a data object d (120595(119889) gt 120595(119889119896+1))where 119889119896+1 represents the (119896+1)119905ℎ data object in the directedroad network In other words we can state that all answerobjects are top-k results of query q
Definition 3 (nonanswer objects119863minus) A data object d is calleda nonanswer object of query q if the score of data object d(120595(119889) lt 120595(119889119886)) where 119889119886 represents any other data object inthe directed road network Similarly we can generalize thisdefinition for TkSK a data object d is called a nonanswerobject of query q if the score of data object d (120595(119889) lt 120595(119889119896))where 119889119896 represents the kth data object in the directed roadnetwork That is we can say that all answer objects are top-k results of query q Therefore we can state that none of thenonanswer objects are in the top-k results of query q
Definition 4 (lowest answer object 119863+119897 ) An answer object119889+ isin 119863+ is called a lowest answer object to a point 119901 isin 119866such that 120595(119889+119897 )119901 = min(120595(119889+1 )119901 120595(119889+2 )119901 120595(119889+|119889+|)119901)where120595(119889+119897 )119901 represents the score of the lowest answer objectat point p In other words 120595(119889+119897 )119901 lt 120595(119889+119886 )119901 at point p where119889+119886 is any other answer object in the 119863+ setDefinition 5 (highest nonanswer object 119863minusℎ) A nonanswerobject 119889minus isin 119863minus is called a highest nonanswer object toa point 119901 isin 119866 such that 120595(119889minusℎ )119901 = max(120595(119889minus1 )119901 120595(119889minus2 )119901 120595(119889+|119889minus|)119901) where 120595(119889+ℎ)119901 represents the score of thehighest nonanswer object at point p In other words the120595(119889minus119897 )119901 lt 120595(119889minus119886 )119901 at point p where 119889minus119886 is any other nonanswerobject in the 119863minus set
As discussed earlier the main challenge in the continuousprocessing of moving TkSK is to maintain the validity of theresult set because the movement of query objects can nullifythe result set To monitor the validity of the result set wepropose a safe-region-based approach
51 Computation of Safe Exit Points In this section wepresent our technique to compute the safe exit points Themain goal is to find a point in the road network where the
8 Wireless Communications and Mobile Computing
query result set will change The result set will change whenthe score of highest nonanswer 119863minusℎ surpasses the score of119863+119897 Generally the textual relevance score does not changeTherefore the score of data objects only changes because ofthe spatial relevance score which can only change by themovement of query objects The computation of the safe exitpoint is based on two key observations
Observation 1 If 119863+119899120573 = 119863+119901119886 there is no safe exit point in thesegment
Explanation 119863+119901119886 represents the set of answer objects atanchor point 119901119886 whereas 119863+119899120573 represents the set of answerobjects at boundary node 119899120573 As discussed earlier the safe exitpoint is the particular point where the query results changedIf the query results at the starting node are the same as theending node of any segmentedge there does not exist anypoint where the query result is changing Hence we do notsearch the safe exit point in that segment
Observation 2 If 119863+119901119886 = 119863+119899120573 there is a safe exit point in thesegment
Explanation In contrast to Observation 1 if the query resultsare different at the starting and ending points then thereexists a point where the query results are changing Hencethere is a safe exit point in the segment
To find the safe region we observe the following cases
Case 1 (when 120572 = 1 and the textual relevance of the highestnonanswer object and lowest answer object is the same)In this case both the textual and spatial relevance have thesame importance (ie 120572 = 1) In addition the top-k resultdepends only on the spatial relevance because the textualrelevance of both objects is the same The data object thatis closer to query point q becomes the answer object For anundirected edge the safe exit point 119901119904119890 is the center pointie max(119889119894119904119905(119901119904119890 119889+1 ) 119889119894119904119905(119901119904119890 119889+2 ) 119889119894119904119905(119901119904119890 119889+|119889+|)) =min(119889119894119904119905(119901119904119890 119889minus1 ) 119889119894119904119905(119901119904119890 119889minus2 ) 119889119894119904119905(119901119904119890 119889minus|119889minus|)) betweenthe lowest answer object and the highest nonanswer objectHowever in case of a directed edge where 119889119894119904119905(119901119886 119899120573) =119889119894119904119905(119899120573 119901119886) the safe exit point is either 119889+119897 or 119901119886 If 119889+119897 isin(119901119886 119899120573) then the safe exit point is 119889+119897 otherwise the safe exitpoint is 119901119886Case 2 (when 120572 = 1 and the textual relevance of the highestnonanswer object and lowest answer object is different) Inthis case the top-k result depends on all functions that are the120572 spatial and textual relevance Clearly for the undirectededges the midpoint between the lowest answer object andthe highest nonanswer object does not provide a valid safeexit point Therefore we introduce the divide-and-conquertechnique This will keep dividing the search space until weget the point where the score of the nonanswer is greater thanthat of the answer object Typically the safe exit point shouldbe closer to the data object whose score is lower Based onthis observation first we compute the midpoint in a similarfashion to Case 1 and then we continue dividing the search
space until we find the point For undirected edges the safeexit point can be computed in a similar fashion to Case 1
Case 2 also works for other cases when the safe exit pointis not the mid point between the lowest answer object andthe highest nonanswer object In these cases the safe exitpoint depends on two or more functions Therefore the safeexit point can be easily computed using the aforementioneddivide-and-conquer technique Following are the scenarioswhere the safe exit point can be computed using Case 2
(a) When 120572 = 1 and textual relevance of the nearest non-answer object and farthest answer object is different
(b) When 120572 = 1 and textual relevance of the nearestnonanswer object and farthest answer object is same
Case 3 (when 120572 = 0) This means the spatial relevance hasno effect on the score of data objects Hence no monitoringis required for this scenario
Algorithm 3 retrieves the safe exit points using theobservations we discussed earlier The core function in thisalgorithm is ComputeSafeExit(119901119886 119899120573) which finds the safeexit point in a segment between 119901119886 and 119899120573 The detailedComputeSafeExit(119901119886 119899120573) is described in Algorithm 4 FirstAlgorithm 4 determines 119889+119897 and 119889minusℎ at point 119901 isin [119901119886 119899120573]Recall that 119889+119897 is the lowest answer object to p where 119889minusℎ isthe highest nonanswer object to p Algorithm 4 computes thesafe exit point based on the cases we discussed earlier Thereare a further two scenarios for Cases 1 and 2 For Case 1 if119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then the safe exit point is the mid-point between 119889+119897 and 119889minusℎ If 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) thenthe edge is directed and therefore the safe exit point is either119901119886 or 119889+119897 If 119889+119897 lies on the edge [119901119886 119899120573] then 119889+119897 is the safe exitpoint Otherwise 119901119886 is the safe exit point
Similarly for Case 2 if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) thenthe safe exit point is computed by dividing the search space byhalf until we find the closest point such that 120595(119889minusℎ) gt 120595(119889+119897 )The safe exit point is computed in the same way as in Case 2if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886)52 Computation of Safe Exit Points for Example Considerthe same example in Figure 1 where the query point q issuesa top-1 keyword query with qt ldquoItalian restaurantrdquo For thisexample let us consider 120572 = 1 The monitoring algorithmstarts exploring from the active edge containing the queryobject q Therefore
997888997888997888997888997888rarr(119902 1198993) is explored first As shown inTable 3 for
997888997888997888997888997888rarr(119902 1198993) 119863+119902 = 1198893 and 119863+1198993 = 1198893 Accordingto Observation 1 no safe exit point exists in this segmentTherefore edges adjacent to 1198993 are explored and 1198993 becomesthe new 119901119886 The edge (1198993 1198994) is explored next Similarlythe answer object at 1198993 and 1198994 is the same 119863+1198993 = 119863+1198994 =1198893 Therefore a safe exit point does not exist in (1198993 1198994)The edge (1198993 1198997) is explored next As shown in Table 3119863+1198993 = 1198893 and 119863+1198997 = 1198896 By Observation 2 there is asafe exit point in (1198993 1198997) As shown in Figure 1 1198893119905 =1198896119905 = ldquo119868119905119886119897119894119886119899119877119890119904119905119886119906119903119886119899119905rdquo and 119889119894119904119905(1198993 1198997) = 119889119894119904119905(1198997 1198993)
Wireless Communications and Mobile Computing 9
(1) Input Same as Algorithm 1(2) Output 119875119878119864 a set of safe exit points(3) 119875119878119864 larr997888 0 lowastset of safe exit points(4) 119863+119901119886 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119901119886 (119901119886 119899120573))(5) lowastResults calculated using Algorithm 1(6) 119863+119899120573 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910((119899120573 (119901119886 119899120573)))(7) lowastResults calculated using Algorithm 1(8) if 119863+119901119886 = 119863+119899120573 then(9) no safe exit point lowastrefer to Observation 1(10) end(11) if 119863+119901119886 = 119863+119899120573 then(12) 119875119878119864 larr997888 119875119878119864 cup 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119901119886 119899120573) lowastsafe exit point
exist - refer to Observation 2(13) end(14) return 119875119878119864
Algorithm 3 COSK monitoring algorithm
(1) Input same as Algorithm 1(2) Output se safe exit point in (119901119886 119899120573)(3) 119863+119897 larr997888 lt 119901119863+119897 gt | for each point 119901 isin [119901119886 119899120573] 119889+119897 such that120595(119889+119897 )119901 = min(120595(119889+1 )119901 120595(119889+2 )119901 120595(119889+|119889+|)119901(4) 119863minusℎ larr997888 lt 119901119863minusℎ gt | for each point 119901 isin [119901119886 119899120573] 119889minusℎ such that120595(119889minusℎ )119901 = max(120595(119889minus1 )119901 120595(119889minus2 )119901 120595(119889minus|119889minus |)119901(5) if Case 1 then(6) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(7) 119901119904119890 =
max(119889119894119904119905(119904119890 119889+1 ) 119889119894119904119905(119904119890 119889+2 ) 119889119894119904119905(119904119890 119889+|119889+ |)) =min(119889119894119904119905(119904119890 119889minus1 ) 119889119894119904119905(119904119890 119889minus2 ) 119889119894119904119905(119904119890 119889minus|119889minus |))
(8) end(9) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(10) 119901119904119890 = 119901119886 or 119901119904119890 = 119889+119897 where 119889+119897 isin (119901119886 119899120573)(11) end(12) end(13) if Case 2 then(14) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(15) 119901119904119890 =closest point to 119901119886 such that 120595(119889minusℎ ) gt 120595(119889+119897 )(16) end(17) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(18) Same as Line (10)(19) end(20) end(21) return 119901119904119890
Algorithm 4 ComputeSafeExit(119901119886 119899120573)
Therefore according to Case 1 the safe exit point 1199041 isthe midpoint between 1198893 and 1198896 That is 119889119894119904119905(1199011199041198901 1198893) =119889119894119904119905(1199011199041198901 1198896) where119889119894119904119905(1199011199041198901 1198893) = 119909+3 and 119889119894119904119905(1199011199041198901 1198896) =minus119909 + 5 for 0 lt 119909 lt 3 Consequently 119909 = 1 which means thatthe distance from 1198993 to 1199011199041198901 is 1
Next we determine a safe exit point in (1198993 1198995) As shownin Table 3 the answer object at 1198995 is also the same as 1198993Hence no safe exit point exists in this edge Next
larr997888997888997888997888997888(1198996 1198995) isexplored with 119901119886 = 1198995 According to Table 3 119863+1198997 = 1198894 and
119863+1198995 = 1198893 Therefore a safe exit point exists in this edge This
edge is directed and for each point 119901 isin larr997888997888997888997888997888(1198996 1198995) the shortestdistance from p to 1198893 is from 119901 997888rarr 1198996 997888rarr 1198992 997888rarr 1198993 997888rarr 1198893Therefore 1198995 is the safe exit point
The bold lines in Figure 5 indicate the safe region of qThetop-1 result remains 1198893 until the query q lies in the safe region
Next we analyze the time complexity for determininga set of safe exit points using a set of qualifying objects119889 isin 119863+119901119886 cup 119863+119899120573 cup 119863(119901119886 119899120573) Note that 119863+119901119886 (119863+119899120573) indicates
10 Wireless Communications and Mobile Computing
Table 3 Computation of safe exit points for example scenario
EdgeSegment 119901119886 119863+119901119886 119863+119899120573 119901119904119890997888997888997888997888rarr(119902 1198993) q 119863+119902 = 1198893 119863+1198993 = 1198893 none(1198993 1198994) q 119863+1198993 = 1198893 119863+1198994 = 1198893 none(1198993 1198997) 1198993 119863+1198993 = 1198893 119863+1198997 = 1198896 1199011199041198901997888997888997888997888997888rarr(1198993 1198995) 1198993 119863+1198993 = 1198893 119863+1198995 = 1198893 nonelarr997888997888997888997888997888(1198996 1198995) 1198995 119863+1198995 = 1198893 119863+1198996 = 1198894 1199011199041198902
2
q
3
1
1 1
1
1
2
1
2
1 2
1
3
2
1
1
d4 (Chinese Restaurant)
d1 (Grand Hotel)
d5 (Pub and Bar)
n1
n6
n2 n3
n4
n7
pse1
pse2
n5
d6(Italian Restaurant)
d3 (Italian Restaurant)
d2 (Cafe)
d7 (Cafe and Bakery)
Figure 5 Illustration of safe region of q
the set of k data objects that satisfies the query conditionat 119901119886 (119899120573) According to Dijkstras algorithm [26] the timecomplexity 119874(119863+119902 ) for computing a set of answer objects at aquery point q is119874(119863+119902 ) = 119874(|119864|+|119873| log |119873|)Thismeans that119874(119863+119901119886) = 119874(119863+119899120573) = 119874(|119864| + |119873| log |119873|) holds for endpoints119901119886 and 119899120573 Thus time complexity 119874(Ω119896119905ℎ) when determiningthe skyline Ω119896119905ℎ with the k-th highest score is 119874(Ω119896119905ℎ) =119862119896119905ℎ119874(|119863+119901119886 cup 119863+119899120573 cup 119863(119901119886 119899120573)|) where 119862119896119905ℎ is the numberof qualifying objects that participate in the constitution ofthe skyline with the k-th highest score Therefore the timecomplexity of determining a safe exit point coincides withthe time complexity of determining the two skylines iethe skyline 119863+119897 with the k-th highest (or lowest) score foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects This is because the safe exit point is foundat the cross point between these skylines
Figure 6 represents the skyline graph for 119896 = 1 in an edge(1198997 1198993) Let us draw the score function for 1198893 and 1198896 for theroad segment (1198997 1198993) where a safe exit point exists This isbecause 119863(1198993)+ = 1198893 and 119863(1198997)+ = 1198896 for 119896 = 1 For eachpoint 119901 isin (1198997 1198993) the distance between 1198893 and point p canbe represented as 119889119894119904119905(1198893 119901) = 119889119894119904119905(1198893 1198993) + 119897119890119899(1198993 119901) = 6 minus119897119890119899(1198997 119901) Similarly for each point 119901 isin (1198997 1198993) the distancebetween 1198896 and point p can be represented as 119889119894119904119905(1198896 119901) =119889119894119904119905(1198896 1198997) + 119897119890119899(1198997 119901) = 2 + 119897119890119899(1198997 119901) Let 119897119890119899(1198997 119901) be
n7
10
08
06
04
02
n3pse1d7
distance
Scor
e
05 10 15 20 25 30
(d6) = 1(x + 3)
(d3) = 1(minusx + 7)
Figure 6 Skyline graph for 119896 = 1 on the road segment (1198997 1198993)
a variable x (0 le 119909 le 3) We can write 120582(1198893 119901) =119889119894119904119905(1198893 119901) = 6 minus 119909 and 120582(1198896 119901) = 119889119894119904119905(1198896 119901) = 2 + 119909 Thenwe can represent score function 120595(1198893) and 120595(1198896) as follows
120595(1198893) = 120583(1198893119905 119902119905)(1 + 120572 sdot 120582(1198893 119901)) = 1(7 minus 119909) for(0 le 119909 le 3)
Wireless Communications and Mobile Computing 11
120595(1198896) = 120583(1198896119905 119902119905)(1 + 120572 sdot 120582(1198896 119901)) = 1(3 + 119909) for(0 le 119909 le 3)Finally we present the lemma to prove that safe exit points
computed by COSK are correct
Lemma 8 The COSK algorithm correctly computes a set ofsafe exit points
Proof We will prove the correctness of the COSK algorithmby contradiction We assume that if 119863+119901119886 = 119863+119899120573 there is nosafe exit point in a road segment (119901119886119899120573) This means that foreach point p in the road segment (119901119886119899120573) the query result atp equals 119863+119901119886 ie 119863+119901 = 119863+119901119886forall119901 isin (119901119886119899120573) However it leadsto a contradiction that 119863+119899120573 = 119863+119901119886 when 119901 = 119899120573 There-fore if 119863+119901119886 = 119863+119899120573 a safe exit point exists in (119901119886119899120573) In addi-tion a safe exit point is determined using the skyline 119863+119897 foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects when 119863+119901119886 = 119863+119899120573 The first skyline is acomposite polyline drawn from answer objects in 119863+119901119886 Thesecond skyline is a composite polyline drawn from nonan-swer objects in 119863+119899120573 cup 119863(119901119886 119899120573) minus 119863+119901119886
6 Monitoring Query Results and Safe Regionsin Dynamic Directed Road Networks
In this section we discuss the monitoring of spatial key-word queries in dynamic road networks where the networkdistance changes depending on the traffic conditions Theupdates on weight of some edges may invalidate the queryresults or safe region of q even though the query objectq remains within their respective safe region Figure 7illustrates an example of changing the weights edges
larr997888997888997888997888997888(1198991 1198992)and
larr997888997888997888997888997888(1198991 1198996) For convenience we consider 120572 = 1 and qt =ldquoItalian restaurantrdquo In Figure 7(a) the top-1 result is 1198891 andbold lines show the safe region of query q Now consider attime 119905119895 the weights of two edgeslarr997888997888997888997888997888(1198991 1198992) andlarr997888997888997888997888997888(1198991 1198996) changeddue to heavy traffic condition as shown in Figure 7(b) Theupdate in weight of edges may invalidate the query resultor safe region of q Therefore it is necessary to monitor thevalidity of results and safe region when the changes occur
Next we introduce a monitoring region to monitor thevalidity of the safe region effectively when the weight ofan edge is changed Monitoring region MR contains all thepoints between query point q and lowest answer object andhighest nonanswer object Formally it is defined as 119872119877 =119889119894119904119905(119902119863+119897 ) cup 119889119894119904119905(119902119863minusℎ) where 119889119894119904119905(119902119863+119897 ) is the distancebetween q and lowest answer object and 119889119894119904119905(119902119863minusℎ) is highestnonanswer object In given example the 119863+119897 = 1198891 and 119863minusℎ =1198892 1198893 Therefore the dotted lines in Figure 8(a) shows themonitoring region of query object q
Now at time 119905119895 the update to edgeslarr997888997888997888997888997888(1198991 1198996) and larr997888997888997888997888997888997888(1198991 1198891)
which is not part of monitoring region can safely be ignoredHowever the updated on segment
997888997888997888997888997888997888rarr(1198992 1198891)which is associatedwith monitoring region may nullify the results As shown in
Figure 8(b) after update the top-1 result becomes 1198892 and boldlines represents the new safe region of q
Algorithm 5 monitors the validity of result set and saferegion of query object qwhen the weight of any edge changesLet us consider weight of edge (119899119894 119899119895) changes at time 119905119895First algorithm checks whether edge (119899119894 119899119895) is associatedwith monitoring region or not If it is not part of monitoringregion then algorithm simply ignores the update in edge(119899119894 119899119895) and query results and safe region remains valid Incontrast if edge is associated with monitoring region (ie119872119877cap(119899119894 119899119895) = 0) then algorithm evaluates the query resultsConsequently the top-k results and safe region of queryq needs to be updated Finally the algorithm updates themonitoring region of q
7 Performance Evaluation
In this section we evaluate the performance of COSKthrough simulation experiments We describe our experi-mental settings in Section 71 and we present our experimen-tal results for static and dynamic road networks in Sections72 and 73 respectively
71 Experimental Settings All of our experiments wereperformed using real road networks namely OldenburgSan Francisco and San Joaquin All three road networkswere obtained from [27] The original road network of SanFrancisco had 21047 nodes and 21692 edges We reformat-ted the network pruned approximately 30 of the nodesand adjusted the edges and their weights accordingly Thisresulted in a network with 14732 nodes and 14316 edgesBoth the direction of edges and data objects on the edgeswere generated randomly The description of each data objectwas extracted from Twitter messages [28] and we assignedone tweet per data object Table 4 presents the characteristicsof the data sets used in the experimental evaluation Wesimulated moving query objects by using a spatiotemporaldata generator [29] The input to generator was the road net-work of the data set used and the output was the set of queryobjects moving on the road network Each experiment had100 moving queries which were continuously monitored for100 timestamps (1 timestamp = 1 second) and the averageresult was reported in the experiments
As a benchmark for COSK in static road network weimplemented a CMTkSK+ algorithm [22] which also contin-uously monitored the moving top-k spatial keyword queriesin the road networks However this algorithm was originallydesigned for undirected road networks To make a faircomparison we modified CMTkSK+ to process top-k spatialkeyword queries in directed road networks and called itCMTkSK+ Specifically we modified the distance computa-tion method between two points such that in directed roadnetworks 119889119894119904119905(1199011 1199012) = 119889119894119904119905(1199012 1199011) Since CMTkSK+ doesnot handle top-k spatial queries in dynamic road roads wecompared the performance of COSK with basic algorithmwhich recomputes the results whenever query object changesits location All algorithms were implemented in Java andwere executed on a desktop PC 280-GHz Intel Core i5 with
12 Wireless Communications and Mobile Computing
3
q5 5
2 3
3
2
2 3 5
11
d3 (Chinese Restaurant)
n1
n6
n2 pse2
pse1
pse3
n4n5
n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Safe region at time 119905119894
9
q10 5
6 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6
n2 n3
n4n5
d2 (Italian Restaurant)d1 (Italian Restaurant)
(b) Updating weight oflarr997888997888997888997888997888997888(1198991 1198992) and
larr997888997888997888997888997888997888(1198991 1198996) at time 119905119895
Figure 7 Updating the weight of edges in a dynamic road network where 119905119894 lt 119905119895
3
q5 5
2 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6 n4n5
n2 n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Monitoring region at time 119905119894
9
q10 5
5 4
233
2
2 3 5
11
037
pse2pse1
pse3
d3 (Chinese Restaurant)n6 n4n5
n2 n3d2 (Italian Restaurant)n1 d1 (Italian Restaurant)
(b) New safe region at time 119905119895
Figure 8 Monitoring region and updated safe region at time 119905119895
(1) InputMonitoring regionMR updated edge (119899119894 119899119895)(2) Output none(3) if 119872119877cap (119899119894 119899119895) = 0 then(4) lowastedge (119899119894 119899119895) is not part of monitoring region(5) ignore the change in the weight of edge (119899119894 119899119895)(6) end(7) 119875119878119864 larr997888 0 lowastset of safe exit points(8) else(9) 119863119896119906119901119889 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119899119894 119890119894) lowastupdate set of
top-k results(10) 119875119878119864119906119901119889 larr997888 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119875119886 119899120573) lowastupdate safe exit
points(11) 119872119877119906119901119889 larr997888 119862119900119898119901119906119905119890119872119900119899119894119905119900119903119894119899119892119877119890119892119894119900119899(119863+119897 119863minusℎ )
lowastupdate monitoring region(12) end
Algorithm 5 MonitoringSafeRegion(MR(119899119894 119899119895))
Table 4 Summary of datasets
Attribute Oldenburg San Francisco San JoaquinTotal no of nodes 6104 14732 18262Total no of edges 7034 14316 23876Percentage of directed edges 30 30 30Total no of objects 5627 11453 19098Average no of objects per edge 08 08 08Total no of words 49517 103649 166153
Wireless Communications and Mobile Computing 13
Table 5 Experimental parameter settings
Parameter RangeNumber of results (k) 5 10 15 20 25Number of keywords (n) 1 2 3 4 5Query parameter (120572) 001 01 1 10 100Dataset Oldenburg San Francisco San JoaquinNumber of data objects (119873119863) 10 20 30 40 50 (x1000)Speed of query objects (119881119902119903119910) 25 50 75 100 125 (kmh)Mobility (119872119902119903119910) 20 40 60 80 100Ratio of directed edges (119864119889119894119903) 10 20 30 40 50Ratio of updated edges (119864119906119901119889) 15 30 60 80 100
8GB of memory In the experiments we compared (1) queryprocessing times (2) edges processed ie the number ofedges processed for retrieving query results and (3) indexsizes Table 5 summarizes the parameters used in the exper-iments In each experiment we varied a single parameterwithin the range that is shown in Table 5 while maintainingthe other parameters at the bolded default values
We evaluated the performance of the algorithms by usingthe following measures (1) total amount of server CPUtime which indicates the query processing time and (2)total communication cost as the total number of points (iethe location updates sent by query objects and the queryresults and safe exit points returned by the server) transferredbetween clients and the serverThebattery power andwirelessbandwidth consumption typically increase with the amountof data transferred between objects (clients) and serversThus we used the amount of transferred data as a metric toevaluate the communication cost
72 Experimental Results of Top-k Spatial KeywordQueries in Static Road Networks
721 Effect of k Figure 9 indicates the effect of the numberof results on the query processing time and communicationcost for both algorithms Figure 9(a) indicates that the queryprocessing time increases for both algorithms as the value ofk increases This is expected because with an increase in kmore data objects are required to be explored and verifiedNevertheless COSK significantly outperforms CMTkSK+ fortwo main reasons First a relevant object search is very effi-cient when using the highest significant factor and secondCOSKdoes not need to verify the set of answer objects as longas the query object lies in a safe region On the other handthe CMTkSK+ query processing time increases significantlybecause it has to monitor and verify the set of candidateobjects periodically In Figure 9(b) the communication costsfor both algorithms increase as the number of objects in-creases However the proposed algorithm demonstrates su-perior performance compared to CMTkSK+ because client-server communication is not required when the query objectlies within the safe exit points whereas in CMTkSK+ thequery object is required to report its location to the serverwhenever it moves
722 Effect of119873119863 This experimentwas conducted on datasetSan Joaquin This dataset included 19098 data objects there-fore we randomly generated approximately 30000 additionaldata objects on different edges In Figure 10 we evaluate theperformance of COSK and CMTkSK+ by varying the cardi-nality of the data objects Note that119873119863 = 10119870 corresponds toa low density of data points while119873119863 = 50119870 corresponds toa high density In Figure 10(a) it is interesting to notice thatthe query processing times of both algorithms decrease asthe cardinality of the data objects increases For CMTkSK+this is because with high density the monitoring range of aquery decreases However for COSK it is mainly becausewhen the data density is high fewer edges are required tobe expanded which decreases the query processing time InFigure 10(b) we study the influence of the cardinality of thedata objects on the communication costs The experimentalresults indicate that the communication costs of CMTkSK+incur almost constant communication costs regardless ofdata object cardinality However the communication costsof COSK increase in proportion to the 119873119863 value This isexpected because the safe region becomes smaller as thedensity of the data objects increases which increases thecommunication costs
723 Effect of Query Keywords (n) Figure 11 shows thequery processing time and communication for COSK andCMTkSK+ as a function of the number of query keywordsFigures 11(a) and 11(b) show the trend that the performanceof both algorithms degrades when the number of keywordsincreases This is mainly because by increasing the numberof query keywords the number of relevant objects may alsoincrease resulting in a higher query processing time andcommunication cost However the safe-region-based algo-rithm COSK scales better than CMTkSk+ because of its lessexpensive monitoring technique
724 Effect of 120572 Figure 12 demonstrates the impact of queryparameter 120572 on the query processing time and on the com-munication cost A small value of 120572 indicates a greater im-portance of textual relevance whereas a high value of 120572gives more preference to the spatial relevance It is interestingto note that the query processing time is lower for higher
14 Wireless Communications and Mobile Computing
k
50
10
10
15 20
20
30
Que
ry p
roce
ssin
g tim
e (s)
COSKCMTkSK+
40
25
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
k
5 10 15 20 25
(b) Communication cost
Figure 9 Effect of k on query processing time and number of edges processed
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
10k 20k 30k 40k 50kND
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
tran
sferr
ed m
essa
ges
1M
10 20 30 40 50ND
(b) Communication cost
Figure 10 Effect of119873119863 on query processing time and communication cost
values of 120572 which indicates more importance to the spatialrelevance This is mainly because when the spatial relevanceis higher fewer edges and objects are required to be exploredand processed to determine the top-k data objects Observethat in Figure 12(b) the number of messages sent by COSKdecreases sharply with an increase in 120572725 Effect of Speed Figure 13(a) demonstrates the influenceof the speed of the query objects on the query processingtime of the COSK and CMTkSK+ algorithms The experi-mental results indicate that the performance of CMTkSK+is not significantly influenced by the speed of the query
objects because the candidate objects must be continuouslymonitored after a regular interval of time regardless ofthe speed On the other hand for COSK the performancegradually decreases as the speed of the query objects increasesbecause the objects leave their respective safe regions morefrequently Figure 13(b) shows the communication costs ofCOSK and CMTkSK+ with respect to the speed of the queryobjects CMTkSK+ incurs almost constant communicationcosts because a server-initiated request to verify the candidateobjects does not depend on the speed For COSK the queryobjects cross safe regions more frequently when the speed ishigh which increases the communication costs
Wireless Communications and Mobile Computing 15
Number of keywords1 2 3 4 5
COSKCMTkSK+
0
15
30
45
Que
ry p
roce
ssin
g tim
e (s)
60
(a) Query processing time
COSK
Number of keywords
CMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1 2 3 4 5
(b) Communication cost
Figure 11 Effect of number of keywords on query processing time and communication cost
001 01 1 10 100
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
001 01 1 10 100
(b) Communication cost
Figure 12 Effect of 120572 on query processing time and communication cost
726 Effect of Mobility Figure 14 shows the effect of mobility119872119902119903119910 (mobility refers to the percentage of query objects thatare moving at any timestamp) on the performance of COSKand CMTkSK+ algorithms As expected the query pro-cessing time and communication costs for both algorithmsincrease with119872119902119903y Nevertheless COSK performs better thanCMTkSK+ in terms of query processing time and commu-nication costs
727 Effect of Directed Edges Figure 15 shows the impactof percentage of directed edges 119864119889119894119903 on the performance ofCOSK and CMTkSK+ algorithms The query processing time
increases with 119864119889119894119903 because algorithm needs to explore moreedges to retrieve the top-k keyword queries However thecommunication cost is not significantly affected by the valueof 119864119889119894119903 for both the algorithms
728 Effect of Datasets Figure 16 demonstrates the indexsizes of the COSK and CMTkSK+ approaches for differentdatasets As shown in Figure 16 both algorithms have similarindex sizes However COSK has minor space overheadbecause it stores additional information of the highest signifi-cance factor 120579119905 of edges More important this space overheadis minimal as compared to the gain achieved by COSK inquery processing time and communication costs
16 Wireless Communications and Mobile Computing
25 50 75 100 125
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
Vqry
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
25 50 75 100 125Vqry
(b) Communication cost
Figure 13 Effect of speed on query processing time and communication cost
20 40 60 80 100Mqry
COSKCMTkSK+
0
15
45
30
60
Que
ry p
roce
ssin
g tim
e (s)
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
20 40 60 80 100Mqry
1k
COSKCMTkSK+
(b) Communication cost
Figure 14 Effect of mobility on query processing time and communication cost
73 Experimental Results of Top-k Spatial Keyword Queriesin Dynamic Road Networks In this section we evaluate theperformance of COSK and basic algorithm for dynamic roadnetworks The 119864119906119901119889 indicates the percentage of all edges thatchange their weight at each timestamp The length of anupdated edge is randomly selected between 01 to 10 times theoriginal length Figure 17(a) depicts the query processing timeof COSK and basic algorithm It is evident from the figure thatquery processing time of basic algorithm is not significantlyaffected by 119864119906119901119889 This is mainly because the query objectsissue top-k spatial queries at each timestamp However query
processing time of COSK increases with the value of 119864119906119901119889because the probability that the updated edge may associatedwith the monitoring region of query q increases with 119864119906119901119889Therefore when 119864119906119901119889 becomes large the results need to befrequently updated which increases the query processingtime Figure 17(b) shows the communication costs of COSKand basic algorithm with respect to 119864119906119901119889 Basic algorithmincurs almost constant communication costs regardless of thevalue of 119864119906119901119889 In contrast the communication cost of COSKincreases with 119864119906119901119889 because the query result and safe regionsneeds to be frequently updated
Wireless Communications and Mobile Computing 17
COSKCMTkSK+
10 20 30 40 50Edir
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1k
10 20 30 40 50Edir
eSPAKCMTkSK+
(b) Communication cost
Figure 15 Effect of 119864119889119894119903 on query processing time and communication cost
COSKCMTkSK+
0
15
45
30
60
Inde
x siz
e (M
B)
OldenburgDatasets
San Francisco San Joaquin
Figure 16 Effect of dataset on index size
8 Conclusion
In this paper we investigated moving top-k spatial keywordqueries in directed and dynamic road networksWepresentedan efficient indexing framework using inverted files thatindexes the data objects on edges allowing for the effectivesearching of data objects relevant to queries in terms ofboth textual and spatial relevance We also presented a safe-exit-based algorithm called COSK to monitor moving top-k spatial keyword queries We demonstrated that the queryresults remain valid as long as the query object resides withina safe region Furthermore COSK can effectively monitor thevalidity of query results and safe regions in dynamic roadnetworks Finally an experimental evaluation conducted on
real road networks demonstrated that COSK significantlyreduced the query processing time and communication costscompared to the CMTkSK+ algorithm
Data Availability
The real road network data used in this study are also used inmany previous studies The road network data is cited in themanuscript and it is available at httpswwwcsutahedusimlifeifeiSpatialDatasethtm To simulate the moving queriesthe authors used the spatiotemporal data generator which isalso used in previous studiesThe research article of generatoris cited in the manuscript The documentation and source
18 Wireless Communications and Mobile Computing
0
20
40
60
Que
ry p
roce
ssin
g tim
e (s)
80
15 30 45 60 75Eupd
COSKBasic
(a) Query processing time
15 30 45 60 75Eupd
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
COSKBasic
(b) Communication cost
Figure 17 Effect of 119864119906119901119889 on query processing time and communication cost
files of generator are available at httpsiapgjade-hsdeper-sonenbrinkhoffgenerator They used the Twitter tweetsfor generating the description of data objects and also querykeywords The tweets used can be accessible at httpfollow-thehashtagcomdatasetsfree-twitter-dataset-usa-200000-free-usa-tweets
Conflicts of Interest
The authors declare that there is no conflicts of interestregarding the publication of this paper
Acknowledgments
Hyung-JuChowas supported by theNational Research Foun-dation of Korea (NRF) grant funded by the Korean Govern-ment (MSIP) (NRF-2016R1A2B4009793) and this researchwas partially supported by Basic Science Research Programthrough the National Research Foundation of Korea (NRF)fundedby theMinistry of Education (2016R1D1A1B03934129)
References
[1] D Papadias N Mamoulis J Zhang and Y Tao ldquoQuery pro-cessing in spatial network databasesrdquo in Proceedings of the 29thInternational Conference on Very Large Data Bases (VLDB rsquo03)pp 802ndash813 September 2003
[2] H-J Cho K Ryu and T-S Chung ldquoAn efficient algorithm forcomputing safe exit points of moving range queries in directedroad networksrdquo Information Systems vol 41 pp 1ndash19 2014
[3] G Tsatsanifos and A Vlachou ldquoOn processing Top-k spatio-textual preference queriesrdquo in Proceedings of the 18th Interna-tional Conference on ExtendingDatabase Technology (EDBT rsquo15)pp 433ndash444 March 2015
[4] R Li A X Liu A L Wang and B Bruhadeshwar ldquoFast rangequery processing with strong privacy protection for cloud com-putingrdquo Proceedings of the VLDB Endowment vol 7 no 14 pp1953ndash1964 2014
[5] G Cong C S Jensen andDWu ldquoEfficient retrieval of the Top-k most relevant spatial web objectsrdquo Proceedings of the VLDBEndowment vol 2 no 1 pp 337ndash348 2009
[6] Z Li K C K Lee B Zheng W-C Lee D Lee and X WangldquoIR-tree An efficient index for geographic document searchrdquoIEEE Transactions on Knowledge and Data Engineering vol 23no 4 pp 585ndash599 2011
[7] Y Zhou X Xie C Wang Y Gong and W Ma ldquoHybrid indexstructures for location-based web searchrdquo in Proceedings of the14th ACM International Conference on Information and Knowl-edge Management pp 155ndash162 Bremen Germany October2005
[8] J Zobel and A Moffat ldquoInverted files for text search enginesrdquoACM Computing Surveys vol 38 no 2 2006
[9] N Beckmann H Kriegel R Schneider and B Seeger ldquoR-anefficient and robust accessmethod for points and rectanglesrdquo inProceedings of the ACM SIGMOD International Conference onManagement of Data vol 19 pp 322ndash331 May 1990
[10] R Hariharan B Hore C Li and S Mehrotra ldquoProcessing spa-tial-keyword (sk) queries in geographic information retrieval(gir) systemsrdquo in Proceedings of the 19th International Confer-ence on Scientific and Statistical DatabaseManagement (SSDBMrsquo07) July 2007
[11] I De FelipeV Hristidis andN Rishe ldquoKeyword search on spa-tial databasesrdquo in Proceedings of the 24th International Confer-ence on Data Engineering (ICDE rsquo08) pp 656ndash665 April 2008
[12] J B Rocha-Junior O Gkorgkas S Jonassen and K NoslashrvagldquoEfficient processing of top-k spatial keyword queriesrdquo inProceedings of the International Symposium on Spatial andTemporal Databases pp 205ndash222 Springer 2011
[13] D Zhang K-L Tan andAK Tung ldquoScalable top-k spatial key-word searchrdquo in Proceedings of the 16th International Conferenceon Extending Database Technology pp 359ndash370 2013
Wireless Communications and Mobile Computing 19
[14] J B Rocha-Junior andK Noslashrvag ldquoTop-k spatial keyword quer-ies on road networksrdquo in Proceedings of the 15th InternationalConference on Extending Database Technology pp 168ndash179Berlin Germany March 2012
[15] H-J Cho S J Kwon and T-S Chung ldquoA safe exit algorithmfor continuous nearest neighbor monitoring in road networksrdquoMobile Information Systems vol 9 no 1 pp 37ndash53 2013
[16] D Yung M L Yiu and E Lo ldquoA safe-exit approach for efficientnetwork-based moving range queriesrdquo Data amp KnowledgeEngineering vol 72 pp 126ndash147 2012
[17] M Attique H Cho R Jin and T Chung ldquoEfficient Processingof Continuous Reverse k Nearest Neighbor on Moving Objectsin Road Networksrdquo ISPRS International Journal of Geo-Infor-mation vol 5 no 12 p 247 2016
[18] H G Elmongui M F Mokbel and W G Aref ldquoContinuousaggregate nearest neighbor queriesrdquoGeoInformatica vol 17 no1 pp 63ndash95 2013
[19] D Wu M L Yiu C S Jensen and G Cong ldquoEfficient con-tinuously moving top-k spatial keyword query processingrdquo inProceedings of the IEEE International Conference on Data En-gineering (ICDE rsquo11) pp 541ndash552 Hannover Germany April2011
[20] W Huang G Li K-L Tan and J Feng ldquoEfficient safe-re-gion construction for moving top-k spatial keyword queriesrdquoin Proceedings of the 21st ACM International Conference onInformation and Knowledge Management pp 932ndash941 2012
[21] L Guo J ShaoHHAung andK-L Tan ldquoEfficient continuoustop-k spatial keyword queries on road networksrdquoGeoInformat-ica vol 19 no 1 pp 29ndash60 2014
[22] Y Li G Li L Shu Q Huang and H Jiang ldquoContinuous moni-toring of top-k spatial keyword queries in road networksrdquo Jour-nal of Information Science and Engineering vol 31 no 6 pp1831ndash1848 2015
[23] M Attique A Khan and T-S Chung ldquoESPAK Top-k spatialkeyword query processing in directed road networksrdquo in Pro-ceedings of the Workshops of the International Conference onExtending Database Technology and the International Confer-ence on DatabaseTheory (EDBTICDT rsquo17) March 2017
[24] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing ampManagement vol24 no 5 pp 513ndash523 1988
[25] V N Anh O de Kretser and A Moffat ldquoVector-space rankingwith effective early terminationrdquo in Proceedings of the 24th An-nual International ACM SIGIR Conference pp 35ndash42 NewOrleans LO USA 2001
[26] E W Dijkstra ldquoA note on two problems in connexion withgraphsrdquo Numerische Mathematik vol 1 pp 269ndash271 1959
[27] ldquoReal datasets for spatial databasesrdquo httpswwwcsutahedulifeifeiSpatialDatasethtm
[28] ldquoTwitterrdquo httpstwittercom[29] T Brinkhoff ldquoA framework for generating network-basedmov-
ing objectsrdquo GeoInformatica vol 6 no 2 pp 153ndash180 2002
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
4 Wireless Communications and Mobile Computing
of one or more segments An edge is also considered a seg-ment where the nodes are the end points of the edge Theweight of a segment (1199011 1199012) is denoted by119882(119904)32 Problem Formulation Similar to previous studies [5 1423] we assume each data object 119889 isin 119863 has a point location119889119897 in the road network and a text description 119889119905 Given aquery location 119902119897 a set of keywords 119902119905 and k number ofdata objects to return the top-k spatial keyword query 119876119896 isdefined as119876119896 = (119902119897 119902119905 119896) which takes three arguments andreturns the best k data objects from D according to a scorethat considers spatial proximity and text relevance The score120595(119889) of a data object d is defined by the following equation
120595 (119889) = 120583 (119889119905 119902119905)1 + 120572 sdot 120582 (119889119897 119902119897) (1)
where 120582(119889119897 119902119897) is the spatial relevance between 119889119897 and119902119897 120583(119889119905 119902119905) is the textual relevance between 119889119905 and 119902119905 and120572 is a positive real number that determines the importanceof one measure over the other For example if only textualrelevance is considered then 120572 = 0 If more importance isgiven to spatial relevance then 120572 gt 1
Spatial relevance (120582) is defined as the shortest distancebetween data objects d and q 120582(119889119897 119902119897) = 119889119894119904119905(119889119897 119902119897)Thus 119889119894119904119905(119889119894119897 119902119897) lt 119889119894119904119905(119889119895119897 119902119897) indicates that data object119889119894 is more spatially relevant to q than data object 119889119895 Thetextual relevance (120583) can be computed using any popularinformation retrieval model such as cosine similarity or thelanguage model In this study we use the cosine similarity be-tween 119889119905 and 119902119905 The textual relevance is defined as follows
120583 (119889119905 119902119905) = sum119905isin119902119905 119908119905(119889119905)119908119905(119902119905)radicsum119905isin119889119905 [119908119905(119889119905)]2 sum119905isin119902119905 [119908119905(119902119905)]2
(2)
The weight 119908119905(119889119905) = 1 + ln(119891119905(119889119905)) where 119891119905(119889119905) representsthe frequency of term t in 119889119905 The weight 119908119905(119902119905) = ln(1 +|119863|119889119891119905) where |119863| is the number of objects in D and 119889119891119905 isthe document frequency A higher 120583 means a higher textualrelevance to the query keywords We used the variation ofcosine similarity based on the significance factor 120579119905(119899) ofterm t in a document n where n represents the descriptionof data object 119889119905 or query keywords 119902119905 The significance120579119905(119899) = 119908119905(119899)radicsum119905isin119899(119908119905(119899))2 is the normalized weight of theterm in the document by taking into account the length ofthe document [24 25] Hence the textual relevance 120583(119889119905 119902119905)can be rewritten as
120583 (119889119905 119902119905) = sum119905isin119902119905
120579119905(119889119905)120579119905(119902119905) (3)
4 Query Processing System
In this section we present the proposed query processingsystem that indexes the data objects and prunes the irrelevantedges for efficient query processing In Section 41 we discussthe indexing framework and in Section 42 we present anefficient keyword query processing algorithm for snapshotqueries
41 Indexing Framework In this study our main work focu-ses on moving queries in a directed and dynamic road net-works We use a method similar to the enhanced techniquepresented in [12] as our basic framework for processingsnapshot queries in directed and dynamic road networksTheindexing framework combines a road network framework[1] for storing spatial information and an inverted file forindexing data objects For easy traversing of the networkwe store the adjacent nodes of each given node by storingnode id (119899119894119889) edge id (119890119894119889) the direction of the edge andthe weight of the edge The indexing framework consists oftwomain components a pruning component and an invertedfile component Figure 2 illustrates the main componentsof an indexing framework The pruning component firstprunes the edges that contain data objects irrelevant to thequery keyword To achieve this we introduced the highestsignificance 120579+119905 of a given term t in the description of objectslying on the edge The 120579+119905 on an edge is retrieved by a keycomposed of a pair of edge id and term id (119890119894119889 119905119894119889) The 120579+119905represents an upper-bound significance of any object lying onan edge with term t in its description The inverted list of aterm t on an edge is accessed only if the upper-bound scorecomposed by 120579+119905 and theminimumnetwork distance betweenthe starting node of the edge and query q may return acandidate data object Naturally the edges with upper-boundscores smaller than the score of the k-th object found so farare pruned
We implement an inverted file for indexing data objectsThe inverted file contains a vocabulary and inverted lists Thevocabulary keeps general information about each term (suchas the frequency of the term) which is helpful in computingthe textual relevance of the data objects The inverted liststores the data objects located on the edge
997888997888997888997888997888rarr(119899119904 119899119890) that havea term t in their description An inverted list is identifiedby a key composed of (119890119894119889 119905119894119889) Each inverted file is a set ofinverted lists A separate inverted list is used for each term inthe object description An inverted list stores two attributesfor each data object first the distance between the data objectand the starting node 119889119894119904119905(119899119904 119889119894) second the significancefactor 120579(119905119894 119889119894) of the term 119905119894 in the description of the dataobject Note that the network distance between two points ina directed road network is not symmetrical (ie 119889119894119904119905(119899119904 119889119894) =119889119894119904119905(119889119894 119899119904)) Recall that the starting node is chosen accordingto the orientation of the edge such that the direction of theedge is from the node toward the data object In Figure 1 1198993is the starting node for 1198897 For bidirectional edges any of theadjacent nodes can act as a starting node
The proposed indexing scheme has three main advan-tages First the object search relevant to query keywords isvery efficient using the (119890119894119889 119905119894119889) pair Second inverted filesalso store the network distance between the starting node andthe data object which helps in accessing the data object in thedirected road network Finally the pruning technique allowsfor faster query processing by exploring fewer edges
Table 2 presents the notations used in this study
42 Query Processing Algorithm Our algorithm traverses theroad network incrementally in a similar fashion to Dijkstrarsquos
Wireless Communications and Mobile Computing 5
Inverted FileInverted Lists
PruningVocabulary
1 Compute upper-bound score using
2 Inverted list of a term is accessedonly if the upper-bound score is greater than kth object
dist(nq) and t+
lteid tidgt
lteid tidgt
tid Dftid
di dist(ns di) (d t )
+t
Figure 2 Indexing framework
Table 2 Summary of notations used in this paper
Notation DefinitionG = (N EW) Graph model of road network119889119894119904119905(119901119904 119901119890) Length of shortest path from 119901119904 to 119901119890 where 119901119904 and 119901119890 represent start and end points respectively119897119890119899(1199011 1199012) Length of segment connecting two points 1199011 and 1199012119899119894 Node in road network119890 = (119899119904 119899119890) Edge in edge set E where 119899119904 and 119899119890 are start and end points of the edge119899120573 Boundary node corresponding to start (119899119904) or end (119899119890) point of an edge119882(119890) Weight of edge (119899119904 119899119890)q Query point in road networkk A number that represents q can be among k number of closest facilities to a data object dD Set of data objects119863 = 1198891 1198892 119889|119863|119863(119899119904 119899119890) Set of data objects in an edge119901119886 Anchor point that corresponds to start point of expansion119875119878119864 Safe exit point where safe and non-safe regions of q intersect120572 query parameter120595(119889) Score of data object d120583(119889119905 119902119905) textual relevance of data object d with query keywords120582(119889119897 119902119897) Spatial relevance of data object d with query location119863+ Set of answer objects119863minus Set of non-answer objects119889+119897 Lowest answer object119889minusℎ Highest non-answer object
algorithm [26] Algorithm 1 returns the top-k data objectswith the highest scores according to their joint textual andspatial relevance to the query The algorithm begins byexploring the active edge where query object q is located andexpands the network in an increasing order of distance fromq Each entry in the min-heap has the form (119901119886 119890119889119892119890) where119901119886 indicates the anchor point in the edge For an active edgeq becomes the anchor point Otherwise for directed edgesending node 119899119890 becomes the anchor point For bidirectionaledges either of the adjacent boundary nodes ie 119899119904 or 119899119890becomes the anchor point Let119863119896 be the current set of top-kdata objects and 119904119896 be the score of the k-th data object in119863119896The 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896) function retrieves the candidatedata objects 119863119888 located in an edge with a better score 120595(119889)than 119904119896 Next the 119863119896 set is updated with the data objects in
119863119888 and so does 119904119896The algorithm continues its expansion andinserts the adjacent edges of the boundary node until the heapis exhausted or the upper-bound score of the remaining dataobjects cannot have a better score than 119904119896 The upper-boundscore 120595(119899) of node n is computed using 119889119894119904119905(119899 119902) and themaximum textual relevance (120583 = 1)Therefore if120595(119899) le 119904119896 itmeans that even if there is unexplored data object dmatchingall query keywords its score can be better than the k-th objectin 119863119896 because 119889119894119904119905(119889 119902119897) ge 119889119894119904119905(119899 119902119897) This is certain owingto the fact that the algorithm strictly expands the node with aminimum distance to the query location
Algorithm 2 presents the 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896) proce-dure which finds the candidate data objects This procedurehas twomain steps In the first step the upper-bound score ofthe edges is computed using a significance factor (120579119905 ) of a term
6 Wireless Communications and Mobile Computing
(1) Input Top-k spatial keyword query 119876119873 = (119902119897 119902119905 119896)(2) Output Top-k data objects with highest score(3) 119863119888 larr997888 0 lowastset of candidate data objects(4) max-heap 119863119896 larr997888 0 lowastcurrent Top-k set(5) 119904119896 larr997888 0 lowastk-th score in119863119896(6) min-heap larr997888 0(7) 119890119909119901119897119900119903119890119889 larr997888 0(8) min-heapinsert(119902119897 119890119889119892119890119886119888119905119894V119890)(9) 119863119888 larr997888 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896)(10) update119863119896 and 119904119896 with 119889 isin 119863119888(11) whilemin-heap = 0 and (1(1 + 120572120582(119889119897 119902119897)) lt 119904119896) do(12) for each unexplored adjacent edge of (119901119886 119890119889119892119890) do(13) 119890119909119901119897119900119903119890119889 larr997888 119890119909119901119897119900119903119890119889 cup (119901119886 119890119889119892119890)(14) 119863119888 larr997888 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896)(15) update119863119896 and 119904119896 with 119889 isin 119863119888(16) end(17) min-heapinsert(adjacent node edge)(18) end(19) return119863119896
Algorithm 1 EvaluateSnapshotQuery(Node 119899119894 Edge 119890119894)
(1) Input Edge ID 119890119894119889 Term ID 119905119894119889 score of k-th object 119904119896(2) Output candidate list119863119888(3) compute 120579119905(119890119894)(4) if 120579119905(119890119894) gt 0 then(5) 119898119886119909119904119888119900119903119890(119890119894) larr997888 119888119900119898119901119906119905119890119898119886119909119904119888119900119903119890(120579119905 119889119894119904119905(119890119894 119902119897))(6) end(7) if 119898119886119909119904119888119900119903119890(119890119894) gt 119904119896 then(8) for each data object in 119890119894 do(9) compute 119889119904119888119900119903119890(10) end(11) if 119889119904119888119900119903119890 gt 119904119896 then(12) 119863119888 larr997888 119863119888 cup 119889(13) end(14) end(15) return119863119888
Algorithm 2 CandidateSearch((119890119894119889 119905119894119889) 119904119896)
119905 isin 119902119905 and the shortest distance 119904119889119894119904119905(119890119894 119902119897) between the edgeand the query location In the next step the inverted lists ofterm t are fetched if their upper-bound score is greater than119904119896 In the inverted lists the objects with score 120595(119889) greaterthan 119904119896 are returned
To understand the proposed algorithm consider theroad network presented in Figure 1 Assume that a query qgenerated a top-1 keyword query with qd ldquoItalian Restau-rantrdquo For ease of presentation we assume 120572 = 1 and thetextual relevance 120583 is the number of occurrences of querykeywords in 119889119905 divided by the number of keywords in thedocument (description of data object) For example 120595(1198894) =120583(1198894119905 119902119905)(1 + 120582(1198894119897 119902119897)) = 058 = 006 The algorithmstarts the network expansion from an active edge
997888997888997888997888997888rarr(1198992 1198993)where q is the anchor point Note that the direction of the edge997888997888997888997888997888rarr(1198992 1198993) is from 1198992 to 1198993 Therefore the algorithm explores
only997888997888997888997888997888rarr(119902 1198993) There is no data object found in
997888997888997888997888997888rarr(119902 1198993) Then1198993 becomes the anchor point and edges (1198993 1198994) (1198993 1198995)and (1198993 1198997) are inserted in min-heap Next the 119888119886119899119889119904119890119886119903119888ℎfunction retrieves the candidate data objects on edges (1198993 1198994)(1198992 1198993) and (1198993 1198997) whose score is better than 119904119896 On edge(1198993 1198995) data object 1198893 is retrieved with 120595(1198893) = 02 Dataobject 1198893 is inserted in the119863119896 set and the value of 119904119896 is set to02 For edges (1198993 1198994) and (1198993 1198997) there is no candidate objectfound because 1198892119905 (ldquoCaferdquo) and 1198897119905 (ldquoCafe and Bakeryrdquo) donot match with 119902119905 The algorithm continues expanding theedges whose upper-bound score is greater than 119904119896 The edge997888997888997888997888997888rarr(1198997 1198992) is explored next The upper-bound score of
997888997888997888997888997888rarr(1198997 1198992)is 17 which is less than 119904119896 Similarly for edge
larr997888997888997888997888997888(1198996 1198995) theupper-bound score is 058 lt 119904119896 Therefore the algorithmterminates and reports 1198893 as the top-1 result
Wireless Communications and Mobile Computing 7
q
q issues TkSK query at p1
Server returns a set of objects for p1
Figure 3 Illustration of directed road network
qq issues TkSK query at p2
Server returns a set of objects for p2
Figure 4 Illustration of directed road network
5 Moving Top-119896 Spatial Keyword Queries
In this section we present our method to monitor themoving top-k spatial keyword queries where query objectsare moving in a directed road network Figure 3 providesan example of TkSK in road networks where query point qissues a TkSK query at point 1199011 Note that the numbers onthe arrows in the figure indicate the order of the steps Toobtain top-k results at 1199011 the server executes Algorithm 1as mentioned in Section 42 Now consider that the queryobject is moved to 1199012 as shown in Figure 4 to retrieve thetop-k results at point 1199012 The simple method is to repeat theprocedure executed at 1199011 However the use of recomputationwhenever query q changes its location significantly increasesthe computation cost Furthermore it also increases thecommunication overhead because the query object mustreport its location whenever it moves and the server mustsend the results set To address these issues we introduce thesafe exit approach
In the proposed framework the server computes safeexit points for a query object The server maintains a set ofmoving queries and the query result remains valid until thequery objects remain inside their respective safe exit pointsWhenever a query object leaves its safe exit points the serverrecomputes theTkSK and safe exit points for the query object
Next we present our method to compute the safe exitpoints for a query objectThe safe exit point represents a pointin the segment where a safe region and nonsafe region meetWe compute the safe exit point using the divide-and-conquertechnique Before presenting the detailed methodology wedefine the terminologies used in this section
Definition 1 (safe region) A portion of a road segment thatcan guarantee that as long as the query point lies in it itstop-k results remain valid
Definition 2 (answer objects 119863+) A data object d is calledan answer object of query q if the score of data object d(120595(119889) gt 120595(119889119886)) where 119889119886 represents any other data object inthe directed road network Similarly we can generalize thisdefinition for TkSK a data object d is called an answer object
of query q if the score of a data object d (120595(119889) gt 120595(119889119896+1))where 119889119896+1 represents the (119896+1)119905ℎ data object in the directedroad network In other words we can state that all answerobjects are top-k results of query q
Definition 3 (nonanswer objects119863minus) A data object d is calleda nonanswer object of query q if the score of data object d(120595(119889) lt 120595(119889119886)) where 119889119886 represents any other data object inthe directed road network Similarly we can generalize thisdefinition for TkSK a data object d is called a nonanswerobject of query q if the score of data object d (120595(119889) lt 120595(119889119896))where 119889119896 represents the kth data object in the directed roadnetwork That is we can say that all answer objects are top-k results of query q Therefore we can state that none of thenonanswer objects are in the top-k results of query q
Definition 4 (lowest answer object 119863+119897 ) An answer object119889+ isin 119863+ is called a lowest answer object to a point 119901 isin 119866such that 120595(119889+119897 )119901 = min(120595(119889+1 )119901 120595(119889+2 )119901 120595(119889+|119889+|)119901)where120595(119889+119897 )119901 represents the score of the lowest answer objectat point p In other words 120595(119889+119897 )119901 lt 120595(119889+119886 )119901 at point p where119889+119886 is any other answer object in the 119863+ setDefinition 5 (highest nonanswer object 119863minusℎ) A nonanswerobject 119889minus isin 119863minus is called a highest nonanswer object toa point 119901 isin 119866 such that 120595(119889minusℎ )119901 = max(120595(119889minus1 )119901 120595(119889minus2 )119901 120595(119889+|119889minus|)119901) where 120595(119889+ℎ)119901 represents the score of thehighest nonanswer object at point p In other words the120595(119889minus119897 )119901 lt 120595(119889minus119886 )119901 at point p where 119889minus119886 is any other nonanswerobject in the 119863minus set
As discussed earlier the main challenge in the continuousprocessing of moving TkSK is to maintain the validity of theresult set because the movement of query objects can nullifythe result set To monitor the validity of the result set wepropose a safe-region-based approach
51 Computation of Safe Exit Points In this section wepresent our technique to compute the safe exit points Themain goal is to find a point in the road network where the
8 Wireless Communications and Mobile Computing
query result set will change The result set will change whenthe score of highest nonanswer 119863minusℎ surpasses the score of119863+119897 Generally the textual relevance score does not changeTherefore the score of data objects only changes because ofthe spatial relevance score which can only change by themovement of query objects The computation of the safe exitpoint is based on two key observations
Observation 1 If 119863+119899120573 = 119863+119901119886 there is no safe exit point in thesegment
Explanation 119863+119901119886 represents the set of answer objects atanchor point 119901119886 whereas 119863+119899120573 represents the set of answerobjects at boundary node 119899120573 As discussed earlier the safe exitpoint is the particular point where the query results changedIf the query results at the starting node are the same as theending node of any segmentedge there does not exist anypoint where the query result is changing Hence we do notsearch the safe exit point in that segment
Observation 2 If 119863+119901119886 = 119863+119899120573 there is a safe exit point in thesegment
Explanation In contrast to Observation 1 if the query resultsare different at the starting and ending points then thereexists a point where the query results are changing Hencethere is a safe exit point in the segment
To find the safe region we observe the following cases
Case 1 (when 120572 = 1 and the textual relevance of the highestnonanswer object and lowest answer object is the same)In this case both the textual and spatial relevance have thesame importance (ie 120572 = 1) In addition the top-k resultdepends only on the spatial relevance because the textualrelevance of both objects is the same The data object thatis closer to query point q becomes the answer object For anundirected edge the safe exit point 119901119904119890 is the center pointie max(119889119894119904119905(119901119904119890 119889+1 ) 119889119894119904119905(119901119904119890 119889+2 ) 119889119894119904119905(119901119904119890 119889+|119889+|)) =min(119889119894119904119905(119901119904119890 119889minus1 ) 119889119894119904119905(119901119904119890 119889minus2 ) 119889119894119904119905(119901119904119890 119889minus|119889minus|)) betweenthe lowest answer object and the highest nonanswer objectHowever in case of a directed edge where 119889119894119904119905(119901119886 119899120573) =119889119894119904119905(119899120573 119901119886) the safe exit point is either 119889+119897 or 119901119886 If 119889+119897 isin(119901119886 119899120573) then the safe exit point is 119889+119897 otherwise the safe exitpoint is 119901119886Case 2 (when 120572 = 1 and the textual relevance of the highestnonanswer object and lowest answer object is different) Inthis case the top-k result depends on all functions that are the120572 spatial and textual relevance Clearly for the undirectededges the midpoint between the lowest answer object andthe highest nonanswer object does not provide a valid safeexit point Therefore we introduce the divide-and-conquertechnique This will keep dividing the search space until weget the point where the score of the nonanswer is greater thanthat of the answer object Typically the safe exit point shouldbe closer to the data object whose score is lower Based onthis observation first we compute the midpoint in a similarfashion to Case 1 and then we continue dividing the search
space until we find the point For undirected edges the safeexit point can be computed in a similar fashion to Case 1
Case 2 also works for other cases when the safe exit pointis not the mid point between the lowest answer object andthe highest nonanswer object In these cases the safe exitpoint depends on two or more functions Therefore the safeexit point can be easily computed using the aforementioneddivide-and-conquer technique Following are the scenarioswhere the safe exit point can be computed using Case 2
(a) When 120572 = 1 and textual relevance of the nearest non-answer object and farthest answer object is different
(b) When 120572 = 1 and textual relevance of the nearestnonanswer object and farthest answer object is same
Case 3 (when 120572 = 0) This means the spatial relevance hasno effect on the score of data objects Hence no monitoringis required for this scenario
Algorithm 3 retrieves the safe exit points using theobservations we discussed earlier The core function in thisalgorithm is ComputeSafeExit(119901119886 119899120573) which finds the safeexit point in a segment between 119901119886 and 119899120573 The detailedComputeSafeExit(119901119886 119899120573) is described in Algorithm 4 FirstAlgorithm 4 determines 119889+119897 and 119889minusℎ at point 119901 isin [119901119886 119899120573]Recall that 119889+119897 is the lowest answer object to p where 119889minusℎ isthe highest nonanswer object to p Algorithm 4 computes thesafe exit point based on the cases we discussed earlier Thereare a further two scenarios for Cases 1 and 2 For Case 1 if119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then the safe exit point is the mid-point between 119889+119897 and 119889minusℎ If 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) thenthe edge is directed and therefore the safe exit point is either119901119886 or 119889+119897 If 119889+119897 lies on the edge [119901119886 119899120573] then 119889+119897 is the safe exitpoint Otherwise 119901119886 is the safe exit point
Similarly for Case 2 if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) thenthe safe exit point is computed by dividing the search space byhalf until we find the closest point such that 120595(119889minusℎ) gt 120595(119889+119897 )The safe exit point is computed in the same way as in Case 2if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886)52 Computation of Safe Exit Points for Example Considerthe same example in Figure 1 where the query point q issuesa top-1 keyword query with qt ldquoItalian restaurantrdquo For thisexample let us consider 120572 = 1 The monitoring algorithmstarts exploring from the active edge containing the queryobject q Therefore
997888997888997888997888997888rarr(119902 1198993) is explored first As shown inTable 3 for
997888997888997888997888997888rarr(119902 1198993) 119863+119902 = 1198893 and 119863+1198993 = 1198893 Accordingto Observation 1 no safe exit point exists in this segmentTherefore edges adjacent to 1198993 are explored and 1198993 becomesthe new 119901119886 The edge (1198993 1198994) is explored next Similarlythe answer object at 1198993 and 1198994 is the same 119863+1198993 = 119863+1198994 =1198893 Therefore a safe exit point does not exist in (1198993 1198994)The edge (1198993 1198997) is explored next As shown in Table 3119863+1198993 = 1198893 and 119863+1198997 = 1198896 By Observation 2 there is asafe exit point in (1198993 1198997) As shown in Figure 1 1198893119905 =1198896119905 = ldquo119868119905119886119897119894119886119899119877119890119904119905119886119906119903119886119899119905rdquo and 119889119894119904119905(1198993 1198997) = 119889119894119904119905(1198997 1198993)
Wireless Communications and Mobile Computing 9
(1) Input Same as Algorithm 1(2) Output 119875119878119864 a set of safe exit points(3) 119875119878119864 larr997888 0 lowastset of safe exit points(4) 119863+119901119886 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119901119886 (119901119886 119899120573))(5) lowastResults calculated using Algorithm 1(6) 119863+119899120573 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910((119899120573 (119901119886 119899120573)))(7) lowastResults calculated using Algorithm 1(8) if 119863+119901119886 = 119863+119899120573 then(9) no safe exit point lowastrefer to Observation 1(10) end(11) if 119863+119901119886 = 119863+119899120573 then(12) 119875119878119864 larr997888 119875119878119864 cup 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119901119886 119899120573) lowastsafe exit point
exist - refer to Observation 2(13) end(14) return 119875119878119864
Algorithm 3 COSK monitoring algorithm
(1) Input same as Algorithm 1(2) Output se safe exit point in (119901119886 119899120573)(3) 119863+119897 larr997888 lt 119901119863+119897 gt | for each point 119901 isin [119901119886 119899120573] 119889+119897 such that120595(119889+119897 )119901 = min(120595(119889+1 )119901 120595(119889+2 )119901 120595(119889+|119889+|)119901(4) 119863minusℎ larr997888 lt 119901119863minusℎ gt | for each point 119901 isin [119901119886 119899120573] 119889minusℎ such that120595(119889minusℎ )119901 = max(120595(119889minus1 )119901 120595(119889minus2 )119901 120595(119889minus|119889minus |)119901(5) if Case 1 then(6) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(7) 119901119904119890 =
max(119889119894119904119905(119904119890 119889+1 ) 119889119894119904119905(119904119890 119889+2 ) 119889119894119904119905(119904119890 119889+|119889+ |)) =min(119889119894119904119905(119904119890 119889minus1 ) 119889119894119904119905(119904119890 119889minus2 ) 119889119894119904119905(119904119890 119889minus|119889minus |))
(8) end(9) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(10) 119901119904119890 = 119901119886 or 119901119904119890 = 119889+119897 where 119889+119897 isin (119901119886 119899120573)(11) end(12) end(13) if Case 2 then(14) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(15) 119901119904119890 =closest point to 119901119886 such that 120595(119889minusℎ ) gt 120595(119889+119897 )(16) end(17) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(18) Same as Line (10)(19) end(20) end(21) return 119901119904119890
Algorithm 4 ComputeSafeExit(119901119886 119899120573)
Therefore according to Case 1 the safe exit point 1199041 isthe midpoint between 1198893 and 1198896 That is 119889119894119904119905(1199011199041198901 1198893) =119889119894119904119905(1199011199041198901 1198896) where119889119894119904119905(1199011199041198901 1198893) = 119909+3 and 119889119894119904119905(1199011199041198901 1198896) =minus119909 + 5 for 0 lt 119909 lt 3 Consequently 119909 = 1 which means thatthe distance from 1198993 to 1199011199041198901 is 1
Next we determine a safe exit point in (1198993 1198995) As shownin Table 3 the answer object at 1198995 is also the same as 1198993Hence no safe exit point exists in this edge Next
larr997888997888997888997888997888(1198996 1198995) isexplored with 119901119886 = 1198995 According to Table 3 119863+1198997 = 1198894 and
119863+1198995 = 1198893 Therefore a safe exit point exists in this edge This
edge is directed and for each point 119901 isin larr997888997888997888997888997888(1198996 1198995) the shortestdistance from p to 1198893 is from 119901 997888rarr 1198996 997888rarr 1198992 997888rarr 1198993 997888rarr 1198893Therefore 1198995 is the safe exit point
The bold lines in Figure 5 indicate the safe region of qThetop-1 result remains 1198893 until the query q lies in the safe region
Next we analyze the time complexity for determininga set of safe exit points using a set of qualifying objects119889 isin 119863+119901119886 cup 119863+119899120573 cup 119863(119901119886 119899120573) Note that 119863+119901119886 (119863+119899120573) indicates
10 Wireless Communications and Mobile Computing
Table 3 Computation of safe exit points for example scenario
EdgeSegment 119901119886 119863+119901119886 119863+119899120573 119901119904119890997888997888997888997888rarr(119902 1198993) q 119863+119902 = 1198893 119863+1198993 = 1198893 none(1198993 1198994) q 119863+1198993 = 1198893 119863+1198994 = 1198893 none(1198993 1198997) 1198993 119863+1198993 = 1198893 119863+1198997 = 1198896 1199011199041198901997888997888997888997888997888rarr(1198993 1198995) 1198993 119863+1198993 = 1198893 119863+1198995 = 1198893 nonelarr997888997888997888997888997888(1198996 1198995) 1198995 119863+1198995 = 1198893 119863+1198996 = 1198894 1199011199041198902
2
q
3
1
1 1
1
1
2
1
2
1 2
1
3
2
1
1
d4 (Chinese Restaurant)
d1 (Grand Hotel)
d5 (Pub and Bar)
n1
n6
n2 n3
n4
n7
pse1
pse2
n5
d6(Italian Restaurant)
d3 (Italian Restaurant)
d2 (Cafe)
d7 (Cafe and Bakery)
Figure 5 Illustration of safe region of q
the set of k data objects that satisfies the query conditionat 119901119886 (119899120573) According to Dijkstras algorithm [26] the timecomplexity 119874(119863+119902 ) for computing a set of answer objects at aquery point q is119874(119863+119902 ) = 119874(|119864|+|119873| log |119873|)Thismeans that119874(119863+119901119886) = 119874(119863+119899120573) = 119874(|119864| + |119873| log |119873|) holds for endpoints119901119886 and 119899120573 Thus time complexity 119874(Ω119896119905ℎ) when determiningthe skyline Ω119896119905ℎ with the k-th highest score is 119874(Ω119896119905ℎ) =119862119896119905ℎ119874(|119863+119901119886 cup 119863+119899120573 cup 119863(119901119886 119899120573)|) where 119862119896119905ℎ is the numberof qualifying objects that participate in the constitution ofthe skyline with the k-th highest score Therefore the timecomplexity of determining a safe exit point coincides withthe time complexity of determining the two skylines iethe skyline 119863+119897 with the k-th highest (or lowest) score foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects This is because the safe exit point is foundat the cross point between these skylines
Figure 6 represents the skyline graph for 119896 = 1 in an edge(1198997 1198993) Let us draw the score function for 1198893 and 1198896 for theroad segment (1198997 1198993) where a safe exit point exists This isbecause 119863(1198993)+ = 1198893 and 119863(1198997)+ = 1198896 for 119896 = 1 For eachpoint 119901 isin (1198997 1198993) the distance between 1198893 and point p canbe represented as 119889119894119904119905(1198893 119901) = 119889119894119904119905(1198893 1198993) + 119897119890119899(1198993 119901) = 6 minus119897119890119899(1198997 119901) Similarly for each point 119901 isin (1198997 1198993) the distancebetween 1198896 and point p can be represented as 119889119894119904119905(1198896 119901) =119889119894119904119905(1198896 1198997) + 119897119890119899(1198997 119901) = 2 + 119897119890119899(1198997 119901) Let 119897119890119899(1198997 119901) be
n7
10
08
06
04
02
n3pse1d7
distance
Scor
e
05 10 15 20 25 30
(d6) = 1(x + 3)
(d3) = 1(minusx + 7)
Figure 6 Skyline graph for 119896 = 1 on the road segment (1198997 1198993)
a variable x (0 le 119909 le 3) We can write 120582(1198893 119901) =119889119894119904119905(1198893 119901) = 6 minus 119909 and 120582(1198896 119901) = 119889119894119904119905(1198896 119901) = 2 + 119909 Thenwe can represent score function 120595(1198893) and 120595(1198896) as follows
120595(1198893) = 120583(1198893119905 119902119905)(1 + 120572 sdot 120582(1198893 119901)) = 1(7 minus 119909) for(0 le 119909 le 3)
Wireless Communications and Mobile Computing 11
120595(1198896) = 120583(1198896119905 119902119905)(1 + 120572 sdot 120582(1198896 119901)) = 1(3 + 119909) for(0 le 119909 le 3)Finally we present the lemma to prove that safe exit points
computed by COSK are correct
Lemma 8 The COSK algorithm correctly computes a set ofsafe exit points
Proof We will prove the correctness of the COSK algorithmby contradiction We assume that if 119863+119901119886 = 119863+119899120573 there is nosafe exit point in a road segment (119901119886119899120573) This means that foreach point p in the road segment (119901119886119899120573) the query result atp equals 119863+119901119886 ie 119863+119901 = 119863+119901119886forall119901 isin (119901119886119899120573) However it leadsto a contradiction that 119863+119899120573 = 119863+119901119886 when 119901 = 119899120573 There-fore if 119863+119901119886 = 119863+119899120573 a safe exit point exists in (119901119886119899120573) In addi-tion a safe exit point is determined using the skyline 119863+119897 foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects when 119863+119901119886 = 119863+119899120573 The first skyline is acomposite polyline drawn from answer objects in 119863+119901119886 Thesecond skyline is a composite polyline drawn from nonan-swer objects in 119863+119899120573 cup 119863(119901119886 119899120573) minus 119863+119901119886
6 Monitoring Query Results and Safe Regionsin Dynamic Directed Road Networks
In this section we discuss the monitoring of spatial key-word queries in dynamic road networks where the networkdistance changes depending on the traffic conditions Theupdates on weight of some edges may invalidate the queryresults or safe region of q even though the query objectq remains within their respective safe region Figure 7illustrates an example of changing the weights edges
larr997888997888997888997888997888(1198991 1198992)and
larr997888997888997888997888997888(1198991 1198996) For convenience we consider 120572 = 1 and qt =ldquoItalian restaurantrdquo In Figure 7(a) the top-1 result is 1198891 andbold lines show the safe region of query q Now consider attime 119905119895 the weights of two edgeslarr997888997888997888997888997888(1198991 1198992) andlarr997888997888997888997888997888(1198991 1198996) changeddue to heavy traffic condition as shown in Figure 7(b) Theupdate in weight of edges may invalidate the query resultor safe region of q Therefore it is necessary to monitor thevalidity of results and safe region when the changes occur
Next we introduce a monitoring region to monitor thevalidity of the safe region effectively when the weight ofan edge is changed Monitoring region MR contains all thepoints between query point q and lowest answer object andhighest nonanswer object Formally it is defined as 119872119877 =119889119894119904119905(119902119863+119897 ) cup 119889119894119904119905(119902119863minusℎ) where 119889119894119904119905(119902119863+119897 ) is the distancebetween q and lowest answer object and 119889119894119904119905(119902119863minusℎ) is highestnonanswer object In given example the 119863+119897 = 1198891 and 119863minusℎ =1198892 1198893 Therefore the dotted lines in Figure 8(a) shows themonitoring region of query object q
Now at time 119905119895 the update to edgeslarr997888997888997888997888997888(1198991 1198996) and larr997888997888997888997888997888997888(1198991 1198891)
which is not part of monitoring region can safely be ignoredHowever the updated on segment
997888997888997888997888997888997888rarr(1198992 1198891)which is associatedwith monitoring region may nullify the results As shown in
Figure 8(b) after update the top-1 result becomes 1198892 and boldlines represents the new safe region of q
Algorithm 5 monitors the validity of result set and saferegion of query object qwhen the weight of any edge changesLet us consider weight of edge (119899119894 119899119895) changes at time 119905119895First algorithm checks whether edge (119899119894 119899119895) is associatedwith monitoring region or not If it is not part of monitoringregion then algorithm simply ignores the update in edge(119899119894 119899119895) and query results and safe region remains valid Incontrast if edge is associated with monitoring region (ie119872119877cap(119899119894 119899119895) = 0) then algorithm evaluates the query resultsConsequently the top-k results and safe region of queryq needs to be updated Finally the algorithm updates themonitoring region of q
7 Performance Evaluation
In this section we evaluate the performance of COSKthrough simulation experiments We describe our experi-mental settings in Section 71 and we present our experimen-tal results for static and dynamic road networks in Sections72 and 73 respectively
71 Experimental Settings All of our experiments wereperformed using real road networks namely OldenburgSan Francisco and San Joaquin All three road networkswere obtained from [27] The original road network of SanFrancisco had 21047 nodes and 21692 edges We reformat-ted the network pruned approximately 30 of the nodesand adjusted the edges and their weights accordingly Thisresulted in a network with 14732 nodes and 14316 edgesBoth the direction of edges and data objects on the edgeswere generated randomly The description of each data objectwas extracted from Twitter messages [28] and we assignedone tweet per data object Table 4 presents the characteristicsof the data sets used in the experimental evaluation Wesimulated moving query objects by using a spatiotemporaldata generator [29] The input to generator was the road net-work of the data set used and the output was the set of queryobjects moving on the road network Each experiment had100 moving queries which were continuously monitored for100 timestamps (1 timestamp = 1 second) and the averageresult was reported in the experiments
As a benchmark for COSK in static road network weimplemented a CMTkSK+ algorithm [22] which also contin-uously monitored the moving top-k spatial keyword queriesin the road networks However this algorithm was originallydesigned for undirected road networks To make a faircomparison we modified CMTkSK+ to process top-k spatialkeyword queries in directed road networks and called itCMTkSK+ Specifically we modified the distance computa-tion method between two points such that in directed roadnetworks 119889119894119904119905(1199011 1199012) = 119889119894119904119905(1199012 1199011) Since CMTkSK+ doesnot handle top-k spatial queries in dynamic road roads wecompared the performance of COSK with basic algorithmwhich recomputes the results whenever query object changesits location All algorithms were implemented in Java andwere executed on a desktop PC 280-GHz Intel Core i5 with
12 Wireless Communications and Mobile Computing
3
q5 5
2 3
3
2
2 3 5
11
d3 (Chinese Restaurant)
n1
n6
n2 pse2
pse1
pse3
n4n5
n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Safe region at time 119905119894
9
q10 5
6 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6
n2 n3
n4n5
d2 (Italian Restaurant)d1 (Italian Restaurant)
(b) Updating weight oflarr997888997888997888997888997888997888(1198991 1198992) and
larr997888997888997888997888997888997888(1198991 1198996) at time 119905119895
Figure 7 Updating the weight of edges in a dynamic road network where 119905119894 lt 119905119895
3
q5 5
2 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6 n4n5
n2 n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Monitoring region at time 119905119894
9
q10 5
5 4
233
2
2 3 5
11
037
pse2pse1
pse3
d3 (Chinese Restaurant)n6 n4n5
n2 n3d2 (Italian Restaurant)n1 d1 (Italian Restaurant)
(b) New safe region at time 119905119895
Figure 8 Monitoring region and updated safe region at time 119905119895
(1) InputMonitoring regionMR updated edge (119899119894 119899119895)(2) Output none(3) if 119872119877cap (119899119894 119899119895) = 0 then(4) lowastedge (119899119894 119899119895) is not part of monitoring region(5) ignore the change in the weight of edge (119899119894 119899119895)(6) end(7) 119875119878119864 larr997888 0 lowastset of safe exit points(8) else(9) 119863119896119906119901119889 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119899119894 119890119894) lowastupdate set of
top-k results(10) 119875119878119864119906119901119889 larr997888 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119875119886 119899120573) lowastupdate safe exit
points(11) 119872119877119906119901119889 larr997888 119862119900119898119901119906119905119890119872119900119899119894119905119900119903119894119899119892119877119890119892119894119900119899(119863+119897 119863minusℎ )
lowastupdate monitoring region(12) end
Algorithm 5 MonitoringSafeRegion(MR(119899119894 119899119895))
Table 4 Summary of datasets
Attribute Oldenburg San Francisco San JoaquinTotal no of nodes 6104 14732 18262Total no of edges 7034 14316 23876Percentage of directed edges 30 30 30Total no of objects 5627 11453 19098Average no of objects per edge 08 08 08Total no of words 49517 103649 166153
Wireless Communications and Mobile Computing 13
Table 5 Experimental parameter settings
Parameter RangeNumber of results (k) 5 10 15 20 25Number of keywords (n) 1 2 3 4 5Query parameter (120572) 001 01 1 10 100Dataset Oldenburg San Francisco San JoaquinNumber of data objects (119873119863) 10 20 30 40 50 (x1000)Speed of query objects (119881119902119903119910) 25 50 75 100 125 (kmh)Mobility (119872119902119903119910) 20 40 60 80 100Ratio of directed edges (119864119889119894119903) 10 20 30 40 50Ratio of updated edges (119864119906119901119889) 15 30 60 80 100
8GB of memory In the experiments we compared (1) queryprocessing times (2) edges processed ie the number ofedges processed for retrieving query results and (3) indexsizes Table 5 summarizes the parameters used in the exper-iments In each experiment we varied a single parameterwithin the range that is shown in Table 5 while maintainingthe other parameters at the bolded default values
We evaluated the performance of the algorithms by usingthe following measures (1) total amount of server CPUtime which indicates the query processing time and (2)total communication cost as the total number of points (iethe location updates sent by query objects and the queryresults and safe exit points returned by the server) transferredbetween clients and the serverThebattery power andwirelessbandwidth consumption typically increase with the amountof data transferred between objects (clients) and serversThus we used the amount of transferred data as a metric toevaluate the communication cost
72 Experimental Results of Top-k Spatial KeywordQueries in Static Road Networks
721 Effect of k Figure 9 indicates the effect of the numberof results on the query processing time and communicationcost for both algorithms Figure 9(a) indicates that the queryprocessing time increases for both algorithms as the value ofk increases This is expected because with an increase in kmore data objects are required to be explored and verifiedNevertheless COSK significantly outperforms CMTkSK+ fortwo main reasons First a relevant object search is very effi-cient when using the highest significant factor and secondCOSKdoes not need to verify the set of answer objects as longas the query object lies in a safe region On the other handthe CMTkSK+ query processing time increases significantlybecause it has to monitor and verify the set of candidateobjects periodically In Figure 9(b) the communication costsfor both algorithms increase as the number of objects in-creases However the proposed algorithm demonstrates su-perior performance compared to CMTkSK+ because client-server communication is not required when the query objectlies within the safe exit points whereas in CMTkSK+ thequery object is required to report its location to the serverwhenever it moves
722 Effect of119873119863 This experimentwas conducted on datasetSan Joaquin This dataset included 19098 data objects there-fore we randomly generated approximately 30000 additionaldata objects on different edges In Figure 10 we evaluate theperformance of COSK and CMTkSK+ by varying the cardi-nality of the data objects Note that119873119863 = 10119870 corresponds toa low density of data points while119873119863 = 50119870 corresponds toa high density In Figure 10(a) it is interesting to notice thatthe query processing times of both algorithms decrease asthe cardinality of the data objects increases For CMTkSK+this is because with high density the monitoring range of aquery decreases However for COSK it is mainly becausewhen the data density is high fewer edges are required tobe expanded which decreases the query processing time InFigure 10(b) we study the influence of the cardinality of thedata objects on the communication costs The experimentalresults indicate that the communication costs of CMTkSK+incur almost constant communication costs regardless ofdata object cardinality However the communication costsof COSK increase in proportion to the 119873119863 value This isexpected because the safe region becomes smaller as thedensity of the data objects increases which increases thecommunication costs
723 Effect of Query Keywords (n) Figure 11 shows thequery processing time and communication for COSK andCMTkSK+ as a function of the number of query keywordsFigures 11(a) and 11(b) show the trend that the performanceof both algorithms degrades when the number of keywordsincreases This is mainly because by increasing the numberof query keywords the number of relevant objects may alsoincrease resulting in a higher query processing time andcommunication cost However the safe-region-based algo-rithm COSK scales better than CMTkSk+ because of its lessexpensive monitoring technique
724 Effect of 120572 Figure 12 demonstrates the impact of queryparameter 120572 on the query processing time and on the com-munication cost A small value of 120572 indicates a greater im-portance of textual relevance whereas a high value of 120572gives more preference to the spatial relevance It is interestingto note that the query processing time is lower for higher
14 Wireless Communications and Mobile Computing
k
50
10
10
15 20
20
30
Que
ry p
roce
ssin
g tim
e (s)
COSKCMTkSK+
40
25
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
k
5 10 15 20 25
(b) Communication cost
Figure 9 Effect of k on query processing time and number of edges processed
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
10k 20k 30k 40k 50kND
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
tran
sferr
ed m
essa
ges
1M
10 20 30 40 50ND
(b) Communication cost
Figure 10 Effect of119873119863 on query processing time and communication cost
values of 120572 which indicates more importance to the spatialrelevance This is mainly because when the spatial relevanceis higher fewer edges and objects are required to be exploredand processed to determine the top-k data objects Observethat in Figure 12(b) the number of messages sent by COSKdecreases sharply with an increase in 120572725 Effect of Speed Figure 13(a) demonstrates the influenceof the speed of the query objects on the query processingtime of the COSK and CMTkSK+ algorithms The experi-mental results indicate that the performance of CMTkSK+is not significantly influenced by the speed of the query
objects because the candidate objects must be continuouslymonitored after a regular interval of time regardless ofthe speed On the other hand for COSK the performancegradually decreases as the speed of the query objects increasesbecause the objects leave their respective safe regions morefrequently Figure 13(b) shows the communication costs ofCOSK and CMTkSK+ with respect to the speed of the queryobjects CMTkSK+ incurs almost constant communicationcosts because a server-initiated request to verify the candidateobjects does not depend on the speed For COSK the queryobjects cross safe regions more frequently when the speed ishigh which increases the communication costs
Wireless Communications and Mobile Computing 15
Number of keywords1 2 3 4 5
COSKCMTkSK+
0
15
30
45
Que
ry p
roce
ssin
g tim
e (s)
60
(a) Query processing time
COSK
Number of keywords
CMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1 2 3 4 5
(b) Communication cost
Figure 11 Effect of number of keywords on query processing time and communication cost
001 01 1 10 100
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
001 01 1 10 100
(b) Communication cost
Figure 12 Effect of 120572 on query processing time and communication cost
726 Effect of Mobility Figure 14 shows the effect of mobility119872119902119903119910 (mobility refers to the percentage of query objects thatare moving at any timestamp) on the performance of COSKand CMTkSK+ algorithms As expected the query pro-cessing time and communication costs for both algorithmsincrease with119872119902119903y Nevertheless COSK performs better thanCMTkSK+ in terms of query processing time and commu-nication costs
727 Effect of Directed Edges Figure 15 shows the impactof percentage of directed edges 119864119889119894119903 on the performance ofCOSK and CMTkSK+ algorithms The query processing time
increases with 119864119889119894119903 because algorithm needs to explore moreedges to retrieve the top-k keyword queries However thecommunication cost is not significantly affected by the valueof 119864119889119894119903 for both the algorithms
728 Effect of Datasets Figure 16 demonstrates the indexsizes of the COSK and CMTkSK+ approaches for differentdatasets As shown in Figure 16 both algorithms have similarindex sizes However COSK has minor space overheadbecause it stores additional information of the highest signifi-cance factor 120579119905 of edges More important this space overheadis minimal as compared to the gain achieved by COSK inquery processing time and communication costs
16 Wireless Communications and Mobile Computing
25 50 75 100 125
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
Vqry
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
25 50 75 100 125Vqry
(b) Communication cost
Figure 13 Effect of speed on query processing time and communication cost
20 40 60 80 100Mqry
COSKCMTkSK+
0
15
45
30
60
Que
ry p
roce
ssin
g tim
e (s)
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
20 40 60 80 100Mqry
1k
COSKCMTkSK+
(b) Communication cost
Figure 14 Effect of mobility on query processing time and communication cost
73 Experimental Results of Top-k Spatial Keyword Queriesin Dynamic Road Networks In this section we evaluate theperformance of COSK and basic algorithm for dynamic roadnetworks The 119864119906119901119889 indicates the percentage of all edges thatchange their weight at each timestamp The length of anupdated edge is randomly selected between 01 to 10 times theoriginal length Figure 17(a) depicts the query processing timeof COSK and basic algorithm It is evident from the figure thatquery processing time of basic algorithm is not significantlyaffected by 119864119906119901119889 This is mainly because the query objectsissue top-k spatial queries at each timestamp However query
processing time of COSK increases with the value of 119864119906119901119889because the probability that the updated edge may associatedwith the monitoring region of query q increases with 119864119906119901119889Therefore when 119864119906119901119889 becomes large the results need to befrequently updated which increases the query processingtime Figure 17(b) shows the communication costs of COSKand basic algorithm with respect to 119864119906119901119889 Basic algorithmincurs almost constant communication costs regardless of thevalue of 119864119906119901119889 In contrast the communication cost of COSKincreases with 119864119906119901119889 because the query result and safe regionsneeds to be frequently updated
Wireless Communications and Mobile Computing 17
COSKCMTkSK+
10 20 30 40 50Edir
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1k
10 20 30 40 50Edir
eSPAKCMTkSK+
(b) Communication cost
Figure 15 Effect of 119864119889119894119903 on query processing time and communication cost
COSKCMTkSK+
0
15
45
30
60
Inde
x siz
e (M
B)
OldenburgDatasets
San Francisco San Joaquin
Figure 16 Effect of dataset on index size
8 Conclusion
In this paper we investigated moving top-k spatial keywordqueries in directed and dynamic road networksWepresentedan efficient indexing framework using inverted files thatindexes the data objects on edges allowing for the effectivesearching of data objects relevant to queries in terms ofboth textual and spatial relevance We also presented a safe-exit-based algorithm called COSK to monitor moving top-k spatial keyword queries We demonstrated that the queryresults remain valid as long as the query object resides withina safe region Furthermore COSK can effectively monitor thevalidity of query results and safe regions in dynamic roadnetworks Finally an experimental evaluation conducted on
real road networks demonstrated that COSK significantlyreduced the query processing time and communication costscompared to the CMTkSK+ algorithm
Data Availability
The real road network data used in this study are also used inmany previous studies The road network data is cited in themanuscript and it is available at httpswwwcsutahedusimlifeifeiSpatialDatasethtm To simulate the moving queriesthe authors used the spatiotemporal data generator which isalso used in previous studiesThe research article of generatoris cited in the manuscript The documentation and source
18 Wireless Communications and Mobile Computing
0
20
40
60
Que
ry p
roce
ssin
g tim
e (s)
80
15 30 45 60 75Eupd
COSKBasic
(a) Query processing time
15 30 45 60 75Eupd
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
COSKBasic
(b) Communication cost
Figure 17 Effect of 119864119906119901119889 on query processing time and communication cost
files of generator are available at httpsiapgjade-hsdeper-sonenbrinkhoffgenerator They used the Twitter tweetsfor generating the description of data objects and also querykeywords The tweets used can be accessible at httpfollow-thehashtagcomdatasetsfree-twitter-dataset-usa-200000-free-usa-tweets
Conflicts of Interest
The authors declare that there is no conflicts of interestregarding the publication of this paper
Acknowledgments
Hyung-JuChowas supported by theNational Research Foun-dation of Korea (NRF) grant funded by the Korean Govern-ment (MSIP) (NRF-2016R1A2B4009793) and this researchwas partially supported by Basic Science Research Programthrough the National Research Foundation of Korea (NRF)fundedby theMinistry of Education (2016R1D1A1B03934129)
References
[1] D Papadias N Mamoulis J Zhang and Y Tao ldquoQuery pro-cessing in spatial network databasesrdquo in Proceedings of the 29thInternational Conference on Very Large Data Bases (VLDB rsquo03)pp 802ndash813 September 2003
[2] H-J Cho K Ryu and T-S Chung ldquoAn efficient algorithm forcomputing safe exit points of moving range queries in directedroad networksrdquo Information Systems vol 41 pp 1ndash19 2014
[3] G Tsatsanifos and A Vlachou ldquoOn processing Top-k spatio-textual preference queriesrdquo in Proceedings of the 18th Interna-tional Conference on ExtendingDatabase Technology (EDBT rsquo15)pp 433ndash444 March 2015
[4] R Li A X Liu A L Wang and B Bruhadeshwar ldquoFast rangequery processing with strong privacy protection for cloud com-putingrdquo Proceedings of the VLDB Endowment vol 7 no 14 pp1953ndash1964 2014
[5] G Cong C S Jensen andDWu ldquoEfficient retrieval of the Top-k most relevant spatial web objectsrdquo Proceedings of the VLDBEndowment vol 2 no 1 pp 337ndash348 2009
[6] Z Li K C K Lee B Zheng W-C Lee D Lee and X WangldquoIR-tree An efficient index for geographic document searchrdquoIEEE Transactions on Knowledge and Data Engineering vol 23no 4 pp 585ndash599 2011
[7] Y Zhou X Xie C Wang Y Gong and W Ma ldquoHybrid indexstructures for location-based web searchrdquo in Proceedings of the14th ACM International Conference on Information and Knowl-edge Management pp 155ndash162 Bremen Germany October2005
[8] J Zobel and A Moffat ldquoInverted files for text search enginesrdquoACM Computing Surveys vol 38 no 2 2006
[9] N Beckmann H Kriegel R Schneider and B Seeger ldquoR-anefficient and robust accessmethod for points and rectanglesrdquo inProceedings of the ACM SIGMOD International Conference onManagement of Data vol 19 pp 322ndash331 May 1990
[10] R Hariharan B Hore C Li and S Mehrotra ldquoProcessing spa-tial-keyword (sk) queries in geographic information retrieval(gir) systemsrdquo in Proceedings of the 19th International Confer-ence on Scientific and Statistical DatabaseManagement (SSDBMrsquo07) July 2007
[11] I De FelipeV Hristidis andN Rishe ldquoKeyword search on spa-tial databasesrdquo in Proceedings of the 24th International Confer-ence on Data Engineering (ICDE rsquo08) pp 656ndash665 April 2008
[12] J B Rocha-Junior O Gkorgkas S Jonassen and K NoslashrvagldquoEfficient processing of top-k spatial keyword queriesrdquo inProceedings of the International Symposium on Spatial andTemporal Databases pp 205ndash222 Springer 2011
[13] D Zhang K-L Tan andAK Tung ldquoScalable top-k spatial key-word searchrdquo in Proceedings of the 16th International Conferenceon Extending Database Technology pp 359ndash370 2013
Wireless Communications and Mobile Computing 19
[14] J B Rocha-Junior andK Noslashrvag ldquoTop-k spatial keyword quer-ies on road networksrdquo in Proceedings of the 15th InternationalConference on Extending Database Technology pp 168ndash179Berlin Germany March 2012
[15] H-J Cho S J Kwon and T-S Chung ldquoA safe exit algorithmfor continuous nearest neighbor monitoring in road networksrdquoMobile Information Systems vol 9 no 1 pp 37ndash53 2013
[16] D Yung M L Yiu and E Lo ldquoA safe-exit approach for efficientnetwork-based moving range queriesrdquo Data amp KnowledgeEngineering vol 72 pp 126ndash147 2012
[17] M Attique H Cho R Jin and T Chung ldquoEfficient Processingof Continuous Reverse k Nearest Neighbor on Moving Objectsin Road Networksrdquo ISPRS International Journal of Geo-Infor-mation vol 5 no 12 p 247 2016
[18] H G Elmongui M F Mokbel and W G Aref ldquoContinuousaggregate nearest neighbor queriesrdquoGeoInformatica vol 17 no1 pp 63ndash95 2013
[19] D Wu M L Yiu C S Jensen and G Cong ldquoEfficient con-tinuously moving top-k spatial keyword query processingrdquo inProceedings of the IEEE International Conference on Data En-gineering (ICDE rsquo11) pp 541ndash552 Hannover Germany April2011
[20] W Huang G Li K-L Tan and J Feng ldquoEfficient safe-re-gion construction for moving top-k spatial keyword queriesrdquoin Proceedings of the 21st ACM International Conference onInformation and Knowledge Management pp 932ndash941 2012
[21] L Guo J ShaoHHAung andK-L Tan ldquoEfficient continuoustop-k spatial keyword queries on road networksrdquoGeoInformat-ica vol 19 no 1 pp 29ndash60 2014
[22] Y Li G Li L Shu Q Huang and H Jiang ldquoContinuous moni-toring of top-k spatial keyword queries in road networksrdquo Jour-nal of Information Science and Engineering vol 31 no 6 pp1831ndash1848 2015
[23] M Attique A Khan and T-S Chung ldquoESPAK Top-k spatialkeyword query processing in directed road networksrdquo in Pro-ceedings of the Workshops of the International Conference onExtending Database Technology and the International Confer-ence on DatabaseTheory (EDBTICDT rsquo17) March 2017
[24] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing ampManagement vol24 no 5 pp 513ndash523 1988
[25] V N Anh O de Kretser and A Moffat ldquoVector-space rankingwith effective early terminationrdquo in Proceedings of the 24th An-nual International ACM SIGIR Conference pp 35ndash42 NewOrleans LO USA 2001
[26] E W Dijkstra ldquoA note on two problems in connexion withgraphsrdquo Numerische Mathematik vol 1 pp 269ndash271 1959
[27] ldquoReal datasets for spatial databasesrdquo httpswwwcsutahedulifeifeiSpatialDatasethtm
[28] ldquoTwitterrdquo httpstwittercom[29] T Brinkhoff ldquoA framework for generating network-basedmov-
ing objectsrdquo GeoInformatica vol 6 no 2 pp 153ndash180 2002
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
Wireless Communications and Mobile Computing 5
Inverted FileInverted Lists
PruningVocabulary
1 Compute upper-bound score using
2 Inverted list of a term is accessedonly if the upper-bound score is greater than kth object
dist(nq) and t+
lteid tidgt
lteid tidgt
tid Dftid
di dist(ns di) (d t )
+t
Figure 2 Indexing framework
Table 2 Summary of notations used in this paper
Notation DefinitionG = (N EW) Graph model of road network119889119894119904119905(119901119904 119901119890) Length of shortest path from 119901119904 to 119901119890 where 119901119904 and 119901119890 represent start and end points respectively119897119890119899(1199011 1199012) Length of segment connecting two points 1199011 and 1199012119899119894 Node in road network119890 = (119899119904 119899119890) Edge in edge set E where 119899119904 and 119899119890 are start and end points of the edge119899120573 Boundary node corresponding to start (119899119904) or end (119899119890) point of an edge119882(119890) Weight of edge (119899119904 119899119890)q Query point in road networkk A number that represents q can be among k number of closest facilities to a data object dD Set of data objects119863 = 1198891 1198892 119889|119863|119863(119899119904 119899119890) Set of data objects in an edge119901119886 Anchor point that corresponds to start point of expansion119875119878119864 Safe exit point where safe and non-safe regions of q intersect120572 query parameter120595(119889) Score of data object d120583(119889119905 119902119905) textual relevance of data object d with query keywords120582(119889119897 119902119897) Spatial relevance of data object d with query location119863+ Set of answer objects119863minus Set of non-answer objects119889+119897 Lowest answer object119889minusℎ Highest non-answer object
algorithm [26] Algorithm 1 returns the top-k data objectswith the highest scores according to their joint textual andspatial relevance to the query The algorithm begins byexploring the active edge where query object q is located andexpands the network in an increasing order of distance fromq Each entry in the min-heap has the form (119901119886 119890119889119892119890) where119901119886 indicates the anchor point in the edge For an active edgeq becomes the anchor point Otherwise for directed edgesending node 119899119890 becomes the anchor point For bidirectionaledges either of the adjacent boundary nodes ie 119899119904 or 119899119890becomes the anchor point Let119863119896 be the current set of top-kdata objects and 119904119896 be the score of the k-th data object in119863119896The 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896) function retrieves the candidatedata objects 119863119888 located in an edge with a better score 120595(119889)than 119904119896 Next the 119863119896 set is updated with the data objects in
119863119888 and so does 119904119896The algorithm continues its expansion andinserts the adjacent edges of the boundary node until the heapis exhausted or the upper-bound score of the remaining dataobjects cannot have a better score than 119904119896 The upper-boundscore 120595(119899) of node n is computed using 119889119894119904119905(119899 119902) and themaximum textual relevance (120583 = 1)Therefore if120595(119899) le 119904119896 itmeans that even if there is unexplored data object dmatchingall query keywords its score can be better than the k-th objectin 119863119896 because 119889119894119904119905(119889 119902119897) ge 119889119894119904119905(119899 119902119897) This is certain owingto the fact that the algorithm strictly expands the node with aminimum distance to the query location
Algorithm 2 presents the 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896) proce-dure which finds the candidate data objects This procedurehas twomain steps In the first step the upper-bound score ofthe edges is computed using a significance factor (120579119905 ) of a term
6 Wireless Communications and Mobile Computing
(1) Input Top-k spatial keyword query 119876119873 = (119902119897 119902119905 119896)(2) Output Top-k data objects with highest score(3) 119863119888 larr997888 0 lowastset of candidate data objects(4) max-heap 119863119896 larr997888 0 lowastcurrent Top-k set(5) 119904119896 larr997888 0 lowastk-th score in119863119896(6) min-heap larr997888 0(7) 119890119909119901119897119900119903119890119889 larr997888 0(8) min-heapinsert(119902119897 119890119889119892119890119886119888119905119894V119890)(9) 119863119888 larr997888 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896)(10) update119863119896 and 119904119896 with 119889 isin 119863119888(11) whilemin-heap = 0 and (1(1 + 120572120582(119889119897 119902119897)) lt 119904119896) do(12) for each unexplored adjacent edge of (119901119886 119890119889119892119890) do(13) 119890119909119901119897119900119903119890119889 larr997888 119890119909119901119897119900119903119890119889 cup (119901119886 119890119889119892119890)(14) 119863119888 larr997888 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896)(15) update119863119896 and 119904119896 with 119889 isin 119863119888(16) end(17) min-heapinsert(adjacent node edge)(18) end(19) return119863119896
Algorithm 1 EvaluateSnapshotQuery(Node 119899119894 Edge 119890119894)
(1) Input Edge ID 119890119894119889 Term ID 119905119894119889 score of k-th object 119904119896(2) Output candidate list119863119888(3) compute 120579119905(119890119894)(4) if 120579119905(119890119894) gt 0 then(5) 119898119886119909119904119888119900119903119890(119890119894) larr997888 119888119900119898119901119906119905119890119898119886119909119904119888119900119903119890(120579119905 119889119894119904119905(119890119894 119902119897))(6) end(7) if 119898119886119909119904119888119900119903119890(119890119894) gt 119904119896 then(8) for each data object in 119890119894 do(9) compute 119889119904119888119900119903119890(10) end(11) if 119889119904119888119900119903119890 gt 119904119896 then(12) 119863119888 larr997888 119863119888 cup 119889(13) end(14) end(15) return119863119888
Algorithm 2 CandidateSearch((119890119894119889 119905119894119889) 119904119896)
119905 isin 119902119905 and the shortest distance 119904119889119894119904119905(119890119894 119902119897) between the edgeand the query location In the next step the inverted lists ofterm t are fetched if their upper-bound score is greater than119904119896 In the inverted lists the objects with score 120595(119889) greaterthan 119904119896 are returned
To understand the proposed algorithm consider theroad network presented in Figure 1 Assume that a query qgenerated a top-1 keyword query with qd ldquoItalian Restau-rantrdquo For ease of presentation we assume 120572 = 1 and thetextual relevance 120583 is the number of occurrences of querykeywords in 119889119905 divided by the number of keywords in thedocument (description of data object) For example 120595(1198894) =120583(1198894119905 119902119905)(1 + 120582(1198894119897 119902119897)) = 058 = 006 The algorithmstarts the network expansion from an active edge
997888997888997888997888997888rarr(1198992 1198993)where q is the anchor point Note that the direction of the edge997888997888997888997888997888rarr(1198992 1198993) is from 1198992 to 1198993 Therefore the algorithm explores
only997888997888997888997888997888rarr(119902 1198993) There is no data object found in
997888997888997888997888997888rarr(119902 1198993) Then1198993 becomes the anchor point and edges (1198993 1198994) (1198993 1198995)and (1198993 1198997) are inserted in min-heap Next the 119888119886119899119889119904119890119886119903119888ℎfunction retrieves the candidate data objects on edges (1198993 1198994)(1198992 1198993) and (1198993 1198997) whose score is better than 119904119896 On edge(1198993 1198995) data object 1198893 is retrieved with 120595(1198893) = 02 Dataobject 1198893 is inserted in the119863119896 set and the value of 119904119896 is set to02 For edges (1198993 1198994) and (1198993 1198997) there is no candidate objectfound because 1198892119905 (ldquoCaferdquo) and 1198897119905 (ldquoCafe and Bakeryrdquo) donot match with 119902119905 The algorithm continues expanding theedges whose upper-bound score is greater than 119904119896 The edge997888997888997888997888997888rarr(1198997 1198992) is explored next The upper-bound score of
997888997888997888997888997888rarr(1198997 1198992)is 17 which is less than 119904119896 Similarly for edge
larr997888997888997888997888997888(1198996 1198995) theupper-bound score is 058 lt 119904119896 Therefore the algorithmterminates and reports 1198893 as the top-1 result
Wireless Communications and Mobile Computing 7
q
q issues TkSK query at p1
Server returns a set of objects for p1
Figure 3 Illustration of directed road network
qq issues TkSK query at p2
Server returns a set of objects for p2
Figure 4 Illustration of directed road network
5 Moving Top-119896 Spatial Keyword Queries
In this section we present our method to monitor themoving top-k spatial keyword queries where query objectsare moving in a directed road network Figure 3 providesan example of TkSK in road networks where query point qissues a TkSK query at point 1199011 Note that the numbers onthe arrows in the figure indicate the order of the steps Toobtain top-k results at 1199011 the server executes Algorithm 1as mentioned in Section 42 Now consider that the queryobject is moved to 1199012 as shown in Figure 4 to retrieve thetop-k results at point 1199012 The simple method is to repeat theprocedure executed at 1199011 However the use of recomputationwhenever query q changes its location significantly increasesthe computation cost Furthermore it also increases thecommunication overhead because the query object mustreport its location whenever it moves and the server mustsend the results set To address these issues we introduce thesafe exit approach
In the proposed framework the server computes safeexit points for a query object The server maintains a set ofmoving queries and the query result remains valid until thequery objects remain inside their respective safe exit pointsWhenever a query object leaves its safe exit points the serverrecomputes theTkSK and safe exit points for the query object
Next we present our method to compute the safe exitpoints for a query objectThe safe exit point represents a pointin the segment where a safe region and nonsafe region meetWe compute the safe exit point using the divide-and-conquertechnique Before presenting the detailed methodology wedefine the terminologies used in this section
Definition 1 (safe region) A portion of a road segment thatcan guarantee that as long as the query point lies in it itstop-k results remain valid
Definition 2 (answer objects 119863+) A data object d is calledan answer object of query q if the score of data object d(120595(119889) gt 120595(119889119886)) where 119889119886 represents any other data object inthe directed road network Similarly we can generalize thisdefinition for TkSK a data object d is called an answer object
of query q if the score of a data object d (120595(119889) gt 120595(119889119896+1))where 119889119896+1 represents the (119896+1)119905ℎ data object in the directedroad network In other words we can state that all answerobjects are top-k results of query q
Definition 3 (nonanswer objects119863minus) A data object d is calleda nonanswer object of query q if the score of data object d(120595(119889) lt 120595(119889119886)) where 119889119886 represents any other data object inthe directed road network Similarly we can generalize thisdefinition for TkSK a data object d is called a nonanswerobject of query q if the score of data object d (120595(119889) lt 120595(119889119896))where 119889119896 represents the kth data object in the directed roadnetwork That is we can say that all answer objects are top-k results of query q Therefore we can state that none of thenonanswer objects are in the top-k results of query q
Definition 4 (lowest answer object 119863+119897 ) An answer object119889+ isin 119863+ is called a lowest answer object to a point 119901 isin 119866such that 120595(119889+119897 )119901 = min(120595(119889+1 )119901 120595(119889+2 )119901 120595(119889+|119889+|)119901)where120595(119889+119897 )119901 represents the score of the lowest answer objectat point p In other words 120595(119889+119897 )119901 lt 120595(119889+119886 )119901 at point p where119889+119886 is any other answer object in the 119863+ setDefinition 5 (highest nonanswer object 119863minusℎ) A nonanswerobject 119889minus isin 119863minus is called a highest nonanswer object toa point 119901 isin 119866 such that 120595(119889minusℎ )119901 = max(120595(119889minus1 )119901 120595(119889minus2 )119901 120595(119889+|119889minus|)119901) where 120595(119889+ℎ)119901 represents the score of thehighest nonanswer object at point p In other words the120595(119889minus119897 )119901 lt 120595(119889minus119886 )119901 at point p where 119889minus119886 is any other nonanswerobject in the 119863minus set
As discussed earlier the main challenge in the continuousprocessing of moving TkSK is to maintain the validity of theresult set because the movement of query objects can nullifythe result set To monitor the validity of the result set wepropose a safe-region-based approach
51 Computation of Safe Exit Points In this section wepresent our technique to compute the safe exit points Themain goal is to find a point in the road network where the
8 Wireless Communications and Mobile Computing
query result set will change The result set will change whenthe score of highest nonanswer 119863minusℎ surpasses the score of119863+119897 Generally the textual relevance score does not changeTherefore the score of data objects only changes because ofthe spatial relevance score which can only change by themovement of query objects The computation of the safe exitpoint is based on two key observations
Observation 1 If 119863+119899120573 = 119863+119901119886 there is no safe exit point in thesegment
Explanation 119863+119901119886 represents the set of answer objects atanchor point 119901119886 whereas 119863+119899120573 represents the set of answerobjects at boundary node 119899120573 As discussed earlier the safe exitpoint is the particular point where the query results changedIf the query results at the starting node are the same as theending node of any segmentedge there does not exist anypoint where the query result is changing Hence we do notsearch the safe exit point in that segment
Observation 2 If 119863+119901119886 = 119863+119899120573 there is a safe exit point in thesegment
Explanation In contrast to Observation 1 if the query resultsare different at the starting and ending points then thereexists a point where the query results are changing Hencethere is a safe exit point in the segment
To find the safe region we observe the following cases
Case 1 (when 120572 = 1 and the textual relevance of the highestnonanswer object and lowest answer object is the same)In this case both the textual and spatial relevance have thesame importance (ie 120572 = 1) In addition the top-k resultdepends only on the spatial relevance because the textualrelevance of both objects is the same The data object thatis closer to query point q becomes the answer object For anundirected edge the safe exit point 119901119904119890 is the center pointie max(119889119894119904119905(119901119904119890 119889+1 ) 119889119894119904119905(119901119904119890 119889+2 ) 119889119894119904119905(119901119904119890 119889+|119889+|)) =min(119889119894119904119905(119901119904119890 119889minus1 ) 119889119894119904119905(119901119904119890 119889minus2 ) 119889119894119904119905(119901119904119890 119889minus|119889minus|)) betweenthe lowest answer object and the highest nonanswer objectHowever in case of a directed edge where 119889119894119904119905(119901119886 119899120573) =119889119894119904119905(119899120573 119901119886) the safe exit point is either 119889+119897 or 119901119886 If 119889+119897 isin(119901119886 119899120573) then the safe exit point is 119889+119897 otherwise the safe exitpoint is 119901119886Case 2 (when 120572 = 1 and the textual relevance of the highestnonanswer object and lowest answer object is different) Inthis case the top-k result depends on all functions that are the120572 spatial and textual relevance Clearly for the undirectededges the midpoint between the lowest answer object andthe highest nonanswer object does not provide a valid safeexit point Therefore we introduce the divide-and-conquertechnique This will keep dividing the search space until weget the point where the score of the nonanswer is greater thanthat of the answer object Typically the safe exit point shouldbe closer to the data object whose score is lower Based onthis observation first we compute the midpoint in a similarfashion to Case 1 and then we continue dividing the search
space until we find the point For undirected edges the safeexit point can be computed in a similar fashion to Case 1
Case 2 also works for other cases when the safe exit pointis not the mid point between the lowest answer object andthe highest nonanswer object In these cases the safe exitpoint depends on two or more functions Therefore the safeexit point can be easily computed using the aforementioneddivide-and-conquer technique Following are the scenarioswhere the safe exit point can be computed using Case 2
(a) When 120572 = 1 and textual relevance of the nearest non-answer object and farthest answer object is different
(b) When 120572 = 1 and textual relevance of the nearestnonanswer object and farthest answer object is same
Case 3 (when 120572 = 0) This means the spatial relevance hasno effect on the score of data objects Hence no monitoringis required for this scenario
Algorithm 3 retrieves the safe exit points using theobservations we discussed earlier The core function in thisalgorithm is ComputeSafeExit(119901119886 119899120573) which finds the safeexit point in a segment between 119901119886 and 119899120573 The detailedComputeSafeExit(119901119886 119899120573) is described in Algorithm 4 FirstAlgorithm 4 determines 119889+119897 and 119889minusℎ at point 119901 isin [119901119886 119899120573]Recall that 119889+119897 is the lowest answer object to p where 119889minusℎ isthe highest nonanswer object to p Algorithm 4 computes thesafe exit point based on the cases we discussed earlier Thereare a further two scenarios for Cases 1 and 2 For Case 1 if119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then the safe exit point is the mid-point between 119889+119897 and 119889minusℎ If 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) thenthe edge is directed and therefore the safe exit point is either119901119886 or 119889+119897 If 119889+119897 lies on the edge [119901119886 119899120573] then 119889+119897 is the safe exitpoint Otherwise 119901119886 is the safe exit point
Similarly for Case 2 if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) thenthe safe exit point is computed by dividing the search space byhalf until we find the closest point such that 120595(119889minusℎ) gt 120595(119889+119897 )The safe exit point is computed in the same way as in Case 2if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886)52 Computation of Safe Exit Points for Example Considerthe same example in Figure 1 where the query point q issuesa top-1 keyword query with qt ldquoItalian restaurantrdquo For thisexample let us consider 120572 = 1 The monitoring algorithmstarts exploring from the active edge containing the queryobject q Therefore
997888997888997888997888997888rarr(119902 1198993) is explored first As shown inTable 3 for
997888997888997888997888997888rarr(119902 1198993) 119863+119902 = 1198893 and 119863+1198993 = 1198893 Accordingto Observation 1 no safe exit point exists in this segmentTherefore edges adjacent to 1198993 are explored and 1198993 becomesthe new 119901119886 The edge (1198993 1198994) is explored next Similarlythe answer object at 1198993 and 1198994 is the same 119863+1198993 = 119863+1198994 =1198893 Therefore a safe exit point does not exist in (1198993 1198994)The edge (1198993 1198997) is explored next As shown in Table 3119863+1198993 = 1198893 and 119863+1198997 = 1198896 By Observation 2 there is asafe exit point in (1198993 1198997) As shown in Figure 1 1198893119905 =1198896119905 = ldquo119868119905119886119897119894119886119899119877119890119904119905119886119906119903119886119899119905rdquo and 119889119894119904119905(1198993 1198997) = 119889119894119904119905(1198997 1198993)
Wireless Communications and Mobile Computing 9
(1) Input Same as Algorithm 1(2) Output 119875119878119864 a set of safe exit points(3) 119875119878119864 larr997888 0 lowastset of safe exit points(4) 119863+119901119886 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119901119886 (119901119886 119899120573))(5) lowastResults calculated using Algorithm 1(6) 119863+119899120573 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910((119899120573 (119901119886 119899120573)))(7) lowastResults calculated using Algorithm 1(8) if 119863+119901119886 = 119863+119899120573 then(9) no safe exit point lowastrefer to Observation 1(10) end(11) if 119863+119901119886 = 119863+119899120573 then(12) 119875119878119864 larr997888 119875119878119864 cup 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119901119886 119899120573) lowastsafe exit point
exist - refer to Observation 2(13) end(14) return 119875119878119864
Algorithm 3 COSK monitoring algorithm
(1) Input same as Algorithm 1(2) Output se safe exit point in (119901119886 119899120573)(3) 119863+119897 larr997888 lt 119901119863+119897 gt | for each point 119901 isin [119901119886 119899120573] 119889+119897 such that120595(119889+119897 )119901 = min(120595(119889+1 )119901 120595(119889+2 )119901 120595(119889+|119889+|)119901(4) 119863minusℎ larr997888 lt 119901119863minusℎ gt | for each point 119901 isin [119901119886 119899120573] 119889minusℎ such that120595(119889minusℎ )119901 = max(120595(119889minus1 )119901 120595(119889minus2 )119901 120595(119889minus|119889minus |)119901(5) if Case 1 then(6) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(7) 119901119904119890 =
max(119889119894119904119905(119904119890 119889+1 ) 119889119894119904119905(119904119890 119889+2 ) 119889119894119904119905(119904119890 119889+|119889+ |)) =min(119889119894119904119905(119904119890 119889minus1 ) 119889119894119904119905(119904119890 119889minus2 ) 119889119894119904119905(119904119890 119889minus|119889minus |))
(8) end(9) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(10) 119901119904119890 = 119901119886 or 119901119904119890 = 119889+119897 where 119889+119897 isin (119901119886 119899120573)(11) end(12) end(13) if Case 2 then(14) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(15) 119901119904119890 =closest point to 119901119886 such that 120595(119889minusℎ ) gt 120595(119889+119897 )(16) end(17) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(18) Same as Line (10)(19) end(20) end(21) return 119901119904119890
Algorithm 4 ComputeSafeExit(119901119886 119899120573)
Therefore according to Case 1 the safe exit point 1199041 isthe midpoint between 1198893 and 1198896 That is 119889119894119904119905(1199011199041198901 1198893) =119889119894119904119905(1199011199041198901 1198896) where119889119894119904119905(1199011199041198901 1198893) = 119909+3 and 119889119894119904119905(1199011199041198901 1198896) =minus119909 + 5 for 0 lt 119909 lt 3 Consequently 119909 = 1 which means thatthe distance from 1198993 to 1199011199041198901 is 1
Next we determine a safe exit point in (1198993 1198995) As shownin Table 3 the answer object at 1198995 is also the same as 1198993Hence no safe exit point exists in this edge Next
larr997888997888997888997888997888(1198996 1198995) isexplored with 119901119886 = 1198995 According to Table 3 119863+1198997 = 1198894 and
119863+1198995 = 1198893 Therefore a safe exit point exists in this edge This
edge is directed and for each point 119901 isin larr997888997888997888997888997888(1198996 1198995) the shortestdistance from p to 1198893 is from 119901 997888rarr 1198996 997888rarr 1198992 997888rarr 1198993 997888rarr 1198893Therefore 1198995 is the safe exit point
The bold lines in Figure 5 indicate the safe region of qThetop-1 result remains 1198893 until the query q lies in the safe region
Next we analyze the time complexity for determininga set of safe exit points using a set of qualifying objects119889 isin 119863+119901119886 cup 119863+119899120573 cup 119863(119901119886 119899120573) Note that 119863+119901119886 (119863+119899120573) indicates
10 Wireless Communications and Mobile Computing
Table 3 Computation of safe exit points for example scenario
EdgeSegment 119901119886 119863+119901119886 119863+119899120573 119901119904119890997888997888997888997888rarr(119902 1198993) q 119863+119902 = 1198893 119863+1198993 = 1198893 none(1198993 1198994) q 119863+1198993 = 1198893 119863+1198994 = 1198893 none(1198993 1198997) 1198993 119863+1198993 = 1198893 119863+1198997 = 1198896 1199011199041198901997888997888997888997888997888rarr(1198993 1198995) 1198993 119863+1198993 = 1198893 119863+1198995 = 1198893 nonelarr997888997888997888997888997888(1198996 1198995) 1198995 119863+1198995 = 1198893 119863+1198996 = 1198894 1199011199041198902
2
q
3
1
1 1
1
1
2
1
2
1 2
1
3
2
1
1
d4 (Chinese Restaurant)
d1 (Grand Hotel)
d5 (Pub and Bar)
n1
n6
n2 n3
n4
n7
pse1
pse2
n5
d6(Italian Restaurant)
d3 (Italian Restaurant)
d2 (Cafe)
d7 (Cafe and Bakery)
Figure 5 Illustration of safe region of q
the set of k data objects that satisfies the query conditionat 119901119886 (119899120573) According to Dijkstras algorithm [26] the timecomplexity 119874(119863+119902 ) for computing a set of answer objects at aquery point q is119874(119863+119902 ) = 119874(|119864|+|119873| log |119873|)Thismeans that119874(119863+119901119886) = 119874(119863+119899120573) = 119874(|119864| + |119873| log |119873|) holds for endpoints119901119886 and 119899120573 Thus time complexity 119874(Ω119896119905ℎ) when determiningthe skyline Ω119896119905ℎ with the k-th highest score is 119874(Ω119896119905ℎ) =119862119896119905ℎ119874(|119863+119901119886 cup 119863+119899120573 cup 119863(119901119886 119899120573)|) where 119862119896119905ℎ is the numberof qualifying objects that participate in the constitution ofthe skyline with the k-th highest score Therefore the timecomplexity of determining a safe exit point coincides withthe time complexity of determining the two skylines iethe skyline 119863+119897 with the k-th highest (or lowest) score foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects This is because the safe exit point is foundat the cross point between these skylines
Figure 6 represents the skyline graph for 119896 = 1 in an edge(1198997 1198993) Let us draw the score function for 1198893 and 1198896 for theroad segment (1198997 1198993) where a safe exit point exists This isbecause 119863(1198993)+ = 1198893 and 119863(1198997)+ = 1198896 for 119896 = 1 For eachpoint 119901 isin (1198997 1198993) the distance between 1198893 and point p canbe represented as 119889119894119904119905(1198893 119901) = 119889119894119904119905(1198893 1198993) + 119897119890119899(1198993 119901) = 6 minus119897119890119899(1198997 119901) Similarly for each point 119901 isin (1198997 1198993) the distancebetween 1198896 and point p can be represented as 119889119894119904119905(1198896 119901) =119889119894119904119905(1198896 1198997) + 119897119890119899(1198997 119901) = 2 + 119897119890119899(1198997 119901) Let 119897119890119899(1198997 119901) be
n7
10
08
06
04
02
n3pse1d7
distance
Scor
e
05 10 15 20 25 30
(d6) = 1(x + 3)
(d3) = 1(minusx + 7)
Figure 6 Skyline graph for 119896 = 1 on the road segment (1198997 1198993)
a variable x (0 le 119909 le 3) We can write 120582(1198893 119901) =119889119894119904119905(1198893 119901) = 6 minus 119909 and 120582(1198896 119901) = 119889119894119904119905(1198896 119901) = 2 + 119909 Thenwe can represent score function 120595(1198893) and 120595(1198896) as follows
120595(1198893) = 120583(1198893119905 119902119905)(1 + 120572 sdot 120582(1198893 119901)) = 1(7 minus 119909) for(0 le 119909 le 3)
Wireless Communications and Mobile Computing 11
120595(1198896) = 120583(1198896119905 119902119905)(1 + 120572 sdot 120582(1198896 119901)) = 1(3 + 119909) for(0 le 119909 le 3)Finally we present the lemma to prove that safe exit points
computed by COSK are correct
Lemma 8 The COSK algorithm correctly computes a set ofsafe exit points
Proof We will prove the correctness of the COSK algorithmby contradiction We assume that if 119863+119901119886 = 119863+119899120573 there is nosafe exit point in a road segment (119901119886119899120573) This means that foreach point p in the road segment (119901119886119899120573) the query result atp equals 119863+119901119886 ie 119863+119901 = 119863+119901119886forall119901 isin (119901119886119899120573) However it leadsto a contradiction that 119863+119899120573 = 119863+119901119886 when 119901 = 119899120573 There-fore if 119863+119901119886 = 119863+119899120573 a safe exit point exists in (119901119886119899120573) In addi-tion a safe exit point is determined using the skyline 119863+119897 foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects when 119863+119901119886 = 119863+119899120573 The first skyline is acomposite polyline drawn from answer objects in 119863+119901119886 Thesecond skyline is a composite polyline drawn from nonan-swer objects in 119863+119899120573 cup 119863(119901119886 119899120573) minus 119863+119901119886
6 Monitoring Query Results and Safe Regionsin Dynamic Directed Road Networks
In this section we discuss the monitoring of spatial key-word queries in dynamic road networks where the networkdistance changes depending on the traffic conditions Theupdates on weight of some edges may invalidate the queryresults or safe region of q even though the query objectq remains within their respective safe region Figure 7illustrates an example of changing the weights edges
larr997888997888997888997888997888(1198991 1198992)and
larr997888997888997888997888997888(1198991 1198996) For convenience we consider 120572 = 1 and qt =ldquoItalian restaurantrdquo In Figure 7(a) the top-1 result is 1198891 andbold lines show the safe region of query q Now consider attime 119905119895 the weights of two edgeslarr997888997888997888997888997888(1198991 1198992) andlarr997888997888997888997888997888(1198991 1198996) changeddue to heavy traffic condition as shown in Figure 7(b) Theupdate in weight of edges may invalidate the query resultor safe region of q Therefore it is necessary to monitor thevalidity of results and safe region when the changes occur
Next we introduce a monitoring region to monitor thevalidity of the safe region effectively when the weight ofan edge is changed Monitoring region MR contains all thepoints between query point q and lowest answer object andhighest nonanswer object Formally it is defined as 119872119877 =119889119894119904119905(119902119863+119897 ) cup 119889119894119904119905(119902119863minusℎ) where 119889119894119904119905(119902119863+119897 ) is the distancebetween q and lowest answer object and 119889119894119904119905(119902119863minusℎ) is highestnonanswer object In given example the 119863+119897 = 1198891 and 119863minusℎ =1198892 1198893 Therefore the dotted lines in Figure 8(a) shows themonitoring region of query object q
Now at time 119905119895 the update to edgeslarr997888997888997888997888997888(1198991 1198996) and larr997888997888997888997888997888997888(1198991 1198891)
which is not part of monitoring region can safely be ignoredHowever the updated on segment
997888997888997888997888997888997888rarr(1198992 1198891)which is associatedwith monitoring region may nullify the results As shown in
Figure 8(b) after update the top-1 result becomes 1198892 and boldlines represents the new safe region of q
Algorithm 5 monitors the validity of result set and saferegion of query object qwhen the weight of any edge changesLet us consider weight of edge (119899119894 119899119895) changes at time 119905119895First algorithm checks whether edge (119899119894 119899119895) is associatedwith monitoring region or not If it is not part of monitoringregion then algorithm simply ignores the update in edge(119899119894 119899119895) and query results and safe region remains valid Incontrast if edge is associated with monitoring region (ie119872119877cap(119899119894 119899119895) = 0) then algorithm evaluates the query resultsConsequently the top-k results and safe region of queryq needs to be updated Finally the algorithm updates themonitoring region of q
7 Performance Evaluation
In this section we evaluate the performance of COSKthrough simulation experiments We describe our experi-mental settings in Section 71 and we present our experimen-tal results for static and dynamic road networks in Sections72 and 73 respectively
71 Experimental Settings All of our experiments wereperformed using real road networks namely OldenburgSan Francisco and San Joaquin All three road networkswere obtained from [27] The original road network of SanFrancisco had 21047 nodes and 21692 edges We reformat-ted the network pruned approximately 30 of the nodesand adjusted the edges and their weights accordingly Thisresulted in a network with 14732 nodes and 14316 edgesBoth the direction of edges and data objects on the edgeswere generated randomly The description of each data objectwas extracted from Twitter messages [28] and we assignedone tweet per data object Table 4 presents the characteristicsof the data sets used in the experimental evaluation Wesimulated moving query objects by using a spatiotemporaldata generator [29] The input to generator was the road net-work of the data set used and the output was the set of queryobjects moving on the road network Each experiment had100 moving queries which were continuously monitored for100 timestamps (1 timestamp = 1 second) and the averageresult was reported in the experiments
As a benchmark for COSK in static road network weimplemented a CMTkSK+ algorithm [22] which also contin-uously monitored the moving top-k spatial keyword queriesin the road networks However this algorithm was originallydesigned for undirected road networks To make a faircomparison we modified CMTkSK+ to process top-k spatialkeyword queries in directed road networks and called itCMTkSK+ Specifically we modified the distance computa-tion method between two points such that in directed roadnetworks 119889119894119904119905(1199011 1199012) = 119889119894119904119905(1199012 1199011) Since CMTkSK+ doesnot handle top-k spatial queries in dynamic road roads wecompared the performance of COSK with basic algorithmwhich recomputes the results whenever query object changesits location All algorithms were implemented in Java andwere executed on a desktop PC 280-GHz Intel Core i5 with
12 Wireless Communications and Mobile Computing
3
q5 5
2 3
3
2
2 3 5
11
d3 (Chinese Restaurant)
n1
n6
n2 pse2
pse1
pse3
n4n5
n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Safe region at time 119905119894
9
q10 5
6 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6
n2 n3
n4n5
d2 (Italian Restaurant)d1 (Italian Restaurant)
(b) Updating weight oflarr997888997888997888997888997888997888(1198991 1198992) and
larr997888997888997888997888997888997888(1198991 1198996) at time 119905119895
Figure 7 Updating the weight of edges in a dynamic road network where 119905119894 lt 119905119895
3
q5 5
2 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6 n4n5
n2 n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Monitoring region at time 119905119894
9
q10 5
5 4
233
2
2 3 5
11
037
pse2pse1
pse3
d3 (Chinese Restaurant)n6 n4n5
n2 n3d2 (Italian Restaurant)n1 d1 (Italian Restaurant)
(b) New safe region at time 119905119895
Figure 8 Monitoring region and updated safe region at time 119905119895
(1) InputMonitoring regionMR updated edge (119899119894 119899119895)(2) Output none(3) if 119872119877cap (119899119894 119899119895) = 0 then(4) lowastedge (119899119894 119899119895) is not part of monitoring region(5) ignore the change in the weight of edge (119899119894 119899119895)(6) end(7) 119875119878119864 larr997888 0 lowastset of safe exit points(8) else(9) 119863119896119906119901119889 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119899119894 119890119894) lowastupdate set of
top-k results(10) 119875119878119864119906119901119889 larr997888 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119875119886 119899120573) lowastupdate safe exit
points(11) 119872119877119906119901119889 larr997888 119862119900119898119901119906119905119890119872119900119899119894119905119900119903119894119899119892119877119890119892119894119900119899(119863+119897 119863minusℎ )
lowastupdate monitoring region(12) end
Algorithm 5 MonitoringSafeRegion(MR(119899119894 119899119895))
Table 4 Summary of datasets
Attribute Oldenburg San Francisco San JoaquinTotal no of nodes 6104 14732 18262Total no of edges 7034 14316 23876Percentage of directed edges 30 30 30Total no of objects 5627 11453 19098Average no of objects per edge 08 08 08Total no of words 49517 103649 166153
Wireless Communications and Mobile Computing 13
Table 5 Experimental parameter settings
Parameter RangeNumber of results (k) 5 10 15 20 25Number of keywords (n) 1 2 3 4 5Query parameter (120572) 001 01 1 10 100Dataset Oldenburg San Francisco San JoaquinNumber of data objects (119873119863) 10 20 30 40 50 (x1000)Speed of query objects (119881119902119903119910) 25 50 75 100 125 (kmh)Mobility (119872119902119903119910) 20 40 60 80 100Ratio of directed edges (119864119889119894119903) 10 20 30 40 50Ratio of updated edges (119864119906119901119889) 15 30 60 80 100
8GB of memory In the experiments we compared (1) queryprocessing times (2) edges processed ie the number ofedges processed for retrieving query results and (3) indexsizes Table 5 summarizes the parameters used in the exper-iments In each experiment we varied a single parameterwithin the range that is shown in Table 5 while maintainingthe other parameters at the bolded default values
We evaluated the performance of the algorithms by usingthe following measures (1) total amount of server CPUtime which indicates the query processing time and (2)total communication cost as the total number of points (iethe location updates sent by query objects and the queryresults and safe exit points returned by the server) transferredbetween clients and the serverThebattery power andwirelessbandwidth consumption typically increase with the amountof data transferred between objects (clients) and serversThus we used the amount of transferred data as a metric toevaluate the communication cost
72 Experimental Results of Top-k Spatial KeywordQueries in Static Road Networks
721 Effect of k Figure 9 indicates the effect of the numberof results on the query processing time and communicationcost for both algorithms Figure 9(a) indicates that the queryprocessing time increases for both algorithms as the value ofk increases This is expected because with an increase in kmore data objects are required to be explored and verifiedNevertheless COSK significantly outperforms CMTkSK+ fortwo main reasons First a relevant object search is very effi-cient when using the highest significant factor and secondCOSKdoes not need to verify the set of answer objects as longas the query object lies in a safe region On the other handthe CMTkSK+ query processing time increases significantlybecause it has to monitor and verify the set of candidateobjects periodically In Figure 9(b) the communication costsfor both algorithms increase as the number of objects in-creases However the proposed algorithm demonstrates su-perior performance compared to CMTkSK+ because client-server communication is not required when the query objectlies within the safe exit points whereas in CMTkSK+ thequery object is required to report its location to the serverwhenever it moves
722 Effect of119873119863 This experimentwas conducted on datasetSan Joaquin This dataset included 19098 data objects there-fore we randomly generated approximately 30000 additionaldata objects on different edges In Figure 10 we evaluate theperformance of COSK and CMTkSK+ by varying the cardi-nality of the data objects Note that119873119863 = 10119870 corresponds toa low density of data points while119873119863 = 50119870 corresponds toa high density In Figure 10(a) it is interesting to notice thatthe query processing times of both algorithms decrease asthe cardinality of the data objects increases For CMTkSK+this is because with high density the monitoring range of aquery decreases However for COSK it is mainly becausewhen the data density is high fewer edges are required tobe expanded which decreases the query processing time InFigure 10(b) we study the influence of the cardinality of thedata objects on the communication costs The experimentalresults indicate that the communication costs of CMTkSK+incur almost constant communication costs regardless ofdata object cardinality However the communication costsof COSK increase in proportion to the 119873119863 value This isexpected because the safe region becomes smaller as thedensity of the data objects increases which increases thecommunication costs
723 Effect of Query Keywords (n) Figure 11 shows thequery processing time and communication for COSK andCMTkSK+ as a function of the number of query keywordsFigures 11(a) and 11(b) show the trend that the performanceof both algorithms degrades when the number of keywordsincreases This is mainly because by increasing the numberof query keywords the number of relevant objects may alsoincrease resulting in a higher query processing time andcommunication cost However the safe-region-based algo-rithm COSK scales better than CMTkSk+ because of its lessexpensive monitoring technique
724 Effect of 120572 Figure 12 demonstrates the impact of queryparameter 120572 on the query processing time and on the com-munication cost A small value of 120572 indicates a greater im-portance of textual relevance whereas a high value of 120572gives more preference to the spatial relevance It is interestingto note that the query processing time is lower for higher
14 Wireless Communications and Mobile Computing
k
50
10
10
15 20
20
30
Que
ry p
roce
ssin
g tim
e (s)
COSKCMTkSK+
40
25
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
k
5 10 15 20 25
(b) Communication cost
Figure 9 Effect of k on query processing time and number of edges processed
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
10k 20k 30k 40k 50kND
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
tran
sferr
ed m
essa
ges
1M
10 20 30 40 50ND
(b) Communication cost
Figure 10 Effect of119873119863 on query processing time and communication cost
values of 120572 which indicates more importance to the spatialrelevance This is mainly because when the spatial relevanceis higher fewer edges and objects are required to be exploredand processed to determine the top-k data objects Observethat in Figure 12(b) the number of messages sent by COSKdecreases sharply with an increase in 120572725 Effect of Speed Figure 13(a) demonstrates the influenceof the speed of the query objects on the query processingtime of the COSK and CMTkSK+ algorithms The experi-mental results indicate that the performance of CMTkSK+is not significantly influenced by the speed of the query
objects because the candidate objects must be continuouslymonitored after a regular interval of time regardless ofthe speed On the other hand for COSK the performancegradually decreases as the speed of the query objects increasesbecause the objects leave their respective safe regions morefrequently Figure 13(b) shows the communication costs ofCOSK and CMTkSK+ with respect to the speed of the queryobjects CMTkSK+ incurs almost constant communicationcosts because a server-initiated request to verify the candidateobjects does not depend on the speed For COSK the queryobjects cross safe regions more frequently when the speed ishigh which increases the communication costs
Wireless Communications and Mobile Computing 15
Number of keywords1 2 3 4 5
COSKCMTkSK+
0
15
30
45
Que
ry p
roce
ssin
g tim
e (s)
60
(a) Query processing time
COSK
Number of keywords
CMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1 2 3 4 5
(b) Communication cost
Figure 11 Effect of number of keywords on query processing time and communication cost
001 01 1 10 100
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
001 01 1 10 100
(b) Communication cost
Figure 12 Effect of 120572 on query processing time and communication cost
726 Effect of Mobility Figure 14 shows the effect of mobility119872119902119903119910 (mobility refers to the percentage of query objects thatare moving at any timestamp) on the performance of COSKand CMTkSK+ algorithms As expected the query pro-cessing time and communication costs for both algorithmsincrease with119872119902119903y Nevertheless COSK performs better thanCMTkSK+ in terms of query processing time and commu-nication costs
727 Effect of Directed Edges Figure 15 shows the impactof percentage of directed edges 119864119889119894119903 on the performance ofCOSK and CMTkSK+ algorithms The query processing time
increases with 119864119889119894119903 because algorithm needs to explore moreedges to retrieve the top-k keyword queries However thecommunication cost is not significantly affected by the valueof 119864119889119894119903 for both the algorithms
728 Effect of Datasets Figure 16 demonstrates the indexsizes of the COSK and CMTkSK+ approaches for differentdatasets As shown in Figure 16 both algorithms have similarindex sizes However COSK has minor space overheadbecause it stores additional information of the highest signifi-cance factor 120579119905 of edges More important this space overheadis minimal as compared to the gain achieved by COSK inquery processing time and communication costs
16 Wireless Communications and Mobile Computing
25 50 75 100 125
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
Vqry
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
25 50 75 100 125Vqry
(b) Communication cost
Figure 13 Effect of speed on query processing time and communication cost
20 40 60 80 100Mqry
COSKCMTkSK+
0
15
45
30
60
Que
ry p
roce
ssin
g tim
e (s)
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
20 40 60 80 100Mqry
1k
COSKCMTkSK+
(b) Communication cost
Figure 14 Effect of mobility on query processing time and communication cost
73 Experimental Results of Top-k Spatial Keyword Queriesin Dynamic Road Networks In this section we evaluate theperformance of COSK and basic algorithm for dynamic roadnetworks The 119864119906119901119889 indicates the percentage of all edges thatchange their weight at each timestamp The length of anupdated edge is randomly selected between 01 to 10 times theoriginal length Figure 17(a) depicts the query processing timeof COSK and basic algorithm It is evident from the figure thatquery processing time of basic algorithm is not significantlyaffected by 119864119906119901119889 This is mainly because the query objectsissue top-k spatial queries at each timestamp However query
processing time of COSK increases with the value of 119864119906119901119889because the probability that the updated edge may associatedwith the monitoring region of query q increases with 119864119906119901119889Therefore when 119864119906119901119889 becomes large the results need to befrequently updated which increases the query processingtime Figure 17(b) shows the communication costs of COSKand basic algorithm with respect to 119864119906119901119889 Basic algorithmincurs almost constant communication costs regardless of thevalue of 119864119906119901119889 In contrast the communication cost of COSKincreases with 119864119906119901119889 because the query result and safe regionsneeds to be frequently updated
Wireless Communications and Mobile Computing 17
COSKCMTkSK+
10 20 30 40 50Edir
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1k
10 20 30 40 50Edir
eSPAKCMTkSK+
(b) Communication cost
Figure 15 Effect of 119864119889119894119903 on query processing time and communication cost
COSKCMTkSK+
0
15
45
30
60
Inde
x siz
e (M
B)
OldenburgDatasets
San Francisco San Joaquin
Figure 16 Effect of dataset on index size
8 Conclusion
In this paper we investigated moving top-k spatial keywordqueries in directed and dynamic road networksWepresentedan efficient indexing framework using inverted files thatindexes the data objects on edges allowing for the effectivesearching of data objects relevant to queries in terms ofboth textual and spatial relevance We also presented a safe-exit-based algorithm called COSK to monitor moving top-k spatial keyword queries We demonstrated that the queryresults remain valid as long as the query object resides withina safe region Furthermore COSK can effectively monitor thevalidity of query results and safe regions in dynamic roadnetworks Finally an experimental evaluation conducted on
real road networks demonstrated that COSK significantlyreduced the query processing time and communication costscompared to the CMTkSK+ algorithm
Data Availability
The real road network data used in this study are also used inmany previous studies The road network data is cited in themanuscript and it is available at httpswwwcsutahedusimlifeifeiSpatialDatasethtm To simulate the moving queriesthe authors used the spatiotemporal data generator which isalso used in previous studiesThe research article of generatoris cited in the manuscript The documentation and source
18 Wireless Communications and Mobile Computing
0
20
40
60
Que
ry p
roce
ssin
g tim
e (s)
80
15 30 45 60 75Eupd
COSKBasic
(a) Query processing time
15 30 45 60 75Eupd
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
COSKBasic
(b) Communication cost
Figure 17 Effect of 119864119906119901119889 on query processing time and communication cost
files of generator are available at httpsiapgjade-hsdeper-sonenbrinkhoffgenerator They used the Twitter tweetsfor generating the description of data objects and also querykeywords The tweets used can be accessible at httpfollow-thehashtagcomdatasetsfree-twitter-dataset-usa-200000-free-usa-tweets
Conflicts of Interest
The authors declare that there is no conflicts of interestregarding the publication of this paper
Acknowledgments
Hyung-JuChowas supported by theNational Research Foun-dation of Korea (NRF) grant funded by the Korean Govern-ment (MSIP) (NRF-2016R1A2B4009793) and this researchwas partially supported by Basic Science Research Programthrough the National Research Foundation of Korea (NRF)fundedby theMinistry of Education (2016R1D1A1B03934129)
References
[1] D Papadias N Mamoulis J Zhang and Y Tao ldquoQuery pro-cessing in spatial network databasesrdquo in Proceedings of the 29thInternational Conference on Very Large Data Bases (VLDB rsquo03)pp 802ndash813 September 2003
[2] H-J Cho K Ryu and T-S Chung ldquoAn efficient algorithm forcomputing safe exit points of moving range queries in directedroad networksrdquo Information Systems vol 41 pp 1ndash19 2014
[3] G Tsatsanifos and A Vlachou ldquoOn processing Top-k spatio-textual preference queriesrdquo in Proceedings of the 18th Interna-tional Conference on ExtendingDatabase Technology (EDBT rsquo15)pp 433ndash444 March 2015
[4] R Li A X Liu A L Wang and B Bruhadeshwar ldquoFast rangequery processing with strong privacy protection for cloud com-putingrdquo Proceedings of the VLDB Endowment vol 7 no 14 pp1953ndash1964 2014
[5] G Cong C S Jensen andDWu ldquoEfficient retrieval of the Top-k most relevant spatial web objectsrdquo Proceedings of the VLDBEndowment vol 2 no 1 pp 337ndash348 2009
[6] Z Li K C K Lee B Zheng W-C Lee D Lee and X WangldquoIR-tree An efficient index for geographic document searchrdquoIEEE Transactions on Knowledge and Data Engineering vol 23no 4 pp 585ndash599 2011
[7] Y Zhou X Xie C Wang Y Gong and W Ma ldquoHybrid indexstructures for location-based web searchrdquo in Proceedings of the14th ACM International Conference on Information and Knowl-edge Management pp 155ndash162 Bremen Germany October2005
[8] J Zobel and A Moffat ldquoInverted files for text search enginesrdquoACM Computing Surveys vol 38 no 2 2006
[9] N Beckmann H Kriegel R Schneider and B Seeger ldquoR-anefficient and robust accessmethod for points and rectanglesrdquo inProceedings of the ACM SIGMOD International Conference onManagement of Data vol 19 pp 322ndash331 May 1990
[10] R Hariharan B Hore C Li and S Mehrotra ldquoProcessing spa-tial-keyword (sk) queries in geographic information retrieval(gir) systemsrdquo in Proceedings of the 19th International Confer-ence on Scientific and Statistical DatabaseManagement (SSDBMrsquo07) July 2007
[11] I De FelipeV Hristidis andN Rishe ldquoKeyword search on spa-tial databasesrdquo in Proceedings of the 24th International Confer-ence on Data Engineering (ICDE rsquo08) pp 656ndash665 April 2008
[12] J B Rocha-Junior O Gkorgkas S Jonassen and K NoslashrvagldquoEfficient processing of top-k spatial keyword queriesrdquo inProceedings of the International Symposium on Spatial andTemporal Databases pp 205ndash222 Springer 2011
[13] D Zhang K-L Tan andAK Tung ldquoScalable top-k spatial key-word searchrdquo in Proceedings of the 16th International Conferenceon Extending Database Technology pp 359ndash370 2013
Wireless Communications and Mobile Computing 19
[14] J B Rocha-Junior andK Noslashrvag ldquoTop-k spatial keyword quer-ies on road networksrdquo in Proceedings of the 15th InternationalConference on Extending Database Technology pp 168ndash179Berlin Germany March 2012
[15] H-J Cho S J Kwon and T-S Chung ldquoA safe exit algorithmfor continuous nearest neighbor monitoring in road networksrdquoMobile Information Systems vol 9 no 1 pp 37ndash53 2013
[16] D Yung M L Yiu and E Lo ldquoA safe-exit approach for efficientnetwork-based moving range queriesrdquo Data amp KnowledgeEngineering vol 72 pp 126ndash147 2012
[17] M Attique H Cho R Jin and T Chung ldquoEfficient Processingof Continuous Reverse k Nearest Neighbor on Moving Objectsin Road Networksrdquo ISPRS International Journal of Geo-Infor-mation vol 5 no 12 p 247 2016
[18] H G Elmongui M F Mokbel and W G Aref ldquoContinuousaggregate nearest neighbor queriesrdquoGeoInformatica vol 17 no1 pp 63ndash95 2013
[19] D Wu M L Yiu C S Jensen and G Cong ldquoEfficient con-tinuously moving top-k spatial keyword query processingrdquo inProceedings of the IEEE International Conference on Data En-gineering (ICDE rsquo11) pp 541ndash552 Hannover Germany April2011
[20] W Huang G Li K-L Tan and J Feng ldquoEfficient safe-re-gion construction for moving top-k spatial keyword queriesrdquoin Proceedings of the 21st ACM International Conference onInformation and Knowledge Management pp 932ndash941 2012
[21] L Guo J ShaoHHAung andK-L Tan ldquoEfficient continuoustop-k spatial keyword queries on road networksrdquoGeoInformat-ica vol 19 no 1 pp 29ndash60 2014
[22] Y Li G Li L Shu Q Huang and H Jiang ldquoContinuous moni-toring of top-k spatial keyword queries in road networksrdquo Jour-nal of Information Science and Engineering vol 31 no 6 pp1831ndash1848 2015
[23] M Attique A Khan and T-S Chung ldquoESPAK Top-k spatialkeyword query processing in directed road networksrdquo in Pro-ceedings of the Workshops of the International Conference onExtending Database Technology and the International Confer-ence on DatabaseTheory (EDBTICDT rsquo17) March 2017
[24] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing ampManagement vol24 no 5 pp 513ndash523 1988
[25] V N Anh O de Kretser and A Moffat ldquoVector-space rankingwith effective early terminationrdquo in Proceedings of the 24th An-nual International ACM SIGIR Conference pp 35ndash42 NewOrleans LO USA 2001
[26] E W Dijkstra ldquoA note on two problems in connexion withgraphsrdquo Numerische Mathematik vol 1 pp 269ndash271 1959
[27] ldquoReal datasets for spatial databasesrdquo httpswwwcsutahedulifeifeiSpatialDatasethtm
[28] ldquoTwitterrdquo httpstwittercom[29] T Brinkhoff ldquoA framework for generating network-basedmov-
ing objectsrdquo GeoInformatica vol 6 no 2 pp 153ndash180 2002
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
6 Wireless Communications and Mobile Computing
(1) Input Top-k spatial keyword query 119876119873 = (119902119897 119902119905 119896)(2) Output Top-k data objects with highest score(3) 119863119888 larr997888 0 lowastset of candidate data objects(4) max-heap 119863119896 larr997888 0 lowastcurrent Top-k set(5) 119904119896 larr997888 0 lowastk-th score in119863119896(6) min-heap larr997888 0(7) 119890119909119901119897119900119903119890119889 larr997888 0(8) min-heapinsert(119902119897 119890119889119892119890119886119888119905119894V119890)(9) 119863119888 larr997888 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896)(10) update119863119896 and 119904119896 with 119889 isin 119863119888(11) whilemin-heap = 0 and (1(1 + 120572120582(119889119897 119902119897)) lt 119904119896) do(12) for each unexplored adjacent edge of (119901119886 119890119889119892119890) do(13) 119890119909119901119897119900119903119890119889 larr997888 119890119909119901119897119900119903119890119889 cup (119901119886 119890119889119892119890)(14) 119863119888 larr997888 119888119886119899119889119904119890119886119903119888ℎ((119890119894119889 119905119894119889) 119904119896)(15) update119863119896 and 119904119896 with 119889 isin 119863119888(16) end(17) min-heapinsert(adjacent node edge)(18) end(19) return119863119896
Algorithm 1 EvaluateSnapshotQuery(Node 119899119894 Edge 119890119894)
(1) Input Edge ID 119890119894119889 Term ID 119905119894119889 score of k-th object 119904119896(2) Output candidate list119863119888(3) compute 120579119905(119890119894)(4) if 120579119905(119890119894) gt 0 then(5) 119898119886119909119904119888119900119903119890(119890119894) larr997888 119888119900119898119901119906119905119890119898119886119909119904119888119900119903119890(120579119905 119889119894119904119905(119890119894 119902119897))(6) end(7) if 119898119886119909119904119888119900119903119890(119890119894) gt 119904119896 then(8) for each data object in 119890119894 do(9) compute 119889119904119888119900119903119890(10) end(11) if 119889119904119888119900119903119890 gt 119904119896 then(12) 119863119888 larr997888 119863119888 cup 119889(13) end(14) end(15) return119863119888
Algorithm 2 CandidateSearch((119890119894119889 119905119894119889) 119904119896)
119905 isin 119902119905 and the shortest distance 119904119889119894119904119905(119890119894 119902119897) between the edgeand the query location In the next step the inverted lists ofterm t are fetched if their upper-bound score is greater than119904119896 In the inverted lists the objects with score 120595(119889) greaterthan 119904119896 are returned
To understand the proposed algorithm consider theroad network presented in Figure 1 Assume that a query qgenerated a top-1 keyword query with qd ldquoItalian Restau-rantrdquo For ease of presentation we assume 120572 = 1 and thetextual relevance 120583 is the number of occurrences of querykeywords in 119889119905 divided by the number of keywords in thedocument (description of data object) For example 120595(1198894) =120583(1198894119905 119902119905)(1 + 120582(1198894119897 119902119897)) = 058 = 006 The algorithmstarts the network expansion from an active edge
997888997888997888997888997888rarr(1198992 1198993)where q is the anchor point Note that the direction of the edge997888997888997888997888997888rarr(1198992 1198993) is from 1198992 to 1198993 Therefore the algorithm explores
only997888997888997888997888997888rarr(119902 1198993) There is no data object found in
997888997888997888997888997888rarr(119902 1198993) Then1198993 becomes the anchor point and edges (1198993 1198994) (1198993 1198995)and (1198993 1198997) are inserted in min-heap Next the 119888119886119899119889119904119890119886119903119888ℎfunction retrieves the candidate data objects on edges (1198993 1198994)(1198992 1198993) and (1198993 1198997) whose score is better than 119904119896 On edge(1198993 1198995) data object 1198893 is retrieved with 120595(1198893) = 02 Dataobject 1198893 is inserted in the119863119896 set and the value of 119904119896 is set to02 For edges (1198993 1198994) and (1198993 1198997) there is no candidate objectfound because 1198892119905 (ldquoCaferdquo) and 1198897119905 (ldquoCafe and Bakeryrdquo) donot match with 119902119905 The algorithm continues expanding theedges whose upper-bound score is greater than 119904119896 The edge997888997888997888997888997888rarr(1198997 1198992) is explored next The upper-bound score of
997888997888997888997888997888rarr(1198997 1198992)is 17 which is less than 119904119896 Similarly for edge
larr997888997888997888997888997888(1198996 1198995) theupper-bound score is 058 lt 119904119896 Therefore the algorithmterminates and reports 1198893 as the top-1 result
Wireless Communications and Mobile Computing 7
q
q issues TkSK query at p1
Server returns a set of objects for p1
Figure 3 Illustration of directed road network
qq issues TkSK query at p2
Server returns a set of objects for p2
Figure 4 Illustration of directed road network
5 Moving Top-119896 Spatial Keyword Queries
In this section we present our method to monitor themoving top-k spatial keyword queries where query objectsare moving in a directed road network Figure 3 providesan example of TkSK in road networks where query point qissues a TkSK query at point 1199011 Note that the numbers onthe arrows in the figure indicate the order of the steps Toobtain top-k results at 1199011 the server executes Algorithm 1as mentioned in Section 42 Now consider that the queryobject is moved to 1199012 as shown in Figure 4 to retrieve thetop-k results at point 1199012 The simple method is to repeat theprocedure executed at 1199011 However the use of recomputationwhenever query q changes its location significantly increasesthe computation cost Furthermore it also increases thecommunication overhead because the query object mustreport its location whenever it moves and the server mustsend the results set To address these issues we introduce thesafe exit approach
In the proposed framework the server computes safeexit points for a query object The server maintains a set ofmoving queries and the query result remains valid until thequery objects remain inside their respective safe exit pointsWhenever a query object leaves its safe exit points the serverrecomputes theTkSK and safe exit points for the query object
Next we present our method to compute the safe exitpoints for a query objectThe safe exit point represents a pointin the segment where a safe region and nonsafe region meetWe compute the safe exit point using the divide-and-conquertechnique Before presenting the detailed methodology wedefine the terminologies used in this section
Definition 1 (safe region) A portion of a road segment thatcan guarantee that as long as the query point lies in it itstop-k results remain valid
Definition 2 (answer objects 119863+) A data object d is calledan answer object of query q if the score of data object d(120595(119889) gt 120595(119889119886)) where 119889119886 represents any other data object inthe directed road network Similarly we can generalize thisdefinition for TkSK a data object d is called an answer object
of query q if the score of a data object d (120595(119889) gt 120595(119889119896+1))where 119889119896+1 represents the (119896+1)119905ℎ data object in the directedroad network In other words we can state that all answerobjects are top-k results of query q
Definition 3 (nonanswer objects119863minus) A data object d is calleda nonanswer object of query q if the score of data object d(120595(119889) lt 120595(119889119886)) where 119889119886 represents any other data object inthe directed road network Similarly we can generalize thisdefinition for TkSK a data object d is called a nonanswerobject of query q if the score of data object d (120595(119889) lt 120595(119889119896))where 119889119896 represents the kth data object in the directed roadnetwork That is we can say that all answer objects are top-k results of query q Therefore we can state that none of thenonanswer objects are in the top-k results of query q
Definition 4 (lowest answer object 119863+119897 ) An answer object119889+ isin 119863+ is called a lowest answer object to a point 119901 isin 119866such that 120595(119889+119897 )119901 = min(120595(119889+1 )119901 120595(119889+2 )119901 120595(119889+|119889+|)119901)where120595(119889+119897 )119901 represents the score of the lowest answer objectat point p In other words 120595(119889+119897 )119901 lt 120595(119889+119886 )119901 at point p where119889+119886 is any other answer object in the 119863+ setDefinition 5 (highest nonanswer object 119863minusℎ) A nonanswerobject 119889minus isin 119863minus is called a highest nonanswer object toa point 119901 isin 119866 such that 120595(119889minusℎ )119901 = max(120595(119889minus1 )119901 120595(119889minus2 )119901 120595(119889+|119889minus|)119901) where 120595(119889+ℎ)119901 represents the score of thehighest nonanswer object at point p In other words the120595(119889minus119897 )119901 lt 120595(119889minus119886 )119901 at point p where 119889minus119886 is any other nonanswerobject in the 119863minus set
As discussed earlier the main challenge in the continuousprocessing of moving TkSK is to maintain the validity of theresult set because the movement of query objects can nullifythe result set To monitor the validity of the result set wepropose a safe-region-based approach
51 Computation of Safe Exit Points In this section wepresent our technique to compute the safe exit points Themain goal is to find a point in the road network where the
8 Wireless Communications and Mobile Computing
query result set will change The result set will change whenthe score of highest nonanswer 119863minusℎ surpasses the score of119863+119897 Generally the textual relevance score does not changeTherefore the score of data objects only changes because ofthe spatial relevance score which can only change by themovement of query objects The computation of the safe exitpoint is based on two key observations
Observation 1 If 119863+119899120573 = 119863+119901119886 there is no safe exit point in thesegment
Explanation 119863+119901119886 represents the set of answer objects atanchor point 119901119886 whereas 119863+119899120573 represents the set of answerobjects at boundary node 119899120573 As discussed earlier the safe exitpoint is the particular point where the query results changedIf the query results at the starting node are the same as theending node of any segmentedge there does not exist anypoint where the query result is changing Hence we do notsearch the safe exit point in that segment
Observation 2 If 119863+119901119886 = 119863+119899120573 there is a safe exit point in thesegment
Explanation In contrast to Observation 1 if the query resultsare different at the starting and ending points then thereexists a point where the query results are changing Hencethere is a safe exit point in the segment
To find the safe region we observe the following cases
Case 1 (when 120572 = 1 and the textual relevance of the highestnonanswer object and lowest answer object is the same)In this case both the textual and spatial relevance have thesame importance (ie 120572 = 1) In addition the top-k resultdepends only on the spatial relevance because the textualrelevance of both objects is the same The data object thatis closer to query point q becomes the answer object For anundirected edge the safe exit point 119901119904119890 is the center pointie max(119889119894119904119905(119901119904119890 119889+1 ) 119889119894119904119905(119901119904119890 119889+2 ) 119889119894119904119905(119901119904119890 119889+|119889+|)) =min(119889119894119904119905(119901119904119890 119889minus1 ) 119889119894119904119905(119901119904119890 119889minus2 ) 119889119894119904119905(119901119904119890 119889minus|119889minus|)) betweenthe lowest answer object and the highest nonanswer objectHowever in case of a directed edge where 119889119894119904119905(119901119886 119899120573) =119889119894119904119905(119899120573 119901119886) the safe exit point is either 119889+119897 or 119901119886 If 119889+119897 isin(119901119886 119899120573) then the safe exit point is 119889+119897 otherwise the safe exitpoint is 119901119886Case 2 (when 120572 = 1 and the textual relevance of the highestnonanswer object and lowest answer object is different) Inthis case the top-k result depends on all functions that are the120572 spatial and textual relevance Clearly for the undirectededges the midpoint between the lowest answer object andthe highest nonanswer object does not provide a valid safeexit point Therefore we introduce the divide-and-conquertechnique This will keep dividing the search space until weget the point where the score of the nonanswer is greater thanthat of the answer object Typically the safe exit point shouldbe closer to the data object whose score is lower Based onthis observation first we compute the midpoint in a similarfashion to Case 1 and then we continue dividing the search
space until we find the point For undirected edges the safeexit point can be computed in a similar fashion to Case 1
Case 2 also works for other cases when the safe exit pointis not the mid point between the lowest answer object andthe highest nonanswer object In these cases the safe exitpoint depends on two or more functions Therefore the safeexit point can be easily computed using the aforementioneddivide-and-conquer technique Following are the scenarioswhere the safe exit point can be computed using Case 2
(a) When 120572 = 1 and textual relevance of the nearest non-answer object and farthest answer object is different
(b) When 120572 = 1 and textual relevance of the nearestnonanswer object and farthest answer object is same
Case 3 (when 120572 = 0) This means the spatial relevance hasno effect on the score of data objects Hence no monitoringis required for this scenario
Algorithm 3 retrieves the safe exit points using theobservations we discussed earlier The core function in thisalgorithm is ComputeSafeExit(119901119886 119899120573) which finds the safeexit point in a segment between 119901119886 and 119899120573 The detailedComputeSafeExit(119901119886 119899120573) is described in Algorithm 4 FirstAlgorithm 4 determines 119889+119897 and 119889minusℎ at point 119901 isin [119901119886 119899120573]Recall that 119889+119897 is the lowest answer object to p where 119889minusℎ isthe highest nonanswer object to p Algorithm 4 computes thesafe exit point based on the cases we discussed earlier Thereare a further two scenarios for Cases 1 and 2 For Case 1 if119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then the safe exit point is the mid-point between 119889+119897 and 119889minusℎ If 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) thenthe edge is directed and therefore the safe exit point is either119901119886 or 119889+119897 If 119889+119897 lies on the edge [119901119886 119899120573] then 119889+119897 is the safe exitpoint Otherwise 119901119886 is the safe exit point
Similarly for Case 2 if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) thenthe safe exit point is computed by dividing the search space byhalf until we find the closest point such that 120595(119889minusℎ) gt 120595(119889+119897 )The safe exit point is computed in the same way as in Case 2if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886)52 Computation of Safe Exit Points for Example Considerthe same example in Figure 1 where the query point q issuesa top-1 keyword query with qt ldquoItalian restaurantrdquo For thisexample let us consider 120572 = 1 The monitoring algorithmstarts exploring from the active edge containing the queryobject q Therefore
997888997888997888997888997888rarr(119902 1198993) is explored first As shown inTable 3 for
997888997888997888997888997888rarr(119902 1198993) 119863+119902 = 1198893 and 119863+1198993 = 1198893 Accordingto Observation 1 no safe exit point exists in this segmentTherefore edges adjacent to 1198993 are explored and 1198993 becomesthe new 119901119886 The edge (1198993 1198994) is explored next Similarlythe answer object at 1198993 and 1198994 is the same 119863+1198993 = 119863+1198994 =1198893 Therefore a safe exit point does not exist in (1198993 1198994)The edge (1198993 1198997) is explored next As shown in Table 3119863+1198993 = 1198893 and 119863+1198997 = 1198896 By Observation 2 there is asafe exit point in (1198993 1198997) As shown in Figure 1 1198893119905 =1198896119905 = ldquo119868119905119886119897119894119886119899119877119890119904119905119886119906119903119886119899119905rdquo and 119889119894119904119905(1198993 1198997) = 119889119894119904119905(1198997 1198993)
Wireless Communications and Mobile Computing 9
(1) Input Same as Algorithm 1(2) Output 119875119878119864 a set of safe exit points(3) 119875119878119864 larr997888 0 lowastset of safe exit points(4) 119863+119901119886 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119901119886 (119901119886 119899120573))(5) lowastResults calculated using Algorithm 1(6) 119863+119899120573 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910((119899120573 (119901119886 119899120573)))(7) lowastResults calculated using Algorithm 1(8) if 119863+119901119886 = 119863+119899120573 then(9) no safe exit point lowastrefer to Observation 1(10) end(11) if 119863+119901119886 = 119863+119899120573 then(12) 119875119878119864 larr997888 119875119878119864 cup 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119901119886 119899120573) lowastsafe exit point
exist - refer to Observation 2(13) end(14) return 119875119878119864
Algorithm 3 COSK monitoring algorithm
(1) Input same as Algorithm 1(2) Output se safe exit point in (119901119886 119899120573)(3) 119863+119897 larr997888 lt 119901119863+119897 gt | for each point 119901 isin [119901119886 119899120573] 119889+119897 such that120595(119889+119897 )119901 = min(120595(119889+1 )119901 120595(119889+2 )119901 120595(119889+|119889+|)119901(4) 119863minusℎ larr997888 lt 119901119863minusℎ gt | for each point 119901 isin [119901119886 119899120573] 119889minusℎ such that120595(119889minusℎ )119901 = max(120595(119889minus1 )119901 120595(119889minus2 )119901 120595(119889minus|119889minus |)119901(5) if Case 1 then(6) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(7) 119901119904119890 =
max(119889119894119904119905(119904119890 119889+1 ) 119889119894119904119905(119904119890 119889+2 ) 119889119894119904119905(119904119890 119889+|119889+ |)) =min(119889119894119904119905(119904119890 119889minus1 ) 119889119894119904119905(119904119890 119889minus2 ) 119889119894119904119905(119904119890 119889minus|119889minus |))
(8) end(9) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(10) 119901119904119890 = 119901119886 or 119901119904119890 = 119889+119897 where 119889+119897 isin (119901119886 119899120573)(11) end(12) end(13) if Case 2 then(14) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(15) 119901119904119890 =closest point to 119901119886 such that 120595(119889minusℎ ) gt 120595(119889+119897 )(16) end(17) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(18) Same as Line (10)(19) end(20) end(21) return 119901119904119890
Algorithm 4 ComputeSafeExit(119901119886 119899120573)
Therefore according to Case 1 the safe exit point 1199041 isthe midpoint between 1198893 and 1198896 That is 119889119894119904119905(1199011199041198901 1198893) =119889119894119904119905(1199011199041198901 1198896) where119889119894119904119905(1199011199041198901 1198893) = 119909+3 and 119889119894119904119905(1199011199041198901 1198896) =minus119909 + 5 for 0 lt 119909 lt 3 Consequently 119909 = 1 which means thatthe distance from 1198993 to 1199011199041198901 is 1
Next we determine a safe exit point in (1198993 1198995) As shownin Table 3 the answer object at 1198995 is also the same as 1198993Hence no safe exit point exists in this edge Next
larr997888997888997888997888997888(1198996 1198995) isexplored with 119901119886 = 1198995 According to Table 3 119863+1198997 = 1198894 and
119863+1198995 = 1198893 Therefore a safe exit point exists in this edge This
edge is directed and for each point 119901 isin larr997888997888997888997888997888(1198996 1198995) the shortestdistance from p to 1198893 is from 119901 997888rarr 1198996 997888rarr 1198992 997888rarr 1198993 997888rarr 1198893Therefore 1198995 is the safe exit point
The bold lines in Figure 5 indicate the safe region of qThetop-1 result remains 1198893 until the query q lies in the safe region
Next we analyze the time complexity for determininga set of safe exit points using a set of qualifying objects119889 isin 119863+119901119886 cup 119863+119899120573 cup 119863(119901119886 119899120573) Note that 119863+119901119886 (119863+119899120573) indicates
10 Wireless Communications and Mobile Computing
Table 3 Computation of safe exit points for example scenario
EdgeSegment 119901119886 119863+119901119886 119863+119899120573 119901119904119890997888997888997888997888rarr(119902 1198993) q 119863+119902 = 1198893 119863+1198993 = 1198893 none(1198993 1198994) q 119863+1198993 = 1198893 119863+1198994 = 1198893 none(1198993 1198997) 1198993 119863+1198993 = 1198893 119863+1198997 = 1198896 1199011199041198901997888997888997888997888997888rarr(1198993 1198995) 1198993 119863+1198993 = 1198893 119863+1198995 = 1198893 nonelarr997888997888997888997888997888(1198996 1198995) 1198995 119863+1198995 = 1198893 119863+1198996 = 1198894 1199011199041198902
2
q
3
1
1 1
1
1
2
1
2
1 2
1
3
2
1
1
d4 (Chinese Restaurant)
d1 (Grand Hotel)
d5 (Pub and Bar)
n1
n6
n2 n3
n4
n7
pse1
pse2
n5
d6(Italian Restaurant)
d3 (Italian Restaurant)
d2 (Cafe)
d7 (Cafe and Bakery)
Figure 5 Illustration of safe region of q
the set of k data objects that satisfies the query conditionat 119901119886 (119899120573) According to Dijkstras algorithm [26] the timecomplexity 119874(119863+119902 ) for computing a set of answer objects at aquery point q is119874(119863+119902 ) = 119874(|119864|+|119873| log |119873|)Thismeans that119874(119863+119901119886) = 119874(119863+119899120573) = 119874(|119864| + |119873| log |119873|) holds for endpoints119901119886 and 119899120573 Thus time complexity 119874(Ω119896119905ℎ) when determiningthe skyline Ω119896119905ℎ with the k-th highest score is 119874(Ω119896119905ℎ) =119862119896119905ℎ119874(|119863+119901119886 cup 119863+119899120573 cup 119863(119901119886 119899120573)|) where 119862119896119905ℎ is the numberof qualifying objects that participate in the constitution ofthe skyline with the k-th highest score Therefore the timecomplexity of determining a safe exit point coincides withthe time complexity of determining the two skylines iethe skyline 119863+119897 with the k-th highest (or lowest) score foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects This is because the safe exit point is foundat the cross point between these skylines
Figure 6 represents the skyline graph for 119896 = 1 in an edge(1198997 1198993) Let us draw the score function for 1198893 and 1198896 for theroad segment (1198997 1198993) where a safe exit point exists This isbecause 119863(1198993)+ = 1198893 and 119863(1198997)+ = 1198896 for 119896 = 1 For eachpoint 119901 isin (1198997 1198993) the distance between 1198893 and point p canbe represented as 119889119894119904119905(1198893 119901) = 119889119894119904119905(1198893 1198993) + 119897119890119899(1198993 119901) = 6 minus119897119890119899(1198997 119901) Similarly for each point 119901 isin (1198997 1198993) the distancebetween 1198896 and point p can be represented as 119889119894119904119905(1198896 119901) =119889119894119904119905(1198896 1198997) + 119897119890119899(1198997 119901) = 2 + 119897119890119899(1198997 119901) Let 119897119890119899(1198997 119901) be
n7
10
08
06
04
02
n3pse1d7
distance
Scor
e
05 10 15 20 25 30
(d6) = 1(x + 3)
(d3) = 1(minusx + 7)
Figure 6 Skyline graph for 119896 = 1 on the road segment (1198997 1198993)
a variable x (0 le 119909 le 3) We can write 120582(1198893 119901) =119889119894119904119905(1198893 119901) = 6 minus 119909 and 120582(1198896 119901) = 119889119894119904119905(1198896 119901) = 2 + 119909 Thenwe can represent score function 120595(1198893) and 120595(1198896) as follows
120595(1198893) = 120583(1198893119905 119902119905)(1 + 120572 sdot 120582(1198893 119901)) = 1(7 minus 119909) for(0 le 119909 le 3)
Wireless Communications and Mobile Computing 11
120595(1198896) = 120583(1198896119905 119902119905)(1 + 120572 sdot 120582(1198896 119901)) = 1(3 + 119909) for(0 le 119909 le 3)Finally we present the lemma to prove that safe exit points
computed by COSK are correct
Lemma 8 The COSK algorithm correctly computes a set ofsafe exit points
Proof We will prove the correctness of the COSK algorithmby contradiction We assume that if 119863+119901119886 = 119863+119899120573 there is nosafe exit point in a road segment (119901119886119899120573) This means that foreach point p in the road segment (119901119886119899120573) the query result atp equals 119863+119901119886 ie 119863+119901 = 119863+119901119886forall119901 isin (119901119886119899120573) However it leadsto a contradiction that 119863+119899120573 = 119863+119901119886 when 119901 = 119899120573 There-fore if 119863+119901119886 = 119863+119899120573 a safe exit point exists in (119901119886119899120573) In addi-tion a safe exit point is determined using the skyline 119863+119897 foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects when 119863+119901119886 = 119863+119899120573 The first skyline is acomposite polyline drawn from answer objects in 119863+119901119886 Thesecond skyline is a composite polyline drawn from nonan-swer objects in 119863+119899120573 cup 119863(119901119886 119899120573) minus 119863+119901119886
6 Monitoring Query Results and Safe Regionsin Dynamic Directed Road Networks
In this section we discuss the monitoring of spatial key-word queries in dynamic road networks where the networkdistance changes depending on the traffic conditions Theupdates on weight of some edges may invalidate the queryresults or safe region of q even though the query objectq remains within their respective safe region Figure 7illustrates an example of changing the weights edges
larr997888997888997888997888997888(1198991 1198992)and
larr997888997888997888997888997888(1198991 1198996) For convenience we consider 120572 = 1 and qt =ldquoItalian restaurantrdquo In Figure 7(a) the top-1 result is 1198891 andbold lines show the safe region of query q Now consider attime 119905119895 the weights of two edgeslarr997888997888997888997888997888(1198991 1198992) andlarr997888997888997888997888997888(1198991 1198996) changeddue to heavy traffic condition as shown in Figure 7(b) Theupdate in weight of edges may invalidate the query resultor safe region of q Therefore it is necessary to monitor thevalidity of results and safe region when the changes occur
Next we introduce a monitoring region to monitor thevalidity of the safe region effectively when the weight ofan edge is changed Monitoring region MR contains all thepoints between query point q and lowest answer object andhighest nonanswer object Formally it is defined as 119872119877 =119889119894119904119905(119902119863+119897 ) cup 119889119894119904119905(119902119863minusℎ) where 119889119894119904119905(119902119863+119897 ) is the distancebetween q and lowest answer object and 119889119894119904119905(119902119863minusℎ) is highestnonanswer object In given example the 119863+119897 = 1198891 and 119863minusℎ =1198892 1198893 Therefore the dotted lines in Figure 8(a) shows themonitoring region of query object q
Now at time 119905119895 the update to edgeslarr997888997888997888997888997888(1198991 1198996) and larr997888997888997888997888997888997888(1198991 1198891)
which is not part of monitoring region can safely be ignoredHowever the updated on segment
997888997888997888997888997888997888rarr(1198992 1198891)which is associatedwith monitoring region may nullify the results As shown in
Figure 8(b) after update the top-1 result becomes 1198892 and boldlines represents the new safe region of q
Algorithm 5 monitors the validity of result set and saferegion of query object qwhen the weight of any edge changesLet us consider weight of edge (119899119894 119899119895) changes at time 119905119895First algorithm checks whether edge (119899119894 119899119895) is associatedwith monitoring region or not If it is not part of monitoringregion then algorithm simply ignores the update in edge(119899119894 119899119895) and query results and safe region remains valid Incontrast if edge is associated with monitoring region (ie119872119877cap(119899119894 119899119895) = 0) then algorithm evaluates the query resultsConsequently the top-k results and safe region of queryq needs to be updated Finally the algorithm updates themonitoring region of q
7 Performance Evaluation
In this section we evaluate the performance of COSKthrough simulation experiments We describe our experi-mental settings in Section 71 and we present our experimen-tal results for static and dynamic road networks in Sections72 and 73 respectively
71 Experimental Settings All of our experiments wereperformed using real road networks namely OldenburgSan Francisco and San Joaquin All three road networkswere obtained from [27] The original road network of SanFrancisco had 21047 nodes and 21692 edges We reformat-ted the network pruned approximately 30 of the nodesand adjusted the edges and their weights accordingly Thisresulted in a network with 14732 nodes and 14316 edgesBoth the direction of edges and data objects on the edgeswere generated randomly The description of each data objectwas extracted from Twitter messages [28] and we assignedone tweet per data object Table 4 presents the characteristicsof the data sets used in the experimental evaluation Wesimulated moving query objects by using a spatiotemporaldata generator [29] The input to generator was the road net-work of the data set used and the output was the set of queryobjects moving on the road network Each experiment had100 moving queries which were continuously monitored for100 timestamps (1 timestamp = 1 second) and the averageresult was reported in the experiments
As a benchmark for COSK in static road network weimplemented a CMTkSK+ algorithm [22] which also contin-uously monitored the moving top-k spatial keyword queriesin the road networks However this algorithm was originallydesigned for undirected road networks To make a faircomparison we modified CMTkSK+ to process top-k spatialkeyword queries in directed road networks and called itCMTkSK+ Specifically we modified the distance computa-tion method between two points such that in directed roadnetworks 119889119894119904119905(1199011 1199012) = 119889119894119904119905(1199012 1199011) Since CMTkSK+ doesnot handle top-k spatial queries in dynamic road roads wecompared the performance of COSK with basic algorithmwhich recomputes the results whenever query object changesits location All algorithms were implemented in Java andwere executed on a desktop PC 280-GHz Intel Core i5 with
12 Wireless Communications and Mobile Computing
3
q5 5
2 3
3
2
2 3 5
11
d3 (Chinese Restaurant)
n1
n6
n2 pse2
pse1
pse3
n4n5
n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Safe region at time 119905119894
9
q10 5
6 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6
n2 n3
n4n5
d2 (Italian Restaurant)d1 (Italian Restaurant)
(b) Updating weight oflarr997888997888997888997888997888997888(1198991 1198992) and
larr997888997888997888997888997888997888(1198991 1198996) at time 119905119895
Figure 7 Updating the weight of edges in a dynamic road network where 119905119894 lt 119905119895
3
q5 5
2 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6 n4n5
n2 n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Monitoring region at time 119905119894
9
q10 5
5 4
233
2
2 3 5
11
037
pse2pse1
pse3
d3 (Chinese Restaurant)n6 n4n5
n2 n3d2 (Italian Restaurant)n1 d1 (Italian Restaurant)
(b) New safe region at time 119905119895
Figure 8 Monitoring region and updated safe region at time 119905119895
(1) InputMonitoring regionMR updated edge (119899119894 119899119895)(2) Output none(3) if 119872119877cap (119899119894 119899119895) = 0 then(4) lowastedge (119899119894 119899119895) is not part of monitoring region(5) ignore the change in the weight of edge (119899119894 119899119895)(6) end(7) 119875119878119864 larr997888 0 lowastset of safe exit points(8) else(9) 119863119896119906119901119889 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119899119894 119890119894) lowastupdate set of
top-k results(10) 119875119878119864119906119901119889 larr997888 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119875119886 119899120573) lowastupdate safe exit
points(11) 119872119877119906119901119889 larr997888 119862119900119898119901119906119905119890119872119900119899119894119905119900119903119894119899119892119877119890119892119894119900119899(119863+119897 119863minusℎ )
lowastupdate monitoring region(12) end
Algorithm 5 MonitoringSafeRegion(MR(119899119894 119899119895))
Table 4 Summary of datasets
Attribute Oldenburg San Francisco San JoaquinTotal no of nodes 6104 14732 18262Total no of edges 7034 14316 23876Percentage of directed edges 30 30 30Total no of objects 5627 11453 19098Average no of objects per edge 08 08 08Total no of words 49517 103649 166153
Wireless Communications and Mobile Computing 13
Table 5 Experimental parameter settings
Parameter RangeNumber of results (k) 5 10 15 20 25Number of keywords (n) 1 2 3 4 5Query parameter (120572) 001 01 1 10 100Dataset Oldenburg San Francisco San JoaquinNumber of data objects (119873119863) 10 20 30 40 50 (x1000)Speed of query objects (119881119902119903119910) 25 50 75 100 125 (kmh)Mobility (119872119902119903119910) 20 40 60 80 100Ratio of directed edges (119864119889119894119903) 10 20 30 40 50Ratio of updated edges (119864119906119901119889) 15 30 60 80 100
8GB of memory In the experiments we compared (1) queryprocessing times (2) edges processed ie the number ofedges processed for retrieving query results and (3) indexsizes Table 5 summarizes the parameters used in the exper-iments In each experiment we varied a single parameterwithin the range that is shown in Table 5 while maintainingthe other parameters at the bolded default values
We evaluated the performance of the algorithms by usingthe following measures (1) total amount of server CPUtime which indicates the query processing time and (2)total communication cost as the total number of points (iethe location updates sent by query objects and the queryresults and safe exit points returned by the server) transferredbetween clients and the serverThebattery power andwirelessbandwidth consumption typically increase with the amountof data transferred between objects (clients) and serversThus we used the amount of transferred data as a metric toevaluate the communication cost
72 Experimental Results of Top-k Spatial KeywordQueries in Static Road Networks
721 Effect of k Figure 9 indicates the effect of the numberof results on the query processing time and communicationcost for both algorithms Figure 9(a) indicates that the queryprocessing time increases for both algorithms as the value ofk increases This is expected because with an increase in kmore data objects are required to be explored and verifiedNevertheless COSK significantly outperforms CMTkSK+ fortwo main reasons First a relevant object search is very effi-cient when using the highest significant factor and secondCOSKdoes not need to verify the set of answer objects as longas the query object lies in a safe region On the other handthe CMTkSK+ query processing time increases significantlybecause it has to monitor and verify the set of candidateobjects periodically In Figure 9(b) the communication costsfor both algorithms increase as the number of objects in-creases However the proposed algorithm demonstrates su-perior performance compared to CMTkSK+ because client-server communication is not required when the query objectlies within the safe exit points whereas in CMTkSK+ thequery object is required to report its location to the serverwhenever it moves
722 Effect of119873119863 This experimentwas conducted on datasetSan Joaquin This dataset included 19098 data objects there-fore we randomly generated approximately 30000 additionaldata objects on different edges In Figure 10 we evaluate theperformance of COSK and CMTkSK+ by varying the cardi-nality of the data objects Note that119873119863 = 10119870 corresponds toa low density of data points while119873119863 = 50119870 corresponds toa high density In Figure 10(a) it is interesting to notice thatthe query processing times of both algorithms decrease asthe cardinality of the data objects increases For CMTkSK+this is because with high density the monitoring range of aquery decreases However for COSK it is mainly becausewhen the data density is high fewer edges are required tobe expanded which decreases the query processing time InFigure 10(b) we study the influence of the cardinality of thedata objects on the communication costs The experimentalresults indicate that the communication costs of CMTkSK+incur almost constant communication costs regardless ofdata object cardinality However the communication costsof COSK increase in proportion to the 119873119863 value This isexpected because the safe region becomes smaller as thedensity of the data objects increases which increases thecommunication costs
723 Effect of Query Keywords (n) Figure 11 shows thequery processing time and communication for COSK andCMTkSK+ as a function of the number of query keywordsFigures 11(a) and 11(b) show the trend that the performanceof both algorithms degrades when the number of keywordsincreases This is mainly because by increasing the numberof query keywords the number of relevant objects may alsoincrease resulting in a higher query processing time andcommunication cost However the safe-region-based algo-rithm COSK scales better than CMTkSk+ because of its lessexpensive monitoring technique
724 Effect of 120572 Figure 12 demonstrates the impact of queryparameter 120572 on the query processing time and on the com-munication cost A small value of 120572 indicates a greater im-portance of textual relevance whereas a high value of 120572gives more preference to the spatial relevance It is interestingto note that the query processing time is lower for higher
14 Wireless Communications and Mobile Computing
k
50
10
10
15 20
20
30
Que
ry p
roce
ssin
g tim
e (s)
COSKCMTkSK+
40
25
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
k
5 10 15 20 25
(b) Communication cost
Figure 9 Effect of k on query processing time and number of edges processed
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
10k 20k 30k 40k 50kND
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
tran
sferr
ed m
essa
ges
1M
10 20 30 40 50ND
(b) Communication cost
Figure 10 Effect of119873119863 on query processing time and communication cost
values of 120572 which indicates more importance to the spatialrelevance This is mainly because when the spatial relevanceis higher fewer edges and objects are required to be exploredand processed to determine the top-k data objects Observethat in Figure 12(b) the number of messages sent by COSKdecreases sharply with an increase in 120572725 Effect of Speed Figure 13(a) demonstrates the influenceof the speed of the query objects on the query processingtime of the COSK and CMTkSK+ algorithms The experi-mental results indicate that the performance of CMTkSK+is not significantly influenced by the speed of the query
objects because the candidate objects must be continuouslymonitored after a regular interval of time regardless ofthe speed On the other hand for COSK the performancegradually decreases as the speed of the query objects increasesbecause the objects leave their respective safe regions morefrequently Figure 13(b) shows the communication costs ofCOSK and CMTkSK+ with respect to the speed of the queryobjects CMTkSK+ incurs almost constant communicationcosts because a server-initiated request to verify the candidateobjects does not depend on the speed For COSK the queryobjects cross safe regions more frequently when the speed ishigh which increases the communication costs
Wireless Communications and Mobile Computing 15
Number of keywords1 2 3 4 5
COSKCMTkSK+
0
15
30
45
Que
ry p
roce
ssin
g tim
e (s)
60
(a) Query processing time
COSK
Number of keywords
CMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1 2 3 4 5
(b) Communication cost
Figure 11 Effect of number of keywords on query processing time and communication cost
001 01 1 10 100
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
001 01 1 10 100
(b) Communication cost
Figure 12 Effect of 120572 on query processing time and communication cost
726 Effect of Mobility Figure 14 shows the effect of mobility119872119902119903119910 (mobility refers to the percentage of query objects thatare moving at any timestamp) on the performance of COSKand CMTkSK+ algorithms As expected the query pro-cessing time and communication costs for both algorithmsincrease with119872119902119903y Nevertheless COSK performs better thanCMTkSK+ in terms of query processing time and commu-nication costs
727 Effect of Directed Edges Figure 15 shows the impactof percentage of directed edges 119864119889119894119903 on the performance ofCOSK and CMTkSK+ algorithms The query processing time
increases with 119864119889119894119903 because algorithm needs to explore moreedges to retrieve the top-k keyword queries However thecommunication cost is not significantly affected by the valueof 119864119889119894119903 for both the algorithms
728 Effect of Datasets Figure 16 demonstrates the indexsizes of the COSK and CMTkSK+ approaches for differentdatasets As shown in Figure 16 both algorithms have similarindex sizes However COSK has minor space overheadbecause it stores additional information of the highest signifi-cance factor 120579119905 of edges More important this space overheadis minimal as compared to the gain achieved by COSK inquery processing time and communication costs
16 Wireless Communications and Mobile Computing
25 50 75 100 125
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
Vqry
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
25 50 75 100 125Vqry
(b) Communication cost
Figure 13 Effect of speed on query processing time and communication cost
20 40 60 80 100Mqry
COSKCMTkSK+
0
15
45
30
60
Que
ry p
roce
ssin
g tim
e (s)
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
20 40 60 80 100Mqry
1k
COSKCMTkSK+
(b) Communication cost
Figure 14 Effect of mobility on query processing time and communication cost
73 Experimental Results of Top-k Spatial Keyword Queriesin Dynamic Road Networks In this section we evaluate theperformance of COSK and basic algorithm for dynamic roadnetworks The 119864119906119901119889 indicates the percentage of all edges thatchange their weight at each timestamp The length of anupdated edge is randomly selected between 01 to 10 times theoriginal length Figure 17(a) depicts the query processing timeof COSK and basic algorithm It is evident from the figure thatquery processing time of basic algorithm is not significantlyaffected by 119864119906119901119889 This is mainly because the query objectsissue top-k spatial queries at each timestamp However query
processing time of COSK increases with the value of 119864119906119901119889because the probability that the updated edge may associatedwith the monitoring region of query q increases with 119864119906119901119889Therefore when 119864119906119901119889 becomes large the results need to befrequently updated which increases the query processingtime Figure 17(b) shows the communication costs of COSKand basic algorithm with respect to 119864119906119901119889 Basic algorithmincurs almost constant communication costs regardless of thevalue of 119864119906119901119889 In contrast the communication cost of COSKincreases with 119864119906119901119889 because the query result and safe regionsneeds to be frequently updated
Wireless Communications and Mobile Computing 17
COSKCMTkSK+
10 20 30 40 50Edir
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1k
10 20 30 40 50Edir
eSPAKCMTkSK+
(b) Communication cost
Figure 15 Effect of 119864119889119894119903 on query processing time and communication cost
COSKCMTkSK+
0
15
45
30
60
Inde
x siz
e (M
B)
OldenburgDatasets
San Francisco San Joaquin
Figure 16 Effect of dataset on index size
8 Conclusion
In this paper we investigated moving top-k spatial keywordqueries in directed and dynamic road networksWepresentedan efficient indexing framework using inverted files thatindexes the data objects on edges allowing for the effectivesearching of data objects relevant to queries in terms ofboth textual and spatial relevance We also presented a safe-exit-based algorithm called COSK to monitor moving top-k spatial keyword queries We demonstrated that the queryresults remain valid as long as the query object resides withina safe region Furthermore COSK can effectively monitor thevalidity of query results and safe regions in dynamic roadnetworks Finally an experimental evaluation conducted on
real road networks demonstrated that COSK significantlyreduced the query processing time and communication costscompared to the CMTkSK+ algorithm
Data Availability
The real road network data used in this study are also used inmany previous studies The road network data is cited in themanuscript and it is available at httpswwwcsutahedusimlifeifeiSpatialDatasethtm To simulate the moving queriesthe authors used the spatiotemporal data generator which isalso used in previous studiesThe research article of generatoris cited in the manuscript The documentation and source
18 Wireless Communications and Mobile Computing
0
20
40
60
Que
ry p
roce
ssin
g tim
e (s)
80
15 30 45 60 75Eupd
COSKBasic
(a) Query processing time
15 30 45 60 75Eupd
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
COSKBasic
(b) Communication cost
Figure 17 Effect of 119864119906119901119889 on query processing time and communication cost
files of generator are available at httpsiapgjade-hsdeper-sonenbrinkhoffgenerator They used the Twitter tweetsfor generating the description of data objects and also querykeywords The tweets used can be accessible at httpfollow-thehashtagcomdatasetsfree-twitter-dataset-usa-200000-free-usa-tweets
Conflicts of Interest
The authors declare that there is no conflicts of interestregarding the publication of this paper
Acknowledgments
Hyung-JuChowas supported by theNational Research Foun-dation of Korea (NRF) grant funded by the Korean Govern-ment (MSIP) (NRF-2016R1A2B4009793) and this researchwas partially supported by Basic Science Research Programthrough the National Research Foundation of Korea (NRF)fundedby theMinistry of Education (2016R1D1A1B03934129)
References
[1] D Papadias N Mamoulis J Zhang and Y Tao ldquoQuery pro-cessing in spatial network databasesrdquo in Proceedings of the 29thInternational Conference on Very Large Data Bases (VLDB rsquo03)pp 802ndash813 September 2003
[2] H-J Cho K Ryu and T-S Chung ldquoAn efficient algorithm forcomputing safe exit points of moving range queries in directedroad networksrdquo Information Systems vol 41 pp 1ndash19 2014
[3] G Tsatsanifos and A Vlachou ldquoOn processing Top-k spatio-textual preference queriesrdquo in Proceedings of the 18th Interna-tional Conference on ExtendingDatabase Technology (EDBT rsquo15)pp 433ndash444 March 2015
[4] R Li A X Liu A L Wang and B Bruhadeshwar ldquoFast rangequery processing with strong privacy protection for cloud com-putingrdquo Proceedings of the VLDB Endowment vol 7 no 14 pp1953ndash1964 2014
[5] G Cong C S Jensen andDWu ldquoEfficient retrieval of the Top-k most relevant spatial web objectsrdquo Proceedings of the VLDBEndowment vol 2 no 1 pp 337ndash348 2009
[6] Z Li K C K Lee B Zheng W-C Lee D Lee and X WangldquoIR-tree An efficient index for geographic document searchrdquoIEEE Transactions on Knowledge and Data Engineering vol 23no 4 pp 585ndash599 2011
[7] Y Zhou X Xie C Wang Y Gong and W Ma ldquoHybrid indexstructures for location-based web searchrdquo in Proceedings of the14th ACM International Conference on Information and Knowl-edge Management pp 155ndash162 Bremen Germany October2005
[8] J Zobel and A Moffat ldquoInverted files for text search enginesrdquoACM Computing Surveys vol 38 no 2 2006
[9] N Beckmann H Kriegel R Schneider and B Seeger ldquoR-anefficient and robust accessmethod for points and rectanglesrdquo inProceedings of the ACM SIGMOD International Conference onManagement of Data vol 19 pp 322ndash331 May 1990
[10] R Hariharan B Hore C Li and S Mehrotra ldquoProcessing spa-tial-keyword (sk) queries in geographic information retrieval(gir) systemsrdquo in Proceedings of the 19th International Confer-ence on Scientific and Statistical DatabaseManagement (SSDBMrsquo07) July 2007
[11] I De FelipeV Hristidis andN Rishe ldquoKeyword search on spa-tial databasesrdquo in Proceedings of the 24th International Confer-ence on Data Engineering (ICDE rsquo08) pp 656ndash665 April 2008
[12] J B Rocha-Junior O Gkorgkas S Jonassen and K NoslashrvagldquoEfficient processing of top-k spatial keyword queriesrdquo inProceedings of the International Symposium on Spatial andTemporal Databases pp 205ndash222 Springer 2011
[13] D Zhang K-L Tan andAK Tung ldquoScalable top-k spatial key-word searchrdquo in Proceedings of the 16th International Conferenceon Extending Database Technology pp 359ndash370 2013
Wireless Communications and Mobile Computing 19
[14] J B Rocha-Junior andK Noslashrvag ldquoTop-k spatial keyword quer-ies on road networksrdquo in Proceedings of the 15th InternationalConference on Extending Database Technology pp 168ndash179Berlin Germany March 2012
[15] H-J Cho S J Kwon and T-S Chung ldquoA safe exit algorithmfor continuous nearest neighbor monitoring in road networksrdquoMobile Information Systems vol 9 no 1 pp 37ndash53 2013
[16] D Yung M L Yiu and E Lo ldquoA safe-exit approach for efficientnetwork-based moving range queriesrdquo Data amp KnowledgeEngineering vol 72 pp 126ndash147 2012
[17] M Attique H Cho R Jin and T Chung ldquoEfficient Processingof Continuous Reverse k Nearest Neighbor on Moving Objectsin Road Networksrdquo ISPRS International Journal of Geo-Infor-mation vol 5 no 12 p 247 2016
[18] H G Elmongui M F Mokbel and W G Aref ldquoContinuousaggregate nearest neighbor queriesrdquoGeoInformatica vol 17 no1 pp 63ndash95 2013
[19] D Wu M L Yiu C S Jensen and G Cong ldquoEfficient con-tinuously moving top-k spatial keyword query processingrdquo inProceedings of the IEEE International Conference on Data En-gineering (ICDE rsquo11) pp 541ndash552 Hannover Germany April2011
[20] W Huang G Li K-L Tan and J Feng ldquoEfficient safe-re-gion construction for moving top-k spatial keyword queriesrdquoin Proceedings of the 21st ACM International Conference onInformation and Knowledge Management pp 932ndash941 2012
[21] L Guo J ShaoHHAung andK-L Tan ldquoEfficient continuoustop-k spatial keyword queries on road networksrdquoGeoInformat-ica vol 19 no 1 pp 29ndash60 2014
[22] Y Li G Li L Shu Q Huang and H Jiang ldquoContinuous moni-toring of top-k spatial keyword queries in road networksrdquo Jour-nal of Information Science and Engineering vol 31 no 6 pp1831ndash1848 2015
[23] M Attique A Khan and T-S Chung ldquoESPAK Top-k spatialkeyword query processing in directed road networksrdquo in Pro-ceedings of the Workshops of the International Conference onExtending Database Technology and the International Confer-ence on DatabaseTheory (EDBTICDT rsquo17) March 2017
[24] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing ampManagement vol24 no 5 pp 513ndash523 1988
[25] V N Anh O de Kretser and A Moffat ldquoVector-space rankingwith effective early terminationrdquo in Proceedings of the 24th An-nual International ACM SIGIR Conference pp 35ndash42 NewOrleans LO USA 2001
[26] E W Dijkstra ldquoA note on two problems in connexion withgraphsrdquo Numerische Mathematik vol 1 pp 269ndash271 1959
[27] ldquoReal datasets for spatial databasesrdquo httpswwwcsutahedulifeifeiSpatialDatasethtm
[28] ldquoTwitterrdquo httpstwittercom[29] T Brinkhoff ldquoA framework for generating network-basedmov-
ing objectsrdquo GeoInformatica vol 6 no 2 pp 153ndash180 2002
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
Wireless Communications and Mobile Computing 7
q
q issues TkSK query at p1
Server returns a set of objects for p1
Figure 3 Illustration of directed road network
qq issues TkSK query at p2
Server returns a set of objects for p2
Figure 4 Illustration of directed road network
5 Moving Top-119896 Spatial Keyword Queries
In this section we present our method to monitor themoving top-k spatial keyword queries where query objectsare moving in a directed road network Figure 3 providesan example of TkSK in road networks where query point qissues a TkSK query at point 1199011 Note that the numbers onthe arrows in the figure indicate the order of the steps Toobtain top-k results at 1199011 the server executes Algorithm 1as mentioned in Section 42 Now consider that the queryobject is moved to 1199012 as shown in Figure 4 to retrieve thetop-k results at point 1199012 The simple method is to repeat theprocedure executed at 1199011 However the use of recomputationwhenever query q changes its location significantly increasesthe computation cost Furthermore it also increases thecommunication overhead because the query object mustreport its location whenever it moves and the server mustsend the results set To address these issues we introduce thesafe exit approach
In the proposed framework the server computes safeexit points for a query object The server maintains a set ofmoving queries and the query result remains valid until thequery objects remain inside their respective safe exit pointsWhenever a query object leaves its safe exit points the serverrecomputes theTkSK and safe exit points for the query object
Next we present our method to compute the safe exitpoints for a query objectThe safe exit point represents a pointin the segment where a safe region and nonsafe region meetWe compute the safe exit point using the divide-and-conquertechnique Before presenting the detailed methodology wedefine the terminologies used in this section
Definition 1 (safe region) A portion of a road segment thatcan guarantee that as long as the query point lies in it itstop-k results remain valid
Definition 2 (answer objects 119863+) A data object d is calledan answer object of query q if the score of data object d(120595(119889) gt 120595(119889119886)) where 119889119886 represents any other data object inthe directed road network Similarly we can generalize thisdefinition for TkSK a data object d is called an answer object
of query q if the score of a data object d (120595(119889) gt 120595(119889119896+1))where 119889119896+1 represents the (119896+1)119905ℎ data object in the directedroad network In other words we can state that all answerobjects are top-k results of query q
Definition 3 (nonanswer objects119863minus) A data object d is calleda nonanswer object of query q if the score of data object d(120595(119889) lt 120595(119889119886)) where 119889119886 represents any other data object inthe directed road network Similarly we can generalize thisdefinition for TkSK a data object d is called a nonanswerobject of query q if the score of data object d (120595(119889) lt 120595(119889119896))where 119889119896 represents the kth data object in the directed roadnetwork That is we can say that all answer objects are top-k results of query q Therefore we can state that none of thenonanswer objects are in the top-k results of query q
Definition 4 (lowest answer object 119863+119897 ) An answer object119889+ isin 119863+ is called a lowest answer object to a point 119901 isin 119866such that 120595(119889+119897 )119901 = min(120595(119889+1 )119901 120595(119889+2 )119901 120595(119889+|119889+|)119901)where120595(119889+119897 )119901 represents the score of the lowest answer objectat point p In other words 120595(119889+119897 )119901 lt 120595(119889+119886 )119901 at point p where119889+119886 is any other answer object in the 119863+ setDefinition 5 (highest nonanswer object 119863minusℎ) A nonanswerobject 119889minus isin 119863minus is called a highest nonanswer object toa point 119901 isin 119866 such that 120595(119889minusℎ )119901 = max(120595(119889minus1 )119901 120595(119889minus2 )119901 120595(119889+|119889minus|)119901) where 120595(119889+ℎ)119901 represents the score of thehighest nonanswer object at point p In other words the120595(119889minus119897 )119901 lt 120595(119889minus119886 )119901 at point p where 119889minus119886 is any other nonanswerobject in the 119863minus set
As discussed earlier the main challenge in the continuousprocessing of moving TkSK is to maintain the validity of theresult set because the movement of query objects can nullifythe result set To monitor the validity of the result set wepropose a safe-region-based approach
51 Computation of Safe Exit Points In this section wepresent our technique to compute the safe exit points Themain goal is to find a point in the road network where the
8 Wireless Communications and Mobile Computing
query result set will change The result set will change whenthe score of highest nonanswer 119863minusℎ surpasses the score of119863+119897 Generally the textual relevance score does not changeTherefore the score of data objects only changes because ofthe spatial relevance score which can only change by themovement of query objects The computation of the safe exitpoint is based on two key observations
Observation 1 If 119863+119899120573 = 119863+119901119886 there is no safe exit point in thesegment
Explanation 119863+119901119886 represents the set of answer objects atanchor point 119901119886 whereas 119863+119899120573 represents the set of answerobjects at boundary node 119899120573 As discussed earlier the safe exitpoint is the particular point where the query results changedIf the query results at the starting node are the same as theending node of any segmentedge there does not exist anypoint where the query result is changing Hence we do notsearch the safe exit point in that segment
Observation 2 If 119863+119901119886 = 119863+119899120573 there is a safe exit point in thesegment
Explanation In contrast to Observation 1 if the query resultsare different at the starting and ending points then thereexists a point where the query results are changing Hencethere is a safe exit point in the segment
To find the safe region we observe the following cases
Case 1 (when 120572 = 1 and the textual relevance of the highestnonanswer object and lowest answer object is the same)In this case both the textual and spatial relevance have thesame importance (ie 120572 = 1) In addition the top-k resultdepends only on the spatial relevance because the textualrelevance of both objects is the same The data object thatis closer to query point q becomes the answer object For anundirected edge the safe exit point 119901119904119890 is the center pointie max(119889119894119904119905(119901119904119890 119889+1 ) 119889119894119904119905(119901119904119890 119889+2 ) 119889119894119904119905(119901119904119890 119889+|119889+|)) =min(119889119894119904119905(119901119904119890 119889minus1 ) 119889119894119904119905(119901119904119890 119889minus2 ) 119889119894119904119905(119901119904119890 119889minus|119889minus|)) betweenthe lowest answer object and the highest nonanswer objectHowever in case of a directed edge where 119889119894119904119905(119901119886 119899120573) =119889119894119904119905(119899120573 119901119886) the safe exit point is either 119889+119897 or 119901119886 If 119889+119897 isin(119901119886 119899120573) then the safe exit point is 119889+119897 otherwise the safe exitpoint is 119901119886Case 2 (when 120572 = 1 and the textual relevance of the highestnonanswer object and lowest answer object is different) Inthis case the top-k result depends on all functions that are the120572 spatial and textual relevance Clearly for the undirectededges the midpoint between the lowest answer object andthe highest nonanswer object does not provide a valid safeexit point Therefore we introduce the divide-and-conquertechnique This will keep dividing the search space until weget the point where the score of the nonanswer is greater thanthat of the answer object Typically the safe exit point shouldbe closer to the data object whose score is lower Based onthis observation first we compute the midpoint in a similarfashion to Case 1 and then we continue dividing the search
space until we find the point For undirected edges the safeexit point can be computed in a similar fashion to Case 1
Case 2 also works for other cases when the safe exit pointis not the mid point between the lowest answer object andthe highest nonanswer object In these cases the safe exitpoint depends on two or more functions Therefore the safeexit point can be easily computed using the aforementioneddivide-and-conquer technique Following are the scenarioswhere the safe exit point can be computed using Case 2
(a) When 120572 = 1 and textual relevance of the nearest non-answer object and farthest answer object is different
(b) When 120572 = 1 and textual relevance of the nearestnonanswer object and farthest answer object is same
Case 3 (when 120572 = 0) This means the spatial relevance hasno effect on the score of data objects Hence no monitoringis required for this scenario
Algorithm 3 retrieves the safe exit points using theobservations we discussed earlier The core function in thisalgorithm is ComputeSafeExit(119901119886 119899120573) which finds the safeexit point in a segment between 119901119886 and 119899120573 The detailedComputeSafeExit(119901119886 119899120573) is described in Algorithm 4 FirstAlgorithm 4 determines 119889+119897 and 119889minusℎ at point 119901 isin [119901119886 119899120573]Recall that 119889+119897 is the lowest answer object to p where 119889minusℎ isthe highest nonanswer object to p Algorithm 4 computes thesafe exit point based on the cases we discussed earlier Thereare a further two scenarios for Cases 1 and 2 For Case 1 if119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then the safe exit point is the mid-point between 119889+119897 and 119889minusℎ If 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) thenthe edge is directed and therefore the safe exit point is either119901119886 or 119889+119897 If 119889+119897 lies on the edge [119901119886 119899120573] then 119889+119897 is the safe exitpoint Otherwise 119901119886 is the safe exit point
Similarly for Case 2 if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) thenthe safe exit point is computed by dividing the search space byhalf until we find the closest point such that 120595(119889minusℎ) gt 120595(119889+119897 )The safe exit point is computed in the same way as in Case 2if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886)52 Computation of Safe Exit Points for Example Considerthe same example in Figure 1 where the query point q issuesa top-1 keyword query with qt ldquoItalian restaurantrdquo For thisexample let us consider 120572 = 1 The monitoring algorithmstarts exploring from the active edge containing the queryobject q Therefore
997888997888997888997888997888rarr(119902 1198993) is explored first As shown inTable 3 for
997888997888997888997888997888rarr(119902 1198993) 119863+119902 = 1198893 and 119863+1198993 = 1198893 Accordingto Observation 1 no safe exit point exists in this segmentTherefore edges adjacent to 1198993 are explored and 1198993 becomesthe new 119901119886 The edge (1198993 1198994) is explored next Similarlythe answer object at 1198993 and 1198994 is the same 119863+1198993 = 119863+1198994 =1198893 Therefore a safe exit point does not exist in (1198993 1198994)The edge (1198993 1198997) is explored next As shown in Table 3119863+1198993 = 1198893 and 119863+1198997 = 1198896 By Observation 2 there is asafe exit point in (1198993 1198997) As shown in Figure 1 1198893119905 =1198896119905 = ldquo119868119905119886119897119894119886119899119877119890119904119905119886119906119903119886119899119905rdquo and 119889119894119904119905(1198993 1198997) = 119889119894119904119905(1198997 1198993)
Wireless Communications and Mobile Computing 9
(1) Input Same as Algorithm 1(2) Output 119875119878119864 a set of safe exit points(3) 119875119878119864 larr997888 0 lowastset of safe exit points(4) 119863+119901119886 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119901119886 (119901119886 119899120573))(5) lowastResults calculated using Algorithm 1(6) 119863+119899120573 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910((119899120573 (119901119886 119899120573)))(7) lowastResults calculated using Algorithm 1(8) if 119863+119901119886 = 119863+119899120573 then(9) no safe exit point lowastrefer to Observation 1(10) end(11) if 119863+119901119886 = 119863+119899120573 then(12) 119875119878119864 larr997888 119875119878119864 cup 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119901119886 119899120573) lowastsafe exit point
exist - refer to Observation 2(13) end(14) return 119875119878119864
Algorithm 3 COSK monitoring algorithm
(1) Input same as Algorithm 1(2) Output se safe exit point in (119901119886 119899120573)(3) 119863+119897 larr997888 lt 119901119863+119897 gt | for each point 119901 isin [119901119886 119899120573] 119889+119897 such that120595(119889+119897 )119901 = min(120595(119889+1 )119901 120595(119889+2 )119901 120595(119889+|119889+|)119901(4) 119863minusℎ larr997888 lt 119901119863minusℎ gt | for each point 119901 isin [119901119886 119899120573] 119889minusℎ such that120595(119889minusℎ )119901 = max(120595(119889minus1 )119901 120595(119889minus2 )119901 120595(119889minus|119889minus |)119901(5) if Case 1 then(6) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(7) 119901119904119890 =
max(119889119894119904119905(119904119890 119889+1 ) 119889119894119904119905(119904119890 119889+2 ) 119889119894119904119905(119904119890 119889+|119889+ |)) =min(119889119894119904119905(119904119890 119889minus1 ) 119889119894119904119905(119904119890 119889minus2 ) 119889119894119904119905(119904119890 119889minus|119889minus |))
(8) end(9) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(10) 119901119904119890 = 119901119886 or 119901119904119890 = 119889+119897 where 119889+119897 isin (119901119886 119899120573)(11) end(12) end(13) if Case 2 then(14) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(15) 119901119904119890 =closest point to 119901119886 such that 120595(119889minusℎ ) gt 120595(119889+119897 )(16) end(17) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(18) Same as Line (10)(19) end(20) end(21) return 119901119904119890
Algorithm 4 ComputeSafeExit(119901119886 119899120573)
Therefore according to Case 1 the safe exit point 1199041 isthe midpoint between 1198893 and 1198896 That is 119889119894119904119905(1199011199041198901 1198893) =119889119894119904119905(1199011199041198901 1198896) where119889119894119904119905(1199011199041198901 1198893) = 119909+3 and 119889119894119904119905(1199011199041198901 1198896) =minus119909 + 5 for 0 lt 119909 lt 3 Consequently 119909 = 1 which means thatthe distance from 1198993 to 1199011199041198901 is 1
Next we determine a safe exit point in (1198993 1198995) As shownin Table 3 the answer object at 1198995 is also the same as 1198993Hence no safe exit point exists in this edge Next
larr997888997888997888997888997888(1198996 1198995) isexplored with 119901119886 = 1198995 According to Table 3 119863+1198997 = 1198894 and
119863+1198995 = 1198893 Therefore a safe exit point exists in this edge This
edge is directed and for each point 119901 isin larr997888997888997888997888997888(1198996 1198995) the shortestdistance from p to 1198893 is from 119901 997888rarr 1198996 997888rarr 1198992 997888rarr 1198993 997888rarr 1198893Therefore 1198995 is the safe exit point
The bold lines in Figure 5 indicate the safe region of qThetop-1 result remains 1198893 until the query q lies in the safe region
Next we analyze the time complexity for determininga set of safe exit points using a set of qualifying objects119889 isin 119863+119901119886 cup 119863+119899120573 cup 119863(119901119886 119899120573) Note that 119863+119901119886 (119863+119899120573) indicates
10 Wireless Communications and Mobile Computing
Table 3 Computation of safe exit points for example scenario
EdgeSegment 119901119886 119863+119901119886 119863+119899120573 119901119904119890997888997888997888997888rarr(119902 1198993) q 119863+119902 = 1198893 119863+1198993 = 1198893 none(1198993 1198994) q 119863+1198993 = 1198893 119863+1198994 = 1198893 none(1198993 1198997) 1198993 119863+1198993 = 1198893 119863+1198997 = 1198896 1199011199041198901997888997888997888997888997888rarr(1198993 1198995) 1198993 119863+1198993 = 1198893 119863+1198995 = 1198893 nonelarr997888997888997888997888997888(1198996 1198995) 1198995 119863+1198995 = 1198893 119863+1198996 = 1198894 1199011199041198902
2
q
3
1
1 1
1
1
2
1
2
1 2
1
3
2
1
1
d4 (Chinese Restaurant)
d1 (Grand Hotel)
d5 (Pub and Bar)
n1
n6
n2 n3
n4
n7
pse1
pse2
n5
d6(Italian Restaurant)
d3 (Italian Restaurant)
d2 (Cafe)
d7 (Cafe and Bakery)
Figure 5 Illustration of safe region of q
the set of k data objects that satisfies the query conditionat 119901119886 (119899120573) According to Dijkstras algorithm [26] the timecomplexity 119874(119863+119902 ) for computing a set of answer objects at aquery point q is119874(119863+119902 ) = 119874(|119864|+|119873| log |119873|)Thismeans that119874(119863+119901119886) = 119874(119863+119899120573) = 119874(|119864| + |119873| log |119873|) holds for endpoints119901119886 and 119899120573 Thus time complexity 119874(Ω119896119905ℎ) when determiningthe skyline Ω119896119905ℎ with the k-th highest score is 119874(Ω119896119905ℎ) =119862119896119905ℎ119874(|119863+119901119886 cup 119863+119899120573 cup 119863(119901119886 119899120573)|) where 119862119896119905ℎ is the numberof qualifying objects that participate in the constitution ofthe skyline with the k-th highest score Therefore the timecomplexity of determining a safe exit point coincides withthe time complexity of determining the two skylines iethe skyline 119863+119897 with the k-th highest (or lowest) score foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects This is because the safe exit point is foundat the cross point between these skylines
Figure 6 represents the skyline graph for 119896 = 1 in an edge(1198997 1198993) Let us draw the score function for 1198893 and 1198896 for theroad segment (1198997 1198993) where a safe exit point exists This isbecause 119863(1198993)+ = 1198893 and 119863(1198997)+ = 1198896 for 119896 = 1 For eachpoint 119901 isin (1198997 1198993) the distance between 1198893 and point p canbe represented as 119889119894119904119905(1198893 119901) = 119889119894119904119905(1198893 1198993) + 119897119890119899(1198993 119901) = 6 minus119897119890119899(1198997 119901) Similarly for each point 119901 isin (1198997 1198993) the distancebetween 1198896 and point p can be represented as 119889119894119904119905(1198896 119901) =119889119894119904119905(1198896 1198997) + 119897119890119899(1198997 119901) = 2 + 119897119890119899(1198997 119901) Let 119897119890119899(1198997 119901) be
n7
10
08
06
04
02
n3pse1d7
distance
Scor
e
05 10 15 20 25 30
(d6) = 1(x + 3)
(d3) = 1(minusx + 7)
Figure 6 Skyline graph for 119896 = 1 on the road segment (1198997 1198993)
a variable x (0 le 119909 le 3) We can write 120582(1198893 119901) =119889119894119904119905(1198893 119901) = 6 minus 119909 and 120582(1198896 119901) = 119889119894119904119905(1198896 119901) = 2 + 119909 Thenwe can represent score function 120595(1198893) and 120595(1198896) as follows
120595(1198893) = 120583(1198893119905 119902119905)(1 + 120572 sdot 120582(1198893 119901)) = 1(7 minus 119909) for(0 le 119909 le 3)
Wireless Communications and Mobile Computing 11
120595(1198896) = 120583(1198896119905 119902119905)(1 + 120572 sdot 120582(1198896 119901)) = 1(3 + 119909) for(0 le 119909 le 3)Finally we present the lemma to prove that safe exit points
computed by COSK are correct
Lemma 8 The COSK algorithm correctly computes a set ofsafe exit points
Proof We will prove the correctness of the COSK algorithmby contradiction We assume that if 119863+119901119886 = 119863+119899120573 there is nosafe exit point in a road segment (119901119886119899120573) This means that foreach point p in the road segment (119901119886119899120573) the query result atp equals 119863+119901119886 ie 119863+119901 = 119863+119901119886forall119901 isin (119901119886119899120573) However it leadsto a contradiction that 119863+119899120573 = 119863+119901119886 when 119901 = 119899120573 There-fore if 119863+119901119886 = 119863+119899120573 a safe exit point exists in (119901119886119899120573) In addi-tion a safe exit point is determined using the skyline 119863+119897 foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects when 119863+119901119886 = 119863+119899120573 The first skyline is acomposite polyline drawn from answer objects in 119863+119901119886 Thesecond skyline is a composite polyline drawn from nonan-swer objects in 119863+119899120573 cup 119863(119901119886 119899120573) minus 119863+119901119886
6 Monitoring Query Results and Safe Regionsin Dynamic Directed Road Networks
In this section we discuss the monitoring of spatial key-word queries in dynamic road networks where the networkdistance changes depending on the traffic conditions Theupdates on weight of some edges may invalidate the queryresults or safe region of q even though the query objectq remains within their respective safe region Figure 7illustrates an example of changing the weights edges
larr997888997888997888997888997888(1198991 1198992)and
larr997888997888997888997888997888(1198991 1198996) For convenience we consider 120572 = 1 and qt =ldquoItalian restaurantrdquo In Figure 7(a) the top-1 result is 1198891 andbold lines show the safe region of query q Now consider attime 119905119895 the weights of two edgeslarr997888997888997888997888997888(1198991 1198992) andlarr997888997888997888997888997888(1198991 1198996) changeddue to heavy traffic condition as shown in Figure 7(b) Theupdate in weight of edges may invalidate the query resultor safe region of q Therefore it is necessary to monitor thevalidity of results and safe region when the changes occur
Next we introduce a monitoring region to monitor thevalidity of the safe region effectively when the weight ofan edge is changed Monitoring region MR contains all thepoints between query point q and lowest answer object andhighest nonanswer object Formally it is defined as 119872119877 =119889119894119904119905(119902119863+119897 ) cup 119889119894119904119905(119902119863minusℎ) where 119889119894119904119905(119902119863+119897 ) is the distancebetween q and lowest answer object and 119889119894119904119905(119902119863minusℎ) is highestnonanswer object In given example the 119863+119897 = 1198891 and 119863minusℎ =1198892 1198893 Therefore the dotted lines in Figure 8(a) shows themonitoring region of query object q
Now at time 119905119895 the update to edgeslarr997888997888997888997888997888(1198991 1198996) and larr997888997888997888997888997888997888(1198991 1198891)
which is not part of monitoring region can safely be ignoredHowever the updated on segment
997888997888997888997888997888997888rarr(1198992 1198891)which is associatedwith monitoring region may nullify the results As shown in
Figure 8(b) after update the top-1 result becomes 1198892 and boldlines represents the new safe region of q
Algorithm 5 monitors the validity of result set and saferegion of query object qwhen the weight of any edge changesLet us consider weight of edge (119899119894 119899119895) changes at time 119905119895First algorithm checks whether edge (119899119894 119899119895) is associatedwith monitoring region or not If it is not part of monitoringregion then algorithm simply ignores the update in edge(119899119894 119899119895) and query results and safe region remains valid Incontrast if edge is associated with monitoring region (ie119872119877cap(119899119894 119899119895) = 0) then algorithm evaluates the query resultsConsequently the top-k results and safe region of queryq needs to be updated Finally the algorithm updates themonitoring region of q
7 Performance Evaluation
In this section we evaluate the performance of COSKthrough simulation experiments We describe our experi-mental settings in Section 71 and we present our experimen-tal results for static and dynamic road networks in Sections72 and 73 respectively
71 Experimental Settings All of our experiments wereperformed using real road networks namely OldenburgSan Francisco and San Joaquin All three road networkswere obtained from [27] The original road network of SanFrancisco had 21047 nodes and 21692 edges We reformat-ted the network pruned approximately 30 of the nodesand adjusted the edges and their weights accordingly Thisresulted in a network with 14732 nodes and 14316 edgesBoth the direction of edges and data objects on the edgeswere generated randomly The description of each data objectwas extracted from Twitter messages [28] and we assignedone tweet per data object Table 4 presents the characteristicsof the data sets used in the experimental evaluation Wesimulated moving query objects by using a spatiotemporaldata generator [29] The input to generator was the road net-work of the data set used and the output was the set of queryobjects moving on the road network Each experiment had100 moving queries which were continuously monitored for100 timestamps (1 timestamp = 1 second) and the averageresult was reported in the experiments
As a benchmark for COSK in static road network weimplemented a CMTkSK+ algorithm [22] which also contin-uously monitored the moving top-k spatial keyword queriesin the road networks However this algorithm was originallydesigned for undirected road networks To make a faircomparison we modified CMTkSK+ to process top-k spatialkeyword queries in directed road networks and called itCMTkSK+ Specifically we modified the distance computa-tion method between two points such that in directed roadnetworks 119889119894119904119905(1199011 1199012) = 119889119894119904119905(1199012 1199011) Since CMTkSK+ doesnot handle top-k spatial queries in dynamic road roads wecompared the performance of COSK with basic algorithmwhich recomputes the results whenever query object changesits location All algorithms were implemented in Java andwere executed on a desktop PC 280-GHz Intel Core i5 with
12 Wireless Communications and Mobile Computing
3
q5 5
2 3
3
2
2 3 5
11
d3 (Chinese Restaurant)
n1
n6
n2 pse2
pse1
pse3
n4n5
n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Safe region at time 119905119894
9
q10 5
6 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6
n2 n3
n4n5
d2 (Italian Restaurant)d1 (Italian Restaurant)
(b) Updating weight oflarr997888997888997888997888997888997888(1198991 1198992) and
larr997888997888997888997888997888997888(1198991 1198996) at time 119905119895
Figure 7 Updating the weight of edges in a dynamic road network where 119905119894 lt 119905119895
3
q5 5
2 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6 n4n5
n2 n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Monitoring region at time 119905119894
9
q10 5
5 4
233
2
2 3 5
11
037
pse2pse1
pse3
d3 (Chinese Restaurant)n6 n4n5
n2 n3d2 (Italian Restaurant)n1 d1 (Italian Restaurant)
(b) New safe region at time 119905119895
Figure 8 Monitoring region and updated safe region at time 119905119895
(1) InputMonitoring regionMR updated edge (119899119894 119899119895)(2) Output none(3) if 119872119877cap (119899119894 119899119895) = 0 then(4) lowastedge (119899119894 119899119895) is not part of monitoring region(5) ignore the change in the weight of edge (119899119894 119899119895)(6) end(7) 119875119878119864 larr997888 0 lowastset of safe exit points(8) else(9) 119863119896119906119901119889 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119899119894 119890119894) lowastupdate set of
top-k results(10) 119875119878119864119906119901119889 larr997888 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119875119886 119899120573) lowastupdate safe exit
points(11) 119872119877119906119901119889 larr997888 119862119900119898119901119906119905119890119872119900119899119894119905119900119903119894119899119892119877119890119892119894119900119899(119863+119897 119863minusℎ )
lowastupdate monitoring region(12) end
Algorithm 5 MonitoringSafeRegion(MR(119899119894 119899119895))
Table 4 Summary of datasets
Attribute Oldenburg San Francisco San JoaquinTotal no of nodes 6104 14732 18262Total no of edges 7034 14316 23876Percentage of directed edges 30 30 30Total no of objects 5627 11453 19098Average no of objects per edge 08 08 08Total no of words 49517 103649 166153
Wireless Communications and Mobile Computing 13
Table 5 Experimental parameter settings
Parameter RangeNumber of results (k) 5 10 15 20 25Number of keywords (n) 1 2 3 4 5Query parameter (120572) 001 01 1 10 100Dataset Oldenburg San Francisco San JoaquinNumber of data objects (119873119863) 10 20 30 40 50 (x1000)Speed of query objects (119881119902119903119910) 25 50 75 100 125 (kmh)Mobility (119872119902119903119910) 20 40 60 80 100Ratio of directed edges (119864119889119894119903) 10 20 30 40 50Ratio of updated edges (119864119906119901119889) 15 30 60 80 100
8GB of memory In the experiments we compared (1) queryprocessing times (2) edges processed ie the number ofedges processed for retrieving query results and (3) indexsizes Table 5 summarizes the parameters used in the exper-iments In each experiment we varied a single parameterwithin the range that is shown in Table 5 while maintainingthe other parameters at the bolded default values
We evaluated the performance of the algorithms by usingthe following measures (1) total amount of server CPUtime which indicates the query processing time and (2)total communication cost as the total number of points (iethe location updates sent by query objects and the queryresults and safe exit points returned by the server) transferredbetween clients and the serverThebattery power andwirelessbandwidth consumption typically increase with the amountof data transferred between objects (clients) and serversThus we used the amount of transferred data as a metric toevaluate the communication cost
72 Experimental Results of Top-k Spatial KeywordQueries in Static Road Networks
721 Effect of k Figure 9 indicates the effect of the numberof results on the query processing time and communicationcost for both algorithms Figure 9(a) indicates that the queryprocessing time increases for both algorithms as the value ofk increases This is expected because with an increase in kmore data objects are required to be explored and verifiedNevertheless COSK significantly outperforms CMTkSK+ fortwo main reasons First a relevant object search is very effi-cient when using the highest significant factor and secondCOSKdoes not need to verify the set of answer objects as longas the query object lies in a safe region On the other handthe CMTkSK+ query processing time increases significantlybecause it has to monitor and verify the set of candidateobjects periodically In Figure 9(b) the communication costsfor both algorithms increase as the number of objects in-creases However the proposed algorithm demonstrates su-perior performance compared to CMTkSK+ because client-server communication is not required when the query objectlies within the safe exit points whereas in CMTkSK+ thequery object is required to report its location to the serverwhenever it moves
722 Effect of119873119863 This experimentwas conducted on datasetSan Joaquin This dataset included 19098 data objects there-fore we randomly generated approximately 30000 additionaldata objects on different edges In Figure 10 we evaluate theperformance of COSK and CMTkSK+ by varying the cardi-nality of the data objects Note that119873119863 = 10119870 corresponds toa low density of data points while119873119863 = 50119870 corresponds toa high density In Figure 10(a) it is interesting to notice thatthe query processing times of both algorithms decrease asthe cardinality of the data objects increases For CMTkSK+this is because with high density the monitoring range of aquery decreases However for COSK it is mainly becausewhen the data density is high fewer edges are required tobe expanded which decreases the query processing time InFigure 10(b) we study the influence of the cardinality of thedata objects on the communication costs The experimentalresults indicate that the communication costs of CMTkSK+incur almost constant communication costs regardless ofdata object cardinality However the communication costsof COSK increase in proportion to the 119873119863 value This isexpected because the safe region becomes smaller as thedensity of the data objects increases which increases thecommunication costs
723 Effect of Query Keywords (n) Figure 11 shows thequery processing time and communication for COSK andCMTkSK+ as a function of the number of query keywordsFigures 11(a) and 11(b) show the trend that the performanceof both algorithms degrades when the number of keywordsincreases This is mainly because by increasing the numberof query keywords the number of relevant objects may alsoincrease resulting in a higher query processing time andcommunication cost However the safe-region-based algo-rithm COSK scales better than CMTkSk+ because of its lessexpensive monitoring technique
724 Effect of 120572 Figure 12 demonstrates the impact of queryparameter 120572 on the query processing time and on the com-munication cost A small value of 120572 indicates a greater im-portance of textual relevance whereas a high value of 120572gives more preference to the spatial relevance It is interestingto note that the query processing time is lower for higher
14 Wireless Communications and Mobile Computing
k
50
10
10
15 20
20
30
Que
ry p
roce
ssin
g tim
e (s)
COSKCMTkSK+
40
25
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
k
5 10 15 20 25
(b) Communication cost
Figure 9 Effect of k on query processing time and number of edges processed
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
10k 20k 30k 40k 50kND
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
tran
sferr
ed m
essa
ges
1M
10 20 30 40 50ND
(b) Communication cost
Figure 10 Effect of119873119863 on query processing time and communication cost
values of 120572 which indicates more importance to the spatialrelevance This is mainly because when the spatial relevanceis higher fewer edges and objects are required to be exploredand processed to determine the top-k data objects Observethat in Figure 12(b) the number of messages sent by COSKdecreases sharply with an increase in 120572725 Effect of Speed Figure 13(a) demonstrates the influenceof the speed of the query objects on the query processingtime of the COSK and CMTkSK+ algorithms The experi-mental results indicate that the performance of CMTkSK+is not significantly influenced by the speed of the query
objects because the candidate objects must be continuouslymonitored after a regular interval of time regardless ofthe speed On the other hand for COSK the performancegradually decreases as the speed of the query objects increasesbecause the objects leave their respective safe regions morefrequently Figure 13(b) shows the communication costs ofCOSK and CMTkSK+ with respect to the speed of the queryobjects CMTkSK+ incurs almost constant communicationcosts because a server-initiated request to verify the candidateobjects does not depend on the speed For COSK the queryobjects cross safe regions more frequently when the speed ishigh which increases the communication costs
Wireless Communications and Mobile Computing 15
Number of keywords1 2 3 4 5
COSKCMTkSK+
0
15
30
45
Que
ry p
roce
ssin
g tim
e (s)
60
(a) Query processing time
COSK
Number of keywords
CMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1 2 3 4 5
(b) Communication cost
Figure 11 Effect of number of keywords on query processing time and communication cost
001 01 1 10 100
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
001 01 1 10 100
(b) Communication cost
Figure 12 Effect of 120572 on query processing time and communication cost
726 Effect of Mobility Figure 14 shows the effect of mobility119872119902119903119910 (mobility refers to the percentage of query objects thatare moving at any timestamp) on the performance of COSKand CMTkSK+ algorithms As expected the query pro-cessing time and communication costs for both algorithmsincrease with119872119902119903y Nevertheless COSK performs better thanCMTkSK+ in terms of query processing time and commu-nication costs
727 Effect of Directed Edges Figure 15 shows the impactof percentage of directed edges 119864119889119894119903 on the performance ofCOSK and CMTkSK+ algorithms The query processing time
increases with 119864119889119894119903 because algorithm needs to explore moreedges to retrieve the top-k keyword queries However thecommunication cost is not significantly affected by the valueof 119864119889119894119903 for both the algorithms
728 Effect of Datasets Figure 16 demonstrates the indexsizes of the COSK and CMTkSK+ approaches for differentdatasets As shown in Figure 16 both algorithms have similarindex sizes However COSK has minor space overheadbecause it stores additional information of the highest signifi-cance factor 120579119905 of edges More important this space overheadis minimal as compared to the gain achieved by COSK inquery processing time and communication costs
16 Wireless Communications and Mobile Computing
25 50 75 100 125
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
Vqry
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
25 50 75 100 125Vqry
(b) Communication cost
Figure 13 Effect of speed on query processing time and communication cost
20 40 60 80 100Mqry
COSKCMTkSK+
0
15
45
30
60
Que
ry p
roce
ssin
g tim
e (s)
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
20 40 60 80 100Mqry
1k
COSKCMTkSK+
(b) Communication cost
Figure 14 Effect of mobility on query processing time and communication cost
73 Experimental Results of Top-k Spatial Keyword Queriesin Dynamic Road Networks In this section we evaluate theperformance of COSK and basic algorithm for dynamic roadnetworks The 119864119906119901119889 indicates the percentage of all edges thatchange their weight at each timestamp The length of anupdated edge is randomly selected between 01 to 10 times theoriginal length Figure 17(a) depicts the query processing timeof COSK and basic algorithm It is evident from the figure thatquery processing time of basic algorithm is not significantlyaffected by 119864119906119901119889 This is mainly because the query objectsissue top-k spatial queries at each timestamp However query
processing time of COSK increases with the value of 119864119906119901119889because the probability that the updated edge may associatedwith the monitoring region of query q increases with 119864119906119901119889Therefore when 119864119906119901119889 becomes large the results need to befrequently updated which increases the query processingtime Figure 17(b) shows the communication costs of COSKand basic algorithm with respect to 119864119906119901119889 Basic algorithmincurs almost constant communication costs regardless of thevalue of 119864119906119901119889 In contrast the communication cost of COSKincreases with 119864119906119901119889 because the query result and safe regionsneeds to be frequently updated
Wireless Communications and Mobile Computing 17
COSKCMTkSK+
10 20 30 40 50Edir
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1k
10 20 30 40 50Edir
eSPAKCMTkSK+
(b) Communication cost
Figure 15 Effect of 119864119889119894119903 on query processing time and communication cost
COSKCMTkSK+
0
15
45
30
60
Inde
x siz
e (M
B)
OldenburgDatasets
San Francisco San Joaquin
Figure 16 Effect of dataset on index size
8 Conclusion
In this paper we investigated moving top-k spatial keywordqueries in directed and dynamic road networksWepresentedan efficient indexing framework using inverted files thatindexes the data objects on edges allowing for the effectivesearching of data objects relevant to queries in terms ofboth textual and spatial relevance We also presented a safe-exit-based algorithm called COSK to monitor moving top-k spatial keyword queries We demonstrated that the queryresults remain valid as long as the query object resides withina safe region Furthermore COSK can effectively monitor thevalidity of query results and safe regions in dynamic roadnetworks Finally an experimental evaluation conducted on
real road networks demonstrated that COSK significantlyreduced the query processing time and communication costscompared to the CMTkSK+ algorithm
Data Availability
The real road network data used in this study are also used inmany previous studies The road network data is cited in themanuscript and it is available at httpswwwcsutahedusimlifeifeiSpatialDatasethtm To simulate the moving queriesthe authors used the spatiotemporal data generator which isalso used in previous studiesThe research article of generatoris cited in the manuscript The documentation and source
18 Wireless Communications and Mobile Computing
0
20
40
60
Que
ry p
roce
ssin
g tim
e (s)
80
15 30 45 60 75Eupd
COSKBasic
(a) Query processing time
15 30 45 60 75Eupd
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
COSKBasic
(b) Communication cost
Figure 17 Effect of 119864119906119901119889 on query processing time and communication cost
files of generator are available at httpsiapgjade-hsdeper-sonenbrinkhoffgenerator They used the Twitter tweetsfor generating the description of data objects and also querykeywords The tweets used can be accessible at httpfollow-thehashtagcomdatasetsfree-twitter-dataset-usa-200000-free-usa-tweets
Conflicts of Interest
The authors declare that there is no conflicts of interestregarding the publication of this paper
Acknowledgments
Hyung-JuChowas supported by theNational Research Foun-dation of Korea (NRF) grant funded by the Korean Govern-ment (MSIP) (NRF-2016R1A2B4009793) and this researchwas partially supported by Basic Science Research Programthrough the National Research Foundation of Korea (NRF)fundedby theMinistry of Education (2016R1D1A1B03934129)
References
[1] D Papadias N Mamoulis J Zhang and Y Tao ldquoQuery pro-cessing in spatial network databasesrdquo in Proceedings of the 29thInternational Conference on Very Large Data Bases (VLDB rsquo03)pp 802ndash813 September 2003
[2] H-J Cho K Ryu and T-S Chung ldquoAn efficient algorithm forcomputing safe exit points of moving range queries in directedroad networksrdquo Information Systems vol 41 pp 1ndash19 2014
[3] G Tsatsanifos and A Vlachou ldquoOn processing Top-k spatio-textual preference queriesrdquo in Proceedings of the 18th Interna-tional Conference on ExtendingDatabase Technology (EDBT rsquo15)pp 433ndash444 March 2015
[4] R Li A X Liu A L Wang and B Bruhadeshwar ldquoFast rangequery processing with strong privacy protection for cloud com-putingrdquo Proceedings of the VLDB Endowment vol 7 no 14 pp1953ndash1964 2014
[5] G Cong C S Jensen andDWu ldquoEfficient retrieval of the Top-k most relevant spatial web objectsrdquo Proceedings of the VLDBEndowment vol 2 no 1 pp 337ndash348 2009
[6] Z Li K C K Lee B Zheng W-C Lee D Lee and X WangldquoIR-tree An efficient index for geographic document searchrdquoIEEE Transactions on Knowledge and Data Engineering vol 23no 4 pp 585ndash599 2011
[7] Y Zhou X Xie C Wang Y Gong and W Ma ldquoHybrid indexstructures for location-based web searchrdquo in Proceedings of the14th ACM International Conference on Information and Knowl-edge Management pp 155ndash162 Bremen Germany October2005
[8] J Zobel and A Moffat ldquoInverted files for text search enginesrdquoACM Computing Surveys vol 38 no 2 2006
[9] N Beckmann H Kriegel R Schneider and B Seeger ldquoR-anefficient and robust accessmethod for points and rectanglesrdquo inProceedings of the ACM SIGMOD International Conference onManagement of Data vol 19 pp 322ndash331 May 1990
[10] R Hariharan B Hore C Li and S Mehrotra ldquoProcessing spa-tial-keyword (sk) queries in geographic information retrieval(gir) systemsrdquo in Proceedings of the 19th International Confer-ence on Scientific and Statistical DatabaseManagement (SSDBMrsquo07) July 2007
[11] I De FelipeV Hristidis andN Rishe ldquoKeyword search on spa-tial databasesrdquo in Proceedings of the 24th International Confer-ence on Data Engineering (ICDE rsquo08) pp 656ndash665 April 2008
[12] J B Rocha-Junior O Gkorgkas S Jonassen and K NoslashrvagldquoEfficient processing of top-k spatial keyword queriesrdquo inProceedings of the International Symposium on Spatial andTemporal Databases pp 205ndash222 Springer 2011
[13] D Zhang K-L Tan andAK Tung ldquoScalable top-k spatial key-word searchrdquo in Proceedings of the 16th International Conferenceon Extending Database Technology pp 359ndash370 2013
Wireless Communications and Mobile Computing 19
[14] J B Rocha-Junior andK Noslashrvag ldquoTop-k spatial keyword quer-ies on road networksrdquo in Proceedings of the 15th InternationalConference on Extending Database Technology pp 168ndash179Berlin Germany March 2012
[15] H-J Cho S J Kwon and T-S Chung ldquoA safe exit algorithmfor continuous nearest neighbor monitoring in road networksrdquoMobile Information Systems vol 9 no 1 pp 37ndash53 2013
[16] D Yung M L Yiu and E Lo ldquoA safe-exit approach for efficientnetwork-based moving range queriesrdquo Data amp KnowledgeEngineering vol 72 pp 126ndash147 2012
[17] M Attique H Cho R Jin and T Chung ldquoEfficient Processingof Continuous Reverse k Nearest Neighbor on Moving Objectsin Road Networksrdquo ISPRS International Journal of Geo-Infor-mation vol 5 no 12 p 247 2016
[18] H G Elmongui M F Mokbel and W G Aref ldquoContinuousaggregate nearest neighbor queriesrdquoGeoInformatica vol 17 no1 pp 63ndash95 2013
[19] D Wu M L Yiu C S Jensen and G Cong ldquoEfficient con-tinuously moving top-k spatial keyword query processingrdquo inProceedings of the IEEE International Conference on Data En-gineering (ICDE rsquo11) pp 541ndash552 Hannover Germany April2011
[20] W Huang G Li K-L Tan and J Feng ldquoEfficient safe-re-gion construction for moving top-k spatial keyword queriesrdquoin Proceedings of the 21st ACM International Conference onInformation and Knowledge Management pp 932ndash941 2012
[21] L Guo J ShaoHHAung andK-L Tan ldquoEfficient continuoustop-k spatial keyword queries on road networksrdquoGeoInformat-ica vol 19 no 1 pp 29ndash60 2014
[22] Y Li G Li L Shu Q Huang and H Jiang ldquoContinuous moni-toring of top-k spatial keyword queries in road networksrdquo Jour-nal of Information Science and Engineering vol 31 no 6 pp1831ndash1848 2015
[23] M Attique A Khan and T-S Chung ldquoESPAK Top-k spatialkeyword query processing in directed road networksrdquo in Pro-ceedings of the Workshops of the International Conference onExtending Database Technology and the International Confer-ence on DatabaseTheory (EDBTICDT rsquo17) March 2017
[24] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing ampManagement vol24 no 5 pp 513ndash523 1988
[25] V N Anh O de Kretser and A Moffat ldquoVector-space rankingwith effective early terminationrdquo in Proceedings of the 24th An-nual International ACM SIGIR Conference pp 35ndash42 NewOrleans LO USA 2001
[26] E W Dijkstra ldquoA note on two problems in connexion withgraphsrdquo Numerische Mathematik vol 1 pp 269ndash271 1959
[27] ldquoReal datasets for spatial databasesrdquo httpswwwcsutahedulifeifeiSpatialDatasethtm
[28] ldquoTwitterrdquo httpstwittercom[29] T Brinkhoff ldquoA framework for generating network-basedmov-
ing objectsrdquo GeoInformatica vol 6 no 2 pp 153ndash180 2002
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
8 Wireless Communications and Mobile Computing
query result set will change The result set will change whenthe score of highest nonanswer 119863minusℎ surpasses the score of119863+119897 Generally the textual relevance score does not changeTherefore the score of data objects only changes because ofthe spatial relevance score which can only change by themovement of query objects The computation of the safe exitpoint is based on two key observations
Observation 1 If 119863+119899120573 = 119863+119901119886 there is no safe exit point in thesegment
Explanation 119863+119901119886 represents the set of answer objects atanchor point 119901119886 whereas 119863+119899120573 represents the set of answerobjects at boundary node 119899120573 As discussed earlier the safe exitpoint is the particular point where the query results changedIf the query results at the starting node are the same as theending node of any segmentedge there does not exist anypoint where the query result is changing Hence we do notsearch the safe exit point in that segment
Observation 2 If 119863+119901119886 = 119863+119899120573 there is a safe exit point in thesegment
Explanation In contrast to Observation 1 if the query resultsare different at the starting and ending points then thereexists a point where the query results are changing Hencethere is a safe exit point in the segment
To find the safe region we observe the following cases
Case 1 (when 120572 = 1 and the textual relevance of the highestnonanswer object and lowest answer object is the same)In this case both the textual and spatial relevance have thesame importance (ie 120572 = 1) In addition the top-k resultdepends only on the spatial relevance because the textualrelevance of both objects is the same The data object thatis closer to query point q becomes the answer object For anundirected edge the safe exit point 119901119904119890 is the center pointie max(119889119894119904119905(119901119904119890 119889+1 ) 119889119894119904119905(119901119904119890 119889+2 ) 119889119894119904119905(119901119904119890 119889+|119889+|)) =min(119889119894119904119905(119901119904119890 119889minus1 ) 119889119894119904119905(119901119904119890 119889minus2 ) 119889119894119904119905(119901119904119890 119889minus|119889minus|)) betweenthe lowest answer object and the highest nonanswer objectHowever in case of a directed edge where 119889119894119904119905(119901119886 119899120573) =119889119894119904119905(119899120573 119901119886) the safe exit point is either 119889+119897 or 119901119886 If 119889+119897 isin(119901119886 119899120573) then the safe exit point is 119889+119897 otherwise the safe exitpoint is 119901119886Case 2 (when 120572 = 1 and the textual relevance of the highestnonanswer object and lowest answer object is different) Inthis case the top-k result depends on all functions that are the120572 spatial and textual relevance Clearly for the undirectededges the midpoint between the lowest answer object andthe highest nonanswer object does not provide a valid safeexit point Therefore we introduce the divide-and-conquertechnique This will keep dividing the search space until weget the point where the score of the nonanswer is greater thanthat of the answer object Typically the safe exit point shouldbe closer to the data object whose score is lower Based onthis observation first we compute the midpoint in a similarfashion to Case 1 and then we continue dividing the search
space until we find the point For undirected edges the safeexit point can be computed in a similar fashion to Case 1
Case 2 also works for other cases when the safe exit pointis not the mid point between the lowest answer object andthe highest nonanswer object In these cases the safe exitpoint depends on two or more functions Therefore the safeexit point can be easily computed using the aforementioneddivide-and-conquer technique Following are the scenarioswhere the safe exit point can be computed using Case 2
(a) When 120572 = 1 and textual relevance of the nearest non-answer object and farthest answer object is different
(b) When 120572 = 1 and textual relevance of the nearestnonanswer object and farthest answer object is same
Case 3 (when 120572 = 0) This means the spatial relevance hasno effect on the score of data objects Hence no monitoringis required for this scenario
Algorithm 3 retrieves the safe exit points using theobservations we discussed earlier The core function in thisalgorithm is ComputeSafeExit(119901119886 119899120573) which finds the safeexit point in a segment between 119901119886 and 119899120573 The detailedComputeSafeExit(119901119886 119899120573) is described in Algorithm 4 FirstAlgorithm 4 determines 119889+119897 and 119889minusℎ at point 119901 isin [119901119886 119899120573]Recall that 119889+119897 is the lowest answer object to p where 119889minusℎ isthe highest nonanswer object to p Algorithm 4 computes thesafe exit point based on the cases we discussed earlier Thereare a further two scenarios for Cases 1 and 2 For Case 1 if119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then the safe exit point is the mid-point between 119889+119897 and 119889minusℎ If 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) thenthe edge is directed and therefore the safe exit point is either119901119886 or 119889+119897 If 119889+119897 lies on the edge [119901119886 119899120573] then 119889+119897 is the safe exitpoint Otherwise 119901119886 is the safe exit point
Similarly for Case 2 if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) thenthe safe exit point is computed by dividing the search space byhalf until we find the closest point such that 120595(119889minusℎ) gt 120595(119889+119897 )The safe exit point is computed in the same way as in Case 2if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886)52 Computation of Safe Exit Points for Example Considerthe same example in Figure 1 where the query point q issuesa top-1 keyword query with qt ldquoItalian restaurantrdquo For thisexample let us consider 120572 = 1 The monitoring algorithmstarts exploring from the active edge containing the queryobject q Therefore
997888997888997888997888997888rarr(119902 1198993) is explored first As shown inTable 3 for
997888997888997888997888997888rarr(119902 1198993) 119863+119902 = 1198893 and 119863+1198993 = 1198893 Accordingto Observation 1 no safe exit point exists in this segmentTherefore edges adjacent to 1198993 are explored and 1198993 becomesthe new 119901119886 The edge (1198993 1198994) is explored next Similarlythe answer object at 1198993 and 1198994 is the same 119863+1198993 = 119863+1198994 =1198893 Therefore a safe exit point does not exist in (1198993 1198994)The edge (1198993 1198997) is explored next As shown in Table 3119863+1198993 = 1198893 and 119863+1198997 = 1198896 By Observation 2 there is asafe exit point in (1198993 1198997) As shown in Figure 1 1198893119905 =1198896119905 = ldquo119868119905119886119897119894119886119899119877119890119904119905119886119906119903119886119899119905rdquo and 119889119894119904119905(1198993 1198997) = 119889119894119904119905(1198997 1198993)
Wireless Communications and Mobile Computing 9
(1) Input Same as Algorithm 1(2) Output 119875119878119864 a set of safe exit points(3) 119875119878119864 larr997888 0 lowastset of safe exit points(4) 119863+119901119886 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119901119886 (119901119886 119899120573))(5) lowastResults calculated using Algorithm 1(6) 119863+119899120573 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910((119899120573 (119901119886 119899120573)))(7) lowastResults calculated using Algorithm 1(8) if 119863+119901119886 = 119863+119899120573 then(9) no safe exit point lowastrefer to Observation 1(10) end(11) if 119863+119901119886 = 119863+119899120573 then(12) 119875119878119864 larr997888 119875119878119864 cup 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119901119886 119899120573) lowastsafe exit point
exist - refer to Observation 2(13) end(14) return 119875119878119864
Algorithm 3 COSK monitoring algorithm
(1) Input same as Algorithm 1(2) Output se safe exit point in (119901119886 119899120573)(3) 119863+119897 larr997888 lt 119901119863+119897 gt | for each point 119901 isin [119901119886 119899120573] 119889+119897 such that120595(119889+119897 )119901 = min(120595(119889+1 )119901 120595(119889+2 )119901 120595(119889+|119889+|)119901(4) 119863minusℎ larr997888 lt 119901119863minusℎ gt | for each point 119901 isin [119901119886 119899120573] 119889minusℎ such that120595(119889minusℎ )119901 = max(120595(119889minus1 )119901 120595(119889minus2 )119901 120595(119889minus|119889minus |)119901(5) if Case 1 then(6) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(7) 119901119904119890 =
max(119889119894119904119905(119904119890 119889+1 ) 119889119894119904119905(119904119890 119889+2 ) 119889119894119904119905(119904119890 119889+|119889+ |)) =min(119889119894119904119905(119904119890 119889minus1 ) 119889119894119904119905(119904119890 119889minus2 ) 119889119894119904119905(119904119890 119889minus|119889minus |))
(8) end(9) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(10) 119901119904119890 = 119901119886 or 119901119904119890 = 119889+119897 where 119889+119897 isin (119901119886 119899120573)(11) end(12) end(13) if Case 2 then(14) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(15) 119901119904119890 =closest point to 119901119886 such that 120595(119889minusℎ ) gt 120595(119889+119897 )(16) end(17) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(18) Same as Line (10)(19) end(20) end(21) return 119901119904119890
Algorithm 4 ComputeSafeExit(119901119886 119899120573)
Therefore according to Case 1 the safe exit point 1199041 isthe midpoint between 1198893 and 1198896 That is 119889119894119904119905(1199011199041198901 1198893) =119889119894119904119905(1199011199041198901 1198896) where119889119894119904119905(1199011199041198901 1198893) = 119909+3 and 119889119894119904119905(1199011199041198901 1198896) =minus119909 + 5 for 0 lt 119909 lt 3 Consequently 119909 = 1 which means thatthe distance from 1198993 to 1199011199041198901 is 1
Next we determine a safe exit point in (1198993 1198995) As shownin Table 3 the answer object at 1198995 is also the same as 1198993Hence no safe exit point exists in this edge Next
larr997888997888997888997888997888(1198996 1198995) isexplored with 119901119886 = 1198995 According to Table 3 119863+1198997 = 1198894 and
119863+1198995 = 1198893 Therefore a safe exit point exists in this edge This
edge is directed and for each point 119901 isin larr997888997888997888997888997888(1198996 1198995) the shortestdistance from p to 1198893 is from 119901 997888rarr 1198996 997888rarr 1198992 997888rarr 1198993 997888rarr 1198893Therefore 1198995 is the safe exit point
The bold lines in Figure 5 indicate the safe region of qThetop-1 result remains 1198893 until the query q lies in the safe region
Next we analyze the time complexity for determininga set of safe exit points using a set of qualifying objects119889 isin 119863+119901119886 cup 119863+119899120573 cup 119863(119901119886 119899120573) Note that 119863+119901119886 (119863+119899120573) indicates
10 Wireless Communications and Mobile Computing
Table 3 Computation of safe exit points for example scenario
EdgeSegment 119901119886 119863+119901119886 119863+119899120573 119901119904119890997888997888997888997888rarr(119902 1198993) q 119863+119902 = 1198893 119863+1198993 = 1198893 none(1198993 1198994) q 119863+1198993 = 1198893 119863+1198994 = 1198893 none(1198993 1198997) 1198993 119863+1198993 = 1198893 119863+1198997 = 1198896 1199011199041198901997888997888997888997888997888rarr(1198993 1198995) 1198993 119863+1198993 = 1198893 119863+1198995 = 1198893 nonelarr997888997888997888997888997888(1198996 1198995) 1198995 119863+1198995 = 1198893 119863+1198996 = 1198894 1199011199041198902
2
q
3
1
1 1
1
1
2
1
2
1 2
1
3
2
1
1
d4 (Chinese Restaurant)
d1 (Grand Hotel)
d5 (Pub and Bar)
n1
n6
n2 n3
n4
n7
pse1
pse2
n5
d6(Italian Restaurant)
d3 (Italian Restaurant)
d2 (Cafe)
d7 (Cafe and Bakery)
Figure 5 Illustration of safe region of q
the set of k data objects that satisfies the query conditionat 119901119886 (119899120573) According to Dijkstras algorithm [26] the timecomplexity 119874(119863+119902 ) for computing a set of answer objects at aquery point q is119874(119863+119902 ) = 119874(|119864|+|119873| log |119873|)Thismeans that119874(119863+119901119886) = 119874(119863+119899120573) = 119874(|119864| + |119873| log |119873|) holds for endpoints119901119886 and 119899120573 Thus time complexity 119874(Ω119896119905ℎ) when determiningthe skyline Ω119896119905ℎ with the k-th highest score is 119874(Ω119896119905ℎ) =119862119896119905ℎ119874(|119863+119901119886 cup 119863+119899120573 cup 119863(119901119886 119899120573)|) where 119862119896119905ℎ is the numberof qualifying objects that participate in the constitution ofthe skyline with the k-th highest score Therefore the timecomplexity of determining a safe exit point coincides withthe time complexity of determining the two skylines iethe skyline 119863+119897 with the k-th highest (or lowest) score foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects This is because the safe exit point is foundat the cross point between these skylines
Figure 6 represents the skyline graph for 119896 = 1 in an edge(1198997 1198993) Let us draw the score function for 1198893 and 1198896 for theroad segment (1198997 1198993) where a safe exit point exists This isbecause 119863(1198993)+ = 1198893 and 119863(1198997)+ = 1198896 for 119896 = 1 For eachpoint 119901 isin (1198997 1198993) the distance between 1198893 and point p canbe represented as 119889119894119904119905(1198893 119901) = 119889119894119904119905(1198893 1198993) + 119897119890119899(1198993 119901) = 6 minus119897119890119899(1198997 119901) Similarly for each point 119901 isin (1198997 1198993) the distancebetween 1198896 and point p can be represented as 119889119894119904119905(1198896 119901) =119889119894119904119905(1198896 1198997) + 119897119890119899(1198997 119901) = 2 + 119897119890119899(1198997 119901) Let 119897119890119899(1198997 119901) be
n7
10
08
06
04
02
n3pse1d7
distance
Scor
e
05 10 15 20 25 30
(d6) = 1(x + 3)
(d3) = 1(minusx + 7)
Figure 6 Skyline graph for 119896 = 1 on the road segment (1198997 1198993)
a variable x (0 le 119909 le 3) We can write 120582(1198893 119901) =119889119894119904119905(1198893 119901) = 6 minus 119909 and 120582(1198896 119901) = 119889119894119904119905(1198896 119901) = 2 + 119909 Thenwe can represent score function 120595(1198893) and 120595(1198896) as follows
120595(1198893) = 120583(1198893119905 119902119905)(1 + 120572 sdot 120582(1198893 119901)) = 1(7 minus 119909) for(0 le 119909 le 3)
Wireless Communications and Mobile Computing 11
120595(1198896) = 120583(1198896119905 119902119905)(1 + 120572 sdot 120582(1198896 119901)) = 1(3 + 119909) for(0 le 119909 le 3)Finally we present the lemma to prove that safe exit points
computed by COSK are correct
Lemma 8 The COSK algorithm correctly computes a set ofsafe exit points
Proof We will prove the correctness of the COSK algorithmby contradiction We assume that if 119863+119901119886 = 119863+119899120573 there is nosafe exit point in a road segment (119901119886119899120573) This means that foreach point p in the road segment (119901119886119899120573) the query result atp equals 119863+119901119886 ie 119863+119901 = 119863+119901119886forall119901 isin (119901119886119899120573) However it leadsto a contradiction that 119863+119899120573 = 119863+119901119886 when 119901 = 119899120573 There-fore if 119863+119901119886 = 119863+119899120573 a safe exit point exists in (119901119886119899120573) In addi-tion a safe exit point is determined using the skyline 119863+119897 foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects when 119863+119901119886 = 119863+119899120573 The first skyline is acomposite polyline drawn from answer objects in 119863+119901119886 Thesecond skyline is a composite polyline drawn from nonan-swer objects in 119863+119899120573 cup 119863(119901119886 119899120573) minus 119863+119901119886
6 Monitoring Query Results and Safe Regionsin Dynamic Directed Road Networks
In this section we discuss the monitoring of spatial key-word queries in dynamic road networks where the networkdistance changes depending on the traffic conditions Theupdates on weight of some edges may invalidate the queryresults or safe region of q even though the query objectq remains within their respective safe region Figure 7illustrates an example of changing the weights edges
larr997888997888997888997888997888(1198991 1198992)and
larr997888997888997888997888997888(1198991 1198996) For convenience we consider 120572 = 1 and qt =ldquoItalian restaurantrdquo In Figure 7(a) the top-1 result is 1198891 andbold lines show the safe region of query q Now consider attime 119905119895 the weights of two edgeslarr997888997888997888997888997888(1198991 1198992) andlarr997888997888997888997888997888(1198991 1198996) changeddue to heavy traffic condition as shown in Figure 7(b) Theupdate in weight of edges may invalidate the query resultor safe region of q Therefore it is necessary to monitor thevalidity of results and safe region when the changes occur
Next we introduce a monitoring region to monitor thevalidity of the safe region effectively when the weight ofan edge is changed Monitoring region MR contains all thepoints between query point q and lowest answer object andhighest nonanswer object Formally it is defined as 119872119877 =119889119894119904119905(119902119863+119897 ) cup 119889119894119904119905(119902119863minusℎ) where 119889119894119904119905(119902119863+119897 ) is the distancebetween q and lowest answer object and 119889119894119904119905(119902119863minusℎ) is highestnonanswer object In given example the 119863+119897 = 1198891 and 119863minusℎ =1198892 1198893 Therefore the dotted lines in Figure 8(a) shows themonitoring region of query object q
Now at time 119905119895 the update to edgeslarr997888997888997888997888997888(1198991 1198996) and larr997888997888997888997888997888997888(1198991 1198891)
which is not part of monitoring region can safely be ignoredHowever the updated on segment
997888997888997888997888997888997888rarr(1198992 1198891)which is associatedwith monitoring region may nullify the results As shown in
Figure 8(b) after update the top-1 result becomes 1198892 and boldlines represents the new safe region of q
Algorithm 5 monitors the validity of result set and saferegion of query object qwhen the weight of any edge changesLet us consider weight of edge (119899119894 119899119895) changes at time 119905119895First algorithm checks whether edge (119899119894 119899119895) is associatedwith monitoring region or not If it is not part of monitoringregion then algorithm simply ignores the update in edge(119899119894 119899119895) and query results and safe region remains valid Incontrast if edge is associated with monitoring region (ie119872119877cap(119899119894 119899119895) = 0) then algorithm evaluates the query resultsConsequently the top-k results and safe region of queryq needs to be updated Finally the algorithm updates themonitoring region of q
7 Performance Evaluation
In this section we evaluate the performance of COSKthrough simulation experiments We describe our experi-mental settings in Section 71 and we present our experimen-tal results for static and dynamic road networks in Sections72 and 73 respectively
71 Experimental Settings All of our experiments wereperformed using real road networks namely OldenburgSan Francisco and San Joaquin All three road networkswere obtained from [27] The original road network of SanFrancisco had 21047 nodes and 21692 edges We reformat-ted the network pruned approximately 30 of the nodesand adjusted the edges and their weights accordingly Thisresulted in a network with 14732 nodes and 14316 edgesBoth the direction of edges and data objects on the edgeswere generated randomly The description of each data objectwas extracted from Twitter messages [28] and we assignedone tweet per data object Table 4 presents the characteristicsof the data sets used in the experimental evaluation Wesimulated moving query objects by using a spatiotemporaldata generator [29] The input to generator was the road net-work of the data set used and the output was the set of queryobjects moving on the road network Each experiment had100 moving queries which were continuously monitored for100 timestamps (1 timestamp = 1 second) and the averageresult was reported in the experiments
As a benchmark for COSK in static road network weimplemented a CMTkSK+ algorithm [22] which also contin-uously monitored the moving top-k spatial keyword queriesin the road networks However this algorithm was originallydesigned for undirected road networks To make a faircomparison we modified CMTkSK+ to process top-k spatialkeyword queries in directed road networks and called itCMTkSK+ Specifically we modified the distance computa-tion method between two points such that in directed roadnetworks 119889119894119904119905(1199011 1199012) = 119889119894119904119905(1199012 1199011) Since CMTkSK+ doesnot handle top-k spatial queries in dynamic road roads wecompared the performance of COSK with basic algorithmwhich recomputes the results whenever query object changesits location All algorithms were implemented in Java andwere executed on a desktop PC 280-GHz Intel Core i5 with
12 Wireless Communications and Mobile Computing
3
q5 5
2 3
3
2
2 3 5
11
d3 (Chinese Restaurant)
n1
n6
n2 pse2
pse1
pse3
n4n5
n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Safe region at time 119905119894
9
q10 5
6 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6
n2 n3
n4n5
d2 (Italian Restaurant)d1 (Italian Restaurant)
(b) Updating weight oflarr997888997888997888997888997888997888(1198991 1198992) and
larr997888997888997888997888997888997888(1198991 1198996) at time 119905119895
Figure 7 Updating the weight of edges in a dynamic road network where 119905119894 lt 119905119895
3
q5 5
2 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6 n4n5
n2 n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Monitoring region at time 119905119894
9
q10 5
5 4
233
2
2 3 5
11
037
pse2pse1
pse3
d3 (Chinese Restaurant)n6 n4n5
n2 n3d2 (Italian Restaurant)n1 d1 (Italian Restaurant)
(b) New safe region at time 119905119895
Figure 8 Monitoring region and updated safe region at time 119905119895
(1) InputMonitoring regionMR updated edge (119899119894 119899119895)(2) Output none(3) if 119872119877cap (119899119894 119899119895) = 0 then(4) lowastedge (119899119894 119899119895) is not part of monitoring region(5) ignore the change in the weight of edge (119899119894 119899119895)(6) end(7) 119875119878119864 larr997888 0 lowastset of safe exit points(8) else(9) 119863119896119906119901119889 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119899119894 119890119894) lowastupdate set of
top-k results(10) 119875119878119864119906119901119889 larr997888 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119875119886 119899120573) lowastupdate safe exit
points(11) 119872119877119906119901119889 larr997888 119862119900119898119901119906119905119890119872119900119899119894119905119900119903119894119899119892119877119890119892119894119900119899(119863+119897 119863minusℎ )
lowastupdate monitoring region(12) end
Algorithm 5 MonitoringSafeRegion(MR(119899119894 119899119895))
Table 4 Summary of datasets
Attribute Oldenburg San Francisco San JoaquinTotal no of nodes 6104 14732 18262Total no of edges 7034 14316 23876Percentage of directed edges 30 30 30Total no of objects 5627 11453 19098Average no of objects per edge 08 08 08Total no of words 49517 103649 166153
Wireless Communications and Mobile Computing 13
Table 5 Experimental parameter settings
Parameter RangeNumber of results (k) 5 10 15 20 25Number of keywords (n) 1 2 3 4 5Query parameter (120572) 001 01 1 10 100Dataset Oldenburg San Francisco San JoaquinNumber of data objects (119873119863) 10 20 30 40 50 (x1000)Speed of query objects (119881119902119903119910) 25 50 75 100 125 (kmh)Mobility (119872119902119903119910) 20 40 60 80 100Ratio of directed edges (119864119889119894119903) 10 20 30 40 50Ratio of updated edges (119864119906119901119889) 15 30 60 80 100
8GB of memory In the experiments we compared (1) queryprocessing times (2) edges processed ie the number ofedges processed for retrieving query results and (3) indexsizes Table 5 summarizes the parameters used in the exper-iments In each experiment we varied a single parameterwithin the range that is shown in Table 5 while maintainingthe other parameters at the bolded default values
We evaluated the performance of the algorithms by usingthe following measures (1) total amount of server CPUtime which indicates the query processing time and (2)total communication cost as the total number of points (iethe location updates sent by query objects and the queryresults and safe exit points returned by the server) transferredbetween clients and the serverThebattery power andwirelessbandwidth consumption typically increase with the amountof data transferred between objects (clients) and serversThus we used the amount of transferred data as a metric toevaluate the communication cost
72 Experimental Results of Top-k Spatial KeywordQueries in Static Road Networks
721 Effect of k Figure 9 indicates the effect of the numberof results on the query processing time and communicationcost for both algorithms Figure 9(a) indicates that the queryprocessing time increases for both algorithms as the value ofk increases This is expected because with an increase in kmore data objects are required to be explored and verifiedNevertheless COSK significantly outperforms CMTkSK+ fortwo main reasons First a relevant object search is very effi-cient when using the highest significant factor and secondCOSKdoes not need to verify the set of answer objects as longas the query object lies in a safe region On the other handthe CMTkSK+ query processing time increases significantlybecause it has to monitor and verify the set of candidateobjects periodically In Figure 9(b) the communication costsfor both algorithms increase as the number of objects in-creases However the proposed algorithm demonstrates su-perior performance compared to CMTkSK+ because client-server communication is not required when the query objectlies within the safe exit points whereas in CMTkSK+ thequery object is required to report its location to the serverwhenever it moves
722 Effect of119873119863 This experimentwas conducted on datasetSan Joaquin This dataset included 19098 data objects there-fore we randomly generated approximately 30000 additionaldata objects on different edges In Figure 10 we evaluate theperformance of COSK and CMTkSK+ by varying the cardi-nality of the data objects Note that119873119863 = 10119870 corresponds toa low density of data points while119873119863 = 50119870 corresponds toa high density In Figure 10(a) it is interesting to notice thatthe query processing times of both algorithms decrease asthe cardinality of the data objects increases For CMTkSK+this is because with high density the monitoring range of aquery decreases However for COSK it is mainly becausewhen the data density is high fewer edges are required tobe expanded which decreases the query processing time InFigure 10(b) we study the influence of the cardinality of thedata objects on the communication costs The experimentalresults indicate that the communication costs of CMTkSK+incur almost constant communication costs regardless ofdata object cardinality However the communication costsof COSK increase in proportion to the 119873119863 value This isexpected because the safe region becomes smaller as thedensity of the data objects increases which increases thecommunication costs
723 Effect of Query Keywords (n) Figure 11 shows thequery processing time and communication for COSK andCMTkSK+ as a function of the number of query keywordsFigures 11(a) and 11(b) show the trend that the performanceof both algorithms degrades when the number of keywordsincreases This is mainly because by increasing the numberof query keywords the number of relevant objects may alsoincrease resulting in a higher query processing time andcommunication cost However the safe-region-based algo-rithm COSK scales better than CMTkSk+ because of its lessexpensive monitoring technique
724 Effect of 120572 Figure 12 demonstrates the impact of queryparameter 120572 on the query processing time and on the com-munication cost A small value of 120572 indicates a greater im-portance of textual relevance whereas a high value of 120572gives more preference to the spatial relevance It is interestingto note that the query processing time is lower for higher
14 Wireless Communications and Mobile Computing
k
50
10
10
15 20
20
30
Que
ry p
roce
ssin
g tim
e (s)
COSKCMTkSK+
40
25
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
k
5 10 15 20 25
(b) Communication cost
Figure 9 Effect of k on query processing time and number of edges processed
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
10k 20k 30k 40k 50kND
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
tran
sferr
ed m
essa
ges
1M
10 20 30 40 50ND
(b) Communication cost
Figure 10 Effect of119873119863 on query processing time and communication cost
values of 120572 which indicates more importance to the spatialrelevance This is mainly because when the spatial relevanceis higher fewer edges and objects are required to be exploredand processed to determine the top-k data objects Observethat in Figure 12(b) the number of messages sent by COSKdecreases sharply with an increase in 120572725 Effect of Speed Figure 13(a) demonstrates the influenceof the speed of the query objects on the query processingtime of the COSK and CMTkSK+ algorithms The experi-mental results indicate that the performance of CMTkSK+is not significantly influenced by the speed of the query
objects because the candidate objects must be continuouslymonitored after a regular interval of time regardless ofthe speed On the other hand for COSK the performancegradually decreases as the speed of the query objects increasesbecause the objects leave their respective safe regions morefrequently Figure 13(b) shows the communication costs ofCOSK and CMTkSK+ with respect to the speed of the queryobjects CMTkSK+ incurs almost constant communicationcosts because a server-initiated request to verify the candidateobjects does not depend on the speed For COSK the queryobjects cross safe regions more frequently when the speed ishigh which increases the communication costs
Wireless Communications and Mobile Computing 15
Number of keywords1 2 3 4 5
COSKCMTkSK+
0
15
30
45
Que
ry p
roce
ssin
g tim
e (s)
60
(a) Query processing time
COSK
Number of keywords
CMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1 2 3 4 5
(b) Communication cost
Figure 11 Effect of number of keywords on query processing time and communication cost
001 01 1 10 100
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
001 01 1 10 100
(b) Communication cost
Figure 12 Effect of 120572 on query processing time and communication cost
726 Effect of Mobility Figure 14 shows the effect of mobility119872119902119903119910 (mobility refers to the percentage of query objects thatare moving at any timestamp) on the performance of COSKand CMTkSK+ algorithms As expected the query pro-cessing time and communication costs for both algorithmsincrease with119872119902119903y Nevertheless COSK performs better thanCMTkSK+ in terms of query processing time and commu-nication costs
727 Effect of Directed Edges Figure 15 shows the impactof percentage of directed edges 119864119889119894119903 on the performance ofCOSK and CMTkSK+ algorithms The query processing time
increases with 119864119889119894119903 because algorithm needs to explore moreedges to retrieve the top-k keyword queries However thecommunication cost is not significantly affected by the valueof 119864119889119894119903 for both the algorithms
728 Effect of Datasets Figure 16 demonstrates the indexsizes of the COSK and CMTkSK+ approaches for differentdatasets As shown in Figure 16 both algorithms have similarindex sizes However COSK has minor space overheadbecause it stores additional information of the highest signifi-cance factor 120579119905 of edges More important this space overheadis minimal as compared to the gain achieved by COSK inquery processing time and communication costs
16 Wireless Communications and Mobile Computing
25 50 75 100 125
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
Vqry
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
25 50 75 100 125Vqry
(b) Communication cost
Figure 13 Effect of speed on query processing time and communication cost
20 40 60 80 100Mqry
COSKCMTkSK+
0
15
45
30
60
Que
ry p
roce
ssin
g tim
e (s)
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
20 40 60 80 100Mqry
1k
COSKCMTkSK+
(b) Communication cost
Figure 14 Effect of mobility on query processing time and communication cost
73 Experimental Results of Top-k Spatial Keyword Queriesin Dynamic Road Networks In this section we evaluate theperformance of COSK and basic algorithm for dynamic roadnetworks The 119864119906119901119889 indicates the percentage of all edges thatchange their weight at each timestamp The length of anupdated edge is randomly selected between 01 to 10 times theoriginal length Figure 17(a) depicts the query processing timeof COSK and basic algorithm It is evident from the figure thatquery processing time of basic algorithm is not significantlyaffected by 119864119906119901119889 This is mainly because the query objectsissue top-k spatial queries at each timestamp However query
processing time of COSK increases with the value of 119864119906119901119889because the probability that the updated edge may associatedwith the monitoring region of query q increases with 119864119906119901119889Therefore when 119864119906119901119889 becomes large the results need to befrequently updated which increases the query processingtime Figure 17(b) shows the communication costs of COSKand basic algorithm with respect to 119864119906119901119889 Basic algorithmincurs almost constant communication costs regardless of thevalue of 119864119906119901119889 In contrast the communication cost of COSKincreases with 119864119906119901119889 because the query result and safe regionsneeds to be frequently updated
Wireless Communications and Mobile Computing 17
COSKCMTkSK+
10 20 30 40 50Edir
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1k
10 20 30 40 50Edir
eSPAKCMTkSK+
(b) Communication cost
Figure 15 Effect of 119864119889119894119903 on query processing time and communication cost
COSKCMTkSK+
0
15
45
30
60
Inde
x siz
e (M
B)
OldenburgDatasets
San Francisco San Joaquin
Figure 16 Effect of dataset on index size
8 Conclusion
In this paper we investigated moving top-k spatial keywordqueries in directed and dynamic road networksWepresentedan efficient indexing framework using inverted files thatindexes the data objects on edges allowing for the effectivesearching of data objects relevant to queries in terms ofboth textual and spatial relevance We also presented a safe-exit-based algorithm called COSK to monitor moving top-k spatial keyword queries We demonstrated that the queryresults remain valid as long as the query object resides withina safe region Furthermore COSK can effectively monitor thevalidity of query results and safe regions in dynamic roadnetworks Finally an experimental evaluation conducted on
real road networks demonstrated that COSK significantlyreduced the query processing time and communication costscompared to the CMTkSK+ algorithm
Data Availability
The real road network data used in this study are also used inmany previous studies The road network data is cited in themanuscript and it is available at httpswwwcsutahedusimlifeifeiSpatialDatasethtm To simulate the moving queriesthe authors used the spatiotemporal data generator which isalso used in previous studiesThe research article of generatoris cited in the manuscript The documentation and source
18 Wireless Communications and Mobile Computing
0
20
40
60
Que
ry p
roce
ssin
g tim
e (s)
80
15 30 45 60 75Eupd
COSKBasic
(a) Query processing time
15 30 45 60 75Eupd
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
COSKBasic
(b) Communication cost
Figure 17 Effect of 119864119906119901119889 on query processing time and communication cost
files of generator are available at httpsiapgjade-hsdeper-sonenbrinkhoffgenerator They used the Twitter tweetsfor generating the description of data objects and also querykeywords The tweets used can be accessible at httpfollow-thehashtagcomdatasetsfree-twitter-dataset-usa-200000-free-usa-tweets
Conflicts of Interest
The authors declare that there is no conflicts of interestregarding the publication of this paper
Acknowledgments
Hyung-JuChowas supported by theNational Research Foun-dation of Korea (NRF) grant funded by the Korean Govern-ment (MSIP) (NRF-2016R1A2B4009793) and this researchwas partially supported by Basic Science Research Programthrough the National Research Foundation of Korea (NRF)fundedby theMinistry of Education (2016R1D1A1B03934129)
References
[1] D Papadias N Mamoulis J Zhang and Y Tao ldquoQuery pro-cessing in spatial network databasesrdquo in Proceedings of the 29thInternational Conference on Very Large Data Bases (VLDB rsquo03)pp 802ndash813 September 2003
[2] H-J Cho K Ryu and T-S Chung ldquoAn efficient algorithm forcomputing safe exit points of moving range queries in directedroad networksrdquo Information Systems vol 41 pp 1ndash19 2014
[3] G Tsatsanifos and A Vlachou ldquoOn processing Top-k spatio-textual preference queriesrdquo in Proceedings of the 18th Interna-tional Conference on ExtendingDatabase Technology (EDBT rsquo15)pp 433ndash444 March 2015
[4] R Li A X Liu A L Wang and B Bruhadeshwar ldquoFast rangequery processing with strong privacy protection for cloud com-putingrdquo Proceedings of the VLDB Endowment vol 7 no 14 pp1953ndash1964 2014
[5] G Cong C S Jensen andDWu ldquoEfficient retrieval of the Top-k most relevant spatial web objectsrdquo Proceedings of the VLDBEndowment vol 2 no 1 pp 337ndash348 2009
[6] Z Li K C K Lee B Zheng W-C Lee D Lee and X WangldquoIR-tree An efficient index for geographic document searchrdquoIEEE Transactions on Knowledge and Data Engineering vol 23no 4 pp 585ndash599 2011
[7] Y Zhou X Xie C Wang Y Gong and W Ma ldquoHybrid indexstructures for location-based web searchrdquo in Proceedings of the14th ACM International Conference on Information and Knowl-edge Management pp 155ndash162 Bremen Germany October2005
[8] J Zobel and A Moffat ldquoInverted files for text search enginesrdquoACM Computing Surveys vol 38 no 2 2006
[9] N Beckmann H Kriegel R Schneider and B Seeger ldquoR-anefficient and robust accessmethod for points and rectanglesrdquo inProceedings of the ACM SIGMOD International Conference onManagement of Data vol 19 pp 322ndash331 May 1990
[10] R Hariharan B Hore C Li and S Mehrotra ldquoProcessing spa-tial-keyword (sk) queries in geographic information retrieval(gir) systemsrdquo in Proceedings of the 19th International Confer-ence on Scientific and Statistical DatabaseManagement (SSDBMrsquo07) July 2007
[11] I De FelipeV Hristidis andN Rishe ldquoKeyword search on spa-tial databasesrdquo in Proceedings of the 24th International Confer-ence on Data Engineering (ICDE rsquo08) pp 656ndash665 April 2008
[12] J B Rocha-Junior O Gkorgkas S Jonassen and K NoslashrvagldquoEfficient processing of top-k spatial keyword queriesrdquo inProceedings of the International Symposium on Spatial andTemporal Databases pp 205ndash222 Springer 2011
[13] D Zhang K-L Tan andAK Tung ldquoScalable top-k spatial key-word searchrdquo in Proceedings of the 16th International Conferenceon Extending Database Technology pp 359ndash370 2013
Wireless Communications and Mobile Computing 19
[14] J B Rocha-Junior andK Noslashrvag ldquoTop-k spatial keyword quer-ies on road networksrdquo in Proceedings of the 15th InternationalConference on Extending Database Technology pp 168ndash179Berlin Germany March 2012
[15] H-J Cho S J Kwon and T-S Chung ldquoA safe exit algorithmfor continuous nearest neighbor monitoring in road networksrdquoMobile Information Systems vol 9 no 1 pp 37ndash53 2013
[16] D Yung M L Yiu and E Lo ldquoA safe-exit approach for efficientnetwork-based moving range queriesrdquo Data amp KnowledgeEngineering vol 72 pp 126ndash147 2012
[17] M Attique H Cho R Jin and T Chung ldquoEfficient Processingof Continuous Reverse k Nearest Neighbor on Moving Objectsin Road Networksrdquo ISPRS International Journal of Geo-Infor-mation vol 5 no 12 p 247 2016
[18] H G Elmongui M F Mokbel and W G Aref ldquoContinuousaggregate nearest neighbor queriesrdquoGeoInformatica vol 17 no1 pp 63ndash95 2013
[19] D Wu M L Yiu C S Jensen and G Cong ldquoEfficient con-tinuously moving top-k spatial keyword query processingrdquo inProceedings of the IEEE International Conference on Data En-gineering (ICDE rsquo11) pp 541ndash552 Hannover Germany April2011
[20] W Huang G Li K-L Tan and J Feng ldquoEfficient safe-re-gion construction for moving top-k spatial keyword queriesrdquoin Proceedings of the 21st ACM International Conference onInformation and Knowledge Management pp 932ndash941 2012
[21] L Guo J ShaoHHAung andK-L Tan ldquoEfficient continuoustop-k spatial keyword queries on road networksrdquoGeoInformat-ica vol 19 no 1 pp 29ndash60 2014
[22] Y Li G Li L Shu Q Huang and H Jiang ldquoContinuous moni-toring of top-k spatial keyword queries in road networksrdquo Jour-nal of Information Science and Engineering vol 31 no 6 pp1831ndash1848 2015
[23] M Attique A Khan and T-S Chung ldquoESPAK Top-k spatialkeyword query processing in directed road networksrdquo in Pro-ceedings of the Workshops of the International Conference onExtending Database Technology and the International Confer-ence on DatabaseTheory (EDBTICDT rsquo17) March 2017
[24] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing ampManagement vol24 no 5 pp 513ndash523 1988
[25] V N Anh O de Kretser and A Moffat ldquoVector-space rankingwith effective early terminationrdquo in Proceedings of the 24th An-nual International ACM SIGIR Conference pp 35ndash42 NewOrleans LO USA 2001
[26] E W Dijkstra ldquoA note on two problems in connexion withgraphsrdquo Numerische Mathematik vol 1 pp 269ndash271 1959
[27] ldquoReal datasets for spatial databasesrdquo httpswwwcsutahedulifeifeiSpatialDatasethtm
[28] ldquoTwitterrdquo httpstwittercom[29] T Brinkhoff ldquoA framework for generating network-basedmov-
ing objectsrdquo GeoInformatica vol 6 no 2 pp 153ndash180 2002
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
Wireless Communications and Mobile Computing 9
(1) Input Same as Algorithm 1(2) Output 119875119878119864 a set of safe exit points(3) 119875119878119864 larr997888 0 lowastset of safe exit points(4) 119863+119901119886 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119901119886 (119901119886 119899120573))(5) lowastResults calculated using Algorithm 1(6) 119863+119899120573 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910((119899120573 (119901119886 119899120573)))(7) lowastResults calculated using Algorithm 1(8) if 119863+119901119886 = 119863+119899120573 then(9) no safe exit point lowastrefer to Observation 1(10) end(11) if 119863+119901119886 = 119863+119899120573 then(12) 119875119878119864 larr997888 119875119878119864 cup 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119901119886 119899120573) lowastsafe exit point
exist - refer to Observation 2(13) end(14) return 119875119878119864
Algorithm 3 COSK monitoring algorithm
(1) Input same as Algorithm 1(2) Output se safe exit point in (119901119886 119899120573)(3) 119863+119897 larr997888 lt 119901119863+119897 gt | for each point 119901 isin [119901119886 119899120573] 119889+119897 such that120595(119889+119897 )119901 = min(120595(119889+1 )119901 120595(119889+2 )119901 120595(119889+|119889+|)119901(4) 119863minusℎ larr997888 lt 119901119863minusℎ gt | for each point 119901 isin [119901119886 119899120573] 119889minusℎ such that120595(119889minusℎ )119901 = max(120595(119889minus1 )119901 120595(119889minus2 )119901 120595(119889minus|119889minus |)119901(5) if Case 1 then(6) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(7) 119901119904119890 =
max(119889119894119904119905(119904119890 119889+1 ) 119889119894119904119905(119904119890 119889+2 ) 119889119894119904119905(119904119890 119889+|119889+ |)) =min(119889119894119904119905(119904119890 119889minus1 ) 119889119894119904119905(119904119890 119889minus2 ) 119889119894119904119905(119904119890 119889minus|119889minus |))
(8) end(9) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(10) 119901119904119890 = 119901119886 or 119901119904119890 = 119889+119897 where 119889+119897 isin (119901119886 119899120573)(11) end(12) end(13) if Case 2 then(14) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(15) 119901119904119890 =closest point to 119901119886 such that 120595(119889minusℎ ) gt 120595(119889+119897 )(16) end(17) if 119889119894119904119905(119901119886 119899120573) = 119889119894119904119905(119899120573 119901119886) then(18) Same as Line (10)(19) end(20) end(21) return 119901119904119890
Algorithm 4 ComputeSafeExit(119901119886 119899120573)
Therefore according to Case 1 the safe exit point 1199041 isthe midpoint between 1198893 and 1198896 That is 119889119894119904119905(1199011199041198901 1198893) =119889119894119904119905(1199011199041198901 1198896) where119889119894119904119905(1199011199041198901 1198893) = 119909+3 and 119889119894119904119905(1199011199041198901 1198896) =minus119909 + 5 for 0 lt 119909 lt 3 Consequently 119909 = 1 which means thatthe distance from 1198993 to 1199011199041198901 is 1
Next we determine a safe exit point in (1198993 1198995) As shownin Table 3 the answer object at 1198995 is also the same as 1198993Hence no safe exit point exists in this edge Next
larr997888997888997888997888997888(1198996 1198995) isexplored with 119901119886 = 1198995 According to Table 3 119863+1198997 = 1198894 and
119863+1198995 = 1198893 Therefore a safe exit point exists in this edge This
edge is directed and for each point 119901 isin larr997888997888997888997888997888(1198996 1198995) the shortestdistance from p to 1198893 is from 119901 997888rarr 1198996 997888rarr 1198992 997888rarr 1198993 997888rarr 1198893Therefore 1198995 is the safe exit point
The bold lines in Figure 5 indicate the safe region of qThetop-1 result remains 1198893 until the query q lies in the safe region
Next we analyze the time complexity for determininga set of safe exit points using a set of qualifying objects119889 isin 119863+119901119886 cup 119863+119899120573 cup 119863(119901119886 119899120573) Note that 119863+119901119886 (119863+119899120573) indicates
10 Wireless Communications and Mobile Computing
Table 3 Computation of safe exit points for example scenario
EdgeSegment 119901119886 119863+119901119886 119863+119899120573 119901119904119890997888997888997888997888rarr(119902 1198993) q 119863+119902 = 1198893 119863+1198993 = 1198893 none(1198993 1198994) q 119863+1198993 = 1198893 119863+1198994 = 1198893 none(1198993 1198997) 1198993 119863+1198993 = 1198893 119863+1198997 = 1198896 1199011199041198901997888997888997888997888997888rarr(1198993 1198995) 1198993 119863+1198993 = 1198893 119863+1198995 = 1198893 nonelarr997888997888997888997888997888(1198996 1198995) 1198995 119863+1198995 = 1198893 119863+1198996 = 1198894 1199011199041198902
2
q
3
1
1 1
1
1
2
1
2
1 2
1
3
2
1
1
d4 (Chinese Restaurant)
d1 (Grand Hotel)
d5 (Pub and Bar)
n1
n6
n2 n3
n4
n7
pse1
pse2
n5
d6(Italian Restaurant)
d3 (Italian Restaurant)
d2 (Cafe)
d7 (Cafe and Bakery)
Figure 5 Illustration of safe region of q
the set of k data objects that satisfies the query conditionat 119901119886 (119899120573) According to Dijkstras algorithm [26] the timecomplexity 119874(119863+119902 ) for computing a set of answer objects at aquery point q is119874(119863+119902 ) = 119874(|119864|+|119873| log |119873|)Thismeans that119874(119863+119901119886) = 119874(119863+119899120573) = 119874(|119864| + |119873| log |119873|) holds for endpoints119901119886 and 119899120573 Thus time complexity 119874(Ω119896119905ℎ) when determiningthe skyline Ω119896119905ℎ with the k-th highest score is 119874(Ω119896119905ℎ) =119862119896119905ℎ119874(|119863+119901119886 cup 119863+119899120573 cup 119863(119901119886 119899120573)|) where 119862119896119905ℎ is the numberof qualifying objects that participate in the constitution ofthe skyline with the k-th highest score Therefore the timecomplexity of determining a safe exit point coincides withthe time complexity of determining the two skylines iethe skyline 119863+119897 with the k-th highest (or lowest) score foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects This is because the safe exit point is foundat the cross point between these skylines
Figure 6 represents the skyline graph for 119896 = 1 in an edge(1198997 1198993) Let us draw the score function for 1198893 and 1198896 for theroad segment (1198997 1198993) where a safe exit point exists This isbecause 119863(1198993)+ = 1198893 and 119863(1198997)+ = 1198896 for 119896 = 1 For eachpoint 119901 isin (1198997 1198993) the distance between 1198893 and point p canbe represented as 119889119894119904119905(1198893 119901) = 119889119894119904119905(1198893 1198993) + 119897119890119899(1198993 119901) = 6 minus119897119890119899(1198997 119901) Similarly for each point 119901 isin (1198997 1198993) the distancebetween 1198896 and point p can be represented as 119889119894119904119905(1198896 119901) =119889119894119904119905(1198896 1198997) + 119897119890119899(1198997 119901) = 2 + 119897119890119899(1198997 119901) Let 119897119890119899(1198997 119901) be
n7
10
08
06
04
02
n3pse1d7
distance
Scor
e
05 10 15 20 25 30
(d6) = 1(x + 3)
(d3) = 1(minusx + 7)
Figure 6 Skyline graph for 119896 = 1 on the road segment (1198997 1198993)
a variable x (0 le 119909 le 3) We can write 120582(1198893 119901) =119889119894119904119905(1198893 119901) = 6 minus 119909 and 120582(1198896 119901) = 119889119894119904119905(1198896 119901) = 2 + 119909 Thenwe can represent score function 120595(1198893) and 120595(1198896) as follows
120595(1198893) = 120583(1198893119905 119902119905)(1 + 120572 sdot 120582(1198893 119901)) = 1(7 minus 119909) for(0 le 119909 le 3)
Wireless Communications and Mobile Computing 11
120595(1198896) = 120583(1198896119905 119902119905)(1 + 120572 sdot 120582(1198896 119901)) = 1(3 + 119909) for(0 le 119909 le 3)Finally we present the lemma to prove that safe exit points
computed by COSK are correct
Lemma 8 The COSK algorithm correctly computes a set ofsafe exit points
Proof We will prove the correctness of the COSK algorithmby contradiction We assume that if 119863+119901119886 = 119863+119899120573 there is nosafe exit point in a road segment (119901119886119899120573) This means that foreach point p in the road segment (119901119886119899120573) the query result atp equals 119863+119901119886 ie 119863+119901 = 119863+119901119886forall119901 isin (119901119886119899120573) However it leadsto a contradiction that 119863+119899120573 = 119863+119901119886 when 119901 = 119899120573 There-fore if 119863+119901119886 = 119863+119899120573 a safe exit point exists in (119901119886119899120573) In addi-tion a safe exit point is determined using the skyline 119863+119897 foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects when 119863+119901119886 = 119863+119899120573 The first skyline is acomposite polyline drawn from answer objects in 119863+119901119886 Thesecond skyline is a composite polyline drawn from nonan-swer objects in 119863+119899120573 cup 119863(119901119886 119899120573) minus 119863+119901119886
6 Monitoring Query Results and Safe Regionsin Dynamic Directed Road Networks
In this section we discuss the monitoring of spatial key-word queries in dynamic road networks where the networkdistance changes depending on the traffic conditions Theupdates on weight of some edges may invalidate the queryresults or safe region of q even though the query objectq remains within their respective safe region Figure 7illustrates an example of changing the weights edges
larr997888997888997888997888997888(1198991 1198992)and
larr997888997888997888997888997888(1198991 1198996) For convenience we consider 120572 = 1 and qt =ldquoItalian restaurantrdquo In Figure 7(a) the top-1 result is 1198891 andbold lines show the safe region of query q Now consider attime 119905119895 the weights of two edgeslarr997888997888997888997888997888(1198991 1198992) andlarr997888997888997888997888997888(1198991 1198996) changeddue to heavy traffic condition as shown in Figure 7(b) Theupdate in weight of edges may invalidate the query resultor safe region of q Therefore it is necessary to monitor thevalidity of results and safe region when the changes occur
Next we introduce a monitoring region to monitor thevalidity of the safe region effectively when the weight ofan edge is changed Monitoring region MR contains all thepoints between query point q and lowest answer object andhighest nonanswer object Formally it is defined as 119872119877 =119889119894119904119905(119902119863+119897 ) cup 119889119894119904119905(119902119863minusℎ) where 119889119894119904119905(119902119863+119897 ) is the distancebetween q and lowest answer object and 119889119894119904119905(119902119863minusℎ) is highestnonanswer object In given example the 119863+119897 = 1198891 and 119863minusℎ =1198892 1198893 Therefore the dotted lines in Figure 8(a) shows themonitoring region of query object q
Now at time 119905119895 the update to edgeslarr997888997888997888997888997888(1198991 1198996) and larr997888997888997888997888997888997888(1198991 1198891)
which is not part of monitoring region can safely be ignoredHowever the updated on segment
997888997888997888997888997888997888rarr(1198992 1198891)which is associatedwith monitoring region may nullify the results As shown in
Figure 8(b) after update the top-1 result becomes 1198892 and boldlines represents the new safe region of q
Algorithm 5 monitors the validity of result set and saferegion of query object qwhen the weight of any edge changesLet us consider weight of edge (119899119894 119899119895) changes at time 119905119895First algorithm checks whether edge (119899119894 119899119895) is associatedwith monitoring region or not If it is not part of monitoringregion then algorithm simply ignores the update in edge(119899119894 119899119895) and query results and safe region remains valid Incontrast if edge is associated with monitoring region (ie119872119877cap(119899119894 119899119895) = 0) then algorithm evaluates the query resultsConsequently the top-k results and safe region of queryq needs to be updated Finally the algorithm updates themonitoring region of q
7 Performance Evaluation
In this section we evaluate the performance of COSKthrough simulation experiments We describe our experi-mental settings in Section 71 and we present our experimen-tal results for static and dynamic road networks in Sections72 and 73 respectively
71 Experimental Settings All of our experiments wereperformed using real road networks namely OldenburgSan Francisco and San Joaquin All three road networkswere obtained from [27] The original road network of SanFrancisco had 21047 nodes and 21692 edges We reformat-ted the network pruned approximately 30 of the nodesand adjusted the edges and their weights accordingly Thisresulted in a network with 14732 nodes and 14316 edgesBoth the direction of edges and data objects on the edgeswere generated randomly The description of each data objectwas extracted from Twitter messages [28] and we assignedone tweet per data object Table 4 presents the characteristicsof the data sets used in the experimental evaluation Wesimulated moving query objects by using a spatiotemporaldata generator [29] The input to generator was the road net-work of the data set used and the output was the set of queryobjects moving on the road network Each experiment had100 moving queries which were continuously monitored for100 timestamps (1 timestamp = 1 second) and the averageresult was reported in the experiments
As a benchmark for COSK in static road network weimplemented a CMTkSK+ algorithm [22] which also contin-uously monitored the moving top-k spatial keyword queriesin the road networks However this algorithm was originallydesigned for undirected road networks To make a faircomparison we modified CMTkSK+ to process top-k spatialkeyword queries in directed road networks and called itCMTkSK+ Specifically we modified the distance computa-tion method between two points such that in directed roadnetworks 119889119894119904119905(1199011 1199012) = 119889119894119904119905(1199012 1199011) Since CMTkSK+ doesnot handle top-k spatial queries in dynamic road roads wecompared the performance of COSK with basic algorithmwhich recomputes the results whenever query object changesits location All algorithms were implemented in Java andwere executed on a desktop PC 280-GHz Intel Core i5 with
12 Wireless Communications and Mobile Computing
3
q5 5
2 3
3
2
2 3 5
11
d3 (Chinese Restaurant)
n1
n6
n2 pse2
pse1
pse3
n4n5
n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Safe region at time 119905119894
9
q10 5
6 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6
n2 n3
n4n5
d2 (Italian Restaurant)d1 (Italian Restaurant)
(b) Updating weight oflarr997888997888997888997888997888997888(1198991 1198992) and
larr997888997888997888997888997888997888(1198991 1198996) at time 119905119895
Figure 7 Updating the weight of edges in a dynamic road network where 119905119894 lt 119905119895
3
q5 5
2 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6 n4n5
n2 n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Monitoring region at time 119905119894
9
q10 5
5 4
233
2
2 3 5
11
037
pse2pse1
pse3
d3 (Chinese Restaurant)n6 n4n5
n2 n3d2 (Italian Restaurant)n1 d1 (Italian Restaurant)
(b) New safe region at time 119905119895
Figure 8 Monitoring region and updated safe region at time 119905119895
(1) InputMonitoring regionMR updated edge (119899119894 119899119895)(2) Output none(3) if 119872119877cap (119899119894 119899119895) = 0 then(4) lowastedge (119899119894 119899119895) is not part of monitoring region(5) ignore the change in the weight of edge (119899119894 119899119895)(6) end(7) 119875119878119864 larr997888 0 lowastset of safe exit points(8) else(9) 119863119896119906119901119889 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119899119894 119890119894) lowastupdate set of
top-k results(10) 119875119878119864119906119901119889 larr997888 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119875119886 119899120573) lowastupdate safe exit
points(11) 119872119877119906119901119889 larr997888 119862119900119898119901119906119905119890119872119900119899119894119905119900119903119894119899119892119877119890119892119894119900119899(119863+119897 119863minusℎ )
lowastupdate monitoring region(12) end
Algorithm 5 MonitoringSafeRegion(MR(119899119894 119899119895))
Table 4 Summary of datasets
Attribute Oldenburg San Francisco San JoaquinTotal no of nodes 6104 14732 18262Total no of edges 7034 14316 23876Percentage of directed edges 30 30 30Total no of objects 5627 11453 19098Average no of objects per edge 08 08 08Total no of words 49517 103649 166153
Wireless Communications and Mobile Computing 13
Table 5 Experimental parameter settings
Parameter RangeNumber of results (k) 5 10 15 20 25Number of keywords (n) 1 2 3 4 5Query parameter (120572) 001 01 1 10 100Dataset Oldenburg San Francisco San JoaquinNumber of data objects (119873119863) 10 20 30 40 50 (x1000)Speed of query objects (119881119902119903119910) 25 50 75 100 125 (kmh)Mobility (119872119902119903119910) 20 40 60 80 100Ratio of directed edges (119864119889119894119903) 10 20 30 40 50Ratio of updated edges (119864119906119901119889) 15 30 60 80 100
8GB of memory In the experiments we compared (1) queryprocessing times (2) edges processed ie the number ofedges processed for retrieving query results and (3) indexsizes Table 5 summarizes the parameters used in the exper-iments In each experiment we varied a single parameterwithin the range that is shown in Table 5 while maintainingthe other parameters at the bolded default values
We evaluated the performance of the algorithms by usingthe following measures (1) total amount of server CPUtime which indicates the query processing time and (2)total communication cost as the total number of points (iethe location updates sent by query objects and the queryresults and safe exit points returned by the server) transferredbetween clients and the serverThebattery power andwirelessbandwidth consumption typically increase with the amountof data transferred between objects (clients) and serversThus we used the amount of transferred data as a metric toevaluate the communication cost
72 Experimental Results of Top-k Spatial KeywordQueries in Static Road Networks
721 Effect of k Figure 9 indicates the effect of the numberof results on the query processing time and communicationcost for both algorithms Figure 9(a) indicates that the queryprocessing time increases for both algorithms as the value ofk increases This is expected because with an increase in kmore data objects are required to be explored and verifiedNevertheless COSK significantly outperforms CMTkSK+ fortwo main reasons First a relevant object search is very effi-cient when using the highest significant factor and secondCOSKdoes not need to verify the set of answer objects as longas the query object lies in a safe region On the other handthe CMTkSK+ query processing time increases significantlybecause it has to monitor and verify the set of candidateobjects periodically In Figure 9(b) the communication costsfor both algorithms increase as the number of objects in-creases However the proposed algorithm demonstrates su-perior performance compared to CMTkSK+ because client-server communication is not required when the query objectlies within the safe exit points whereas in CMTkSK+ thequery object is required to report its location to the serverwhenever it moves
722 Effect of119873119863 This experimentwas conducted on datasetSan Joaquin This dataset included 19098 data objects there-fore we randomly generated approximately 30000 additionaldata objects on different edges In Figure 10 we evaluate theperformance of COSK and CMTkSK+ by varying the cardi-nality of the data objects Note that119873119863 = 10119870 corresponds toa low density of data points while119873119863 = 50119870 corresponds toa high density In Figure 10(a) it is interesting to notice thatthe query processing times of both algorithms decrease asthe cardinality of the data objects increases For CMTkSK+this is because with high density the monitoring range of aquery decreases However for COSK it is mainly becausewhen the data density is high fewer edges are required tobe expanded which decreases the query processing time InFigure 10(b) we study the influence of the cardinality of thedata objects on the communication costs The experimentalresults indicate that the communication costs of CMTkSK+incur almost constant communication costs regardless ofdata object cardinality However the communication costsof COSK increase in proportion to the 119873119863 value This isexpected because the safe region becomes smaller as thedensity of the data objects increases which increases thecommunication costs
723 Effect of Query Keywords (n) Figure 11 shows thequery processing time and communication for COSK andCMTkSK+ as a function of the number of query keywordsFigures 11(a) and 11(b) show the trend that the performanceof both algorithms degrades when the number of keywordsincreases This is mainly because by increasing the numberof query keywords the number of relevant objects may alsoincrease resulting in a higher query processing time andcommunication cost However the safe-region-based algo-rithm COSK scales better than CMTkSk+ because of its lessexpensive monitoring technique
724 Effect of 120572 Figure 12 demonstrates the impact of queryparameter 120572 on the query processing time and on the com-munication cost A small value of 120572 indicates a greater im-portance of textual relevance whereas a high value of 120572gives more preference to the spatial relevance It is interestingto note that the query processing time is lower for higher
14 Wireless Communications and Mobile Computing
k
50
10
10
15 20
20
30
Que
ry p
roce
ssin
g tim
e (s)
COSKCMTkSK+
40
25
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
k
5 10 15 20 25
(b) Communication cost
Figure 9 Effect of k on query processing time and number of edges processed
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
10k 20k 30k 40k 50kND
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
tran
sferr
ed m
essa
ges
1M
10 20 30 40 50ND
(b) Communication cost
Figure 10 Effect of119873119863 on query processing time and communication cost
values of 120572 which indicates more importance to the spatialrelevance This is mainly because when the spatial relevanceis higher fewer edges and objects are required to be exploredand processed to determine the top-k data objects Observethat in Figure 12(b) the number of messages sent by COSKdecreases sharply with an increase in 120572725 Effect of Speed Figure 13(a) demonstrates the influenceof the speed of the query objects on the query processingtime of the COSK and CMTkSK+ algorithms The experi-mental results indicate that the performance of CMTkSK+is not significantly influenced by the speed of the query
objects because the candidate objects must be continuouslymonitored after a regular interval of time regardless ofthe speed On the other hand for COSK the performancegradually decreases as the speed of the query objects increasesbecause the objects leave their respective safe regions morefrequently Figure 13(b) shows the communication costs ofCOSK and CMTkSK+ with respect to the speed of the queryobjects CMTkSK+ incurs almost constant communicationcosts because a server-initiated request to verify the candidateobjects does not depend on the speed For COSK the queryobjects cross safe regions more frequently when the speed ishigh which increases the communication costs
Wireless Communications and Mobile Computing 15
Number of keywords1 2 3 4 5
COSKCMTkSK+
0
15
30
45
Que
ry p
roce
ssin
g tim
e (s)
60
(a) Query processing time
COSK
Number of keywords
CMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1 2 3 4 5
(b) Communication cost
Figure 11 Effect of number of keywords on query processing time and communication cost
001 01 1 10 100
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
001 01 1 10 100
(b) Communication cost
Figure 12 Effect of 120572 on query processing time and communication cost
726 Effect of Mobility Figure 14 shows the effect of mobility119872119902119903119910 (mobility refers to the percentage of query objects thatare moving at any timestamp) on the performance of COSKand CMTkSK+ algorithms As expected the query pro-cessing time and communication costs for both algorithmsincrease with119872119902119903y Nevertheless COSK performs better thanCMTkSK+ in terms of query processing time and commu-nication costs
727 Effect of Directed Edges Figure 15 shows the impactof percentage of directed edges 119864119889119894119903 on the performance ofCOSK and CMTkSK+ algorithms The query processing time
increases with 119864119889119894119903 because algorithm needs to explore moreedges to retrieve the top-k keyword queries However thecommunication cost is not significantly affected by the valueof 119864119889119894119903 for both the algorithms
728 Effect of Datasets Figure 16 demonstrates the indexsizes of the COSK and CMTkSK+ approaches for differentdatasets As shown in Figure 16 both algorithms have similarindex sizes However COSK has minor space overheadbecause it stores additional information of the highest signifi-cance factor 120579119905 of edges More important this space overheadis minimal as compared to the gain achieved by COSK inquery processing time and communication costs
16 Wireless Communications and Mobile Computing
25 50 75 100 125
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
Vqry
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
25 50 75 100 125Vqry
(b) Communication cost
Figure 13 Effect of speed on query processing time and communication cost
20 40 60 80 100Mqry
COSKCMTkSK+
0
15
45
30
60
Que
ry p
roce
ssin
g tim
e (s)
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
20 40 60 80 100Mqry
1k
COSKCMTkSK+
(b) Communication cost
Figure 14 Effect of mobility on query processing time and communication cost
73 Experimental Results of Top-k Spatial Keyword Queriesin Dynamic Road Networks In this section we evaluate theperformance of COSK and basic algorithm for dynamic roadnetworks The 119864119906119901119889 indicates the percentage of all edges thatchange their weight at each timestamp The length of anupdated edge is randomly selected between 01 to 10 times theoriginal length Figure 17(a) depicts the query processing timeof COSK and basic algorithm It is evident from the figure thatquery processing time of basic algorithm is not significantlyaffected by 119864119906119901119889 This is mainly because the query objectsissue top-k spatial queries at each timestamp However query
processing time of COSK increases with the value of 119864119906119901119889because the probability that the updated edge may associatedwith the monitoring region of query q increases with 119864119906119901119889Therefore when 119864119906119901119889 becomes large the results need to befrequently updated which increases the query processingtime Figure 17(b) shows the communication costs of COSKand basic algorithm with respect to 119864119906119901119889 Basic algorithmincurs almost constant communication costs regardless of thevalue of 119864119906119901119889 In contrast the communication cost of COSKincreases with 119864119906119901119889 because the query result and safe regionsneeds to be frequently updated
Wireless Communications and Mobile Computing 17
COSKCMTkSK+
10 20 30 40 50Edir
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1k
10 20 30 40 50Edir
eSPAKCMTkSK+
(b) Communication cost
Figure 15 Effect of 119864119889119894119903 on query processing time and communication cost
COSKCMTkSK+
0
15
45
30
60
Inde
x siz
e (M
B)
OldenburgDatasets
San Francisco San Joaquin
Figure 16 Effect of dataset on index size
8 Conclusion
In this paper we investigated moving top-k spatial keywordqueries in directed and dynamic road networksWepresentedan efficient indexing framework using inverted files thatindexes the data objects on edges allowing for the effectivesearching of data objects relevant to queries in terms ofboth textual and spatial relevance We also presented a safe-exit-based algorithm called COSK to monitor moving top-k spatial keyword queries We demonstrated that the queryresults remain valid as long as the query object resides withina safe region Furthermore COSK can effectively monitor thevalidity of query results and safe regions in dynamic roadnetworks Finally an experimental evaluation conducted on
real road networks demonstrated that COSK significantlyreduced the query processing time and communication costscompared to the CMTkSK+ algorithm
Data Availability
The real road network data used in this study are also used inmany previous studies The road network data is cited in themanuscript and it is available at httpswwwcsutahedusimlifeifeiSpatialDatasethtm To simulate the moving queriesthe authors used the spatiotemporal data generator which isalso used in previous studiesThe research article of generatoris cited in the manuscript The documentation and source
18 Wireless Communications and Mobile Computing
0
20
40
60
Que
ry p
roce
ssin
g tim
e (s)
80
15 30 45 60 75Eupd
COSKBasic
(a) Query processing time
15 30 45 60 75Eupd
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
COSKBasic
(b) Communication cost
Figure 17 Effect of 119864119906119901119889 on query processing time and communication cost
files of generator are available at httpsiapgjade-hsdeper-sonenbrinkhoffgenerator They used the Twitter tweetsfor generating the description of data objects and also querykeywords The tweets used can be accessible at httpfollow-thehashtagcomdatasetsfree-twitter-dataset-usa-200000-free-usa-tweets
Conflicts of Interest
The authors declare that there is no conflicts of interestregarding the publication of this paper
Acknowledgments
Hyung-JuChowas supported by theNational Research Foun-dation of Korea (NRF) grant funded by the Korean Govern-ment (MSIP) (NRF-2016R1A2B4009793) and this researchwas partially supported by Basic Science Research Programthrough the National Research Foundation of Korea (NRF)fundedby theMinistry of Education (2016R1D1A1B03934129)
References
[1] D Papadias N Mamoulis J Zhang and Y Tao ldquoQuery pro-cessing in spatial network databasesrdquo in Proceedings of the 29thInternational Conference on Very Large Data Bases (VLDB rsquo03)pp 802ndash813 September 2003
[2] H-J Cho K Ryu and T-S Chung ldquoAn efficient algorithm forcomputing safe exit points of moving range queries in directedroad networksrdquo Information Systems vol 41 pp 1ndash19 2014
[3] G Tsatsanifos and A Vlachou ldquoOn processing Top-k spatio-textual preference queriesrdquo in Proceedings of the 18th Interna-tional Conference on ExtendingDatabase Technology (EDBT rsquo15)pp 433ndash444 March 2015
[4] R Li A X Liu A L Wang and B Bruhadeshwar ldquoFast rangequery processing with strong privacy protection for cloud com-putingrdquo Proceedings of the VLDB Endowment vol 7 no 14 pp1953ndash1964 2014
[5] G Cong C S Jensen andDWu ldquoEfficient retrieval of the Top-k most relevant spatial web objectsrdquo Proceedings of the VLDBEndowment vol 2 no 1 pp 337ndash348 2009
[6] Z Li K C K Lee B Zheng W-C Lee D Lee and X WangldquoIR-tree An efficient index for geographic document searchrdquoIEEE Transactions on Knowledge and Data Engineering vol 23no 4 pp 585ndash599 2011
[7] Y Zhou X Xie C Wang Y Gong and W Ma ldquoHybrid indexstructures for location-based web searchrdquo in Proceedings of the14th ACM International Conference on Information and Knowl-edge Management pp 155ndash162 Bremen Germany October2005
[8] J Zobel and A Moffat ldquoInverted files for text search enginesrdquoACM Computing Surveys vol 38 no 2 2006
[9] N Beckmann H Kriegel R Schneider and B Seeger ldquoR-anefficient and robust accessmethod for points and rectanglesrdquo inProceedings of the ACM SIGMOD International Conference onManagement of Data vol 19 pp 322ndash331 May 1990
[10] R Hariharan B Hore C Li and S Mehrotra ldquoProcessing spa-tial-keyword (sk) queries in geographic information retrieval(gir) systemsrdquo in Proceedings of the 19th International Confer-ence on Scientific and Statistical DatabaseManagement (SSDBMrsquo07) July 2007
[11] I De FelipeV Hristidis andN Rishe ldquoKeyword search on spa-tial databasesrdquo in Proceedings of the 24th International Confer-ence on Data Engineering (ICDE rsquo08) pp 656ndash665 April 2008
[12] J B Rocha-Junior O Gkorgkas S Jonassen and K NoslashrvagldquoEfficient processing of top-k spatial keyword queriesrdquo inProceedings of the International Symposium on Spatial andTemporal Databases pp 205ndash222 Springer 2011
[13] D Zhang K-L Tan andAK Tung ldquoScalable top-k spatial key-word searchrdquo in Proceedings of the 16th International Conferenceon Extending Database Technology pp 359ndash370 2013
Wireless Communications and Mobile Computing 19
[14] J B Rocha-Junior andK Noslashrvag ldquoTop-k spatial keyword quer-ies on road networksrdquo in Proceedings of the 15th InternationalConference on Extending Database Technology pp 168ndash179Berlin Germany March 2012
[15] H-J Cho S J Kwon and T-S Chung ldquoA safe exit algorithmfor continuous nearest neighbor monitoring in road networksrdquoMobile Information Systems vol 9 no 1 pp 37ndash53 2013
[16] D Yung M L Yiu and E Lo ldquoA safe-exit approach for efficientnetwork-based moving range queriesrdquo Data amp KnowledgeEngineering vol 72 pp 126ndash147 2012
[17] M Attique H Cho R Jin and T Chung ldquoEfficient Processingof Continuous Reverse k Nearest Neighbor on Moving Objectsin Road Networksrdquo ISPRS International Journal of Geo-Infor-mation vol 5 no 12 p 247 2016
[18] H G Elmongui M F Mokbel and W G Aref ldquoContinuousaggregate nearest neighbor queriesrdquoGeoInformatica vol 17 no1 pp 63ndash95 2013
[19] D Wu M L Yiu C S Jensen and G Cong ldquoEfficient con-tinuously moving top-k spatial keyword query processingrdquo inProceedings of the IEEE International Conference on Data En-gineering (ICDE rsquo11) pp 541ndash552 Hannover Germany April2011
[20] W Huang G Li K-L Tan and J Feng ldquoEfficient safe-re-gion construction for moving top-k spatial keyword queriesrdquoin Proceedings of the 21st ACM International Conference onInformation and Knowledge Management pp 932ndash941 2012
[21] L Guo J ShaoHHAung andK-L Tan ldquoEfficient continuoustop-k spatial keyword queries on road networksrdquoGeoInformat-ica vol 19 no 1 pp 29ndash60 2014
[22] Y Li G Li L Shu Q Huang and H Jiang ldquoContinuous moni-toring of top-k spatial keyword queries in road networksrdquo Jour-nal of Information Science and Engineering vol 31 no 6 pp1831ndash1848 2015
[23] M Attique A Khan and T-S Chung ldquoESPAK Top-k spatialkeyword query processing in directed road networksrdquo in Pro-ceedings of the Workshops of the International Conference onExtending Database Technology and the International Confer-ence on DatabaseTheory (EDBTICDT rsquo17) March 2017
[24] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing ampManagement vol24 no 5 pp 513ndash523 1988
[25] V N Anh O de Kretser and A Moffat ldquoVector-space rankingwith effective early terminationrdquo in Proceedings of the 24th An-nual International ACM SIGIR Conference pp 35ndash42 NewOrleans LO USA 2001
[26] E W Dijkstra ldquoA note on two problems in connexion withgraphsrdquo Numerische Mathematik vol 1 pp 269ndash271 1959
[27] ldquoReal datasets for spatial databasesrdquo httpswwwcsutahedulifeifeiSpatialDatasethtm
[28] ldquoTwitterrdquo httpstwittercom[29] T Brinkhoff ldquoA framework for generating network-basedmov-
ing objectsrdquo GeoInformatica vol 6 no 2 pp 153ndash180 2002
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
10 Wireless Communications and Mobile Computing
Table 3 Computation of safe exit points for example scenario
EdgeSegment 119901119886 119863+119901119886 119863+119899120573 119901119904119890997888997888997888997888rarr(119902 1198993) q 119863+119902 = 1198893 119863+1198993 = 1198893 none(1198993 1198994) q 119863+1198993 = 1198893 119863+1198994 = 1198893 none(1198993 1198997) 1198993 119863+1198993 = 1198893 119863+1198997 = 1198896 1199011199041198901997888997888997888997888997888rarr(1198993 1198995) 1198993 119863+1198993 = 1198893 119863+1198995 = 1198893 nonelarr997888997888997888997888997888(1198996 1198995) 1198995 119863+1198995 = 1198893 119863+1198996 = 1198894 1199011199041198902
2
q
3
1
1 1
1
1
2
1
2
1 2
1
3
2
1
1
d4 (Chinese Restaurant)
d1 (Grand Hotel)
d5 (Pub and Bar)
n1
n6
n2 n3
n4
n7
pse1
pse2
n5
d6(Italian Restaurant)
d3 (Italian Restaurant)
d2 (Cafe)
d7 (Cafe and Bakery)
Figure 5 Illustration of safe region of q
the set of k data objects that satisfies the query conditionat 119901119886 (119899120573) According to Dijkstras algorithm [26] the timecomplexity 119874(119863+119902 ) for computing a set of answer objects at aquery point q is119874(119863+119902 ) = 119874(|119864|+|119873| log |119873|)Thismeans that119874(119863+119901119886) = 119874(119863+119899120573) = 119874(|119864| + |119873| log |119873|) holds for endpoints119901119886 and 119899120573 Thus time complexity 119874(Ω119896119905ℎ) when determiningthe skyline Ω119896119905ℎ with the k-th highest score is 119874(Ω119896119905ℎ) =119862119896119905ℎ119874(|119863+119901119886 cup 119863+119899120573 cup 119863(119901119886 119899120573)|) where 119862119896119905ℎ is the numberof qualifying objects that participate in the constitution ofthe skyline with the k-th highest score Therefore the timecomplexity of determining a safe exit point coincides withthe time complexity of determining the two skylines iethe skyline 119863+119897 with the k-th highest (or lowest) score foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects This is because the safe exit point is foundat the cross point between these skylines
Figure 6 represents the skyline graph for 119896 = 1 in an edge(1198997 1198993) Let us draw the score function for 1198893 and 1198896 for theroad segment (1198997 1198993) where a safe exit point exists This isbecause 119863(1198993)+ = 1198893 and 119863(1198997)+ = 1198896 for 119896 = 1 For eachpoint 119901 isin (1198997 1198993) the distance between 1198893 and point p canbe represented as 119889119894119904119905(1198893 119901) = 119889119894119904119905(1198893 1198993) + 119897119890119899(1198993 119901) = 6 minus119897119890119899(1198997 119901) Similarly for each point 119901 isin (1198997 1198993) the distancebetween 1198896 and point p can be represented as 119889119894119904119905(1198896 119901) =119889119894119904119905(1198896 1198997) + 119897119890119899(1198997 119901) = 2 + 119897119890119899(1198997 119901) Let 119897119890119899(1198997 119901) be
n7
10
08
06
04
02
n3pse1d7
distance
Scor
e
05 10 15 20 25 30
(d6) = 1(x + 3)
(d3) = 1(minusx + 7)
Figure 6 Skyline graph for 119896 = 1 on the road segment (1198997 1198993)
a variable x (0 le 119909 le 3) We can write 120582(1198893 119901) =119889119894119904119905(1198893 119901) = 6 minus 119909 and 120582(1198896 119901) = 119889119894119904119905(1198896 119901) = 2 + 119909 Thenwe can represent score function 120595(1198893) and 120595(1198896) as follows
120595(1198893) = 120583(1198893119905 119902119905)(1 + 120572 sdot 120582(1198893 119901)) = 1(7 minus 119909) for(0 le 119909 le 3)
Wireless Communications and Mobile Computing 11
120595(1198896) = 120583(1198896119905 119902119905)(1 + 120572 sdot 120582(1198896 119901)) = 1(3 + 119909) for(0 le 119909 le 3)Finally we present the lemma to prove that safe exit points
computed by COSK are correct
Lemma 8 The COSK algorithm correctly computes a set ofsafe exit points
Proof We will prove the correctness of the COSK algorithmby contradiction We assume that if 119863+119901119886 = 119863+119899120573 there is nosafe exit point in a road segment (119901119886119899120573) This means that foreach point p in the road segment (119901119886119899120573) the query result atp equals 119863+119901119886 ie 119863+119901 = 119863+119901119886forall119901 isin (119901119886119899120573) However it leadsto a contradiction that 119863+119899120573 = 119863+119901119886 when 119901 = 119899120573 There-fore if 119863+119901119886 = 119863+119899120573 a safe exit point exists in (119901119886119899120573) In addi-tion a safe exit point is determined using the skyline 119863+119897 foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects when 119863+119901119886 = 119863+119899120573 The first skyline is acomposite polyline drawn from answer objects in 119863+119901119886 Thesecond skyline is a composite polyline drawn from nonan-swer objects in 119863+119899120573 cup 119863(119901119886 119899120573) minus 119863+119901119886
6 Monitoring Query Results and Safe Regionsin Dynamic Directed Road Networks
In this section we discuss the monitoring of spatial key-word queries in dynamic road networks where the networkdistance changes depending on the traffic conditions Theupdates on weight of some edges may invalidate the queryresults or safe region of q even though the query objectq remains within their respective safe region Figure 7illustrates an example of changing the weights edges
larr997888997888997888997888997888(1198991 1198992)and
larr997888997888997888997888997888(1198991 1198996) For convenience we consider 120572 = 1 and qt =ldquoItalian restaurantrdquo In Figure 7(a) the top-1 result is 1198891 andbold lines show the safe region of query q Now consider attime 119905119895 the weights of two edgeslarr997888997888997888997888997888(1198991 1198992) andlarr997888997888997888997888997888(1198991 1198996) changeddue to heavy traffic condition as shown in Figure 7(b) Theupdate in weight of edges may invalidate the query resultor safe region of q Therefore it is necessary to monitor thevalidity of results and safe region when the changes occur
Next we introduce a monitoring region to monitor thevalidity of the safe region effectively when the weight ofan edge is changed Monitoring region MR contains all thepoints between query point q and lowest answer object andhighest nonanswer object Formally it is defined as 119872119877 =119889119894119904119905(119902119863+119897 ) cup 119889119894119904119905(119902119863minusℎ) where 119889119894119904119905(119902119863+119897 ) is the distancebetween q and lowest answer object and 119889119894119904119905(119902119863minusℎ) is highestnonanswer object In given example the 119863+119897 = 1198891 and 119863minusℎ =1198892 1198893 Therefore the dotted lines in Figure 8(a) shows themonitoring region of query object q
Now at time 119905119895 the update to edgeslarr997888997888997888997888997888(1198991 1198996) and larr997888997888997888997888997888997888(1198991 1198891)
which is not part of monitoring region can safely be ignoredHowever the updated on segment
997888997888997888997888997888997888rarr(1198992 1198891)which is associatedwith monitoring region may nullify the results As shown in
Figure 8(b) after update the top-1 result becomes 1198892 and boldlines represents the new safe region of q
Algorithm 5 monitors the validity of result set and saferegion of query object qwhen the weight of any edge changesLet us consider weight of edge (119899119894 119899119895) changes at time 119905119895First algorithm checks whether edge (119899119894 119899119895) is associatedwith monitoring region or not If it is not part of monitoringregion then algorithm simply ignores the update in edge(119899119894 119899119895) and query results and safe region remains valid Incontrast if edge is associated with monitoring region (ie119872119877cap(119899119894 119899119895) = 0) then algorithm evaluates the query resultsConsequently the top-k results and safe region of queryq needs to be updated Finally the algorithm updates themonitoring region of q
7 Performance Evaluation
In this section we evaluate the performance of COSKthrough simulation experiments We describe our experi-mental settings in Section 71 and we present our experimen-tal results for static and dynamic road networks in Sections72 and 73 respectively
71 Experimental Settings All of our experiments wereperformed using real road networks namely OldenburgSan Francisco and San Joaquin All three road networkswere obtained from [27] The original road network of SanFrancisco had 21047 nodes and 21692 edges We reformat-ted the network pruned approximately 30 of the nodesand adjusted the edges and their weights accordingly Thisresulted in a network with 14732 nodes and 14316 edgesBoth the direction of edges and data objects on the edgeswere generated randomly The description of each data objectwas extracted from Twitter messages [28] and we assignedone tweet per data object Table 4 presents the characteristicsof the data sets used in the experimental evaluation Wesimulated moving query objects by using a spatiotemporaldata generator [29] The input to generator was the road net-work of the data set used and the output was the set of queryobjects moving on the road network Each experiment had100 moving queries which were continuously monitored for100 timestamps (1 timestamp = 1 second) and the averageresult was reported in the experiments
As a benchmark for COSK in static road network weimplemented a CMTkSK+ algorithm [22] which also contin-uously monitored the moving top-k spatial keyword queriesin the road networks However this algorithm was originallydesigned for undirected road networks To make a faircomparison we modified CMTkSK+ to process top-k spatialkeyword queries in directed road networks and called itCMTkSK+ Specifically we modified the distance computa-tion method between two points such that in directed roadnetworks 119889119894119904119905(1199011 1199012) = 119889119894119904119905(1199012 1199011) Since CMTkSK+ doesnot handle top-k spatial queries in dynamic road roads wecompared the performance of COSK with basic algorithmwhich recomputes the results whenever query object changesits location All algorithms were implemented in Java andwere executed on a desktop PC 280-GHz Intel Core i5 with
12 Wireless Communications and Mobile Computing
3
q5 5
2 3
3
2
2 3 5
11
d3 (Chinese Restaurant)
n1
n6
n2 pse2
pse1
pse3
n4n5
n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Safe region at time 119905119894
9
q10 5
6 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6
n2 n3
n4n5
d2 (Italian Restaurant)d1 (Italian Restaurant)
(b) Updating weight oflarr997888997888997888997888997888997888(1198991 1198992) and
larr997888997888997888997888997888997888(1198991 1198996) at time 119905119895
Figure 7 Updating the weight of edges in a dynamic road network where 119905119894 lt 119905119895
3
q5 5
2 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6 n4n5
n2 n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Monitoring region at time 119905119894
9
q10 5
5 4
233
2
2 3 5
11
037
pse2pse1
pse3
d3 (Chinese Restaurant)n6 n4n5
n2 n3d2 (Italian Restaurant)n1 d1 (Italian Restaurant)
(b) New safe region at time 119905119895
Figure 8 Monitoring region and updated safe region at time 119905119895
(1) InputMonitoring regionMR updated edge (119899119894 119899119895)(2) Output none(3) if 119872119877cap (119899119894 119899119895) = 0 then(4) lowastedge (119899119894 119899119895) is not part of monitoring region(5) ignore the change in the weight of edge (119899119894 119899119895)(6) end(7) 119875119878119864 larr997888 0 lowastset of safe exit points(8) else(9) 119863119896119906119901119889 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119899119894 119890119894) lowastupdate set of
top-k results(10) 119875119878119864119906119901119889 larr997888 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119875119886 119899120573) lowastupdate safe exit
points(11) 119872119877119906119901119889 larr997888 119862119900119898119901119906119905119890119872119900119899119894119905119900119903119894119899119892119877119890119892119894119900119899(119863+119897 119863minusℎ )
lowastupdate monitoring region(12) end
Algorithm 5 MonitoringSafeRegion(MR(119899119894 119899119895))
Table 4 Summary of datasets
Attribute Oldenburg San Francisco San JoaquinTotal no of nodes 6104 14732 18262Total no of edges 7034 14316 23876Percentage of directed edges 30 30 30Total no of objects 5627 11453 19098Average no of objects per edge 08 08 08Total no of words 49517 103649 166153
Wireless Communications and Mobile Computing 13
Table 5 Experimental parameter settings
Parameter RangeNumber of results (k) 5 10 15 20 25Number of keywords (n) 1 2 3 4 5Query parameter (120572) 001 01 1 10 100Dataset Oldenburg San Francisco San JoaquinNumber of data objects (119873119863) 10 20 30 40 50 (x1000)Speed of query objects (119881119902119903119910) 25 50 75 100 125 (kmh)Mobility (119872119902119903119910) 20 40 60 80 100Ratio of directed edges (119864119889119894119903) 10 20 30 40 50Ratio of updated edges (119864119906119901119889) 15 30 60 80 100
8GB of memory In the experiments we compared (1) queryprocessing times (2) edges processed ie the number ofedges processed for retrieving query results and (3) indexsizes Table 5 summarizes the parameters used in the exper-iments In each experiment we varied a single parameterwithin the range that is shown in Table 5 while maintainingthe other parameters at the bolded default values
We evaluated the performance of the algorithms by usingthe following measures (1) total amount of server CPUtime which indicates the query processing time and (2)total communication cost as the total number of points (iethe location updates sent by query objects and the queryresults and safe exit points returned by the server) transferredbetween clients and the serverThebattery power andwirelessbandwidth consumption typically increase with the amountof data transferred between objects (clients) and serversThus we used the amount of transferred data as a metric toevaluate the communication cost
72 Experimental Results of Top-k Spatial KeywordQueries in Static Road Networks
721 Effect of k Figure 9 indicates the effect of the numberof results on the query processing time and communicationcost for both algorithms Figure 9(a) indicates that the queryprocessing time increases for both algorithms as the value ofk increases This is expected because with an increase in kmore data objects are required to be explored and verifiedNevertheless COSK significantly outperforms CMTkSK+ fortwo main reasons First a relevant object search is very effi-cient when using the highest significant factor and secondCOSKdoes not need to verify the set of answer objects as longas the query object lies in a safe region On the other handthe CMTkSK+ query processing time increases significantlybecause it has to monitor and verify the set of candidateobjects periodically In Figure 9(b) the communication costsfor both algorithms increase as the number of objects in-creases However the proposed algorithm demonstrates su-perior performance compared to CMTkSK+ because client-server communication is not required when the query objectlies within the safe exit points whereas in CMTkSK+ thequery object is required to report its location to the serverwhenever it moves
722 Effect of119873119863 This experimentwas conducted on datasetSan Joaquin This dataset included 19098 data objects there-fore we randomly generated approximately 30000 additionaldata objects on different edges In Figure 10 we evaluate theperformance of COSK and CMTkSK+ by varying the cardi-nality of the data objects Note that119873119863 = 10119870 corresponds toa low density of data points while119873119863 = 50119870 corresponds toa high density In Figure 10(a) it is interesting to notice thatthe query processing times of both algorithms decrease asthe cardinality of the data objects increases For CMTkSK+this is because with high density the monitoring range of aquery decreases However for COSK it is mainly becausewhen the data density is high fewer edges are required tobe expanded which decreases the query processing time InFigure 10(b) we study the influence of the cardinality of thedata objects on the communication costs The experimentalresults indicate that the communication costs of CMTkSK+incur almost constant communication costs regardless ofdata object cardinality However the communication costsof COSK increase in proportion to the 119873119863 value This isexpected because the safe region becomes smaller as thedensity of the data objects increases which increases thecommunication costs
723 Effect of Query Keywords (n) Figure 11 shows thequery processing time and communication for COSK andCMTkSK+ as a function of the number of query keywordsFigures 11(a) and 11(b) show the trend that the performanceof both algorithms degrades when the number of keywordsincreases This is mainly because by increasing the numberof query keywords the number of relevant objects may alsoincrease resulting in a higher query processing time andcommunication cost However the safe-region-based algo-rithm COSK scales better than CMTkSk+ because of its lessexpensive monitoring technique
724 Effect of 120572 Figure 12 demonstrates the impact of queryparameter 120572 on the query processing time and on the com-munication cost A small value of 120572 indicates a greater im-portance of textual relevance whereas a high value of 120572gives more preference to the spatial relevance It is interestingto note that the query processing time is lower for higher
14 Wireless Communications and Mobile Computing
k
50
10
10
15 20
20
30
Que
ry p
roce
ssin
g tim
e (s)
COSKCMTkSK+
40
25
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
k
5 10 15 20 25
(b) Communication cost
Figure 9 Effect of k on query processing time and number of edges processed
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
10k 20k 30k 40k 50kND
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
tran
sferr
ed m
essa
ges
1M
10 20 30 40 50ND
(b) Communication cost
Figure 10 Effect of119873119863 on query processing time and communication cost
values of 120572 which indicates more importance to the spatialrelevance This is mainly because when the spatial relevanceis higher fewer edges and objects are required to be exploredand processed to determine the top-k data objects Observethat in Figure 12(b) the number of messages sent by COSKdecreases sharply with an increase in 120572725 Effect of Speed Figure 13(a) demonstrates the influenceof the speed of the query objects on the query processingtime of the COSK and CMTkSK+ algorithms The experi-mental results indicate that the performance of CMTkSK+is not significantly influenced by the speed of the query
objects because the candidate objects must be continuouslymonitored after a regular interval of time regardless ofthe speed On the other hand for COSK the performancegradually decreases as the speed of the query objects increasesbecause the objects leave their respective safe regions morefrequently Figure 13(b) shows the communication costs ofCOSK and CMTkSK+ with respect to the speed of the queryobjects CMTkSK+ incurs almost constant communicationcosts because a server-initiated request to verify the candidateobjects does not depend on the speed For COSK the queryobjects cross safe regions more frequently when the speed ishigh which increases the communication costs
Wireless Communications and Mobile Computing 15
Number of keywords1 2 3 4 5
COSKCMTkSK+
0
15
30
45
Que
ry p
roce
ssin
g tim
e (s)
60
(a) Query processing time
COSK
Number of keywords
CMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1 2 3 4 5
(b) Communication cost
Figure 11 Effect of number of keywords on query processing time and communication cost
001 01 1 10 100
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
001 01 1 10 100
(b) Communication cost
Figure 12 Effect of 120572 on query processing time and communication cost
726 Effect of Mobility Figure 14 shows the effect of mobility119872119902119903119910 (mobility refers to the percentage of query objects thatare moving at any timestamp) on the performance of COSKand CMTkSK+ algorithms As expected the query pro-cessing time and communication costs for both algorithmsincrease with119872119902119903y Nevertheless COSK performs better thanCMTkSK+ in terms of query processing time and commu-nication costs
727 Effect of Directed Edges Figure 15 shows the impactof percentage of directed edges 119864119889119894119903 on the performance ofCOSK and CMTkSK+ algorithms The query processing time
increases with 119864119889119894119903 because algorithm needs to explore moreedges to retrieve the top-k keyword queries However thecommunication cost is not significantly affected by the valueof 119864119889119894119903 for both the algorithms
728 Effect of Datasets Figure 16 demonstrates the indexsizes of the COSK and CMTkSK+ approaches for differentdatasets As shown in Figure 16 both algorithms have similarindex sizes However COSK has minor space overheadbecause it stores additional information of the highest signifi-cance factor 120579119905 of edges More important this space overheadis minimal as compared to the gain achieved by COSK inquery processing time and communication costs
16 Wireless Communications and Mobile Computing
25 50 75 100 125
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
Vqry
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
25 50 75 100 125Vqry
(b) Communication cost
Figure 13 Effect of speed on query processing time and communication cost
20 40 60 80 100Mqry
COSKCMTkSK+
0
15
45
30
60
Que
ry p
roce
ssin
g tim
e (s)
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
20 40 60 80 100Mqry
1k
COSKCMTkSK+
(b) Communication cost
Figure 14 Effect of mobility on query processing time and communication cost
73 Experimental Results of Top-k Spatial Keyword Queriesin Dynamic Road Networks In this section we evaluate theperformance of COSK and basic algorithm for dynamic roadnetworks The 119864119906119901119889 indicates the percentage of all edges thatchange their weight at each timestamp The length of anupdated edge is randomly selected between 01 to 10 times theoriginal length Figure 17(a) depicts the query processing timeof COSK and basic algorithm It is evident from the figure thatquery processing time of basic algorithm is not significantlyaffected by 119864119906119901119889 This is mainly because the query objectsissue top-k spatial queries at each timestamp However query
processing time of COSK increases with the value of 119864119906119901119889because the probability that the updated edge may associatedwith the monitoring region of query q increases with 119864119906119901119889Therefore when 119864119906119901119889 becomes large the results need to befrequently updated which increases the query processingtime Figure 17(b) shows the communication costs of COSKand basic algorithm with respect to 119864119906119901119889 Basic algorithmincurs almost constant communication costs regardless of thevalue of 119864119906119901119889 In contrast the communication cost of COSKincreases with 119864119906119901119889 because the query result and safe regionsneeds to be frequently updated
Wireless Communications and Mobile Computing 17
COSKCMTkSK+
10 20 30 40 50Edir
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1k
10 20 30 40 50Edir
eSPAKCMTkSK+
(b) Communication cost
Figure 15 Effect of 119864119889119894119903 on query processing time and communication cost
COSKCMTkSK+
0
15
45
30
60
Inde
x siz
e (M
B)
OldenburgDatasets
San Francisco San Joaquin
Figure 16 Effect of dataset on index size
8 Conclusion
In this paper we investigated moving top-k spatial keywordqueries in directed and dynamic road networksWepresentedan efficient indexing framework using inverted files thatindexes the data objects on edges allowing for the effectivesearching of data objects relevant to queries in terms ofboth textual and spatial relevance We also presented a safe-exit-based algorithm called COSK to monitor moving top-k spatial keyword queries We demonstrated that the queryresults remain valid as long as the query object resides withina safe region Furthermore COSK can effectively monitor thevalidity of query results and safe regions in dynamic roadnetworks Finally an experimental evaluation conducted on
real road networks demonstrated that COSK significantlyreduced the query processing time and communication costscompared to the CMTkSK+ algorithm
Data Availability
The real road network data used in this study are also used inmany previous studies The road network data is cited in themanuscript and it is available at httpswwwcsutahedusimlifeifeiSpatialDatasethtm To simulate the moving queriesthe authors used the spatiotemporal data generator which isalso used in previous studiesThe research article of generatoris cited in the manuscript The documentation and source
18 Wireless Communications and Mobile Computing
0
20
40
60
Que
ry p
roce
ssin
g tim
e (s)
80
15 30 45 60 75Eupd
COSKBasic
(a) Query processing time
15 30 45 60 75Eupd
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
COSKBasic
(b) Communication cost
Figure 17 Effect of 119864119906119901119889 on query processing time and communication cost
files of generator are available at httpsiapgjade-hsdeper-sonenbrinkhoffgenerator They used the Twitter tweetsfor generating the description of data objects and also querykeywords The tweets used can be accessible at httpfollow-thehashtagcomdatasetsfree-twitter-dataset-usa-200000-free-usa-tweets
Conflicts of Interest
The authors declare that there is no conflicts of interestregarding the publication of this paper
Acknowledgments
Hyung-JuChowas supported by theNational Research Foun-dation of Korea (NRF) grant funded by the Korean Govern-ment (MSIP) (NRF-2016R1A2B4009793) and this researchwas partially supported by Basic Science Research Programthrough the National Research Foundation of Korea (NRF)fundedby theMinistry of Education (2016R1D1A1B03934129)
References
[1] D Papadias N Mamoulis J Zhang and Y Tao ldquoQuery pro-cessing in spatial network databasesrdquo in Proceedings of the 29thInternational Conference on Very Large Data Bases (VLDB rsquo03)pp 802ndash813 September 2003
[2] H-J Cho K Ryu and T-S Chung ldquoAn efficient algorithm forcomputing safe exit points of moving range queries in directedroad networksrdquo Information Systems vol 41 pp 1ndash19 2014
[3] G Tsatsanifos and A Vlachou ldquoOn processing Top-k spatio-textual preference queriesrdquo in Proceedings of the 18th Interna-tional Conference on ExtendingDatabase Technology (EDBT rsquo15)pp 433ndash444 March 2015
[4] R Li A X Liu A L Wang and B Bruhadeshwar ldquoFast rangequery processing with strong privacy protection for cloud com-putingrdquo Proceedings of the VLDB Endowment vol 7 no 14 pp1953ndash1964 2014
[5] G Cong C S Jensen andDWu ldquoEfficient retrieval of the Top-k most relevant spatial web objectsrdquo Proceedings of the VLDBEndowment vol 2 no 1 pp 337ndash348 2009
[6] Z Li K C K Lee B Zheng W-C Lee D Lee and X WangldquoIR-tree An efficient index for geographic document searchrdquoIEEE Transactions on Knowledge and Data Engineering vol 23no 4 pp 585ndash599 2011
[7] Y Zhou X Xie C Wang Y Gong and W Ma ldquoHybrid indexstructures for location-based web searchrdquo in Proceedings of the14th ACM International Conference on Information and Knowl-edge Management pp 155ndash162 Bremen Germany October2005
[8] J Zobel and A Moffat ldquoInverted files for text search enginesrdquoACM Computing Surveys vol 38 no 2 2006
[9] N Beckmann H Kriegel R Schneider and B Seeger ldquoR-anefficient and robust accessmethod for points and rectanglesrdquo inProceedings of the ACM SIGMOD International Conference onManagement of Data vol 19 pp 322ndash331 May 1990
[10] R Hariharan B Hore C Li and S Mehrotra ldquoProcessing spa-tial-keyword (sk) queries in geographic information retrieval(gir) systemsrdquo in Proceedings of the 19th International Confer-ence on Scientific and Statistical DatabaseManagement (SSDBMrsquo07) July 2007
[11] I De FelipeV Hristidis andN Rishe ldquoKeyword search on spa-tial databasesrdquo in Proceedings of the 24th International Confer-ence on Data Engineering (ICDE rsquo08) pp 656ndash665 April 2008
[12] J B Rocha-Junior O Gkorgkas S Jonassen and K NoslashrvagldquoEfficient processing of top-k spatial keyword queriesrdquo inProceedings of the International Symposium on Spatial andTemporal Databases pp 205ndash222 Springer 2011
[13] D Zhang K-L Tan andAK Tung ldquoScalable top-k spatial key-word searchrdquo in Proceedings of the 16th International Conferenceon Extending Database Technology pp 359ndash370 2013
Wireless Communications and Mobile Computing 19
[14] J B Rocha-Junior andK Noslashrvag ldquoTop-k spatial keyword quer-ies on road networksrdquo in Proceedings of the 15th InternationalConference on Extending Database Technology pp 168ndash179Berlin Germany March 2012
[15] H-J Cho S J Kwon and T-S Chung ldquoA safe exit algorithmfor continuous nearest neighbor monitoring in road networksrdquoMobile Information Systems vol 9 no 1 pp 37ndash53 2013
[16] D Yung M L Yiu and E Lo ldquoA safe-exit approach for efficientnetwork-based moving range queriesrdquo Data amp KnowledgeEngineering vol 72 pp 126ndash147 2012
[17] M Attique H Cho R Jin and T Chung ldquoEfficient Processingof Continuous Reverse k Nearest Neighbor on Moving Objectsin Road Networksrdquo ISPRS International Journal of Geo-Infor-mation vol 5 no 12 p 247 2016
[18] H G Elmongui M F Mokbel and W G Aref ldquoContinuousaggregate nearest neighbor queriesrdquoGeoInformatica vol 17 no1 pp 63ndash95 2013
[19] D Wu M L Yiu C S Jensen and G Cong ldquoEfficient con-tinuously moving top-k spatial keyword query processingrdquo inProceedings of the IEEE International Conference on Data En-gineering (ICDE rsquo11) pp 541ndash552 Hannover Germany April2011
[20] W Huang G Li K-L Tan and J Feng ldquoEfficient safe-re-gion construction for moving top-k spatial keyword queriesrdquoin Proceedings of the 21st ACM International Conference onInformation and Knowledge Management pp 932ndash941 2012
[21] L Guo J ShaoHHAung andK-L Tan ldquoEfficient continuoustop-k spatial keyword queries on road networksrdquoGeoInformat-ica vol 19 no 1 pp 29ndash60 2014
[22] Y Li G Li L Shu Q Huang and H Jiang ldquoContinuous moni-toring of top-k spatial keyword queries in road networksrdquo Jour-nal of Information Science and Engineering vol 31 no 6 pp1831ndash1848 2015
[23] M Attique A Khan and T-S Chung ldquoESPAK Top-k spatialkeyword query processing in directed road networksrdquo in Pro-ceedings of the Workshops of the International Conference onExtending Database Technology and the International Confer-ence on DatabaseTheory (EDBTICDT rsquo17) March 2017
[24] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing ampManagement vol24 no 5 pp 513ndash523 1988
[25] V N Anh O de Kretser and A Moffat ldquoVector-space rankingwith effective early terminationrdquo in Proceedings of the 24th An-nual International ACM SIGIR Conference pp 35ndash42 NewOrleans LO USA 2001
[26] E W Dijkstra ldquoA note on two problems in connexion withgraphsrdquo Numerische Mathematik vol 1 pp 269ndash271 1959
[27] ldquoReal datasets for spatial databasesrdquo httpswwwcsutahedulifeifeiSpatialDatasethtm
[28] ldquoTwitterrdquo httpstwittercom[29] T Brinkhoff ldquoA framework for generating network-basedmov-
ing objectsrdquo GeoInformatica vol 6 no 2 pp 153ndash180 2002
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
Wireless Communications and Mobile Computing 11
120595(1198896) = 120583(1198896119905 119902119905)(1 + 120572 sdot 120582(1198896 119901)) = 1(3 + 119909) for(0 le 119909 le 3)Finally we present the lemma to prove that safe exit points
computed by COSK are correct
Lemma 8 The COSK algorithm correctly computes a set ofsafe exit points
Proof We will prove the correctness of the COSK algorithmby contradiction We assume that if 119863+119901119886 = 119863+119899120573 there is nosafe exit point in a road segment (119901119886119899120573) This means that foreach point p in the road segment (119901119886119899120573) the query result atp equals 119863+119901119886 ie 119863+119901 = 119863+119901119886forall119901 isin (119901119886119899120573) However it leadsto a contradiction that 119863+119899120573 = 119863+119901119886 when 119901 = 119899120573 There-fore if 119863+119901119886 = 119863+119899120573 a safe exit point exists in (119901119886119899120573) In addi-tion a safe exit point is determined using the skyline 119863+119897 foranswer objects and the skyline 119863minusℎ with the highest score fornonanswer objects when 119863+119901119886 = 119863+119899120573 The first skyline is acomposite polyline drawn from answer objects in 119863+119901119886 Thesecond skyline is a composite polyline drawn from nonan-swer objects in 119863+119899120573 cup 119863(119901119886 119899120573) minus 119863+119901119886
6 Monitoring Query Results and Safe Regionsin Dynamic Directed Road Networks
In this section we discuss the monitoring of spatial key-word queries in dynamic road networks where the networkdistance changes depending on the traffic conditions Theupdates on weight of some edges may invalidate the queryresults or safe region of q even though the query objectq remains within their respective safe region Figure 7illustrates an example of changing the weights edges
larr997888997888997888997888997888(1198991 1198992)and
larr997888997888997888997888997888(1198991 1198996) For convenience we consider 120572 = 1 and qt =ldquoItalian restaurantrdquo In Figure 7(a) the top-1 result is 1198891 andbold lines show the safe region of query q Now consider attime 119905119895 the weights of two edgeslarr997888997888997888997888997888(1198991 1198992) andlarr997888997888997888997888997888(1198991 1198996) changeddue to heavy traffic condition as shown in Figure 7(b) Theupdate in weight of edges may invalidate the query resultor safe region of q Therefore it is necessary to monitor thevalidity of results and safe region when the changes occur
Next we introduce a monitoring region to monitor thevalidity of the safe region effectively when the weight ofan edge is changed Monitoring region MR contains all thepoints between query point q and lowest answer object andhighest nonanswer object Formally it is defined as 119872119877 =119889119894119904119905(119902119863+119897 ) cup 119889119894119904119905(119902119863minusℎ) where 119889119894119904119905(119902119863+119897 ) is the distancebetween q and lowest answer object and 119889119894119904119905(119902119863minusℎ) is highestnonanswer object In given example the 119863+119897 = 1198891 and 119863minusℎ =1198892 1198893 Therefore the dotted lines in Figure 8(a) shows themonitoring region of query object q
Now at time 119905119895 the update to edgeslarr997888997888997888997888997888(1198991 1198996) and larr997888997888997888997888997888997888(1198991 1198891)
which is not part of monitoring region can safely be ignoredHowever the updated on segment
997888997888997888997888997888997888rarr(1198992 1198891)which is associatedwith monitoring region may nullify the results As shown in
Figure 8(b) after update the top-1 result becomes 1198892 and boldlines represents the new safe region of q
Algorithm 5 monitors the validity of result set and saferegion of query object qwhen the weight of any edge changesLet us consider weight of edge (119899119894 119899119895) changes at time 119905119895First algorithm checks whether edge (119899119894 119899119895) is associatedwith monitoring region or not If it is not part of monitoringregion then algorithm simply ignores the update in edge(119899119894 119899119895) and query results and safe region remains valid Incontrast if edge is associated with monitoring region (ie119872119877cap(119899119894 119899119895) = 0) then algorithm evaluates the query resultsConsequently the top-k results and safe region of queryq needs to be updated Finally the algorithm updates themonitoring region of q
7 Performance Evaluation
In this section we evaluate the performance of COSKthrough simulation experiments We describe our experi-mental settings in Section 71 and we present our experimen-tal results for static and dynamic road networks in Sections72 and 73 respectively
71 Experimental Settings All of our experiments wereperformed using real road networks namely OldenburgSan Francisco and San Joaquin All three road networkswere obtained from [27] The original road network of SanFrancisco had 21047 nodes and 21692 edges We reformat-ted the network pruned approximately 30 of the nodesand adjusted the edges and their weights accordingly Thisresulted in a network with 14732 nodes and 14316 edgesBoth the direction of edges and data objects on the edgeswere generated randomly The description of each data objectwas extracted from Twitter messages [28] and we assignedone tweet per data object Table 4 presents the characteristicsof the data sets used in the experimental evaluation Wesimulated moving query objects by using a spatiotemporaldata generator [29] The input to generator was the road net-work of the data set used and the output was the set of queryobjects moving on the road network Each experiment had100 moving queries which were continuously monitored for100 timestamps (1 timestamp = 1 second) and the averageresult was reported in the experiments
As a benchmark for COSK in static road network weimplemented a CMTkSK+ algorithm [22] which also contin-uously monitored the moving top-k spatial keyword queriesin the road networks However this algorithm was originallydesigned for undirected road networks To make a faircomparison we modified CMTkSK+ to process top-k spatialkeyword queries in directed road networks and called itCMTkSK+ Specifically we modified the distance computa-tion method between two points such that in directed roadnetworks 119889119894119904119905(1199011 1199012) = 119889119894119904119905(1199012 1199011) Since CMTkSK+ doesnot handle top-k spatial queries in dynamic road roads wecompared the performance of COSK with basic algorithmwhich recomputes the results whenever query object changesits location All algorithms were implemented in Java andwere executed on a desktop PC 280-GHz Intel Core i5 with
12 Wireless Communications and Mobile Computing
3
q5 5
2 3
3
2
2 3 5
11
d3 (Chinese Restaurant)
n1
n6
n2 pse2
pse1
pse3
n4n5
n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Safe region at time 119905119894
9
q10 5
6 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6
n2 n3
n4n5
d2 (Italian Restaurant)d1 (Italian Restaurant)
(b) Updating weight oflarr997888997888997888997888997888997888(1198991 1198992) and
larr997888997888997888997888997888997888(1198991 1198996) at time 119905119895
Figure 7 Updating the weight of edges in a dynamic road network where 119905119894 lt 119905119895
3
q5 5
2 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6 n4n5
n2 n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Monitoring region at time 119905119894
9
q10 5
5 4
233
2
2 3 5
11
037
pse2pse1
pse3
d3 (Chinese Restaurant)n6 n4n5
n2 n3d2 (Italian Restaurant)n1 d1 (Italian Restaurant)
(b) New safe region at time 119905119895
Figure 8 Monitoring region and updated safe region at time 119905119895
(1) InputMonitoring regionMR updated edge (119899119894 119899119895)(2) Output none(3) if 119872119877cap (119899119894 119899119895) = 0 then(4) lowastedge (119899119894 119899119895) is not part of monitoring region(5) ignore the change in the weight of edge (119899119894 119899119895)(6) end(7) 119875119878119864 larr997888 0 lowastset of safe exit points(8) else(9) 119863119896119906119901119889 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119899119894 119890119894) lowastupdate set of
top-k results(10) 119875119878119864119906119901119889 larr997888 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119875119886 119899120573) lowastupdate safe exit
points(11) 119872119877119906119901119889 larr997888 119862119900119898119901119906119905119890119872119900119899119894119905119900119903119894119899119892119877119890119892119894119900119899(119863+119897 119863minusℎ )
lowastupdate monitoring region(12) end
Algorithm 5 MonitoringSafeRegion(MR(119899119894 119899119895))
Table 4 Summary of datasets
Attribute Oldenburg San Francisco San JoaquinTotal no of nodes 6104 14732 18262Total no of edges 7034 14316 23876Percentage of directed edges 30 30 30Total no of objects 5627 11453 19098Average no of objects per edge 08 08 08Total no of words 49517 103649 166153
Wireless Communications and Mobile Computing 13
Table 5 Experimental parameter settings
Parameter RangeNumber of results (k) 5 10 15 20 25Number of keywords (n) 1 2 3 4 5Query parameter (120572) 001 01 1 10 100Dataset Oldenburg San Francisco San JoaquinNumber of data objects (119873119863) 10 20 30 40 50 (x1000)Speed of query objects (119881119902119903119910) 25 50 75 100 125 (kmh)Mobility (119872119902119903119910) 20 40 60 80 100Ratio of directed edges (119864119889119894119903) 10 20 30 40 50Ratio of updated edges (119864119906119901119889) 15 30 60 80 100
8GB of memory In the experiments we compared (1) queryprocessing times (2) edges processed ie the number ofedges processed for retrieving query results and (3) indexsizes Table 5 summarizes the parameters used in the exper-iments In each experiment we varied a single parameterwithin the range that is shown in Table 5 while maintainingthe other parameters at the bolded default values
We evaluated the performance of the algorithms by usingthe following measures (1) total amount of server CPUtime which indicates the query processing time and (2)total communication cost as the total number of points (iethe location updates sent by query objects and the queryresults and safe exit points returned by the server) transferredbetween clients and the serverThebattery power andwirelessbandwidth consumption typically increase with the amountof data transferred between objects (clients) and serversThus we used the amount of transferred data as a metric toevaluate the communication cost
72 Experimental Results of Top-k Spatial KeywordQueries in Static Road Networks
721 Effect of k Figure 9 indicates the effect of the numberof results on the query processing time and communicationcost for both algorithms Figure 9(a) indicates that the queryprocessing time increases for both algorithms as the value ofk increases This is expected because with an increase in kmore data objects are required to be explored and verifiedNevertheless COSK significantly outperforms CMTkSK+ fortwo main reasons First a relevant object search is very effi-cient when using the highest significant factor and secondCOSKdoes not need to verify the set of answer objects as longas the query object lies in a safe region On the other handthe CMTkSK+ query processing time increases significantlybecause it has to monitor and verify the set of candidateobjects periodically In Figure 9(b) the communication costsfor both algorithms increase as the number of objects in-creases However the proposed algorithm demonstrates su-perior performance compared to CMTkSK+ because client-server communication is not required when the query objectlies within the safe exit points whereas in CMTkSK+ thequery object is required to report its location to the serverwhenever it moves
722 Effect of119873119863 This experimentwas conducted on datasetSan Joaquin This dataset included 19098 data objects there-fore we randomly generated approximately 30000 additionaldata objects on different edges In Figure 10 we evaluate theperformance of COSK and CMTkSK+ by varying the cardi-nality of the data objects Note that119873119863 = 10119870 corresponds toa low density of data points while119873119863 = 50119870 corresponds toa high density In Figure 10(a) it is interesting to notice thatthe query processing times of both algorithms decrease asthe cardinality of the data objects increases For CMTkSK+this is because with high density the monitoring range of aquery decreases However for COSK it is mainly becausewhen the data density is high fewer edges are required tobe expanded which decreases the query processing time InFigure 10(b) we study the influence of the cardinality of thedata objects on the communication costs The experimentalresults indicate that the communication costs of CMTkSK+incur almost constant communication costs regardless ofdata object cardinality However the communication costsof COSK increase in proportion to the 119873119863 value This isexpected because the safe region becomes smaller as thedensity of the data objects increases which increases thecommunication costs
723 Effect of Query Keywords (n) Figure 11 shows thequery processing time and communication for COSK andCMTkSK+ as a function of the number of query keywordsFigures 11(a) and 11(b) show the trend that the performanceof both algorithms degrades when the number of keywordsincreases This is mainly because by increasing the numberof query keywords the number of relevant objects may alsoincrease resulting in a higher query processing time andcommunication cost However the safe-region-based algo-rithm COSK scales better than CMTkSk+ because of its lessexpensive monitoring technique
724 Effect of 120572 Figure 12 demonstrates the impact of queryparameter 120572 on the query processing time and on the com-munication cost A small value of 120572 indicates a greater im-portance of textual relevance whereas a high value of 120572gives more preference to the spatial relevance It is interestingto note that the query processing time is lower for higher
14 Wireless Communications and Mobile Computing
k
50
10
10
15 20
20
30
Que
ry p
roce
ssin
g tim
e (s)
COSKCMTkSK+
40
25
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
k
5 10 15 20 25
(b) Communication cost
Figure 9 Effect of k on query processing time and number of edges processed
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
10k 20k 30k 40k 50kND
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
tran
sferr
ed m
essa
ges
1M
10 20 30 40 50ND
(b) Communication cost
Figure 10 Effect of119873119863 on query processing time and communication cost
values of 120572 which indicates more importance to the spatialrelevance This is mainly because when the spatial relevanceis higher fewer edges and objects are required to be exploredand processed to determine the top-k data objects Observethat in Figure 12(b) the number of messages sent by COSKdecreases sharply with an increase in 120572725 Effect of Speed Figure 13(a) demonstrates the influenceof the speed of the query objects on the query processingtime of the COSK and CMTkSK+ algorithms The experi-mental results indicate that the performance of CMTkSK+is not significantly influenced by the speed of the query
objects because the candidate objects must be continuouslymonitored after a regular interval of time regardless ofthe speed On the other hand for COSK the performancegradually decreases as the speed of the query objects increasesbecause the objects leave their respective safe regions morefrequently Figure 13(b) shows the communication costs ofCOSK and CMTkSK+ with respect to the speed of the queryobjects CMTkSK+ incurs almost constant communicationcosts because a server-initiated request to verify the candidateobjects does not depend on the speed For COSK the queryobjects cross safe regions more frequently when the speed ishigh which increases the communication costs
Wireless Communications and Mobile Computing 15
Number of keywords1 2 3 4 5
COSKCMTkSK+
0
15
30
45
Que
ry p
roce
ssin
g tim
e (s)
60
(a) Query processing time
COSK
Number of keywords
CMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1 2 3 4 5
(b) Communication cost
Figure 11 Effect of number of keywords on query processing time and communication cost
001 01 1 10 100
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
001 01 1 10 100
(b) Communication cost
Figure 12 Effect of 120572 on query processing time and communication cost
726 Effect of Mobility Figure 14 shows the effect of mobility119872119902119903119910 (mobility refers to the percentage of query objects thatare moving at any timestamp) on the performance of COSKand CMTkSK+ algorithms As expected the query pro-cessing time and communication costs for both algorithmsincrease with119872119902119903y Nevertheless COSK performs better thanCMTkSK+ in terms of query processing time and commu-nication costs
727 Effect of Directed Edges Figure 15 shows the impactof percentage of directed edges 119864119889119894119903 on the performance ofCOSK and CMTkSK+ algorithms The query processing time
increases with 119864119889119894119903 because algorithm needs to explore moreedges to retrieve the top-k keyword queries However thecommunication cost is not significantly affected by the valueof 119864119889119894119903 for both the algorithms
728 Effect of Datasets Figure 16 demonstrates the indexsizes of the COSK and CMTkSK+ approaches for differentdatasets As shown in Figure 16 both algorithms have similarindex sizes However COSK has minor space overheadbecause it stores additional information of the highest signifi-cance factor 120579119905 of edges More important this space overheadis minimal as compared to the gain achieved by COSK inquery processing time and communication costs
16 Wireless Communications and Mobile Computing
25 50 75 100 125
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
Vqry
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
25 50 75 100 125Vqry
(b) Communication cost
Figure 13 Effect of speed on query processing time and communication cost
20 40 60 80 100Mqry
COSKCMTkSK+
0
15
45
30
60
Que
ry p
roce
ssin
g tim
e (s)
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
20 40 60 80 100Mqry
1k
COSKCMTkSK+
(b) Communication cost
Figure 14 Effect of mobility on query processing time and communication cost
73 Experimental Results of Top-k Spatial Keyword Queriesin Dynamic Road Networks In this section we evaluate theperformance of COSK and basic algorithm for dynamic roadnetworks The 119864119906119901119889 indicates the percentage of all edges thatchange their weight at each timestamp The length of anupdated edge is randomly selected between 01 to 10 times theoriginal length Figure 17(a) depicts the query processing timeof COSK and basic algorithm It is evident from the figure thatquery processing time of basic algorithm is not significantlyaffected by 119864119906119901119889 This is mainly because the query objectsissue top-k spatial queries at each timestamp However query
processing time of COSK increases with the value of 119864119906119901119889because the probability that the updated edge may associatedwith the monitoring region of query q increases with 119864119906119901119889Therefore when 119864119906119901119889 becomes large the results need to befrequently updated which increases the query processingtime Figure 17(b) shows the communication costs of COSKand basic algorithm with respect to 119864119906119901119889 Basic algorithmincurs almost constant communication costs regardless of thevalue of 119864119906119901119889 In contrast the communication cost of COSKincreases with 119864119906119901119889 because the query result and safe regionsneeds to be frequently updated
Wireless Communications and Mobile Computing 17
COSKCMTkSK+
10 20 30 40 50Edir
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1k
10 20 30 40 50Edir
eSPAKCMTkSK+
(b) Communication cost
Figure 15 Effect of 119864119889119894119903 on query processing time and communication cost
COSKCMTkSK+
0
15
45
30
60
Inde
x siz
e (M
B)
OldenburgDatasets
San Francisco San Joaquin
Figure 16 Effect of dataset on index size
8 Conclusion
In this paper we investigated moving top-k spatial keywordqueries in directed and dynamic road networksWepresentedan efficient indexing framework using inverted files thatindexes the data objects on edges allowing for the effectivesearching of data objects relevant to queries in terms ofboth textual and spatial relevance We also presented a safe-exit-based algorithm called COSK to monitor moving top-k spatial keyword queries We demonstrated that the queryresults remain valid as long as the query object resides withina safe region Furthermore COSK can effectively monitor thevalidity of query results and safe regions in dynamic roadnetworks Finally an experimental evaluation conducted on
real road networks demonstrated that COSK significantlyreduced the query processing time and communication costscompared to the CMTkSK+ algorithm
Data Availability
The real road network data used in this study are also used inmany previous studies The road network data is cited in themanuscript and it is available at httpswwwcsutahedusimlifeifeiSpatialDatasethtm To simulate the moving queriesthe authors used the spatiotemporal data generator which isalso used in previous studiesThe research article of generatoris cited in the manuscript The documentation and source
18 Wireless Communications and Mobile Computing
0
20
40
60
Que
ry p
roce
ssin
g tim
e (s)
80
15 30 45 60 75Eupd
COSKBasic
(a) Query processing time
15 30 45 60 75Eupd
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
COSKBasic
(b) Communication cost
Figure 17 Effect of 119864119906119901119889 on query processing time and communication cost
files of generator are available at httpsiapgjade-hsdeper-sonenbrinkhoffgenerator They used the Twitter tweetsfor generating the description of data objects and also querykeywords The tweets used can be accessible at httpfollow-thehashtagcomdatasetsfree-twitter-dataset-usa-200000-free-usa-tweets
Conflicts of Interest
The authors declare that there is no conflicts of interestregarding the publication of this paper
Acknowledgments
Hyung-JuChowas supported by theNational Research Foun-dation of Korea (NRF) grant funded by the Korean Govern-ment (MSIP) (NRF-2016R1A2B4009793) and this researchwas partially supported by Basic Science Research Programthrough the National Research Foundation of Korea (NRF)fundedby theMinistry of Education (2016R1D1A1B03934129)
References
[1] D Papadias N Mamoulis J Zhang and Y Tao ldquoQuery pro-cessing in spatial network databasesrdquo in Proceedings of the 29thInternational Conference on Very Large Data Bases (VLDB rsquo03)pp 802ndash813 September 2003
[2] H-J Cho K Ryu and T-S Chung ldquoAn efficient algorithm forcomputing safe exit points of moving range queries in directedroad networksrdquo Information Systems vol 41 pp 1ndash19 2014
[3] G Tsatsanifos and A Vlachou ldquoOn processing Top-k spatio-textual preference queriesrdquo in Proceedings of the 18th Interna-tional Conference on ExtendingDatabase Technology (EDBT rsquo15)pp 433ndash444 March 2015
[4] R Li A X Liu A L Wang and B Bruhadeshwar ldquoFast rangequery processing with strong privacy protection for cloud com-putingrdquo Proceedings of the VLDB Endowment vol 7 no 14 pp1953ndash1964 2014
[5] G Cong C S Jensen andDWu ldquoEfficient retrieval of the Top-k most relevant spatial web objectsrdquo Proceedings of the VLDBEndowment vol 2 no 1 pp 337ndash348 2009
[6] Z Li K C K Lee B Zheng W-C Lee D Lee and X WangldquoIR-tree An efficient index for geographic document searchrdquoIEEE Transactions on Knowledge and Data Engineering vol 23no 4 pp 585ndash599 2011
[7] Y Zhou X Xie C Wang Y Gong and W Ma ldquoHybrid indexstructures for location-based web searchrdquo in Proceedings of the14th ACM International Conference on Information and Knowl-edge Management pp 155ndash162 Bremen Germany October2005
[8] J Zobel and A Moffat ldquoInverted files for text search enginesrdquoACM Computing Surveys vol 38 no 2 2006
[9] N Beckmann H Kriegel R Schneider and B Seeger ldquoR-anefficient and robust accessmethod for points and rectanglesrdquo inProceedings of the ACM SIGMOD International Conference onManagement of Data vol 19 pp 322ndash331 May 1990
[10] R Hariharan B Hore C Li and S Mehrotra ldquoProcessing spa-tial-keyword (sk) queries in geographic information retrieval(gir) systemsrdquo in Proceedings of the 19th International Confer-ence on Scientific and Statistical DatabaseManagement (SSDBMrsquo07) July 2007
[11] I De FelipeV Hristidis andN Rishe ldquoKeyword search on spa-tial databasesrdquo in Proceedings of the 24th International Confer-ence on Data Engineering (ICDE rsquo08) pp 656ndash665 April 2008
[12] J B Rocha-Junior O Gkorgkas S Jonassen and K NoslashrvagldquoEfficient processing of top-k spatial keyword queriesrdquo inProceedings of the International Symposium on Spatial andTemporal Databases pp 205ndash222 Springer 2011
[13] D Zhang K-L Tan andAK Tung ldquoScalable top-k spatial key-word searchrdquo in Proceedings of the 16th International Conferenceon Extending Database Technology pp 359ndash370 2013
Wireless Communications and Mobile Computing 19
[14] J B Rocha-Junior andK Noslashrvag ldquoTop-k spatial keyword quer-ies on road networksrdquo in Proceedings of the 15th InternationalConference on Extending Database Technology pp 168ndash179Berlin Germany March 2012
[15] H-J Cho S J Kwon and T-S Chung ldquoA safe exit algorithmfor continuous nearest neighbor monitoring in road networksrdquoMobile Information Systems vol 9 no 1 pp 37ndash53 2013
[16] D Yung M L Yiu and E Lo ldquoA safe-exit approach for efficientnetwork-based moving range queriesrdquo Data amp KnowledgeEngineering vol 72 pp 126ndash147 2012
[17] M Attique H Cho R Jin and T Chung ldquoEfficient Processingof Continuous Reverse k Nearest Neighbor on Moving Objectsin Road Networksrdquo ISPRS International Journal of Geo-Infor-mation vol 5 no 12 p 247 2016
[18] H G Elmongui M F Mokbel and W G Aref ldquoContinuousaggregate nearest neighbor queriesrdquoGeoInformatica vol 17 no1 pp 63ndash95 2013
[19] D Wu M L Yiu C S Jensen and G Cong ldquoEfficient con-tinuously moving top-k spatial keyword query processingrdquo inProceedings of the IEEE International Conference on Data En-gineering (ICDE rsquo11) pp 541ndash552 Hannover Germany April2011
[20] W Huang G Li K-L Tan and J Feng ldquoEfficient safe-re-gion construction for moving top-k spatial keyword queriesrdquoin Proceedings of the 21st ACM International Conference onInformation and Knowledge Management pp 932ndash941 2012
[21] L Guo J ShaoHHAung andK-L Tan ldquoEfficient continuoustop-k spatial keyword queries on road networksrdquoGeoInformat-ica vol 19 no 1 pp 29ndash60 2014
[22] Y Li G Li L Shu Q Huang and H Jiang ldquoContinuous moni-toring of top-k spatial keyword queries in road networksrdquo Jour-nal of Information Science and Engineering vol 31 no 6 pp1831ndash1848 2015
[23] M Attique A Khan and T-S Chung ldquoESPAK Top-k spatialkeyword query processing in directed road networksrdquo in Pro-ceedings of the Workshops of the International Conference onExtending Database Technology and the International Confer-ence on DatabaseTheory (EDBTICDT rsquo17) March 2017
[24] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing ampManagement vol24 no 5 pp 513ndash523 1988
[25] V N Anh O de Kretser and A Moffat ldquoVector-space rankingwith effective early terminationrdquo in Proceedings of the 24th An-nual International ACM SIGIR Conference pp 35ndash42 NewOrleans LO USA 2001
[26] E W Dijkstra ldquoA note on two problems in connexion withgraphsrdquo Numerische Mathematik vol 1 pp 269ndash271 1959
[27] ldquoReal datasets for spatial databasesrdquo httpswwwcsutahedulifeifeiSpatialDatasethtm
[28] ldquoTwitterrdquo httpstwittercom[29] T Brinkhoff ldquoA framework for generating network-basedmov-
ing objectsrdquo GeoInformatica vol 6 no 2 pp 153ndash180 2002
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
12 Wireless Communications and Mobile Computing
3
q5 5
2 3
3
2
2 3 5
11
d3 (Chinese Restaurant)
n1
n6
n2 pse2
pse1
pse3
n4n5
n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Safe region at time 119905119894
9
q10 5
6 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6
n2 n3
n4n5
d2 (Italian Restaurant)d1 (Italian Restaurant)
(b) Updating weight oflarr997888997888997888997888997888997888(1198991 1198992) and
larr997888997888997888997888997888997888(1198991 1198996) at time 119905119895
Figure 7 Updating the weight of edges in a dynamic road network where 119905119894 lt 119905119895
3
q5 5
2 4
3
2
2 3 5
1
d3 (Chinese Restaurant)
n1
n6 n4n5
n2 n3d2 (Italian Restaurant)d1 (Italian Restaurant)
(a) Monitoring region at time 119905119894
9
q10 5
5 4
233
2
2 3 5
11
037
pse2pse1
pse3
d3 (Chinese Restaurant)n6 n4n5
n2 n3d2 (Italian Restaurant)n1 d1 (Italian Restaurant)
(b) New safe region at time 119905119895
Figure 8 Monitoring region and updated safe region at time 119905119895
(1) InputMonitoring regionMR updated edge (119899119894 119899119895)(2) Output none(3) if 119872119877cap (119899119894 119899119895) = 0 then(4) lowastedge (119899119894 119899119895) is not part of monitoring region(5) ignore the change in the weight of edge (119899119894 119899119895)(6) end(7) 119875119878119864 larr997888 0 lowastset of safe exit points(8) else(9) 119863119896119906119901119889 larr997888 119864V119886119897119906119886119905119890119878119899119886119901119904ℎ119900119905119876119906119890119903119910(119899119894 119890119894) lowastupdate set of
top-k results(10) 119875119878119864119906119901119889 larr997888 119862119900119898119901119906119905119890119878119886119891119890119864119909119894119905(119875119886 119899120573) lowastupdate safe exit
points(11) 119872119877119906119901119889 larr997888 119862119900119898119901119906119905119890119872119900119899119894119905119900119903119894119899119892119877119890119892119894119900119899(119863+119897 119863minusℎ )
lowastupdate monitoring region(12) end
Algorithm 5 MonitoringSafeRegion(MR(119899119894 119899119895))
Table 4 Summary of datasets
Attribute Oldenburg San Francisco San JoaquinTotal no of nodes 6104 14732 18262Total no of edges 7034 14316 23876Percentage of directed edges 30 30 30Total no of objects 5627 11453 19098Average no of objects per edge 08 08 08Total no of words 49517 103649 166153
Wireless Communications and Mobile Computing 13
Table 5 Experimental parameter settings
Parameter RangeNumber of results (k) 5 10 15 20 25Number of keywords (n) 1 2 3 4 5Query parameter (120572) 001 01 1 10 100Dataset Oldenburg San Francisco San JoaquinNumber of data objects (119873119863) 10 20 30 40 50 (x1000)Speed of query objects (119881119902119903119910) 25 50 75 100 125 (kmh)Mobility (119872119902119903119910) 20 40 60 80 100Ratio of directed edges (119864119889119894119903) 10 20 30 40 50Ratio of updated edges (119864119906119901119889) 15 30 60 80 100
8GB of memory In the experiments we compared (1) queryprocessing times (2) edges processed ie the number ofedges processed for retrieving query results and (3) indexsizes Table 5 summarizes the parameters used in the exper-iments In each experiment we varied a single parameterwithin the range that is shown in Table 5 while maintainingthe other parameters at the bolded default values
We evaluated the performance of the algorithms by usingthe following measures (1) total amount of server CPUtime which indicates the query processing time and (2)total communication cost as the total number of points (iethe location updates sent by query objects and the queryresults and safe exit points returned by the server) transferredbetween clients and the serverThebattery power andwirelessbandwidth consumption typically increase with the amountof data transferred between objects (clients) and serversThus we used the amount of transferred data as a metric toevaluate the communication cost
72 Experimental Results of Top-k Spatial KeywordQueries in Static Road Networks
721 Effect of k Figure 9 indicates the effect of the numberof results on the query processing time and communicationcost for both algorithms Figure 9(a) indicates that the queryprocessing time increases for both algorithms as the value ofk increases This is expected because with an increase in kmore data objects are required to be explored and verifiedNevertheless COSK significantly outperforms CMTkSK+ fortwo main reasons First a relevant object search is very effi-cient when using the highest significant factor and secondCOSKdoes not need to verify the set of answer objects as longas the query object lies in a safe region On the other handthe CMTkSK+ query processing time increases significantlybecause it has to monitor and verify the set of candidateobjects periodically In Figure 9(b) the communication costsfor both algorithms increase as the number of objects in-creases However the proposed algorithm demonstrates su-perior performance compared to CMTkSK+ because client-server communication is not required when the query objectlies within the safe exit points whereas in CMTkSK+ thequery object is required to report its location to the serverwhenever it moves
722 Effect of119873119863 This experimentwas conducted on datasetSan Joaquin This dataset included 19098 data objects there-fore we randomly generated approximately 30000 additionaldata objects on different edges In Figure 10 we evaluate theperformance of COSK and CMTkSK+ by varying the cardi-nality of the data objects Note that119873119863 = 10119870 corresponds toa low density of data points while119873119863 = 50119870 corresponds toa high density In Figure 10(a) it is interesting to notice thatthe query processing times of both algorithms decrease asthe cardinality of the data objects increases For CMTkSK+this is because with high density the monitoring range of aquery decreases However for COSK it is mainly becausewhen the data density is high fewer edges are required tobe expanded which decreases the query processing time InFigure 10(b) we study the influence of the cardinality of thedata objects on the communication costs The experimentalresults indicate that the communication costs of CMTkSK+incur almost constant communication costs regardless ofdata object cardinality However the communication costsof COSK increase in proportion to the 119873119863 value This isexpected because the safe region becomes smaller as thedensity of the data objects increases which increases thecommunication costs
723 Effect of Query Keywords (n) Figure 11 shows thequery processing time and communication for COSK andCMTkSK+ as a function of the number of query keywordsFigures 11(a) and 11(b) show the trend that the performanceof both algorithms degrades when the number of keywordsincreases This is mainly because by increasing the numberof query keywords the number of relevant objects may alsoincrease resulting in a higher query processing time andcommunication cost However the safe-region-based algo-rithm COSK scales better than CMTkSk+ because of its lessexpensive monitoring technique
724 Effect of 120572 Figure 12 demonstrates the impact of queryparameter 120572 on the query processing time and on the com-munication cost A small value of 120572 indicates a greater im-portance of textual relevance whereas a high value of 120572gives more preference to the spatial relevance It is interestingto note that the query processing time is lower for higher
14 Wireless Communications and Mobile Computing
k
50
10
10
15 20
20
30
Que
ry p
roce
ssin
g tim
e (s)
COSKCMTkSK+
40
25
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
k
5 10 15 20 25
(b) Communication cost
Figure 9 Effect of k on query processing time and number of edges processed
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
10k 20k 30k 40k 50kND
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
tran
sferr
ed m
essa
ges
1M
10 20 30 40 50ND
(b) Communication cost
Figure 10 Effect of119873119863 on query processing time and communication cost
values of 120572 which indicates more importance to the spatialrelevance This is mainly because when the spatial relevanceis higher fewer edges and objects are required to be exploredand processed to determine the top-k data objects Observethat in Figure 12(b) the number of messages sent by COSKdecreases sharply with an increase in 120572725 Effect of Speed Figure 13(a) demonstrates the influenceof the speed of the query objects on the query processingtime of the COSK and CMTkSK+ algorithms The experi-mental results indicate that the performance of CMTkSK+is not significantly influenced by the speed of the query
objects because the candidate objects must be continuouslymonitored after a regular interval of time regardless ofthe speed On the other hand for COSK the performancegradually decreases as the speed of the query objects increasesbecause the objects leave their respective safe regions morefrequently Figure 13(b) shows the communication costs ofCOSK and CMTkSK+ with respect to the speed of the queryobjects CMTkSK+ incurs almost constant communicationcosts because a server-initiated request to verify the candidateobjects does not depend on the speed For COSK the queryobjects cross safe regions more frequently when the speed ishigh which increases the communication costs
Wireless Communications and Mobile Computing 15
Number of keywords1 2 3 4 5
COSKCMTkSK+
0
15
30
45
Que
ry p
roce
ssin
g tim
e (s)
60
(a) Query processing time
COSK
Number of keywords
CMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1 2 3 4 5
(b) Communication cost
Figure 11 Effect of number of keywords on query processing time and communication cost
001 01 1 10 100
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
001 01 1 10 100
(b) Communication cost
Figure 12 Effect of 120572 on query processing time and communication cost
726 Effect of Mobility Figure 14 shows the effect of mobility119872119902119903119910 (mobility refers to the percentage of query objects thatare moving at any timestamp) on the performance of COSKand CMTkSK+ algorithms As expected the query pro-cessing time and communication costs for both algorithmsincrease with119872119902119903y Nevertheless COSK performs better thanCMTkSK+ in terms of query processing time and commu-nication costs
727 Effect of Directed Edges Figure 15 shows the impactof percentage of directed edges 119864119889119894119903 on the performance ofCOSK and CMTkSK+ algorithms The query processing time
increases with 119864119889119894119903 because algorithm needs to explore moreedges to retrieve the top-k keyword queries However thecommunication cost is not significantly affected by the valueof 119864119889119894119903 for both the algorithms
728 Effect of Datasets Figure 16 demonstrates the indexsizes of the COSK and CMTkSK+ approaches for differentdatasets As shown in Figure 16 both algorithms have similarindex sizes However COSK has minor space overheadbecause it stores additional information of the highest signifi-cance factor 120579119905 of edges More important this space overheadis minimal as compared to the gain achieved by COSK inquery processing time and communication costs
16 Wireless Communications and Mobile Computing
25 50 75 100 125
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
Vqry
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
25 50 75 100 125Vqry
(b) Communication cost
Figure 13 Effect of speed on query processing time and communication cost
20 40 60 80 100Mqry
COSKCMTkSK+
0
15
45
30
60
Que
ry p
roce
ssin
g tim
e (s)
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
20 40 60 80 100Mqry
1k
COSKCMTkSK+
(b) Communication cost
Figure 14 Effect of mobility on query processing time and communication cost
73 Experimental Results of Top-k Spatial Keyword Queriesin Dynamic Road Networks In this section we evaluate theperformance of COSK and basic algorithm for dynamic roadnetworks The 119864119906119901119889 indicates the percentage of all edges thatchange their weight at each timestamp The length of anupdated edge is randomly selected between 01 to 10 times theoriginal length Figure 17(a) depicts the query processing timeof COSK and basic algorithm It is evident from the figure thatquery processing time of basic algorithm is not significantlyaffected by 119864119906119901119889 This is mainly because the query objectsissue top-k spatial queries at each timestamp However query
processing time of COSK increases with the value of 119864119906119901119889because the probability that the updated edge may associatedwith the monitoring region of query q increases with 119864119906119901119889Therefore when 119864119906119901119889 becomes large the results need to befrequently updated which increases the query processingtime Figure 17(b) shows the communication costs of COSKand basic algorithm with respect to 119864119906119901119889 Basic algorithmincurs almost constant communication costs regardless of thevalue of 119864119906119901119889 In contrast the communication cost of COSKincreases with 119864119906119901119889 because the query result and safe regionsneeds to be frequently updated
Wireless Communications and Mobile Computing 17
COSKCMTkSK+
10 20 30 40 50Edir
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1k
10 20 30 40 50Edir
eSPAKCMTkSK+
(b) Communication cost
Figure 15 Effect of 119864119889119894119903 on query processing time and communication cost
COSKCMTkSK+
0
15
45
30
60
Inde
x siz
e (M
B)
OldenburgDatasets
San Francisco San Joaquin
Figure 16 Effect of dataset on index size
8 Conclusion
In this paper we investigated moving top-k spatial keywordqueries in directed and dynamic road networksWepresentedan efficient indexing framework using inverted files thatindexes the data objects on edges allowing for the effectivesearching of data objects relevant to queries in terms ofboth textual and spatial relevance We also presented a safe-exit-based algorithm called COSK to monitor moving top-k spatial keyword queries We demonstrated that the queryresults remain valid as long as the query object resides withina safe region Furthermore COSK can effectively monitor thevalidity of query results and safe regions in dynamic roadnetworks Finally an experimental evaluation conducted on
real road networks demonstrated that COSK significantlyreduced the query processing time and communication costscompared to the CMTkSK+ algorithm
Data Availability
The real road network data used in this study are also used inmany previous studies The road network data is cited in themanuscript and it is available at httpswwwcsutahedusimlifeifeiSpatialDatasethtm To simulate the moving queriesthe authors used the spatiotemporal data generator which isalso used in previous studiesThe research article of generatoris cited in the manuscript The documentation and source
18 Wireless Communications and Mobile Computing
0
20
40
60
Que
ry p
roce
ssin
g tim
e (s)
80
15 30 45 60 75Eupd
COSKBasic
(a) Query processing time
15 30 45 60 75Eupd
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
COSKBasic
(b) Communication cost
Figure 17 Effect of 119864119906119901119889 on query processing time and communication cost
files of generator are available at httpsiapgjade-hsdeper-sonenbrinkhoffgenerator They used the Twitter tweetsfor generating the description of data objects and also querykeywords The tweets used can be accessible at httpfollow-thehashtagcomdatasetsfree-twitter-dataset-usa-200000-free-usa-tweets
Conflicts of Interest
The authors declare that there is no conflicts of interestregarding the publication of this paper
Acknowledgments
Hyung-JuChowas supported by theNational Research Foun-dation of Korea (NRF) grant funded by the Korean Govern-ment (MSIP) (NRF-2016R1A2B4009793) and this researchwas partially supported by Basic Science Research Programthrough the National Research Foundation of Korea (NRF)fundedby theMinistry of Education (2016R1D1A1B03934129)
References
[1] D Papadias N Mamoulis J Zhang and Y Tao ldquoQuery pro-cessing in spatial network databasesrdquo in Proceedings of the 29thInternational Conference on Very Large Data Bases (VLDB rsquo03)pp 802ndash813 September 2003
[2] H-J Cho K Ryu and T-S Chung ldquoAn efficient algorithm forcomputing safe exit points of moving range queries in directedroad networksrdquo Information Systems vol 41 pp 1ndash19 2014
[3] G Tsatsanifos and A Vlachou ldquoOn processing Top-k spatio-textual preference queriesrdquo in Proceedings of the 18th Interna-tional Conference on ExtendingDatabase Technology (EDBT rsquo15)pp 433ndash444 March 2015
[4] R Li A X Liu A L Wang and B Bruhadeshwar ldquoFast rangequery processing with strong privacy protection for cloud com-putingrdquo Proceedings of the VLDB Endowment vol 7 no 14 pp1953ndash1964 2014
[5] G Cong C S Jensen andDWu ldquoEfficient retrieval of the Top-k most relevant spatial web objectsrdquo Proceedings of the VLDBEndowment vol 2 no 1 pp 337ndash348 2009
[6] Z Li K C K Lee B Zheng W-C Lee D Lee and X WangldquoIR-tree An efficient index for geographic document searchrdquoIEEE Transactions on Knowledge and Data Engineering vol 23no 4 pp 585ndash599 2011
[7] Y Zhou X Xie C Wang Y Gong and W Ma ldquoHybrid indexstructures for location-based web searchrdquo in Proceedings of the14th ACM International Conference on Information and Knowl-edge Management pp 155ndash162 Bremen Germany October2005
[8] J Zobel and A Moffat ldquoInverted files for text search enginesrdquoACM Computing Surveys vol 38 no 2 2006
[9] N Beckmann H Kriegel R Schneider and B Seeger ldquoR-anefficient and robust accessmethod for points and rectanglesrdquo inProceedings of the ACM SIGMOD International Conference onManagement of Data vol 19 pp 322ndash331 May 1990
[10] R Hariharan B Hore C Li and S Mehrotra ldquoProcessing spa-tial-keyword (sk) queries in geographic information retrieval(gir) systemsrdquo in Proceedings of the 19th International Confer-ence on Scientific and Statistical DatabaseManagement (SSDBMrsquo07) July 2007
[11] I De FelipeV Hristidis andN Rishe ldquoKeyword search on spa-tial databasesrdquo in Proceedings of the 24th International Confer-ence on Data Engineering (ICDE rsquo08) pp 656ndash665 April 2008
[12] J B Rocha-Junior O Gkorgkas S Jonassen and K NoslashrvagldquoEfficient processing of top-k spatial keyword queriesrdquo inProceedings of the International Symposium on Spatial andTemporal Databases pp 205ndash222 Springer 2011
[13] D Zhang K-L Tan andAK Tung ldquoScalable top-k spatial key-word searchrdquo in Proceedings of the 16th International Conferenceon Extending Database Technology pp 359ndash370 2013
Wireless Communications and Mobile Computing 19
[14] J B Rocha-Junior andK Noslashrvag ldquoTop-k spatial keyword quer-ies on road networksrdquo in Proceedings of the 15th InternationalConference on Extending Database Technology pp 168ndash179Berlin Germany March 2012
[15] H-J Cho S J Kwon and T-S Chung ldquoA safe exit algorithmfor continuous nearest neighbor monitoring in road networksrdquoMobile Information Systems vol 9 no 1 pp 37ndash53 2013
[16] D Yung M L Yiu and E Lo ldquoA safe-exit approach for efficientnetwork-based moving range queriesrdquo Data amp KnowledgeEngineering vol 72 pp 126ndash147 2012
[17] M Attique H Cho R Jin and T Chung ldquoEfficient Processingof Continuous Reverse k Nearest Neighbor on Moving Objectsin Road Networksrdquo ISPRS International Journal of Geo-Infor-mation vol 5 no 12 p 247 2016
[18] H G Elmongui M F Mokbel and W G Aref ldquoContinuousaggregate nearest neighbor queriesrdquoGeoInformatica vol 17 no1 pp 63ndash95 2013
[19] D Wu M L Yiu C S Jensen and G Cong ldquoEfficient con-tinuously moving top-k spatial keyword query processingrdquo inProceedings of the IEEE International Conference on Data En-gineering (ICDE rsquo11) pp 541ndash552 Hannover Germany April2011
[20] W Huang G Li K-L Tan and J Feng ldquoEfficient safe-re-gion construction for moving top-k spatial keyword queriesrdquoin Proceedings of the 21st ACM International Conference onInformation and Knowledge Management pp 932ndash941 2012
[21] L Guo J ShaoHHAung andK-L Tan ldquoEfficient continuoustop-k spatial keyword queries on road networksrdquoGeoInformat-ica vol 19 no 1 pp 29ndash60 2014
[22] Y Li G Li L Shu Q Huang and H Jiang ldquoContinuous moni-toring of top-k spatial keyword queries in road networksrdquo Jour-nal of Information Science and Engineering vol 31 no 6 pp1831ndash1848 2015
[23] M Attique A Khan and T-S Chung ldquoESPAK Top-k spatialkeyword query processing in directed road networksrdquo in Pro-ceedings of the Workshops of the International Conference onExtending Database Technology and the International Confer-ence on DatabaseTheory (EDBTICDT rsquo17) March 2017
[24] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing ampManagement vol24 no 5 pp 513ndash523 1988
[25] V N Anh O de Kretser and A Moffat ldquoVector-space rankingwith effective early terminationrdquo in Proceedings of the 24th An-nual International ACM SIGIR Conference pp 35ndash42 NewOrleans LO USA 2001
[26] E W Dijkstra ldquoA note on two problems in connexion withgraphsrdquo Numerische Mathematik vol 1 pp 269ndash271 1959
[27] ldquoReal datasets for spatial databasesrdquo httpswwwcsutahedulifeifeiSpatialDatasethtm
[28] ldquoTwitterrdquo httpstwittercom[29] T Brinkhoff ldquoA framework for generating network-basedmov-
ing objectsrdquo GeoInformatica vol 6 no 2 pp 153ndash180 2002
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
Wireless Communications and Mobile Computing 13
Table 5 Experimental parameter settings
Parameter RangeNumber of results (k) 5 10 15 20 25Number of keywords (n) 1 2 3 4 5Query parameter (120572) 001 01 1 10 100Dataset Oldenburg San Francisco San JoaquinNumber of data objects (119873119863) 10 20 30 40 50 (x1000)Speed of query objects (119881119902119903119910) 25 50 75 100 125 (kmh)Mobility (119872119902119903119910) 20 40 60 80 100Ratio of directed edges (119864119889119894119903) 10 20 30 40 50Ratio of updated edges (119864119906119901119889) 15 30 60 80 100
8GB of memory In the experiments we compared (1) queryprocessing times (2) edges processed ie the number ofedges processed for retrieving query results and (3) indexsizes Table 5 summarizes the parameters used in the exper-iments In each experiment we varied a single parameterwithin the range that is shown in Table 5 while maintainingthe other parameters at the bolded default values
We evaluated the performance of the algorithms by usingthe following measures (1) total amount of server CPUtime which indicates the query processing time and (2)total communication cost as the total number of points (iethe location updates sent by query objects and the queryresults and safe exit points returned by the server) transferredbetween clients and the serverThebattery power andwirelessbandwidth consumption typically increase with the amountof data transferred between objects (clients) and serversThus we used the amount of transferred data as a metric toevaluate the communication cost
72 Experimental Results of Top-k Spatial KeywordQueries in Static Road Networks
721 Effect of k Figure 9 indicates the effect of the numberof results on the query processing time and communicationcost for both algorithms Figure 9(a) indicates that the queryprocessing time increases for both algorithms as the value ofk increases This is expected because with an increase in kmore data objects are required to be explored and verifiedNevertheless COSK significantly outperforms CMTkSK+ fortwo main reasons First a relevant object search is very effi-cient when using the highest significant factor and secondCOSKdoes not need to verify the set of answer objects as longas the query object lies in a safe region On the other handthe CMTkSK+ query processing time increases significantlybecause it has to monitor and verify the set of candidateobjects periodically In Figure 9(b) the communication costsfor both algorithms increase as the number of objects in-creases However the proposed algorithm demonstrates su-perior performance compared to CMTkSK+ because client-server communication is not required when the query objectlies within the safe exit points whereas in CMTkSK+ thequery object is required to report its location to the serverwhenever it moves
722 Effect of119873119863 This experimentwas conducted on datasetSan Joaquin This dataset included 19098 data objects there-fore we randomly generated approximately 30000 additionaldata objects on different edges In Figure 10 we evaluate theperformance of COSK and CMTkSK+ by varying the cardi-nality of the data objects Note that119873119863 = 10119870 corresponds toa low density of data points while119873119863 = 50119870 corresponds toa high density In Figure 10(a) it is interesting to notice thatthe query processing times of both algorithms decrease asthe cardinality of the data objects increases For CMTkSK+this is because with high density the monitoring range of aquery decreases However for COSK it is mainly becausewhen the data density is high fewer edges are required tobe expanded which decreases the query processing time InFigure 10(b) we study the influence of the cardinality of thedata objects on the communication costs The experimentalresults indicate that the communication costs of CMTkSK+incur almost constant communication costs regardless ofdata object cardinality However the communication costsof COSK increase in proportion to the 119873119863 value This isexpected because the safe region becomes smaller as thedensity of the data objects increases which increases thecommunication costs
723 Effect of Query Keywords (n) Figure 11 shows thequery processing time and communication for COSK andCMTkSK+ as a function of the number of query keywordsFigures 11(a) and 11(b) show the trend that the performanceof both algorithms degrades when the number of keywordsincreases This is mainly because by increasing the numberof query keywords the number of relevant objects may alsoincrease resulting in a higher query processing time andcommunication cost However the safe-region-based algo-rithm COSK scales better than CMTkSk+ because of its lessexpensive monitoring technique
724 Effect of 120572 Figure 12 demonstrates the impact of queryparameter 120572 on the query processing time and on the com-munication cost A small value of 120572 indicates a greater im-portance of textual relevance whereas a high value of 120572gives more preference to the spatial relevance It is interestingto note that the query processing time is lower for higher
14 Wireless Communications and Mobile Computing
k
50
10
10
15 20
20
30
Que
ry p
roce
ssin
g tim
e (s)
COSKCMTkSK+
40
25
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
k
5 10 15 20 25
(b) Communication cost
Figure 9 Effect of k on query processing time and number of edges processed
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
10k 20k 30k 40k 50kND
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
tran
sferr
ed m
essa
ges
1M
10 20 30 40 50ND
(b) Communication cost
Figure 10 Effect of119873119863 on query processing time and communication cost
values of 120572 which indicates more importance to the spatialrelevance This is mainly because when the spatial relevanceis higher fewer edges and objects are required to be exploredand processed to determine the top-k data objects Observethat in Figure 12(b) the number of messages sent by COSKdecreases sharply with an increase in 120572725 Effect of Speed Figure 13(a) demonstrates the influenceof the speed of the query objects on the query processingtime of the COSK and CMTkSK+ algorithms The experi-mental results indicate that the performance of CMTkSK+is not significantly influenced by the speed of the query
objects because the candidate objects must be continuouslymonitored after a regular interval of time regardless ofthe speed On the other hand for COSK the performancegradually decreases as the speed of the query objects increasesbecause the objects leave their respective safe regions morefrequently Figure 13(b) shows the communication costs ofCOSK and CMTkSK+ with respect to the speed of the queryobjects CMTkSK+ incurs almost constant communicationcosts because a server-initiated request to verify the candidateobjects does not depend on the speed For COSK the queryobjects cross safe regions more frequently when the speed ishigh which increases the communication costs
Wireless Communications and Mobile Computing 15
Number of keywords1 2 3 4 5
COSKCMTkSK+
0
15
30
45
Que
ry p
roce
ssin
g tim
e (s)
60
(a) Query processing time
COSK
Number of keywords
CMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1 2 3 4 5
(b) Communication cost
Figure 11 Effect of number of keywords on query processing time and communication cost
001 01 1 10 100
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
001 01 1 10 100
(b) Communication cost
Figure 12 Effect of 120572 on query processing time and communication cost
726 Effect of Mobility Figure 14 shows the effect of mobility119872119902119903119910 (mobility refers to the percentage of query objects thatare moving at any timestamp) on the performance of COSKand CMTkSK+ algorithms As expected the query pro-cessing time and communication costs for both algorithmsincrease with119872119902119903y Nevertheless COSK performs better thanCMTkSK+ in terms of query processing time and commu-nication costs
727 Effect of Directed Edges Figure 15 shows the impactof percentage of directed edges 119864119889119894119903 on the performance ofCOSK and CMTkSK+ algorithms The query processing time
increases with 119864119889119894119903 because algorithm needs to explore moreedges to retrieve the top-k keyword queries However thecommunication cost is not significantly affected by the valueof 119864119889119894119903 for both the algorithms
728 Effect of Datasets Figure 16 demonstrates the indexsizes of the COSK and CMTkSK+ approaches for differentdatasets As shown in Figure 16 both algorithms have similarindex sizes However COSK has minor space overheadbecause it stores additional information of the highest signifi-cance factor 120579119905 of edges More important this space overheadis minimal as compared to the gain achieved by COSK inquery processing time and communication costs
16 Wireless Communications and Mobile Computing
25 50 75 100 125
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
Vqry
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
25 50 75 100 125Vqry
(b) Communication cost
Figure 13 Effect of speed on query processing time and communication cost
20 40 60 80 100Mqry
COSKCMTkSK+
0
15
45
30
60
Que
ry p
roce
ssin
g tim
e (s)
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
20 40 60 80 100Mqry
1k
COSKCMTkSK+
(b) Communication cost
Figure 14 Effect of mobility on query processing time and communication cost
73 Experimental Results of Top-k Spatial Keyword Queriesin Dynamic Road Networks In this section we evaluate theperformance of COSK and basic algorithm for dynamic roadnetworks The 119864119906119901119889 indicates the percentage of all edges thatchange their weight at each timestamp The length of anupdated edge is randomly selected between 01 to 10 times theoriginal length Figure 17(a) depicts the query processing timeof COSK and basic algorithm It is evident from the figure thatquery processing time of basic algorithm is not significantlyaffected by 119864119906119901119889 This is mainly because the query objectsissue top-k spatial queries at each timestamp However query
processing time of COSK increases with the value of 119864119906119901119889because the probability that the updated edge may associatedwith the monitoring region of query q increases with 119864119906119901119889Therefore when 119864119906119901119889 becomes large the results need to befrequently updated which increases the query processingtime Figure 17(b) shows the communication costs of COSKand basic algorithm with respect to 119864119906119901119889 Basic algorithmincurs almost constant communication costs regardless of thevalue of 119864119906119901119889 In contrast the communication cost of COSKincreases with 119864119906119901119889 because the query result and safe regionsneeds to be frequently updated
Wireless Communications and Mobile Computing 17
COSKCMTkSK+
10 20 30 40 50Edir
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1k
10 20 30 40 50Edir
eSPAKCMTkSK+
(b) Communication cost
Figure 15 Effect of 119864119889119894119903 on query processing time and communication cost
COSKCMTkSK+
0
15
45
30
60
Inde
x siz
e (M
B)
OldenburgDatasets
San Francisco San Joaquin
Figure 16 Effect of dataset on index size
8 Conclusion
In this paper we investigated moving top-k spatial keywordqueries in directed and dynamic road networksWepresentedan efficient indexing framework using inverted files thatindexes the data objects on edges allowing for the effectivesearching of data objects relevant to queries in terms ofboth textual and spatial relevance We also presented a safe-exit-based algorithm called COSK to monitor moving top-k spatial keyword queries We demonstrated that the queryresults remain valid as long as the query object resides withina safe region Furthermore COSK can effectively monitor thevalidity of query results and safe regions in dynamic roadnetworks Finally an experimental evaluation conducted on
real road networks demonstrated that COSK significantlyreduced the query processing time and communication costscompared to the CMTkSK+ algorithm
Data Availability
The real road network data used in this study are also used inmany previous studies The road network data is cited in themanuscript and it is available at httpswwwcsutahedusimlifeifeiSpatialDatasethtm To simulate the moving queriesthe authors used the spatiotemporal data generator which isalso used in previous studiesThe research article of generatoris cited in the manuscript The documentation and source
18 Wireless Communications and Mobile Computing
0
20
40
60
Que
ry p
roce
ssin
g tim
e (s)
80
15 30 45 60 75Eupd
COSKBasic
(a) Query processing time
15 30 45 60 75Eupd
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
COSKBasic
(b) Communication cost
Figure 17 Effect of 119864119906119901119889 on query processing time and communication cost
files of generator are available at httpsiapgjade-hsdeper-sonenbrinkhoffgenerator They used the Twitter tweetsfor generating the description of data objects and also querykeywords The tweets used can be accessible at httpfollow-thehashtagcomdatasetsfree-twitter-dataset-usa-200000-free-usa-tweets
Conflicts of Interest
The authors declare that there is no conflicts of interestregarding the publication of this paper
Acknowledgments
Hyung-JuChowas supported by theNational Research Foun-dation of Korea (NRF) grant funded by the Korean Govern-ment (MSIP) (NRF-2016R1A2B4009793) and this researchwas partially supported by Basic Science Research Programthrough the National Research Foundation of Korea (NRF)fundedby theMinistry of Education (2016R1D1A1B03934129)
References
[1] D Papadias N Mamoulis J Zhang and Y Tao ldquoQuery pro-cessing in spatial network databasesrdquo in Proceedings of the 29thInternational Conference on Very Large Data Bases (VLDB rsquo03)pp 802ndash813 September 2003
[2] H-J Cho K Ryu and T-S Chung ldquoAn efficient algorithm forcomputing safe exit points of moving range queries in directedroad networksrdquo Information Systems vol 41 pp 1ndash19 2014
[3] G Tsatsanifos and A Vlachou ldquoOn processing Top-k spatio-textual preference queriesrdquo in Proceedings of the 18th Interna-tional Conference on ExtendingDatabase Technology (EDBT rsquo15)pp 433ndash444 March 2015
[4] R Li A X Liu A L Wang and B Bruhadeshwar ldquoFast rangequery processing with strong privacy protection for cloud com-putingrdquo Proceedings of the VLDB Endowment vol 7 no 14 pp1953ndash1964 2014
[5] G Cong C S Jensen andDWu ldquoEfficient retrieval of the Top-k most relevant spatial web objectsrdquo Proceedings of the VLDBEndowment vol 2 no 1 pp 337ndash348 2009
[6] Z Li K C K Lee B Zheng W-C Lee D Lee and X WangldquoIR-tree An efficient index for geographic document searchrdquoIEEE Transactions on Knowledge and Data Engineering vol 23no 4 pp 585ndash599 2011
[7] Y Zhou X Xie C Wang Y Gong and W Ma ldquoHybrid indexstructures for location-based web searchrdquo in Proceedings of the14th ACM International Conference on Information and Knowl-edge Management pp 155ndash162 Bremen Germany October2005
[8] J Zobel and A Moffat ldquoInverted files for text search enginesrdquoACM Computing Surveys vol 38 no 2 2006
[9] N Beckmann H Kriegel R Schneider and B Seeger ldquoR-anefficient and robust accessmethod for points and rectanglesrdquo inProceedings of the ACM SIGMOD International Conference onManagement of Data vol 19 pp 322ndash331 May 1990
[10] R Hariharan B Hore C Li and S Mehrotra ldquoProcessing spa-tial-keyword (sk) queries in geographic information retrieval(gir) systemsrdquo in Proceedings of the 19th International Confer-ence on Scientific and Statistical DatabaseManagement (SSDBMrsquo07) July 2007
[11] I De FelipeV Hristidis andN Rishe ldquoKeyword search on spa-tial databasesrdquo in Proceedings of the 24th International Confer-ence on Data Engineering (ICDE rsquo08) pp 656ndash665 April 2008
[12] J B Rocha-Junior O Gkorgkas S Jonassen and K NoslashrvagldquoEfficient processing of top-k spatial keyword queriesrdquo inProceedings of the International Symposium on Spatial andTemporal Databases pp 205ndash222 Springer 2011
[13] D Zhang K-L Tan andAK Tung ldquoScalable top-k spatial key-word searchrdquo in Proceedings of the 16th International Conferenceon Extending Database Technology pp 359ndash370 2013
Wireless Communications and Mobile Computing 19
[14] J B Rocha-Junior andK Noslashrvag ldquoTop-k spatial keyword quer-ies on road networksrdquo in Proceedings of the 15th InternationalConference on Extending Database Technology pp 168ndash179Berlin Germany March 2012
[15] H-J Cho S J Kwon and T-S Chung ldquoA safe exit algorithmfor continuous nearest neighbor monitoring in road networksrdquoMobile Information Systems vol 9 no 1 pp 37ndash53 2013
[16] D Yung M L Yiu and E Lo ldquoA safe-exit approach for efficientnetwork-based moving range queriesrdquo Data amp KnowledgeEngineering vol 72 pp 126ndash147 2012
[17] M Attique H Cho R Jin and T Chung ldquoEfficient Processingof Continuous Reverse k Nearest Neighbor on Moving Objectsin Road Networksrdquo ISPRS International Journal of Geo-Infor-mation vol 5 no 12 p 247 2016
[18] H G Elmongui M F Mokbel and W G Aref ldquoContinuousaggregate nearest neighbor queriesrdquoGeoInformatica vol 17 no1 pp 63ndash95 2013
[19] D Wu M L Yiu C S Jensen and G Cong ldquoEfficient con-tinuously moving top-k spatial keyword query processingrdquo inProceedings of the IEEE International Conference on Data En-gineering (ICDE rsquo11) pp 541ndash552 Hannover Germany April2011
[20] W Huang G Li K-L Tan and J Feng ldquoEfficient safe-re-gion construction for moving top-k spatial keyword queriesrdquoin Proceedings of the 21st ACM International Conference onInformation and Knowledge Management pp 932ndash941 2012
[21] L Guo J ShaoHHAung andK-L Tan ldquoEfficient continuoustop-k spatial keyword queries on road networksrdquoGeoInformat-ica vol 19 no 1 pp 29ndash60 2014
[22] Y Li G Li L Shu Q Huang and H Jiang ldquoContinuous moni-toring of top-k spatial keyword queries in road networksrdquo Jour-nal of Information Science and Engineering vol 31 no 6 pp1831ndash1848 2015
[23] M Attique A Khan and T-S Chung ldquoESPAK Top-k spatialkeyword query processing in directed road networksrdquo in Pro-ceedings of the Workshops of the International Conference onExtending Database Technology and the International Confer-ence on DatabaseTheory (EDBTICDT rsquo17) March 2017
[24] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing ampManagement vol24 no 5 pp 513ndash523 1988
[25] V N Anh O de Kretser and A Moffat ldquoVector-space rankingwith effective early terminationrdquo in Proceedings of the 24th An-nual International ACM SIGIR Conference pp 35ndash42 NewOrleans LO USA 2001
[26] E W Dijkstra ldquoA note on two problems in connexion withgraphsrdquo Numerische Mathematik vol 1 pp 269ndash271 1959
[27] ldquoReal datasets for spatial databasesrdquo httpswwwcsutahedulifeifeiSpatialDatasethtm
[28] ldquoTwitterrdquo httpstwittercom[29] T Brinkhoff ldquoA framework for generating network-basedmov-
ing objectsrdquo GeoInformatica vol 6 no 2 pp 153ndash180 2002
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
14 Wireless Communications and Mobile Computing
k
50
10
10
15 20
20
30
Que
ry p
roce
ssin
g tim
e (s)
COSKCMTkSK+
40
25
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
k
5 10 15 20 25
(b) Communication cost
Figure 9 Effect of k on query processing time and number of edges processed
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
10k 20k 30k 40k 50kND
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
tran
sferr
ed m
essa
ges
1M
10 20 30 40 50ND
(b) Communication cost
Figure 10 Effect of119873119863 on query processing time and communication cost
values of 120572 which indicates more importance to the spatialrelevance This is mainly because when the spatial relevanceis higher fewer edges and objects are required to be exploredand processed to determine the top-k data objects Observethat in Figure 12(b) the number of messages sent by COSKdecreases sharply with an increase in 120572725 Effect of Speed Figure 13(a) demonstrates the influenceof the speed of the query objects on the query processingtime of the COSK and CMTkSK+ algorithms The experi-mental results indicate that the performance of CMTkSK+is not significantly influenced by the speed of the query
objects because the candidate objects must be continuouslymonitored after a regular interval of time regardless ofthe speed On the other hand for COSK the performancegradually decreases as the speed of the query objects increasesbecause the objects leave their respective safe regions morefrequently Figure 13(b) shows the communication costs ofCOSK and CMTkSK+ with respect to the speed of the queryobjects CMTkSK+ incurs almost constant communicationcosts because a server-initiated request to verify the candidateobjects does not depend on the speed For COSK the queryobjects cross safe regions more frequently when the speed ishigh which increases the communication costs
Wireless Communications and Mobile Computing 15
Number of keywords1 2 3 4 5
COSKCMTkSK+
0
15
30
45
Que
ry p
roce
ssin
g tim
e (s)
60
(a) Query processing time
COSK
Number of keywords
CMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1 2 3 4 5
(b) Communication cost
Figure 11 Effect of number of keywords on query processing time and communication cost
001 01 1 10 100
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
001 01 1 10 100
(b) Communication cost
Figure 12 Effect of 120572 on query processing time and communication cost
726 Effect of Mobility Figure 14 shows the effect of mobility119872119902119903119910 (mobility refers to the percentage of query objects thatare moving at any timestamp) on the performance of COSKand CMTkSK+ algorithms As expected the query pro-cessing time and communication costs for both algorithmsincrease with119872119902119903y Nevertheless COSK performs better thanCMTkSK+ in terms of query processing time and commu-nication costs
727 Effect of Directed Edges Figure 15 shows the impactof percentage of directed edges 119864119889119894119903 on the performance ofCOSK and CMTkSK+ algorithms The query processing time
increases with 119864119889119894119903 because algorithm needs to explore moreedges to retrieve the top-k keyword queries However thecommunication cost is not significantly affected by the valueof 119864119889119894119903 for both the algorithms
728 Effect of Datasets Figure 16 demonstrates the indexsizes of the COSK and CMTkSK+ approaches for differentdatasets As shown in Figure 16 both algorithms have similarindex sizes However COSK has minor space overheadbecause it stores additional information of the highest signifi-cance factor 120579119905 of edges More important this space overheadis minimal as compared to the gain achieved by COSK inquery processing time and communication costs
16 Wireless Communications and Mobile Computing
25 50 75 100 125
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
Vqry
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
25 50 75 100 125Vqry
(b) Communication cost
Figure 13 Effect of speed on query processing time and communication cost
20 40 60 80 100Mqry
COSKCMTkSK+
0
15
45
30
60
Que
ry p
roce
ssin
g tim
e (s)
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
20 40 60 80 100Mqry
1k
COSKCMTkSK+
(b) Communication cost
Figure 14 Effect of mobility on query processing time and communication cost
73 Experimental Results of Top-k Spatial Keyword Queriesin Dynamic Road Networks In this section we evaluate theperformance of COSK and basic algorithm for dynamic roadnetworks The 119864119906119901119889 indicates the percentage of all edges thatchange their weight at each timestamp The length of anupdated edge is randomly selected between 01 to 10 times theoriginal length Figure 17(a) depicts the query processing timeof COSK and basic algorithm It is evident from the figure thatquery processing time of basic algorithm is not significantlyaffected by 119864119906119901119889 This is mainly because the query objectsissue top-k spatial queries at each timestamp However query
processing time of COSK increases with the value of 119864119906119901119889because the probability that the updated edge may associatedwith the monitoring region of query q increases with 119864119906119901119889Therefore when 119864119906119901119889 becomes large the results need to befrequently updated which increases the query processingtime Figure 17(b) shows the communication costs of COSKand basic algorithm with respect to 119864119906119901119889 Basic algorithmincurs almost constant communication costs regardless of thevalue of 119864119906119901119889 In contrast the communication cost of COSKincreases with 119864119906119901119889 because the query result and safe regionsneeds to be frequently updated
Wireless Communications and Mobile Computing 17
COSKCMTkSK+
10 20 30 40 50Edir
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1k
10 20 30 40 50Edir
eSPAKCMTkSK+
(b) Communication cost
Figure 15 Effect of 119864119889119894119903 on query processing time and communication cost
COSKCMTkSK+
0
15
45
30
60
Inde
x siz
e (M
B)
OldenburgDatasets
San Francisco San Joaquin
Figure 16 Effect of dataset on index size
8 Conclusion
In this paper we investigated moving top-k spatial keywordqueries in directed and dynamic road networksWepresentedan efficient indexing framework using inverted files thatindexes the data objects on edges allowing for the effectivesearching of data objects relevant to queries in terms ofboth textual and spatial relevance We also presented a safe-exit-based algorithm called COSK to monitor moving top-k spatial keyword queries We demonstrated that the queryresults remain valid as long as the query object resides withina safe region Furthermore COSK can effectively monitor thevalidity of query results and safe regions in dynamic roadnetworks Finally an experimental evaluation conducted on
real road networks demonstrated that COSK significantlyreduced the query processing time and communication costscompared to the CMTkSK+ algorithm
Data Availability
The real road network data used in this study are also used inmany previous studies The road network data is cited in themanuscript and it is available at httpswwwcsutahedusimlifeifeiSpatialDatasethtm To simulate the moving queriesthe authors used the spatiotemporal data generator which isalso used in previous studiesThe research article of generatoris cited in the manuscript The documentation and source
18 Wireless Communications and Mobile Computing
0
20
40
60
Que
ry p
roce
ssin
g tim
e (s)
80
15 30 45 60 75Eupd
COSKBasic
(a) Query processing time
15 30 45 60 75Eupd
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
COSKBasic
(b) Communication cost
Figure 17 Effect of 119864119906119901119889 on query processing time and communication cost
files of generator are available at httpsiapgjade-hsdeper-sonenbrinkhoffgenerator They used the Twitter tweetsfor generating the description of data objects and also querykeywords The tweets used can be accessible at httpfollow-thehashtagcomdatasetsfree-twitter-dataset-usa-200000-free-usa-tweets
Conflicts of Interest
The authors declare that there is no conflicts of interestregarding the publication of this paper
Acknowledgments
Hyung-JuChowas supported by theNational Research Foun-dation of Korea (NRF) grant funded by the Korean Govern-ment (MSIP) (NRF-2016R1A2B4009793) and this researchwas partially supported by Basic Science Research Programthrough the National Research Foundation of Korea (NRF)fundedby theMinistry of Education (2016R1D1A1B03934129)
References
[1] D Papadias N Mamoulis J Zhang and Y Tao ldquoQuery pro-cessing in spatial network databasesrdquo in Proceedings of the 29thInternational Conference on Very Large Data Bases (VLDB rsquo03)pp 802ndash813 September 2003
[2] H-J Cho K Ryu and T-S Chung ldquoAn efficient algorithm forcomputing safe exit points of moving range queries in directedroad networksrdquo Information Systems vol 41 pp 1ndash19 2014
[3] G Tsatsanifos and A Vlachou ldquoOn processing Top-k spatio-textual preference queriesrdquo in Proceedings of the 18th Interna-tional Conference on ExtendingDatabase Technology (EDBT rsquo15)pp 433ndash444 March 2015
[4] R Li A X Liu A L Wang and B Bruhadeshwar ldquoFast rangequery processing with strong privacy protection for cloud com-putingrdquo Proceedings of the VLDB Endowment vol 7 no 14 pp1953ndash1964 2014
[5] G Cong C S Jensen andDWu ldquoEfficient retrieval of the Top-k most relevant spatial web objectsrdquo Proceedings of the VLDBEndowment vol 2 no 1 pp 337ndash348 2009
[6] Z Li K C K Lee B Zheng W-C Lee D Lee and X WangldquoIR-tree An efficient index for geographic document searchrdquoIEEE Transactions on Knowledge and Data Engineering vol 23no 4 pp 585ndash599 2011
[7] Y Zhou X Xie C Wang Y Gong and W Ma ldquoHybrid indexstructures for location-based web searchrdquo in Proceedings of the14th ACM International Conference on Information and Knowl-edge Management pp 155ndash162 Bremen Germany October2005
[8] J Zobel and A Moffat ldquoInverted files for text search enginesrdquoACM Computing Surveys vol 38 no 2 2006
[9] N Beckmann H Kriegel R Schneider and B Seeger ldquoR-anefficient and robust accessmethod for points and rectanglesrdquo inProceedings of the ACM SIGMOD International Conference onManagement of Data vol 19 pp 322ndash331 May 1990
[10] R Hariharan B Hore C Li and S Mehrotra ldquoProcessing spa-tial-keyword (sk) queries in geographic information retrieval(gir) systemsrdquo in Proceedings of the 19th International Confer-ence on Scientific and Statistical DatabaseManagement (SSDBMrsquo07) July 2007
[11] I De FelipeV Hristidis andN Rishe ldquoKeyword search on spa-tial databasesrdquo in Proceedings of the 24th International Confer-ence on Data Engineering (ICDE rsquo08) pp 656ndash665 April 2008
[12] J B Rocha-Junior O Gkorgkas S Jonassen and K NoslashrvagldquoEfficient processing of top-k spatial keyword queriesrdquo inProceedings of the International Symposium on Spatial andTemporal Databases pp 205ndash222 Springer 2011
[13] D Zhang K-L Tan andAK Tung ldquoScalable top-k spatial key-word searchrdquo in Proceedings of the 16th International Conferenceon Extending Database Technology pp 359ndash370 2013
Wireless Communications and Mobile Computing 19
[14] J B Rocha-Junior andK Noslashrvag ldquoTop-k spatial keyword quer-ies on road networksrdquo in Proceedings of the 15th InternationalConference on Extending Database Technology pp 168ndash179Berlin Germany March 2012
[15] H-J Cho S J Kwon and T-S Chung ldquoA safe exit algorithmfor continuous nearest neighbor monitoring in road networksrdquoMobile Information Systems vol 9 no 1 pp 37ndash53 2013
[16] D Yung M L Yiu and E Lo ldquoA safe-exit approach for efficientnetwork-based moving range queriesrdquo Data amp KnowledgeEngineering vol 72 pp 126ndash147 2012
[17] M Attique H Cho R Jin and T Chung ldquoEfficient Processingof Continuous Reverse k Nearest Neighbor on Moving Objectsin Road Networksrdquo ISPRS International Journal of Geo-Infor-mation vol 5 no 12 p 247 2016
[18] H G Elmongui M F Mokbel and W G Aref ldquoContinuousaggregate nearest neighbor queriesrdquoGeoInformatica vol 17 no1 pp 63ndash95 2013
[19] D Wu M L Yiu C S Jensen and G Cong ldquoEfficient con-tinuously moving top-k spatial keyword query processingrdquo inProceedings of the IEEE International Conference on Data En-gineering (ICDE rsquo11) pp 541ndash552 Hannover Germany April2011
[20] W Huang G Li K-L Tan and J Feng ldquoEfficient safe-re-gion construction for moving top-k spatial keyword queriesrdquoin Proceedings of the 21st ACM International Conference onInformation and Knowledge Management pp 932ndash941 2012
[21] L Guo J ShaoHHAung andK-L Tan ldquoEfficient continuoustop-k spatial keyword queries on road networksrdquoGeoInformat-ica vol 19 no 1 pp 29ndash60 2014
[22] Y Li G Li L Shu Q Huang and H Jiang ldquoContinuous moni-toring of top-k spatial keyword queries in road networksrdquo Jour-nal of Information Science and Engineering vol 31 no 6 pp1831ndash1848 2015
[23] M Attique A Khan and T-S Chung ldquoESPAK Top-k spatialkeyword query processing in directed road networksrdquo in Pro-ceedings of the Workshops of the International Conference onExtending Database Technology and the International Confer-ence on DatabaseTheory (EDBTICDT rsquo17) March 2017
[24] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing ampManagement vol24 no 5 pp 513ndash523 1988
[25] V N Anh O de Kretser and A Moffat ldquoVector-space rankingwith effective early terminationrdquo in Proceedings of the 24th An-nual International ACM SIGIR Conference pp 35ndash42 NewOrleans LO USA 2001
[26] E W Dijkstra ldquoA note on two problems in connexion withgraphsrdquo Numerische Mathematik vol 1 pp 269ndash271 1959
[27] ldquoReal datasets for spatial databasesrdquo httpswwwcsutahedulifeifeiSpatialDatasethtm
[28] ldquoTwitterrdquo httpstwittercom[29] T Brinkhoff ldquoA framework for generating network-basedmov-
ing objectsrdquo GeoInformatica vol 6 no 2 pp 153ndash180 2002
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
Wireless Communications and Mobile Computing 15
Number of keywords1 2 3 4 5
COSKCMTkSK+
0
15
30
45
Que
ry p
roce
ssin
g tim
e (s)
60
(a) Query processing time
COSK
Number of keywords
CMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1 2 3 4 5
(b) Communication cost
Figure 11 Effect of number of keywords on query processing time and communication cost
001 01 1 10 100
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
001 01 1 10 100
(b) Communication cost
Figure 12 Effect of 120572 on query processing time and communication cost
726 Effect of Mobility Figure 14 shows the effect of mobility119872119902119903119910 (mobility refers to the percentage of query objects thatare moving at any timestamp) on the performance of COSKand CMTkSK+ algorithms As expected the query pro-cessing time and communication costs for both algorithmsincrease with119872119902119903y Nevertheless COSK performs better thanCMTkSK+ in terms of query processing time and commu-nication costs
727 Effect of Directed Edges Figure 15 shows the impactof percentage of directed edges 119864119889119894119903 on the performance ofCOSK and CMTkSK+ algorithms The query processing time
increases with 119864119889119894119903 because algorithm needs to explore moreedges to retrieve the top-k keyword queries However thecommunication cost is not significantly affected by the valueof 119864119889119894119903 for both the algorithms
728 Effect of Datasets Figure 16 demonstrates the indexsizes of the COSK and CMTkSK+ approaches for differentdatasets As shown in Figure 16 both algorithms have similarindex sizes However COSK has minor space overheadbecause it stores additional information of the highest signifi-cance factor 120579119905 of edges More important this space overheadis minimal as compared to the gain achieved by COSK inquery processing time and communication costs
16 Wireless Communications and Mobile Computing
25 50 75 100 125
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
Vqry
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
25 50 75 100 125Vqry
(b) Communication cost
Figure 13 Effect of speed on query processing time and communication cost
20 40 60 80 100Mqry
COSKCMTkSK+
0
15
45
30
60
Que
ry p
roce
ssin
g tim
e (s)
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
20 40 60 80 100Mqry
1k
COSKCMTkSK+
(b) Communication cost
Figure 14 Effect of mobility on query processing time and communication cost
73 Experimental Results of Top-k Spatial Keyword Queriesin Dynamic Road Networks In this section we evaluate theperformance of COSK and basic algorithm for dynamic roadnetworks The 119864119906119901119889 indicates the percentage of all edges thatchange their weight at each timestamp The length of anupdated edge is randomly selected between 01 to 10 times theoriginal length Figure 17(a) depicts the query processing timeof COSK and basic algorithm It is evident from the figure thatquery processing time of basic algorithm is not significantlyaffected by 119864119906119901119889 This is mainly because the query objectsissue top-k spatial queries at each timestamp However query
processing time of COSK increases with the value of 119864119906119901119889because the probability that the updated edge may associatedwith the monitoring region of query q increases with 119864119906119901119889Therefore when 119864119906119901119889 becomes large the results need to befrequently updated which increases the query processingtime Figure 17(b) shows the communication costs of COSKand basic algorithm with respect to 119864119906119901119889 Basic algorithmincurs almost constant communication costs regardless of thevalue of 119864119906119901119889 In contrast the communication cost of COSKincreases with 119864119906119901119889 because the query result and safe regionsneeds to be frequently updated
Wireless Communications and Mobile Computing 17
COSKCMTkSK+
10 20 30 40 50Edir
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1k
10 20 30 40 50Edir
eSPAKCMTkSK+
(b) Communication cost
Figure 15 Effect of 119864119889119894119903 on query processing time and communication cost
COSKCMTkSK+
0
15
45
30
60
Inde
x siz
e (M
B)
OldenburgDatasets
San Francisco San Joaquin
Figure 16 Effect of dataset on index size
8 Conclusion
In this paper we investigated moving top-k spatial keywordqueries in directed and dynamic road networksWepresentedan efficient indexing framework using inverted files thatindexes the data objects on edges allowing for the effectivesearching of data objects relevant to queries in terms ofboth textual and spatial relevance We also presented a safe-exit-based algorithm called COSK to monitor moving top-k spatial keyword queries We demonstrated that the queryresults remain valid as long as the query object resides withina safe region Furthermore COSK can effectively monitor thevalidity of query results and safe regions in dynamic roadnetworks Finally an experimental evaluation conducted on
real road networks demonstrated that COSK significantlyreduced the query processing time and communication costscompared to the CMTkSK+ algorithm
Data Availability
The real road network data used in this study are also used inmany previous studies The road network data is cited in themanuscript and it is available at httpswwwcsutahedusimlifeifeiSpatialDatasethtm To simulate the moving queriesthe authors used the spatiotemporal data generator which isalso used in previous studiesThe research article of generatoris cited in the manuscript The documentation and source
18 Wireless Communications and Mobile Computing
0
20
40
60
Que
ry p
roce
ssin
g tim
e (s)
80
15 30 45 60 75Eupd
COSKBasic
(a) Query processing time
15 30 45 60 75Eupd
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
COSKBasic
(b) Communication cost
Figure 17 Effect of 119864119906119901119889 on query processing time and communication cost
files of generator are available at httpsiapgjade-hsdeper-sonenbrinkhoffgenerator They used the Twitter tweetsfor generating the description of data objects and also querykeywords The tweets used can be accessible at httpfollow-thehashtagcomdatasetsfree-twitter-dataset-usa-200000-free-usa-tweets
Conflicts of Interest
The authors declare that there is no conflicts of interestregarding the publication of this paper
Acknowledgments
Hyung-JuChowas supported by theNational Research Foun-dation of Korea (NRF) grant funded by the Korean Govern-ment (MSIP) (NRF-2016R1A2B4009793) and this researchwas partially supported by Basic Science Research Programthrough the National Research Foundation of Korea (NRF)fundedby theMinistry of Education (2016R1D1A1B03934129)
References
[1] D Papadias N Mamoulis J Zhang and Y Tao ldquoQuery pro-cessing in spatial network databasesrdquo in Proceedings of the 29thInternational Conference on Very Large Data Bases (VLDB rsquo03)pp 802ndash813 September 2003
[2] H-J Cho K Ryu and T-S Chung ldquoAn efficient algorithm forcomputing safe exit points of moving range queries in directedroad networksrdquo Information Systems vol 41 pp 1ndash19 2014
[3] G Tsatsanifos and A Vlachou ldquoOn processing Top-k spatio-textual preference queriesrdquo in Proceedings of the 18th Interna-tional Conference on ExtendingDatabase Technology (EDBT rsquo15)pp 433ndash444 March 2015
[4] R Li A X Liu A L Wang and B Bruhadeshwar ldquoFast rangequery processing with strong privacy protection for cloud com-putingrdquo Proceedings of the VLDB Endowment vol 7 no 14 pp1953ndash1964 2014
[5] G Cong C S Jensen andDWu ldquoEfficient retrieval of the Top-k most relevant spatial web objectsrdquo Proceedings of the VLDBEndowment vol 2 no 1 pp 337ndash348 2009
[6] Z Li K C K Lee B Zheng W-C Lee D Lee and X WangldquoIR-tree An efficient index for geographic document searchrdquoIEEE Transactions on Knowledge and Data Engineering vol 23no 4 pp 585ndash599 2011
[7] Y Zhou X Xie C Wang Y Gong and W Ma ldquoHybrid indexstructures for location-based web searchrdquo in Proceedings of the14th ACM International Conference on Information and Knowl-edge Management pp 155ndash162 Bremen Germany October2005
[8] J Zobel and A Moffat ldquoInverted files for text search enginesrdquoACM Computing Surveys vol 38 no 2 2006
[9] N Beckmann H Kriegel R Schneider and B Seeger ldquoR-anefficient and robust accessmethod for points and rectanglesrdquo inProceedings of the ACM SIGMOD International Conference onManagement of Data vol 19 pp 322ndash331 May 1990
[10] R Hariharan B Hore C Li and S Mehrotra ldquoProcessing spa-tial-keyword (sk) queries in geographic information retrieval(gir) systemsrdquo in Proceedings of the 19th International Confer-ence on Scientific and Statistical DatabaseManagement (SSDBMrsquo07) July 2007
[11] I De FelipeV Hristidis andN Rishe ldquoKeyword search on spa-tial databasesrdquo in Proceedings of the 24th International Confer-ence on Data Engineering (ICDE rsquo08) pp 656ndash665 April 2008
[12] J B Rocha-Junior O Gkorgkas S Jonassen and K NoslashrvagldquoEfficient processing of top-k spatial keyword queriesrdquo inProceedings of the International Symposium on Spatial andTemporal Databases pp 205ndash222 Springer 2011
[13] D Zhang K-L Tan andAK Tung ldquoScalable top-k spatial key-word searchrdquo in Proceedings of the 16th International Conferenceon Extending Database Technology pp 359ndash370 2013
Wireless Communications and Mobile Computing 19
[14] J B Rocha-Junior andK Noslashrvag ldquoTop-k spatial keyword quer-ies on road networksrdquo in Proceedings of the 15th InternationalConference on Extending Database Technology pp 168ndash179Berlin Germany March 2012
[15] H-J Cho S J Kwon and T-S Chung ldquoA safe exit algorithmfor continuous nearest neighbor monitoring in road networksrdquoMobile Information Systems vol 9 no 1 pp 37ndash53 2013
[16] D Yung M L Yiu and E Lo ldquoA safe-exit approach for efficientnetwork-based moving range queriesrdquo Data amp KnowledgeEngineering vol 72 pp 126ndash147 2012
[17] M Attique H Cho R Jin and T Chung ldquoEfficient Processingof Continuous Reverse k Nearest Neighbor on Moving Objectsin Road Networksrdquo ISPRS International Journal of Geo-Infor-mation vol 5 no 12 p 247 2016
[18] H G Elmongui M F Mokbel and W G Aref ldquoContinuousaggregate nearest neighbor queriesrdquoGeoInformatica vol 17 no1 pp 63ndash95 2013
[19] D Wu M L Yiu C S Jensen and G Cong ldquoEfficient con-tinuously moving top-k spatial keyword query processingrdquo inProceedings of the IEEE International Conference on Data En-gineering (ICDE rsquo11) pp 541ndash552 Hannover Germany April2011
[20] W Huang G Li K-L Tan and J Feng ldquoEfficient safe-re-gion construction for moving top-k spatial keyword queriesrdquoin Proceedings of the 21st ACM International Conference onInformation and Knowledge Management pp 932ndash941 2012
[21] L Guo J ShaoHHAung andK-L Tan ldquoEfficient continuoustop-k spatial keyword queries on road networksrdquoGeoInformat-ica vol 19 no 1 pp 29ndash60 2014
[22] Y Li G Li L Shu Q Huang and H Jiang ldquoContinuous moni-toring of top-k spatial keyword queries in road networksrdquo Jour-nal of Information Science and Engineering vol 31 no 6 pp1831ndash1848 2015
[23] M Attique A Khan and T-S Chung ldquoESPAK Top-k spatialkeyword query processing in directed road networksrdquo in Pro-ceedings of the Workshops of the International Conference onExtending Database Technology and the International Confer-ence on DatabaseTheory (EDBTICDT rsquo17) March 2017
[24] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing ampManagement vol24 no 5 pp 513ndash523 1988
[25] V N Anh O de Kretser and A Moffat ldquoVector-space rankingwith effective early terminationrdquo in Proceedings of the 24th An-nual International ACM SIGIR Conference pp 35ndash42 NewOrleans LO USA 2001
[26] E W Dijkstra ldquoA note on two problems in connexion withgraphsrdquo Numerische Mathematik vol 1 pp 269ndash271 1959
[27] ldquoReal datasets for spatial databasesrdquo httpswwwcsutahedulifeifeiSpatialDatasethtm
[28] ldquoTwitterrdquo httpstwittercom[29] T Brinkhoff ldquoA framework for generating network-basedmov-
ing objectsrdquo GeoInformatica vol 6 no 2 pp 153ndash180 2002
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
16 Wireless Communications and Mobile Computing
25 50 75 100 125
COSKCMTkSK+
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
Vqry
(a) Query processing time
COSKCMTkSK+
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
25 50 75 100 125Vqry
(b) Communication cost
Figure 13 Effect of speed on query processing time and communication cost
20 40 60 80 100Mqry
COSKCMTkSK+
0
15
45
30
60
Que
ry p
roce
ssin
g tim
e (s)
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
20 40 60 80 100Mqry
1k
COSKCMTkSK+
(b) Communication cost
Figure 14 Effect of mobility on query processing time and communication cost
73 Experimental Results of Top-k Spatial Keyword Queriesin Dynamic Road Networks In this section we evaluate theperformance of COSK and basic algorithm for dynamic roadnetworks The 119864119906119901119889 indicates the percentage of all edges thatchange their weight at each timestamp The length of anupdated edge is randomly selected between 01 to 10 times theoriginal length Figure 17(a) depicts the query processing timeof COSK and basic algorithm It is evident from the figure thatquery processing time of basic algorithm is not significantlyaffected by 119864119906119901119889 This is mainly because the query objectsissue top-k spatial queries at each timestamp However query
processing time of COSK increases with the value of 119864119906119901119889because the probability that the updated edge may associatedwith the monitoring region of query q increases with 119864119906119901119889Therefore when 119864119906119901119889 becomes large the results need to befrequently updated which increases the query processingtime Figure 17(b) shows the communication costs of COSKand basic algorithm with respect to 119864119906119901119889 Basic algorithmincurs almost constant communication costs regardless of thevalue of 119864119906119901119889 In contrast the communication cost of COSKincreases with 119864119906119901119889 because the query result and safe regionsneeds to be frequently updated
Wireless Communications and Mobile Computing 17
COSKCMTkSK+
10 20 30 40 50Edir
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1k
10 20 30 40 50Edir
eSPAKCMTkSK+
(b) Communication cost
Figure 15 Effect of 119864119889119894119903 on query processing time and communication cost
COSKCMTkSK+
0
15
45
30
60
Inde
x siz
e (M
B)
OldenburgDatasets
San Francisco San Joaquin
Figure 16 Effect of dataset on index size
8 Conclusion
In this paper we investigated moving top-k spatial keywordqueries in directed and dynamic road networksWepresentedan efficient indexing framework using inverted files thatindexes the data objects on edges allowing for the effectivesearching of data objects relevant to queries in terms ofboth textual and spatial relevance We also presented a safe-exit-based algorithm called COSK to monitor moving top-k spatial keyword queries We demonstrated that the queryresults remain valid as long as the query object resides withina safe region Furthermore COSK can effectively monitor thevalidity of query results and safe regions in dynamic roadnetworks Finally an experimental evaluation conducted on
real road networks demonstrated that COSK significantlyreduced the query processing time and communication costscompared to the CMTkSK+ algorithm
Data Availability
The real road network data used in this study are also used inmany previous studies The road network data is cited in themanuscript and it is available at httpswwwcsutahedusimlifeifeiSpatialDatasethtm To simulate the moving queriesthe authors used the spatiotemporal data generator which isalso used in previous studiesThe research article of generatoris cited in the manuscript The documentation and source
18 Wireless Communications and Mobile Computing
0
20
40
60
Que
ry p
roce
ssin
g tim
e (s)
80
15 30 45 60 75Eupd
COSKBasic
(a) Query processing time
15 30 45 60 75Eupd
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
COSKBasic
(b) Communication cost
Figure 17 Effect of 119864119906119901119889 on query processing time and communication cost
files of generator are available at httpsiapgjade-hsdeper-sonenbrinkhoffgenerator They used the Twitter tweetsfor generating the description of data objects and also querykeywords The tweets used can be accessible at httpfollow-thehashtagcomdatasetsfree-twitter-dataset-usa-200000-free-usa-tweets
Conflicts of Interest
The authors declare that there is no conflicts of interestregarding the publication of this paper
Acknowledgments
Hyung-JuChowas supported by theNational Research Foun-dation of Korea (NRF) grant funded by the Korean Govern-ment (MSIP) (NRF-2016R1A2B4009793) and this researchwas partially supported by Basic Science Research Programthrough the National Research Foundation of Korea (NRF)fundedby theMinistry of Education (2016R1D1A1B03934129)
References
[1] D Papadias N Mamoulis J Zhang and Y Tao ldquoQuery pro-cessing in spatial network databasesrdquo in Proceedings of the 29thInternational Conference on Very Large Data Bases (VLDB rsquo03)pp 802ndash813 September 2003
[2] H-J Cho K Ryu and T-S Chung ldquoAn efficient algorithm forcomputing safe exit points of moving range queries in directedroad networksrdquo Information Systems vol 41 pp 1ndash19 2014
[3] G Tsatsanifos and A Vlachou ldquoOn processing Top-k spatio-textual preference queriesrdquo in Proceedings of the 18th Interna-tional Conference on ExtendingDatabase Technology (EDBT rsquo15)pp 433ndash444 March 2015
[4] R Li A X Liu A L Wang and B Bruhadeshwar ldquoFast rangequery processing with strong privacy protection for cloud com-putingrdquo Proceedings of the VLDB Endowment vol 7 no 14 pp1953ndash1964 2014
[5] G Cong C S Jensen andDWu ldquoEfficient retrieval of the Top-k most relevant spatial web objectsrdquo Proceedings of the VLDBEndowment vol 2 no 1 pp 337ndash348 2009
[6] Z Li K C K Lee B Zheng W-C Lee D Lee and X WangldquoIR-tree An efficient index for geographic document searchrdquoIEEE Transactions on Knowledge and Data Engineering vol 23no 4 pp 585ndash599 2011
[7] Y Zhou X Xie C Wang Y Gong and W Ma ldquoHybrid indexstructures for location-based web searchrdquo in Proceedings of the14th ACM International Conference on Information and Knowl-edge Management pp 155ndash162 Bremen Germany October2005
[8] J Zobel and A Moffat ldquoInverted files for text search enginesrdquoACM Computing Surveys vol 38 no 2 2006
[9] N Beckmann H Kriegel R Schneider and B Seeger ldquoR-anefficient and robust accessmethod for points and rectanglesrdquo inProceedings of the ACM SIGMOD International Conference onManagement of Data vol 19 pp 322ndash331 May 1990
[10] R Hariharan B Hore C Li and S Mehrotra ldquoProcessing spa-tial-keyword (sk) queries in geographic information retrieval(gir) systemsrdquo in Proceedings of the 19th International Confer-ence on Scientific and Statistical DatabaseManagement (SSDBMrsquo07) July 2007
[11] I De FelipeV Hristidis andN Rishe ldquoKeyword search on spa-tial databasesrdquo in Proceedings of the 24th International Confer-ence on Data Engineering (ICDE rsquo08) pp 656ndash665 April 2008
[12] J B Rocha-Junior O Gkorgkas S Jonassen and K NoslashrvagldquoEfficient processing of top-k spatial keyword queriesrdquo inProceedings of the International Symposium on Spatial andTemporal Databases pp 205ndash222 Springer 2011
[13] D Zhang K-L Tan andAK Tung ldquoScalable top-k spatial key-word searchrdquo in Proceedings of the 16th International Conferenceon Extending Database Technology pp 359ndash370 2013
Wireless Communications and Mobile Computing 19
[14] J B Rocha-Junior andK Noslashrvag ldquoTop-k spatial keyword quer-ies on road networksrdquo in Proceedings of the 15th InternationalConference on Extending Database Technology pp 168ndash179Berlin Germany March 2012
[15] H-J Cho S J Kwon and T-S Chung ldquoA safe exit algorithmfor continuous nearest neighbor monitoring in road networksrdquoMobile Information Systems vol 9 no 1 pp 37ndash53 2013
[16] D Yung M L Yiu and E Lo ldquoA safe-exit approach for efficientnetwork-based moving range queriesrdquo Data amp KnowledgeEngineering vol 72 pp 126ndash147 2012
[17] M Attique H Cho R Jin and T Chung ldquoEfficient Processingof Continuous Reverse k Nearest Neighbor on Moving Objectsin Road Networksrdquo ISPRS International Journal of Geo-Infor-mation vol 5 no 12 p 247 2016
[18] H G Elmongui M F Mokbel and W G Aref ldquoContinuousaggregate nearest neighbor queriesrdquoGeoInformatica vol 17 no1 pp 63ndash95 2013
[19] D Wu M L Yiu C S Jensen and G Cong ldquoEfficient con-tinuously moving top-k spatial keyword query processingrdquo inProceedings of the IEEE International Conference on Data En-gineering (ICDE rsquo11) pp 541ndash552 Hannover Germany April2011
[20] W Huang G Li K-L Tan and J Feng ldquoEfficient safe-re-gion construction for moving top-k spatial keyword queriesrdquoin Proceedings of the 21st ACM International Conference onInformation and Knowledge Management pp 932ndash941 2012
[21] L Guo J ShaoHHAung andK-L Tan ldquoEfficient continuoustop-k spatial keyword queries on road networksrdquoGeoInformat-ica vol 19 no 1 pp 29ndash60 2014
[22] Y Li G Li L Shu Q Huang and H Jiang ldquoContinuous moni-toring of top-k spatial keyword queries in road networksrdquo Jour-nal of Information Science and Engineering vol 31 no 6 pp1831ndash1848 2015
[23] M Attique A Khan and T-S Chung ldquoESPAK Top-k spatialkeyword query processing in directed road networksrdquo in Pro-ceedings of the Workshops of the International Conference onExtending Database Technology and the International Confer-ence on DatabaseTheory (EDBTICDT rsquo17) March 2017
[24] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing ampManagement vol24 no 5 pp 513ndash523 1988
[25] V N Anh O de Kretser and A Moffat ldquoVector-space rankingwith effective early terminationrdquo in Proceedings of the 24th An-nual International ACM SIGIR Conference pp 35ndash42 NewOrleans LO USA 2001
[26] E W Dijkstra ldquoA note on two problems in connexion withgraphsrdquo Numerische Mathematik vol 1 pp 269ndash271 1959
[27] ldquoReal datasets for spatial databasesrdquo httpswwwcsutahedulifeifeiSpatialDatasethtm
[28] ldquoTwitterrdquo httpstwittercom[29] T Brinkhoff ldquoA framework for generating network-basedmov-
ing objectsrdquo GeoInformatica vol 6 no 2 pp 153ndash180 2002
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
Wireless Communications and Mobile Computing 17
COSKCMTkSK+
10 20 30 40 50Edir
0
10
20
30
Que
ry p
roce
ssin
g tim
e (s)
40
(a) Query processing time
100
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
1k
10 20 30 40 50Edir
eSPAKCMTkSK+
(b) Communication cost
Figure 15 Effect of 119864119889119894119903 on query processing time and communication cost
COSKCMTkSK+
0
15
45
30
60
Inde
x siz
e (M
B)
OldenburgDatasets
San Francisco San Joaquin
Figure 16 Effect of dataset on index size
8 Conclusion
In this paper we investigated moving top-k spatial keywordqueries in directed and dynamic road networksWepresentedan efficient indexing framework using inverted files thatindexes the data objects on edges allowing for the effectivesearching of data objects relevant to queries in terms ofboth textual and spatial relevance We also presented a safe-exit-based algorithm called COSK to monitor moving top-k spatial keyword queries We demonstrated that the queryresults remain valid as long as the query object resides withina safe region Furthermore COSK can effectively monitor thevalidity of query results and safe regions in dynamic roadnetworks Finally an experimental evaluation conducted on
real road networks demonstrated that COSK significantlyreduced the query processing time and communication costscompared to the CMTkSK+ algorithm
Data Availability
The real road network data used in this study are also used inmany previous studies The road network data is cited in themanuscript and it is available at httpswwwcsutahedusimlifeifeiSpatialDatasethtm To simulate the moving queriesthe authors used the spatiotemporal data generator which isalso used in previous studiesThe research article of generatoris cited in the manuscript The documentation and source
18 Wireless Communications and Mobile Computing
0
20
40
60
Que
ry p
roce
ssin
g tim
e (s)
80
15 30 45 60 75Eupd
COSKBasic
(a) Query processing time
15 30 45 60 75Eupd
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
COSKBasic
(b) Communication cost
Figure 17 Effect of 119864119906119901119889 on query processing time and communication cost
files of generator are available at httpsiapgjade-hsdeper-sonenbrinkhoffgenerator They used the Twitter tweetsfor generating the description of data objects and also querykeywords The tweets used can be accessible at httpfollow-thehashtagcomdatasetsfree-twitter-dataset-usa-200000-free-usa-tweets
Conflicts of Interest
The authors declare that there is no conflicts of interestregarding the publication of this paper
Acknowledgments
Hyung-JuChowas supported by theNational Research Foun-dation of Korea (NRF) grant funded by the Korean Govern-ment (MSIP) (NRF-2016R1A2B4009793) and this researchwas partially supported by Basic Science Research Programthrough the National Research Foundation of Korea (NRF)fundedby theMinistry of Education (2016R1D1A1B03934129)
References
[1] D Papadias N Mamoulis J Zhang and Y Tao ldquoQuery pro-cessing in spatial network databasesrdquo in Proceedings of the 29thInternational Conference on Very Large Data Bases (VLDB rsquo03)pp 802ndash813 September 2003
[2] H-J Cho K Ryu and T-S Chung ldquoAn efficient algorithm forcomputing safe exit points of moving range queries in directedroad networksrdquo Information Systems vol 41 pp 1ndash19 2014
[3] G Tsatsanifos and A Vlachou ldquoOn processing Top-k spatio-textual preference queriesrdquo in Proceedings of the 18th Interna-tional Conference on ExtendingDatabase Technology (EDBT rsquo15)pp 433ndash444 March 2015
[4] R Li A X Liu A L Wang and B Bruhadeshwar ldquoFast rangequery processing with strong privacy protection for cloud com-putingrdquo Proceedings of the VLDB Endowment vol 7 no 14 pp1953ndash1964 2014
[5] G Cong C S Jensen andDWu ldquoEfficient retrieval of the Top-k most relevant spatial web objectsrdquo Proceedings of the VLDBEndowment vol 2 no 1 pp 337ndash348 2009
[6] Z Li K C K Lee B Zheng W-C Lee D Lee and X WangldquoIR-tree An efficient index for geographic document searchrdquoIEEE Transactions on Knowledge and Data Engineering vol 23no 4 pp 585ndash599 2011
[7] Y Zhou X Xie C Wang Y Gong and W Ma ldquoHybrid indexstructures for location-based web searchrdquo in Proceedings of the14th ACM International Conference on Information and Knowl-edge Management pp 155ndash162 Bremen Germany October2005
[8] J Zobel and A Moffat ldquoInverted files for text search enginesrdquoACM Computing Surveys vol 38 no 2 2006
[9] N Beckmann H Kriegel R Schneider and B Seeger ldquoR-anefficient and robust accessmethod for points and rectanglesrdquo inProceedings of the ACM SIGMOD International Conference onManagement of Data vol 19 pp 322ndash331 May 1990
[10] R Hariharan B Hore C Li and S Mehrotra ldquoProcessing spa-tial-keyword (sk) queries in geographic information retrieval(gir) systemsrdquo in Proceedings of the 19th International Confer-ence on Scientific and Statistical DatabaseManagement (SSDBMrsquo07) July 2007
[11] I De FelipeV Hristidis andN Rishe ldquoKeyword search on spa-tial databasesrdquo in Proceedings of the 24th International Confer-ence on Data Engineering (ICDE rsquo08) pp 656ndash665 April 2008
[12] J B Rocha-Junior O Gkorgkas S Jonassen and K NoslashrvagldquoEfficient processing of top-k spatial keyword queriesrdquo inProceedings of the International Symposium on Spatial andTemporal Databases pp 205ndash222 Springer 2011
[13] D Zhang K-L Tan andAK Tung ldquoScalable top-k spatial key-word searchrdquo in Proceedings of the 16th International Conferenceon Extending Database Technology pp 359ndash370 2013
Wireless Communications and Mobile Computing 19
[14] J B Rocha-Junior andK Noslashrvag ldquoTop-k spatial keyword quer-ies on road networksrdquo in Proceedings of the 15th InternationalConference on Extending Database Technology pp 168ndash179Berlin Germany March 2012
[15] H-J Cho S J Kwon and T-S Chung ldquoA safe exit algorithmfor continuous nearest neighbor monitoring in road networksrdquoMobile Information Systems vol 9 no 1 pp 37ndash53 2013
[16] D Yung M L Yiu and E Lo ldquoA safe-exit approach for efficientnetwork-based moving range queriesrdquo Data amp KnowledgeEngineering vol 72 pp 126ndash147 2012
[17] M Attique H Cho R Jin and T Chung ldquoEfficient Processingof Continuous Reverse k Nearest Neighbor on Moving Objectsin Road Networksrdquo ISPRS International Journal of Geo-Infor-mation vol 5 no 12 p 247 2016
[18] H G Elmongui M F Mokbel and W G Aref ldquoContinuousaggregate nearest neighbor queriesrdquoGeoInformatica vol 17 no1 pp 63ndash95 2013
[19] D Wu M L Yiu C S Jensen and G Cong ldquoEfficient con-tinuously moving top-k spatial keyword query processingrdquo inProceedings of the IEEE International Conference on Data En-gineering (ICDE rsquo11) pp 541ndash552 Hannover Germany April2011
[20] W Huang G Li K-L Tan and J Feng ldquoEfficient safe-re-gion construction for moving top-k spatial keyword queriesrdquoin Proceedings of the 21st ACM International Conference onInformation and Knowledge Management pp 932ndash941 2012
[21] L Guo J ShaoHHAung andK-L Tan ldquoEfficient continuoustop-k spatial keyword queries on road networksrdquoGeoInformat-ica vol 19 no 1 pp 29ndash60 2014
[22] Y Li G Li L Shu Q Huang and H Jiang ldquoContinuous moni-toring of top-k spatial keyword queries in road networksrdquo Jour-nal of Information Science and Engineering vol 31 no 6 pp1831ndash1848 2015
[23] M Attique A Khan and T-S Chung ldquoESPAK Top-k spatialkeyword query processing in directed road networksrdquo in Pro-ceedings of the Workshops of the International Conference onExtending Database Technology and the International Confer-ence on DatabaseTheory (EDBTICDT rsquo17) March 2017
[24] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing ampManagement vol24 no 5 pp 513ndash523 1988
[25] V N Anh O de Kretser and A Moffat ldquoVector-space rankingwith effective early terminationrdquo in Proceedings of the 24th An-nual International ACM SIGIR Conference pp 35ndash42 NewOrleans LO USA 2001
[26] E W Dijkstra ldquoA note on two problems in connexion withgraphsrdquo Numerische Mathematik vol 1 pp 269ndash271 1959
[27] ldquoReal datasets for spatial databasesrdquo httpswwwcsutahedulifeifeiSpatialDatasethtm
[28] ldquoTwitterrdquo httpstwittercom[29] T Brinkhoff ldquoA framework for generating network-basedmov-
ing objectsrdquo GeoInformatica vol 6 no 2 pp 153ndash180 2002
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
18 Wireless Communications and Mobile Computing
0
20
40
60
Que
ry p
roce
ssin
g tim
e (s)
80
15 30 45 60 75Eupd
COSKBasic
(a) Query processing time
15 30 45 60 75Eupd
100
1k
10k
100k
of
mes
sage
s tra
nsfe
rred
1M
COSKBasic
(b) Communication cost
Figure 17 Effect of 119864119906119901119889 on query processing time and communication cost
files of generator are available at httpsiapgjade-hsdeper-sonenbrinkhoffgenerator They used the Twitter tweetsfor generating the description of data objects and also querykeywords The tweets used can be accessible at httpfollow-thehashtagcomdatasetsfree-twitter-dataset-usa-200000-free-usa-tweets
Conflicts of Interest
The authors declare that there is no conflicts of interestregarding the publication of this paper
Acknowledgments
Hyung-JuChowas supported by theNational Research Foun-dation of Korea (NRF) grant funded by the Korean Govern-ment (MSIP) (NRF-2016R1A2B4009793) and this researchwas partially supported by Basic Science Research Programthrough the National Research Foundation of Korea (NRF)fundedby theMinistry of Education (2016R1D1A1B03934129)
References
[1] D Papadias N Mamoulis J Zhang and Y Tao ldquoQuery pro-cessing in spatial network databasesrdquo in Proceedings of the 29thInternational Conference on Very Large Data Bases (VLDB rsquo03)pp 802ndash813 September 2003
[2] H-J Cho K Ryu and T-S Chung ldquoAn efficient algorithm forcomputing safe exit points of moving range queries in directedroad networksrdquo Information Systems vol 41 pp 1ndash19 2014
[3] G Tsatsanifos and A Vlachou ldquoOn processing Top-k spatio-textual preference queriesrdquo in Proceedings of the 18th Interna-tional Conference on ExtendingDatabase Technology (EDBT rsquo15)pp 433ndash444 March 2015
[4] R Li A X Liu A L Wang and B Bruhadeshwar ldquoFast rangequery processing with strong privacy protection for cloud com-putingrdquo Proceedings of the VLDB Endowment vol 7 no 14 pp1953ndash1964 2014
[5] G Cong C S Jensen andDWu ldquoEfficient retrieval of the Top-k most relevant spatial web objectsrdquo Proceedings of the VLDBEndowment vol 2 no 1 pp 337ndash348 2009
[6] Z Li K C K Lee B Zheng W-C Lee D Lee and X WangldquoIR-tree An efficient index for geographic document searchrdquoIEEE Transactions on Knowledge and Data Engineering vol 23no 4 pp 585ndash599 2011
[7] Y Zhou X Xie C Wang Y Gong and W Ma ldquoHybrid indexstructures for location-based web searchrdquo in Proceedings of the14th ACM International Conference on Information and Knowl-edge Management pp 155ndash162 Bremen Germany October2005
[8] J Zobel and A Moffat ldquoInverted files for text search enginesrdquoACM Computing Surveys vol 38 no 2 2006
[9] N Beckmann H Kriegel R Schneider and B Seeger ldquoR-anefficient and robust accessmethod for points and rectanglesrdquo inProceedings of the ACM SIGMOD International Conference onManagement of Data vol 19 pp 322ndash331 May 1990
[10] R Hariharan B Hore C Li and S Mehrotra ldquoProcessing spa-tial-keyword (sk) queries in geographic information retrieval(gir) systemsrdquo in Proceedings of the 19th International Confer-ence on Scientific and Statistical DatabaseManagement (SSDBMrsquo07) July 2007
[11] I De FelipeV Hristidis andN Rishe ldquoKeyword search on spa-tial databasesrdquo in Proceedings of the 24th International Confer-ence on Data Engineering (ICDE rsquo08) pp 656ndash665 April 2008
[12] J B Rocha-Junior O Gkorgkas S Jonassen and K NoslashrvagldquoEfficient processing of top-k spatial keyword queriesrdquo inProceedings of the International Symposium on Spatial andTemporal Databases pp 205ndash222 Springer 2011
[13] D Zhang K-L Tan andAK Tung ldquoScalable top-k spatial key-word searchrdquo in Proceedings of the 16th International Conferenceon Extending Database Technology pp 359ndash370 2013
Wireless Communications and Mobile Computing 19
[14] J B Rocha-Junior andK Noslashrvag ldquoTop-k spatial keyword quer-ies on road networksrdquo in Proceedings of the 15th InternationalConference on Extending Database Technology pp 168ndash179Berlin Germany March 2012
[15] H-J Cho S J Kwon and T-S Chung ldquoA safe exit algorithmfor continuous nearest neighbor monitoring in road networksrdquoMobile Information Systems vol 9 no 1 pp 37ndash53 2013
[16] D Yung M L Yiu and E Lo ldquoA safe-exit approach for efficientnetwork-based moving range queriesrdquo Data amp KnowledgeEngineering vol 72 pp 126ndash147 2012
[17] M Attique H Cho R Jin and T Chung ldquoEfficient Processingof Continuous Reverse k Nearest Neighbor on Moving Objectsin Road Networksrdquo ISPRS International Journal of Geo-Infor-mation vol 5 no 12 p 247 2016
[18] H G Elmongui M F Mokbel and W G Aref ldquoContinuousaggregate nearest neighbor queriesrdquoGeoInformatica vol 17 no1 pp 63ndash95 2013
[19] D Wu M L Yiu C S Jensen and G Cong ldquoEfficient con-tinuously moving top-k spatial keyword query processingrdquo inProceedings of the IEEE International Conference on Data En-gineering (ICDE rsquo11) pp 541ndash552 Hannover Germany April2011
[20] W Huang G Li K-L Tan and J Feng ldquoEfficient safe-re-gion construction for moving top-k spatial keyword queriesrdquoin Proceedings of the 21st ACM International Conference onInformation and Knowledge Management pp 932ndash941 2012
[21] L Guo J ShaoHHAung andK-L Tan ldquoEfficient continuoustop-k spatial keyword queries on road networksrdquoGeoInformat-ica vol 19 no 1 pp 29ndash60 2014
[22] Y Li G Li L Shu Q Huang and H Jiang ldquoContinuous moni-toring of top-k spatial keyword queries in road networksrdquo Jour-nal of Information Science and Engineering vol 31 no 6 pp1831ndash1848 2015
[23] M Attique A Khan and T-S Chung ldquoESPAK Top-k spatialkeyword query processing in directed road networksrdquo in Pro-ceedings of the Workshops of the International Conference onExtending Database Technology and the International Confer-ence on DatabaseTheory (EDBTICDT rsquo17) March 2017
[24] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing ampManagement vol24 no 5 pp 513ndash523 1988
[25] V N Anh O de Kretser and A Moffat ldquoVector-space rankingwith effective early terminationrdquo in Proceedings of the 24th An-nual International ACM SIGIR Conference pp 35ndash42 NewOrleans LO USA 2001
[26] E W Dijkstra ldquoA note on two problems in connexion withgraphsrdquo Numerische Mathematik vol 1 pp 269ndash271 1959
[27] ldquoReal datasets for spatial databasesrdquo httpswwwcsutahedulifeifeiSpatialDatasethtm
[28] ldquoTwitterrdquo httpstwittercom[29] T Brinkhoff ldquoA framework for generating network-basedmov-
ing objectsrdquo GeoInformatica vol 6 no 2 pp 153ndash180 2002
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
Wireless Communications and Mobile Computing 19
[14] J B Rocha-Junior andK Noslashrvag ldquoTop-k spatial keyword quer-ies on road networksrdquo in Proceedings of the 15th InternationalConference on Extending Database Technology pp 168ndash179Berlin Germany March 2012
[15] H-J Cho S J Kwon and T-S Chung ldquoA safe exit algorithmfor continuous nearest neighbor monitoring in road networksrdquoMobile Information Systems vol 9 no 1 pp 37ndash53 2013
[16] D Yung M L Yiu and E Lo ldquoA safe-exit approach for efficientnetwork-based moving range queriesrdquo Data amp KnowledgeEngineering vol 72 pp 126ndash147 2012
[17] M Attique H Cho R Jin and T Chung ldquoEfficient Processingof Continuous Reverse k Nearest Neighbor on Moving Objectsin Road Networksrdquo ISPRS International Journal of Geo-Infor-mation vol 5 no 12 p 247 2016
[18] H G Elmongui M F Mokbel and W G Aref ldquoContinuousaggregate nearest neighbor queriesrdquoGeoInformatica vol 17 no1 pp 63ndash95 2013
[19] D Wu M L Yiu C S Jensen and G Cong ldquoEfficient con-tinuously moving top-k spatial keyword query processingrdquo inProceedings of the IEEE International Conference on Data En-gineering (ICDE rsquo11) pp 541ndash552 Hannover Germany April2011
[20] W Huang G Li K-L Tan and J Feng ldquoEfficient safe-re-gion construction for moving top-k spatial keyword queriesrdquoin Proceedings of the 21st ACM International Conference onInformation and Knowledge Management pp 932ndash941 2012
[21] L Guo J ShaoHHAung andK-L Tan ldquoEfficient continuoustop-k spatial keyword queries on road networksrdquoGeoInformat-ica vol 19 no 1 pp 29ndash60 2014
[22] Y Li G Li L Shu Q Huang and H Jiang ldquoContinuous moni-toring of top-k spatial keyword queries in road networksrdquo Jour-nal of Information Science and Engineering vol 31 no 6 pp1831ndash1848 2015
[23] M Attique A Khan and T-S Chung ldquoESPAK Top-k spatialkeyword query processing in directed road networksrdquo in Pro-ceedings of the Workshops of the International Conference onExtending Database Technology and the International Confer-ence on DatabaseTheory (EDBTICDT rsquo17) March 2017
[24] G Salton and C Buckley ldquoTerm-weighting approaches in auto-matic text retrievalrdquo Information Processing ampManagement vol24 no 5 pp 513ndash523 1988
[25] V N Anh O de Kretser and A Moffat ldquoVector-space rankingwith effective early terminationrdquo in Proceedings of the 24th An-nual International ACM SIGIR Conference pp 35ndash42 NewOrleans LO USA 2001
[26] E W Dijkstra ldquoA note on two problems in connexion withgraphsrdquo Numerische Mathematik vol 1 pp 269ndash271 1959
[27] ldquoReal datasets for spatial databasesrdquo httpswwwcsutahedulifeifeiSpatialDatasethtm
[28] ldquoTwitterrdquo httpstwittercom[29] T Brinkhoff ldquoA framework for generating network-basedmov-
ing objectsrdquo GeoInformatica vol 6 no 2 pp 153ndash180 2002
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom
International Journal of
AerospaceEngineeringHindawiwwwhindawicom Volume 2018
RoboticsJournal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Active and Passive Electronic Components
VLSI Design
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Shock and Vibration
Hindawiwwwhindawicom Volume 2018
Civil EngineeringAdvances in
Acoustics and VibrationAdvances in
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Electrical and Computer Engineering
Journal of
Advances inOptoElectronics
Hindawiwwwhindawicom
Volume 2018
Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom
The Scientific World Journal
Volume 2018
Control Scienceand Engineering
Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom
Journal ofEngineeringVolume 2018
SensorsJournal of
Hindawiwwwhindawicom Volume 2018
International Journal of
RotatingMachinery
Hindawiwwwhindawicom Volume 2018
Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Chemical EngineeringInternational Journal of Antennas and
Propagation
International Journal of
Hindawiwwwhindawicom Volume 2018
Hindawiwwwhindawicom Volume 2018
Navigation and Observation
International Journal of
Hindawi
wwwhindawicom Volume 2018
Advances in
Multimedia
Submit your manuscripts atwwwhindawicom