+ All Categories
Home > Documents > A Mobile Information System Based on Crowd-Sensed and Official ...

A Mobile Information System Based on Crowd-Sensed and Official ...

Date post: 02-Jan-2017
Category:
Upload: trantuong
View: 218 times
Download: 1 times
Share this document with a friend
12
Research Article A Mobile Information System Based on Crowd-Sensed and Official Crime Data for Finding Safe Routes: A Case Study of Mexico City Félix Mata, Miguel Torres-Ruiz, Giovanni Guzmán, Rolando Quintero, Roberto Zagal-Flores, Marco Moreno-Ibarra, and Eduardo Loza Instituto Polit´ ecnico Nacional, UPALM, Zacatenco, 07320 Mexico City, DF, Mexico Correspondence should be addressed to Miguel Torres-Ruiz; [email protected] Received 20 November 2015; Revised 12 February 2016; Accepted 14 February 2016 Academic Editor: Salil Kanhere Copyright © 2016 F´ elix Mata et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Mobile information systems agendas are increasingly becoming an essential part of human life and they play an important role in several daily activities. ese have been developed for different contexts such as public facilities in smart cities, health care, traffic congestions, e-commerce, financial security, user-generated content, and crowdsourcing. In GIScience, problems related to routing systems have been deeply explored by using several techniques, but they are not focused on security or crime rates. In this paper, an approach to provide estimations defined by crime rates for generating safe routes in mobile devices is proposed. It consists of integrating crowd-sensed and official crime data with a mobile application. us, data are semantically processed by an ontology and classified by the Bayes algorithm. A geospatial repository was used to store tweets related to crime events of Mexico City and official reports that were geocoded for obtaining safe routes. A forecast related to crime events that can occur in a certain place with the collected information was performed. e novelty is a hybrid approach based on semantic processing to retrieve relevant data from unstructured data sources and a classifier algorithm to collect relevant crime data from official government reports with a mobile application. 1. Introduction Nowadays, millions of citizens go through the streets of Mexico City, taking some specific routes that are planned by using either public or private transportation or even walking. Although there are well-known routes that citizens most oſten take for their traveling, new routes (probably unsafe) might be experimented especially for newcomers. is gen- eration is not an easy task; generally it is made by routing systems that search the shortest or fastest paths [1]. Never- theless, a feature like security is not taken into account when the route is generated by these systems. In particular, safety is a critical characteristic that should be considered for these mobile applications, especially in overcrowded large cities. Many routing systems are based on processing linguistic tech- niques and treating with name of places to define the origin and destination. us, these works have faced well-known linguistic problems (e.g., polysemy). In [2], a word sense disambiguation method used to name places was applied. Our work is oriented towards crime prevention by using information retrieval techniques and a clustering classifica- tion method. Other works are only focused on generating a better performance [3] and computing the shortest path [1]. Other approaches deal with crime information by using location recognition, network information, geographic infor- mation, and technologies such as augmented reality, short message service (SMS), and near field communication (NFC) [4]. However, clustering approaches are used to obtain the risk level in a specific area. So, the novelty of our proposal is the use of two heterogeneous data sources: of and government official reports, which were integrated and processed by the Bayes algorithm for obtaining a risk level applied to routes in a mobile system. ere are different approaches for computing the safe path and analyzing crime events, which are based on math- ematical models, machine learning, geospatial analysis, and Hindawi Publishing Corporation Mobile Information Systems Volume 2016, Article ID 8068209, 11 pages http://dx.doi.org/10.1155/2016/8068209
Transcript
Page 1: A Mobile Information System Based on Crowd-Sensed and Official ...

Research ArticleA Mobile Information System Based on Crowd-Sensed andOfficial Crime Data for Finding Safe Routes A Case Study ofMexico City

Feacutelix Mata Miguel Torres-Ruiz Giovanni Guzmaacuten Rolando QuinteroRoberto Zagal-Flores Marco Moreno-Ibarra and Eduardo Loza

Instituto Politecnico Nacional UPALM Zacatenco 07320 Mexico City DF Mexico

Correspondence should be addressed to Miguel Torres-Ruiz migueltorresruizgmailcom

Received 20 November 2015 Revised 12 February 2016 Accepted 14 February 2016

Academic Editor Salil Kanhere

Copyright copy 2016 Felix Mata et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Mobile information systems agendas are increasingly becoming an essential part of human life and they play an important role inseveral daily activities These have been developed for different contexts such as public facilities in smart cities health care trafficcongestions e-commerce financial security user-generated content and crowdsourcing In GIScience problems related to routingsystems have been deeply explored by using several techniques but they are not focused on security or crime rates In this paperan approach to provide estimations defined by crime rates for generating safe routes in mobile devices is proposed It consists ofintegrating crowd-sensed and official crime data with a mobile application Thus data are semantically processed by an ontologyand classified by the Bayes algorithm A geospatial repository was used to store tweets related to crime events of Mexico City andofficial reports that were geocoded for obtaining safe routes A forecast related to crime events that can occur in a certain placewith the collected information was performed The novelty is a hybrid approach based on semantic processing to retrieve relevantdata from unstructured data sources and a classifier algorithm to collect relevant crime data from official government reports witha mobile application

1 Introduction

Nowadays millions of citizens go through the streets ofMexico City taking some specific routes that are planned byusing either public or private transportation or even walkingAlthough there are well-known routes that citizens mostoften take for their traveling new routes (probably unsafe)might be experimented especially for newcomers This gen-eration is not an easy task generally it is made by routingsystems that search the shortest or fastest paths [1] Never-theless a feature like security is not taken into account whenthe route is generated by these systems In particular safetyis a critical characteristic that should be considered for thesemobile applications especially in overcrowded large citiesMany routing systems are based on processing linguistic tech-niques and treating with name of places to define the originand destination Thus these works have faced well-knownlinguistic problems (eg polysemy) In [2] a word sense

disambiguation method used to name places was appliedOur work is oriented towards crime prevention by usinginformation retrieval techniques and a clustering classifica-tion method Other works are only focused on generatinga better performance [3] and computing the shortest path[1] Other approaches deal with crime information by usinglocation recognition network information geographic infor-mation and technologies such as augmented reality shortmessage service (SMS) and near field communication (NFC)[4] However clustering approaches are used to obtain therisk level in a specific area So the novelty of our proposal isthe use of two heterogeneous data sources 119888119900119903119901119906119904 of 119905119908119890119890119905119904and government official reports which were integrated andprocessed by the Bayes algorithm for obtaining a risk levelapplied to routes in a mobile system

There are different approaches for computing the safepath and analyzing crime events which are based on math-ematical models machine learning geospatial analysis and

Hindawi Publishing CorporationMobile Information SystemsVolume 2016 Article ID 8068209 11 pageshttpdxdoiorg10115520168068209

2 Mobile Information Systems

crowdsourcing methods in mobile information systems [5]In [6] a risk model for urban road network is proposed ituses amathematicalmodel based on civic datasets of criminalactivity for mobility traces in the city But it ignores the tem-poral dimension for filtering data crime In [7 8] historicalcrime data are processed for identifying patterns that help incrime prediction In our work we identify the risk level with-out considering patterns directly Commercial spatial analysistools are presented in [9 10]They are very useful to study andanalyze crime patterns for strategies against crime Crowd-sourcing is an interesting approach for analyzing crime dataincluding social information [11] According to [12] there areimportant relationships between crime and peoplersquos activitiesin the spatiotemporal context

This paper introduces a framework based on an approachfor integrating official crime reports given by public author-ities with crowdsourcing data which were obtained fromthe Twitter streaming The goal is to provide safe routeswith a mobile information system in order to increase theconfidence level of the citizens that use the urban infras-tructure of the city Summing up the approach combinesstatistical data obtained from official information with theperception of people derived on a daily basis and reflected bycrowdsourcing dataThus this framework takes into accountevents for a particular time and continuous updates are per-formed in order to provide recent events that occur at specificplaces which are reported by a social network communityThe crowdsourcing-based method considers a large databaseof tweets which were collected by the mobile applicationwhile official data were given by local authorities Thus theapproach applies a socialmining technique in order to extractfeatures from theTwitter dataset and enriches the characteris-tics that emerged from a statistical database that qualifies thesafety in Mexico City

The remainder of the paper is organized as followsSection 2 presents a discussion of the related work Section 3describes the general framework of the mobile informationsystem Section 4 depicts the experimental results of themobile application as well as the comparison with othersystems Section 5 highlights the conclusion and future work

2 Discussion about the Related Work

The study of crime prevention in several community environ-ments has been explored The Department of Police Servicesof the California State University offers a list of safety mobileapplications [5] Another example is the Sentinel Campuswhich is a free mobile application that provides crime statis-tics for more than 4400 universities [13] In the social webcontext Wikicrimes is a Brazilian site that presents a crimestatus based on ℎ119900119905 119904119901119900119905 maps in which users can reportregional crime data [14] The proposal retrieves informationin a structured way by specialized system However ourapproach retrieves information in unstructured way by anonspecialized system In [15] a study of crime preventionsystems with an analysis of the main web and mobile crimeinformation systems is described Here the informationretrieval task is collaborative however the authors do not usea hybrid approachwhich integrates social and official sources

Other applications have used spatial statistics for identifyingfactors that affect directly the crime occurrence in order togenerate a public map of crime prevention [10] Neverthelessit does not consider the case when a person needs to cross anarea or point walking or in a car

On the other hand mobile applications for routing andplanning in city environments are increasingly becomingessential for improving urban spaces Thus this indicatesthe appearance of the next generation of mobile informationsystems in which recommendations are focused on decisionmaking in order to adequately support the growth of bigcities Applications like CrowdPlanner [16] and DroidOpp-PathFinder [17] are addressed to generate crowd-sensedroute recommendation systems which request from users toevaluate candidate routes recommended by different sourcesand methods for determining the best route based on thefeedback of those users The routes generated by these appli-cations are evaluated by people in our case the assessment ismade by a clustering algorithm and by the crime occurrence

Recently a large range of experimental applicationswhich take into account the social network and crowd-senseddata were developed In [18] a spatio- and crowd-basedrouting system is proposed in order to improve the recom-mendation quality however it is only based on volunteeredinformation and the enrichment of the crime data sourcesis not considered A similar work is presented in [11] inwhich the sentiments are considered as a perception of userswith respect to the security in certain geographic places [19]Moreover a system for routing of police patrols based ongenetic algorithms is presented in [3] It generates particularroutes that minimize the number of crime occurrences in agiven area The route is generated by using the shortest pathwith data from a fixed time window

In [20] a crime mapping and prediction based on histor-ical data is proposedThis workmakes centered analyses withspatiotemporal techniques The analysis of crime preventionconsiders the network traffic and nodes Other proposal hasincluded the position of CCTV cameras to detect crimeevents [21] Thus [22] developed a police patrolling strategybased on the Bayesian method and ant colony algorithm inorder to reduce the average time between two consecutivevisits to ℎ119900119905 119904119901119900119905119904 In [23] a crime ontology is describedby using two scenarios of analysis (1) the circumstancesof a crime its mechanism information of criminals andknowledge about the methods of crime investigation and (2)the penal procedure and other methodological and tacticalrecommendations of criminality crime features and eventsAnother ontology-based system in crime analysis is explainedin [24] in which an ontology to represent digital incidentsassociated with digital investigation and legal requirementsis described

The cited approaches have been using spatial data ofcrimes but they are not able to analyze unstructured dataIn this work we propose a hybrid approach embedded in amobile application which automatically combines social andofficial crime data by using an ontology exploration methodand the Bayes algorithm to classify crime activities accordingto the Mexican penal code

Mobile Information Systems 3

(1) Retrieval of crime data sources

Tweets database Official reports

(2) Crime data repository

Integrated crime database

(3) Semantic processing

Crime ontology

(4) Clustering approach

(5) Generation of safe routes

Integrated crimedatabase

Integrated crimedatabase

Crime ontology

Safe routingalgorithm

Bayes algorithm

Figure 1 The general framework with its stages

3 The General Framework

The framework is composed of five stages (1) retrieval ofcrime data sources (2) the crime data repository (3) thesemantic processing (4) the clustering approach and (5)generation of safe routes (see Figure 1)

Summing up the approach consists of retrieving tweetsthat are related to crime events taking into account attributesthat define the time and location when an event occursThesetweets are analyzed and integrated with crime data fromofficial sources (government institutions) The integrationprocess is automatically performed by using descriptions andspatiotemporal attributes of data An application ontology isproposed to define candidates for the integration task Laterthe Bayes algorithm is used to classify data that cannot beautomatically integrated For the cases of synonymy namesthe GeoNames web service is used to solve this conflict

The categorization is carried out analyzing the spatiotem-poral attributes and the description of words that appearin the tweets andor official database record For examplecrime events could have occurred walking or by car withor without violence The description of these events is usedto categorize each record or tweet The crime data cannotbe directly classified because this information could be

Tweets database Official reports

Semantic analysis

Integrated database

Spatiotemporal analysis

Data analysis

Figure 2 Hybrid approach integrated crime database from theofficial and social sources

imprecise or incomplete For instance if a tweet does notcontain the address or any reference like points of interesttime definition and crime or theft then the tweet is not takeninto consideration Thus information that is incomplete orwith imprecise data will be classified by using the Bayesianalgorithm Moreover the categorization allows us to knowwhat type of crimetheft occurred in certain place or point(with or without violence) While the classification definesthe confidence (security) level for an area or street this isdefined by the sum of all crimetheft points of an area orstreet Thus all the categorized and classified data are usedas input parameters for the safe routing algorithm The finalresult is a safe route that does not cross or contain points withhigh crime rates

As a conclusion the first stage presents the crime datasources to be processed particularly relevant tweets andofficial reports The second stage consists of building a crimedata repository It stores data that were integrated fromTwitter and official sources The third stage is in charge ofclassifying crime records by their description The process isdriven by an application ontology The fourth stage gathersthe events by crimetheft type time and location where theyhappened Lately classified areas are generated according totheir confidence level (how often a crime occurs in a placedate and time) in order to be categorized like secure orinsecurity clusters The fifth stage performs an estimationbased on the crime rates which is processed by applyinga Bayesian algorithm in order to obtain crime values withrespect to points in safe routes It visualizes the safe routeon the mobile application according to the following crimesrobbery with violence to passengers theft without violenceto passengers car theft with violence and car theft withoutviolence

31 Stage 1 Retrieval of Crime Data Sources In this stagethe data are retrieved and recollected from two sources (1)crime information from the tweets dataset and (2) crimeinformation from the official reports (see Figure 2) Thefollowing considerations for describing both processes areoutlined as follows

4 Mobile Information Systems

Table 1 Crime-related events from Twitter accounts covering Mexico City

Twitter account Location Creation date Followers Number of tweets Belongs to government WebsiteSSPDFVIAL Mexico City 07142010 369115 15465 Yes httpsspdfgobmxPolloVial Mexico City 01312013 667 7191 No No websiteTrafico889 Mexico City 05142009 137099 9054 No httpsiempre889comtraficoAlertux Mexico City 10162012 179574 3559 No httpwwwalertuxcom072AvialCDMX Mexico City 10202010 83535 13471 Yes httpwwwagudfgobmxRedVial Mexico City 03092010 63702 4481 No httprvialmx

Table 2 Record structure for the official database

Event type Year 2012 Year 2013Q1

Q2

Q3

Q4

Q1

Q2

Q3

Q4

Robbery ofpassenger inpublic transport

159 175 154 111 102 134 115 86

Robbery ofpasser withoutviolence

47 80 54 79 73 46 49 57

From the tweets dataset it is important to know infor-mation represented by tweets that talk about crime-relatedevents The features were considered as follows specializedand generic accounts mention of crimes in popular streetscommon abbreviations of crimes popular places where somecrimes were committed such as malls entertainment sitesprivate and public locations and historical monuments Thisprocess determined the popular words and relevant conceptsby using the ontology Thus the most common 119899-grams foreach tweet were obtained by sorting them according to theoccurrence frequency The repetitive 119899-grams were selectedwith a threshold of more than 100 mentions Thus fromthe most frequent unigram bigram and trigram lists wehave identified by hand 456 common crimes on popularstreets 150 common crime-related events 135 common crimehashtags 69 common nicknames 65 common buildingsplaces and monuments 34 common abbreviations and 26common combinations of prepositions

The dataset contains 450250 tweets collected over aperiod of six months from January 7 2015 until June 242015 without considering retweets and posts with blankspaces Tweets are collected from reliable Twitter profiles thatcorrespond to known services and institutions (see Table 1)The APIs that were used to retrieve crime tweets were theSearch API and Twitter4j

With respect to official reports they were retrieved by theInfoDF systemTheprocess consisted of requesting to the SSPand PGJ government agencies and the records with respectto crime information such as robbery of passerby and vehicletheft

On the other hand information retrieved from the SSPagency contains more details than information from the PGJsuch as the type of offense the colony or area where the eventoccurred and the time A processed record from the officialpolice database (PGJ-DF) is shown in Table 2 It represents

a crimetheft associated with a location or neighborhood aswell as the time by trimester

32 Stage 2 A Crime Data Repository In this stage themost relevant tweets are identified in order to carry outa semantic matching which is driven by an ontology thatwas adopted from [23] A fragment of this ontology forintegrating semantically crime tweets with official reports isdepicted in Figure 3

The ontology is explored by Algorithm 1 and it usesthe hyperonymy and synonymy as semantic relationshipsin order to find a matching between each tweet and crimereport It means that if a tweet is expressed by domains oftime and location their hyperons and synonyms are searchedwithin the ontology for contextualizing them So synonymsand hyperons are stored into a vectorThe process is repeatedby each record of the official institutions the obtained vectoris compared and in case of a match then a term or conceptis the same or has the same parent thus it is considered acandidate to be unified In other words if a match is foundthen the data are mixed (tweets and databases from the PGJand SSP)

33 Stage 3 The Semantic Processing The applied processto tweets establishes the following tasks (1) 119879119900119896119890119899119894119911119890119903 Itconsists of separating and identifying the tweets as well asfiltering the tweets to remove the stop words from the tokenstream This task was performed by Lucene system [25] (2)119875119903119900119888119890119904119904119894119899119892 119863119886119905119886 This task involves analysis of tweets theyare identified by spatial temporal and description attributesin order to identify where when and how a crime eventoccurred (3) 119862119886119905119890119892119900119903119894119911119886119905119894119900119899 It uses the semantic matchingby means of grouping the tweets according to the crimesdefined in the ontology The semantic distance for eachprocessed tweet is computed by using the weighted crimerateThe output of this stage is a set of tweets categorized by acrime type Figure 4 shows a general diagram that describesthe integration of these tasks Thus in order to illustrate theabove two different records from Twitter dataset at differentlevels of spatial granularity are presented as follows

(1) SSPDFVIAL crime report in public transport witha fire arm Iztapalapa

(2) Alertux theft of students in Ecatepec streets 105and 106 RedVial

Mobile Information Systems 5

Input TweetsResult Unified and categorized tweet

(1) Let 119902[119894] = 119890119897119890119898119890119899119905119904 119900119891 119905119908119890119890119905(2) 119899 = 0(3) while 119899 = 119894 do(4) Parsing and identification (119902[119894])(5) nodestart()(6) while node = null do(7) 119895 = 0 119894 = 0(8) if Hyperonomy or Similarity(concept name) then(9) 119888119900119899119881119890119888[119895] = 119892119890119905 119901119886119903119890119899119905 119886119899119889 119888ℎ119894119897119889119903119890119899(119899119900119889119890)

(10) nodenext()(11) 119862119903119900119908119889119881119890119888119905119900119903[119895] = 119890V119890119899119905119879119910119901119890 119904119890119886119903119888ℎ(119888119900119899119881119890119888[119895])(12) temporal search(conVec[119895])(13) 119895++ 119896++ 119899++

Algorithm 1 The OntoExplore algorithm

Thing

Crime

Crime against Crime against

Theft

Violent theft Nonviolent theft

Carjacking RobberyCar stolen from

Armed robbery Highway robbery

parking

property person

Figure 3 A fragment of the crime ontology to semantically match tweets and official information

Tweets analysis Spatial analysis

(GeoNames and gazetteer)

Linguistic analysis(dictionary)

Temporal analysis

Filtering and cleaningdata

(Lucene system)

The OntoExplorealgorithmexecution

Categorized crimerecords

Integrated crimedatabase

Figure 4 Tasks involved in the semantic processing

The cleaning process includes the adaptation ofAlgorithm 2 for removing stop words according to thedefinition presented in [26]

On the other hand a tweet is associated with a recordof the official database according to its relevance by using

the ontology and the hyperonymy relation The relevance ismeasured by contextualizing the record and the tweet Thismeans that each semantic item of the tweet and record isidentified and then it is searched in the crime ontology inorder to find matching terms synonyms or related concepts

6 Mobile Information Systems

Input An arbitrary stop word dictionary 119879 the set of schema trees 119876119868Result A maximal set of stop words 1198791015840

(1) 119879 larr 0(2)119882 = the set of all words in the domain(3)119863 = 119879 cap119882(4) while exist word 119908 isin 119863(5) for each interface 119902 isin 119876119868(6) remove the stop word constraints for the lables of sibling nodes(7) if no stop word constraint is violated then then(8) 119879

1015840larr 119879 cup 119908

(9) else(10) remove antonymus of 119908 appearing in119863 from 1198791015840(11) return 1198791015840

Algorithm 2 Removing stop words

For example in the first tweet above the location is notprecise only the name of neighborhood is given but thestreets are not defined However these features enrich thedescription of the event

Therefore the tweet is contextualized and related to theconcept ldquoviolencerdquo Moreover the second tweet has a preciselocation but it is imprecise regarding the event descriptionThis process generates a semantic classification matched tothe official database categorization in which each tweetis described according to few categories either Boolean ordescriptors (eg theft crime and violence) The location isderived by the neighborhood street name or spatial relation(eg near and far) The time is computed by explicit data(at 1100 am) or temporal adverbs (eg now and afternoon)Thus a parsing that contains the attributes identified by thesemantic classification (event type crime and theft in car orwalking) for evaluating the domains is generatedThe parsingstructure that was obtained by analyzing the tweet applicationis presented as follows

semantic classificationTheftCrimeTheftwith violenceTheftwalkingTheftin car Theft without violenceattribute day Mon Tue Wed Thu Fri Sat Sunattribute id location 30339461 30339462 3033949330339495 30339496 30339671 attribute time 1 2 3 4 5 6 7 8 9 10 11 12 13 1415 16 17 18 19 20 21 22 23 24dataMon 30339461 10 Theft

Fri 30339671 20 Crime

The labelsemantic classification represents the result of thesemantic matchingattribute represents the valid attributesand their domains anddata denotes the classified data aftersyntactic and semantic-syntactical analysis

34 Stage 4 The Clustering Approach In this stage anapproach based on Bayes algorithm is proposed for clusteringthe crime data that are stored in the repository A bag ofwordsis used in order to classify the crimes it means that there isa set of words that describes each type of theft or crime If atweet or record contains these words it is classified accordingto the type of theft or crime that these words represent inthe database So the Bayes algorithm classifies tweets orrecords that do not contain these words or only one of thesewords The goal is to classify tweets that are relevant and thatthe semantic processing cannot classify Thus a tweet thatdescribes a theft or crime which does not have an exact loca-tion time or description will be classified by using the Bayesalgorithm

The Bayes algorithm takes into account certain features(words that compose a tweet or record from official database)that are identified and assigned to a particular cluster In thistask many clusters can appear and they will be filtered laterThe Bayes theorem is useful not only to cluster data related tocrimeor theftbut also to cluster datawhen they containwordsthat do not belong to a type of crime so these clusters will beomitted The use of Bayes algorithm had two specific goalsthe first one is to classify data that cannot be semanticallyclassified and the second is oriented towards performing anestimation For example a user requests a safe route in certainarea but there are not data of theftcrime reports neitherTwitter nor official database for this area So this approachfinds the cluster which belongs to the area (eg a safe orunsafe cluster) andmakes an estimation to this zone for deter-mining what points are probably safer than others Finallyweights are assigned for all points and these weights are sentas parameters to the safe route algorithm

Equation (1) was applied to data and they were clusteredThe next estimation is also computed for establishing aprobability for each point that belongs to a route and a crimeevent In other words such equation defines the likelihoodthat an event occurs on a given day and time and at a specificlocation

119875 (119888 | 119909) = 119875 (119909) times

119875 (119883 | 119862)

119875 (119909)

(1)

Mobile Information Systems 7

Input119873 = 1198991 1198992 119899

119896Set of nodes

119899119904start node

119899119891finish node

Result119863 = 1198891 1198892 119889

119896Set of distance values (weighted crime rate)

(1) Assign distance values 119889(119899119904) = 0 119889(119899

119894) = infinforall119899

119894= 119899119904

(2) Let 119880 = 119873 minus 119899119904

(3) Let current node 119899119888= 119899119904

(4) while exists(119899119888) do

(5) foreach neighbor(119899119888) do

(6) Let be 119899119895= 119899119890119894119892ℎ119887119900119903(119899

119888)

(7) if 119889(119899119895) gt 119897119890119899119892119905ℎ(119899

119895 119899119888) + 119889(119899

119888) then

(8) 119889(119899119895) = 119897119890119899119892119905ℎ(119899

119895 119899119888) + 119889(119899

119888)

(9) 119880 = 119880 minus 119899119895

(10) if 119899119891notin 119880 or min(119897119890119899119892119905ℎ(119899

119894 119899119895)) = infinforall119899

119894 119899119895isin 119880 then

(11) return 119889(119899119896) forall 119899119896isin 119873

(12) else(13) 119899

119888= 119889(119899

119894) = min(119889(119899

119895)) forall 119899119895isin 119880

Algorithm 3 The safe routing algorithm

The clustering approach obtains some probabilities thatrepresent the estimation for specific crimes that were previ-ously classified semanticallyThese probabilities define possi-ble hot spots that represent events which are directly associ-ated with streets In addition the method classifies incomingdata generated by the categorization task searching patternsthat were defined as words associated with a crime It gen-erates some predictive spatiotemporal patterns according tothe indicated parametersThus the goal is to search the crimeprobability for specific locations in a given route for havingsome criminal events when a user travels on that route Theprobabilities are based on the following combinations

let 119875(coordinate = [119909 119910] | event) be the probabilityof an event directly related to its location and let119875(119889119886119910 119903 119905119894119898119890 | event) be the probability of an eventtaking into account its temporariness So the compu-tation of such probabilities is presented as follows119875(119888 | 119909) represents a subsequent probability and it isdefined by the probability that an event occurs con-sidering past events inwhich119875(119909) represents the totalprobability which denotes the number of times thatsome given attributes appear in the events (eg theft)119875(119909 | 119888) represents the conditional probability whichis denoted by the number of times when a targetappears in each attribute (eg car) taking intoaccount the total number that the target appears inall attributes119875(119888) is defined as an a priori probability which repre-sents the number of times that a target appears in allevents

Events attributes and locations are identified and clas-sified by the Bayes algorithm The clusters are generatedaccording to the defined patterns that were used in thetraining process Particularly the places are described astrusted or untrusted The probability is used as a weighted

value and it is normalized in a range of [0 1] where 0represents the lowest and 1 the highest likelihood Thusthese values are used in the route generation An example ofthe clustering computation with the obtained likelihood ispresented as follows

Classified Event (ldquo12052014rdquo ldquo3rdquo Monday12255928567) ldquoEventrdquo ldquoTheftrdquo ldquoID Coordinaterdquoldquo30339694rdquo ldquolatituderdquo 194038744 ldquolongituderdquo99150226 ldquoProbabilityrdquo ldquo00009295401rdquo ldquodiag-nosed weight 089rdquo

35 Stage 5 Generation of Safe Routes The safe routes arevisualized in the mobile mapping application which wasdeveloped as a client by using REST (Representational StateTransfer) It is a web service technology that generatesrequests and the data parsing is received by the server Thespatial feature is supported byOpen StreetMaps [12]The saferoute is obtained by the adaptation of the Dijkstra algorithmin which the nodes in the network are assigned as an averageweight that was obtained from the number of crimes for aspecific point or geographic area The values generated fromthe Bayes algorithm reflect the probability of having or not anevent such as ldquotheftcrimerdquo

Let 119899119904be the starting node called initial node and let

119889 be the distance of node 119899119896from the initial node to 119899

1

Safe route algorithm assigns some initial distance valuesbased on the weighted crime rate for these points So theweighted crime rate is computed by the number of com-plaint crimetheft occurrences in a specific point and timeAlgorithm 3 describes the process for obtaining the saferoutes

Thus this algorithm uses data that were sent by themobile application (origin and destination points) in order toreturn a route that avoids locations where crime events haveoccurred The confidence level is a metric that is computedby considering the number of crime incidents that haveoccurred in a specific point and time This is used as a

8 Mobile Information Systems

weighted value for the safe routing algorithm The displayedroute is marked with coordinates that were returned by thealgorithm in order to visualize the route on the mobilemapping application

The weights can be also relaxed by spatial and temporalvalues such as date (day month and year) location (point orarea) and time (hour or period) In the interface level userscan configure the route search process by modeling someparameters as follows If the route is generated by using eithersocial official or integrated data source then the searchprocess also makes a difference between different transporta-tions (eg walking or by car)

4 Experimental Results and Evaluation

The mobile application was implemented in Android 40and the tests were performed in mobile devices In thissection the results based on information of Mexico Cityare presented The repository is composed of 5441 eventswhich were recollected and processed from the tweets andcorrelated with official reports In this dataset a frequencytable is generated which indicates the number of times thateach attribute appears given events such as theft and crime

The frequency values are defined as weights when deriv-ing a safe route All values are assigned to correspondingnodes in the network In addition the probability that anevent occurs at some location and specific date is computedand stored in a vector table This also allows a comparison ofcrime probabilities at different placesThe sort of summariza-tion provides a support to evaluate the probability of an eventto occur at a specific location andor particular time (egthe probability that a crime or theft can occur in ldquoAvenueEje Central on Mondayrdquo) Then the probability that an eventoccurs at a given place and time is defined in the followingexample ldquoOn Monday at 900 am in Iztapalapardquo

119875(119909) = 119875(Monday) = 2202149119875(119909 | 119888) = 119875(Monday | Theft) = 18218119875(119888) = 119875(Theft) = 2182149119875(119888 | 119909) = 119875(Monday | Theft) =119875(MONDAY) times119875(THEFT | Monday)119875(Mon-day) = 00818

(1) The probabilities of some classified crime events arecomputed

(2) The selected combinations (eg theft at a particularlocation and time) are definedHowever this problemis NP-hard thus a semantic classification has beenapplied to restrict the search space and decrease thecomputational complexity

(3) This provides the probabilities for all events that occurfor each particular domain value

An example to compute the probability for crime eventssuch as the probability that ldquoa theft occurs at a given locationon Friday or a crime occurs onTuesday at noonrdquo is presentedas follows

119875(Theft) = 0104119875(Crime) = 0065119875(Crime(119908)) = 0017119875(Crime(119901)) = 0812

119875(Theft | day = Monday) = 0032119875(Crime | day = Monday) = 013119875(Crime(119908) | day = Monday) = 0017119875(Crime(119901) | day = Monday) = 0813119875(Theft | hour = 9) = 01119875(Crime | day = 9) = 005119875(Crime(119908) | hour = 9) = 005119875(Crime(119901) | hour = 9) = 08119875(Theft | hour = 1245612) = 0119875(Crime | day = 1245612) = 02119875(Crime(119908) | hour = 1245612) = 0119875(Crime(119901) | hour = 1245612) = 08rArr

119875(Theft | Time = 9ampamp day =Monday ampamp coordinate= 1245612) =119875(Theft) times 119875(Theft | day = Monday) times 119875(Theft |time = 9)times(Theft | coordinate = 1245612) = 0104times0032 times 01 times 0 = 0

The example was based on taking into consideration thedefined points from a given area it means that we have theestimation that an event occurs in 119909 point at 11 am or in 119909point in several hours of the day The probability that a theftoccurs is increasingdecreasing depending on the hour andthe day of week In that case the user can also ask for allpossible combinations near his location For instance for 119910point he wants to know the probability of suffering a theft atdifferent hours of a day So he can know what event (type ofcrime) is the most common to occur for an 119909 point at 6 pm

Figures 5 and 6 show safe routes between two pointsfor transient users where the circles on the map representuntrusted points The route was generated by avoiding thesepoints although this means that the path could be longer it isthe safer route Moreover Figure 6 depicts the generated saferoute for the same points indicated in Figure 5 not only whenthe user is not a transient but also when the user is a driverIn this case the route is different because the estimation iscarried out only processing the event types related to drivers

Finally it is possible to generate routes considering thedata reports of different periods of time (eg thefts andcrimes occurred from April to July in 2015) The safe routingalgorithm generates a specific route that avoids the pointswhere theft events have occurred in the past Neverthelessthe generated route can change if the user increases the timeperiod for taking into account (eg from January to Marchin 2015) or including all temporal data available (eg from2013 to 2015) It allows us to know what location is safer thanothers at specific day and hour in the same geographic area

Mobile Information Systems 9

Figure 5 Suggested safe route for a transient

Figure 6 Suggested safe route for a user in a car

Figure 7 depicts the events that occurred to transients anddrivers from official and social sources the icons representtheft of transients and drivers with and without violence aswell as theft of house

On the other hand the obtained results in Figures 56 and 7 were compared with the Official Crime Map Sys-tem (httpwwwmapadelincuencialorgmx) Figure 8 onlydepicts events that volunteersmarked as points where a crimeor theft with violence occurred the data difference is evidentand the possibilities to configure the system are very limitedin comparison with the proposed mobile information system(see Figure 9)

Thus Figure 10 shows the hybrid map view from the webversion The events in yellow color were reported by thesocial network source and events with blue color representthe official source

5 Conclusions and Future Work

In this paper a hybrid approach for finding safe routesusing semantic processing and classification algorithms with

Figure 7 All events occurred from 2013 to 2015

Figure 8 Crime map (made by volunteers)

data provided from a social network and the official crimereports is presented As a case study a mobile informationsystem was developed It generates safe routes based oncrime reports of Mexico City from large tweet repositoryand official databases The data are semantically classifiedto determine whether the tweet describes a crime event ortheft in case of tweets which cannot be identified as crimesthey are evaluated by Bayes algorithm which clustered themaccording to the contained descriptionThus the clusters areused tomake prediction regarding the possibility that a crimecan occur in a specific place and hour The spatiotemporalanalysis determined the location where the crime eventsoccurred Moreover the confidence level of a location wasdefined and it was used as a parameter for computing the saferroute

Themain contributions of this work are as follows (1) thedesign of a hybrid approach based on semantic processingto retrieve crime data from a social network source (2) theintegration of crowd-sensed data with official governmentsources (3) the validation of a tweet performed by comparingthe sources using 119896-fold cross validation (4) the estimation

10 Mobile Information Systems

Figure 9 Theftcrime events that occurred in a specific time fromthe mobile information system

Figure 10 The hybrid map view from the web version

model based on the Bayes algorithm to obtain safe routes withdata that were provided by the mobile device and (5) thedesign of amobile information system to generate safe routes

According to the results of the estimation the certaintydegree is around 75 of effectiveness It was tested bycomparing areas with crime data but the records wereintentionally removed and original copy was keptThus withthe results of the estimation a comparison with the originalcopy was performed So we found that the estimation has aperformance of 75 for all the points of the data sample

In addition a metric to measure the confidence level orsecurity for certain points and areas of Mexico City has beenproposed It allows finding safe routes according to pathswith a low crime rate Moreover the mobile application gath-ers long-term statistical data with almost real informationfrom citizens which are acting as sensors in the city Theresults of the mobile system have been tested and comparedwith the Crime Map System

Future works are oriented towards evaluating the cogni-tive perception of people taking into consideration points

or geographic places for finding comfortable routes Thesentiment analysis will be treated in order to incorporate thisfeature as a parameter in the computation of routes Addi-tionally we are proposing the integration of ourmobile appli-cation with the Mexican CCTV camera systems for sensingthe dynamic of certain areas in the city This contribution isfocused on developing mobile information systems for rout-ing and urban planning in city environments Mobile appli-cations are increasingly becoming essential for analyzing theurban dynamic of big cities Thus the appearance of the nextgeneration of mobile information systems will be devised inreal-time road network conditions In addition this genera-tion is oriented towards improving the quality of human lifefor increasing the sustainability of the smart cities

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

Thisworkwas partially sponsored by the Instituto PolitecnicoNacional (IPN) the Consejo Nacional de Ciencia y Tec-nologıa (CONACYT) and the Secretarıa de Investigacion yPosgrado (SIP) under Grants 20162006 20161899 20161869and 20161611

References

[1] H Zhang Y Xu and X Wen ldquoOptimal shortest path setproblem in undirected graphsrdquo Journal of Combinatorial Opti-mization vol 29 no 3 pp 511ndash530 2015

[2] W Templeton-Steadman and R Williams ldquoInformation deliv-ery system and method for mobile appliancesrdquo US Patent App11562054 2006

[3] D Reis A Melo A L V Coelho and V Furtado ldquoTowardsoptimal police patrol routes with genetic algorithmsrdquo in Intel-ligence and Security Informatics IEEE International Conferenceon Intelligence and Security Informatics ISI 2006 SanDiego CAUSA May 23-24 2006 Proceedings vol 3975 of Lecture Notesin Computer Science pp 485ndash491 Springer Berlin Germany2006

[4] V Ceikute and C S Jensen ldquoRouting service qualitymdashlocaldriver behavior versus routing servicesrdquo in Proceedings of theIEEE 14th International Conference onMobileDataManagement(MDM rsquo13) vol 1 pp 97ndash106 IEEE June 2013

[5] Safety Apps October 2015 httpwwwcsunedupolicesafety-apps

[6] E Galbrun K Pelechrinis and E Terzi ldquoUrban navigationbeyond shortest route the case of safe pathsrdquo InformationSystems vol 57 pp 160ndash171 2016

[7] T Wang C Rudin D Wagner and R Sevieri ldquoLearning todetect patterns of crimerdquo in Machine Learning and KnowledgeDiscovery in Databases vol 8190 of Lecture Notes in ComputerScience pp 515ndash530 Springer Berlin Germany 2013

[8] C-H Yu W Ding P Chen and M Morabito ldquoCrime forecast-ing using spatio-temporal pattern with ensemble learningrdquo inAdvances in Knowledge Discovery andDataMining 18th Pacific-AsiaConference PAKDD2014 Tainan TaiwanMay 13ndash16 2014

Mobile Information Systems 11

Proceedings Part II Lecture Notes in Computer Science pp174ndash185 Springer Berlin Germany 2014

[9] L Scott and NWarmerdam ldquoExtend crime analysis with arcgisspatial statistics toolsrdquo ArcUser Magazine 2005

[10] M Leitner Crime Modeling and Mapping Using GeospatialTechnologies vol 8 Springer DordrechtTheNetherlands 2013

[11] J Kim M Cha and T Sandholm ldquoSocroutes safe routesbased on tweet sentimentsrdquo in Proceedings of the 23rd ACMInternational Conference on World Wide Web (WWW rsquo14) pp179ndash182 Seoul South Korea April 2014

[12] MHaklay and PWeber ldquoOpenstreetmap user-generated streetmapsrdquo IEEE Pervasive Computing vol 7 no 4 pp 12ndash18 2008

[13] J M Sanchez Bernabeu J V Berna Martınez and F MaciaPerez ldquoSmart sentinelmonitoring and prevention system in thesmart citiesrdquo International Review on Computers and Softwarevol 9 no 9 pp 1554ndash1559 2014

[14] Wiki crimes 2015 httpwwwwikicrimesorg[15] T Moon S Heo and S Lee ldquoUbiquitous crime prevention

system (UCPS) for a safer cityrdquoProcedia Environmental Sciencesvol 22 pp 288ndash301 2014

[16] H Su K Zheng J Huang H Jeung L Chen and X ZhouldquoCrowdplanner a crowd-based route recommendation systemrdquoinProceedings of the 30th IEEE International Conference onDataEngineering (ICDE rsquo14) pp 1144ndash1155 IEEE Chicago Ill USAApril 2014

[17] V Arnaboldi M Conti and F Delmastro ldquoImplementationof CAMEO a context-aware middleware for opportunisticmobile social networksrdquo inProceedings of the IEEE InternationalSymposium on a World of Wireless Mobile and MultimediaNetworks (WoWMoM rsquo11) pp 1ndash3 Lucca Italy June 2011

[18] M Nagarajan K Gomadam A P Sheth A Ranabahu RMutharaju and A Jadhav ldquoSpatio-temporal-thematic analysisof citizen sensor data challenges and experiencesrdquo in WebInformation Systems EngineeringmdashWISE 2009 vol 5802 of Lec-ture Notes in Computer Science pp 539ndash553 Springer BerlinGermany 2009

[19] N Powdthavee ldquoUnhappiness and crime evidence from SouthAfricardquo Economica vol 72 no 287 pp 531ndash547 2005

[20] J Ratcliffe ldquoCrime mapping spatial and temporal challengesrdquoin Handbook of Quantitative Criminology pp 5ndash24 SpringerNew York NY USA 2010

[21] P Gupta G N Purohit and A Dadhich ldquoCrime preventionthrough alternate route finding in traffic surveillance using cctvcamerasrdquo International Journal of Engineering and AdvancedTechnology vol 2 no 5 pp 414ndash418 2013

[22] H Chen T Cheng and S Wise ldquoDesigning daily patrol routesfor policing based on ANT colony algorithmrdquo ISPRS Annalsof the Photogrammetry Remote Sensing and Spatial InformationSciences vol II-4W2 pp 103ndash109 2015

[23] D Dzemydiene and E Kazemikaitiene ldquoOntology-based deci-sion support system for crime investigation processesrdquo inInformation Systems Development pp 427ndash438 Springer NewYork NY USA 2005

[24] Y Chabot A Bertaux C Nicolle and T Kechadi ldquoA completeformalized knowledge representation model for advanced dig-ital forensics timeline analysisrdquoDigital Investigation vol 15 pp83ndash100 2015

[25] E Hatcher O Gospodnetic and M McCandless Lucene inAction Manning Publications Greenwich Conn USA 2004

[26] FMata-RiveraMTorres-RuizGGuzmanMMoreno-Ibarraand R Quintero ldquoA collaborative learning approach for geo-graphic information retrieval based on social networksrdquo Com-puters in Human Behavior vol 51 pp 829ndash842 2015

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 2: A Mobile Information System Based on Crowd-Sensed and Official ...

2 Mobile Information Systems

crowdsourcing methods in mobile information systems [5]In [6] a risk model for urban road network is proposed ituses amathematicalmodel based on civic datasets of criminalactivity for mobility traces in the city But it ignores the tem-poral dimension for filtering data crime In [7 8] historicalcrime data are processed for identifying patterns that help incrime prediction In our work we identify the risk level with-out considering patterns directly Commercial spatial analysistools are presented in [9 10]They are very useful to study andanalyze crime patterns for strategies against crime Crowd-sourcing is an interesting approach for analyzing crime dataincluding social information [11] According to [12] there areimportant relationships between crime and peoplersquos activitiesin the spatiotemporal context

This paper introduces a framework based on an approachfor integrating official crime reports given by public author-ities with crowdsourcing data which were obtained fromthe Twitter streaming The goal is to provide safe routeswith a mobile information system in order to increase theconfidence level of the citizens that use the urban infras-tructure of the city Summing up the approach combinesstatistical data obtained from official information with theperception of people derived on a daily basis and reflected bycrowdsourcing dataThus this framework takes into accountevents for a particular time and continuous updates are per-formed in order to provide recent events that occur at specificplaces which are reported by a social network communityThe crowdsourcing-based method considers a large databaseof tweets which were collected by the mobile applicationwhile official data were given by local authorities Thus theapproach applies a socialmining technique in order to extractfeatures from theTwitter dataset and enriches the characteris-tics that emerged from a statistical database that qualifies thesafety in Mexico City

The remainder of the paper is organized as followsSection 2 presents a discussion of the related work Section 3describes the general framework of the mobile informationsystem Section 4 depicts the experimental results of themobile application as well as the comparison with othersystems Section 5 highlights the conclusion and future work

2 Discussion about the Related Work

The study of crime prevention in several community environ-ments has been explored The Department of Police Servicesof the California State University offers a list of safety mobileapplications [5] Another example is the Sentinel Campuswhich is a free mobile application that provides crime statis-tics for more than 4400 universities [13] In the social webcontext Wikicrimes is a Brazilian site that presents a crimestatus based on ℎ119900119905 119904119901119900119905 maps in which users can reportregional crime data [14] The proposal retrieves informationin a structured way by specialized system However ourapproach retrieves information in unstructured way by anonspecialized system In [15] a study of crime preventionsystems with an analysis of the main web and mobile crimeinformation systems is described Here the informationretrieval task is collaborative however the authors do not usea hybrid approachwhich integrates social and official sources

Other applications have used spatial statistics for identifyingfactors that affect directly the crime occurrence in order togenerate a public map of crime prevention [10] Neverthelessit does not consider the case when a person needs to cross anarea or point walking or in a car

On the other hand mobile applications for routing andplanning in city environments are increasingly becomingessential for improving urban spaces Thus this indicatesthe appearance of the next generation of mobile informationsystems in which recommendations are focused on decisionmaking in order to adequately support the growth of bigcities Applications like CrowdPlanner [16] and DroidOpp-PathFinder [17] are addressed to generate crowd-sensedroute recommendation systems which request from users toevaluate candidate routes recommended by different sourcesand methods for determining the best route based on thefeedback of those users The routes generated by these appli-cations are evaluated by people in our case the assessment ismade by a clustering algorithm and by the crime occurrence

Recently a large range of experimental applicationswhich take into account the social network and crowd-senseddata were developed In [18] a spatio- and crowd-basedrouting system is proposed in order to improve the recom-mendation quality however it is only based on volunteeredinformation and the enrichment of the crime data sourcesis not considered A similar work is presented in [11] inwhich the sentiments are considered as a perception of userswith respect to the security in certain geographic places [19]Moreover a system for routing of police patrols based ongenetic algorithms is presented in [3] It generates particularroutes that minimize the number of crime occurrences in agiven area The route is generated by using the shortest pathwith data from a fixed time window

In [20] a crime mapping and prediction based on histor-ical data is proposedThis workmakes centered analyses withspatiotemporal techniques The analysis of crime preventionconsiders the network traffic and nodes Other proposal hasincluded the position of CCTV cameras to detect crimeevents [21] Thus [22] developed a police patrolling strategybased on the Bayesian method and ant colony algorithm inorder to reduce the average time between two consecutivevisits to ℎ119900119905 119904119901119900119905119904 In [23] a crime ontology is describedby using two scenarios of analysis (1) the circumstancesof a crime its mechanism information of criminals andknowledge about the methods of crime investigation and (2)the penal procedure and other methodological and tacticalrecommendations of criminality crime features and eventsAnother ontology-based system in crime analysis is explainedin [24] in which an ontology to represent digital incidentsassociated with digital investigation and legal requirementsis described

The cited approaches have been using spatial data ofcrimes but they are not able to analyze unstructured dataIn this work we propose a hybrid approach embedded in amobile application which automatically combines social andofficial crime data by using an ontology exploration methodand the Bayes algorithm to classify crime activities accordingto the Mexican penal code

Mobile Information Systems 3

(1) Retrieval of crime data sources

Tweets database Official reports

(2) Crime data repository

Integrated crime database

(3) Semantic processing

Crime ontology

(4) Clustering approach

(5) Generation of safe routes

Integrated crimedatabase

Integrated crimedatabase

Crime ontology

Safe routingalgorithm

Bayes algorithm

Figure 1 The general framework with its stages

3 The General Framework

The framework is composed of five stages (1) retrieval ofcrime data sources (2) the crime data repository (3) thesemantic processing (4) the clustering approach and (5)generation of safe routes (see Figure 1)

Summing up the approach consists of retrieving tweetsthat are related to crime events taking into account attributesthat define the time and location when an event occursThesetweets are analyzed and integrated with crime data fromofficial sources (government institutions) The integrationprocess is automatically performed by using descriptions andspatiotemporal attributes of data An application ontology isproposed to define candidates for the integration task Laterthe Bayes algorithm is used to classify data that cannot beautomatically integrated For the cases of synonymy namesthe GeoNames web service is used to solve this conflict

The categorization is carried out analyzing the spatiotem-poral attributes and the description of words that appearin the tweets andor official database record For examplecrime events could have occurred walking or by car withor without violence The description of these events is usedto categorize each record or tweet The crime data cannotbe directly classified because this information could be

Tweets database Official reports

Semantic analysis

Integrated database

Spatiotemporal analysis

Data analysis

Figure 2 Hybrid approach integrated crime database from theofficial and social sources

imprecise or incomplete For instance if a tweet does notcontain the address or any reference like points of interesttime definition and crime or theft then the tweet is not takeninto consideration Thus information that is incomplete orwith imprecise data will be classified by using the Bayesianalgorithm Moreover the categorization allows us to knowwhat type of crimetheft occurred in certain place or point(with or without violence) While the classification definesthe confidence (security) level for an area or street this isdefined by the sum of all crimetheft points of an area orstreet Thus all the categorized and classified data are usedas input parameters for the safe routing algorithm The finalresult is a safe route that does not cross or contain points withhigh crime rates

As a conclusion the first stage presents the crime datasources to be processed particularly relevant tweets andofficial reports The second stage consists of building a crimedata repository It stores data that were integrated fromTwitter and official sources The third stage is in charge ofclassifying crime records by their description The process isdriven by an application ontology The fourth stage gathersthe events by crimetheft type time and location where theyhappened Lately classified areas are generated according totheir confidence level (how often a crime occurs in a placedate and time) in order to be categorized like secure orinsecurity clusters The fifth stage performs an estimationbased on the crime rates which is processed by applyinga Bayesian algorithm in order to obtain crime values withrespect to points in safe routes It visualizes the safe routeon the mobile application according to the following crimesrobbery with violence to passengers theft without violenceto passengers car theft with violence and car theft withoutviolence

31 Stage 1 Retrieval of Crime Data Sources In this stagethe data are retrieved and recollected from two sources (1)crime information from the tweets dataset and (2) crimeinformation from the official reports (see Figure 2) Thefollowing considerations for describing both processes areoutlined as follows

4 Mobile Information Systems

Table 1 Crime-related events from Twitter accounts covering Mexico City

Twitter account Location Creation date Followers Number of tweets Belongs to government WebsiteSSPDFVIAL Mexico City 07142010 369115 15465 Yes httpsspdfgobmxPolloVial Mexico City 01312013 667 7191 No No websiteTrafico889 Mexico City 05142009 137099 9054 No httpsiempre889comtraficoAlertux Mexico City 10162012 179574 3559 No httpwwwalertuxcom072AvialCDMX Mexico City 10202010 83535 13471 Yes httpwwwagudfgobmxRedVial Mexico City 03092010 63702 4481 No httprvialmx

Table 2 Record structure for the official database

Event type Year 2012 Year 2013Q1

Q2

Q3

Q4

Q1

Q2

Q3

Q4

Robbery ofpassenger inpublic transport

159 175 154 111 102 134 115 86

Robbery ofpasser withoutviolence

47 80 54 79 73 46 49 57

From the tweets dataset it is important to know infor-mation represented by tweets that talk about crime-relatedevents The features were considered as follows specializedand generic accounts mention of crimes in popular streetscommon abbreviations of crimes popular places where somecrimes were committed such as malls entertainment sitesprivate and public locations and historical monuments Thisprocess determined the popular words and relevant conceptsby using the ontology Thus the most common 119899-grams foreach tweet were obtained by sorting them according to theoccurrence frequency The repetitive 119899-grams were selectedwith a threshold of more than 100 mentions Thus fromthe most frequent unigram bigram and trigram lists wehave identified by hand 456 common crimes on popularstreets 150 common crime-related events 135 common crimehashtags 69 common nicknames 65 common buildingsplaces and monuments 34 common abbreviations and 26common combinations of prepositions

The dataset contains 450250 tweets collected over aperiod of six months from January 7 2015 until June 242015 without considering retweets and posts with blankspaces Tweets are collected from reliable Twitter profiles thatcorrespond to known services and institutions (see Table 1)The APIs that were used to retrieve crime tweets were theSearch API and Twitter4j

With respect to official reports they were retrieved by theInfoDF systemTheprocess consisted of requesting to the SSPand PGJ government agencies and the records with respectto crime information such as robbery of passerby and vehicletheft

On the other hand information retrieved from the SSPagency contains more details than information from the PGJsuch as the type of offense the colony or area where the eventoccurred and the time A processed record from the officialpolice database (PGJ-DF) is shown in Table 2 It represents

a crimetheft associated with a location or neighborhood aswell as the time by trimester

32 Stage 2 A Crime Data Repository In this stage themost relevant tweets are identified in order to carry outa semantic matching which is driven by an ontology thatwas adopted from [23] A fragment of this ontology forintegrating semantically crime tweets with official reports isdepicted in Figure 3

The ontology is explored by Algorithm 1 and it usesthe hyperonymy and synonymy as semantic relationshipsin order to find a matching between each tweet and crimereport It means that if a tweet is expressed by domains oftime and location their hyperons and synonyms are searchedwithin the ontology for contextualizing them So synonymsand hyperons are stored into a vectorThe process is repeatedby each record of the official institutions the obtained vectoris compared and in case of a match then a term or conceptis the same or has the same parent thus it is considered acandidate to be unified In other words if a match is foundthen the data are mixed (tweets and databases from the PGJand SSP)

33 Stage 3 The Semantic Processing The applied processto tweets establishes the following tasks (1) 119879119900119896119890119899119894119911119890119903 Itconsists of separating and identifying the tweets as well asfiltering the tweets to remove the stop words from the tokenstream This task was performed by Lucene system [25] (2)119875119903119900119888119890119904119904119894119899119892 119863119886119905119886 This task involves analysis of tweets theyare identified by spatial temporal and description attributesin order to identify where when and how a crime eventoccurred (3) 119862119886119905119890119892119900119903119894119911119886119905119894119900119899 It uses the semantic matchingby means of grouping the tweets according to the crimesdefined in the ontology The semantic distance for eachprocessed tweet is computed by using the weighted crimerateThe output of this stage is a set of tweets categorized by acrime type Figure 4 shows a general diagram that describesthe integration of these tasks Thus in order to illustrate theabove two different records from Twitter dataset at differentlevels of spatial granularity are presented as follows

(1) SSPDFVIAL crime report in public transport witha fire arm Iztapalapa

(2) Alertux theft of students in Ecatepec streets 105and 106 RedVial

Mobile Information Systems 5

Input TweetsResult Unified and categorized tweet

(1) Let 119902[119894] = 119890119897119890119898119890119899119905119904 119900119891 119905119908119890119890119905(2) 119899 = 0(3) while 119899 = 119894 do(4) Parsing and identification (119902[119894])(5) nodestart()(6) while node = null do(7) 119895 = 0 119894 = 0(8) if Hyperonomy or Similarity(concept name) then(9) 119888119900119899119881119890119888[119895] = 119892119890119905 119901119886119903119890119899119905 119886119899119889 119888ℎ119894119897119889119903119890119899(119899119900119889119890)

(10) nodenext()(11) 119862119903119900119908119889119881119890119888119905119900119903[119895] = 119890V119890119899119905119879119910119901119890 119904119890119886119903119888ℎ(119888119900119899119881119890119888[119895])(12) temporal search(conVec[119895])(13) 119895++ 119896++ 119899++

Algorithm 1 The OntoExplore algorithm

Thing

Crime

Crime against Crime against

Theft

Violent theft Nonviolent theft

Carjacking RobberyCar stolen from

Armed robbery Highway robbery

parking

property person

Figure 3 A fragment of the crime ontology to semantically match tweets and official information

Tweets analysis Spatial analysis

(GeoNames and gazetteer)

Linguistic analysis(dictionary)

Temporal analysis

Filtering and cleaningdata

(Lucene system)

The OntoExplorealgorithmexecution

Categorized crimerecords

Integrated crimedatabase

Figure 4 Tasks involved in the semantic processing

The cleaning process includes the adaptation ofAlgorithm 2 for removing stop words according to thedefinition presented in [26]

On the other hand a tweet is associated with a recordof the official database according to its relevance by using

the ontology and the hyperonymy relation The relevance ismeasured by contextualizing the record and the tweet Thismeans that each semantic item of the tweet and record isidentified and then it is searched in the crime ontology inorder to find matching terms synonyms or related concepts

6 Mobile Information Systems

Input An arbitrary stop word dictionary 119879 the set of schema trees 119876119868Result A maximal set of stop words 1198791015840

(1) 119879 larr 0(2)119882 = the set of all words in the domain(3)119863 = 119879 cap119882(4) while exist word 119908 isin 119863(5) for each interface 119902 isin 119876119868(6) remove the stop word constraints for the lables of sibling nodes(7) if no stop word constraint is violated then then(8) 119879

1015840larr 119879 cup 119908

(9) else(10) remove antonymus of 119908 appearing in119863 from 1198791015840(11) return 1198791015840

Algorithm 2 Removing stop words

For example in the first tweet above the location is notprecise only the name of neighborhood is given but thestreets are not defined However these features enrich thedescription of the event

Therefore the tweet is contextualized and related to theconcept ldquoviolencerdquo Moreover the second tweet has a preciselocation but it is imprecise regarding the event descriptionThis process generates a semantic classification matched tothe official database categorization in which each tweetis described according to few categories either Boolean ordescriptors (eg theft crime and violence) The location isderived by the neighborhood street name or spatial relation(eg near and far) The time is computed by explicit data(at 1100 am) or temporal adverbs (eg now and afternoon)Thus a parsing that contains the attributes identified by thesemantic classification (event type crime and theft in car orwalking) for evaluating the domains is generatedThe parsingstructure that was obtained by analyzing the tweet applicationis presented as follows

semantic classificationTheftCrimeTheftwith violenceTheftwalkingTheftin car Theft without violenceattribute day Mon Tue Wed Thu Fri Sat Sunattribute id location 30339461 30339462 3033949330339495 30339496 30339671 attribute time 1 2 3 4 5 6 7 8 9 10 11 12 13 1415 16 17 18 19 20 21 22 23 24dataMon 30339461 10 Theft

Fri 30339671 20 Crime

The labelsemantic classification represents the result of thesemantic matchingattribute represents the valid attributesand their domains anddata denotes the classified data aftersyntactic and semantic-syntactical analysis

34 Stage 4 The Clustering Approach In this stage anapproach based on Bayes algorithm is proposed for clusteringthe crime data that are stored in the repository A bag ofwordsis used in order to classify the crimes it means that there isa set of words that describes each type of theft or crime If atweet or record contains these words it is classified accordingto the type of theft or crime that these words represent inthe database So the Bayes algorithm classifies tweets orrecords that do not contain these words or only one of thesewords The goal is to classify tweets that are relevant and thatthe semantic processing cannot classify Thus a tweet thatdescribes a theft or crime which does not have an exact loca-tion time or description will be classified by using the Bayesalgorithm

The Bayes algorithm takes into account certain features(words that compose a tweet or record from official database)that are identified and assigned to a particular cluster In thistask many clusters can appear and they will be filtered laterThe Bayes theorem is useful not only to cluster data related tocrimeor theftbut also to cluster datawhen they containwordsthat do not belong to a type of crime so these clusters will beomitted The use of Bayes algorithm had two specific goalsthe first one is to classify data that cannot be semanticallyclassified and the second is oriented towards performing anestimation For example a user requests a safe route in certainarea but there are not data of theftcrime reports neitherTwitter nor official database for this area So this approachfinds the cluster which belongs to the area (eg a safe orunsafe cluster) andmakes an estimation to this zone for deter-mining what points are probably safer than others Finallyweights are assigned for all points and these weights are sentas parameters to the safe route algorithm

Equation (1) was applied to data and they were clusteredThe next estimation is also computed for establishing aprobability for each point that belongs to a route and a crimeevent In other words such equation defines the likelihoodthat an event occurs on a given day and time and at a specificlocation

119875 (119888 | 119909) = 119875 (119909) times

119875 (119883 | 119862)

119875 (119909)

(1)

Mobile Information Systems 7

Input119873 = 1198991 1198992 119899

119896Set of nodes

119899119904start node

119899119891finish node

Result119863 = 1198891 1198892 119889

119896Set of distance values (weighted crime rate)

(1) Assign distance values 119889(119899119904) = 0 119889(119899

119894) = infinforall119899

119894= 119899119904

(2) Let 119880 = 119873 minus 119899119904

(3) Let current node 119899119888= 119899119904

(4) while exists(119899119888) do

(5) foreach neighbor(119899119888) do

(6) Let be 119899119895= 119899119890119894119892ℎ119887119900119903(119899

119888)

(7) if 119889(119899119895) gt 119897119890119899119892119905ℎ(119899

119895 119899119888) + 119889(119899

119888) then

(8) 119889(119899119895) = 119897119890119899119892119905ℎ(119899

119895 119899119888) + 119889(119899

119888)

(9) 119880 = 119880 minus 119899119895

(10) if 119899119891notin 119880 or min(119897119890119899119892119905ℎ(119899

119894 119899119895)) = infinforall119899

119894 119899119895isin 119880 then

(11) return 119889(119899119896) forall 119899119896isin 119873

(12) else(13) 119899

119888= 119889(119899

119894) = min(119889(119899

119895)) forall 119899119895isin 119880

Algorithm 3 The safe routing algorithm

The clustering approach obtains some probabilities thatrepresent the estimation for specific crimes that were previ-ously classified semanticallyThese probabilities define possi-ble hot spots that represent events which are directly associ-ated with streets In addition the method classifies incomingdata generated by the categorization task searching patternsthat were defined as words associated with a crime It gen-erates some predictive spatiotemporal patterns according tothe indicated parametersThus the goal is to search the crimeprobability for specific locations in a given route for havingsome criminal events when a user travels on that route Theprobabilities are based on the following combinations

let 119875(coordinate = [119909 119910] | event) be the probabilityof an event directly related to its location and let119875(119889119886119910 119903 119905119894119898119890 | event) be the probability of an eventtaking into account its temporariness So the compu-tation of such probabilities is presented as follows119875(119888 | 119909) represents a subsequent probability and it isdefined by the probability that an event occurs con-sidering past events inwhich119875(119909) represents the totalprobability which denotes the number of times thatsome given attributes appear in the events (eg theft)119875(119909 | 119888) represents the conditional probability whichis denoted by the number of times when a targetappears in each attribute (eg car) taking intoaccount the total number that the target appears inall attributes119875(119888) is defined as an a priori probability which repre-sents the number of times that a target appears in allevents

Events attributes and locations are identified and clas-sified by the Bayes algorithm The clusters are generatedaccording to the defined patterns that were used in thetraining process Particularly the places are described astrusted or untrusted The probability is used as a weighted

value and it is normalized in a range of [0 1] where 0represents the lowest and 1 the highest likelihood Thusthese values are used in the route generation An example ofthe clustering computation with the obtained likelihood ispresented as follows

Classified Event (ldquo12052014rdquo ldquo3rdquo Monday12255928567) ldquoEventrdquo ldquoTheftrdquo ldquoID Coordinaterdquoldquo30339694rdquo ldquolatituderdquo 194038744 ldquolongituderdquo99150226 ldquoProbabilityrdquo ldquo00009295401rdquo ldquodiag-nosed weight 089rdquo

35 Stage 5 Generation of Safe Routes The safe routes arevisualized in the mobile mapping application which wasdeveloped as a client by using REST (Representational StateTransfer) It is a web service technology that generatesrequests and the data parsing is received by the server Thespatial feature is supported byOpen StreetMaps [12]The saferoute is obtained by the adaptation of the Dijkstra algorithmin which the nodes in the network are assigned as an averageweight that was obtained from the number of crimes for aspecific point or geographic area The values generated fromthe Bayes algorithm reflect the probability of having or not anevent such as ldquotheftcrimerdquo

Let 119899119904be the starting node called initial node and let

119889 be the distance of node 119899119896from the initial node to 119899

1

Safe route algorithm assigns some initial distance valuesbased on the weighted crime rate for these points So theweighted crime rate is computed by the number of com-plaint crimetheft occurrences in a specific point and timeAlgorithm 3 describes the process for obtaining the saferoutes

Thus this algorithm uses data that were sent by themobile application (origin and destination points) in order toreturn a route that avoids locations where crime events haveoccurred The confidence level is a metric that is computedby considering the number of crime incidents that haveoccurred in a specific point and time This is used as a

8 Mobile Information Systems

weighted value for the safe routing algorithm The displayedroute is marked with coordinates that were returned by thealgorithm in order to visualize the route on the mobilemapping application

The weights can be also relaxed by spatial and temporalvalues such as date (day month and year) location (point orarea) and time (hour or period) In the interface level userscan configure the route search process by modeling someparameters as follows If the route is generated by using eithersocial official or integrated data source then the searchprocess also makes a difference between different transporta-tions (eg walking or by car)

4 Experimental Results and Evaluation

The mobile application was implemented in Android 40and the tests were performed in mobile devices In thissection the results based on information of Mexico Cityare presented The repository is composed of 5441 eventswhich were recollected and processed from the tweets andcorrelated with official reports In this dataset a frequencytable is generated which indicates the number of times thateach attribute appears given events such as theft and crime

The frequency values are defined as weights when deriv-ing a safe route All values are assigned to correspondingnodes in the network In addition the probability that anevent occurs at some location and specific date is computedand stored in a vector table This also allows a comparison ofcrime probabilities at different placesThe sort of summariza-tion provides a support to evaluate the probability of an eventto occur at a specific location andor particular time (egthe probability that a crime or theft can occur in ldquoAvenueEje Central on Mondayrdquo) Then the probability that an eventoccurs at a given place and time is defined in the followingexample ldquoOn Monday at 900 am in Iztapalapardquo

119875(119909) = 119875(Monday) = 2202149119875(119909 | 119888) = 119875(Monday | Theft) = 18218119875(119888) = 119875(Theft) = 2182149119875(119888 | 119909) = 119875(Monday | Theft) =119875(MONDAY) times119875(THEFT | Monday)119875(Mon-day) = 00818

(1) The probabilities of some classified crime events arecomputed

(2) The selected combinations (eg theft at a particularlocation and time) are definedHowever this problemis NP-hard thus a semantic classification has beenapplied to restrict the search space and decrease thecomputational complexity

(3) This provides the probabilities for all events that occurfor each particular domain value

An example to compute the probability for crime eventssuch as the probability that ldquoa theft occurs at a given locationon Friday or a crime occurs onTuesday at noonrdquo is presentedas follows

119875(Theft) = 0104119875(Crime) = 0065119875(Crime(119908)) = 0017119875(Crime(119901)) = 0812

119875(Theft | day = Monday) = 0032119875(Crime | day = Monday) = 013119875(Crime(119908) | day = Monday) = 0017119875(Crime(119901) | day = Monday) = 0813119875(Theft | hour = 9) = 01119875(Crime | day = 9) = 005119875(Crime(119908) | hour = 9) = 005119875(Crime(119901) | hour = 9) = 08119875(Theft | hour = 1245612) = 0119875(Crime | day = 1245612) = 02119875(Crime(119908) | hour = 1245612) = 0119875(Crime(119901) | hour = 1245612) = 08rArr

119875(Theft | Time = 9ampamp day =Monday ampamp coordinate= 1245612) =119875(Theft) times 119875(Theft | day = Monday) times 119875(Theft |time = 9)times(Theft | coordinate = 1245612) = 0104times0032 times 01 times 0 = 0

The example was based on taking into consideration thedefined points from a given area it means that we have theestimation that an event occurs in 119909 point at 11 am or in 119909point in several hours of the day The probability that a theftoccurs is increasingdecreasing depending on the hour andthe day of week In that case the user can also ask for allpossible combinations near his location For instance for 119910point he wants to know the probability of suffering a theft atdifferent hours of a day So he can know what event (type ofcrime) is the most common to occur for an 119909 point at 6 pm

Figures 5 and 6 show safe routes between two pointsfor transient users where the circles on the map representuntrusted points The route was generated by avoiding thesepoints although this means that the path could be longer it isthe safer route Moreover Figure 6 depicts the generated saferoute for the same points indicated in Figure 5 not only whenthe user is not a transient but also when the user is a driverIn this case the route is different because the estimation iscarried out only processing the event types related to drivers

Finally it is possible to generate routes considering thedata reports of different periods of time (eg thefts andcrimes occurred from April to July in 2015) The safe routingalgorithm generates a specific route that avoids the pointswhere theft events have occurred in the past Neverthelessthe generated route can change if the user increases the timeperiod for taking into account (eg from January to Marchin 2015) or including all temporal data available (eg from2013 to 2015) It allows us to know what location is safer thanothers at specific day and hour in the same geographic area

Mobile Information Systems 9

Figure 5 Suggested safe route for a transient

Figure 6 Suggested safe route for a user in a car

Figure 7 depicts the events that occurred to transients anddrivers from official and social sources the icons representtheft of transients and drivers with and without violence aswell as theft of house

On the other hand the obtained results in Figures 56 and 7 were compared with the Official Crime Map Sys-tem (httpwwwmapadelincuencialorgmx) Figure 8 onlydepicts events that volunteersmarked as points where a crimeor theft with violence occurred the data difference is evidentand the possibilities to configure the system are very limitedin comparison with the proposed mobile information system(see Figure 9)

Thus Figure 10 shows the hybrid map view from the webversion The events in yellow color were reported by thesocial network source and events with blue color representthe official source

5 Conclusions and Future Work

In this paper a hybrid approach for finding safe routesusing semantic processing and classification algorithms with

Figure 7 All events occurred from 2013 to 2015

Figure 8 Crime map (made by volunteers)

data provided from a social network and the official crimereports is presented As a case study a mobile informationsystem was developed It generates safe routes based oncrime reports of Mexico City from large tweet repositoryand official databases The data are semantically classifiedto determine whether the tweet describes a crime event ortheft in case of tweets which cannot be identified as crimesthey are evaluated by Bayes algorithm which clustered themaccording to the contained descriptionThus the clusters areused tomake prediction regarding the possibility that a crimecan occur in a specific place and hour The spatiotemporalanalysis determined the location where the crime eventsoccurred Moreover the confidence level of a location wasdefined and it was used as a parameter for computing the saferroute

Themain contributions of this work are as follows (1) thedesign of a hybrid approach based on semantic processingto retrieve crime data from a social network source (2) theintegration of crowd-sensed data with official governmentsources (3) the validation of a tweet performed by comparingthe sources using 119896-fold cross validation (4) the estimation

10 Mobile Information Systems

Figure 9 Theftcrime events that occurred in a specific time fromthe mobile information system

Figure 10 The hybrid map view from the web version

model based on the Bayes algorithm to obtain safe routes withdata that were provided by the mobile device and (5) thedesign of amobile information system to generate safe routes

According to the results of the estimation the certaintydegree is around 75 of effectiveness It was tested bycomparing areas with crime data but the records wereintentionally removed and original copy was keptThus withthe results of the estimation a comparison with the originalcopy was performed So we found that the estimation has aperformance of 75 for all the points of the data sample

In addition a metric to measure the confidence level orsecurity for certain points and areas of Mexico City has beenproposed It allows finding safe routes according to pathswith a low crime rate Moreover the mobile application gath-ers long-term statistical data with almost real informationfrom citizens which are acting as sensors in the city Theresults of the mobile system have been tested and comparedwith the Crime Map System

Future works are oriented towards evaluating the cogni-tive perception of people taking into consideration points

or geographic places for finding comfortable routes Thesentiment analysis will be treated in order to incorporate thisfeature as a parameter in the computation of routes Addi-tionally we are proposing the integration of ourmobile appli-cation with the Mexican CCTV camera systems for sensingthe dynamic of certain areas in the city This contribution isfocused on developing mobile information systems for rout-ing and urban planning in city environments Mobile appli-cations are increasingly becoming essential for analyzing theurban dynamic of big cities Thus the appearance of the nextgeneration of mobile information systems will be devised inreal-time road network conditions In addition this genera-tion is oriented towards improving the quality of human lifefor increasing the sustainability of the smart cities

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

Thisworkwas partially sponsored by the Instituto PolitecnicoNacional (IPN) the Consejo Nacional de Ciencia y Tec-nologıa (CONACYT) and the Secretarıa de Investigacion yPosgrado (SIP) under Grants 20162006 20161899 20161869and 20161611

References

[1] H Zhang Y Xu and X Wen ldquoOptimal shortest path setproblem in undirected graphsrdquo Journal of Combinatorial Opti-mization vol 29 no 3 pp 511ndash530 2015

[2] W Templeton-Steadman and R Williams ldquoInformation deliv-ery system and method for mobile appliancesrdquo US Patent App11562054 2006

[3] D Reis A Melo A L V Coelho and V Furtado ldquoTowardsoptimal police patrol routes with genetic algorithmsrdquo in Intel-ligence and Security Informatics IEEE International Conferenceon Intelligence and Security Informatics ISI 2006 SanDiego CAUSA May 23-24 2006 Proceedings vol 3975 of Lecture Notesin Computer Science pp 485ndash491 Springer Berlin Germany2006

[4] V Ceikute and C S Jensen ldquoRouting service qualitymdashlocaldriver behavior versus routing servicesrdquo in Proceedings of theIEEE 14th International Conference onMobileDataManagement(MDM rsquo13) vol 1 pp 97ndash106 IEEE June 2013

[5] Safety Apps October 2015 httpwwwcsunedupolicesafety-apps

[6] E Galbrun K Pelechrinis and E Terzi ldquoUrban navigationbeyond shortest route the case of safe pathsrdquo InformationSystems vol 57 pp 160ndash171 2016

[7] T Wang C Rudin D Wagner and R Sevieri ldquoLearning todetect patterns of crimerdquo in Machine Learning and KnowledgeDiscovery in Databases vol 8190 of Lecture Notes in ComputerScience pp 515ndash530 Springer Berlin Germany 2013

[8] C-H Yu W Ding P Chen and M Morabito ldquoCrime forecast-ing using spatio-temporal pattern with ensemble learningrdquo inAdvances in Knowledge Discovery andDataMining 18th Pacific-AsiaConference PAKDD2014 Tainan TaiwanMay 13ndash16 2014

Mobile Information Systems 11

Proceedings Part II Lecture Notes in Computer Science pp174ndash185 Springer Berlin Germany 2014

[9] L Scott and NWarmerdam ldquoExtend crime analysis with arcgisspatial statistics toolsrdquo ArcUser Magazine 2005

[10] M Leitner Crime Modeling and Mapping Using GeospatialTechnologies vol 8 Springer DordrechtTheNetherlands 2013

[11] J Kim M Cha and T Sandholm ldquoSocroutes safe routesbased on tweet sentimentsrdquo in Proceedings of the 23rd ACMInternational Conference on World Wide Web (WWW rsquo14) pp179ndash182 Seoul South Korea April 2014

[12] MHaklay and PWeber ldquoOpenstreetmap user-generated streetmapsrdquo IEEE Pervasive Computing vol 7 no 4 pp 12ndash18 2008

[13] J M Sanchez Bernabeu J V Berna Martınez and F MaciaPerez ldquoSmart sentinelmonitoring and prevention system in thesmart citiesrdquo International Review on Computers and Softwarevol 9 no 9 pp 1554ndash1559 2014

[14] Wiki crimes 2015 httpwwwwikicrimesorg[15] T Moon S Heo and S Lee ldquoUbiquitous crime prevention

system (UCPS) for a safer cityrdquoProcedia Environmental Sciencesvol 22 pp 288ndash301 2014

[16] H Su K Zheng J Huang H Jeung L Chen and X ZhouldquoCrowdplanner a crowd-based route recommendation systemrdquoinProceedings of the 30th IEEE International Conference onDataEngineering (ICDE rsquo14) pp 1144ndash1155 IEEE Chicago Ill USAApril 2014

[17] V Arnaboldi M Conti and F Delmastro ldquoImplementationof CAMEO a context-aware middleware for opportunisticmobile social networksrdquo inProceedings of the IEEE InternationalSymposium on a World of Wireless Mobile and MultimediaNetworks (WoWMoM rsquo11) pp 1ndash3 Lucca Italy June 2011

[18] M Nagarajan K Gomadam A P Sheth A Ranabahu RMutharaju and A Jadhav ldquoSpatio-temporal-thematic analysisof citizen sensor data challenges and experiencesrdquo in WebInformation Systems EngineeringmdashWISE 2009 vol 5802 of Lec-ture Notes in Computer Science pp 539ndash553 Springer BerlinGermany 2009

[19] N Powdthavee ldquoUnhappiness and crime evidence from SouthAfricardquo Economica vol 72 no 287 pp 531ndash547 2005

[20] J Ratcliffe ldquoCrime mapping spatial and temporal challengesrdquoin Handbook of Quantitative Criminology pp 5ndash24 SpringerNew York NY USA 2010

[21] P Gupta G N Purohit and A Dadhich ldquoCrime preventionthrough alternate route finding in traffic surveillance using cctvcamerasrdquo International Journal of Engineering and AdvancedTechnology vol 2 no 5 pp 414ndash418 2013

[22] H Chen T Cheng and S Wise ldquoDesigning daily patrol routesfor policing based on ANT colony algorithmrdquo ISPRS Annalsof the Photogrammetry Remote Sensing and Spatial InformationSciences vol II-4W2 pp 103ndash109 2015

[23] D Dzemydiene and E Kazemikaitiene ldquoOntology-based deci-sion support system for crime investigation processesrdquo inInformation Systems Development pp 427ndash438 Springer NewYork NY USA 2005

[24] Y Chabot A Bertaux C Nicolle and T Kechadi ldquoA completeformalized knowledge representation model for advanced dig-ital forensics timeline analysisrdquoDigital Investigation vol 15 pp83ndash100 2015

[25] E Hatcher O Gospodnetic and M McCandless Lucene inAction Manning Publications Greenwich Conn USA 2004

[26] FMata-RiveraMTorres-RuizGGuzmanMMoreno-Ibarraand R Quintero ldquoA collaborative learning approach for geo-graphic information retrieval based on social networksrdquo Com-puters in Human Behavior vol 51 pp 829ndash842 2015

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 3: A Mobile Information System Based on Crowd-Sensed and Official ...

Mobile Information Systems 3

(1) Retrieval of crime data sources

Tweets database Official reports

(2) Crime data repository

Integrated crime database

(3) Semantic processing

Crime ontology

(4) Clustering approach

(5) Generation of safe routes

Integrated crimedatabase

Integrated crimedatabase

Crime ontology

Safe routingalgorithm

Bayes algorithm

Figure 1 The general framework with its stages

3 The General Framework

The framework is composed of five stages (1) retrieval ofcrime data sources (2) the crime data repository (3) thesemantic processing (4) the clustering approach and (5)generation of safe routes (see Figure 1)

Summing up the approach consists of retrieving tweetsthat are related to crime events taking into account attributesthat define the time and location when an event occursThesetweets are analyzed and integrated with crime data fromofficial sources (government institutions) The integrationprocess is automatically performed by using descriptions andspatiotemporal attributes of data An application ontology isproposed to define candidates for the integration task Laterthe Bayes algorithm is used to classify data that cannot beautomatically integrated For the cases of synonymy namesthe GeoNames web service is used to solve this conflict

The categorization is carried out analyzing the spatiotem-poral attributes and the description of words that appearin the tweets andor official database record For examplecrime events could have occurred walking or by car withor without violence The description of these events is usedto categorize each record or tweet The crime data cannotbe directly classified because this information could be

Tweets database Official reports

Semantic analysis

Integrated database

Spatiotemporal analysis

Data analysis

Figure 2 Hybrid approach integrated crime database from theofficial and social sources

imprecise or incomplete For instance if a tweet does notcontain the address or any reference like points of interesttime definition and crime or theft then the tweet is not takeninto consideration Thus information that is incomplete orwith imprecise data will be classified by using the Bayesianalgorithm Moreover the categorization allows us to knowwhat type of crimetheft occurred in certain place or point(with or without violence) While the classification definesthe confidence (security) level for an area or street this isdefined by the sum of all crimetheft points of an area orstreet Thus all the categorized and classified data are usedas input parameters for the safe routing algorithm The finalresult is a safe route that does not cross or contain points withhigh crime rates

As a conclusion the first stage presents the crime datasources to be processed particularly relevant tweets andofficial reports The second stage consists of building a crimedata repository It stores data that were integrated fromTwitter and official sources The third stage is in charge ofclassifying crime records by their description The process isdriven by an application ontology The fourth stage gathersthe events by crimetheft type time and location where theyhappened Lately classified areas are generated according totheir confidence level (how often a crime occurs in a placedate and time) in order to be categorized like secure orinsecurity clusters The fifth stage performs an estimationbased on the crime rates which is processed by applyinga Bayesian algorithm in order to obtain crime values withrespect to points in safe routes It visualizes the safe routeon the mobile application according to the following crimesrobbery with violence to passengers theft without violenceto passengers car theft with violence and car theft withoutviolence

31 Stage 1 Retrieval of Crime Data Sources In this stagethe data are retrieved and recollected from two sources (1)crime information from the tweets dataset and (2) crimeinformation from the official reports (see Figure 2) Thefollowing considerations for describing both processes areoutlined as follows

4 Mobile Information Systems

Table 1 Crime-related events from Twitter accounts covering Mexico City

Twitter account Location Creation date Followers Number of tweets Belongs to government WebsiteSSPDFVIAL Mexico City 07142010 369115 15465 Yes httpsspdfgobmxPolloVial Mexico City 01312013 667 7191 No No websiteTrafico889 Mexico City 05142009 137099 9054 No httpsiempre889comtraficoAlertux Mexico City 10162012 179574 3559 No httpwwwalertuxcom072AvialCDMX Mexico City 10202010 83535 13471 Yes httpwwwagudfgobmxRedVial Mexico City 03092010 63702 4481 No httprvialmx

Table 2 Record structure for the official database

Event type Year 2012 Year 2013Q1

Q2

Q3

Q4

Q1

Q2

Q3

Q4

Robbery ofpassenger inpublic transport

159 175 154 111 102 134 115 86

Robbery ofpasser withoutviolence

47 80 54 79 73 46 49 57

From the tweets dataset it is important to know infor-mation represented by tweets that talk about crime-relatedevents The features were considered as follows specializedand generic accounts mention of crimes in popular streetscommon abbreviations of crimes popular places where somecrimes were committed such as malls entertainment sitesprivate and public locations and historical monuments Thisprocess determined the popular words and relevant conceptsby using the ontology Thus the most common 119899-grams foreach tweet were obtained by sorting them according to theoccurrence frequency The repetitive 119899-grams were selectedwith a threshold of more than 100 mentions Thus fromthe most frequent unigram bigram and trigram lists wehave identified by hand 456 common crimes on popularstreets 150 common crime-related events 135 common crimehashtags 69 common nicknames 65 common buildingsplaces and monuments 34 common abbreviations and 26common combinations of prepositions

The dataset contains 450250 tweets collected over aperiod of six months from January 7 2015 until June 242015 without considering retweets and posts with blankspaces Tweets are collected from reliable Twitter profiles thatcorrespond to known services and institutions (see Table 1)The APIs that were used to retrieve crime tweets were theSearch API and Twitter4j

With respect to official reports they were retrieved by theInfoDF systemTheprocess consisted of requesting to the SSPand PGJ government agencies and the records with respectto crime information such as robbery of passerby and vehicletheft

On the other hand information retrieved from the SSPagency contains more details than information from the PGJsuch as the type of offense the colony or area where the eventoccurred and the time A processed record from the officialpolice database (PGJ-DF) is shown in Table 2 It represents

a crimetheft associated with a location or neighborhood aswell as the time by trimester

32 Stage 2 A Crime Data Repository In this stage themost relevant tweets are identified in order to carry outa semantic matching which is driven by an ontology thatwas adopted from [23] A fragment of this ontology forintegrating semantically crime tweets with official reports isdepicted in Figure 3

The ontology is explored by Algorithm 1 and it usesthe hyperonymy and synonymy as semantic relationshipsin order to find a matching between each tweet and crimereport It means that if a tweet is expressed by domains oftime and location their hyperons and synonyms are searchedwithin the ontology for contextualizing them So synonymsand hyperons are stored into a vectorThe process is repeatedby each record of the official institutions the obtained vectoris compared and in case of a match then a term or conceptis the same or has the same parent thus it is considered acandidate to be unified In other words if a match is foundthen the data are mixed (tweets and databases from the PGJand SSP)

33 Stage 3 The Semantic Processing The applied processto tweets establishes the following tasks (1) 119879119900119896119890119899119894119911119890119903 Itconsists of separating and identifying the tweets as well asfiltering the tweets to remove the stop words from the tokenstream This task was performed by Lucene system [25] (2)119875119903119900119888119890119904119904119894119899119892 119863119886119905119886 This task involves analysis of tweets theyare identified by spatial temporal and description attributesin order to identify where when and how a crime eventoccurred (3) 119862119886119905119890119892119900119903119894119911119886119905119894119900119899 It uses the semantic matchingby means of grouping the tweets according to the crimesdefined in the ontology The semantic distance for eachprocessed tweet is computed by using the weighted crimerateThe output of this stage is a set of tweets categorized by acrime type Figure 4 shows a general diagram that describesthe integration of these tasks Thus in order to illustrate theabove two different records from Twitter dataset at differentlevels of spatial granularity are presented as follows

(1) SSPDFVIAL crime report in public transport witha fire arm Iztapalapa

(2) Alertux theft of students in Ecatepec streets 105and 106 RedVial

Mobile Information Systems 5

Input TweetsResult Unified and categorized tweet

(1) Let 119902[119894] = 119890119897119890119898119890119899119905119904 119900119891 119905119908119890119890119905(2) 119899 = 0(3) while 119899 = 119894 do(4) Parsing and identification (119902[119894])(5) nodestart()(6) while node = null do(7) 119895 = 0 119894 = 0(8) if Hyperonomy or Similarity(concept name) then(9) 119888119900119899119881119890119888[119895] = 119892119890119905 119901119886119903119890119899119905 119886119899119889 119888ℎ119894119897119889119903119890119899(119899119900119889119890)

(10) nodenext()(11) 119862119903119900119908119889119881119890119888119905119900119903[119895] = 119890V119890119899119905119879119910119901119890 119904119890119886119903119888ℎ(119888119900119899119881119890119888[119895])(12) temporal search(conVec[119895])(13) 119895++ 119896++ 119899++

Algorithm 1 The OntoExplore algorithm

Thing

Crime

Crime against Crime against

Theft

Violent theft Nonviolent theft

Carjacking RobberyCar stolen from

Armed robbery Highway robbery

parking

property person

Figure 3 A fragment of the crime ontology to semantically match tweets and official information

Tweets analysis Spatial analysis

(GeoNames and gazetteer)

Linguistic analysis(dictionary)

Temporal analysis

Filtering and cleaningdata

(Lucene system)

The OntoExplorealgorithmexecution

Categorized crimerecords

Integrated crimedatabase

Figure 4 Tasks involved in the semantic processing

The cleaning process includes the adaptation ofAlgorithm 2 for removing stop words according to thedefinition presented in [26]

On the other hand a tweet is associated with a recordof the official database according to its relevance by using

the ontology and the hyperonymy relation The relevance ismeasured by contextualizing the record and the tweet Thismeans that each semantic item of the tweet and record isidentified and then it is searched in the crime ontology inorder to find matching terms synonyms or related concepts

6 Mobile Information Systems

Input An arbitrary stop word dictionary 119879 the set of schema trees 119876119868Result A maximal set of stop words 1198791015840

(1) 119879 larr 0(2)119882 = the set of all words in the domain(3)119863 = 119879 cap119882(4) while exist word 119908 isin 119863(5) for each interface 119902 isin 119876119868(6) remove the stop word constraints for the lables of sibling nodes(7) if no stop word constraint is violated then then(8) 119879

1015840larr 119879 cup 119908

(9) else(10) remove antonymus of 119908 appearing in119863 from 1198791015840(11) return 1198791015840

Algorithm 2 Removing stop words

For example in the first tweet above the location is notprecise only the name of neighborhood is given but thestreets are not defined However these features enrich thedescription of the event

Therefore the tweet is contextualized and related to theconcept ldquoviolencerdquo Moreover the second tweet has a preciselocation but it is imprecise regarding the event descriptionThis process generates a semantic classification matched tothe official database categorization in which each tweetis described according to few categories either Boolean ordescriptors (eg theft crime and violence) The location isderived by the neighborhood street name or spatial relation(eg near and far) The time is computed by explicit data(at 1100 am) or temporal adverbs (eg now and afternoon)Thus a parsing that contains the attributes identified by thesemantic classification (event type crime and theft in car orwalking) for evaluating the domains is generatedThe parsingstructure that was obtained by analyzing the tweet applicationis presented as follows

semantic classificationTheftCrimeTheftwith violenceTheftwalkingTheftin car Theft without violenceattribute day Mon Tue Wed Thu Fri Sat Sunattribute id location 30339461 30339462 3033949330339495 30339496 30339671 attribute time 1 2 3 4 5 6 7 8 9 10 11 12 13 1415 16 17 18 19 20 21 22 23 24dataMon 30339461 10 Theft

Fri 30339671 20 Crime

The labelsemantic classification represents the result of thesemantic matchingattribute represents the valid attributesand their domains anddata denotes the classified data aftersyntactic and semantic-syntactical analysis

34 Stage 4 The Clustering Approach In this stage anapproach based on Bayes algorithm is proposed for clusteringthe crime data that are stored in the repository A bag ofwordsis used in order to classify the crimes it means that there isa set of words that describes each type of theft or crime If atweet or record contains these words it is classified accordingto the type of theft or crime that these words represent inthe database So the Bayes algorithm classifies tweets orrecords that do not contain these words or only one of thesewords The goal is to classify tweets that are relevant and thatthe semantic processing cannot classify Thus a tweet thatdescribes a theft or crime which does not have an exact loca-tion time or description will be classified by using the Bayesalgorithm

The Bayes algorithm takes into account certain features(words that compose a tweet or record from official database)that are identified and assigned to a particular cluster In thistask many clusters can appear and they will be filtered laterThe Bayes theorem is useful not only to cluster data related tocrimeor theftbut also to cluster datawhen they containwordsthat do not belong to a type of crime so these clusters will beomitted The use of Bayes algorithm had two specific goalsthe first one is to classify data that cannot be semanticallyclassified and the second is oriented towards performing anestimation For example a user requests a safe route in certainarea but there are not data of theftcrime reports neitherTwitter nor official database for this area So this approachfinds the cluster which belongs to the area (eg a safe orunsafe cluster) andmakes an estimation to this zone for deter-mining what points are probably safer than others Finallyweights are assigned for all points and these weights are sentas parameters to the safe route algorithm

Equation (1) was applied to data and they were clusteredThe next estimation is also computed for establishing aprobability for each point that belongs to a route and a crimeevent In other words such equation defines the likelihoodthat an event occurs on a given day and time and at a specificlocation

119875 (119888 | 119909) = 119875 (119909) times

119875 (119883 | 119862)

119875 (119909)

(1)

Mobile Information Systems 7

Input119873 = 1198991 1198992 119899

119896Set of nodes

119899119904start node

119899119891finish node

Result119863 = 1198891 1198892 119889

119896Set of distance values (weighted crime rate)

(1) Assign distance values 119889(119899119904) = 0 119889(119899

119894) = infinforall119899

119894= 119899119904

(2) Let 119880 = 119873 minus 119899119904

(3) Let current node 119899119888= 119899119904

(4) while exists(119899119888) do

(5) foreach neighbor(119899119888) do

(6) Let be 119899119895= 119899119890119894119892ℎ119887119900119903(119899

119888)

(7) if 119889(119899119895) gt 119897119890119899119892119905ℎ(119899

119895 119899119888) + 119889(119899

119888) then

(8) 119889(119899119895) = 119897119890119899119892119905ℎ(119899

119895 119899119888) + 119889(119899

119888)

(9) 119880 = 119880 minus 119899119895

(10) if 119899119891notin 119880 or min(119897119890119899119892119905ℎ(119899

119894 119899119895)) = infinforall119899

119894 119899119895isin 119880 then

(11) return 119889(119899119896) forall 119899119896isin 119873

(12) else(13) 119899

119888= 119889(119899

119894) = min(119889(119899

119895)) forall 119899119895isin 119880

Algorithm 3 The safe routing algorithm

The clustering approach obtains some probabilities thatrepresent the estimation for specific crimes that were previ-ously classified semanticallyThese probabilities define possi-ble hot spots that represent events which are directly associ-ated with streets In addition the method classifies incomingdata generated by the categorization task searching patternsthat were defined as words associated with a crime It gen-erates some predictive spatiotemporal patterns according tothe indicated parametersThus the goal is to search the crimeprobability for specific locations in a given route for havingsome criminal events when a user travels on that route Theprobabilities are based on the following combinations

let 119875(coordinate = [119909 119910] | event) be the probabilityof an event directly related to its location and let119875(119889119886119910 119903 119905119894119898119890 | event) be the probability of an eventtaking into account its temporariness So the compu-tation of such probabilities is presented as follows119875(119888 | 119909) represents a subsequent probability and it isdefined by the probability that an event occurs con-sidering past events inwhich119875(119909) represents the totalprobability which denotes the number of times thatsome given attributes appear in the events (eg theft)119875(119909 | 119888) represents the conditional probability whichis denoted by the number of times when a targetappears in each attribute (eg car) taking intoaccount the total number that the target appears inall attributes119875(119888) is defined as an a priori probability which repre-sents the number of times that a target appears in allevents

Events attributes and locations are identified and clas-sified by the Bayes algorithm The clusters are generatedaccording to the defined patterns that were used in thetraining process Particularly the places are described astrusted or untrusted The probability is used as a weighted

value and it is normalized in a range of [0 1] where 0represents the lowest and 1 the highest likelihood Thusthese values are used in the route generation An example ofthe clustering computation with the obtained likelihood ispresented as follows

Classified Event (ldquo12052014rdquo ldquo3rdquo Monday12255928567) ldquoEventrdquo ldquoTheftrdquo ldquoID Coordinaterdquoldquo30339694rdquo ldquolatituderdquo 194038744 ldquolongituderdquo99150226 ldquoProbabilityrdquo ldquo00009295401rdquo ldquodiag-nosed weight 089rdquo

35 Stage 5 Generation of Safe Routes The safe routes arevisualized in the mobile mapping application which wasdeveloped as a client by using REST (Representational StateTransfer) It is a web service technology that generatesrequests and the data parsing is received by the server Thespatial feature is supported byOpen StreetMaps [12]The saferoute is obtained by the adaptation of the Dijkstra algorithmin which the nodes in the network are assigned as an averageweight that was obtained from the number of crimes for aspecific point or geographic area The values generated fromthe Bayes algorithm reflect the probability of having or not anevent such as ldquotheftcrimerdquo

Let 119899119904be the starting node called initial node and let

119889 be the distance of node 119899119896from the initial node to 119899

1

Safe route algorithm assigns some initial distance valuesbased on the weighted crime rate for these points So theweighted crime rate is computed by the number of com-plaint crimetheft occurrences in a specific point and timeAlgorithm 3 describes the process for obtaining the saferoutes

Thus this algorithm uses data that were sent by themobile application (origin and destination points) in order toreturn a route that avoids locations where crime events haveoccurred The confidence level is a metric that is computedby considering the number of crime incidents that haveoccurred in a specific point and time This is used as a

8 Mobile Information Systems

weighted value for the safe routing algorithm The displayedroute is marked with coordinates that were returned by thealgorithm in order to visualize the route on the mobilemapping application

The weights can be also relaxed by spatial and temporalvalues such as date (day month and year) location (point orarea) and time (hour or period) In the interface level userscan configure the route search process by modeling someparameters as follows If the route is generated by using eithersocial official or integrated data source then the searchprocess also makes a difference between different transporta-tions (eg walking or by car)

4 Experimental Results and Evaluation

The mobile application was implemented in Android 40and the tests were performed in mobile devices In thissection the results based on information of Mexico Cityare presented The repository is composed of 5441 eventswhich were recollected and processed from the tweets andcorrelated with official reports In this dataset a frequencytable is generated which indicates the number of times thateach attribute appears given events such as theft and crime

The frequency values are defined as weights when deriv-ing a safe route All values are assigned to correspondingnodes in the network In addition the probability that anevent occurs at some location and specific date is computedand stored in a vector table This also allows a comparison ofcrime probabilities at different placesThe sort of summariza-tion provides a support to evaluate the probability of an eventto occur at a specific location andor particular time (egthe probability that a crime or theft can occur in ldquoAvenueEje Central on Mondayrdquo) Then the probability that an eventoccurs at a given place and time is defined in the followingexample ldquoOn Monday at 900 am in Iztapalapardquo

119875(119909) = 119875(Monday) = 2202149119875(119909 | 119888) = 119875(Monday | Theft) = 18218119875(119888) = 119875(Theft) = 2182149119875(119888 | 119909) = 119875(Monday | Theft) =119875(MONDAY) times119875(THEFT | Monday)119875(Mon-day) = 00818

(1) The probabilities of some classified crime events arecomputed

(2) The selected combinations (eg theft at a particularlocation and time) are definedHowever this problemis NP-hard thus a semantic classification has beenapplied to restrict the search space and decrease thecomputational complexity

(3) This provides the probabilities for all events that occurfor each particular domain value

An example to compute the probability for crime eventssuch as the probability that ldquoa theft occurs at a given locationon Friday or a crime occurs onTuesday at noonrdquo is presentedas follows

119875(Theft) = 0104119875(Crime) = 0065119875(Crime(119908)) = 0017119875(Crime(119901)) = 0812

119875(Theft | day = Monday) = 0032119875(Crime | day = Monday) = 013119875(Crime(119908) | day = Monday) = 0017119875(Crime(119901) | day = Monday) = 0813119875(Theft | hour = 9) = 01119875(Crime | day = 9) = 005119875(Crime(119908) | hour = 9) = 005119875(Crime(119901) | hour = 9) = 08119875(Theft | hour = 1245612) = 0119875(Crime | day = 1245612) = 02119875(Crime(119908) | hour = 1245612) = 0119875(Crime(119901) | hour = 1245612) = 08rArr

119875(Theft | Time = 9ampamp day =Monday ampamp coordinate= 1245612) =119875(Theft) times 119875(Theft | day = Monday) times 119875(Theft |time = 9)times(Theft | coordinate = 1245612) = 0104times0032 times 01 times 0 = 0

The example was based on taking into consideration thedefined points from a given area it means that we have theestimation that an event occurs in 119909 point at 11 am or in 119909point in several hours of the day The probability that a theftoccurs is increasingdecreasing depending on the hour andthe day of week In that case the user can also ask for allpossible combinations near his location For instance for 119910point he wants to know the probability of suffering a theft atdifferent hours of a day So he can know what event (type ofcrime) is the most common to occur for an 119909 point at 6 pm

Figures 5 and 6 show safe routes between two pointsfor transient users where the circles on the map representuntrusted points The route was generated by avoiding thesepoints although this means that the path could be longer it isthe safer route Moreover Figure 6 depicts the generated saferoute for the same points indicated in Figure 5 not only whenthe user is not a transient but also when the user is a driverIn this case the route is different because the estimation iscarried out only processing the event types related to drivers

Finally it is possible to generate routes considering thedata reports of different periods of time (eg thefts andcrimes occurred from April to July in 2015) The safe routingalgorithm generates a specific route that avoids the pointswhere theft events have occurred in the past Neverthelessthe generated route can change if the user increases the timeperiod for taking into account (eg from January to Marchin 2015) or including all temporal data available (eg from2013 to 2015) It allows us to know what location is safer thanothers at specific day and hour in the same geographic area

Mobile Information Systems 9

Figure 5 Suggested safe route for a transient

Figure 6 Suggested safe route for a user in a car

Figure 7 depicts the events that occurred to transients anddrivers from official and social sources the icons representtheft of transients and drivers with and without violence aswell as theft of house

On the other hand the obtained results in Figures 56 and 7 were compared with the Official Crime Map Sys-tem (httpwwwmapadelincuencialorgmx) Figure 8 onlydepicts events that volunteersmarked as points where a crimeor theft with violence occurred the data difference is evidentand the possibilities to configure the system are very limitedin comparison with the proposed mobile information system(see Figure 9)

Thus Figure 10 shows the hybrid map view from the webversion The events in yellow color were reported by thesocial network source and events with blue color representthe official source

5 Conclusions and Future Work

In this paper a hybrid approach for finding safe routesusing semantic processing and classification algorithms with

Figure 7 All events occurred from 2013 to 2015

Figure 8 Crime map (made by volunteers)

data provided from a social network and the official crimereports is presented As a case study a mobile informationsystem was developed It generates safe routes based oncrime reports of Mexico City from large tweet repositoryand official databases The data are semantically classifiedto determine whether the tweet describes a crime event ortheft in case of tweets which cannot be identified as crimesthey are evaluated by Bayes algorithm which clustered themaccording to the contained descriptionThus the clusters areused tomake prediction regarding the possibility that a crimecan occur in a specific place and hour The spatiotemporalanalysis determined the location where the crime eventsoccurred Moreover the confidence level of a location wasdefined and it was used as a parameter for computing the saferroute

Themain contributions of this work are as follows (1) thedesign of a hybrid approach based on semantic processingto retrieve crime data from a social network source (2) theintegration of crowd-sensed data with official governmentsources (3) the validation of a tweet performed by comparingthe sources using 119896-fold cross validation (4) the estimation

10 Mobile Information Systems

Figure 9 Theftcrime events that occurred in a specific time fromthe mobile information system

Figure 10 The hybrid map view from the web version

model based on the Bayes algorithm to obtain safe routes withdata that were provided by the mobile device and (5) thedesign of amobile information system to generate safe routes

According to the results of the estimation the certaintydegree is around 75 of effectiveness It was tested bycomparing areas with crime data but the records wereintentionally removed and original copy was keptThus withthe results of the estimation a comparison with the originalcopy was performed So we found that the estimation has aperformance of 75 for all the points of the data sample

In addition a metric to measure the confidence level orsecurity for certain points and areas of Mexico City has beenproposed It allows finding safe routes according to pathswith a low crime rate Moreover the mobile application gath-ers long-term statistical data with almost real informationfrom citizens which are acting as sensors in the city Theresults of the mobile system have been tested and comparedwith the Crime Map System

Future works are oriented towards evaluating the cogni-tive perception of people taking into consideration points

or geographic places for finding comfortable routes Thesentiment analysis will be treated in order to incorporate thisfeature as a parameter in the computation of routes Addi-tionally we are proposing the integration of ourmobile appli-cation with the Mexican CCTV camera systems for sensingthe dynamic of certain areas in the city This contribution isfocused on developing mobile information systems for rout-ing and urban planning in city environments Mobile appli-cations are increasingly becoming essential for analyzing theurban dynamic of big cities Thus the appearance of the nextgeneration of mobile information systems will be devised inreal-time road network conditions In addition this genera-tion is oriented towards improving the quality of human lifefor increasing the sustainability of the smart cities

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

Thisworkwas partially sponsored by the Instituto PolitecnicoNacional (IPN) the Consejo Nacional de Ciencia y Tec-nologıa (CONACYT) and the Secretarıa de Investigacion yPosgrado (SIP) under Grants 20162006 20161899 20161869and 20161611

References

[1] H Zhang Y Xu and X Wen ldquoOptimal shortest path setproblem in undirected graphsrdquo Journal of Combinatorial Opti-mization vol 29 no 3 pp 511ndash530 2015

[2] W Templeton-Steadman and R Williams ldquoInformation deliv-ery system and method for mobile appliancesrdquo US Patent App11562054 2006

[3] D Reis A Melo A L V Coelho and V Furtado ldquoTowardsoptimal police patrol routes with genetic algorithmsrdquo in Intel-ligence and Security Informatics IEEE International Conferenceon Intelligence and Security Informatics ISI 2006 SanDiego CAUSA May 23-24 2006 Proceedings vol 3975 of Lecture Notesin Computer Science pp 485ndash491 Springer Berlin Germany2006

[4] V Ceikute and C S Jensen ldquoRouting service qualitymdashlocaldriver behavior versus routing servicesrdquo in Proceedings of theIEEE 14th International Conference onMobileDataManagement(MDM rsquo13) vol 1 pp 97ndash106 IEEE June 2013

[5] Safety Apps October 2015 httpwwwcsunedupolicesafety-apps

[6] E Galbrun K Pelechrinis and E Terzi ldquoUrban navigationbeyond shortest route the case of safe pathsrdquo InformationSystems vol 57 pp 160ndash171 2016

[7] T Wang C Rudin D Wagner and R Sevieri ldquoLearning todetect patterns of crimerdquo in Machine Learning and KnowledgeDiscovery in Databases vol 8190 of Lecture Notes in ComputerScience pp 515ndash530 Springer Berlin Germany 2013

[8] C-H Yu W Ding P Chen and M Morabito ldquoCrime forecast-ing using spatio-temporal pattern with ensemble learningrdquo inAdvances in Knowledge Discovery andDataMining 18th Pacific-AsiaConference PAKDD2014 Tainan TaiwanMay 13ndash16 2014

Mobile Information Systems 11

Proceedings Part II Lecture Notes in Computer Science pp174ndash185 Springer Berlin Germany 2014

[9] L Scott and NWarmerdam ldquoExtend crime analysis with arcgisspatial statistics toolsrdquo ArcUser Magazine 2005

[10] M Leitner Crime Modeling and Mapping Using GeospatialTechnologies vol 8 Springer DordrechtTheNetherlands 2013

[11] J Kim M Cha and T Sandholm ldquoSocroutes safe routesbased on tweet sentimentsrdquo in Proceedings of the 23rd ACMInternational Conference on World Wide Web (WWW rsquo14) pp179ndash182 Seoul South Korea April 2014

[12] MHaklay and PWeber ldquoOpenstreetmap user-generated streetmapsrdquo IEEE Pervasive Computing vol 7 no 4 pp 12ndash18 2008

[13] J M Sanchez Bernabeu J V Berna Martınez and F MaciaPerez ldquoSmart sentinelmonitoring and prevention system in thesmart citiesrdquo International Review on Computers and Softwarevol 9 no 9 pp 1554ndash1559 2014

[14] Wiki crimes 2015 httpwwwwikicrimesorg[15] T Moon S Heo and S Lee ldquoUbiquitous crime prevention

system (UCPS) for a safer cityrdquoProcedia Environmental Sciencesvol 22 pp 288ndash301 2014

[16] H Su K Zheng J Huang H Jeung L Chen and X ZhouldquoCrowdplanner a crowd-based route recommendation systemrdquoinProceedings of the 30th IEEE International Conference onDataEngineering (ICDE rsquo14) pp 1144ndash1155 IEEE Chicago Ill USAApril 2014

[17] V Arnaboldi M Conti and F Delmastro ldquoImplementationof CAMEO a context-aware middleware for opportunisticmobile social networksrdquo inProceedings of the IEEE InternationalSymposium on a World of Wireless Mobile and MultimediaNetworks (WoWMoM rsquo11) pp 1ndash3 Lucca Italy June 2011

[18] M Nagarajan K Gomadam A P Sheth A Ranabahu RMutharaju and A Jadhav ldquoSpatio-temporal-thematic analysisof citizen sensor data challenges and experiencesrdquo in WebInformation Systems EngineeringmdashWISE 2009 vol 5802 of Lec-ture Notes in Computer Science pp 539ndash553 Springer BerlinGermany 2009

[19] N Powdthavee ldquoUnhappiness and crime evidence from SouthAfricardquo Economica vol 72 no 287 pp 531ndash547 2005

[20] J Ratcliffe ldquoCrime mapping spatial and temporal challengesrdquoin Handbook of Quantitative Criminology pp 5ndash24 SpringerNew York NY USA 2010

[21] P Gupta G N Purohit and A Dadhich ldquoCrime preventionthrough alternate route finding in traffic surveillance using cctvcamerasrdquo International Journal of Engineering and AdvancedTechnology vol 2 no 5 pp 414ndash418 2013

[22] H Chen T Cheng and S Wise ldquoDesigning daily patrol routesfor policing based on ANT colony algorithmrdquo ISPRS Annalsof the Photogrammetry Remote Sensing and Spatial InformationSciences vol II-4W2 pp 103ndash109 2015

[23] D Dzemydiene and E Kazemikaitiene ldquoOntology-based deci-sion support system for crime investigation processesrdquo inInformation Systems Development pp 427ndash438 Springer NewYork NY USA 2005

[24] Y Chabot A Bertaux C Nicolle and T Kechadi ldquoA completeformalized knowledge representation model for advanced dig-ital forensics timeline analysisrdquoDigital Investigation vol 15 pp83ndash100 2015

[25] E Hatcher O Gospodnetic and M McCandless Lucene inAction Manning Publications Greenwich Conn USA 2004

[26] FMata-RiveraMTorres-RuizGGuzmanMMoreno-Ibarraand R Quintero ldquoA collaborative learning approach for geo-graphic information retrieval based on social networksrdquo Com-puters in Human Behavior vol 51 pp 829ndash842 2015

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 4: A Mobile Information System Based on Crowd-Sensed and Official ...

4 Mobile Information Systems

Table 1 Crime-related events from Twitter accounts covering Mexico City

Twitter account Location Creation date Followers Number of tweets Belongs to government WebsiteSSPDFVIAL Mexico City 07142010 369115 15465 Yes httpsspdfgobmxPolloVial Mexico City 01312013 667 7191 No No websiteTrafico889 Mexico City 05142009 137099 9054 No httpsiempre889comtraficoAlertux Mexico City 10162012 179574 3559 No httpwwwalertuxcom072AvialCDMX Mexico City 10202010 83535 13471 Yes httpwwwagudfgobmxRedVial Mexico City 03092010 63702 4481 No httprvialmx

Table 2 Record structure for the official database

Event type Year 2012 Year 2013Q1

Q2

Q3

Q4

Q1

Q2

Q3

Q4

Robbery ofpassenger inpublic transport

159 175 154 111 102 134 115 86

Robbery ofpasser withoutviolence

47 80 54 79 73 46 49 57

From the tweets dataset it is important to know infor-mation represented by tweets that talk about crime-relatedevents The features were considered as follows specializedand generic accounts mention of crimes in popular streetscommon abbreviations of crimes popular places where somecrimes were committed such as malls entertainment sitesprivate and public locations and historical monuments Thisprocess determined the popular words and relevant conceptsby using the ontology Thus the most common 119899-grams foreach tweet were obtained by sorting them according to theoccurrence frequency The repetitive 119899-grams were selectedwith a threshold of more than 100 mentions Thus fromthe most frequent unigram bigram and trigram lists wehave identified by hand 456 common crimes on popularstreets 150 common crime-related events 135 common crimehashtags 69 common nicknames 65 common buildingsplaces and monuments 34 common abbreviations and 26common combinations of prepositions

The dataset contains 450250 tweets collected over aperiod of six months from January 7 2015 until June 242015 without considering retweets and posts with blankspaces Tweets are collected from reliable Twitter profiles thatcorrespond to known services and institutions (see Table 1)The APIs that were used to retrieve crime tweets were theSearch API and Twitter4j

With respect to official reports they were retrieved by theInfoDF systemTheprocess consisted of requesting to the SSPand PGJ government agencies and the records with respectto crime information such as robbery of passerby and vehicletheft

On the other hand information retrieved from the SSPagency contains more details than information from the PGJsuch as the type of offense the colony or area where the eventoccurred and the time A processed record from the officialpolice database (PGJ-DF) is shown in Table 2 It represents

a crimetheft associated with a location or neighborhood aswell as the time by trimester

32 Stage 2 A Crime Data Repository In this stage themost relevant tweets are identified in order to carry outa semantic matching which is driven by an ontology thatwas adopted from [23] A fragment of this ontology forintegrating semantically crime tweets with official reports isdepicted in Figure 3

The ontology is explored by Algorithm 1 and it usesthe hyperonymy and synonymy as semantic relationshipsin order to find a matching between each tweet and crimereport It means that if a tweet is expressed by domains oftime and location their hyperons and synonyms are searchedwithin the ontology for contextualizing them So synonymsand hyperons are stored into a vectorThe process is repeatedby each record of the official institutions the obtained vectoris compared and in case of a match then a term or conceptis the same or has the same parent thus it is considered acandidate to be unified In other words if a match is foundthen the data are mixed (tweets and databases from the PGJand SSP)

33 Stage 3 The Semantic Processing The applied processto tweets establishes the following tasks (1) 119879119900119896119890119899119894119911119890119903 Itconsists of separating and identifying the tweets as well asfiltering the tweets to remove the stop words from the tokenstream This task was performed by Lucene system [25] (2)119875119903119900119888119890119904119904119894119899119892 119863119886119905119886 This task involves analysis of tweets theyare identified by spatial temporal and description attributesin order to identify where when and how a crime eventoccurred (3) 119862119886119905119890119892119900119903119894119911119886119905119894119900119899 It uses the semantic matchingby means of grouping the tweets according to the crimesdefined in the ontology The semantic distance for eachprocessed tweet is computed by using the weighted crimerateThe output of this stage is a set of tweets categorized by acrime type Figure 4 shows a general diagram that describesthe integration of these tasks Thus in order to illustrate theabove two different records from Twitter dataset at differentlevels of spatial granularity are presented as follows

(1) SSPDFVIAL crime report in public transport witha fire arm Iztapalapa

(2) Alertux theft of students in Ecatepec streets 105and 106 RedVial

Mobile Information Systems 5

Input TweetsResult Unified and categorized tweet

(1) Let 119902[119894] = 119890119897119890119898119890119899119905119904 119900119891 119905119908119890119890119905(2) 119899 = 0(3) while 119899 = 119894 do(4) Parsing and identification (119902[119894])(5) nodestart()(6) while node = null do(7) 119895 = 0 119894 = 0(8) if Hyperonomy or Similarity(concept name) then(9) 119888119900119899119881119890119888[119895] = 119892119890119905 119901119886119903119890119899119905 119886119899119889 119888ℎ119894119897119889119903119890119899(119899119900119889119890)

(10) nodenext()(11) 119862119903119900119908119889119881119890119888119905119900119903[119895] = 119890V119890119899119905119879119910119901119890 119904119890119886119903119888ℎ(119888119900119899119881119890119888[119895])(12) temporal search(conVec[119895])(13) 119895++ 119896++ 119899++

Algorithm 1 The OntoExplore algorithm

Thing

Crime

Crime against Crime against

Theft

Violent theft Nonviolent theft

Carjacking RobberyCar stolen from

Armed robbery Highway robbery

parking

property person

Figure 3 A fragment of the crime ontology to semantically match tweets and official information

Tweets analysis Spatial analysis

(GeoNames and gazetteer)

Linguistic analysis(dictionary)

Temporal analysis

Filtering and cleaningdata

(Lucene system)

The OntoExplorealgorithmexecution

Categorized crimerecords

Integrated crimedatabase

Figure 4 Tasks involved in the semantic processing

The cleaning process includes the adaptation ofAlgorithm 2 for removing stop words according to thedefinition presented in [26]

On the other hand a tweet is associated with a recordof the official database according to its relevance by using

the ontology and the hyperonymy relation The relevance ismeasured by contextualizing the record and the tweet Thismeans that each semantic item of the tweet and record isidentified and then it is searched in the crime ontology inorder to find matching terms synonyms or related concepts

6 Mobile Information Systems

Input An arbitrary stop word dictionary 119879 the set of schema trees 119876119868Result A maximal set of stop words 1198791015840

(1) 119879 larr 0(2)119882 = the set of all words in the domain(3)119863 = 119879 cap119882(4) while exist word 119908 isin 119863(5) for each interface 119902 isin 119876119868(6) remove the stop word constraints for the lables of sibling nodes(7) if no stop word constraint is violated then then(8) 119879

1015840larr 119879 cup 119908

(9) else(10) remove antonymus of 119908 appearing in119863 from 1198791015840(11) return 1198791015840

Algorithm 2 Removing stop words

For example in the first tweet above the location is notprecise only the name of neighborhood is given but thestreets are not defined However these features enrich thedescription of the event

Therefore the tweet is contextualized and related to theconcept ldquoviolencerdquo Moreover the second tweet has a preciselocation but it is imprecise regarding the event descriptionThis process generates a semantic classification matched tothe official database categorization in which each tweetis described according to few categories either Boolean ordescriptors (eg theft crime and violence) The location isderived by the neighborhood street name or spatial relation(eg near and far) The time is computed by explicit data(at 1100 am) or temporal adverbs (eg now and afternoon)Thus a parsing that contains the attributes identified by thesemantic classification (event type crime and theft in car orwalking) for evaluating the domains is generatedThe parsingstructure that was obtained by analyzing the tweet applicationis presented as follows

semantic classificationTheftCrimeTheftwith violenceTheftwalkingTheftin car Theft without violenceattribute day Mon Tue Wed Thu Fri Sat Sunattribute id location 30339461 30339462 3033949330339495 30339496 30339671 attribute time 1 2 3 4 5 6 7 8 9 10 11 12 13 1415 16 17 18 19 20 21 22 23 24dataMon 30339461 10 Theft

Fri 30339671 20 Crime

The labelsemantic classification represents the result of thesemantic matchingattribute represents the valid attributesand their domains anddata denotes the classified data aftersyntactic and semantic-syntactical analysis

34 Stage 4 The Clustering Approach In this stage anapproach based on Bayes algorithm is proposed for clusteringthe crime data that are stored in the repository A bag ofwordsis used in order to classify the crimes it means that there isa set of words that describes each type of theft or crime If atweet or record contains these words it is classified accordingto the type of theft or crime that these words represent inthe database So the Bayes algorithm classifies tweets orrecords that do not contain these words or only one of thesewords The goal is to classify tweets that are relevant and thatthe semantic processing cannot classify Thus a tweet thatdescribes a theft or crime which does not have an exact loca-tion time or description will be classified by using the Bayesalgorithm

The Bayes algorithm takes into account certain features(words that compose a tweet or record from official database)that are identified and assigned to a particular cluster In thistask many clusters can appear and they will be filtered laterThe Bayes theorem is useful not only to cluster data related tocrimeor theftbut also to cluster datawhen they containwordsthat do not belong to a type of crime so these clusters will beomitted The use of Bayes algorithm had two specific goalsthe first one is to classify data that cannot be semanticallyclassified and the second is oriented towards performing anestimation For example a user requests a safe route in certainarea but there are not data of theftcrime reports neitherTwitter nor official database for this area So this approachfinds the cluster which belongs to the area (eg a safe orunsafe cluster) andmakes an estimation to this zone for deter-mining what points are probably safer than others Finallyweights are assigned for all points and these weights are sentas parameters to the safe route algorithm

Equation (1) was applied to data and they were clusteredThe next estimation is also computed for establishing aprobability for each point that belongs to a route and a crimeevent In other words such equation defines the likelihoodthat an event occurs on a given day and time and at a specificlocation

119875 (119888 | 119909) = 119875 (119909) times

119875 (119883 | 119862)

119875 (119909)

(1)

Mobile Information Systems 7

Input119873 = 1198991 1198992 119899

119896Set of nodes

119899119904start node

119899119891finish node

Result119863 = 1198891 1198892 119889

119896Set of distance values (weighted crime rate)

(1) Assign distance values 119889(119899119904) = 0 119889(119899

119894) = infinforall119899

119894= 119899119904

(2) Let 119880 = 119873 minus 119899119904

(3) Let current node 119899119888= 119899119904

(4) while exists(119899119888) do

(5) foreach neighbor(119899119888) do

(6) Let be 119899119895= 119899119890119894119892ℎ119887119900119903(119899

119888)

(7) if 119889(119899119895) gt 119897119890119899119892119905ℎ(119899

119895 119899119888) + 119889(119899

119888) then

(8) 119889(119899119895) = 119897119890119899119892119905ℎ(119899

119895 119899119888) + 119889(119899

119888)

(9) 119880 = 119880 minus 119899119895

(10) if 119899119891notin 119880 or min(119897119890119899119892119905ℎ(119899

119894 119899119895)) = infinforall119899

119894 119899119895isin 119880 then

(11) return 119889(119899119896) forall 119899119896isin 119873

(12) else(13) 119899

119888= 119889(119899

119894) = min(119889(119899

119895)) forall 119899119895isin 119880

Algorithm 3 The safe routing algorithm

The clustering approach obtains some probabilities thatrepresent the estimation for specific crimes that were previ-ously classified semanticallyThese probabilities define possi-ble hot spots that represent events which are directly associ-ated with streets In addition the method classifies incomingdata generated by the categorization task searching patternsthat were defined as words associated with a crime It gen-erates some predictive spatiotemporal patterns according tothe indicated parametersThus the goal is to search the crimeprobability for specific locations in a given route for havingsome criminal events when a user travels on that route Theprobabilities are based on the following combinations

let 119875(coordinate = [119909 119910] | event) be the probabilityof an event directly related to its location and let119875(119889119886119910 119903 119905119894119898119890 | event) be the probability of an eventtaking into account its temporariness So the compu-tation of such probabilities is presented as follows119875(119888 | 119909) represents a subsequent probability and it isdefined by the probability that an event occurs con-sidering past events inwhich119875(119909) represents the totalprobability which denotes the number of times thatsome given attributes appear in the events (eg theft)119875(119909 | 119888) represents the conditional probability whichis denoted by the number of times when a targetappears in each attribute (eg car) taking intoaccount the total number that the target appears inall attributes119875(119888) is defined as an a priori probability which repre-sents the number of times that a target appears in allevents

Events attributes and locations are identified and clas-sified by the Bayes algorithm The clusters are generatedaccording to the defined patterns that were used in thetraining process Particularly the places are described astrusted or untrusted The probability is used as a weighted

value and it is normalized in a range of [0 1] where 0represents the lowest and 1 the highest likelihood Thusthese values are used in the route generation An example ofthe clustering computation with the obtained likelihood ispresented as follows

Classified Event (ldquo12052014rdquo ldquo3rdquo Monday12255928567) ldquoEventrdquo ldquoTheftrdquo ldquoID Coordinaterdquoldquo30339694rdquo ldquolatituderdquo 194038744 ldquolongituderdquo99150226 ldquoProbabilityrdquo ldquo00009295401rdquo ldquodiag-nosed weight 089rdquo

35 Stage 5 Generation of Safe Routes The safe routes arevisualized in the mobile mapping application which wasdeveloped as a client by using REST (Representational StateTransfer) It is a web service technology that generatesrequests and the data parsing is received by the server Thespatial feature is supported byOpen StreetMaps [12]The saferoute is obtained by the adaptation of the Dijkstra algorithmin which the nodes in the network are assigned as an averageweight that was obtained from the number of crimes for aspecific point or geographic area The values generated fromthe Bayes algorithm reflect the probability of having or not anevent such as ldquotheftcrimerdquo

Let 119899119904be the starting node called initial node and let

119889 be the distance of node 119899119896from the initial node to 119899

1

Safe route algorithm assigns some initial distance valuesbased on the weighted crime rate for these points So theweighted crime rate is computed by the number of com-plaint crimetheft occurrences in a specific point and timeAlgorithm 3 describes the process for obtaining the saferoutes

Thus this algorithm uses data that were sent by themobile application (origin and destination points) in order toreturn a route that avoids locations where crime events haveoccurred The confidence level is a metric that is computedby considering the number of crime incidents that haveoccurred in a specific point and time This is used as a

8 Mobile Information Systems

weighted value for the safe routing algorithm The displayedroute is marked with coordinates that were returned by thealgorithm in order to visualize the route on the mobilemapping application

The weights can be also relaxed by spatial and temporalvalues such as date (day month and year) location (point orarea) and time (hour or period) In the interface level userscan configure the route search process by modeling someparameters as follows If the route is generated by using eithersocial official or integrated data source then the searchprocess also makes a difference between different transporta-tions (eg walking or by car)

4 Experimental Results and Evaluation

The mobile application was implemented in Android 40and the tests were performed in mobile devices In thissection the results based on information of Mexico Cityare presented The repository is composed of 5441 eventswhich were recollected and processed from the tweets andcorrelated with official reports In this dataset a frequencytable is generated which indicates the number of times thateach attribute appears given events such as theft and crime

The frequency values are defined as weights when deriv-ing a safe route All values are assigned to correspondingnodes in the network In addition the probability that anevent occurs at some location and specific date is computedand stored in a vector table This also allows a comparison ofcrime probabilities at different placesThe sort of summariza-tion provides a support to evaluate the probability of an eventto occur at a specific location andor particular time (egthe probability that a crime or theft can occur in ldquoAvenueEje Central on Mondayrdquo) Then the probability that an eventoccurs at a given place and time is defined in the followingexample ldquoOn Monday at 900 am in Iztapalapardquo

119875(119909) = 119875(Monday) = 2202149119875(119909 | 119888) = 119875(Monday | Theft) = 18218119875(119888) = 119875(Theft) = 2182149119875(119888 | 119909) = 119875(Monday | Theft) =119875(MONDAY) times119875(THEFT | Monday)119875(Mon-day) = 00818

(1) The probabilities of some classified crime events arecomputed

(2) The selected combinations (eg theft at a particularlocation and time) are definedHowever this problemis NP-hard thus a semantic classification has beenapplied to restrict the search space and decrease thecomputational complexity

(3) This provides the probabilities for all events that occurfor each particular domain value

An example to compute the probability for crime eventssuch as the probability that ldquoa theft occurs at a given locationon Friday or a crime occurs onTuesday at noonrdquo is presentedas follows

119875(Theft) = 0104119875(Crime) = 0065119875(Crime(119908)) = 0017119875(Crime(119901)) = 0812

119875(Theft | day = Monday) = 0032119875(Crime | day = Monday) = 013119875(Crime(119908) | day = Monday) = 0017119875(Crime(119901) | day = Monday) = 0813119875(Theft | hour = 9) = 01119875(Crime | day = 9) = 005119875(Crime(119908) | hour = 9) = 005119875(Crime(119901) | hour = 9) = 08119875(Theft | hour = 1245612) = 0119875(Crime | day = 1245612) = 02119875(Crime(119908) | hour = 1245612) = 0119875(Crime(119901) | hour = 1245612) = 08rArr

119875(Theft | Time = 9ampamp day =Monday ampamp coordinate= 1245612) =119875(Theft) times 119875(Theft | day = Monday) times 119875(Theft |time = 9)times(Theft | coordinate = 1245612) = 0104times0032 times 01 times 0 = 0

The example was based on taking into consideration thedefined points from a given area it means that we have theestimation that an event occurs in 119909 point at 11 am or in 119909point in several hours of the day The probability that a theftoccurs is increasingdecreasing depending on the hour andthe day of week In that case the user can also ask for allpossible combinations near his location For instance for 119910point he wants to know the probability of suffering a theft atdifferent hours of a day So he can know what event (type ofcrime) is the most common to occur for an 119909 point at 6 pm

Figures 5 and 6 show safe routes between two pointsfor transient users where the circles on the map representuntrusted points The route was generated by avoiding thesepoints although this means that the path could be longer it isthe safer route Moreover Figure 6 depicts the generated saferoute for the same points indicated in Figure 5 not only whenthe user is not a transient but also when the user is a driverIn this case the route is different because the estimation iscarried out only processing the event types related to drivers

Finally it is possible to generate routes considering thedata reports of different periods of time (eg thefts andcrimes occurred from April to July in 2015) The safe routingalgorithm generates a specific route that avoids the pointswhere theft events have occurred in the past Neverthelessthe generated route can change if the user increases the timeperiod for taking into account (eg from January to Marchin 2015) or including all temporal data available (eg from2013 to 2015) It allows us to know what location is safer thanothers at specific day and hour in the same geographic area

Mobile Information Systems 9

Figure 5 Suggested safe route for a transient

Figure 6 Suggested safe route for a user in a car

Figure 7 depicts the events that occurred to transients anddrivers from official and social sources the icons representtheft of transients and drivers with and without violence aswell as theft of house

On the other hand the obtained results in Figures 56 and 7 were compared with the Official Crime Map Sys-tem (httpwwwmapadelincuencialorgmx) Figure 8 onlydepicts events that volunteersmarked as points where a crimeor theft with violence occurred the data difference is evidentand the possibilities to configure the system are very limitedin comparison with the proposed mobile information system(see Figure 9)

Thus Figure 10 shows the hybrid map view from the webversion The events in yellow color were reported by thesocial network source and events with blue color representthe official source

5 Conclusions and Future Work

In this paper a hybrid approach for finding safe routesusing semantic processing and classification algorithms with

Figure 7 All events occurred from 2013 to 2015

Figure 8 Crime map (made by volunteers)

data provided from a social network and the official crimereports is presented As a case study a mobile informationsystem was developed It generates safe routes based oncrime reports of Mexico City from large tweet repositoryand official databases The data are semantically classifiedto determine whether the tweet describes a crime event ortheft in case of tweets which cannot be identified as crimesthey are evaluated by Bayes algorithm which clustered themaccording to the contained descriptionThus the clusters areused tomake prediction regarding the possibility that a crimecan occur in a specific place and hour The spatiotemporalanalysis determined the location where the crime eventsoccurred Moreover the confidence level of a location wasdefined and it was used as a parameter for computing the saferroute

Themain contributions of this work are as follows (1) thedesign of a hybrid approach based on semantic processingto retrieve crime data from a social network source (2) theintegration of crowd-sensed data with official governmentsources (3) the validation of a tweet performed by comparingthe sources using 119896-fold cross validation (4) the estimation

10 Mobile Information Systems

Figure 9 Theftcrime events that occurred in a specific time fromthe mobile information system

Figure 10 The hybrid map view from the web version

model based on the Bayes algorithm to obtain safe routes withdata that were provided by the mobile device and (5) thedesign of amobile information system to generate safe routes

According to the results of the estimation the certaintydegree is around 75 of effectiveness It was tested bycomparing areas with crime data but the records wereintentionally removed and original copy was keptThus withthe results of the estimation a comparison with the originalcopy was performed So we found that the estimation has aperformance of 75 for all the points of the data sample

In addition a metric to measure the confidence level orsecurity for certain points and areas of Mexico City has beenproposed It allows finding safe routes according to pathswith a low crime rate Moreover the mobile application gath-ers long-term statistical data with almost real informationfrom citizens which are acting as sensors in the city Theresults of the mobile system have been tested and comparedwith the Crime Map System

Future works are oriented towards evaluating the cogni-tive perception of people taking into consideration points

or geographic places for finding comfortable routes Thesentiment analysis will be treated in order to incorporate thisfeature as a parameter in the computation of routes Addi-tionally we are proposing the integration of ourmobile appli-cation with the Mexican CCTV camera systems for sensingthe dynamic of certain areas in the city This contribution isfocused on developing mobile information systems for rout-ing and urban planning in city environments Mobile appli-cations are increasingly becoming essential for analyzing theurban dynamic of big cities Thus the appearance of the nextgeneration of mobile information systems will be devised inreal-time road network conditions In addition this genera-tion is oriented towards improving the quality of human lifefor increasing the sustainability of the smart cities

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

Thisworkwas partially sponsored by the Instituto PolitecnicoNacional (IPN) the Consejo Nacional de Ciencia y Tec-nologıa (CONACYT) and the Secretarıa de Investigacion yPosgrado (SIP) under Grants 20162006 20161899 20161869and 20161611

References

[1] H Zhang Y Xu and X Wen ldquoOptimal shortest path setproblem in undirected graphsrdquo Journal of Combinatorial Opti-mization vol 29 no 3 pp 511ndash530 2015

[2] W Templeton-Steadman and R Williams ldquoInformation deliv-ery system and method for mobile appliancesrdquo US Patent App11562054 2006

[3] D Reis A Melo A L V Coelho and V Furtado ldquoTowardsoptimal police patrol routes with genetic algorithmsrdquo in Intel-ligence and Security Informatics IEEE International Conferenceon Intelligence and Security Informatics ISI 2006 SanDiego CAUSA May 23-24 2006 Proceedings vol 3975 of Lecture Notesin Computer Science pp 485ndash491 Springer Berlin Germany2006

[4] V Ceikute and C S Jensen ldquoRouting service qualitymdashlocaldriver behavior versus routing servicesrdquo in Proceedings of theIEEE 14th International Conference onMobileDataManagement(MDM rsquo13) vol 1 pp 97ndash106 IEEE June 2013

[5] Safety Apps October 2015 httpwwwcsunedupolicesafety-apps

[6] E Galbrun K Pelechrinis and E Terzi ldquoUrban navigationbeyond shortest route the case of safe pathsrdquo InformationSystems vol 57 pp 160ndash171 2016

[7] T Wang C Rudin D Wagner and R Sevieri ldquoLearning todetect patterns of crimerdquo in Machine Learning and KnowledgeDiscovery in Databases vol 8190 of Lecture Notes in ComputerScience pp 515ndash530 Springer Berlin Germany 2013

[8] C-H Yu W Ding P Chen and M Morabito ldquoCrime forecast-ing using spatio-temporal pattern with ensemble learningrdquo inAdvances in Knowledge Discovery andDataMining 18th Pacific-AsiaConference PAKDD2014 Tainan TaiwanMay 13ndash16 2014

Mobile Information Systems 11

Proceedings Part II Lecture Notes in Computer Science pp174ndash185 Springer Berlin Germany 2014

[9] L Scott and NWarmerdam ldquoExtend crime analysis with arcgisspatial statistics toolsrdquo ArcUser Magazine 2005

[10] M Leitner Crime Modeling and Mapping Using GeospatialTechnologies vol 8 Springer DordrechtTheNetherlands 2013

[11] J Kim M Cha and T Sandholm ldquoSocroutes safe routesbased on tweet sentimentsrdquo in Proceedings of the 23rd ACMInternational Conference on World Wide Web (WWW rsquo14) pp179ndash182 Seoul South Korea April 2014

[12] MHaklay and PWeber ldquoOpenstreetmap user-generated streetmapsrdquo IEEE Pervasive Computing vol 7 no 4 pp 12ndash18 2008

[13] J M Sanchez Bernabeu J V Berna Martınez and F MaciaPerez ldquoSmart sentinelmonitoring and prevention system in thesmart citiesrdquo International Review on Computers and Softwarevol 9 no 9 pp 1554ndash1559 2014

[14] Wiki crimes 2015 httpwwwwikicrimesorg[15] T Moon S Heo and S Lee ldquoUbiquitous crime prevention

system (UCPS) for a safer cityrdquoProcedia Environmental Sciencesvol 22 pp 288ndash301 2014

[16] H Su K Zheng J Huang H Jeung L Chen and X ZhouldquoCrowdplanner a crowd-based route recommendation systemrdquoinProceedings of the 30th IEEE International Conference onDataEngineering (ICDE rsquo14) pp 1144ndash1155 IEEE Chicago Ill USAApril 2014

[17] V Arnaboldi M Conti and F Delmastro ldquoImplementationof CAMEO a context-aware middleware for opportunisticmobile social networksrdquo inProceedings of the IEEE InternationalSymposium on a World of Wireless Mobile and MultimediaNetworks (WoWMoM rsquo11) pp 1ndash3 Lucca Italy June 2011

[18] M Nagarajan K Gomadam A P Sheth A Ranabahu RMutharaju and A Jadhav ldquoSpatio-temporal-thematic analysisof citizen sensor data challenges and experiencesrdquo in WebInformation Systems EngineeringmdashWISE 2009 vol 5802 of Lec-ture Notes in Computer Science pp 539ndash553 Springer BerlinGermany 2009

[19] N Powdthavee ldquoUnhappiness and crime evidence from SouthAfricardquo Economica vol 72 no 287 pp 531ndash547 2005

[20] J Ratcliffe ldquoCrime mapping spatial and temporal challengesrdquoin Handbook of Quantitative Criminology pp 5ndash24 SpringerNew York NY USA 2010

[21] P Gupta G N Purohit and A Dadhich ldquoCrime preventionthrough alternate route finding in traffic surveillance using cctvcamerasrdquo International Journal of Engineering and AdvancedTechnology vol 2 no 5 pp 414ndash418 2013

[22] H Chen T Cheng and S Wise ldquoDesigning daily patrol routesfor policing based on ANT colony algorithmrdquo ISPRS Annalsof the Photogrammetry Remote Sensing and Spatial InformationSciences vol II-4W2 pp 103ndash109 2015

[23] D Dzemydiene and E Kazemikaitiene ldquoOntology-based deci-sion support system for crime investigation processesrdquo inInformation Systems Development pp 427ndash438 Springer NewYork NY USA 2005

[24] Y Chabot A Bertaux C Nicolle and T Kechadi ldquoA completeformalized knowledge representation model for advanced dig-ital forensics timeline analysisrdquoDigital Investigation vol 15 pp83ndash100 2015

[25] E Hatcher O Gospodnetic and M McCandless Lucene inAction Manning Publications Greenwich Conn USA 2004

[26] FMata-RiveraMTorres-RuizGGuzmanMMoreno-Ibarraand R Quintero ldquoA collaborative learning approach for geo-graphic information retrieval based on social networksrdquo Com-puters in Human Behavior vol 51 pp 829ndash842 2015

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 5: A Mobile Information System Based on Crowd-Sensed and Official ...

Mobile Information Systems 5

Input TweetsResult Unified and categorized tweet

(1) Let 119902[119894] = 119890119897119890119898119890119899119905119904 119900119891 119905119908119890119890119905(2) 119899 = 0(3) while 119899 = 119894 do(4) Parsing and identification (119902[119894])(5) nodestart()(6) while node = null do(7) 119895 = 0 119894 = 0(8) if Hyperonomy or Similarity(concept name) then(9) 119888119900119899119881119890119888[119895] = 119892119890119905 119901119886119903119890119899119905 119886119899119889 119888ℎ119894119897119889119903119890119899(119899119900119889119890)

(10) nodenext()(11) 119862119903119900119908119889119881119890119888119905119900119903[119895] = 119890V119890119899119905119879119910119901119890 119904119890119886119903119888ℎ(119888119900119899119881119890119888[119895])(12) temporal search(conVec[119895])(13) 119895++ 119896++ 119899++

Algorithm 1 The OntoExplore algorithm

Thing

Crime

Crime against Crime against

Theft

Violent theft Nonviolent theft

Carjacking RobberyCar stolen from

Armed robbery Highway robbery

parking

property person

Figure 3 A fragment of the crime ontology to semantically match tweets and official information

Tweets analysis Spatial analysis

(GeoNames and gazetteer)

Linguistic analysis(dictionary)

Temporal analysis

Filtering and cleaningdata

(Lucene system)

The OntoExplorealgorithmexecution

Categorized crimerecords

Integrated crimedatabase

Figure 4 Tasks involved in the semantic processing

The cleaning process includes the adaptation ofAlgorithm 2 for removing stop words according to thedefinition presented in [26]

On the other hand a tweet is associated with a recordof the official database according to its relevance by using

the ontology and the hyperonymy relation The relevance ismeasured by contextualizing the record and the tweet Thismeans that each semantic item of the tweet and record isidentified and then it is searched in the crime ontology inorder to find matching terms synonyms or related concepts

6 Mobile Information Systems

Input An arbitrary stop word dictionary 119879 the set of schema trees 119876119868Result A maximal set of stop words 1198791015840

(1) 119879 larr 0(2)119882 = the set of all words in the domain(3)119863 = 119879 cap119882(4) while exist word 119908 isin 119863(5) for each interface 119902 isin 119876119868(6) remove the stop word constraints for the lables of sibling nodes(7) if no stop word constraint is violated then then(8) 119879

1015840larr 119879 cup 119908

(9) else(10) remove antonymus of 119908 appearing in119863 from 1198791015840(11) return 1198791015840

Algorithm 2 Removing stop words

For example in the first tweet above the location is notprecise only the name of neighborhood is given but thestreets are not defined However these features enrich thedescription of the event

Therefore the tweet is contextualized and related to theconcept ldquoviolencerdquo Moreover the second tweet has a preciselocation but it is imprecise regarding the event descriptionThis process generates a semantic classification matched tothe official database categorization in which each tweetis described according to few categories either Boolean ordescriptors (eg theft crime and violence) The location isderived by the neighborhood street name or spatial relation(eg near and far) The time is computed by explicit data(at 1100 am) or temporal adverbs (eg now and afternoon)Thus a parsing that contains the attributes identified by thesemantic classification (event type crime and theft in car orwalking) for evaluating the domains is generatedThe parsingstructure that was obtained by analyzing the tweet applicationis presented as follows

semantic classificationTheftCrimeTheftwith violenceTheftwalkingTheftin car Theft without violenceattribute day Mon Tue Wed Thu Fri Sat Sunattribute id location 30339461 30339462 3033949330339495 30339496 30339671 attribute time 1 2 3 4 5 6 7 8 9 10 11 12 13 1415 16 17 18 19 20 21 22 23 24dataMon 30339461 10 Theft

Fri 30339671 20 Crime

The labelsemantic classification represents the result of thesemantic matchingattribute represents the valid attributesand their domains anddata denotes the classified data aftersyntactic and semantic-syntactical analysis

34 Stage 4 The Clustering Approach In this stage anapproach based on Bayes algorithm is proposed for clusteringthe crime data that are stored in the repository A bag ofwordsis used in order to classify the crimes it means that there isa set of words that describes each type of theft or crime If atweet or record contains these words it is classified accordingto the type of theft or crime that these words represent inthe database So the Bayes algorithm classifies tweets orrecords that do not contain these words or only one of thesewords The goal is to classify tweets that are relevant and thatthe semantic processing cannot classify Thus a tweet thatdescribes a theft or crime which does not have an exact loca-tion time or description will be classified by using the Bayesalgorithm

The Bayes algorithm takes into account certain features(words that compose a tweet or record from official database)that are identified and assigned to a particular cluster In thistask many clusters can appear and they will be filtered laterThe Bayes theorem is useful not only to cluster data related tocrimeor theftbut also to cluster datawhen they containwordsthat do not belong to a type of crime so these clusters will beomitted The use of Bayes algorithm had two specific goalsthe first one is to classify data that cannot be semanticallyclassified and the second is oriented towards performing anestimation For example a user requests a safe route in certainarea but there are not data of theftcrime reports neitherTwitter nor official database for this area So this approachfinds the cluster which belongs to the area (eg a safe orunsafe cluster) andmakes an estimation to this zone for deter-mining what points are probably safer than others Finallyweights are assigned for all points and these weights are sentas parameters to the safe route algorithm

Equation (1) was applied to data and they were clusteredThe next estimation is also computed for establishing aprobability for each point that belongs to a route and a crimeevent In other words such equation defines the likelihoodthat an event occurs on a given day and time and at a specificlocation

119875 (119888 | 119909) = 119875 (119909) times

119875 (119883 | 119862)

119875 (119909)

(1)

Mobile Information Systems 7

Input119873 = 1198991 1198992 119899

119896Set of nodes

119899119904start node

119899119891finish node

Result119863 = 1198891 1198892 119889

119896Set of distance values (weighted crime rate)

(1) Assign distance values 119889(119899119904) = 0 119889(119899

119894) = infinforall119899

119894= 119899119904

(2) Let 119880 = 119873 minus 119899119904

(3) Let current node 119899119888= 119899119904

(4) while exists(119899119888) do

(5) foreach neighbor(119899119888) do

(6) Let be 119899119895= 119899119890119894119892ℎ119887119900119903(119899

119888)

(7) if 119889(119899119895) gt 119897119890119899119892119905ℎ(119899

119895 119899119888) + 119889(119899

119888) then

(8) 119889(119899119895) = 119897119890119899119892119905ℎ(119899

119895 119899119888) + 119889(119899

119888)

(9) 119880 = 119880 minus 119899119895

(10) if 119899119891notin 119880 or min(119897119890119899119892119905ℎ(119899

119894 119899119895)) = infinforall119899

119894 119899119895isin 119880 then

(11) return 119889(119899119896) forall 119899119896isin 119873

(12) else(13) 119899

119888= 119889(119899

119894) = min(119889(119899

119895)) forall 119899119895isin 119880

Algorithm 3 The safe routing algorithm

The clustering approach obtains some probabilities thatrepresent the estimation for specific crimes that were previ-ously classified semanticallyThese probabilities define possi-ble hot spots that represent events which are directly associ-ated with streets In addition the method classifies incomingdata generated by the categorization task searching patternsthat were defined as words associated with a crime It gen-erates some predictive spatiotemporal patterns according tothe indicated parametersThus the goal is to search the crimeprobability for specific locations in a given route for havingsome criminal events when a user travels on that route Theprobabilities are based on the following combinations

let 119875(coordinate = [119909 119910] | event) be the probabilityof an event directly related to its location and let119875(119889119886119910 119903 119905119894119898119890 | event) be the probability of an eventtaking into account its temporariness So the compu-tation of such probabilities is presented as follows119875(119888 | 119909) represents a subsequent probability and it isdefined by the probability that an event occurs con-sidering past events inwhich119875(119909) represents the totalprobability which denotes the number of times thatsome given attributes appear in the events (eg theft)119875(119909 | 119888) represents the conditional probability whichis denoted by the number of times when a targetappears in each attribute (eg car) taking intoaccount the total number that the target appears inall attributes119875(119888) is defined as an a priori probability which repre-sents the number of times that a target appears in allevents

Events attributes and locations are identified and clas-sified by the Bayes algorithm The clusters are generatedaccording to the defined patterns that were used in thetraining process Particularly the places are described astrusted or untrusted The probability is used as a weighted

value and it is normalized in a range of [0 1] where 0represents the lowest and 1 the highest likelihood Thusthese values are used in the route generation An example ofthe clustering computation with the obtained likelihood ispresented as follows

Classified Event (ldquo12052014rdquo ldquo3rdquo Monday12255928567) ldquoEventrdquo ldquoTheftrdquo ldquoID Coordinaterdquoldquo30339694rdquo ldquolatituderdquo 194038744 ldquolongituderdquo99150226 ldquoProbabilityrdquo ldquo00009295401rdquo ldquodiag-nosed weight 089rdquo

35 Stage 5 Generation of Safe Routes The safe routes arevisualized in the mobile mapping application which wasdeveloped as a client by using REST (Representational StateTransfer) It is a web service technology that generatesrequests and the data parsing is received by the server Thespatial feature is supported byOpen StreetMaps [12]The saferoute is obtained by the adaptation of the Dijkstra algorithmin which the nodes in the network are assigned as an averageweight that was obtained from the number of crimes for aspecific point or geographic area The values generated fromthe Bayes algorithm reflect the probability of having or not anevent such as ldquotheftcrimerdquo

Let 119899119904be the starting node called initial node and let

119889 be the distance of node 119899119896from the initial node to 119899

1

Safe route algorithm assigns some initial distance valuesbased on the weighted crime rate for these points So theweighted crime rate is computed by the number of com-plaint crimetheft occurrences in a specific point and timeAlgorithm 3 describes the process for obtaining the saferoutes

Thus this algorithm uses data that were sent by themobile application (origin and destination points) in order toreturn a route that avoids locations where crime events haveoccurred The confidence level is a metric that is computedby considering the number of crime incidents that haveoccurred in a specific point and time This is used as a

8 Mobile Information Systems

weighted value for the safe routing algorithm The displayedroute is marked with coordinates that were returned by thealgorithm in order to visualize the route on the mobilemapping application

The weights can be also relaxed by spatial and temporalvalues such as date (day month and year) location (point orarea) and time (hour or period) In the interface level userscan configure the route search process by modeling someparameters as follows If the route is generated by using eithersocial official or integrated data source then the searchprocess also makes a difference between different transporta-tions (eg walking or by car)

4 Experimental Results and Evaluation

The mobile application was implemented in Android 40and the tests were performed in mobile devices In thissection the results based on information of Mexico Cityare presented The repository is composed of 5441 eventswhich were recollected and processed from the tweets andcorrelated with official reports In this dataset a frequencytable is generated which indicates the number of times thateach attribute appears given events such as theft and crime

The frequency values are defined as weights when deriv-ing a safe route All values are assigned to correspondingnodes in the network In addition the probability that anevent occurs at some location and specific date is computedand stored in a vector table This also allows a comparison ofcrime probabilities at different placesThe sort of summariza-tion provides a support to evaluate the probability of an eventto occur at a specific location andor particular time (egthe probability that a crime or theft can occur in ldquoAvenueEje Central on Mondayrdquo) Then the probability that an eventoccurs at a given place and time is defined in the followingexample ldquoOn Monday at 900 am in Iztapalapardquo

119875(119909) = 119875(Monday) = 2202149119875(119909 | 119888) = 119875(Monday | Theft) = 18218119875(119888) = 119875(Theft) = 2182149119875(119888 | 119909) = 119875(Monday | Theft) =119875(MONDAY) times119875(THEFT | Monday)119875(Mon-day) = 00818

(1) The probabilities of some classified crime events arecomputed

(2) The selected combinations (eg theft at a particularlocation and time) are definedHowever this problemis NP-hard thus a semantic classification has beenapplied to restrict the search space and decrease thecomputational complexity

(3) This provides the probabilities for all events that occurfor each particular domain value

An example to compute the probability for crime eventssuch as the probability that ldquoa theft occurs at a given locationon Friday or a crime occurs onTuesday at noonrdquo is presentedas follows

119875(Theft) = 0104119875(Crime) = 0065119875(Crime(119908)) = 0017119875(Crime(119901)) = 0812

119875(Theft | day = Monday) = 0032119875(Crime | day = Monday) = 013119875(Crime(119908) | day = Monday) = 0017119875(Crime(119901) | day = Monday) = 0813119875(Theft | hour = 9) = 01119875(Crime | day = 9) = 005119875(Crime(119908) | hour = 9) = 005119875(Crime(119901) | hour = 9) = 08119875(Theft | hour = 1245612) = 0119875(Crime | day = 1245612) = 02119875(Crime(119908) | hour = 1245612) = 0119875(Crime(119901) | hour = 1245612) = 08rArr

119875(Theft | Time = 9ampamp day =Monday ampamp coordinate= 1245612) =119875(Theft) times 119875(Theft | day = Monday) times 119875(Theft |time = 9)times(Theft | coordinate = 1245612) = 0104times0032 times 01 times 0 = 0

The example was based on taking into consideration thedefined points from a given area it means that we have theestimation that an event occurs in 119909 point at 11 am or in 119909point in several hours of the day The probability that a theftoccurs is increasingdecreasing depending on the hour andthe day of week In that case the user can also ask for allpossible combinations near his location For instance for 119910point he wants to know the probability of suffering a theft atdifferent hours of a day So he can know what event (type ofcrime) is the most common to occur for an 119909 point at 6 pm

Figures 5 and 6 show safe routes between two pointsfor transient users where the circles on the map representuntrusted points The route was generated by avoiding thesepoints although this means that the path could be longer it isthe safer route Moreover Figure 6 depicts the generated saferoute for the same points indicated in Figure 5 not only whenthe user is not a transient but also when the user is a driverIn this case the route is different because the estimation iscarried out only processing the event types related to drivers

Finally it is possible to generate routes considering thedata reports of different periods of time (eg thefts andcrimes occurred from April to July in 2015) The safe routingalgorithm generates a specific route that avoids the pointswhere theft events have occurred in the past Neverthelessthe generated route can change if the user increases the timeperiod for taking into account (eg from January to Marchin 2015) or including all temporal data available (eg from2013 to 2015) It allows us to know what location is safer thanothers at specific day and hour in the same geographic area

Mobile Information Systems 9

Figure 5 Suggested safe route for a transient

Figure 6 Suggested safe route for a user in a car

Figure 7 depicts the events that occurred to transients anddrivers from official and social sources the icons representtheft of transients and drivers with and without violence aswell as theft of house

On the other hand the obtained results in Figures 56 and 7 were compared with the Official Crime Map Sys-tem (httpwwwmapadelincuencialorgmx) Figure 8 onlydepicts events that volunteersmarked as points where a crimeor theft with violence occurred the data difference is evidentand the possibilities to configure the system are very limitedin comparison with the proposed mobile information system(see Figure 9)

Thus Figure 10 shows the hybrid map view from the webversion The events in yellow color were reported by thesocial network source and events with blue color representthe official source

5 Conclusions and Future Work

In this paper a hybrid approach for finding safe routesusing semantic processing and classification algorithms with

Figure 7 All events occurred from 2013 to 2015

Figure 8 Crime map (made by volunteers)

data provided from a social network and the official crimereports is presented As a case study a mobile informationsystem was developed It generates safe routes based oncrime reports of Mexico City from large tweet repositoryand official databases The data are semantically classifiedto determine whether the tweet describes a crime event ortheft in case of tweets which cannot be identified as crimesthey are evaluated by Bayes algorithm which clustered themaccording to the contained descriptionThus the clusters areused tomake prediction regarding the possibility that a crimecan occur in a specific place and hour The spatiotemporalanalysis determined the location where the crime eventsoccurred Moreover the confidence level of a location wasdefined and it was used as a parameter for computing the saferroute

Themain contributions of this work are as follows (1) thedesign of a hybrid approach based on semantic processingto retrieve crime data from a social network source (2) theintegration of crowd-sensed data with official governmentsources (3) the validation of a tweet performed by comparingthe sources using 119896-fold cross validation (4) the estimation

10 Mobile Information Systems

Figure 9 Theftcrime events that occurred in a specific time fromthe mobile information system

Figure 10 The hybrid map view from the web version

model based on the Bayes algorithm to obtain safe routes withdata that were provided by the mobile device and (5) thedesign of amobile information system to generate safe routes

According to the results of the estimation the certaintydegree is around 75 of effectiveness It was tested bycomparing areas with crime data but the records wereintentionally removed and original copy was keptThus withthe results of the estimation a comparison with the originalcopy was performed So we found that the estimation has aperformance of 75 for all the points of the data sample

In addition a metric to measure the confidence level orsecurity for certain points and areas of Mexico City has beenproposed It allows finding safe routes according to pathswith a low crime rate Moreover the mobile application gath-ers long-term statistical data with almost real informationfrom citizens which are acting as sensors in the city Theresults of the mobile system have been tested and comparedwith the Crime Map System

Future works are oriented towards evaluating the cogni-tive perception of people taking into consideration points

or geographic places for finding comfortable routes Thesentiment analysis will be treated in order to incorporate thisfeature as a parameter in the computation of routes Addi-tionally we are proposing the integration of ourmobile appli-cation with the Mexican CCTV camera systems for sensingthe dynamic of certain areas in the city This contribution isfocused on developing mobile information systems for rout-ing and urban planning in city environments Mobile appli-cations are increasingly becoming essential for analyzing theurban dynamic of big cities Thus the appearance of the nextgeneration of mobile information systems will be devised inreal-time road network conditions In addition this genera-tion is oriented towards improving the quality of human lifefor increasing the sustainability of the smart cities

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

Thisworkwas partially sponsored by the Instituto PolitecnicoNacional (IPN) the Consejo Nacional de Ciencia y Tec-nologıa (CONACYT) and the Secretarıa de Investigacion yPosgrado (SIP) under Grants 20162006 20161899 20161869and 20161611

References

[1] H Zhang Y Xu and X Wen ldquoOptimal shortest path setproblem in undirected graphsrdquo Journal of Combinatorial Opti-mization vol 29 no 3 pp 511ndash530 2015

[2] W Templeton-Steadman and R Williams ldquoInformation deliv-ery system and method for mobile appliancesrdquo US Patent App11562054 2006

[3] D Reis A Melo A L V Coelho and V Furtado ldquoTowardsoptimal police patrol routes with genetic algorithmsrdquo in Intel-ligence and Security Informatics IEEE International Conferenceon Intelligence and Security Informatics ISI 2006 SanDiego CAUSA May 23-24 2006 Proceedings vol 3975 of Lecture Notesin Computer Science pp 485ndash491 Springer Berlin Germany2006

[4] V Ceikute and C S Jensen ldquoRouting service qualitymdashlocaldriver behavior versus routing servicesrdquo in Proceedings of theIEEE 14th International Conference onMobileDataManagement(MDM rsquo13) vol 1 pp 97ndash106 IEEE June 2013

[5] Safety Apps October 2015 httpwwwcsunedupolicesafety-apps

[6] E Galbrun K Pelechrinis and E Terzi ldquoUrban navigationbeyond shortest route the case of safe pathsrdquo InformationSystems vol 57 pp 160ndash171 2016

[7] T Wang C Rudin D Wagner and R Sevieri ldquoLearning todetect patterns of crimerdquo in Machine Learning and KnowledgeDiscovery in Databases vol 8190 of Lecture Notes in ComputerScience pp 515ndash530 Springer Berlin Germany 2013

[8] C-H Yu W Ding P Chen and M Morabito ldquoCrime forecast-ing using spatio-temporal pattern with ensemble learningrdquo inAdvances in Knowledge Discovery andDataMining 18th Pacific-AsiaConference PAKDD2014 Tainan TaiwanMay 13ndash16 2014

Mobile Information Systems 11

Proceedings Part II Lecture Notes in Computer Science pp174ndash185 Springer Berlin Germany 2014

[9] L Scott and NWarmerdam ldquoExtend crime analysis with arcgisspatial statistics toolsrdquo ArcUser Magazine 2005

[10] M Leitner Crime Modeling and Mapping Using GeospatialTechnologies vol 8 Springer DordrechtTheNetherlands 2013

[11] J Kim M Cha and T Sandholm ldquoSocroutes safe routesbased on tweet sentimentsrdquo in Proceedings of the 23rd ACMInternational Conference on World Wide Web (WWW rsquo14) pp179ndash182 Seoul South Korea April 2014

[12] MHaklay and PWeber ldquoOpenstreetmap user-generated streetmapsrdquo IEEE Pervasive Computing vol 7 no 4 pp 12ndash18 2008

[13] J M Sanchez Bernabeu J V Berna Martınez and F MaciaPerez ldquoSmart sentinelmonitoring and prevention system in thesmart citiesrdquo International Review on Computers and Softwarevol 9 no 9 pp 1554ndash1559 2014

[14] Wiki crimes 2015 httpwwwwikicrimesorg[15] T Moon S Heo and S Lee ldquoUbiquitous crime prevention

system (UCPS) for a safer cityrdquoProcedia Environmental Sciencesvol 22 pp 288ndash301 2014

[16] H Su K Zheng J Huang H Jeung L Chen and X ZhouldquoCrowdplanner a crowd-based route recommendation systemrdquoinProceedings of the 30th IEEE International Conference onDataEngineering (ICDE rsquo14) pp 1144ndash1155 IEEE Chicago Ill USAApril 2014

[17] V Arnaboldi M Conti and F Delmastro ldquoImplementationof CAMEO a context-aware middleware for opportunisticmobile social networksrdquo inProceedings of the IEEE InternationalSymposium on a World of Wireless Mobile and MultimediaNetworks (WoWMoM rsquo11) pp 1ndash3 Lucca Italy June 2011

[18] M Nagarajan K Gomadam A P Sheth A Ranabahu RMutharaju and A Jadhav ldquoSpatio-temporal-thematic analysisof citizen sensor data challenges and experiencesrdquo in WebInformation Systems EngineeringmdashWISE 2009 vol 5802 of Lec-ture Notes in Computer Science pp 539ndash553 Springer BerlinGermany 2009

[19] N Powdthavee ldquoUnhappiness and crime evidence from SouthAfricardquo Economica vol 72 no 287 pp 531ndash547 2005

[20] J Ratcliffe ldquoCrime mapping spatial and temporal challengesrdquoin Handbook of Quantitative Criminology pp 5ndash24 SpringerNew York NY USA 2010

[21] P Gupta G N Purohit and A Dadhich ldquoCrime preventionthrough alternate route finding in traffic surveillance using cctvcamerasrdquo International Journal of Engineering and AdvancedTechnology vol 2 no 5 pp 414ndash418 2013

[22] H Chen T Cheng and S Wise ldquoDesigning daily patrol routesfor policing based on ANT colony algorithmrdquo ISPRS Annalsof the Photogrammetry Remote Sensing and Spatial InformationSciences vol II-4W2 pp 103ndash109 2015

[23] D Dzemydiene and E Kazemikaitiene ldquoOntology-based deci-sion support system for crime investigation processesrdquo inInformation Systems Development pp 427ndash438 Springer NewYork NY USA 2005

[24] Y Chabot A Bertaux C Nicolle and T Kechadi ldquoA completeformalized knowledge representation model for advanced dig-ital forensics timeline analysisrdquoDigital Investigation vol 15 pp83ndash100 2015

[25] E Hatcher O Gospodnetic and M McCandless Lucene inAction Manning Publications Greenwich Conn USA 2004

[26] FMata-RiveraMTorres-RuizGGuzmanMMoreno-Ibarraand R Quintero ldquoA collaborative learning approach for geo-graphic information retrieval based on social networksrdquo Com-puters in Human Behavior vol 51 pp 829ndash842 2015

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 6: A Mobile Information System Based on Crowd-Sensed and Official ...

6 Mobile Information Systems

Input An arbitrary stop word dictionary 119879 the set of schema trees 119876119868Result A maximal set of stop words 1198791015840

(1) 119879 larr 0(2)119882 = the set of all words in the domain(3)119863 = 119879 cap119882(4) while exist word 119908 isin 119863(5) for each interface 119902 isin 119876119868(6) remove the stop word constraints for the lables of sibling nodes(7) if no stop word constraint is violated then then(8) 119879

1015840larr 119879 cup 119908

(9) else(10) remove antonymus of 119908 appearing in119863 from 1198791015840(11) return 1198791015840

Algorithm 2 Removing stop words

For example in the first tweet above the location is notprecise only the name of neighborhood is given but thestreets are not defined However these features enrich thedescription of the event

Therefore the tweet is contextualized and related to theconcept ldquoviolencerdquo Moreover the second tweet has a preciselocation but it is imprecise regarding the event descriptionThis process generates a semantic classification matched tothe official database categorization in which each tweetis described according to few categories either Boolean ordescriptors (eg theft crime and violence) The location isderived by the neighborhood street name or spatial relation(eg near and far) The time is computed by explicit data(at 1100 am) or temporal adverbs (eg now and afternoon)Thus a parsing that contains the attributes identified by thesemantic classification (event type crime and theft in car orwalking) for evaluating the domains is generatedThe parsingstructure that was obtained by analyzing the tweet applicationis presented as follows

semantic classificationTheftCrimeTheftwith violenceTheftwalkingTheftin car Theft without violenceattribute day Mon Tue Wed Thu Fri Sat Sunattribute id location 30339461 30339462 3033949330339495 30339496 30339671 attribute time 1 2 3 4 5 6 7 8 9 10 11 12 13 1415 16 17 18 19 20 21 22 23 24dataMon 30339461 10 Theft

Fri 30339671 20 Crime

The labelsemantic classification represents the result of thesemantic matchingattribute represents the valid attributesand their domains anddata denotes the classified data aftersyntactic and semantic-syntactical analysis

34 Stage 4 The Clustering Approach In this stage anapproach based on Bayes algorithm is proposed for clusteringthe crime data that are stored in the repository A bag ofwordsis used in order to classify the crimes it means that there isa set of words that describes each type of theft or crime If atweet or record contains these words it is classified accordingto the type of theft or crime that these words represent inthe database So the Bayes algorithm classifies tweets orrecords that do not contain these words or only one of thesewords The goal is to classify tweets that are relevant and thatthe semantic processing cannot classify Thus a tweet thatdescribes a theft or crime which does not have an exact loca-tion time or description will be classified by using the Bayesalgorithm

The Bayes algorithm takes into account certain features(words that compose a tweet or record from official database)that are identified and assigned to a particular cluster In thistask many clusters can appear and they will be filtered laterThe Bayes theorem is useful not only to cluster data related tocrimeor theftbut also to cluster datawhen they containwordsthat do not belong to a type of crime so these clusters will beomitted The use of Bayes algorithm had two specific goalsthe first one is to classify data that cannot be semanticallyclassified and the second is oriented towards performing anestimation For example a user requests a safe route in certainarea but there are not data of theftcrime reports neitherTwitter nor official database for this area So this approachfinds the cluster which belongs to the area (eg a safe orunsafe cluster) andmakes an estimation to this zone for deter-mining what points are probably safer than others Finallyweights are assigned for all points and these weights are sentas parameters to the safe route algorithm

Equation (1) was applied to data and they were clusteredThe next estimation is also computed for establishing aprobability for each point that belongs to a route and a crimeevent In other words such equation defines the likelihoodthat an event occurs on a given day and time and at a specificlocation

119875 (119888 | 119909) = 119875 (119909) times

119875 (119883 | 119862)

119875 (119909)

(1)

Mobile Information Systems 7

Input119873 = 1198991 1198992 119899

119896Set of nodes

119899119904start node

119899119891finish node

Result119863 = 1198891 1198892 119889

119896Set of distance values (weighted crime rate)

(1) Assign distance values 119889(119899119904) = 0 119889(119899

119894) = infinforall119899

119894= 119899119904

(2) Let 119880 = 119873 minus 119899119904

(3) Let current node 119899119888= 119899119904

(4) while exists(119899119888) do

(5) foreach neighbor(119899119888) do

(6) Let be 119899119895= 119899119890119894119892ℎ119887119900119903(119899

119888)

(7) if 119889(119899119895) gt 119897119890119899119892119905ℎ(119899

119895 119899119888) + 119889(119899

119888) then

(8) 119889(119899119895) = 119897119890119899119892119905ℎ(119899

119895 119899119888) + 119889(119899

119888)

(9) 119880 = 119880 minus 119899119895

(10) if 119899119891notin 119880 or min(119897119890119899119892119905ℎ(119899

119894 119899119895)) = infinforall119899

119894 119899119895isin 119880 then

(11) return 119889(119899119896) forall 119899119896isin 119873

(12) else(13) 119899

119888= 119889(119899

119894) = min(119889(119899

119895)) forall 119899119895isin 119880

Algorithm 3 The safe routing algorithm

The clustering approach obtains some probabilities thatrepresent the estimation for specific crimes that were previ-ously classified semanticallyThese probabilities define possi-ble hot spots that represent events which are directly associ-ated with streets In addition the method classifies incomingdata generated by the categorization task searching patternsthat were defined as words associated with a crime It gen-erates some predictive spatiotemporal patterns according tothe indicated parametersThus the goal is to search the crimeprobability for specific locations in a given route for havingsome criminal events when a user travels on that route Theprobabilities are based on the following combinations

let 119875(coordinate = [119909 119910] | event) be the probabilityof an event directly related to its location and let119875(119889119886119910 119903 119905119894119898119890 | event) be the probability of an eventtaking into account its temporariness So the compu-tation of such probabilities is presented as follows119875(119888 | 119909) represents a subsequent probability and it isdefined by the probability that an event occurs con-sidering past events inwhich119875(119909) represents the totalprobability which denotes the number of times thatsome given attributes appear in the events (eg theft)119875(119909 | 119888) represents the conditional probability whichis denoted by the number of times when a targetappears in each attribute (eg car) taking intoaccount the total number that the target appears inall attributes119875(119888) is defined as an a priori probability which repre-sents the number of times that a target appears in allevents

Events attributes and locations are identified and clas-sified by the Bayes algorithm The clusters are generatedaccording to the defined patterns that were used in thetraining process Particularly the places are described astrusted or untrusted The probability is used as a weighted

value and it is normalized in a range of [0 1] where 0represents the lowest and 1 the highest likelihood Thusthese values are used in the route generation An example ofthe clustering computation with the obtained likelihood ispresented as follows

Classified Event (ldquo12052014rdquo ldquo3rdquo Monday12255928567) ldquoEventrdquo ldquoTheftrdquo ldquoID Coordinaterdquoldquo30339694rdquo ldquolatituderdquo 194038744 ldquolongituderdquo99150226 ldquoProbabilityrdquo ldquo00009295401rdquo ldquodiag-nosed weight 089rdquo

35 Stage 5 Generation of Safe Routes The safe routes arevisualized in the mobile mapping application which wasdeveloped as a client by using REST (Representational StateTransfer) It is a web service technology that generatesrequests and the data parsing is received by the server Thespatial feature is supported byOpen StreetMaps [12]The saferoute is obtained by the adaptation of the Dijkstra algorithmin which the nodes in the network are assigned as an averageweight that was obtained from the number of crimes for aspecific point or geographic area The values generated fromthe Bayes algorithm reflect the probability of having or not anevent such as ldquotheftcrimerdquo

Let 119899119904be the starting node called initial node and let

119889 be the distance of node 119899119896from the initial node to 119899

1

Safe route algorithm assigns some initial distance valuesbased on the weighted crime rate for these points So theweighted crime rate is computed by the number of com-plaint crimetheft occurrences in a specific point and timeAlgorithm 3 describes the process for obtaining the saferoutes

Thus this algorithm uses data that were sent by themobile application (origin and destination points) in order toreturn a route that avoids locations where crime events haveoccurred The confidence level is a metric that is computedby considering the number of crime incidents that haveoccurred in a specific point and time This is used as a

8 Mobile Information Systems

weighted value for the safe routing algorithm The displayedroute is marked with coordinates that were returned by thealgorithm in order to visualize the route on the mobilemapping application

The weights can be also relaxed by spatial and temporalvalues such as date (day month and year) location (point orarea) and time (hour or period) In the interface level userscan configure the route search process by modeling someparameters as follows If the route is generated by using eithersocial official or integrated data source then the searchprocess also makes a difference between different transporta-tions (eg walking or by car)

4 Experimental Results and Evaluation

The mobile application was implemented in Android 40and the tests were performed in mobile devices In thissection the results based on information of Mexico Cityare presented The repository is composed of 5441 eventswhich were recollected and processed from the tweets andcorrelated with official reports In this dataset a frequencytable is generated which indicates the number of times thateach attribute appears given events such as theft and crime

The frequency values are defined as weights when deriv-ing a safe route All values are assigned to correspondingnodes in the network In addition the probability that anevent occurs at some location and specific date is computedand stored in a vector table This also allows a comparison ofcrime probabilities at different placesThe sort of summariza-tion provides a support to evaluate the probability of an eventto occur at a specific location andor particular time (egthe probability that a crime or theft can occur in ldquoAvenueEje Central on Mondayrdquo) Then the probability that an eventoccurs at a given place and time is defined in the followingexample ldquoOn Monday at 900 am in Iztapalapardquo

119875(119909) = 119875(Monday) = 2202149119875(119909 | 119888) = 119875(Monday | Theft) = 18218119875(119888) = 119875(Theft) = 2182149119875(119888 | 119909) = 119875(Monday | Theft) =119875(MONDAY) times119875(THEFT | Monday)119875(Mon-day) = 00818

(1) The probabilities of some classified crime events arecomputed

(2) The selected combinations (eg theft at a particularlocation and time) are definedHowever this problemis NP-hard thus a semantic classification has beenapplied to restrict the search space and decrease thecomputational complexity

(3) This provides the probabilities for all events that occurfor each particular domain value

An example to compute the probability for crime eventssuch as the probability that ldquoa theft occurs at a given locationon Friday or a crime occurs onTuesday at noonrdquo is presentedas follows

119875(Theft) = 0104119875(Crime) = 0065119875(Crime(119908)) = 0017119875(Crime(119901)) = 0812

119875(Theft | day = Monday) = 0032119875(Crime | day = Monday) = 013119875(Crime(119908) | day = Monday) = 0017119875(Crime(119901) | day = Monday) = 0813119875(Theft | hour = 9) = 01119875(Crime | day = 9) = 005119875(Crime(119908) | hour = 9) = 005119875(Crime(119901) | hour = 9) = 08119875(Theft | hour = 1245612) = 0119875(Crime | day = 1245612) = 02119875(Crime(119908) | hour = 1245612) = 0119875(Crime(119901) | hour = 1245612) = 08rArr

119875(Theft | Time = 9ampamp day =Monday ampamp coordinate= 1245612) =119875(Theft) times 119875(Theft | day = Monday) times 119875(Theft |time = 9)times(Theft | coordinate = 1245612) = 0104times0032 times 01 times 0 = 0

The example was based on taking into consideration thedefined points from a given area it means that we have theestimation that an event occurs in 119909 point at 11 am or in 119909point in several hours of the day The probability that a theftoccurs is increasingdecreasing depending on the hour andthe day of week In that case the user can also ask for allpossible combinations near his location For instance for 119910point he wants to know the probability of suffering a theft atdifferent hours of a day So he can know what event (type ofcrime) is the most common to occur for an 119909 point at 6 pm

Figures 5 and 6 show safe routes between two pointsfor transient users where the circles on the map representuntrusted points The route was generated by avoiding thesepoints although this means that the path could be longer it isthe safer route Moreover Figure 6 depicts the generated saferoute for the same points indicated in Figure 5 not only whenthe user is not a transient but also when the user is a driverIn this case the route is different because the estimation iscarried out only processing the event types related to drivers

Finally it is possible to generate routes considering thedata reports of different periods of time (eg thefts andcrimes occurred from April to July in 2015) The safe routingalgorithm generates a specific route that avoids the pointswhere theft events have occurred in the past Neverthelessthe generated route can change if the user increases the timeperiod for taking into account (eg from January to Marchin 2015) or including all temporal data available (eg from2013 to 2015) It allows us to know what location is safer thanothers at specific day and hour in the same geographic area

Mobile Information Systems 9

Figure 5 Suggested safe route for a transient

Figure 6 Suggested safe route for a user in a car

Figure 7 depicts the events that occurred to transients anddrivers from official and social sources the icons representtheft of transients and drivers with and without violence aswell as theft of house

On the other hand the obtained results in Figures 56 and 7 were compared with the Official Crime Map Sys-tem (httpwwwmapadelincuencialorgmx) Figure 8 onlydepicts events that volunteersmarked as points where a crimeor theft with violence occurred the data difference is evidentand the possibilities to configure the system are very limitedin comparison with the proposed mobile information system(see Figure 9)

Thus Figure 10 shows the hybrid map view from the webversion The events in yellow color were reported by thesocial network source and events with blue color representthe official source

5 Conclusions and Future Work

In this paper a hybrid approach for finding safe routesusing semantic processing and classification algorithms with

Figure 7 All events occurred from 2013 to 2015

Figure 8 Crime map (made by volunteers)

data provided from a social network and the official crimereports is presented As a case study a mobile informationsystem was developed It generates safe routes based oncrime reports of Mexico City from large tweet repositoryand official databases The data are semantically classifiedto determine whether the tweet describes a crime event ortheft in case of tweets which cannot be identified as crimesthey are evaluated by Bayes algorithm which clustered themaccording to the contained descriptionThus the clusters areused tomake prediction regarding the possibility that a crimecan occur in a specific place and hour The spatiotemporalanalysis determined the location where the crime eventsoccurred Moreover the confidence level of a location wasdefined and it was used as a parameter for computing the saferroute

Themain contributions of this work are as follows (1) thedesign of a hybrid approach based on semantic processingto retrieve crime data from a social network source (2) theintegration of crowd-sensed data with official governmentsources (3) the validation of a tweet performed by comparingthe sources using 119896-fold cross validation (4) the estimation

10 Mobile Information Systems

Figure 9 Theftcrime events that occurred in a specific time fromthe mobile information system

Figure 10 The hybrid map view from the web version

model based on the Bayes algorithm to obtain safe routes withdata that were provided by the mobile device and (5) thedesign of amobile information system to generate safe routes

According to the results of the estimation the certaintydegree is around 75 of effectiveness It was tested bycomparing areas with crime data but the records wereintentionally removed and original copy was keptThus withthe results of the estimation a comparison with the originalcopy was performed So we found that the estimation has aperformance of 75 for all the points of the data sample

In addition a metric to measure the confidence level orsecurity for certain points and areas of Mexico City has beenproposed It allows finding safe routes according to pathswith a low crime rate Moreover the mobile application gath-ers long-term statistical data with almost real informationfrom citizens which are acting as sensors in the city Theresults of the mobile system have been tested and comparedwith the Crime Map System

Future works are oriented towards evaluating the cogni-tive perception of people taking into consideration points

or geographic places for finding comfortable routes Thesentiment analysis will be treated in order to incorporate thisfeature as a parameter in the computation of routes Addi-tionally we are proposing the integration of ourmobile appli-cation with the Mexican CCTV camera systems for sensingthe dynamic of certain areas in the city This contribution isfocused on developing mobile information systems for rout-ing and urban planning in city environments Mobile appli-cations are increasingly becoming essential for analyzing theurban dynamic of big cities Thus the appearance of the nextgeneration of mobile information systems will be devised inreal-time road network conditions In addition this genera-tion is oriented towards improving the quality of human lifefor increasing the sustainability of the smart cities

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

Thisworkwas partially sponsored by the Instituto PolitecnicoNacional (IPN) the Consejo Nacional de Ciencia y Tec-nologıa (CONACYT) and the Secretarıa de Investigacion yPosgrado (SIP) under Grants 20162006 20161899 20161869and 20161611

References

[1] H Zhang Y Xu and X Wen ldquoOptimal shortest path setproblem in undirected graphsrdquo Journal of Combinatorial Opti-mization vol 29 no 3 pp 511ndash530 2015

[2] W Templeton-Steadman and R Williams ldquoInformation deliv-ery system and method for mobile appliancesrdquo US Patent App11562054 2006

[3] D Reis A Melo A L V Coelho and V Furtado ldquoTowardsoptimal police patrol routes with genetic algorithmsrdquo in Intel-ligence and Security Informatics IEEE International Conferenceon Intelligence and Security Informatics ISI 2006 SanDiego CAUSA May 23-24 2006 Proceedings vol 3975 of Lecture Notesin Computer Science pp 485ndash491 Springer Berlin Germany2006

[4] V Ceikute and C S Jensen ldquoRouting service qualitymdashlocaldriver behavior versus routing servicesrdquo in Proceedings of theIEEE 14th International Conference onMobileDataManagement(MDM rsquo13) vol 1 pp 97ndash106 IEEE June 2013

[5] Safety Apps October 2015 httpwwwcsunedupolicesafety-apps

[6] E Galbrun K Pelechrinis and E Terzi ldquoUrban navigationbeyond shortest route the case of safe pathsrdquo InformationSystems vol 57 pp 160ndash171 2016

[7] T Wang C Rudin D Wagner and R Sevieri ldquoLearning todetect patterns of crimerdquo in Machine Learning and KnowledgeDiscovery in Databases vol 8190 of Lecture Notes in ComputerScience pp 515ndash530 Springer Berlin Germany 2013

[8] C-H Yu W Ding P Chen and M Morabito ldquoCrime forecast-ing using spatio-temporal pattern with ensemble learningrdquo inAdvances in Knowledge Discovery andDataMining 18th Pacific-AsiaConference PAKDD2014 Tainan TaiwanMay 13ndash16 2014

Mobile Information Systems 11

Proceedings Part II Lecture Notes in Computer Science pp174ndash185 Springer Berlin Germany 2014

[9] L Scott and NWarmerdam ldquoExtend crime analysis with arcgisspatial statistics toolsrdquo ArcUser Magazine 2005

[10] M Leitner Crime Modeling and Mapping Using GeospatialTechnologies vol 8 Springer DordrechtTheNetherlands 2013

[11] J Kim M Cha and T Sandholm ldquoSocroutes safe routesbased on tweet sentimentsrdquo in Proceedings of the 23rd ACMInternational Conference on World Wide Web (WWW rsquo14) pp179ndash182 Seoul South Korea April 2014

[12] MHaklay and PWeber ldquoOpenstreetmap user-generated streetmapsrdquo IEEE Pervasive Computing vol 7 no 4 pp 12ndash18 2008

[13] J M Sanchez Bernabeu J V Berna Martınez and F MaciaPerez ldquoSmart sentinelmonitoring and prevention system in thesmart citiesrdquo International Review on Computers and Softwarevol 9 no 9 pp 1554ndash1559 2014

[14] Wiki crimes 2015 httpwwwwikicrimesorg[15] T Moon S Heo and S Lee ldquoUbiquitous crime prevention

system (UCPS) for a safer cityrdquoProcedia Environmental Sciencesvol 22 pp 288ndash301 2014

[16] H Su K Zheng J Huang H Jeung L Chen and X ZhouldquoCrowdplanner a crowd-based route recommendation systemrdquoinProceedings of the 30th IEEE International Conference onDataEngineering (ICDE rsquo14) pp 1144ndash1155 IEEE Chicago Ill USAApril 2014

[17] V Arnaboldi M Conti and F Delmastro ldquoImplementationof CAMEO a context-aware middleware for opportunisticmobile social networksrdquo inProceedings of the IEEE InternationalSymposium on a World of Wireless Mobile and MultimediaNetworks (WoWMoM rsquo11) pp 1ndash3 Lucca Italy June 2011

[18] M Nagarajan K Gomadam A P Sheth A Ranabahu RMutharaju and A Jadhav ldquoSpatio-temporal-thematic analysisof citizen sensor data challenges and experiencesrdquo in WebInformation Systems EngineeringmdashWISE 2009 vol 5802 of Lec-ture Notes in Computer Science pp 539ndash553 Springer BerlinGermany 2009

[19] N Powdthavee ldquoUnhappiness and crime evidence from SouthAfricardquo Economica vol 72 no 287 pp 531ndash547 2005

[20] J Ratcliffe ldquoCrime mapping spatial and temporal challengesrdquoin Handbook of Quantitative Criminology pp 5ndash24 SpringerNew York NY USA 2010

[21] P Gupta G N Purohit and A Dadhich ldquoCrime preventionthrough alternate route finding in traffic surveillance using cctvcamerasrdquo International Journal of Engineering and AdvancedTechnology vol 2 no 5 pp 414ndash418 2013

[22] H Chen T Cheng and S Wise ldquoDesigning daily patrol routesfor policing based on ANT colony algorithmrdquo ISPRS Annalsof the Photogrammetry Remote Sensing and Spatial InformationSciences vol II-4W2 pp 103ndash109 2015

[23] D Dzemydiene and E Kazemikaitiene ldquoOntology-based deci-sion support system for crime investigation processesrdquo inInformation Systems Development pp 427ndash438 Springer NewYork NY USA 2005

[24] Y Chabot A Bertaux C Nicolle and T Kechadi ldquoA completeformalized knowledge representation model for advanced dig-ital forensics timeline analysisrdquoDigital Investigation vol 15 pp83ndash100 2015

[25] E Hatcher O Gospodnetic and M McCandless Lucene inAction Manning Publications Greenwich Conn USA 2004

[26] FMata-RiveraMTorres-RuizGGuzmanMMoreno-Ibarraand R Quintero ldquoA collaborative learning approach for geo-graphic information retrieval based on social networksrdquo Com-puters in Human Behavior vol 51 pp 829ndash842 2015

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 7: A Mobile Information System Based on Crowd-Sensed and Official ...

Mobile Information Systems 7

Input119873 = 1198991 1198992 119899

119896Set of nodes

119899119904start node

119899119891finish node

Result119863 = 1198891 1198892 119889

119896Set of distance values (weighted crime rate)

(1) Assign distance values 119889(119899119904) = 0 119889(119899

119894) = infinforall119899

119894= 119899119904

(2) Let 119880 = 119873 minus 119899119904

(3) Let current node 119899119888= 119899119904

(4) while exists(119899119888) do

(5) foreach neighbor(119899119888) do

(6) Let be 119899119895= 119899119890119894119892ℎ119887119900119903(119899

119888)

(7) if 119889(119899119895) gt 119897119890119899119892119905ℎ(119899

119895 119899119888) + 119889(119899

119888) then

(8) 119889(119899119895) = 119897119890119899119892119905ℎ(119899

119895 119899119888) + 119889(119899

119888)

(9) 119880 = 119880 minus 119899119895

(10) if 119899119891notin 119880 or min(119897119890119899119892119905ℎ(119899

119894 119899119895)) = infinforall119899

119894 119899119895isin 119880 then

(11) return 119889(119899119896) forall 119899119896isin 119873

(12) else(13) 119899

119888= 119889(119899

119894) = min(119889(119899

119895)) forall 119899119895isin 119880

Algorithm 3 The safe routing algorithm

The clustering approach obtains some probabilities thatrepresent the estimation for specific crimes that were previ-ously classified semanticallyThese probabilities define possi-ble hot spots that represent events which are directly associ-ated with streets In addition the method classifies incomingdata generated by the categorization task searching patternsthat were defined as words associated with a crime It gen-erates some predictive spatiotemporal patterns according tothe indicated parametersThus the goal is to search the crimeprobability for specific locations in a given route for havingsome criminal events when a user travels on that route Theprobabilities are based on the following combinations

let 119875(coordinate = [119909 119910] | event) be the probabilityof an event directly related to its location and let119875(119889119886119910 119903 119905119894119898119890 | event) be the probability of an eventtaking into account its temporariness So the compu-tation of such probabilities is presented as follows119875(119888 | 119909) represents a subsequent probability and it isdefined by the probability that an event occurs con-sidering past events inwhich119875(119909) represents the totalprobability which denotes the number of times thatsome given attributes appear in the events (eg theft)119875(119909 | 119888) represents the conditional probability whichis denoted by the number of times when a targetappears in each attribute (eg car) taking intoaccount the total number that the target appears inall attributes119875(119888) is defined as an a priori probability which repre-sents the number of times that a target appears in allevents

Events attributes and locations are identified and clas-sified by the Bayes algorithm The clusters are generatedaccording to the defined patterns that were used in thetraining process Particularly the places are described astrusted or untrusted The probability is used as a weighted

value and it is normalized in a range of [0 1] where 0represents the lowest and 1 the highest likelihood Thusthese values are used in the route generation An example ofthe clustering computation with the obtained likelihood ispresented as follows

Classified Event (ldquo12052014rdquo ldquo3rdquo Monday12255928567) ldquoEventrdquo ldquoTheftrdquo ldquoID Coordinaterdquoldquo30339694rdquo ldquolatituderdquo 194038744 ldquolongituderdquo99150226 ldquoProbabilityrdquo ldquo00009295401rdquo ldquodiag-nosed weight 089rdquo

35 Stage 5 Generation of Safe Routes The safe routes arevisualized in the mobile mapping application which wasdeveloped as a client by using REST (Representational StateTransfer) It is a web service technology that generatesrequests and the data parsing is received by the server Thespatial feature is supported byOpen StreetMaps [12]The saferoute is obtained by the adaptation of the Dijkstra algorithmin which the nodes in the network are assigned as an averageweight that was obtained from the number of crimes for aspecific point or geographic area The values generated fromthe Bayes algorithm reflect the probability of having or not anevent such as ldquotheftcrimerdquo

Let 119899119904be the starting node called initial node and let

119889 be the distance of node 119899119896from the initial node to 119899

1

Safe route algorithm assigns some initial distance valuesbased on the weighted crime rate for these points So theweighted crime rate is computed by the number of com-plaint crimetheft occurrences in a specific point and timeAlgorithm 3 describes the process for obtaining the saferoutes

Thus this algorithm uses data that were sent by themobile application (origin and destination points) in order toreturn a route that avoids locations where crime events haveoccurred The confidence level is a metric that is computedby considering the number of crime incidents that haveoccurred in a specific point and time This is used as a

8 Mobile Information Systems

weighted value for the safe routing algorithm The displayedroute is marked with coordinates that were returned by thealgorithm in order to visualize the route on the mobilemapping application

The weights can be also relaxed by spatial and temporalvalues such as date (day month and year) location (point orarea) and time (hour or period) In the interface level userscan configure the route search process by modeling someparameters as follows If the route is generated by using eithersocial official or integrated data source then the searchprocess also makes a difference between different transporta-tions (eg walking or by car)

4 Experimental Results and Evaluation

The mobile application was implemented in Android 40and the tests were performed in mobile devices In thissection the results based on information of Mexico Cityare presented The repository is composed of 5441 eventswhich were recollected and processed from the tweets andcorrelated with official reports In this dataset a frequencytable is generated which indicates the number of times thateach attribute appears given events such as theft and crime

The frequency values are defined as weights when deriv-ing a safe route All values are assigned to correspondingnodes in the network In addition the probability that anevent occurs at some location and specific date is computedand stored in a vector table This also allows a comparison ofcrime probabilities at different placesThe sort of summariza-tion provides a support to evaluate the probability of an eventto occur at a specific location andor particular time (egthe probability that a crime or theft can occur in ldquoAvenueEje Central on Mondayrdquo) Then the probability that an eventoccurs at a given place and time is defined in the followingexample ldquoOn Monday at 900 am in Iztapalapardquo

119875(119909) = 119875(Monday) = 2202149119875(119909 | 119888) = 119875(Monday | Theft) = 18218119875(119888) = 119875(Theft) = 2182149119875(119888 | 119909) = 119875(Monday | Theft) =119875(MONDAY) times119875(THEFT | Monday)119875(Mon-day) = 00818

(1) The probabilities of some classified crime events arecomputed

(2) The selected combinations (eg theft at a particularlocation and time) are definedHowever this problemis NP-hard thus a semantic classification has beenapplied to restrict the search space and decrease thecomputational complexity

(3) This provides the probabilities for all events that occurfor each particular domain value

An example to compute the probability for crime eventssuch as the probability that ldquoa theft occurs at a given locationon Friday or a crime occurs onTuesday at noonrdquo is presentedas follows

119875(Theft) = 0104119875(Crime) = 0065119875(Crime(119908)) = 0017119875(Crime(119901)) = 0812

119875(Theft | day = Monday) = 0032119875(Crime | day = Monday) = 013119875(Crime(119908) | day = Monday) = 0017119875(Crime(119901) | day = Monday) = 0813119875(Theft | hour = 9) = 01119875(Crime | day = 9) = 005119875(Crime(119908) | hour = 9) = 005119875(Crime(119901) | hour = 9) = 08119875(Theft | hour = 1245612) = 0119875(Crime | day = 1245612) = 02119875(Crime(119908) | hour = 1245612) = 0119875(Crime(119901) | hour = 1245612) = 08rArr

119875(Theft | Time = 9ampamp day =Monday ampamp coordinate= 1245612) =119875(Theft) times 119875(Theft | day = Monday) times 119875(Theft |time = 9)times(Theft | coordinate = 1245612) = 0104times0032 times 01 times 0 = 0

The example was based on taking into consideration thedefined points from a given area it means that we have theestimation that an event occurs in 119909 point at 11 am or in 119909point in several hours of the day The probability that a theftoccurs is increasingdecreasing depending on the hour andthe day of week In that case the user can also ask for allpossible combinations near his location For instance for 119910point he wants to know the probability of suffering a theft atdifferent hours of a day So he can know what event (type ofcrime) is the most common to occur for an 119909 point at 6 pm

Figures 5 and 6 show safe routes between two pointsfor transient users where the circles on the map representuntrusted points The route was generated by avoiding thesepoints although this means that the path could be longer it isthe safer route Moreover Figure 6 depicts the generated saferoute for the same points indicated in Figure 5 not only whenthe user is not a transient but also when the user is a driverIn this case the route is different because the estimation iscarried out only processing the event types related to drivers

Finally it is possible to generate routes considering thedata reports of different periods of time (eg thefts andcrimes occurred from April to July in 2015) The safe routingalgorithm generates a specific route that avoids the pointswhere theft events have occurred in the past Neverthelessthe generated route can change if the user increases the timeperiod for taking into account (eg from January to Marchin 2015) or including all temporal data available (eg from2013 to 2015) It allows us to know what location is safer thanothers at specific day and hour in the same geographic area

Mobile Information Systems 9

Figure 5 Suggested safe route for a transient

Figure 6 Suggested safe route for a user in a car

Figure 7 depicts the events that occurred to transients anddrivers from official and social sources the icons representtheft of transients and drivers with and without violence aswell as theft of house

On the other hand the obtained results in Figures 56 and 7 were compared with the Official Crime Map Sys-tem (httpwwwmapadelincuencialorgmx) Figure 8 onlydepicts events that volunteersmarked as points where a crimeor theft with violence occurred the data difference is evidentand the possibilities to configure the system are very limitedin comparison with the proposed mobile information system(see Figure 9)

Thus Figure 10 shows the hybrid map view from the webversion The events in yellow color were reported by thesocial network source and events with blue color representthe official source

5 Conclusions and Future Work

In this paper a hybrid approach for finding safe routesusing semantic processing and classification algorithms with

Figure 7 All events occurred from 2013 to 2015

Figure 8 Crime map (made by volunteers)

data provided from a social network and the official crimereports is presented As a case study a mobile informationsystem was developed It generates safe routes based oncrime reports of Mexico City from large tweet repositoryand official databases The data are semantically classifiedto determine whether the tweet describes a crime event ortheft in case of tweets which cannot be identified as crimesthey are evaluated by Bayes algorithm which clustered themaccording to the contained descriptionThus the clusters areused tomake prediction regarding the possibility that a crimecan occur in a specific place and hour The spatiotemporalanalysis determined the location where the crime eventsoccurred Moreover the confidence level of a location wasdefined and it was used as a parameter for computing the saferroute

Themain contributions of this work are as follows (1) thedesign of a hybrid approach based on semantic processingto retrieve crime data from a social network source (2) theintegration of crowd-sensed data with official governmentsources (3) the validation of a tweet performed by comparingthe sources using 119896-fold cross validation (4) the estimation

10 Mobile Information Systems

Figure 9 Theftcrime events that occurred in a specific time fromthe mobile information system

Figure 10 The hybrid map view from the web version

model based on the Bayes algorithm to obtain safe routes withdata that were provided by the mobile device and (5) thedesign of amobile information system to generate safe routes

According to the results of the estimation the certaintydegree is around 75 of effectiveness It was tested bycomparing areas with crime data but the records wereintentionally removed and original copy was keptThus withthe results of the estimation a comparison with the originalcopy was performed So we found that the estimation has aperformance of 75 for all the points of the data sample

In addition a metric to measure the confidence level orsecurity for certain points and areas of Mexico City has beenproposed It allows finding safe routes according to pathswith a low crime rate Moreover the mobile application gath-ers long-term statistical data with almost real informationfrom citizens which are acting as sensors in the city Theresults of the mobile system have been tested and comparedwith the Crime Map System

Future works are oriented towards evaluating the cogni-tive perception of people taking into consideration points

or geographic places for finding comfortable routes Thesentiment analysis will be treated in order to incorporate thisfeature as a parameter in the computation of routes Addi-tionally we are proposing the integration of ourmobile appli-cation with the Mexican CCTV camera systems for sensingthe dynamic of certain areas in the city This contribution isfocused on developing mobile information systems for rout-ing and urban planning in city environments Mobile appli-cations are increasingly becoming essential for analyzing theurban dynamic of big cities Thus the appearance of the nextgeneration of mobile information systems will be devised inreal-time road network conditions In addition this genera-tion is oriented towards improving the quality of human lifefor increasing the sustainability of the smart cities

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

Thisworkwas partially sponsored by the Instituto PolitecnicoNacional (IPN) the Consejo Nacional de Ciencia y Tec-nologıa (CONACYT) and the Secretarıa de Investigacion yPosgrado (SIP) under Grants 20162006 20161899 20161869and 20161611

References

[1] H Zhang Y Xu and X Wen ldquoOptimal shortest path setproblem in undirected graphsrdquo Journal of Combinatorial Opti-mization vol 29 no 3 pp 511ndash530 2015

[2] W Templeton-Steadman and R Williams ldquoInformation deliv-ery system and method for mobile appliancesrdquo US Patent App11562054 2006

[3] D Reis A Melo A L V Coelho and V Furtado ldquoTowardsoptimal police patrol routes with genetic algorithmsrdquo in Intel-ligence and Security Informatics IEEE International Conferenceon Intelligence and Security Informatics ISI 2006 SanDiego CAUSA May 23-24 2006 Proceedings vol 3975 of Lecture Notesin Computer Science pp 485ndash491 Springer Berlin Germany2006

[4] V Ceikute and C S Jensen ldquoRouting service qualitymdashlocaldriver behavior versus routing servicesrdquo in Proceedings of theIEEE 14th International Conference onMobileDataManagement(MDM rsquo13) vol 1 pp 97ndash106 IEEE June 2013

[5] Safety Apps October 2015 httpwwwcsunedupolicesafety-apps

[6] E Galbrun K Pelechrinis and E Terzi ldquoUrban navigationbeyond shortest route the case of safe pathsrdquo InformationSystems vol 57 pp 160ndash171 2016

[7] T Wang C Rudin D Wagner and R Sevieri ldquoLearning todetect patterns of crimerdquo in Machine Learning and KnowledgeDiscovery in Databases vol 8190 of Lecture Notes in ComputerScience pp 515ndash530 Springer Berlin Germany 2013

[8] C-H Yu W Ding P Chen and M Morabito ldquoCrime forecast-ing using spatio-temporal pattern with ensemble learningrdquo inAdvances in Knowledge Discovery andDataMining 18th Pacific-AsiaConference PAKDD2014 Tainan TaiwanMay 13ndash16 2014

Mobile Information Systems 11

Proceedings Part II Lecture Notes in Computer Science pp174ndash185 Springer Berlin Germany 2014

[9] L Scott and NWarmerdam ldquoExtend crime analysis with arcgisspatial statistics toolsrdquo ArcUser Magazine 2005

[10] M Leitner Crime Modeling and Mapping Using GeospatialTechnologies vol 8 Springer DordrechtTheNetherlands 2013

[11] J Kim M Cha and T Sandholm ldquoSocroutes safe routesbased on tweet sentimentsrdquo in Proceedings of the 23rd ACMInternational Conference on World Wide Web (WWW rsquo14) pp179ndash182 Seoul South Korea April 2014

[12] MHaklay and PWeber ldquoOpenstreetmap user-generated streetmapsrdquo IEEE Pervasive Computing vol 7 no 4 pp 12ndash18 2008

[13] J M Sanchez Bernabeu J V Berna Martınez and F MaciaPerez ldquoSmart sentinelmonitoring and prevention system in thesmart citiesrdquo International Review on Computers and Softwarevol 9 no 9 pp 1554ndash1559 2014

[14] Wiki crimes 2015 httpwwwwikicrimesorg[15] T Moon S Heo and S Lee ldquoUbiquitous crime prevention

system (UCPS) for a safer cityrdquoProcedia Environmental Sciencesvol 22 pp 288ndash301 2014

[16] H Su K Zheng J Huang H Jeung L Chen and X ZhouldquoCrowdplanner a crowd-based route recommendation systemrdquoinProceedings of the 30th IEEE International Conference onDataEngineering (ICDE rsquo14) pp 1144ndash1155 IEEE Chicago Ill USAApril 2014

[17] V Arnaboldi M Conti and F Delmastro ldquoImplementationof CAMEO a context-aware middleware for opportunisticmobile social networksrdquo inProceedings of the IEEE InternationalSymposium on a World of Wireless Mobile and MultimediaNetworks (WoWMoM rsquo11) pp 1ndash3 Lucca Italy June 2011

[18] M Nagarajan K Gomadam A P Sheth A Ranabahu RMutharaju and A Jadhav ldquoSpatio-temporal-thematic analysisof citizen sensor data challenges and experiencesrdquo in WebInformation Systems EngineeringmdashWISE 2009 vol 5802 of Lec-ture Notes in Computer Science pp 539ndash553 Springer BerlinGermany 2009

[19] N Powdthavee ldquoUnhappiness and crime evidence from SouthAfricardquo Economica vol 72 no 287 pp 531ndash547 2005

[20] J Ratcliffe ldquoCrime mapping spatial and temporal challengesrdquoin Handbook of Quantitative Criminology pp 5ndash24 SpringerNew York NY USA 2010

[21] P Gupta G N Purohit and A Dadhich ldquoCrime preventionthrough alternate route finding in traffic surveillance using cctvcamerasrdquo International Journal of Engineering and AdvancedTechnology vol 2 no 5 pp 414ndash418 2013

[22] H Chen T Cheng and S Wise ldquoDesigning daily patrol routesfor policing based on ANT colony algorithmrdquo ISPRS Annalsof the Photogrammetry Remote Sensing and Spatial InformationSciences vol II-4W2 pp 103ndash109 2015

[23] D Dzemydiene and E Kazemikaitiene ldquoOntology-based deci-sion support system for crime investigation processesrdquo inInformation Systems Development pp 427ndash438 Springer NewYork NY USA 2005

[24] Y Chabot A Bertaux C Nicolle and T Kechadi ldquoA completeformalized knowledge representation model for advanced dig-ital forensics timeline analysisrdquoDigital Investigation vol 15 pp83ndash100 2015

[25] E Hatcher O Gospodnetic and M McCandless Lucene inAction Manning Publications Greenwich Conn USA 2004

[26] FMata-RiveraMTorres-RuizGGuzmanMMoreno-Ibarraand R Quintero ldquoA collaborative learning approach for geo-graphic information retrieval based on social networksrdquo Com-puters in Human Behavior vol 51 pp 829ndash842 2015

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 8: A Mobile Information System Based on Crowd-Sensed and Official ...

8 Mobile Information Systems

weighted value for the safe routing algorithm The displayedroute is marked with coordinates that were returned by thealgorithm in order to visualize the route on the mobilemapping application

The weights can be also relaxed by spatial and temporalvalues such as date (day month and year) location (point orarea) and time (hour or period) In the interface level userscan configure the route search process by modeling someparameters as follows If the route is generated by using eithersocial official or integrated data source then the searchprocess also makes a difference between different transporta-tions (eg walking or by car)

4 Experimental Results and Evaluation

The mobile application was implemented in Android 40and the tests were performed in mobile devices In thissection the results based on information of Mexico Cityare presented The repository is composed of 5441 eventswhich were recollected and processed from the tweets andcorrelated with official reports In this dataset a frequencytable is generated which indicates the number of times thateach attribute appears given events such as theft and crime

The frequency values are defined as weights when deriv-ing a safe route All values are assigned to correspondingnodes in the network In addition the probability that anevent occurs at some location and specific date is computedand stored in a vector table This also allows a comparison ofcrime probabilities at different placesThe sort of summariza-tion provides a support to evaluate the probability of an eventto occur at a specific location andor particular time (egthe probability that a crime or theft can occur in ldquoAvenueEje Central on Mondayrdquo) Then the probability that an eventoccurs at a given place and time is defined in the followingexample ldquoOn Monday at 900 am in Iztapalapardquo

119875(119909) = 119875(Monday) = 2202149119875(119909 | 119888) = 119875(Monday | Theft) = 18218119875(119888) = 119875(Theft) = 2182149119875(119888 | 119909) = 119875(Monday | Theft) =119875(MONDAY) times119875(THEFT | Monday)119875(Mon-day) = 00818

(1) The probabilities of some classified crime events arecomputed

(2) The selected combinations (eg theft at a particularlocation and time) are definedHowever this problemis NP-hard thus a semantic classification has beenapplied to restrict the search space and decrease thecomputational complexity

(3) This provides the probabilities for all events that occurfor each particular domain value

An example to compute the probability for crime eventssuch as the probability that ldquoa theft occurs at a given locationon Friday or a crime occurs onTuesday at noonrdquo is presentedas follows

119875(Theft) = 0104119875(Crime) = 0065119875(Crime(119908)) = 0017119875(Crime(119901)) = 0812

119875(Theft | day = Monday) = 0032119875(Crime | day = Monday) = 013119875(Crime(119908) | day = Monday) = 0017119875(Crime(119901) | day = Monday) = 0813119875(Theft | hour = 9) = 01119875(Crime | day = 9) = 005119875(Crime(119908) | hour = 9) = 005119875(Crime(119901) | hour = 9) = 08119875(Theft | hour = 1245612) = 0119875(Crime | day = 1245612) = 02119875(Crime(119908) | hour = 1245612) = 0119875(Crime(119901) | hour = 1245612) = 08rArr

119875(Theft | Time = 9ampamp day =Monday ampamp coordinate= 1245612) =119875(Theft) times 119875(Theft | day = Monday) times 119875(Theft |time = 9)times(Theft | coordinate = 1245612) = 0104times0032 times 01 times 0 = 0

The example was based on taking into consideration thedefined points from a given area it means that we have theestimation that an event occurs in 119909 point at 11 am or in 119909point in several hours of the day The probability that a theftoccurs is increasingdecreasing depending on the hour andthe day of week In that case the user can also ask for allpossible combinations near his location For instance for 119910point he wants to know the probability of suffering a theft atdifferent hours of a day So he can know what event (type ofcrime) is the most common to occur for an 119909 point at 6 pm

Figures 5 and 6 show safe routes between two pointsfor transient users where the circles on the map representuntrusted points The route was generated by avoiding thesepoints although this means that the path could be longer it isthe safer route Moreover Figure 6 depicts the generated saferoute for the same points indicated in Figure 5 not only whenthe user is not a transient but also when the user is a driverIn this case the route is different because the estimation iscarried out only processing the event types related to drivers

Finally it is possible to generate routes considering thedata reports of different periods of time (eg thefts andcrimes occurred from April to July in 2015) The safe routingalgorithm generates a specific route that avoids the pointswhere theft events have occurred in the past Neverthelessthe generated route can change if the user increases the timeperiod for taking into account (eg from January to Marchin 2015) or including all temporal data available (eg from2013 to 2015) It allows us to know what location is safer thanothers at specific day and hour in the same geographic area

Mobile Information Systems 9

Figure 5 Suggested safe route for a transient

Figure 6 Suggested safe route for a user in a car

Figure 7 depicts the events that occurred to transients anddrivers from official and social sources the icons representtheft of transients and drivers with and without violence aswell as theft of house

On the other hand the obtained results in Figures 56 and 7 were compared with the Official Crime Map Sys-tem (httpwwwmapadelincuencialorgmx) Figure 8 onlydepicts events that volunteersmarked as points where a crimeor theft with violence occurred the data difference is evidentand the possibilities to configure the system are very limitedin comparison with the proposed mobile information system(see Figure 9)

Thus Figure 10 shows the hybrid map view from the webversion The events in yellow color were reported by thesocial network source and events with blue color representthe official source

5 Conclusions and Future Work

In this paper a hybrid approach for finding safe routesusing semantic processing and classification algorithms with

Figure 7 All events occurred from 2013 to 2015

Figure 8 Crime map (made by volunteers)

data provided from a social network and the official crimereports is presented As a case study a mobile informationsystem was developed It generates safe routes based oncrime reports of Mexico City from large tweet repositoryand official databases The data are semantically classifiedto determine whether the tweet describes a crime event ortheft in case of tweets which cannot be identified as crimesthey are evaluated by Bayes algorithm which clustered themaccording to the contained descriptionThus the clusters areused tomake prediction regarding the possibility that a crimecan occur in a specific place and hour The spatiotemporalanalysis determined the location where the crime eventsoccurred Moreover the confidence level of a location wasdefined and it was used as a parameter for computing the saferroute

Themain contributions of this work are as follows (1) thedesign of a hybrid approach based on semantic processingto retrieve crime data from a social network source (2) theintegration of crowd-sensed data with official governmentsources (3) the validation of a tweet performed by comparingthe sources using 119896-fold cross validation (4) the estimation

10 Mobile Information Systems

Figure 9 Theftcrime events that occurred in a specific time fromthe mobile information system

Figure 10 The hybrid map view from the web version

model based on the Bayes algorithm to obtain safe routes withdata that were provided by the mobile device and (5) thedesign of amobile information system to generate safe routes

According to the results of the estimation the certaintydegree is around 75 of effectiveness It was tested bycomparing areas with crime data but the records wereintentionally removed and original copy was keptThus withthe results of the estimation a comparison with the originalcopy was performed So we found that the estimation has aperformance of 75 for all the points of the data sample

In addition a metric to measure the confidence level orsecurity for certain points and areas of Mexico City has beenproposed It allows finding safe routes according to pathswith a low crime rate Moreover the mobile application gath-ers long-term statistical data with almost real informationfrom citizens which are acting as sensors in the city Theresults of the mobile system have been tested and comparedwith the Crime Map System

Future works are oriented towards evaluating the cogni-tive perception of people taking into consideration points

or geographic places for finding comfortable routes Thesentiment analysis will be treated in order to incorporate thisfeature as a parameter in the computation of routes Addi-tionally we are proposing the integration of ourmobile appli-cation with the Mexican CCTV camera systems for sensingthe dynamic of certain areas in the city This contribution isfocused on developing mobile information systems for rout-ing and urban planning in city environments Mobile appli-cations are increasingly becoming essential for analyzing theurban dynamic of big cities Thus the appearance of the nextgeneration of mobile information systems will be devised inreal-time road network conditions In addition this genera-tion is oriented towards improving the quality of human lifefor increasing the sustainability of the smart cities

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

Thisworkwas partially sponsored by the Instituto PolitecnicoNacional (IPN) the Consejo Nacional de Ciencia y Tec-nologıa (CONACYT) and the Secretarıa de Investigacion yPosgrado (SIP) under Grants 20162006 20161899 20161869and 20161611

References

[1] H Zhang Y Xu and X Wen ldquoOptimal shortest path setproblem in undirected graphsrdquo Journal of Combinatorial Opti-mization vol 29 no 3 pp 511ndash530 2015

[2] W Templeton-Steadman and R Williams ldquoInformation deliv-ery system and method for mobile appliancesrdquo US Patent App11562054 2006

[3] D Reis A Melo A L V Coelho and V Furtado ldquoTowardsoptimal police patrol routes with genetic algorithmsrdquo in Intel-ligence and Security Informatics IEEE International Conferenceon Intelligence and Security Informatics ISI 2006 SanDiego CAUSA May 23-24 2006 Proceedings vol 3975 of Lecture Notesin Computer Science pp 485ndash491 Springer Berlin Germany2006

[4] V Ceikute and C S Jensen ldquoRouting service qualitymdashlocaldriver behavior versus routing servicesrdquo in Proceedings of theIEEE 14th International Conference onMobileDataManagement(MDM rsquo13) vol 1 pp 97ndash106 IEEE June 2013

[5] Safety Apps October 2015 httpwwwcsunedupolicesafety-apps

[6] E Galbrun K Pelechrinis and E Terzi ldquoUrban navigationbeyond shortest route the case of safe pathsrdquo InformationSystems vol 57 pp 160ndash171 2016

[7] T Wang C Rudin D Wagner and R Sevieri ldquoLearning todetect patterns of crimerdquo in Machine Learning and KnowledgeDiscovery in Databases vol 8190 of Lecture Notes in ComputerScience pp 515ndash530 Springer Berlin Germany 2013

[8] C-H Yu W Ding P Chen and M Morabito ldquoCrime forecast-ing using spatio-temporal pattern with ensemble learningrdquo inAdvances in Knowledge Discovery andDataMining 18th Pacific-AsiaConference PAKDD2014 Tainan TaiwanMay 13ndash16 2014

Mobile Information Systems 11

Proceedings Part II Lecture Notes in Computer Science pp174ndash185 Springer Berlin Germany 2014

[9] L Scott and NWarmerdam ldquoExtend crime analysis with arcgisspatial statistics toolsrdquo ArcUser Magazine 2005

[10] M Leitner Crime Modeling and Mapping Using GeospatialTechnologies vol 8 Springer DordrechtTheNetherlands 2013

[11] J Kim M Cha and T Sandholm ldquoSocroutes safe routesbased on tweet sentimentsrdquo in Proceedings of the 23rd ACMInternational Conference on World Wide Web (WWW rsquo14) pp179ndash182 Seoul South Korea April 2014

[12] MHaklay and PWeber ldquoOpenstreetmap user-generated streetmapsrdquo IEEE Pervasive Computing vol 7 no 4 pp 12ndash18 2008

[13] J M Sanchez Bernabeu J V Berna Martınez and F MaciaPerez ldquoSmart sentinelmonitoring and prevention system in thesmart citiesrdquo International Review on Computers and Softwarevol 9 no 9 pp 1554ndash1559 2014

[14] Wiki crimes 2015 httpwwwwikicrimesorg[15] T Moon S Heo and S Lee ldquoUbiquitous crime prevention

system (UCPS) for a safer cityrdquoProcedia Environmental Sciencesvol 22 pp 288ndash301 2014

[16] H Su K Zheng J Huang H Jeung L Chen and X ZhouldquoCrowdplanner a crowd-based route recommendation systemrdquoinProceedings of the 30th IEEE International Conference onDataEngineering (ICDE rsquo14) pp 1144ndash1155 IEEE Chicago Ill USAApril 2014

[17] V Arnaboldi M Conti and F Delmastro ldquoImplementationof CAMEO a context-aware middleware for opportunisticmobile social networksrdquo inProceedings of the IEEE InternationalSymposium on a World of Wireless Mobile and MultimediaNetworks (WoWMoM rsquo11) pp 1ndash3 Lucca Italy June 2011

[18] M Nagarajan K Gomadam A P Sheth A Ranabahu RMutharaju and A Jadhav ldquoSpatio-temporal-thematic analysisof citizen sensor data challenges and experiencesrdquo in WebInformation Systems EngineeringmdashWISE 2009 vol 5802 of Lec-ture Notes in Computer Science pp 539ndash553 Springer BerlinGermany 2009

[19] N Powdthavee ldquoUnhappiness and crime evidence from SouthAfricardquo Economica vol 72 no 287 pp 531ndash547 2005

[20] J Ratcliffe ldquoCrime mapping spatial and temporal challengesrdquoin Handbook of Quantitative Criminology pp 5ndash24 SpringerNew York NY USA 2010

[21] P Gupta G N Purohit and A Dadhich ldquoCrime preventionthrough alternate route finding in traffic surveillance using cctvcamerasrdquo International Journal of Engineering and AdvancedTechnology vol 2 no 5 pp 414ndash418 2013

[22] H Chen T Cheng and S Wise ldquoDesigning daily patrol routesfor policing based on ANT colony algorithmrdquo ISPRS Annalsof the Photogrammetry Remote Sensing and Spatial InformationSciences vol II-4W2 pp 103ndash109 2015

[23] D Dzemydiene and E Kazemikaitiene ldquoOntology-based deci-sion support system for crime investigation processesrdquo inInformation Systems Development pp 427ndash438 Springer NewYork NY USA 2005

[24] Y Chabot A Bertaux C Nicolle and T Kechadi ldquoA completeformalized knowledge representation model for advanced dig-ital forensics timeline analysisrdquoDigital Investigation vol 15 pp83ndash100 2015

[25] E Hatcher O Gospodnetic and M McCandless Lucene inAction Manning Publications Greenwich Conn USA 2004

[26] FMata-RiveraMTorres-RuizGGuzmanMMoreno-Ibarraand R Quintero ldquoA collaborative learning approach for geo-graphic information retrieval based on social networksrdquo Com-puters in Human Behavior vol 51 pp 829ndash842 2015

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 9: A Mobile Information System Based on Crowd-Sensed and Official ...

Mobile Information Systems 9

Figure 5 Suggested safe route for a transient

Figure 6 Suggested safe route for a user in a car

Figure 7 depicts the events that occurred to transients anddrivers from official and social sources the icons representtheft of transients and drivers with and without violence aswell as theft of house

On the other hand the obtained results in Figures 56 and 7 were compared with the Official Crime Map Sys-tem (httpwwwmapadelincuencialorgmx) Figure 8 onlydepicts events that volunteersmarked as points where a crimeor theft with violence occurred the data difference is evidentand the possibilities to configure the system are very limitedin comparison with the proposed mobile information system(see Figure 9)

Thus Figure 10 shows the hybrid map view from the webversion The events in yellow color were reported by thesocial network source and events with blue color representthe official source

5 Conclusions and Future Work

In this paper a hybrid approach for finding safe routesusing semantic processing and classification algorithms with

Figure 7 All events occurred from 2013 to 2015

Figure 8 Crime map (made by volunteers)

data provided from a social network and the official crimereports is presented As a case study a mobile informationsystem was developed It generates safe routes based oncrime reports of Mexico City from large tweet repositoryand official databases The data are semantically classifiedto determine whether the tweet describes a crime event ortheft in case of tweets which cannot be identified as crimesthey are evaluated by Bayes algorithm which clustered themaccording to the contained descriptionThus the clusters areused tomake prediction regarding the possibility that a crimecan occur in a specific place and hour The spatiotemporalanalysis determined the location where the crime eventsoccurred Moreover the confidence level of a location wasdefined and it was used as a parameter for computing the saferroute

Themain contributions of this work are as follows (1) thedesign of a hybrid approach based on semantic processingto retrieve crime data from a social network source (2) theintegration of crowd-sensed data with official governmentsources (3) the validation of a tweet performed by comparingthe sources using 119896-fold cross validation (4) the estimation

10 Mobile Information Systems

Figure 9 Theftcrime events that occurred in a specific time fromthe mobile information system

Figure 10 The hybrid map view from the web version

model based on the Bayes algorithm to obtain safe routes withdata that were provided by the mobile device and (5) thedesign of amobile information system to generate safe routes

According to the results of the estimation the certaintydegree is around 75 of effectiveness It was tested bycomparing areas with crime data but the records wereintentionally removed and original copy was keptThus withthe results of the estimation a comparison with the originalcopy was performed So we found that the estimation has aperformance of 75 for all the points of the data sample

In addition a metric to measure the confidence level orsecurity for certain points and areas of Mexico City has beenproposed It allows finding safe routes according to pathswith a low crime rate Moreover the mobile application gath-ers long-term statistical data with almost real informationfrom citizens which are acting as sensors in the city Theresults of the mobile system have been tested and comparedwith the Crime Map System

Future works are oriented towards evaluating the cogni-tive perception of people taking into consideration points

or geographic places for finding comfortable routes Thesentiment analysis will be treated in order to incorporate thisfeature as a parameter in the computation of routes Addi-tionally we are proposing the integration of ourmobile appli-cation with the Mexican CCTV camera systems for sensingthe dynamic of certain areas in the city This contribution isfocused on developing mobile information systems for rout-ing and urban planning in city environments Mobile appli-cations are increasingly becoming essential for analyzing theurban dynamic of big cities Thus the appearance of the nextgeneration of mobile information systems will be devised inreal-time road network conditions In addition this genera-tion is oriented towards improving the quality of human lifefor increasing the sustainability of the smart cities

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

Thisworkwas partially sponsored by the Instituto PolitecnicoNacional (IPN) the Consejo Nacional de Ciencia y Tec-nologıa (CONACYT) and the Secretarıa de Investigacion yPosgrado (SIP) under Grants 20162006 20161899 20161869and 20161611

References

[1] H Zhang Y Xu and X Wen ldquoOptimal shortest path setproblem in undirected graphsrdquo Journal of Combinatorial Opti-mization vol 29 no 3 pp 511ndash530 2015

[2] W Templeton-Steadman and R Williams ldquoInformation deliv-ery system and method for mobile appliancesrdquo US Patent App11562054 2006

[3] D Reis A Melo A L V Coelho and V Furtado ldquoTowardsoptimal police patrol routes with genetic algorithmsrdquo in Intel-ligence and Security Informatics IEEE International Conferenceon Intelligence and Security Informatics ISI 2006 SanDiego CAUSA May 23-24 2006 Proceedings vol 3975 of Lecture Notesin Computer Science pp 485ndash491 Springer Berlin Germany2006

[4] V Ceikute and C S Jensen ldquoRouting service qualitymdashlocaldriver behavior versus routing servicesrdquo in Proceedings of theIEEE 14th International Conference onMobileDataManagement(MDM rsquo13) vol 1 pp 97ndash106 IEEE June 2013

[5] Safety Apps October 2015 httpwwwcsunedupolicesafety-apps

[6] E Galbrun K Pelechrinis and E Terzi ldquoUrban navigationbeyond shortest route the case of safe pathsrdquo InformationSystems vol 57 pp 160ndash171 2016

[7] T Wang C Rudin D Wagner and R Sevieri ldquoLearning todetect patterns of crimerdquo in Machine Learning and KnowledgeDiscovery in Databases vol 8190 of Lecture Notes in ComputerScience pp 515ndash530 Springer Berlin Germany 2013

[8] C-H Yu W Ding P Chen and M Morabito ldquoCrime forecast-ing using spatio-temporal pattern with ensemble learningrdquo inAdvances in Knowledge Discovery andDataMining 18th Pacific-AsiaConference PAKDD2014 Tainan TaiwanMay 13ndash16 2014

Mobile Information Systems 11

Proceedings Part II Lecture Notes in Computer Science pp174ndash185 Springer Berlin Germany 2014

[9] L Scott and NWarmerdam ldquoExtend crime analysis with arcgisspatial statistics toolsrdquo ArcUser Magazine 2005

[10] M Leitner Crime Modeling and Mapping Using GeospatialTechnologies vol 8 Springer DordrechtTheNetherlands 2013

[11] J Kim M Cha and T Sandholm ldquoSocroutes safe routesbased on tweet sentimentsrdquo in Proceedings of the 23rd ACMInternational Conference on World Wide Web (WWW rsquo14) pp179ndash182 Seoul South Korea April 2014

[12] MHaklay and PWeber ldquoOpenstreetmap user-generated streetmapsrdquo IEEE Pervasive Computing vol 7 no 4 pp 12ndash18 2008

[13] J M Sanchez Bernabeu J V Berna Martınez and F MaciaPerez ldquoSmart sentinelmonitoring and prevention system in thesmart citiesrdquo International Review on Computers and Softwarevol 9 no 9 pp 1554ndash1559 2014

[14] Wiki crimes 2015 httpwwwwikicrimesorg[15] T Moon S Heo and S Lee ldquoUbiquitous crime prevention

system (UCPS) for a safer cityrdquoProcedia Environmental Sciencesvol 22 pp 288ndash301 2014

[16] H Su K Zheng J Huang H Jeung L Chen and X ZhouldquoCrowdplanner a crowd-based route recommendation systemrdquoinProceedings of the 30th IEEE International Conference onDataEngineering (ICDE rsquo14) pp 1144ndash1155 IEEE Chicago Ill USAApril 2014

[17] V Arnaboldi M Conti and F Delmastro ldquoImplementationof CAMEO a context-aware middleware for opportunisticmobile social networksrdquo inProceedings of the IEEE InternationalSymposium on a World of Wireless Mobile and MultimediaNetworks (WoWMoM rsquo11) pp 1ndash3 Lucca Italy June 2011

[18] M Nagarajan K Gomadam A P Sheth A Ranabahu RMutharaju and A Jadhav ldquoSpatio-temporal-thematic analysisof citizen sensor data challenges and experiencesrdquo in WebInformation Systems EngineeringmdashWISE 2009 vol 5802 of Lec-ture Notes in Computer Science pp 539ndash553 Springer BerlinGermany 2009

[19] N Powdthavee ldquoUnhappiness and crime evidence from SouthAfricardquo Economica vol 72 no 287 pp 531ndash547 2005

[20] J Ratcliffe ldquoCrime mapping spatial and temporal challengesrdquoin Handbook of Quantitative Criminology pp 5ndash24 SpringerNew York NY USA 2010

[21] P Gupta G N Purohit and A Dadhich ldquoCrime preventionthrough alternate route finding in traffic surveillance using cctvcamerasrdquo International Journal of Engineering and AdvancedTechnology vol 2 no 5 pp 414ndash418 2013

[22] H Chen T Cheng and S Wise ldquoDesigning daily patrol routesfor policing based on ANT colony algorithmrdquo ISPRS Annalsof the Photogrammetry Remote Sensing and Spatial InformationSciences vol II-4W2 pp 103ndash109 2015

[23] D Dzemydiene and E Kazemikaitiene ldquoOntology-based deci-sion support system for crime investigation processesrdquo inInformation Systems Development pp 427ndash438 Springer NewYork NY USA 2005

[24] Y Chabot A Bertaux C Nicolle and T Kechadi ldquoA completeformalized knowledge representation model for advanced dig-ital forensics timeline analysisrdquoDigital Investigation vol 15 pp83ndash100 2015

[25] E Hatcher O Gospodnetic and M McCandless Lucene inAction Manning Publications Greenwich Conn USA 2004

[26] FMata-RiveraMTorres-RuizGGuzmanMMoreno-Ibarraand R Quintero ldquoA collaborative learning approach for geo-graphic information retrieval based on social networksrdquo Com-puters in Human Behavior vol 51 pp 829ndash842 2015

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 10: A Mobile Information System Based on Crowd-Sensed and Official ...

10 Mobile Information Systems

Figure 9 Theftcrime events that occurred in a specific time fromthe mobile information system

Figure 10 The hybrid map view from the web version

model based on the Bayes algorithm to obtain safe routes withdata that were provided by the mobile device and (5) thedesign of amobile information system to generate safe routes

According to the results of the estimation the certaintydegree is around 75 of effectiveness It was tested bycomparing areas with crime data but the records wereintentionally removed and original copy was keptThus withthe results of the estimation a comparison with the originalcopy was performed So we found that the estimation has aperformance of 75 for all the points of the data sample

In addition a metric to measure the confidence level orsecurity for certain points and areas of Mexico City has beenproposed It allows finding safe routes according to pathswith a low crime rate Moreover the mobile application gath-ers long-term statistical data with almost real informationfrom citizens which are acting as sensors in the city Theresults of the mobile system have been tested and comparedwith the Crime Map System

Future works are oriented towards evaluating the cogni-tive perception of people taking into consideration points

or geographic places for finding comfortable routes Thesentiment analysis will be treated in order to incorporate thisfeature as a parameter in the computation of routes Addi-tionally we are proposing the integration of ourmobile appli-cation with the Mexican CCTV camera systems for sensingthe dynamic of certain areas in the city This contribution isfocused on developing mobile information systems for rout-ing and urban planning in city environments Mobile appli-cations are increasingly becoming essential for analyzing theurban dynamic of big cities Thus the appearance of the nextgeneration of mobile information systems will be devised inreal-time road network conditions In addition this genera-tion is oriented towards improving the quality of human lifefor increasing the sustainability of the smart cities

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

Thisworkwas partially sponsored by the Instituto PolitecnicoNacional (IPN) the Consejo Nacional de Ciencia y Tec-nologıa (CONACYT) and the Secretarıa de Investigacion yPosgrado (SIP) under Grants 20162006 20161899 20161869and 20161611

References

[1] H Zhang Y Xu and X Wen ldquoOptimal shortest path setproblem in undirected graphsrdquo Journal of Combinatorial Opti-mization vol 29 no 3 pp 511ndash530 2015

[2] W Templeton-Steadman and R Williams ldquoInformation deliv-ery system and method for mobile appliancesrdquo US Patent App11562054 2006

[3] D Reis A Melo A L V Coelho and V Furtado ldquoTowardsoptimal police patrol routes with genetic algorithmsrdquo in Intel-ligence and Security Informatics IEEE International Conferenceon Intelligence and Security Informatics ISI 2006 SanDiego CAUSA May 23-24 2006 Proceedings vol 3975 of Lecture Notesin Computer Science pp 485ndash491 Springer Berlin Germany2006

[4] V Ceikute and C S Jensen ldquoRouting service qualitymdashlocaldriver behavior versus routing servicesrdquo in Proceedings of theIEEE 14th International Conference onMobileDataManagement(MDM rsquo13) vol 1 pp 97ndash106 IEEE June 2013

[5] Safety Apps October 2015 httpwwwcsunedupolicesafety-apps

[6] E Galbrun K Pelechrinis and E Terzi ldquoUrban navigationbeyond shortest route the case of safe pathsrdquo InformationSystems vol 57 pp 160ndash171 2016

[7] T Wang C Rudin D Wagner and R Sevieri ldquoLearning todetect patterns of crimerdquo in Machine Learning and KnowledgeDiscovery in Databases vol 8190 of Lecture Notes in ComputerScience pp 515ndash530 Springer Berlin Germany 2013

[8] C-H Yu W Ding P Chen and M Morabito ldquoCrime forecast-ing using spatio-temporal pattern with ensemble learningrdquo inAdvances in Knowledge Discovery andDataMining 18th Pacific-AsiaConference PAKDD2014 Tainan TaiwanMay 13ndash16 2014

Mobile Information Systems 11

Proceedings Part II Lecture Notes in Computer Science pp174ndash185 Springer Berlin Germany 2014

[9] L Scott and NWarmerdam ldquoExtend crime analysis with arcgisspatial statistics toolsrdquo ArcUser Magazine 2005

[10] M Leitner Crime Modeling and Mapping Using GeospatialTechnologies vol 8 Springer DordrechtTheNetherlands 2013

[11] J Kim M Cha and T Sandholm ldquoSocroutes safe routesbased on tweet sentimentsrdquo in Proceedings of the 23rd ACMInternational Conference on World Wide Web (WWW rsquo14) pp179ndash182 Seoul South Korea April 2014

[12] MHaklay and PWeber ldquoOpenstreetmap user-generated streetmapsrdquo IEEE Pervasive Computing vol 7 no 4 pp 12ndash18 2008

[13] J M Sanchez Bernabeu J V Berna Martınez and F MaciaPerez ldquoSmart sentinelmonitoring and prevention system in thesmart citiesrdquo International Review on Computers and Softwarevol 9 no 9 pp 1554ndash1559 2014

[14] Wiki crimes 2015 httpwwwwikicrimesorg[15] T Moon S Heo and S Lee ldquoUbiquitous crime prevention

system (UCPS) for a safer cityrdquoProcedia Environmental Sciencesvol 22 pp 288ndash301 2014

[16] H Su K Zheng J Huang H Jeung L Chen and X ZhouldquoCrowdplanner a crowd-based route recommendation systemrdquoinProceedings of the 30th IEEE International Conference onDataEngineering (ICDE rsquo14) pp 1144ndash1155 IEEE Chicago Ill USAApril 2014

[17] V Arnaboldi M Conti and F Delmastro ldquoImplementationof CAMEO a context-aware middleware for opportunisticmobile social networksrdquo inProceedings of the IEEE InternationalSymposium on a World of Wireless Mobile and MultimediaNetworks (WoWMoM rsquo11) pp 1ndash3 Lucca Italy June 2011

[18] M Nagarajan K Gomadam A P Sheth A Ranabahu RMutharaju and A Jadhav ldquoSpatio-temporal-thematic analysisof citizen sensor data challenges and experiencesrdquo in WebInformation Systems EngineeringmdashWISE 2009 vol 5802 of Lec-ture Notes in Computer Science pp 539ndash553 Springer BerlinGermany 2009

[19] N Powdthavee ldquoUnhappiness and crime evidence from SouthAfricardquo Economica vol 72 no 287 pp 531ndash547 2005

[20] J Ratcliffe ldquoCrime mapping spatial and temporal challengesrdquoin Handbook of Quantitative Criminology pp 5ndash24 SpringerNew York NY USA 2010

[21] P Gupta G N Purohit and A Dadhich ldquoCrime preventionthrough alternate route finding in traffic surveillance using cctvcamerasrdquo International Journal of Engineering and AdvancedTechnology vol 2 no 5 pp 414ndash418 2013

[22] H Chen T Cheng and S Wise ldquoDesigning daily patrol routesfor policing based on ANT colony algorithmrdquo ISPRS Annalsof the Photogrammetry Remote Sensing and Spatial InformationSciences vol II-4W2 pp 103ndash109 2015

[23] D Dzemydiene and E Kazemikaitiene ldquoOntology-based deci-sion support system for crime investigation processesrdquo inInformation Systems Development pp 427ndash438 Springer NewYork NY USA 2005

[24] Y Chabot A Bertaux C Nicolle and T Kechadi ldquoA completeformalized knowledge representation model for advanced dig-ital forensics timeline analysisrdquoDigital Investigation vol 15 pp83ndash100 2015

[25] E Hatcher O Gospodnetic and M McCandless Lucene inAction Manning Publications Greenwich Conn USA 2004

[26] FMata-RiveraMTorres-RuizGGuzmanMMoreno-Ibarraand R Quintero ldquoA collaborative learning approach for geo-graphic information retrieval based on social networksrdquo Com-puters in Human Behavior vol 51 pp 829ndash842 2015

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 11: A Mobile Information System Based on Crowd-Sensed and Official ...

Mobile Information Systems 11

Proceedings Part II Lecture Notes in Computer Science pp174ndash185 Springer Berlin Germany 2014

[9] L Scott and NWarmerdam ldquoExtend crime analysis with arcgisspatial statistics toolsrdquo ArcUser Magazine 2005

[10] M Leitner Crime Modeling and Mapping Using GeospatialTechnologies vol 8 Springer DordrechtTheNetherlands 2013

[11] J Kim M Cha and T Sandholm ldquoSocroutes safe routesbased on tweet sentimentsrdquo in Proceedings of the 23rd ACMInternational Conference on World Wide Web (WWW rsquo14) pp179ndash182 Seoul South Korea April 2014

[12] MHaklay and PWeber ldquoOpenstreetmap user-generated streetmapsrdquo IEEE Pervasive Computing vol 7 no 4 pp 12ndash18 2008

[13] J M Sanchez Bernabeu J V Berna Martınez and F MaciaPerez ldquoSmart sentinelmonitoring and prevention system in thesmart citiesrdquo International Review on Computers and Softwarevol 9 no 9 pp 1554ndash1559 2014

[14] Wiki crimes 2015 httpwwwwikicrimesorg[15] T Moon S Heo and S Lee ldquoUbiquitous crime prevention

system (UCPS) for a safer cityrdquoProcedia Environmental Sciencesvol 22 pp 288ndash301 2014

[16] H Su K Zheng J Huang H Jeung L Chen and X ZhouldquoCrowdplanner a crowd-based route recommendation systemrdquoinProceedings of the 30th IEEE International Conference onDataEngineering (ICDE rsquo14) pp 1144ndash1155 IEEE Chicago Ill USAApril 2014

[17] V Arnaboldi M Conti and F Delmastro ldquoImplementationof CAMEO a context-aware middleware for opportunisticmobile social networksrdquo inProceedings of the IEEE InternationalSymposium on a World of Wireless Mobile and MultimediaNetworks (WoWMoM rsquo11) pp 1ndash3 Lucca Italy June 2011

[18] M Nagarajan K Gomadam A P Sheth A Ranabahu RMutharaju and A Jadhav ldquoSpatio-temporal-thematic analysisof citizen sensor data challenges and experiencesrdquo in WebInformation Systems EngineeringmdashWISE 2009 vol 5802 of Lec-ture Notes in Computer Science pp 539ndash553 Springer BerlinGermany 2009

[19] N Powdthavee ldquoUnhappiness and crime evidence from SouthAfricardquo Economica vol 72 no 287 pp 531ndash547 2005

[20] J Ratcliffe ldquoCrime mapping spatial and temporal challengesrdquoin Handbook of Quantitative Criminology pp 5ndash24 SpringerNew York NY USA 2010

[21] P Gupta G N Purohit and A Dadhich ldquoCrime preventionthrough alternate route finding in traffic surveillance using cctvcamerasrdquo International Journal of Engineering and AdvancedTechnology vol 2 no 5 pp 414ndash418 2013

[22] H Chen T Cheng and S Wise ldquoDesigning daily patrol routesfor policing based on ANT colony algorithmrdquo ISPRS Annalsof the Photogrammetry Remote Sensing and Spatial InformationSciences vol II-4W2 pp 103ndash109 2015

[23] D Dzemydiene and E Kazemikaitiene ldquoOntology-based deci-sion support system for crime investigation processesrdquo inInformation Systems Development pp 427ndash438 Springer NewYork NY USA 2005

[24] Y Chabot A Bertaux C Nicolle and T Kechadi ldquoA completeformalized knowledge representation model for advanced dig-ital forensics timeline analysisrdquoDigital Investigation vol 15 pp83ndash100 2015

[25] E Hatcher O Gospodnetic and M McCandless Lucene inAction Manning Publications Greenwich Conn USA 2004

[26] FMata-RiveraMTorres-RuizGGuzmanMMoreno-Ibarraand R Quintero ldquoA collaborative learning approach for geo-graphic information retrieval based on social networksrdquo Com-puters in Human Behavior vol 51 pp 829ndash842 2015

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 12: A Mobile Information System Based on Crowd-Sensed and Official ...

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014


Recommended