+ All Categories
Home > Documents > Review Article A Review of Data Fusion...

Review Article A Review of Data Fusion...

Date post: 24-Mar-2020
Category:
Upload: others
View: 11 times
Download: 0 times
Share this document with a friend
20
Hindawi Publishing Corporation e Scientific World Journal Volume 2013, Article ID 704504, 19 pages http://dx.doi.org/10.1155/2013/704504 Review Article A Review of Data Fusion Techniques Federico Castanedo Deusto Institute of Technology, DeustoTech, University of Deusto, Avenida de las Universidades 24, 48007 Bilbao, Spain Correspondence should be addressed to Federico Castanedo; [email protected] Received 9 August 2013; Accepted 11 September 2013 Academic Editors: Y. Takama and D. Ursino Copyright © 2013 Federico Castanedo. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. e integration of data and knowledge from several sources is known as data fusion. is paper summarizes the state of the data fusion field and describes the most relevant studies. We first enumerate and explain different classification schemes for data fusion. en, the most common algorithms are reviewed. ese methods and algorithms are presented using three different categories: (i) data association, (ii) state estimation, and (iii) decision fusion. 1. Introduction In general, all tasks that demand any type of parameter estimation from multiple sources can benefit from the use of data/information fusion methods. e terms information fusion and data fusion are typically employed as synonyms; but in some scenarios, the term data fusion is used for raw data (obtained directly from the sensors) and the term information fusion is employed to define already processed data. In this sense, the term information fusion implies a higher semantic level than data fusion. Other terms associ- ated with data fusion that typically appear in the literature include decision fusion, data combination, data aggregation, multisensor data fusion, and sensor fusion. Researchers in this field agree that the most accepted definition of data fusion was provided by the Joint Directors of Laboratories (JDL) workshop [1]: “A multi-level process dealing with the association, correlation, combination of data and information from single and multiple sources to achieve refined position, identify estimates and complete and timely assessments of situations, threats and their significance.” Hall and Llinas [2] provided the following well-known definition of data fusion: “data fusion techniques combine data from multiple sensors and related information from associated databases to achieve improved accuracy and more specific inferences than could be achieved by the use of a single sensor alone.” Briefly, we can define data fusion as a combination of multiple sources to obtain improved information; in this context, improved information means less expensive, higher quality, or more relevant information. Data fusion techniques have been extensively employed on multisensor environments with the aim of fusing and aggregating data from different sensors; however, these tech- niques can also be applied to other domains, such as text processing. e goal of using data fusion in multisensor envi- ronments is to obtain a lower detection error probability and a higher reliability by using data from multiple distributed sources. e available data fusion techniques can be classified into three nonexclusive categories: (i) data association, (ii) state estimation, and (iii) decision fusion. Because of the large number of published papers on data fusion, this paper does not aim to provide an exhaustive review of all of the studies; instead, the objective is to highlight the main steps that are involved in the data fusion framework and to review the most common techniques for each step. e remainder of this paper continues as follows. e next section provides various classification categories for data fusion techniques. en, Section 3 describes the most com- mon methods for data association tasks. Section 4 provides a review of techniques under the state estimation category. Next, the most common techniques for decision fusion are enumerated in Section 5. Finally, the conclusions obtained
Transcript
Page 1: Review Article A Review of Data Fusion Techniquesdownloads.hindawi.com/journals/tswj/2013/704504.pdf · Classification of Data Fusion Techniques Data fusion is a ... to extract features

Hindawi Publishing CorporationThe Scientific World JournalVolume 2013 Article ID 704504 19 pageshttpdxdoiorg1011552013704504

Review ArticleA Review of Data Fusion Techniques

Federico Castanedo

Deusto Institute of Technology DeustoTech University of Deusto Avenida de las Universidades 24 48007 Bilbao Spain

Correspondence should be addressed to Federico Castanedo castanedofedegmailcom

Received 9 August 2013 Accepted 11 September 2013

Academic Editors Y Takama and D Ursino

Copyright copy 2013 Federico CastanedoThis is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

The integration of data and knowledge from several sources is known as data fusion This paper summarizes the state of the datafusion field and describes the most relevant studies We first enumerate and explain different classification schemes for data fusionThen the most common algorithms are reviewed These methods and algorithms are presented using three different categories (i)data association (ii) state estimation and (iii) decision fusion

1 Introduction

In general all tasks that demand any type of parameterestimation from multiple sources can benefit from the useof datainformation fusion methods The terms informationfusion and data fusion are typically employed as synonymsbut in some scenarios the term data fusion is used forraw data (obtained directly from the sensors) and the terminformation fusion is employed to define already processeddata In this sense the term information fusion implies ahigher semantic level than data fusion Other terms associ-ated with data fusion that typically appear in the literatureinclude decision fusion data combination data aggregationmultisensor data fusion and sensor fusion

Researchers in this field agree that the most accepteddefinition of data fusion was provided by the Joint Directorsof Laboratories (JDL) workshop [1] ldquoA multi-level processdealing with the association correlation combination of dataand information from single and multiple sources to achieverefined position identify estimates and complete and timelyassessments of situations threats and their significancerdquo

Hall and Llinas [2] provided the following well-knowndefinition of data fusion ldquodata fusion techniques combine datafrom multiple sensors and related information from associateddatabases to achieve improved accuracy and more specificinferences than could be achieved by the use of a single sensoralonerdquo

Briefly we can define data fusion as a combination ofmultiple sources to obtain improved information in thiscontext improved information means less expensive higherquality or more relevant information

Data fusion techniques have been extensively employedon multisensor environments with the aim of fusing andaggregating data from different sensors however these tech-niques can also be applied to other domains such as textprocessingThe goal of using data fusion inmultisensor envi-ronments is to obtain a lower detection error probability anda higher reliability by using data from multiple distributedsources

The available data fusion techniques can be classified intothree nonexclusive categories (i) data association (ii) stateestimation and (iii) decision fusion Because of the largenumber of published papers on data fusion this paper doesnot aim to provide an exhaustive review of all of the studiesinstead the objective is to highlight the main steps that areinvolved in the data fusion framework and to review themostcommon techniques for each step

The remainder of this paper continues as follows Thenext section provides various classification categories for datafusion techniques Then Section 3 describes the most com-mon methods for data association tasks Section 4 providesa review of techniques under the state estimation categoryNext the most common techniques for decision fusion areenumerated in Section 5 Finally the conclusions obtained

2 The Scientific World Journal

from reviewing the different methods are highlighted inSection 6

2 Classification of Data Fusion Techniques

Data fusion is a multidisciplinary area that involves severalfields and it is difficult to establish a clear and strict classifi-cationThe employedmethods and techniques can be dividedaccording to the following criteria

(1) attending to the relations between the input datasources as proposed by Durrant-Whyte [3] Theserelations can be defined as (a) complementary (b)redundant or (3) cooperative data

(2) according to the inputoutput data types and theirnature as proposed by Dasarathy [4]

(3) following an abstraction level of the employed data(a) rawmeasurement (b) signals and (c) characteris-tics or decisions

(4) based on the different data fusion levels defined by theJDL

(5) Depending on the architecture type (a) centralized(b) decentralized or (c) distributed

21 Classification Based on the Relations between the DataSources Based on the relations of the sources (see Figure 1)Durrant-Whyte [3] proposed the following classificationcriteria

(1) complementary when the information provided bythe input sources represents different parts of thescene and could thus be used to obtainmore completeglobal information For example in the case of visualsensor networks the information on the same targetprovided by two cameras with different fields of viewis considered complementary

(2) redundant when two or more input sources provideinformation about the same target and could thus befused to increment the confidence For example thedata coming from overlapped areas in visual sensornetworks are considered redundant

(3) cooperative when the provided information is com-bined into new information that is typically morecomplex than the original information For examplemulti-modal (audio and video) data fusion is consid-ered cooperative

22 Dasarathyrsquos Classification One of the most well-knowndata fusion classification systems was provided by Dasarathy[4] and is composed of the following five categories (seeFigure 2)

(1) data in-data out (DAI-DAO) this type is the mostbasic or elementary data fusion method that is con-sidered in classification This type of data fusionprocess inputs and outputs raw data the results

are typically more reliable or accurate Data fusion atthis level is conducted immediately after the data aregathered from the sensors The algorithms employedat this level are based on signal and image processingalgorithms

(2) data in-feature out (DAI-FEO) at this level the datafusion process employs raw data from the sourcesto extract features or characteristics that describe anentity in the environment

(3) feature in-feature out (FEI-FEO) at this level boththe input and output of the data fusion process arefeatures Thus the data fusion process addresses aset of features with to improve refine or obtain newfeatures This process is also known as feature fusionsymbolic fusion information fusion or intermediate-level fusion

(4) feature in-decision out (FEI-DEO) this level obtains aset of features as input and provides a set of decisionsas output Most of the classification systems thatperform a decision based on a sensorrsquos inputs fall intothis category of classification

(5) Decision In-Decision Out (DEI-DEO) This type ofclassification is also known as decision fusion It fusesinput decisions to obtain better or new decisions

The main contribution of Dasarathyrsquos classification is thespecification of the abstraction level either as an input or anoutput providing a framework to classify different methodsor techniques

23 Classification Based on the Abstraction Levels Luo et al[5] provided the following four abstraction levels

(1) signal level directly addresses the signals that areacquired from the sensors

(2) pixel level operates at the image level and could beused to improve image processing tasks

(3) characteristic employs features that are extractedfrom the images or signals (ie shape or velocity)

(4) symbol at this level information is represented assymbols this level is also known as the decision level

Information fusion typically addresses three levels ofabstraction (1) measurements (2) characteristics and (3)decisions Other possible classifications of data fusion basedon the abstraction levels are as follows

(1) low level fusion the raw data are directly providedas an input to the data fusion process which providemore accurate data (a lower signal-to-noise ratio)than the individual sources

(2) medium level fusion characteristics or features(shape texture and position) are fused to obtainfeatures that could be employed for other tasks Thislevel is also known as the feature or characteristiclevel

The Scientific World Journal 3

S1 S2 S3 S4 S5

Complementaryfusion

Redundantfusion

Cooperativefusion

Fusedinformation

Sources

Information

(a + b) (b) (c)

A B

A B

B C

C

C998400

Figure 1 Whytersquos classification based on the relations between the data sources

Data

Data

Features

Features

Decisions

Data

Features

Features

Decisions

Decisions

Data in-data out(DAI-DAO)

Data in-feature out(DAI-FEO)

Feature in-decision out(FEI-DEO)

Decision in-decision out(DEI-DEO)

Feature in-feature out(FEI-FEO)

Figure 2 Dasarathyrsquos classification

(3) high level fusion this level which is also knownas decision fusion takes symbolic representations assources and combines them to obtain amore accuratedecision Bayesianrsquosmethods are typically employed atthis level

(4) multiple level fusion this level addresses data pro-vided from different levels of abstraction (ie whena measurement is combined with a feature to obtain adecision)

24 JDL Data Fusion Classification This classification is themost popular conceptual model in the data fusion commu-nity It was originally proposed by JDL and the American

Department of Defense (DoD) [1] These organizations clas-sified the data fusion process into five processing levels anassociated database and an information bus that connectsthe five components (see Figure 3) The five levels could begrouped into two groups low-level fusion and high-levelfusion which comprise the following components

(i) sources the sources are in charge of providingthe input data Different types of sources can beemployed such as sensors a priori information (ref-erences or geographic data) databases and humaninputs

(ii) human-computer interaction (HCI) HCI is an inter-face that allows inputs to the system from the oper-ators and produces outputs to the operators HCIincludes queries commands and information on theobtained results and alarms

(iii) database management system the database manage-ment system stores the provided information andthe fused results This system is a critical componentbecause of the large amount of highly diverse infor-mation that is stored

In contrast the five levels of data processing are defined asfollows

(1) level 0mdashsource preprocessing source preprocessingis the lowest level of the data fusion process andit includes fusion at the signal and pixel levels Inthe case of text sources this level also includes theinformation extraction processThis level reduces theamount of data and maintains useful information forthe high-level processes

(2) level 1mdashobject refinement object refinement employsthe processed data from the previous level Com-mon procedures of this level include spatio-temporalalignment association correlation clustering orgrouping techniques state estimation the removal offalse positives identity fusion and the combining offeatures that were extracted from images The output

4 The Scientific World Journal

Fusion domain

Level 0 Level 1 Level 2 Level 3Sourcepreprocessing

Objectrefinement

Situationassessment

Threatassessment

Information bus

SourcesSensorsDatabasesKnowledge

Level 4 DatabasemanagementProcess

refinement

Userinterface

Figure 3 The JDL data fusion framework

results of this stage are the object discrimination(classification and identification) and object track-ing (state of the object and orientation) This stagetransforms the input information into consistent datastructures

(3) level 2mdashsituation assessment this level focuses ona higher level of inference than level 1 Situationassessment aims to identify the likely situations giventhe observed events and obtained data It establishesrelationships between the objects Relations (ieproximity communication) are valued to determinethe significance of the entities or objects in a specificenvironment The aim of this level includes perform-ing high-level inferences and identifying significantactivities and events (patterns in general) The outputis a set of high-level inferences

(4) level 3mdashimpact assessment this level evaluates theimpact of the detected activities in level 2 to obtain aproper perspectiveThe current situation is evaluatedand a future projection is performed to identifypossible risks vulnerabilities and operational oppor-tunities This level includes (1) an evaluation of therisk or threat and (2) a prediction of the logicaloutcome

(5) level 4mdashprocess refinement this level improves theprocess from level 0 to level 3 and provides resourceand sensor management The aim is to achieve effi-cient resource management while accounting for taskpriorities scheduling and the control of availableresources

High-level fusion typically starts at level 2 because thetype localization movement and quantity of the objectsare known at that level One of the limitations of the JDLmethod is how the uncertainty about previous or subsequentresults could be employed to enhance the fusion process(feedback loop) Llinas et al [6] propose several refinementsand extensions to the JDL model Blasch and Plano [7]proposed to add a new level (user refinement) to support ahumanuser in the data fusion loopThe JDLmodel represents

the first effort to provide a detailed model and a commonterminology for the data fusion domain However becausetheir roots originate in the military domain the employedterms are oriented to the risks that commonly occur inthese scenarios The Dasarathy model differs from the JDLmodel with regard to the adopted terminology and employedapproach The former is oriented toward the differencesamong the input and output results independent of theemployed fusion method In summary the Dasarathy modelprovides a method for understanding the relations betweenthe fusion tasks and employed data whereas the JDL modelpresents an appropriate fusion perspective to design datafusion systems

25 Classification Based on the Type of Architecture One ofthe main questions that arise when designing a data fusionsystem is where the data fusion process will be performedBased on this criterion the following types of architecturescould be identified

(1) centralized architecture in a centralized architecturethe fusion node resides in the central processor thatreceives the information from all of the input sourcesTherefore all of the fusion processes are executedin a central processor that uses the provided rawmeasurements from the sources In this schema thesources obtain only the observationas measurementsand transmit them to a central processor where thedata fusion process is performed If we assume thatdata alignment and data association are performedcorrectly and that the required time to transfer thedata is not significant then the centralized scheme istheoretically optimal However the previous assump-tions typically do not hold for real systems Moreoverthe large amount of bandwidth that is required to sendraw data through the network is another disadvantagefor the centralized approach This issue becomes abottleneck when this type of architecture is employedfor fusing data in visual sensor networks Finallythe time delays when transferring the informationbetween the different sources are variable and affect

The Scientific World Journal 5

the results in the centralized scheme to a greaterdegree than in other schemes

(2) decentralized architecture a decentralized architec-ture is composed of a network of nodes in which eachnode has its own processing capabilities and there isno single point of data fusion Therefore each nodefuses its local information with the information thatis received from its peers Data fusion is performedautonomously with each node accounting for its localinformation and the information received from itspeers Decentralized data fusion algorithms typicallycommunicate information using the Fisher and Shan-non measurements instead of the objectrsquos state [8]The main disadvantage of this architecture is thecommunication cost which is 119874(1198992) at each com-munication step where 119899 is the number of nodesadditionally the extreme case is considered in whicheach node communicates with all of its peers Thusthis type of architecture could suffer from scalabilityproblems when the number of nodes is increased

(3) distributed architecture in a distributed architecturemeasurements from each source node are processedindependently before the information is sent to thefusion node the fusion node accounts for the infor-mation that is received from the other nodes In otherwords the data association and state estimation areperformed in the source node before the informationis communicated to the fusion node Therefore eachnode provides an estimation of the object state basedon only their local views and this information isthe input to the fusion process which provides afused global view This type of architecture providesdifferent options and variations that range from onlyone fusion node to several intermediate fusion nodes

(4) hierarchical architecture other architectures com-prise a combination of decentralized and distributednodes generating hierarchical schemes in which thedata fusion process is performed at different levels inthe hierarchy

In principle a decentralized data fusion system is moredifficult to implement because of the computation andcommunication requirements However in practice there isno single best architecture and the selection of the mostappropriate architecture should be made depending on therequirements demand existing networks data availabilitynode processing capabilities and organization of the datafusion system

The reader might think that the decentralized anddistributed architectures are similar however they havemeaningful differences (see Figure 4) First in a distributedarchitecture a preprocessing of the obtainedmeasurements isperformed which provides a vector of features as a result (thefeatures are fused thereafter) In contrast in the decentralizedarchitecture the complete data fusion process is conductedin each node and each of the nodes provides a globallyfused result Second the decentralized fusion algorithmstypically communicate information employing the Fisher

and Shannon measurements In contrast distributed algo-rithms typically share a common notion of state (positionvelocity and identity) with their associated probabilitieswhich are used to perform the fusion process [9] Thirdbecause the decentralized data fusion algorithms exchangeinformation instead of states and probabilities they havethe advantage of easily separating old knowledge from newknowledge Thus the process is additive and the associativemeaning is not relevant when the information is receivedand fused However in the distributed data fusion algorithms(ie distributed by Kalman Filter) the state that is goingto be fused is not associative and when and how the fusedestimates are computed is relevant Nevertheless in contrastto the centralized architectures the distributed algorithmsreduce the necessary communication and computationalcosts because some tasks are computed in the distributednodes before data fusion is performed in the fusion node

3 Data Association Techniques

The data association problem must determine the set ofmeasurements that correspond to each target (see Figure 5)Let us suppose that there are 119874 targets that are being trackedby only one sensor in a cluttered environment (by a clutteredenvironment we refer to an environment that has severaltargets that are to close each other)Then the data associationproblem can be defined as follows

(i) each sensorrsquos observation is received in the fusionnode at discrete time intervals

(ii) the sensormight not provide observations at a specificinterval

(iii) some observations are noise and other observationsoriginate from the detected target

(iv) for any specific target and in every time interval wedo not know (a priori) the observations that will begenerated by that target

Therefore the goal of data association is to establish theset of observations or measurements that are generated bythe same target over time Hall and Llinas [2] provided thefollowing definition of data association ldquoThe process of assignand compute the weights that relates the observations or tracks(A track can be defined as an ordered set of points that followa path and are generated by the same target) from one set tothe observation of tracks of another setrdquo

As an example of the complexity of the data associationproblem if we take a frame-to-frame association and assumethat119872 possible points could be detected in all 119899 frames thenthe number of possible sets is (119872)119899minus1 Note that from allof these possible solutions only one set establishes the truemovement of the119872 points

Data association is often performed before the stateestimation of the detected targets Moreover it is a keystep because the estimation or classification will behaveincorrectly if the data association phase does not workcoherently The data association process could also appear inall of the fusion levels but the granularity varies dependingon the objective of each level

6 The Scientific World Journal

Preprocessing

Preprocessing

Preprocessing

Alignment Association Estimation

Stateof theobject

Centralized architecture

Decentralized architecture

Distributed architecture

S1

S2

Fusion node

Preprocessing

Stateof theobject

Stateof theobject

Stateof theobject

S1

S2

S1

S2

Preprocessing

Preprocessing

Preprocessing

Preprocessing

Preprocessing

Alignment

Alignment

Alignment

Alignment

Alignment

Alignment

Alignment

Association

Association

Association

Association

Association

Association

Association

Estimation

Estimation

Estimation

Estimation

Estimation

Estimation

Estimation

Sn

Sn

Sn

Stateof theobject

Figure 4 Classification based on the type of architecture

In general an exhaustive search of all possible combina-tions grows exponentially with the number of targets thusthe data association problem becomes NP complete Themost common techniques that are employed to solve the dataassociation problem are presented in the following sections(from Sections 31 to 37)

31 Nearest Neighbors and K-Means Nearest neighbor(NN) is the simplest data association technique NN isa well-known clustering algorithm that selects or groups

the most similar values How close the one measurement isto another depends on the employed distance metric andtypically depends on the threshold that is established by thedesigner In general the employed criteria could be based on(1) an absolute distance (2) the Euclidean distance or (3) astatistical function of the distance

NN is a simple algorithm that can find a feasible (approx-imate) solution in a small amount of time However in acluttered environment it could provide many pairs that havethe same probability and could thus produce undesirable

The Scientific World Journal 7

Targets Sensors Observations Tracks

Track 1

Track 2

False alarms

Ass

ocia

tion

S1

S2

Sn

Track n

y1 y2 yn

Figure 5 Conceptual overview of the data association process from multiple sensors and multiple targets It is necessary to establish the setof observations over time from the same object that forms a track

error propagation [10] Moreover this algorithm has poorperformance in environments in which false measurementsare frequent which are in highly noisy environments

All neighbors use a similar technique in which all of themeasurements inside a region are included in the tracks119870-Means [11] method is a well-known modification of

the NN algorithm 119870-Means divides the dataset values into119870 different clusters 119870-Means algorithm finds the best local-ization of the cluster centroids where best means a centroidthat is in the center of the data cluster119870-Means is an iterativealgorithm that can be divided into the following steps

(1) obtain the input data and the number of desiredclusters (119870)

(2) randomly assign the centroid of each cluster(3) match each data point with the centroid of each

cluster(4) move the cluster centers to the centroid of the cluster(5) if the algorithm does not converge return to step (3)

119870-Means is a popular algorithm that has been widelyemployed however it has the following disadvantages

(i) the algorithm does not always find the optimal solu-tion for the cluster centers

(ii) the number of clusters must be known a priori andone must assume that this number is the optimum

(iii) the algorithm assumes that the covariance of thedataset is irrelevant or that it has been normalizedalready

There are several options for overcoming these limita-tions For the first one it is possible to execute the algorithmseveral times and obtain the solution that has less varianceFor the second one it is possible to start with a low valueof 119870 and increment the values of 119870 until an adequate resultis obtained The third limitation can be easily overcome bymultiplying the datawith the inverse of the covariancematrix

Many variations have been proposed to Lloydrsquos basic119870-Means algorithm [11] which has a computational upperbound cost of 119874(119870119899) where 119899 is the number of input pointsand 119870 is the number of desired clusters Some algorithmsmodify the initial cluster assignments to improve the separa-tions and reduce the number of iterations Others introduce

soft or multinomial clustering assignments using fuzzy logicprobabilistic or the Bayesian techniques However most ofthe previous variations still must perform several iterationsthrough the data space to converge to a reasonable solutionThis issue becomes a major disadvantage in several real-time applications A new approach that is based on havinga large (but still affordable) number of cluster candidatescompared to the desired 119870 clusters is currently gainingattention The idea behind this computational model is thatthe algorithm builds a good sketch of the original data whilereducing the dimensionality of the input space significantlyIn this manner a weighted 119870-Means can be applied to thelarge candidate clusters to derive a good clustering of theoriginal data Using this idea [12] presented an efficientand scalable 119870-Means algorithm that is based on randomprojections This algorithm requires only one pass throughthe input data to build the clusters More specifically if theinput data distribution holds some separability requirementsthen the number of required candidate clusters grows onlyaccording to119874(log 119899) where 119899 is the number of observationsin the original data This salient feature makes the algorithmscalable in terms of both the memory and computationalrequirements

32 Probabilistic Data Association The probabilistic dataassociation (PDA) algorithm was proposed by Bar-Shalomand Tse [13] and is also known as the modified filter of allneighbors This algorithm assigns an association probabilityto each hypothesis from a valid measurement of a targetA valid measurement refers to the observation that falls inthe validation gate of the target at that time instant Thevalidation gate 120574 which is the center around the predictedmeasurements of the target is used to select the set of basicmeasurements and is defined as

120574 ge (119885 (119896) minus (119896 | 119896 minus 1))119879119878minus1(119896) (119911 (119896) minus 119911 (119896 | 119896 minus 1))

(1)

where 119870 is the temporal index 119878(119896) is the covariance gainand 120574 determines the gating or window size The set of validmeasurements at time instant 119896 is defined as

119885 (119896) = 119911119894(119896) 119894 = 1 119898

119896 (2)

8 The Scientific World Journal

where 119911119894(119896) is the 119894-measurement in the validation region at

time instant 119896 We give the standard equations of the PDAalgorithm next For the state prediction consider

119909 (119896 | 119896 minus 1) = 119865 (119896 minus 1) 119909 (119896 minus 1 | 119896 minus 1) (3)

where 119865(119896 minus 1) is the transition matrix at time instant 119896 minus 1To calculate the measurement prediction consider

(119896 | 119896 minus 1) = 119867 (119896) 119909 (119896 | 119896 minus 1) (4)

where 119867(119896) is the linearization measurement matrix Tocompute the gain or the innovation of the 119894-measurementconsider

V119894(119896) = 119911

119894(119896) minus (119896 | 119896 minus 1) (5)

To calculate the covariance prediction consider

(119896 | 119896 minus 1) = 119865 (119896 minus 1) (119896 minus 1 | 119896 minus 1) 119865(119896 minus 1)119879+ 119876 (119896)

(6)

where 119876(119896) is the process noise covariance matrix To com-pute the innovation covariance (119878) and the Kalman gain (119870)

119878 (119896) = 119867 (119896) (119896 | 119896 minus 1)119867(119896)119879+ 119877

119870 (119896) = (119896 | 119896 minus 1)119867(119896)119879119878(119896)minus1

(7)

To obtain the covariance update in the case in which themea-surements originated by the target are known consider

1198750(119896 | 119896) = (119896 | 119896 minus 1) minus 119870 (119896) 119878 (119896)119870(119896)

119879 (8)

The total update of the covariance is computed as

V (119896) =

119898119896

sum

119894=1

120573119894(119896) V119894(119896)

119875 (119896) = 119870 (119896) [

119898119896

sum

119894=1

(120573119894(119896) V119894(119896) V119894(119896)119879) minus V (119896) V(119896)

119879]119870119879(119896)

(9)

where119898119896is the number of valid measurements in the instant

119896The equation to update the estimated state which is formedby the position and velocity is given by

119909 (119896 | 119896) = 119909 (119896 | 119896 minus 1) + 119870 (119896) V (119896) (10)

Finally the association probabilities of PDA are as follows

120573119894(119896) =

119901119894(119896)

sum119898119896

119894=0119901119894(119896) (11)

where

119901119894(119896) =

(2Π)1198722120582radic1003816100381610038161003816119878119894 (119896)

1003816100381610038161003816 (1 minus 119875119889119875119892)

119875119889

if 119894 = 0

exp [minus12V119879 (119896) 119878minus1 (119896) V (119896)] if 119894 = 0

0 in other cases(12)

where119872 is the dimension of themeasurement vector 120582 is thedensity of the clutter environment 119875

119889is the detection prob-

ability of the correct measurement and 119875119892is the validation

probability of a detected valueIn the PDA algorithm the state estimation of the target is

computed as a weighted sum of the estimated state under allof the hypothesesThe algorithm can associate different mea-surements to one specific target Thus the association of thedifferent measurements to a specific target helps PDA toestimate the target state and the association probabilitiesare used as weights The main disadvantages of the PDAalgorithm are the following

(i) loss of tracks because PDA ignores the interferencewith other targets it sometimes could wrongly clas-sify the closest tracks Therefore it provides a poorperformance when the targets are close to each otheror crossed

(ii) the suboptimal Bayesian approximation when thesource of information is uncertain PDA is the sub-optimal Bayesian approximation to the associationproblem

(iii) one target PDA was initially designed for the asso-ciation of one target in a low-cluttered environmentThe number of false alarms is typically modeled withthe Poisson distribution and they are assumed to bedistributed uniformly in space PDA behaves incor-rectlywhen there aremultiple targets because the falsealarm model does not work well

(iv) track management because PDA assumes that thetrack is already established algorithms must be pro-vided for track initialization and track deletion

PDA is mainly good for tracking targets that do notmake abrupt changes in their movement patterns PDA willmost likely lose the target if it makes abrupt changes in itsmovement patterns

33 Joint Probabilistic Data Association Joint probabilisticdata association (JPDA) is a suboptimal approach for trackingmultiple targets in cluttered environments [14] JPDA issimilar to PDA with the difference that the associationprobabilities are computed using all of the observationsand all of the targets Thus in contrast to PDA JPDAconsiders various hypotheses together and combines themJPDA determines the probability 120573119905

119894(119896) that measurement 119894 is

originated from target 119905 accounting for the fact that underthis hypothesis the measurement cannot be generated byother targets Therefore for a known number of targets itevaluates the different options of the measurement-targetassociation (for the most recent set of measurements) andcombines them into the corresponding state estimation Ifthe association probability is known then the Kalman filterupdating equation of the track 119905 can be written as

119909119905(119896 | 119896) = 119909

119905(119896 | 119896 minus 1) + 119870 (119896) V

119905(119896) (13)

where 119909119905(119896 | 119896) and 119909119905(119896 | 119896 minus 1) are the estimation andprediction of target 119905 and119870(119896) is the filter gainTheweighted

The Scientific World Journal 9

sum of the residuals associated with the observation 119898(119896) oftarget 119905 is as follows

V119905(119896) =

119898(119896)

sum

119894=1

120573119905

119894(119896) V119905

119894(119896) (14)

where V119905119894= 119911119894(119896) minus 119867119909

119905(119896 | 119896 minus 1) Therefore this method

incorporates all of the observations (inside the neighborhoodof the targetrsquos predicted position) to update the estimatedposition by using a posterior probability that is a weightedsum of residuals

The main restrictions of JPDA are the following

(i) a measurement cannot come from more than onetarget

(ii) two measurements cannot be originated by the sametarget (at one time instant)

(iii) the sum of all of the measurementsrsquo probabilities thatare assigned to one target must be 1 sum119898(119896)

119894=0120573119905

119894(119896) = 1

The main disadvantages of JPDA are the following

(i) it requires an explicit mechanism for track initial-ization Similar to PDA JPDA cannot initialize newtracks or remove tracks that are out of the observationarea

(ii) JPDA is a computationally expensive algorithm whenit is applied in environments that havemultiple targetsbecause the number of hypotheses is incrementedexponentially with the number of targets

In general JPDA is more appropriate than MHT insituations in which the density of false measurements is high(ie sonar applications)

34 Multiple Hypothesis Test The underlying idea of themultiple hypothesis test (MHT) is based on using more thantwo consecutive observations to make an association withbetter results Other algorithms that use only two consecutiveobservations have a higher probability of generating an errorIn contrast to PDA and JPDA MHT estimates all of thepossible hypotheses and maintains new hypotheses in eachiteration

MHTwas developed to track multiple targets in clutteredenvironments as a result it combines the data associationproblem and tracking into a unified framework becomingan estimation technique as well The Bayes rule or theBayesian networks are commonly employed to calculate theMHT hypothesis In general researchers have claimed thatMHT outperforms JPDA for the lower densities of falsepositives However the main disadvantage of MHT is thecomputational cost when the number of tracks or falsepositives is incremented Pruning the hypothesis tree usinga window could solve this limitation

The Reid [15] tracking algorithm is considered the stan-dard MHT algorithm but the initial integer programmingformulation of the problem is due to Morefield [16] MHT isan iterative algorithm in which each iteration starts with a set

of correspondence hypotheses Each hypothesis is a collec-tion of disjoint tracks and the prediction of the target in thenext time instant is computed for each hypothesis Next thepredictions are compared with the new observations by usinga distance metric The set of associations established in eachhypothesis (based on a distance) introduces new hypothesesin the next iteration Each new hypothesis represents a newset of tracks that is based on the current observations

Note that each new measurement could come from (i) anew target in the visual field of view (ii) a target being trackedor (iii) noise in the measurement process It is also possiblethat a measurement is not assigned to a target because thetarget disappears or because it is not possible to obtain atarget measurement at that time instant

MHT maintains several correspondence hypotheses foreach target in each frame If the hypothesis in the instant119896 is represented by 119867(119896) = [ℎ

119897(119896) 119896 = 1 119899] then

the probability of the hypothesis ℎ119897(119896) could be represented

recursively using the Bayes rule as follows

119875 (ℎ119897(119896) | 119885 (119896)) = 119875 (ℎ

119892(119896 minus 1) 119886

119894(119896) | 119885 (119896))

=1

119888119875 (119885 (119896) | ℎ

119892(119896 minus 1) 119886

119894(119896))

lowast 119875 (119886119894(119896) | ℎ

119892(119896 minus 1)) lowast 119875 (ℎ

119892(119896 minus 1))

(15)

where ℎ119892(119896 minus 1) is the hypothesis 119892 of the complete set until

the time instant 119896minus1 119886119894(119896) is the 119894th possible association of the

track to the object 119885(119896) is the set of detections of the currentframe and 119888 is a normal constant

The first term on the right side of the previous equationis the likelihood function of the measurement set 119885(119896) giventhe joint likelihood and current hypothesis The second termis the probability of the association hypothesis of the currentdata given the previous hypothesis ℎ

119892(119896 minus 1) The third term

is the probability of the previous hypothesis from which thecurrent hypothesis is calculated

The MHT algorithm has the ability to detect a newtrack while maintaining the hypothesis tree structure Theprobability of a true track is given by theBayes decisionmodelas

119875 (120582 | 119885) =119875 (119885 | 120582) lowast 119875

∘(120582)

119875 (119885) (16)

where 119875(119885 | 120582) is the probability of obtaining the set ofmeasurements 119885 given 120582 119875

∘(120582) is the a priori probability of

the source signal and 119875(119885) is the probability of obtaining theset of detections 119885

MHT considers all of the possibilities including boththe track maintenance and the initialization and removalof tracks in an integrated framework MHT calculates thepossibility of having an object after the generation of a setof measurements using an exhaustive approach and thealgorithm does not assume a fixed number of targetsThe keychallenge of MHT is the effective hypothesis managementThe baseline MHT algorithm can be extended as follows(i) use the hypothesis aggregation for missed targets births

10 The Scientific World Journal

cardinality tracking and closely spaced objects (ii) applya multistage MHT for improving the performance androbustness in challenging settings and (iii) use a feature-aided MHT for extended object surveillance

The main disadvantage of this algorithm is the compu-tational cost which grows exponentially with the number oftracks andmeasurementsTherefore the practical implemen-tation of this algorithm is limited because it is exponential inboth time and memory

With the aim of reducing the computational cost [17]presented a probabilistic MHT algorithm in which theassociations are considered to be random variables thatare statistically independent and in which performing anexhaustive search enumeration is avoided This algorithm isknown as PMHT The PMHT algorithm assumes that thenumber of targets and measurements is known With thesame goal of reducing the computational cost [18] presentedan efficient implementation of the MHT algorithm Thisimplementation was the first version to be applied to performtracking in visual environments They employed the Murty[19] algorithm to determine the best set of 119896 hypothesesin polynomial time with the goal of tracking the points ofinterest

MHT typically performs the tracking process by employ-ing only one characteristic commonly the position TheBayesian combination to use multiple characteristics wasproposed by Liggins II et al [20]

A linear-programming-based relaxation approach to theoptimization problem in MHT tracking was proposed inde-pendently by Coraluppi et al [21] and Storms and Spieksma[22] Joo and Chellappa [23] proposed an association algo-rithm for tracking multiple targets in visual environmentsTheir algorithm is based on in MHT modification in whicha measurement can be associated with more than one targetand several targets can be associated with one measurementThey also proposed a combinatorial optimization algorithmto generate the best set of association hypotheses Theiralgorithm always finds the best hypothesis in contrast toother models which are approximate Coraluppi and Carthel[24] presented a generalization of the MHT algorithm usinga recursion over hypothesis classes rather than over a singlehypothesis This work has been applied in a special case ofthemulti-target tracking problem called cardinality trackingin which they observed the number of sensor measurementsinstead of the target states

35 Distributed Joint Probabilistic Data Association The dis-tributed version of the joint probabilistic data association(JPDA-D) was presented by Chang et al [25] In this tech-nique the estimated state of the target (using two sensors)after being associated is given by

119864 119909 | 1198851 1198852 =

1198981

sum

119895=0

1198982

sum

119897=0

119864 119909 | 1205941

119895 1205942

119897 1198851 1198852

lowast 119875 1205941

119895 1205942

119897| 1198851 1198852

(17)

where 119898119894 119894 = 1 2 is the last set of measurements of

sensor 1 and 2 119885119894 119894 = 1 2 is the set of accumulative data

and 120594 is the association hypothesisThe first term of the rightside of the equation is calculated from the associations thatwere made earlier The second term is computed from theindividual association probabilities as follows

119875 (1205941

119895 1205942

119897| 1198851 1198852) = sum

1199091

sum

1199092

= 119875 (1205941 1205942| 1198851 1198852) 1

119895(1205941) 2

119897(1205942)

119875 (1205941 1205942| 1198851 1198852) =

1

119888119875 (1205941| 1198851) 119875 (120594

2| 1198852) 120574 (120594

1 1205942)

(18)

where 120594119894 are the joint hypotheses involving all of themeasurements and all of the objectives and 119894

119895(120594119894) are the

binary indicators of the measurement-target associationTheadditional term 120574(1205941 1205942) depends on the correlation of theindividual hypothesis and reflects the localization influenceof the current measurements in the joint hypotheses

These equations are obtained assuming that commu-nication exists after every observation and there are onlyapproximations in the case in which communication issporadic and when a substantial amount of noise occursTherefore this algorithm is a theoretical model that has somelimitations in practical applications

36 Distributed Multiple Hypothesis Test The distributedversion of the MHT algorithm (MHT-D) [26 27] follows asimilar structure as the JPDA-D algorithm Let us assume thecase in which one node must fuse two sets of hypotheses andtracks If the hypotheses and track sets are represented by119867119894(119885119894) and 119879119894(119885119894) with 119894 = 1 2 the hypothesis probabilities

are represented by 120582119894119895 and the state distribution of the tracks

(120591119894119895) is represented by 119875(120582119894

119895) and 119875(119909 | 119885

119894 120591119894

119895) then the

maximum available information in the fusion node is 119885 =1198851cup 1198852 The data fusion objective of the MHT-D is to

obtain the set of hypotheses119867(119885) the set of tracks 119879(119885) thehypothesis probabilities 119875(120582 | 119885) and the state distribution119901(119909 | 119885 120591) for the observed data

The MHT-D algorithm is composed of the followingsteps

(1) hypothesis formation for each hypothesis pair 1205821119895and

1205822

119896 which could be fused a track 120591 is formed by

associating the pair of tracks 1205911119895and 1205912

119896 where each

pair comes from one node and could originate fromthe same target The final result of this stage is a setof hypotheses denoted by 119867(119885) and the fused tracks119879(119885)

(2) hypothesis evaluation in this stage the associationprobability of each hypothesis and the estimatedstate of each fused track are obtained The dis-tributed estimation algorithm is employed to calcu-late the likelihood of the possible associations andthe obtained estimations at each specific association

The Scientific World Journal 11

Using the information model the probability of eachfused hypothesis is given by

119875 (120582 | 119885) = 119862minus1prod

119895isin119869

119875(120582(119895)| 119885(119895))120572(119895)

prod

120591isin120582

119871 (120591 | 119885) (19)

where 119862 is a normalizing constant and 119871(120591 | 119885) is thelikelihood of each hypothesis pair

The main disadvantage of the MHT-D is the high com-putational cost that is in the order of 119874(119899119872) where 119899 is thenumber of possible associations and 119872 is the number ofvariables to be estimated

37 Graphical Models Graphical models are a formalism forrepresenting and reasoning with probabilities and indepen-dence A graphical model represents a conditional decom-position of the joint probability A graphical model can berepresented as a graph in which the nodes denote randomvariables the edges denote the possible dependence betweenthe random variables and the plates denote the replication ofa substructure with the appropriate indexing of the relevantvariables The graph captures the joint distribution over therandom variables which can be decomposed into a productof factors that each dependononly a subset of variablesThereare two major classes of graphical models (i) the Bayesiannetworks [28] which are also known as the directed graphicalmodels and (ii) the Markov random fields which are alsoknown as undirected graphical models The directed graph-ical models are useful for expressing causal relationshipsbetween random variables whereas undirected models arebetter suited for expressing soft constraints between randomvariables We refer the reader to the book of Koller andFriedman [29] for more information on graphical models

A framework based on graphical models can solve theproblem of distributed data association in synchronizedsensor networkswith overlapped areas andwhere each sensorreceives noisy measurements this solution was proposedby Chen et al [30 31] Their work is based on graphicalmodels that are used to represent the statistical dependencebetween random variables The data association problem istreated as an inference problem and solved by using themax-product algorithm [32] Graphical models representstatistical dependencies between variables as graphs andthe max-product algorithm converges when the graph isa tree structure Moreover the employed algorithm couldbe implemented in a distributed manner by exchangingmessages between the source nodes in parallel With thisalgorithm if each sensor has 119899 possible combinations ofassociations and there are119872 variables to be estimated it hasa complexity of 119874(1198992119872) which is reasonable and less thanthe 119874(119899119872) complexity of the MHT-D algorithm Howeveraspecial attention must be given to the correlated variableswhen building the graphical model

4 State Estimation Methods

State estimation techniques aim to determine the state ofthe target under movement (typically the position) given

the observation or measurements State estimation tech-niques are also known as tracking techniques In their generalform it is not guaranteed that the target observations arerelevant which means that some of the observations couldactually come from the target and others could be only noiseThe state estimation phase is a common stage in data fusionalgorithms because the targetrsquos observation could come fromdifferent sensors or sources and the final goal is to obtain aglobal target state from the observations

The estimation problem involves finding the values of thevector state (eg position velocity and size) that fits as muchas possible with the observed data From a mathematicalperspective we have a set of redundant observations andthe goal is to find the set of parameters that provides thebest fit to the observed data In general these observationsare corrupted by errors and the propagation of noise in themeasurement process State estimation methods fall underlevel 1 of the JDL classification and could be divided into twobroader groups

(1) linear dynamics and measurements here the esti-mation problem has a standard solution Specificallywhen the equations of the object state and the mea-surements are linear the noise follows the Gaussiandistribution and we do not refer to it as a clutterenvironment in this case the optimal theoreticalsolution is based on the Kalman filter

(2) nonlinear dynamics the state estimation problembecomes difficult and there is not an analytical solu-tion to solve the problem in a generalmanner In prin-ciple there are no practical algorithms available tosolve this problem satisfactorily

Most of the state estimationmethods are based on controltheory and employ the laws of probability to compute avector state from a vector measurement or a stream of vectormeasurements Next the most common estimation methodsare presented including maximum likelihood and maxi-mum posterior (Section 41) the Kalman filter (Section 42)particle filter (Section 43) the distributed Kalman filter(Section 44) distributed particle filter (Section 45) andcovariance consistency methods (Section 46)

41 Maximum Likelihood andMaximum Posterior Themax-imum likelihood (ML) technique is an estimation methodthat is based on probabilistic theory Probabilistic estimationmethods are appropriate when the state variable follows anunknown probability distribution [33] In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of119909 The likelihood function 120582(119909) is defined as a probabilitydensity function of the sequence of 119911 observations given thetrue value of the state 119909 Consider

120582 (119909) = 119901 (119911 | 119909) (20)

The ML estimator finds the value of 119909 that maximizes thelikelihood function

119909 (119896) = argmax119909119901 (119911 | 119909) (21)

12 The Scientific World Journal

which can be obtained from the analytical or empiricalmodels of the sensorsThis function expresses the probabilityof the observed data The main disadvantage of this methodin practice is that it requires the analytical or empirical modelof the sensor to be known to provide the prior distributionand compute the likelihood function This method can alsosystematically underestimate the variance of the distributionwhich leads to a bias problem However the bias of the MLsolution becomes less significant as the number 119873 of datapoints increases and is equal to the true variance of thedistribution that generated the data at the limit119873 rarr infin

The maximum posterior (MAP) method is based on theBayesian theory It is employed when the parameter 119909 tobe estimated is the output of a random variable that has aknown probability density function 119901(119909) In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of 119909The MAP estimator finds the value of 119909 that maximizes theposterior probability distribution as follows

119909 (119896) = argmax119909119901 (119909 | 119911) (22)

Both methods (ML andMAP) aim to find the most likelyvalue for the state 119909 However ML assumes that 119909 is a fixedbut an unknown point from the parameter space whereasMAP considers 119909 to be the output of a random variable witha known a priori probability density function Both of thesemethods are equivalent when there is no a priori informationabout 119909 that is when there are only observations

42The Kalman Filter TheKalman filter is the most popularestimation technique It was originally proposed by Kalman[34] and has been widely studied and applied since then TheKalman filter estimates the state 119909 of a discrete time processgoverned by the following space-time model

119909 (119896 + 1) = Φ (119896) 119909 (119896) + 119866 (119896) 119906 (119896) + 119908 (119896) (23)

with the observations ormeasurements 119911 at time 119896 of the state119909 represented by

119911 (119896) = 119867 (119896) 119909 (119896) + V (119896) (24)

where Φ(119896) is the state transition matrix 119866(119896) is the inputmatrix transition 119906(119896) is the input vector 119867(119896) is themeasurement matrix and 119908 and V are the random Gaussianvariables with zero mean and covariance matrices of 119876(119896)and 119877(119896) respectively Based on the measurements and onthe system parameters the estimation of 119909(119896) which isrepresented by 119909(119896) and the prediction of 119909(119896 + 1) whichis represented by 119909(119896 + 1 | 119896) are given by the following

119909 (119896) = 119909 (119896 | 119896 + 1) + 119870 (119896) [119911 (119896) minus 119867 (119896) 119909 (119896 | 119896 minus 1)]

119909 (119896 + 1 | 119896) = Φ (119896) 119909 (119896 | 119896) + 119866 (119896) 119906 (119896)

(25)

respectively where119870 is the filter gain determined by

119870 (119896) = 119875 (119896 | 119896 minus 1)119867119879(119896)

times [119867 (119896) 119875 (119896 | 119896 minus 1)119867119879(119896) + 119877 (119896)]

minus1

(26)

where 119875(119896 | 119896 minus 1) is the prediction covariance matrix andcan be determined by

119875 (119896 + 1 | 119896) = Φ (119896) 119875 (119896)Φ119879(119896) + 119876 (119896) (27)

with

119875 (119896) = 119875 (119896 | 119896 minus 1) minus 119870 (119896)119867 (119896) 119875 (119896 | 119896 minus 1) (28)

The Kalman filter is mainly employed to fuse low-leveldata If the system could be described as a linear model andthe error could be modeled as the Gaussian noise then therecursive Kalman filter obtains optimal statistical estimations[35] However othermethods are required to address nonlin-ear dynamicmodels and nonlinearmeasurementsThemodi-fied Kalman filter known as the extended Kalman filter (EKF)is an optimal approach for implementing nonlinear recursivefilters [36] The EKF is one of the most often employedmethods for fusing data in robotic applications Howeverit has some disadvantages because the computations of theJacobians are extremely expensive Some attempts have beenmade to reduce the computational cost such as linearizationbut these attempts introduce errors in the filter and make itunstable

The unscented Kalman filter (UKF) [37] has gainedpopularity because it does not have the linearization step andthe associated errors of the EKF [38] The UKF employs adeterministic sampling strategy to establish theminimum setof points around the mean This set of points captures thetrue mean and covariance completely Then these points arepropagated through nonlinear functions and the covarianceof the estimations can be recuperated Another advantage ofthe UKF is its ability to be employed in parallel implementa-tions

43 Particle Filter Particle filters are recursive implemen-tations of the sequential Monte Carlo methods [39] Thismethod builds the posterior density function using severalrandom samples called particles Particles are propagatedover time with a combination of sampling and resamplingsteps At each iteration the sampling step is employed todiscard some particles increasing the relevance of regionswith a higher posterior probability In the filtering processseveral particles of the same state variable are employedand each particle has an associated weight that indicatesthe quality of the particle Therefore the estimation is theresult of a weighted sum of all of the particles The standardparticle filter algorithm has two phases (1) the predictingphase and (2) the updating phase In the predicting phaseeach particle is modified according to the existing modeland accounts for the sum of the random noise to simulatethe noise effect Then in the updating phase the weight ofeach particle is reevaluated using the last available sensorobservation and particles with lower weights are removedSpecifically a generic particle filter comprises the followingsteps

The Scientific World Journal 13

(1) Initialization of the particles

(i) let119873 be equal to the number of particles(ii) 119883(119894)(1) = [119909(1) 119910(1) 0 0]119879 for 119894 = 1 119873

(2) Prediction step

(i) for each particle 119894 = 1 119873 evaluate the state(119896 + 1 | 119896) of the system using the state at timeinstant 119896 with the noise of the system at time 119896Consider

119883(119894)(119896 + 1 | 119896) = 119865 (119896)119883

(119894)(119896)

+ (cauchy-distribution-noise)(119896)

(29)

where 119865(119896) is the transition matrix of the sys-tem

(3) Evaluate the particle weight For each particle 119894 =1 119873

(i) compute the predicted observation state of thesystem using the current predicted state and thenoise at instant 119896 Consider

(119894)(119896 + 1 | 119896) = 119867 (119896 + 1)119883

(119894)(119896 + 1 | 119896)

+ (gaussian-measurement-noise)(119896+1)

(30)

(ii) compute the likelihood (weights) according tothe given distribution Consider

likelihood(119894) = 119873((119894) (119896 + 1 | 119896) 119911(119894) (119896 + 1) var) (31)

(iii) normalize the weights as follows

119908(119894)=

likelihood(119894)

sum119873

119895=1likelihood(119895)

(32)

(4) ResamplingSelection multiply particles with higherweights and remove those with lower weights Thecurrent state must be adjusted using the computedweights of the new particles

(i) Compute the cumulative weights Consider

Cum Wt(119894) =119894

sum

119895=1

119908(119895) (33)

(ii) Generate uniform distributed random variablesfrom 119880(119894) sim 119882(0 1) with the number of stepsequal to the number of particles

(iii) Determine which particles should be multipliedand which ones removed

(5) Propagation phase

(i) incorporate the new values of the state after theresampling of instant 119896 to calculate the value atinstant 119896 + 1 Consider

119909(1119873)

(119896 + 1 | 119896 + 1) = 119909 (119896 + 1 | 119896) (34)

(ii) compute the posterior mean Consider

119909 (119896 + 1) = mean [119909119894 (119896 + 1 | 119896 + 1)] 119894 = 1 119873 (35)

(iii) repeat steps 2 to 5 for each time instant

Particle filters are more flexible than the Kalman filtersand can copewith nonlinear dependencies and non-Gaussiandensities in the dynamic model and in the noise errorHowever they have some disadvantages A large numberof particles are required to obtain a small variance in theestimator It is also difficult to establish the optimal number ofparticles in advance and the number of particles affects thecomputational cost significantly Earlier versions of particlefilters employed a fixed number of particles but recent studieshave started to use a dynamic number of particles [40]

44 The Distributed Kalman Filter The distributed Kalmanfilter requires a correct clock synchronization between eachsource as demonstrated in [41] In other words to correctlyuse the distributed Kalman filter the clocks from all ofthe sources must be synchronized This synchronization istypically achieved through using protocols that employ ashared global clock such as the network time protocol (NTP)Synchronization problems between clocks have been shownto have an effect on the accuracy of the Kalman filterproducing inaccurate estimations [42]

If the estimations are consistent and the cross covarianceis known (or the estimations are uncorrelated) then it ispossible to use the distributed Kalman filters [43] Howeverthe cross covariance must be determined exactly or theobservations must be consistent

We refer the reader to Liggins II et al [20] formore detailsabout the Kalman filter in a distributed and hierarchicalarchitecture

45 Distributed Particle Filter Distributed particle filtershave gained attention recently [44ndash46] Coates [45] used adistributed particle filter to monitor an environment thatcould be captured by the Markovian state-space modelinvolving nonlinear dynamics and observations and non-Gaussian noise

In contrast earlier attempts to solve out-of-sequencemeasurements using particle filters are based on regeneratingthe probability density function to the time instant of theout-of-sequence measurement [47] In a particle filter thisstep requires a large computational cost in addition to thenecessary space to store the previous particles To avoidthis problem Orton and Marrs [48] proposed to store theinformation on the particles at each time instant saving thecost of recalculating this information This technique is close

14 The Scientific World Journal

to optimal and when the delay increases the result is onlyslightly affected [49]However it requires a very large amountof space to store the state of the particles at each time instant

46 Covariance Consistency Methods Covariance Intersec-tionUnion Covariance consistency methods (intersectionand union) were proposed by Uhlmann [43] and are generaland fault-tolerant frameworks for maintaining covariancemeans and estimations in a distributed networkThese meth-ods do not comprise estimation techniques instead they aresimilar to an estimation fusion technique The distributedKalman filter requirement of independent measurements orknown cross-covariances is not a constraintwith thismethod

461 Covariance Intersection If the Kalman filter is employ-ed to combine two estimations (119886

1 1198601) and (119886

2 1198602) then it

is assumed that the joint covariance is in the following form

[

[

1198601119883

1198831198791198602

]

]

(36)

where the cross-covariance 119883 should be known exactly sothat the Kalman filter can be applied without difficultyBecause the computation of the cross-covariances is compu-tationally intensive Uhlmann [43] proposed the covarianceintersection (CI) algorithm

Let us assume that a joint covariance 119872 can be definedwith the diagonal blocks119872

1198601gt 1198601and119872

1198602gt 1198602 Consider

119872 ⩾ [

[

1198601119883

1198831198791198602

]

]

(37)

for every possible instance of the unknown cross-covariance119883 then the components of the matrix119872 could be employedin the Kalman filter equations to provide a fused estimation(119888 119862) that is considered consistent The key point of thismethod relies on generating a joint covariance matrix119872 thatcan represent a useful fused estimation (in this context usefulrefers to something with a lower associated uncertainty) Insummary the CI algorithm computes the joint covariancematrix 119872 where the Kalman filter provides the best fusedestimation (119888 119862) with respect to a fixed measurement of thecovariance matrix (ie the minimum determinant)

Specific covariance criteria must be established becausethere is not a specific minimum joint covariance in theorder of the positive semidefinite matrices Moreover thejoint covariance is the basis of the formal analysis of theCI algorithm the actual result is a nonlinear mixture of theinformation stored on the estimations being fused followingthe following equation

119862 = (1199081119867119879

1119860minus1

11198671+ 1199082119867119879

2119860minus1

21198672+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119867119899)minus1

119888 = 119862(1199081119867119879

1119860minus1

11198861+ 1199082119867119879

2119860minus1

21198862+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119886119899)minus1

(38)

where 119867119894is the transformation of the fused state-space

estimation to the space of the estimated state 119894 The values

of 119908 can be calculated to minimize the covariance determi-nant using convex optimization packages and semipositivematrix programming The result of the CI algorithm hasdifferent characteristics compared to the Kalman filter Forexample if two estimations are provided (119886 119860) and (119887 119861)and their covariances are equal 119860 = 119861 since the Kalmanfilter is based on the statistical independence assumption itproduces a fused estimation with covariance 119862 = (12)119860In contrast the CI method does not assume independenceand thus must be consistent even in the case in whichthe estimations are completely correlated with the estimatedfused covariance 119862 = 119860 In the case of estimations where119860 lt 119861 the CI algorithm does not provide information aboutthe estimation (119887 119861) thus the fused result is (119886 119860)

Every joint-consistent covariance is sufficient to producea fused estimation which guarantees consistency Howeverit is also necessary to guarantee a lack of divergence Diver-gence is avoided in the CI algorithm by choosing a specificmeasurement (ie the determinant) which is minimized ineach fusion operation This measurement represents a non-divergence criterion because the size of the estimated covari-ance according to this criterion would not be incremented

The application of the CI method guarantees consis-tency and nondivergence for every sequence of mean andcovariance-consistent estimations However this methoddoes not work well when the measurements to be fused areinconsistent

462 Covariance Union CI solves the problem of correlatedinputs but not the problemof inconsistent inputs (inconsistentinputs refer to different estimations each of which has ahigh accuracy (small variance) but also a large differencefrom the states of the others) thus the covariance union(CU) algorithm was proposed to solve the latter [43] CUaddresses the following problem two estimations (119886

1 1198601)

and (1198862 1198602) relate to the state of an object and are mutually

inconsistent from one another This issue arises when thedifference between the average estimations is larger thanthe provided covariance Inconsistent inputs can be detectedusing the Mahalanobis distance [50] between them which isdefined as

119872119889= (1198861minus 1198862)119879

(1198601+ 1198602)minus1

(1198861minus 1198862) (39)

and detecting whether this distance is larger than a giventhreshold

The Mahalanobis distance accounts for the covarianceinformation to obtain the distance If the difference betweenthe estimations is high but their covariance is also highthe Mahalanobis distance yields a small value In contrastif the difference between the estimations is small and thecovariances are small it could produce a larger distancevalue A high Mahalanobis distance could indicate that theestimations are inconsistent however it is necessary tohave a specific threshold established by the user or learnedautomatically

The CU algorithm aims to solve the following prob-lem let us suppose that a filtering algorithm provides twoobservations withmean and covariance (119886

1 1198601) and (119886

2 1198602)

The Scientific World Journal 15

respectively It is known that one of the observations is correctand the other is erroneous However the identity of thecorrect estimation is unknown and cannot be determinedIn this situation if both estimations are employed as aninput to the Kalman filter there will be a problem becausethe Kalman filter only guarantees a consistent output if theobservation is updated with a measurement consistent withboth of them In the specific case in which themeasurementscorrespond to the same object but are acquired from twodifferent sensors the Kalman filter can only guarantee thatthe output is consistent if it is consistent with both separatelyBecause it is not possible to knowwhich estimation is correctthe only way to combine the two estimations rigorously isto provide an estimation (119906 119880) that is consistent with bothestimations and to obey the following properties

119880 ⪖ 1198601+ (119906 minus 119886

1) (119906 minus 119860

1)119879

119880 ⪖ 1198602+ (119906 minus 119886

2) (119906 minus 119860

2)119879

(40)

where somemeasurement of thematrix size119880 (ie the deter-minant) is minimized

In other words the previous equations indicate that if theestimation (119886

1 1198601) is consistent then the translation of the

vector 1198861to 119906 requires to increase the covariance by the sum

of a matrix at least as big as the product of (119906minus1198861) in order to

be consistentThe same situation applies to the measurement(1198862 1198602) in order to be consistent

A simple strategy is to choose the mean of the estimationas the input value of one of themeasurements (119906 = 119886

1) In this

case the value of 119880must be chosen such that the estimationis consistent with the worst case (the correct measurement is1198862) However it is possible to assign 119906 an intermediate value

between 1198861and 1198862to decrease the value of 119880 Therefore the

CU algorithm establishes the mean fused value 119906 that hasthe least covariance 119880 but is sufficiently large for the twomeasurements (119886

1and 1198862) for consistency

Because the matrix inequalities presented in previousequations are convex convex optimization algorithms mustbe employed to solve them The value of 119880 can be computedwith the iterative method described by Julier et al [51]The obtained covariance could be significantly larger thanany of the initial covariances and is an indicator of theexisting uncertainty between the initial estimations One ofthe advantages of the CU method arises from the fact thatthe same process could be easily extended to119873 inputs

5 Decision Fusion Methods

A decision is typically taken based on the knowledge of theperceived situation which is provided by many sources inthe data fusion domain These techniques aim to make ahigh-level inference about the events and activities that areproduced from the detected targets These techniques oftenuse symbolic information and the fusion process requires toreasonwhile accounting for the uncertainties and constraintsThese methods fall under level 2 (situation assessment) andlevel 4 (impact assessment) of the JDL data fusion model

51 The Bayesian Methods Information fusion based on theBayesian inference provides a formalism for combining evi-dence according to the probability theory rules Uncertaintyis represented using the conditional probability terms thatdescribe beliefs and take on values in the interval [0 1] wherezero indicates a complete lack of belief and one indicates anabsolute belief The Bayesian inference is based on the Bayesrule as follows

119875 (119884 | 119883) =119875 (119883 | 119884) 119875 (119884)

119875 (119883) (41)

where the posterior probability 119875(119884 | 119883) represents thebelief in the hypothesis 119884 given the information 119883 Thisprobability is obtained by multiplying the a priori probabilityof the hypothesis 119875(119884) by the probability of having 119883 giventhat 119884 is true 119875(119883 | 119884) The value 119875(119883) is used as anormalizing constantThemain disadvantage of the Bayesianinference is that the probabilities 119875(119883) and 119875(119883 | 119884) mustbe known To estimate the conditional probabilities Panet al [52] proposed the use of NNs whereas Coue et al [53]proposed the Bayesian programming

Hall and Llinas [54] described the following problemsassociated with Bayesian inference

(i) Difficulty in establishing the value of a priori proba-bilities

(ii) Complexity when there are multiple potential hypo-theses and a substantial number of events that dependon the conditions

(iii) The hypothesis should be mutually exclusive

(iv) Difficulty in describing the uncertainty of the deci-sions

52 The Dempster-Shafer Inference The Dempster-Shaferinference is based on the mathematical theory introducedby Dempster [55] and Shafer [56] which generalizes theBayesian theory The Dempster-Shafer theory provides aformalism that could be used to represent incomplete knowl-edge updating beliefs and a combination of evidence andallows us to represent the uncertainty explicitly [57]

A fundamental concept in the Dempster-Shafer reason-ing is the frame of discernment which is defined as followsLet Θ = 120579

1 1205792 120579

119873 be the set of all possible states

that define the system and let Θ be exhaustive and mutuallyexclusive due to the system being only in one state 120579

119894isin Θ

where 1 ⪕ 119894 ⪕ 119873 The set Θ is called a frame of discernmentbecause its elements are employed to discern the current stateof the system

The elements of the set 2Θ are called hypotheses Inthe Dempster-Shafer theory based on the evidence 119864 aprobability is assigned to each hypothesis119867 isin 2Θ accordingto the basic assignment of probabilities or the mass function119898 2Θrarr [01] which satisfies

119898(0) = 0 (42)

16 The Scientific World Journal

Thus themass function of the empty set is zero Furthermorethe mass function of a hypothesis is larger than or equal tozero for all of the hypotheses Consider

119898(119867) ge 0 forall119867 isin 2Θ (43)

The sum of the mass function of all the hypotheses is oneConsider

sum

119867isin2Θ

119898(119867) = 1 (44)

To express incomplete beliefs in a hypothesis 119867 the Demp-ster-Shafer theory defines the belief function bel 2Θ rarr

[0 1] over Θ as

bel (119867) = sum119860sube119867

119898(119860) (45)

where bel(0) = 0 and bel(Θ) = 1The doubt level in119867 can beexpressed in terms of the belief function by

dou (119867) = bel (not119867) = sum

119860subenot119867

119898(119860) (46)

To express the plausibility of each hypothesis the functionpl 2Θ rarr [0 1] over Θ is defined as

pl (119867) = 1 minus dou (119867) = sum

119860cap119867=0

119898(119860) (47)

Intuitive plausibility indicates that there is less uncer-tainty in hypothesis119867 if it is more plausible The confidenceinterval [bel(119867) pl(119867)] defines the true belief in hypothesis119867 To combine the effects of the two mass functions 119898

1and

1198982 the Dempster-Shafer theory defines a rule119898

1oplus 1198982as

1198981oplus 1198982(0) = 0

1198981oplus 1198982(119867) =

sum119883cap119884=119867

1198981(119883)119898

2(119884)

1 minus sum119883cap119884=0

1198981(119883)119898

2(119884)

(48)

In contrast to the Bayesian inference a priori probabilitiesare not required in the Dempster-Shafer inference becausethey are assigned at the instant that the information is pro-vided Several studies in the literature have compared the useof the Bayesian inference and the Dempster-Shafer inferencesuch as [58ndash60] Wu et al [61] used the Dempster-Shafertheory to fuse information in context-aware environmentsThis work was extended in [62] to dynamically modify theassociated weights to the sensor measurements Thereforethe fusion mechanism is calibrated according to the recentmeasurements of the sensors (in cases in which the ground-truth is available) In themilitary domain [63] theDempster-Shafer reasoning is used with the a priori information storedin a database for classifying military ships Morbee et al [64]described the use of the Dempster-Shafer theory to build 2Doccupancy maps from several cameras and to evaluate thecontribution of subsets of cameras to a specific task Each taskis the observation of an event of interest and the goal is toassess the validity of a set of hypotheses that are fused usingthe Dempster-Shafer theory

53 Abductive Reasoning Abductive reasoning or inferringthe best explanation is a reasoning method in which ahypothesis is chosen under the assumption that in case itis true it explains the observed event most accurately [65]In other words when an event is observed the abductionmethod attempts to find the best explanation

In the context of probabilistic reasoning abductive infer-ence finds the posterior ML of the system variables givensome observed variables Abductive reasoning is more areasoning pattern than a data fusion technique Thereforedifferent inference methods such as NNs [66] or fuzzy logic[67] can be employed

54 Semantic Methods Decision fusion techniques thatemploy semantic data fromdifferent sources as an input couldprovide more accurate results than those that rely on onlysingle sources There is a growing interest in techniques thatautomatically determine the presence of semantic features invideos to solve the semantic gap [68]

Semantic information fusion is essentially a scheme inwhich raw sensor data are processed such that the nodesexchange only the resultant semantic information Semanticinformation fusion typically covers two phases (i) build-ing the knowledge and (ii) pattern matching (inference)The first phase (typically offline) incorporates the mostappropriate knowledge into semantic information Then thesecond phase (typically online or in real-time) fuses relevantattributes and provides a semantic interpretation of thesensor data [69ndash71]

Semantic fusion could be viewed as an idea for integratingand translating sensor data into formal languages Thereforethe obtained resulting language from the observations ofthe environment is compared with similar languages thatare stored in the database The key of this strategy is thatsimilar behaviors represented by formal languages are alsosemantically similar This type of method provides savingsin the cost of transmission because the nodes need onlytransmit the formal language structure instead of the rawdata However a known set of behaviors must be storedin a database in advance which might be difficult in somescenarios

6 Conclusions

This paper reviews the most popular methods and tech-niques for performing datainformation fusion To determinewhether the application of datainformation fusion methodsis feasible we must evaluate the computational cost of theprocess and the delay introduced in the communicationA centralized data fusion approach is theoretically optimalwhen there is no cost of transmission and there are sufficientcomputational resources However this situation typicallydoes not hold in practical applications

The selection of the most appropriate technique dependson the type of the problem and the established assumptionsof each technique Statistical data fusion methods (eg PDAJPDA MHT and Kalman) are optimal under specific condi-tions [72] First the assumption that the targets are moving

The Scientific World Journal 17

independently and the measurements are normally dis-tributed around the predicted position typically does nothold Second because the statistical techniques model allof the events as probabilities they typically have severalparameters and a priori probabilities for false measurementsand detection errors that are often difficult to obtain (atleast in an optimal sense) For example in the case of theMHT algorithm specific parameters must be established thatare nontrivial to determine and are very sensitive [73] Incontrast statisticalmethods that optimize over several framesare computationally intensive and their complexity typicallygrows exponentially with the number of targets For examplein the case of particle filters tracking several targets can beaccomplished jointly as a group or individually If severaltargets are tracked jointly the necessary number of particlesgrows exponentially Therefore in practice it is better toperform tracking on them individually with the assumptionthat targets do not interact between the particles

In contrast to centralized systems the distributed datafusion methods introduce some challenges in the data fusionprocess such as (i) spatial and temporal alignments of theinformation (ii) out-of-sequence measurements and (iii)data correlation reported by Castanedo et al [74 75] Theinherent redundancy of the distributed systems could beexploited with distributed reasoning techniques and cooper-ative algorithms to improve the individual node estimationsreported by Castanedo et al [76] In addition to the previousstudies a new trend based on the geometric notion of a low-dimensional manifold is gaining attention in the data fusioncommunity An example is the work of Davenport et al [77]which proposes a simple model that captures the correlationbetween the sensor observations by matching the parametervalues for the different obtained manifolds

Acknowledgments

The author would like to thank Jesus Garcıa Miguel APatricio and James Llinas for their interesting and relateddiscussions on several topics that were presented in thispaper

References

[1] JDL Data Fusion Lexicon Technical Panel For C3 FE WhiteSan Diego Calif USA Code 420 1991

[2] D L Hall and J Llinas ldquoAn introduction to multisensor datafusionrdquo Proceedings of the IEEE vol 85 no 1 pp 6ndash23 1997

[3] H F Durrant-Whyte ldquoSensor models and multisensor integra-tionrdquo International Journal of Robotics Research vol 7 no 6 pp97ndash113 1988

[4] B V Dasarathy ldquoSensor fusion potential exploitation-inno-vative architectures and illustrative applicationsrdquo Proceedings ofthe IEEE vol 85 no 1 pp 24ndash38 1997

[5] R C Luo C-C Yih and K L Su ldquoMultisensor fusion andintegration approaches applications and future research direc-tionsrdquo IEEE Sensors Journal vol 2 no 2 pp 107ndash119 2002

[6] J Llinas C Bowman G Rogova A Steinberg E Waltz andF White ldquoRevisiting the JDL data fusion model IIrdquo TechnicalReport DTIC Document 2004

[7] E P Blasch and S Plano ldquoJDL level 5 fusionmodel ldquouser refine-mentrdquo issues and applications in group trackingrdquo in Proceedingsof the Signal Processing Sensor Fusion and Target RecognitionXI pp 270ndash279 April 2002

[8] H F Durrant-Whyte and M Stevens ldquoData fusion in decen-tralized sensing networksrdquo in Proceedings of the 4th Interna-tional Conference on Information Fusion pp 302ndash307MontrealCanada 2001

[9] J Manyika and H Durrant-Whyte Data Fusion and SensorManagement A Decentralized Information-Theoretic ApproachPrentice Hall Upper Saddle River NJ USA 1995

[10] S S Blackman ldquoAssociation and fusion ofmultiple sensor datardquoinMultitarget-Multisensor Tracking Advanced Applications pp187ndash217 Artech House 1990

[11] S Lloyd ldquoLeast squares quantization in pcmrdquo IEEETransactionson Information Theory vol 28 no 2 pp 129ndash137 1982

[12] M Shindler A Wong and A Meyerson ldquoFast and accurate120581-means for large datasetsrdquo in Proceedings of the 25th AnnualConference on Neural Information Processing Systems (NIPS rsquo11)pp 2375ndash2383 December 2011

[13] Y Bar-Shalom and E Tse ldquoTracking in a cluttered environmentwith probabilistic data associationrdquo Automatica vol 11 no 5pp 451ndash460 1975

[14] T E Fortmann Y Bar-Shalom and M Scheffe ldquoMulti-targettracking using joint probabilistic data associationrdquo in Pro-ceedings of the 19th IEEE Conference on Decision and Controlincluding the Symposium on Adaptive Processes vol 19 pp 807ndash812 December 1980

[15] D B Reid ldquoAn algorithm for tracking multiple targetsrdquo IEEETransactions on Automatic Control vol 24 no 6 pp 843ndash8541979

[16] C L Morefield ldquoApplication of 0-1 integer programming tomultitarget tracking problemsrdquo IEEETransactions onAutomaticControl vol 22 no 3 pp 302ndash312 1977

[17] R L Streit and T E Luginbuhl ldquoMaximum likelihood methodfor probabilistic multihypothesis trackingrdquo in Proceedings of theSignal and Data Processing of Small Targets vol 2235 of Pro-ceedings of SPIE p 394 1994

[18] I J Cox and S L Hingorani ldquoEfficient implementation of Reidrsquosmultiple hypothesis tracking algorithm and its evaluation forthe purpose of visual trackingrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 18 no 2 pp 138ndash1501996

[19] K G Murty ldquoAn algorithm for ranking all the assignments inorder of increasing costrdquo Operations Research vol 16 no 3 pp682ndash687 1968

[20] M E Liggins II C-Y Chong I Kadar et al ldquoDistributed fusionarchitectures and algorithms for target trackingrdquo Proceedings ofthe IEEE vol 85 no 1 pp 95ndash106 1997

[21] S Coraluppi C Carthel M Luettgen and S Lynch ldquoAll-source track and identity fusionrdquo in Proceedings of the NationalSymposium on Sensor and Data Fusion 2000

[22] P Storms and F Spieksma ldquoAn lp-based algorithm for the dataassociation problem in multitarget trackingrdquo in Proceedings ofthe 3rd IEEE International Conference on Information Fusionvol 1 2000

[23] S-W Joo and R Chellappa ldquoA multiple-hypothesis approachfor multiobject visual trackingrdquo IEEE Transactions on ImageProcessing vol 16 no 11 pp 2849ndash2854 2007

[24] S Coraluppi andC Carthel ldquoAggregate surveillance a cardinal-ity tracking approachrdquo in Proceedings of the 14th InternationalConference on Information Fusion (FUSION rsquo11) July 2011

18 The Scientific World Journal

[25] K C Chang C Y Chong and Y Bar-Shalom ldquoJoint proba-bilistic data association in distributed sensor networksrdquo IEEETransactions on Automatic Control vol 31 no 10 pp 889ndash8971986

[26] Y Chong S Mori and K C Chang ldquoInformation lusion indistributed sensor networksrdquo in Proceedings of the 4th AmericanControl Conference Boston Mass USA June 1985

[27] Y Chong S Mori and K C Chang ldquoDistributed multitar-get multisensor trackingrdquo in Multitarget-Multisensor TrackingAdvanced Applications vol 1 pp 247ndash295 1990

[28] J Pearl Probabilistic Reasoning in Intelligent Systems Networksof Plausible Inference Morgan Kaufmann San Mateo CalifUSA 1988

[29] Koller and N Friedman Probabilistic Graphical Models Princi-ples and Techniques MIT press 2009

[30] L Chen M Cetin and A S Willsky ldquoDistributed data associ-ation for multi-target tracking in sensor networksrdquo in Proceed-ings of the 7th International Conference on Information Fusion(FUSION rsquo05) pp 9ndash16 July 2005

[31] L Chen M J Wainwright M Cetin and A S Willsky ldquoDataassociation based on optimization in graphical models withapplication to sensor networksrdquo Mathematical and ComputerModelling vol 43 no 9-10 pp 1114ndash1113 2006

[32] Y Weiss and W T Freeman ldquoOn the optimality of solutionsof the max-product belief-propagation algorithm in arbitrarygraphsrdquo IEEE Transactions on InformationTheory vol 47 no 2pp 736ndash744 2001

[33] C Brown H Durrant-Whyte J Leonard B Rao and B SteerldquoDistributed data fusion using Kalman filtering a roboticsapplicationrdquo in Data Fusion in Robotics and Machine Intelli-gence M A Abidi and R C Gonzalez Eds pp 267ndash309 1992

[34] R E Kalman ldquoA new approach to linear filtering and predictionproblemsrdquo Journal of Basic Engineering vol 82 no 1 pp 35ndash451960

[35] R C Luo and M G Kay ldquoData fusion and sensor integrationstate-of-the-art 1990srdquo in Data Fusion in Robotics and MachineIntelligence pp 7ndash135 1992

[36] Welch and G Bishop An Introduction to the Kalman FilterACM SIC-CRAPH 2001 Course Notes 2001

[37] S J Julier and J K Uhlmann ldquoA new extension of the Kalmanfilter to nonlinear systemsrdquo in Proceedings of the InternationalSymposium on AerospaceDefense Sensing Simulation and Con-trols vol 3 1997

[38] A Wan and R Van Der Merwe ldquoThe unscented kalman filterfor nonlinear estimationrdquo in Proceedings of the Adaptive Systemsfor Signal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 153ndash158 2000

[39] D Crisan and A Doucet ldquoA survey of convergence results onparticle filtering methods for practitionersrdquo IEEE Transactionson Signal Processing vol 50 no 3 pp 736ndash746 2002

[40] J Martinez-del Rincon C Orrite-Urunuela and J E Herrero-Jaraba ldquoAn efficient particle filter for color-based tracking incomplex scenesrdquo in Proceedings of the IEEE Conference onAdvancedVideo and Signal Based Surveillance pp 176ndash181 2007

[41] S Ganeriwal R Kumar and M B Srivastava ldquoTiming-syncprotocol for sensor networksrdquo in Proceedings of the 1st Inter-national Conference on Embedded Networked Sensor Systems(SenSys rsquo03) pp 138ndash149 November 2003

[42] M Manzo T Roosta and S Sastry ldquoTime synchronization innetworksrdquo in Proceedings of the 3rd ACMWorkshop on Securityof Ad Hoc and Sensor Networks (SASN rsquo05) pp 107ndash116November 2005

[43] J K Uhlmann ldquoCovariance consistency methods for fault-tolerant distributed data fusionrdquo Information Fusion vol 4 no3 pp 201ndash215 2003

[44] S Bashi V P Jilkov X R Li and H Chen ldquoDistributed imple-mentations of particle filtersrdquo in Proceedings of the 6th Interna-tional Conference of Information Fusion pp 1164ndash1171 2003

[45] M Coates ldquoDistributed particle filters for sensor networksrdquo inProceedings of the 3rd International symposium on InformationProcessing in Sensor Networks (ACM rsquo04) pp 99ndash107 New YorkNY USA 2004

[46] D Gu ldquoDistributed particle filter for target trackingrdquo in Pro-ceedings of the IEEE International Conference on Robotics andAutomation (ICRA rsquo07) pp 3856ndash3861 April 2007

[47] Y Bar-Shalom ldquoUpdate with out-of-sequencemeasurements intracking exact solutionrdquo IEEE Transactions on Aerospace andElectronic Systems vol 38 no 3 pp 769ndash778 2002

[48] M Orton and A Marrs ldquoA Bayesian approach to multi-targettracking and data fusionwithOut-of-SequenceMeasurementsrdquoIEE Colloquium no 174 pp 151ndash155 2001

[49] M L Hernandez A D Marrs S Maskell and M R OrtonldquoTracking and fusion for wireless sensor networksrdquo in Proceed-ings of the 5th International Conference on Information Fusion2002

[50] P C Mahalanobis ldquoOn the generalized distance in statisticsrdquoProceedings National Institute of ScienceIndia vol 2 no 1 pp49ndash55 1936

[51] S J Julier J K Uhlmann and D Nicholson ldquoA methodfor dealing with assignment ambiguityrdquo in Proceedings of theAmerican Control Conference (AAC rsquo04) vol 5 pp 4102ndash4107July 2004

[52] H Pan Z-P Liang T J Anastasio and T S Huang ldquoHybridNN-Bayesian architecture for information fusionrdquo in Proceed-ings of the International Conference on Image Processing (ICIPrsquo98) pp 368ndash371 October 1998

[53] C Coue T Fraichard P Bessiere and E Mazer ldquoMulti-sensordata fusion using Bayesian programming an automotive appli-cationrdquo in Proceedings of the IEEERSJ International Conferenceon Intelligent Robots and Systems pp 141ndash146 October 2002

[54] D L Hall and J Llinas Handbook of Multisensor Data FusionCRC Press Boca Raton Fla USA 2001

[55] P Dempster ldquoA Generalization of Bayesian Inferencerdquo Journalof the Royal Statistical Society B vol 30 no 2 pp 205ndash247 1968

[56] A ShaferMathematical Theory of Evidence Princeton Univer-sity Press Princeton NJ USA 1976

[57] G M Provan ldquoThe validity of Dempster-Shafer belief func-tionsrdquo International Journal of Approximate Reasoning vol 6no 3 pp 389ndash399 1992

[58] D M Buede ldquoShafer-Dempster and Bayesian reasoning aresponse to lsquoShafer-Dempster reasoning with applications tomultisensor target identification systemsrsquordquo IEEE Transactions onSystems Man and Cybernetics vol 18 no 6 pp 1009ndash1011 1988

[59] Y Cheng and R L Kashyap ldquoComparisonol Bayesian andDempsterrsquos rules in evidence combinationrdquo in Maximum-Entropy and BayesianMethods in Science and Engineering 1988

[60] B R Cobb and P P Shenoy ldquoA comparison of Bayesian andbelief function reasoningrdquo Information Systems Frontiers vol 5no 4 pp 345ndash358 2003

[61] H Wu M Siegel R Stiefelhagen and J Yang ldquoSensor fusionusing Dempster-Shafer theoryrdquo in Proceedings of the 19thIEEE Instrumentation and Measurement Technology Conference(TMTC rsquo02) pp 7ndash11 May 2002

The Scientific World Journal 19

[62] H Wu M Siegel and S Ablay ldquoSensor fusion using dempster-shafer theory II staticweighting andKalmanfilter-like dynamicweightingrdquo in Proceedings of the 20th IEEE Information andMeasurement Technology Conference (TMTC rsquo03) pp 907ndash912May 2003

[63] E Bosse P Valin A-C Boury-Brisset and D Grenier ldquoEx-ploitation of a priori knowledge for information fusionrdquo Infor-mation Fusion vol 7 no 2 pp 161ndash175 2006

[64] MMorbee L Tessens H Aghajan andW Philips ldquoDempster-Shafer based multi-view occupancy mapsrdquo Electronics Lettersvol 46 no 5 pp 341ndash343 2010

[65] C S Peirce Abduction and Induction Philosophical Writings ofPeirce vol 156 Dover New York NY USA 1955

[66] A M Abdelbar E A M Andrews and D C Wunsch IIldquoAbductive reasoning with recurrent neural networksrdquo NeuralNetworks vol 16 no 5-6 pp 665ndash673 2003

[67] J R Aguero and A Vargas ldquoInference of operative configu-ration of distribution networks using fuzzy logic techniquesPart II extended real-time modelrdquo IEEE Transactions on PowerSystems vol 20 no 3 pp 1562ndash1569 2005

[68] A W M Smeulders M Worring S Santini A Gupta and RJain ldquoContent-based image retrieval at the end of the earlyyearsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 22 no 12 pp 1349ndash1380 2000

[69] D S Friedlander and S Phoha ldquoSemantic information fusionfor coordinated signal processing in mobile sensor networksrdquoInternational Journal of High Performance Computing Applica-tions vol 16 no 3 pp 235ndash241 2002

[70] S Friedlander ldquoSemantic information extractionrdquo in Dis-tributed Sensor Networks 2005

[71] KWhitehouse J Liu and F Zhao ldquoSemantic Streams a frame-work for composable inference over sensor datardquo in Proceedingsof the 3rd European Workshop on Wireless Sensor NetworksLecture Notes in Computer Science Springer February 2006

[72] J Cox ldquoA review of statistical data association techniques formotion correspondencerdquo International Journal of ComputerVision vol 10 no 1 pp 53ndash66 1993

[73] C J Veenman M J T Reinders and E Backer ldquoResolvingmotion correspondence for densely moving pointsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol23 no 1 pp 54ndash72 2001

[74] F Castanedo M A Patricio J Garcıa and J M MolinaldquoBottom-uptop-down coordination in a multiagent visualsensor networkrdquo in Proceedings of the IEEE Conference onAdvanced Video and Signal Based Surveillance (AVSS rsquo07) pp93ndash98 September 2007

[75] F Castanedo J Garcıa M A Patricio and J M MolinaldquoAnalysis of distributed fusion alternatives in coordinated visionagentsrdquo in Proceedings of the 11th International Conference onInformation Fusion (FUSION rsquo08) July 2008

[76] F Castanedo J Garcıa M A Patricio and J M Molina ldquoDatafusion to improve trajectory tracking in a cooperative surveil-lance multi-agent architecturerdquo Information Fusion vol 11 no3 pp 243ndash255 2010

[77] M A Davenport C Hegde M F Duarte and R G BaraniukldquoJoint manifolds for data fusionrdquo IEEE Transactions on ImageProcessing vol 19 no 10 pp 2580ndash2594 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 2: Review Article A Review of Data Fusion Techniquesdownloads.hindawi.com/journals/tswj/2013/704504.pdf · Classification of Data Fusion Techniques Data fusion is a ... to extract features

2 The Scientific World Journal

from reviewing the different methods are highlighted inSection 6

2 Classification of Data Fusion Techniques

Data fusion is a multidisciplinary area that involves severalfields and it is difficult to establish a clear and strict classifi-cationThe employedmethods and techniques can be dividedaccording to the following criteria

(1) attending to the relations between the input datasources as proposed by Durrant-Whyte [3] Theserelations can be defined as (a) complementary (b)redundant or (3) cooperative data

(2) according to the inputoutput data types and theirnature as proposed by Dasarathy [4]

(3) following an abstraction level of the employed data(a) rawmeasurement (b) signals and (c) characteris-tics or decisions

(4) based on the different data fusion levels defined by theJDL

(5) Depending on the architecture type (a) centralized(b) decentralized or (c) distributed

21 Classification Based on the Relations between the DataSources Based on the relations of the sources (see Figure 1)Durrant-Whyte [3] proposed the following classificationcriteria

(1) complementary when the information provided bythe input sources represents different parts of thescene and could thus be used to obtainmore completeglobal information For example in the case of visualsensor networks the information on the same targetprovided by two cameras with different fields of viewis considered complementary

(2) redundant when two or more input sources provideinformation about the same target and could thus befused to increment the confidence For example thedata coming from overlapped areas in visual sensornetworks are considered redundant

(3) cooperative when the provided information is com-bined into new information that is typically morecomplex than the original information For examplemulti-modal (audio and video) data fusion is consid-ered cooperative

22 Dasarathyrsquos Classification One of the most well-knowndata fusion classification systems was provided by Dasarathy[4] and is composed of the following five categories (seeFigure 2)

(1) data in-data out (DAI-DAO) this type is the mostbasic or elementary data fusion method that is con-sidered in classification This type of data fusionprocess inputs and outputs raw data the results

are typically more reliable or accurate Data fusion atthis level is conducted immediately after the data aregathered from the sensors The algorithms employedat this level are based on signal and image processingalgorithms

(2) data in-feature out (DAI-FEO) at this level the datafusion process employs raw data from the sourcesto extract features or characteristics that describe anentity in the environment

(3) feature in-feature out (FEI-FEO) at this level boththe input and output of the data fusion process arefeatures Thus the data fusion process addresses aset of features with to improve refine or obtain newfeatures This process is also known as feature fusionsymbolic fusion information fusion or intermediate-level fusion

(4) feature in-decision out (FEI-DEO) this level obtains aset of features as input and provides a set of decisionsas output Most of the classification systems thatperform a decision based on a sensorrsquos inputs fall intothis category of classification

(5) Decision In-Decision Out (DEI-DEO) This type ofclassification is also known as decision fusion It fusesinput decisions to obtain better or new decisions

The main contribution of Dasarathyrsquos classification is thespecification of the abstraction level either as an input or anoutput providing a framework to classify different methodsor techniques

23 Classification Based on the Abstraction Levels Luo et al[5] provided the following four abstraction levels

(1) signal level directly addresses the signals that areacquired from the sensors

(2) pixel level operates at the image level and could beused to improve image processing tasks

(3) characteristic employs features that are extractedfrom the images or signals (ie shape or velocity)

(4) symbol at this level information is represented assymbols this level is also known as the decision level

Information fusion typically addresses three levels ofabstraction (1) measurements (2) characteristics and (3)decisions Other possible classifications of data fusion basedon the abstraction levels are as follows

(1) low level fusion the raw data are directly providedas an input to the data fusion process which providemore accurate data (a lower signal-to-noise ratio)than the individual sources

(2) medium level fusion characteristics or features(shape texture and position) are fused to obtainfeatures that could be employed for other tasks Thislevel is also known as the feature or characteristiclevel

The Scientific World Journal 3

S1 S2 S3 S4 S5

Complementaryfusion

Redundantfusion

Cooperativefusion

Fusedinformation

Sources

Information

(a + b) (b) (c)

A B

A B

B C

C

C998400

Figure 1 Whytersquos classification based on the relations between the data sources

Data

Data

Features

Features

Decisions

Data

Features

Features

Decisions

Decisions

Data in-data out(DAI-DAO)

Data in-feature out(DAI-FEO)

Feature in-decision out(FEI-DEO)

Decision in-decision out(DEI-DEO)

Feature in-feature out(FEI-FEO)

Figure 2 Dasarathyrsquos classification

(3) high level fusion this level which is also knownas decision fusion takes symbolic representations assources and combines them to obtain amore accuratedecision Bayesianrsquosmethods are typically employed atthis level

(4) multiple level fusion this level addresses data pro-vided from different levels of abstraction (ie whena measurement is combined with a feature to obtain adecision)

24 JDL Data Fusion Classification This classification is themost popular conceptual model in the data fusion commu-nity It was originally proposed by JDL and the American

Department of Defense (DoD) [1] These organizations clas-sified the data fusion process into five processing levels anassociated database and an information bus that connectsthe five components (see Figure 3) The five levels could begrouped into two groups low-level fusion and high-levelfusion which comprise the following components

(i) sources the sources are in charge of providingthe input data Different types of sources can beemployed such as sensors a priori information (ref-erences or geographic data) databases and humaninputs

(ii) human-computer interaction (HCI) HCI is an inter-face that allows inputs to the system from the oper-ators and produces outputs to the operators HCIincludes queries commands and information on theobtained results and alarms

(iii) database management system the database manage-ment system stores the provided information andthe fused results This system is a critical componentbecause of the large amount of highly diverse infor-mation that is stored

In contrast the five levels of data processing are defined asfollows

(1) level 0mdashsource preprocessing source preprocessingis the lowest level of the data fusion process andit includes fusion at the signal and pixel levels Inthe case of text sources this level also includes theinformation extraction processThis level reduces theamount of data and maintains useful information forthe high-level processes

(2) level 1mdashobject refinement object refinement employsthe processed data from the previous level Com-mon procedures of this level include spatio-temporalalignment association correlation clustering orgrouping techniques state estimation the removal offalse positives identity fusion and the combining offeatures that were extracted from images The output

4 The Scientific World Journal

Fusion domain

Level 0 Level 1 Level 2 Level 3Sourcepreprocessing

Objectrefinement

Situationassessment

Threatassessment

Information bus

SourcesSensorsDatabasesKnowledge

Level 4 DatabasemanagementProcess

refinement

Userinterface

Figure 3 The JDL data fusion framework

results of this stage are the object discrimination(classification and identification) and object track-ing (state of the object and orientation) This stagetransforms the input information into consistent datastructures

(3) level 2mdashsituation assessment this level focuses ona higher level of inference than level 1 Situationassessment aims to identify the likely situations giventhe observed events and obtained data It establishesrelationships between the objects Relations (ieproximity communication) are valued to determinethe significance of the entities or objects in a specificenvironment The aim of this level includes perform-ing high-level inferences and identifying significantactivities and events (patterns in general) The outputis a set of high-level inferences

(4) level 3mdashimpact assessment this level evaluates theimpact of the detected activities in level 2 to obtain aproper perspectiveThe current situation is evaluatedand a future projection is performed to identifypossible risks vulnerabilities and operational oppor-tunities This level includes (1) an evaluation of therisk or threat and (2) a prediction of the logicaloutcome

(5) level 4mdashprocess refinement this level improves theprocess from level 0 to level 3 and provides resourceand sensor management The aim is to achieve effi-cient resource management while accounting for taskpriorities scheduling and the control of availableresources

High-level fusion typically starts at level 2 because thetype localization movement and quantity of the objectsare known at that level One of the limitations of the JDLmethod is how the uncertainty about previous or subsequentresults could be employed to enhance the fusion process(feedback loop) Llinas et al [6] propose several refinementsand extensions to the JDL model Blasch and Plano [7]proposed to add a new level (user refinement) to support ahumanuser in the data fusion loopThe JDLmodel represents

the first effort to provide a detailed model and a commonterminology for the data fusion domain However becausetheir roots originate in the military domain the employedterms are oriented to the risks that commonly occur inthese scenarios The Dasarathy model differs from the JDLmodel with regard to the adopted terminology and employedapproach The former is oriented toward the differencesamong the input and output results independent of theemployed fusion method In summary the Dasarathy modelprovides a method for understanding the relations betweenthe fusion tasks and employed data whereas the JDL modelpresents an appropriate fusion perspective to design datafusion systems

25 Classification Based on the Type of Architecture One ofthe main questions that arise when designing a data fusionsystem is where the data fusion process will be performedBased on this criterion the following types of architecturescould be identified

(1) centralized architecture in a centralized architecturethe fusion node resides in the central processor thatreceives the information from all of the input sourcesTherefore all of the fusion processes are executedin a central processor that uses the provided rawmeasurements from the sources In this schema thesources obtain only the observationas measurementsand transmit them to a central processor where thedata fusion process is performed If we assume thatdata alignment and data association are performedcorrectly and that the required time to transfer thedata is not significant then the centralized scheme istheoretically optimal However the previous assump-tions typically do not hold for real systems Moreoverthe large amount of bandwidth that is required to sendraw data through the network is another disadvantagefor the centralized approach This issue becomes abottleneck when this type of architecture is employedfor fusing data in visual sensor networks Finallythe time delays when transferring the informationbetween the different sources are variable and affect

The Scientific World Journal 5

the results in the centralized scheme to a greaterdegree than in other schemes

(2) decentralized architecture a decentralized architec-ture is composed of a network of nodes in which eachnode has its own processing capabilities and there isno single point of data fusion Therefore each nodefuses its local information with the information thatis received from its peers Data fusion is performedautonomously with each node accounting for its localinformation and the information received from itspeers Decentralized data fusion algorithms typicallycommunicate information using the Fisher and Shan-non measurements instead of the objectrsquos state [8]The main disadvantage of this architecture is thecommunication cost which is 119874(1198992) at each com-munication step where 119899 is the number of nodesadditionally the extreme case is considered in whicheach node communicates with all of its peers Thusthis type of architecture could suffer from scalabilityproblems when the number of nodes is increased

(3) distributed architecture in a distributed architecturemeasurements from each source node are processedindependently before the information is sent to thefusion node the fusion node accounts for the infor-mation that is received from the other nodes In otherwords the data association and state estimation areperformed in the source node before the informationis communicated to the fusion node Therefore eachnode provides an estimation of the object state basedon only their local views and this information isthe input to the fusion process which provides afused global view This type of architecture providesdifferent options and variations that range from onlyone fusion node to several intermediate fusion nodes

(4) hierarchical architecture other architectures com-prise a combination of decentralized and distributednodes generating hierarchical schemes in which thedata fusion process is performed at different levels inthe hierarchy

In principle a decentralized data fusion system is moredifficult to implement because of the computation andcommunication requirements However in practice there isno single best architecture and the selection of the mostappropriate architecture should be made depending on therequirements demand existing networks data availabilitynode processing capabilities and organization of the datafusion system

The reader might think that the decentralized anddistributed architectures are similar however they havemeaningful differences (see Figure 4) First in a distributedarchitecture a preprocessing of the obtainedmeasurements isperformed which provides a vector of features as a result (thefeatures are fused thereafter) In contrast in the decentralizedarchitecture the complete data fusion process is conductedin each node and each of the nodes provides a globallyfused result Second the decentralized fusion algorithmstypically communicate information employing the Fisher

and Shannon measurements In contrast distributed algo-rithms typically share a common notion of state (positionvelocity and identity) with their associated probabilitieswhich are used to perform the fusion process [9] Thirdbecause the decentralized data fusion algorithms exchangeinformation instead of states and probabilities they havethe advantage of easily separating old knowledge from newknowledge Thus the process is additive and the associativemeaning is not relevant when the information is receivedand fused However in the distributed data fusion algorithms(ie distributed by Kalman Filter) the state that is goingto be fused is not associative and when and how the fusedestimates are computed is relevant Nevertheless in contrastto the centralized architectures the distributed algorithmsreduce the necessary communication and computationalcosts because some tasks are computed in the distributednodes before data fusion is performed in the fusion node

3 Data Association Techniques

The data association problem must determine the set ofmeasurements that correspond to each target (see Figure 5)Let us suppose that there are 119874 targets that are being trackedby only one sensor in a cluttered environment (by a clutteredenvironment we refer to an environment that has severaltargets that are to close each other)Then the data associationproblem can be defined as follows

(i) each sensorrsquos observation is received in the fusionnode at discrete time intervals

(ii) the sensormight not provide observations at a specificinterval

(iii) some observations are noise and other observationsoriginate from the detected target

(iv) for any specific target and in every time interval wedo not know (a priori) the observations that will begenerated by that target

Therefore the goal of data association is to establish theset of observations or measurements that are generated bythe same target over time Hall and Llinas [2] provided thefollowing definition of data association ldquoThe process of assignand compute the weights that relates the observations or tracks(A track can be defined as an ordered set of points that followa path and are generated by the same target) from one set tothe observation of tracks of another setrdquo

As an example of the complexity of the data associationproblem if we take a frame-to-frame association and assumethat119872 possible points could be detected in all 119899 frames thenthe number of possible sets is (119872)119899minus1 Note that from allof these possible solutions only one set establishes the truemovement of the119872 points

Data association is often performed before the stateestimation of the detected targets Moreover it is a keystep because the estimation or classification will behaveincorrectly if the data association phase does not workcoherently The data association process could also appear inall of the fusion levels but the granularity varies dependingon the objective of each level

6 The Scientific World Journal

Preprocessing

Preprocessing

Preprocessing

Alignment Association Estimation

Stateof theobject

Centralized architecture

Decentralized architecture

Distributed architecture

S1

S2

Fusion node

Preprocessing

Stateof theobject

Stateof theobject

Stateof theobject

S1

S2

S1

S2

Preprocessing

Preprocessing

Preprocessing

Preprocessing

Preprocessing

Alignment

Alignment

Alignment

Alignment

Alignment

Alignment

Alignment

Association

Association

Association

Association

Association

Association

Association

Estimation

Estimation

Estimation

Estimation

Estimation

Estimation

Estimation

Sn

Sn

Sn

Stateof theobject

Figure 4 Classification based on the type of architecture

In general an exhaustive search of all possible combina-tions grows exponentially with the number of targets thusthe data association problem becomes NP complete Themost common techniques that are employed to solve the dataassociation problem are presented in the following sections(from Sections 31 to 37)

31 Nearest Neighbors and K-Means Nearest neighbor(NN) is the simplest data association technique NN isa well-known clustering algorithm that selects or groups

the most similar values How close the one measurement isto another depends on the employed distance metric andtypically depends on the threshold that is established by thedesigner In general the employed criteria could be based on(1) an absolute distance (2) the Euclidean distance or (3) astatistical function of the distance

NN is a simple algorithm that can find a feasible (approx-imate) solution in a small amount of time However in acluttered environment it could provide many pairs that havethe same probability and could thus produce undesirable

The Scientific World Journal 7

Targets Sensors Observations Tracks

Track 1

Track 2

False alarms

Ass

ocia

tion

S1

S2

Sn

Track n

y1 y2 yn

Figure 5 Conceptual overview of the data association process from multiple sensors and multiple targets It is necessary to establish the setof observations over time from the same object that forms a track

error propagation [10] Moreover this algorithm has poorperformance in environments in which false measurementsare frequent which are in highly noisy environments

All neighbors use a similar technique in which all of themeasurements inside a region are included in the tracks119870-Means [11] method is a well-known modification of

the NN algorithm 119870-Means divides the dataset values into119870 different clusters 119870-Means algorithm finds the best local-ization of the cluster centroids where best means a centroidthat is in the center of the data cluster119870-Means is an iterativealgorithm that can be divided into the following steps

(1) obtain the input data and the number of desiredclusters (119870)

(2) randomly assign the centroid of each cluster(3) match each data point with the centroid of each

cluster(4) move the cluster centers to the centroid of the cluster(5) if the algorithm does not converge return to step (3)

119870-Means is a popular algorithm that has been widelyemployed however it has the following disadvantages

(i) the algorithm does not always find the optimal solu-tion for the cluster centers

(ii) the number of clusters must be known a priori andone must assume that this number is the optimum

(iii) the algorithm assumes that the covariance of thedataset is irrelevant or that it has been normalizedalready

There are several options for overcoming these limita-tions For the first one it is possible to execute the algorithmseveral times and obtain the solution that has less varianceFor the second one it is possible to start with a low valueof 119870 and increment the values of 119870 until an adequate resultis obtained The third limitation can be easily overcome bymultiplying the datawith the inverse of the covariancematrix

Many variations have been proposed to Lloydrsquos basic119870-Means algorithm [11] which has a computational upperbound cost of 119874(119870119899) where 119899 is the number of input pointsand 119870 is the number of desired clusters Some algorithmsmodify the initial cluster assignments to improve the separa-tions and reduce the number of iterations Others introduce

soft or multinomial clustering assignments using fuzzy logicprobabilistic or the Bayesian techniques However most ofthe previous variations still must perform several iterationsthrough the data space to converge to a reasonable solutionThis issue becomes a major disadvantage in several real-time applications A new approach that is based on havinga large (but still affordable) number of cluster candidatescompared to the desired 119870 clusters is currently gainingattention The idea behind this computational model is thatthe algorithm builds a good sketch of the original data whilereducing the dimensionality of the input space significantlyIn this manner a weighted 119870-Means can be applied to thelarge candidate clusters to derive a good clustering of theoriginal data Using this idea [12] presented an efficientand scalable 119870-Means algorithm that is based on randomprojections This algorithm requires only one pass throughthe input data to build the clusters More specifically if theinput data distribution holds some separability requirementsthen the number of required candidate clusters grows onlyaccording to119874(log 119899) where 119899 is the number of observationsin the original data This salient feature makes the algorithmscalable in terms of both the memory and computationalrequirements

32 Probabilistic Data Association The probabilistic dataassociation (PDA) algorithm was proposed by Bar-Shalomand Tse [13] and is also known as the modified filter of allneighbors This algorithm assigns an association probabilityto each hypothesis from a valid measurement of a targetA valid measurement refers to the observation that falls inthe validation gate of the target at that time instant Thevalidation gate 120574 which is the center around the predictedmeasurements of the target is used to select the set of basicmeasurements and is defined as

120574 ge (119885 (119896) minus (119896 | 119896 minus 1))119879119878minus1(119896) (119911 (119896) minus 119911 (119896 | 119896 minus 1))

(1)

where 119870 is the temporal index 119878(119896) is the covariance gainand 120574 determines the gating or window size The set of validmeasurements at time instant 119896 is defined as

119885 (119896) = 119911119894(119896) 119894 = 1 119898

119896 (2)

8 The Scientific World Journal

where 119911119894(119896) is the 119894-measurement in the validation region at

time instant 119896 We give the standard equations of the PDAalgorithm next For the state prediction consider

119909 (119896 | 119896 minus 1) = 119865 (119896 minus 1) 119909 (119896 minus 1 | 119896 minus 1) (3)

where 119865(119896 minus 1) is the transition matrix at time instant 119896 minus 1To calculate the measurement prediction consider

(119896 | 119896 minus 1) = 119867 (119896) 119909 (119896 | 119896 minus 1) (4)

where 119867(119896) is the linearization measurement matrix Tocompute the gain or the innovation of the 119894-measurementconsider

V119894(119896) = 119911

119894(119896) minus (119896 | 119896 minus 1) (5)

To calculate the covariance prediction consider

(119896 | 119896 minus 1) = 119865 (119896 minus 1) (119896 minus 1 | 119896 minus 1) 119865(119896 minus 1)119879+ 119876 (119896)

(6)

where 119876(119896) is the process noise covariance matrix To com-pute the innovation covariance (119878) and the Kalman gain (119870)

119878 (119896) = 119867 (119896) (119896 | 119896 minus 1)119867(119896)119879+ 119877

119870 (119896) = (119896 | 119896 minus 1)119867(119896)119879119878(119896)minus1

(7)

To obtain the covariance update in the case in which themea-surements originated by the target are known consider

1198750(119896 | 119896) = (119896 | 119896 minus 1) minus 119870 (119896) 119878 (119896)119870(119896)

119879 (8)

The total update of the covariance is computed as

V (119896) =

119898119896

sum

119894=1

120573119894(119896) V119894(119896)

119875 (119896) = 119870 (119896) [

119898119896

sum

119894=1

(120573119894(119896) V119894(119896) V119894(119896)119879) minus V (119896) V(119896)

119879]119870119879(119896)

(9)

where119898119896is the number of valid measurements in the instant

119896The equation to update the estimated state which is formedby the position and velocity is given by

119909 (119896 | 119896) = 119909 (119896 | 119896 minus 1) + 119870 (119896) V (119896) (10)

Finally the association probabilities of PDA are as follows

120573119894(119896) =

119901119894(119896)

sum119898119896

119894=0119901119894(119896) (11)

where

119901119894(119896) =

(2Π)1198722120582radic1003816100381610038161003816119878119894 (119896)

1003816100381610038161003816 (1 minus 119875119889119875119892)

119875119889

if 119894 = 0

exp [minus12V119879 (119896) 119878minus1 (119896) V (119896)] if 119894 = 0

0 in other cases(12)

where119872 is the dimension of themeasurement vector 120582 is thedensity of the clutter environment 119875

119889is the detection prob-

ability of the correct measurement and 119875119892is the validation

probability of a detected valueIn the PDA algorithm the state estimation of the target is

computed as a weighted sum of the estimated state under allof the hypothesesThe algorithm can associate different mea-surements to one specific target Thus the association of thedifferent measurements to a specific target helps PDA toestimate the target state and the association probabilitiesare used as weights The main disadvantages of the PDAalgorithm are the following

(i) loss of tracks because PDA ignores the interferencewith other targets it sometimes could wrongly clas-sify the closest tracks Therefore it provides a poorperformance when the targets are close to each otheror crossed

(ii) the suboptimal Bayesian approximation when thesource of information is uncertain PDA is the sub-optimal Bayesian approximation to the associationproblem

(iii) one target PDA was initially designed for the asso-ciation of one target in a low-cluttered environmentThe number of false alarms is typically modeled withthe Poisson distribution and they are assumed to bedistributed uniformly in space PDA behaves incor-rectlywhen there aremultiple targets because the falsealarm model does not work well

(iv) track management because PDA assumes that thetrack is already established algorithms must be pro-vided for track initialization and track deletion

PDA is mainly good for tracking targets that do notmake abrupt changes in their movement patterns PDA willmost likely lose the target if it makes abrupt changes in itsmovement patterns

33 Joint Probabilistic Data Association Joint probabilisticdata association (JPDA) is a suboptimal approach for trackingmultiple targets in cluttered environments [14] JPDA issimilar to PDA with the difference that the associationprobabilities are computed using all of the observationsand all of the targets Thus in contrast to PDA JPDAconsiders various hypotheses together and combines themJPDA determines the probability 120573119905

119894(119896) that measurement 119894 is

originated from target 119905 accounting for the fact that underthis hypothesis the measurement cannot be generated byother targets Therefore for a known number of targets itevaluates the different options of the measurement-targetassociation (for the most recent set of measurements) andcombines them into the corresponding state estimation Ifthe association probability is known then the Kalman filterupdating equation of the track 119905 can be written as

119909119905(119896 | 119896) = 119909

119905(119896 | 119896 minus 1) + 119870 (119896) V

119905(119896) (13)

where 119909119905(119896 | 119896) and 119909119905(119896 | 119896 minus 1) are the estimation andprediction of target 119905 and119870(119896) is the filter gainTheweighted

The Scientific World Journal 9

sum of the residuals associated with the observation 119898(119896) oftarget 119905 is as follows

V119905(119896) =

119898(119896)

sum

119894=1

120573119905

119894(119896) V119905

119894(119896) (14)

where V119905119894= 119911119894(119896) minus 119867119909

119905(119896 | 119896 minus 1) Therefore this method

incorporates all of the observations (inside the neighborhoodof the targetrsquos predicted position) to update the estimatedposition by using a posterior probability that is a weightedsum of residuals

The main restrictions of JPDA are the following

(i) a measurement cannot come from more than onetarget

(ii) two measurements cannot be originated by the sametarget (at one time instant)

(iii) the sum of all of the measurementsrsquo probabilities thatare assigned to one target must be 1 sum119898(119896)

119894=0120573119905

119894(119896) = 1

The main disadvantages of JPDA are the following

(i) it requires an explicit mechanism for track initial-ization Similar to PDA JPDA cannot initialize newtracks or remove tracks that are out of the observationarea

(ii) JPDA is a computationally expensive algorithm whenit is applied in environments that havemultiple targetsbecause the number of hypotheses is incrementedexponentially with the number of targets

In general JPDA is more appropriate than MHT insituations in which the density of false measurements is high(ie sonar applications)

34 Multiple Hypothesis Test The underlying idea of themultiple hypothesis test (MHT) is based on using more thantwo consecutive observations to make an association withbetter results Other algorithms that use only two consecutiveobservations have a higher probability of generating an errorIn contrast to PDA and JPDA MHT estimates all of thepossible hypotheses and maintains new hypotheses in eachiteration

MHTwas developed to track multiple targets in clutteredenvironments as a result it combines the data associationproblem and tracking into a unified framework becomingan estimation technique as well The Bayes rule or theBayesian networks are commonly employed to calculate theMHT hypothesis In general researchers have claimed thatMHT outperforms JPDA for the lower densities of falsepositives However the main disadvantage of MHT is thecomputational cost when the number of tracks or falsepositives is incremented Pruning the hypothesis tree usinga window could solve this limitation

The Reid [15] tracking algorithm is considered the stan-dard MHT algorithm but the initial integer programmingformulation of the problem is due to Morefield [16] MHT isan iterative algorithm in which each iteration starts with a set

of correspondence hypotheses Each hypothesis is a collec-tion of disjoint tracks and the prediction of the target in thenext time instant is computed for each hypothesis Next thepredictions are compared with the new observations by usinga distance metric The set of associations established in eachhypothesis (based on a distance) introduces new hypothesesin the next iteration Each new hypothesis represents a newset of tracks that is based on the current observations

Note that each new measurement could come from (i) anew target in the visual field of view (ii) a target being trackedor (iii) noise in the measurement process It is also possiblethat a measurement is not assigned to a target because thetarget disappears or because it is not possible to obtain atarget measurement at that time instant

MHT maintains several correspondence hypotheses foreach target in each frame If the hypothesis in the instant119896 is represented by 119867(119896) = [ℎ

119897(119896) 119896 = 1 119899] then

the probability of the hypothesis ℎ119897(119896) could be represented

recursively using the Bayes rule as follows

119875 (ℎ119897(119896) | 119885 (119896)) = 119875 (ℎ

119892(119896 minus 1) 119886

119894(119896) | 119885 (119896))

=1

119888119875 (119885 (119896) | ℎ

119892(119896 minus 1) 119886

119894(119896))

lowast 119875 (119886119894(119896) | ℎ

119892(119896 minus 1)) lowast 119875 (ℎ

119892(119896 minus 1))

(15)

where ℎ119892(119896 minus 1) is the hypothesis 119892 of the complete set until

the time instant 119896minus1 119886119894(119896) is the 119894th possible association of the

track to the object 119885(119896) is the set of detections of the currentframe and 119888 is a normal constant

The first term on the right side of the previous equationis the likelihood function of the measurement set 119885(119896) giventhe joint likelihood and current hypothesis The second termis the probability of the association hypothesis of the currentdata given the previous hypothesis ℎ

119892(119896 minus 1) The third term

is the probability of the previous hypothesis from which thecurrent hypothesis is calculated

The MHT algorithm has the ability to detect a newtrack while maintaining the hypothesis tree structure Theprobability of a true track is given by theBayes decisionmodelas

119875 (120582 | 119885) =119875 (119885 | 120582) lowast 119875

∘(120582)

119875 (119885) (16)

where 119875(119885 | 120582) is the probability of obtaining the set ofmeasurements 119885 given 120582 119875

∘(120582) is the a priori probability of

the source signal and 119875(119885) is the probability of obtaining theset of detections 119885

MHT considers all of the possibilities including boththe track maintenance and the initialization and removalof tracks in an integrated framework MHT calculates thepossibility of having an object after the generation of a setof measurements using an exhaustive approach and thealgorithm does not assume a fixed number of targetsThe keychallenge of MHT is the effective hypothesis managementThe baseline MHT algorithm can be extended as follows(i) use the hypothesis aggregation for missed targets births

10 The Scientific World Journal

cardinality tracking and closely spaced objects (ii) applya multistage MHT for improving the performance androbustness in challenging settings and (iii) use a feature-aided MHT for extended object surveillance

The main disadvantage of this algorithm is the compu-tational cost which grows exponentially with the number oftracks andmeasurementsTherefore the practical implemen-tation of this algorithm is limited because it is exponential inboth time and memory

With the aim of reducing the computational cost [17]presented a probabilistic MHT algorithm in which theassociations are considered to be random variables thatare statistically independent and in which performing anexhaustive search enumeration is avoided This algorithm isknown as PMHT The PMHT algorithm assumes that thenumber of targets and measurements is known With thesame goal of reducing the computational cost [18] presentedan efficient implementation of the MHT algorithm Thisimplementation was the first version to be applied to performtracking in visual environments They employed the Murty[19] algorithm to determine the best set of 119896 hypothesesin polynomial time with the goal of tracking the points ofinterest

MHT typically performs the tracking process by employ-ing only one characteristic commonly the position TheBayesian combination to use multiple characteristics wasproposed by Liggins II et al [20]

A linear-programming-based relaxation approach to theoptimization problem in MHT tracking was proposed inde-pendently by Coraluppi et al [21] and Storms and Spieksma[22] Joo and Chellappa [23] proposed an association algo-rithm for tracking multiple targets in visual environmentsTheir algorithm is based on in MHT modification in whicha measurement can be associated with more than one targetand several targets can be associated with one measurementThey also proposed a combinatorial optimization algorithmto generate the best set of association hypotheses Theiralgorithm always finds the best hypothesis in contrast toother models which are approximate Coraluppi and Carthel[24] presented a generalization of the MHT algorithm usinga recursion over hypothesis classes rather than over a singlehypothesis This work has been applied in a special case ofthemulti-target tracking problem called cardinality trackingin which they observed the number of sensor measurementsinstead of the target states

35 Distributed Joint Probabilistic Data Association The dis-tributed version of the joint probabilistic data association(JPDA-D) was presented by Chang et al [25] In this tech-nique the estimated state of the target (using two sensors)after being associated is given by

119864 119909 | 1198851 1198852 =

1198981

sum

119895=0

1198982

sum

119897=0

119864 119909 | 1205941

119895 1205942

119897 1198851 1198852

lowast 119875 1205941

119895 1205942

119897| 1198851 1198852

(17)

where 119898119894 119894 = 1 2 is the last set of measurements of

sensor 1 and 2 119885119894 119894 = 1 2 is the set of accumulative data

and 120594 is the association hypothesisThe first term of the rightside of the equation is calculated from the associations thatwere made earlier The second term is computed from theindividual association probabilities as follows

119875 (1205941

119895 1205942

119897| 1198851 1198852) = sum

1199091

sum

1199092

= 119875 (1205941 1205942| 1198851 1198852) 1

119895(1205941) 2

119897(1205942)

119875 (1205941 1205942| 1198851 1198852) =

1

119888119875 (1205941| 1198851) 119875 (120594

2| 1198852) 120574 (120594

1 1205942)

(18)

where 120594119894 are the joint hypotheses involving all of themeasurements and all of the objectives and 119894

119895(120594119894) are the

binary indicators of the measurement-target associationTheadditional term 120574(1205941 1205942) depends on the correlation of theindividual hypothesis and reflects the localization influenceof the current measurements in the joint hypotheses

These equations are obtained assuming that commu-nication exists after every observation and there are onlyapproximations in the case in which communication issporadic and when a substantial amount of noise occursTherefore this algorithm is a theoretical model that has somelimitations in practical applications

36 Distributed Multiple Hypothesis Test The distributedversion of the MHT algorithm (MHT-D) [26 27] follows asimilar structure as the JPDA-D algorithm Let us assume thecase in which one node must fuse two sets of hypotheses andtracks If the hypotheses and track sets are represented by119867119894(119885119894) and 119879119894(119885119894) with 119894 = 1 2 the hypothesis probabilities

are represented by 120582119894119895 and the state distribution of the tracks

(120591119894119895) is represented by 119875(120582119894

119895) and 119875(119909 | 119885

119894 120591119894

119895) then the

maximum available information in the fusion node is 119885 =1198851cup 1198852 The data fusion objective of the MHT-D is to

obtain the set of hypotheses119867(119885) the set of tracks 119879(119885) thehypothesis probabilities 119875(120582 | 119885) and the state distribution119901(119909 | 119885 120591) for the observed data

The MHT-D algorithm is composed of the followingsteps

(1) hypothesis formation for each hypothesis pair 1205821119895and

1205822

119896 which could be fused a track 120591 is formed by

associating the pair of tracks 1205911119895and 1205912

119896 where each

pair comes from one node and could originate fromthe same target The final result of this stage is a setof hypotheses denoted by 119867(119885) and the fused tracks119879(119885)

(2) hypothesis evaluation in this stage the associationprobability of each hypothesis and the estimatedstate of each fused track are obtained The dis-tributed estimation algorithm is employed to calcu-late the likelihood of the possible associations andthe obtained estimations at each specific association

The Scientific World Journal 11

Using the information model the probability of eachfused hypothesis is given by

119875 (120582 | 119885) = 119862minus1prod

119895isin119869

119875(120582(119895)| 119885(119895))120572(119895)

prod

120591isin120582

119871 (120591 | 119885) (19)

where 119862 is a normalizing constant and 119871(120591 | 119885) is thelikelihood of each hypothesis pair

The main disadvantage of the MHT-D is the high com-putational cost that is in the order of 119874(119899119872) where 119899 is thenumber of possible associations and 119872 is the number ofvariables to be estimated

37 Graphical Models Graphical models are a formalism forrepresenting and reasoning with probabilities and indepen-dence A graphical model represents a conditional decom-position of the joint probability A graphical model can berepresented as a graph in which the nodes denote randomvariables the edges denote the possible dependence betweenthe random variables and the plates denote the replication ofa substructure with the appropriate indexing of the relevantvariables The graph captures the joint distribution over therandom variables which can be decomposed into a productof factors that each dependononly a subset of variablesThereare two major classes of graphical models (i) the Bayesiannetworks [28] which are also known as the directed graphicalmodels and (ii) the Markov random fields which are alsoknown as undirected graphical models The directed graph-ical models are useful for expressing causal relationshipsbetween random variables whereas undirected models arebetter suited for expressing soft constraints between randomvariables We refer the reader to the book of Koller andFriedman [29] for more information on graphical models

A framework based on graphical models can solve theproblem of distributed data association in synchronizedsensor networkswith overlapped areas andwhere each sensorreceives noisy measurements this solution was proposedby Chen et al [30 31] Their work is based on graphicalmodels that are used to represent the statistical dependencebetween random variables The data association problem istreated as an inference problem and solved by using themax-product algorithm [32] Graphical models representstatistical dependencies between variables as graphs andthe max-product algorithm converges when the graph isa tree structure Moreover the employed algorithm couldbe implemented in a distributed manner by exchangingmessages between the source nodes in parallel With thisalgorithm if each sensor has 119899 possible combinations ofassociations and there are119872 variables to be estimated it hasa complexity of 119874(1198992119872) which is reasonable and less thanthe 119874(119899119872) complexity of the MHT-D algorithm Howeveraspecial attention must be given to the correlated variableswhen building the graphical model

4 State Estimation Methods

State estimation techniques aim to determine the state ofthe target under movement (typically the position) given

the observation or measurements State estimation tech-niques are also known as tracking techniques In their generalform it is not guaranteed that the target observations arerelevant which means that some of the observations couldactually come from the target and others could be only noiseThe state estimation phase is a common stage in data fusionalgorithms because the targetrsquos observation could come fromdifferent sensors or sources and the final goal is to obtain aglobal target state from the observations

The estimation problem involves finding the values of thevector state (eg position velocity and size) that fits as muchas possible with the observed data From a mathematicalperspective we have a set of redundant observations andthe goal is to find the set of parameters that provides thebest fit to the observed data In general these observationsare corrupted by errors and the propagation of noise in themeasurement process State estimation methods fall underlevel 1 of the JDL classification and could be divided into twobroader groups

(1) linear dynamics and measurements here the esti-mation problem has a standard solution Specificallywhen the equations of the object state and the mea-surements are linear the noise follows the Gaussiandistribution and we do not refer to it as a clutterenvironment in this case the optimal theoreticalsolution is based on the Kalman filter

(2) nonlinear dynamics the state estimation problembecomes difficult and there is not an analytical solu-tion to solve the problem in a generalmanner In prin-ciple there are no practical algorithms available tosolve this problem satisfactorily

Most of the state estimationmethods are based on controltheory and employ the laws of probability to compute avector state from a vector measurement or a stream of vectormeasurements Next the most common estimation methodsare presented including maximum likelihood and maxi-mum posterior (Section 41) the Kalman filter (Section 42)particle filter (Section 43) the distributed Kalman filter(Section 44) distributed particle filter (Section 45) andcovariance consistency methods (Section 46)

41 Maximum Likelihood andMaximum Posterior Themax-imum likelihood (ML) technique is an estimation methodthat is based on probabilistic theory Probabilistic estimationmethods are appropriate when the state variable follows anunknown probability distribution [33] In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of119909 The likelihood function 120582(119909) is defined as a probabilitydensity function of the sequence of 119911 observations given thetrue value of the state 119909 Consider

120582 (119909) = 119901 (119911 | 119909) (20)

The ML estimator finds the value of 119909 that maximizes thelikelihood function

119909 (119896) = argmax119909119901 (119911 | 119909) (21)

12 The Scientific World Journal

which can be obtained from the analytical or empiricalmodels of the sensorsThis function expresses the probabilityof the observed data The main disadvantage of this methodin practice is that it requires the analytical or empirical modelof the sensor to be known to provide the prior distributionand compute the likelihood function This method can alsosystematically underestimate the variance of the distributionwhich leads to a bias problem However the bias of the MLsolution becomes less significant as the number 119873 of datapoints increases and is equal to the true variance of thedistribution that generated the data at the limit119873 rarr infin

The maximum posterior (MAP) method is based on theBayesian theory It is employed when the parameter 119909 tobe estimated is the output of a random variable that has aknown probability density function 119901(119909) In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of 119909The MAP estimator finds the value of 119909 that maximizes theposterior probability distribution as follows

119909 (119896) = argmax119909119901 (119909 | 119911) (22)

Both methods (ML andMAP) aim to find the most likelyvalue for the state 119909 However ML assumes that 119909 is a fixedbut an unknown point from the parameter space whereasMAP considers 119909 to be the output of a random variable witha known a priori probability density function Both of thesemethods are equivalent when there is no a priori informationabout 119909 that is when there are only observations

42The Kalman Filter TheKalman filter is the most popularestimation technique It was originally proposed by Kalman[34] and has been widely studied and applied since then TheKalman filter estimates the state 119909 of a discrete time processgoverned by the following space-time model

119909 (119896 + 1) = Φ (119896) 119909 (119896) + 119866 (119896) 119906 (119896) + 119908 (119896) (23)

with the observations ormeasurements 119911 at time 119896 of the state119909 represented by

119911 (119896) = 119867 (119896) 119909 (119896) + V (119896) (24)

where Φ(119896) is the state transition matrix 119866(119896) is the inputmatrix transition 119906(119896) is the input vector 119867(119896) is themeasurement matrix and 119908 and V are the random Gaussianvariables with zero mean and covariance matrices of 119876(119896)and 119877(119896) respectively Based on the measurements and onthe system parameters the estimation of 119909(119896) which isrepresented by 119909(119896) and the prediction of 119909(119896 + 1) whichis represented by 119909(119896 + 1 | 119896) are given by the following

119909 (119896) = 119909 (119896 | 119896 + 1) + 119870 (119896) [119911 (119896) minus 119867 (119896) 119909 (119896 | 119896 minus 1)]

119909 (119896 + 1 | 119896) = Φ (119896) 119909 (119896 | 119896) + 119866 (119896) 119906 (119896)

(25)

respectively where119870 is the filter gain determined by

119870 (119896) = 119875 (119896 | 119896 minus 1)119867119879(119896)

times [119867 (119896) 119875 (119896 | 119896 minus 1)119867119879(119896) + 119877 (119896)]

minus1

(26)

where 119875(119896 | 119896 minus 1) is the prediction covariance matrix andcan be determined by

119875 (119896 + 1 | 119896) = Φ (119896) 119875 (119896)Φ119879(119896) + 119876 (119896) (27)

with

119875 (119896) = 119875 (119896 | 119896 minus 1) minus 119870 (119896)119867 (119896) 119875 (119896 | 119896 minus 1) (28)

The Kalman filter is mainly employed to fuse low-leveldata If the system could be described as a linear model andthe error could be modeled as the Gaussian noise then therecursive Kalman filter obtains optimal statistical estimations[35] However othermethods are required to address nonlin-ear dynamicmodels and nonlinearmeasurementsThemodi-fied Kalman filter known as the extended Kalman filter (EKF)is an optimal approach for implementing nonlinear recursivefilters [36] The EKF is one of the most often employedmethods for fusing data in robotic applications Howeverit has some disadvantages because the computations of theJacobians are extremely expensive Some attempts have beenmade to reduce the computational cost such as linearizationbut these attempts introduce errors in the filter and make itunstable

The unscented Kalman filter (UKF) [37] has gainedpopularity because it does not have the linearization step andthe associated errors of the EKF [38] The UKF employs adeterministic sampling strategy to establish theminimum setof points around the mean This set of points captures thetrue mean and covariance completely Then these points arepropagated through nonlinear functions and the covarianceof the estimations can be recuperated Another advantage ofthe UKF is its ability to be employed in parallel implementa-tions

43 Particle Filter Particle filters are recursive implemen-tations of the sequential Monte Carlo methods [39] Thismethod builds the posterior density function using severalrandom samples called particles Particles are propagatedover time with a combination of sampling and resamplingsteps At each iteration the sampling step is employed todiscard some particles increasing the relevance of regionswith a higher posterior probability In the filtering processseveral particles of the same state variable are employedand each particle has an associated weight that indicatesthe quality of the particle Therefore the estimation is theresult of a weighted sum of all of the particles The standardparticle filter algorithm has two phases (1) the predictingphase and (2) the updating phase In the predicting phaseeach particle is modified according to the existing modeland accounts for the sum of the random noise to simulatethe noise effect Then in the updating phase the weight ofeach particle is reevaluated using the last available sensorobservation and particles with lower weights are removedSpecifically a generic particle filter comprises the followingsteps

The Scientific World Journal 13

(1) Initialization of the particles

(i) let119873 be equal to the number of particles(ii) 119883(119894)(1) = [119909(1) 119910(1) 0 0]119879 for 119894 = 1 119873

(2) Prediction step

(i) for each particle 119894 = 1 119873 evaluate the state(119896 + 1 | 119896) of the system using the state at timeinstant 119896 with the noise of the system at time 119896Consider

119883(119894)(119896 + 1 | 119896) = 119865 (119896)119883

(119894)(119896)

+ (cauchy-distribution-noise)(119896)

(29)

where 119865(119896) is the transition matrix of the sys-tem

(3) Evaluate the particle weight For each particle 119894 =1 119873

(i) compute the predicted observation state of thesystem using the current predicted state and thenoise at instant 119896 Consider

(119894)(119896 + 1 | 119896) = 119867 (119896 + 1)119883

(119894)(119896 + 1 | 119896)

+ (gaussian-measurement-noise)(119896+1)

(30)

(ii) compute the likelihood (weights) according tothe given distribution Consider

likelihood(119894) = 119873((119894) (119896 + 1 | 119896) 119911(119894) (119896 + 1) var) (31)

(iii) normalize the weights as follows

119908(119894)=

likelihood(119894)

sum119873

119895=1likelihood(119895)

(32)

(4) ResamplingSelection multiply particles with higherweights and remove those with lower weights Thecurrent state must be adjusted using the computedweights of the new particles

(i) Compute the cumulative weights Consider

Cum Wt(119894) =119894

sum

119895=1

119908(119895) (33)

(ii) Generate uniform distributed random variablesfrom 119880(119894) sim 119882(0 1) with the number of stepsequal to the number of particles

(iii) Determine which particles should be multipliedand which ones removed

(5) Propagation phase

(i) incorporate the new values of the state after theresampling of instant 119896 to calculate the value atinstant 119896 + 1 Consider

119909(1119873)

(119896 + 1 | 119896 + 1) = 119909 (119896 + 1 | 119896) (34)

(ii) compute the posterior mean Consider

119909 (119896 + 1) = mean [119909119894 (119896 + 1 | 119896 + 1)] 119894 = 1 119873 (35)

(iii) repeat steps 2 to 5 for each time instant

Particle filters are more flexible than the Kalman filtersand can copewith nonlinear dependencies and non-Gaussiandensities in the dynamic model and in the noise errorHowever they have some disadvantages A large numberof particles are required to obtain a small variance in theestimator It is also difficult to establish the optimal number ofparticles in advance and the number of particles affects thecomputational cost significantly Earlier versions of particlefilters employed a fixed number of particles but recent studieshave started to use a dynamic number of particles [40]

44 The Distributed Kalman Filter The distributed Kalmanfilter requires a correct clock synchronization between eachsource as demonstrated in [41] In other words to correctlyuse the distributed Kalman filter the clocks from all ofthe sources must be synchronized This synchronization istypically achieved through using protocols that employ ashared global clock such as the network time protocol (NTP)Synchronization problems between clocks have been shownto have an effect on the accuracy of the Kalman filterproducing inaccurate estimations [42]

If the estimations are consistent and the cross covarianceis known (or the estimations are uncorrelated) then it ispossible to use the distributed Kalman filters [43] Howeverthe cross covariance must be determined exactly or theobservations must be consistent

We refer the reader to Liggins II et al [20] formore detailsabout the Kalman filter in a distributed and hierarchicalarchitecture

45 Distributed Particle Filter Distributed particle filtershave gained attention recently [44ndash46] Coates [45] used adistributed particle filter to monitor an environment thatcould be captured by the Markovian state-space modelinvolving nonlinear dynamics and observations and non-Gaussian noise

In contrast earlier attempts to solve out-of-sequencemeasurements using particle filters are based on regeneratingthe probability density function to the time instant of theout-of-sequence measurement [47] In a particle filter thisstep requires a large computational cost in addition to thenecessary space to store the previous particles To avoidthis problem Orton and Marrs [48] proposed to store theinformation on the particles at each time instant saving thecost of recalculating this information This technique is close

14 The Scientific World Journal

to optimal and when the delay increases the result is onlyslightly affected [49]However it requires a very large amountof space to store the state of the particles at each time instant

46 Covariance Consistency Methods Covariance Intersec-tionUnion Covariance consistency methods (intersectionand union) were proposed by Uhlmann [43] and are generaland fault-tolerant frameworks for maintaining covariancemeans and estimations in a distributed networkThese meth-ods do not comprise estimation techniques instead they aresimilar to an estimation fusion technique The distributedKalman filter requirement of independent measurements orknown cross-covariances is not a constraintwith thismethod

461 Covariance Intersection If the Kalman filter is employ-ed to combine two estimations (119886

1 1198601) and (119886

2 1198602) then it

is assumed that the joint covariance is in the following form

[

[

1198601119883

1198831198791198602

]

]

(36)

where the cross-covariance 119883 should be known exactly sothat the Kalman filter can be applied without difficultyBecause the computation of the cross-covariances is compu-tationally intensive Uhlmann [43] proposed the covarianceintersection (CI) algorithm

Let us assume that a joint covariance 119872 can be definedwith the diagonal blocks119872

1198601gt 1198601and119872

1198602gt 1198602 Consider

119872 ⩾ [

[

1198601119883

1198831198791198602

]

]

(37)

for every possible instance of the unknown cross-covariance119883 then the components of the matrix119872 could be employedin the Kalman filter equations to provide a fused estimation(119888 119862) that is considered consistent The key point of thismethod relies on generating a joint covariance matrix119872 thatcan represent a useful fused estimation (in this context usefulrefers to something with a lower associated uncertainty) Insummary the CI algorithm computes the joint covariancematrix 119872 where the Kalman filter provides the best fusedestimation (119888 119862) with respect to a fixed measurement of thecovariance matrix (ie the minimum determinant)

Specific covariance criteria must be established becausethere is not a specific minimum joint covariance in theorder of the positive semidefinite matrices Moreover thejoint covariance is the basis of the formal analysis of theCI algorithm the actual result is a nonlinear mixture of theinformation stored on the estimations being fused followingthe following equation

119862 = (1199081119867119879

1119860minus1

11198671+ 1199082119867119879

2119860minus1

21198672+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119867119899)minus1

119888 = 119862(1199081119867119879

1119860minus1

11198861+ 1199082119867119879

2119860minus1

21198862+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119886119899)minus1

(38)

where 119867119894is the transformation of the fused state-space

estimation to the space of the estimated state 119894 The values

of 119908 can be calculated to minimize the covariance determi-nant using convex optimization packages and semipositivematrix programming The result of the CI algorithm hasdifferent characteristics compared to the Kalman filter Forexample if two estimations are provided (119886 119860) and (119887 119861)and their covariances are equal 119860 = 119861 since the Kalmanfilter is based on the statistical independence assumption itproduces a fused estimation with covariance 119862 = (12)119860In contrast the CI method does not assume independenceand thus must be consistent even in the case in whichthe estimations are completely correlated with the estimatedfused covariance 119862 = 119860 In the case of estimations where119860 lt 119861 the CI algorithm does not provide information aboutthe estimation (119887 119861) thus the fused result is (119886 119860)

Every joint-consistent covariance is sufficient to producea fused estimation which guarantees consistency Howeverit is also necessary to guarantee a lack of divergence Diver-gence is avoided in the CI algorithm by choosing a specificmeasurement (ie the determinant) which is minimized ineach fusion operation This measurement represents a non-divergence criterion because the size of the estimated covari-ance according to this criterion would not be incremented

The application of the CI method guarantees consis-tency and nondivergence for every sequence of mean andcovariance-consistent estimations However this methoddoes not work well when the measurements to be fused areinconsistent

462 Covariance Union CI solves the problem of correlatedinputs but not the problemof inconsistent inputs (inconsistentinputs refer to different estimations each of which has ahigh accuracy (small variance) but also a large differencefrom the states of the others) thus the covariance union(CU) algorithm was proposed to solve the latter [43] CUaddresses the following problem two estimations (119886

1 1198601)

and (1198862 1198602) relate to the state of an object and are mutually

inconsistent from one another This issue arises when thedifference between the average estimations is larger thanthe provided covariance Inconsistent inputs can be detectedusing the Mahalanobis distance [50] between them which isdefined as

119872119889= (1198861minus 1198862)119879

(1198601+ 1198602)minus1

(1198861minus 1198862) (39)

and detecting whether this distance is larger than a giventhreshold

The Mahalanobis distance accounts for the covarianceinformation to obtain the distance If the difference betweenthe estimations is high but their covariance is also highthe Mahalanobis distance yields a small value In contrastif the difference between the estimations is small and thecovariances are small it could produce a larger distancevalue A high Mahalanobis distance could indicate that theestimations are inconsistent however it is necessary tohave a specific threshold established by the user or learnedautomatically

The CU algorithm aims to solve the following prob-lem let us suppose that a filtering algorithm provides twoobservations withmean and covariance (119886

1 1198601) and (119886

2 1198602)

The Scientific World Journal 15

respectively It is known that one of the observations is correctand the other is erroneous However the identity of thecorrect estimation is unknown and cannot be determinedIn this situation if both estimations are employed as aninput to the Kalman filter there will be a problem becausethe Kalman filter only guarantees a consistent output if theobservation is updated with a measurement consistent withboth of them In the specific case in which themeasurementscorrespond to the same object but are acquired from twodifferent sensors the Kalman filter can only guarantee thatthe output is consistent if it is consistent with both separatelyBecause it is not possible to knowwhich estimation is correctthe only way to combine the two estimations rigorously isto provide an estimation (119906 119880) that is consistent with bothestimations and to obey the following properties

119880 ⪖ 1198601+ (119906 minus 119886

1) (119906 minus 119860

1)119879

119880 ⪖ 1198602+ (119906 minus 119886

2) (119906 minus 119860

2)119879

(40)

where somemeasurement of thematrix size119880 (ie the deter-minant) is minimized

In other words the previous equations indicate that if theestimation (119886

1 1198601) is consistent then the translation of the

vector 1198861to 119906 requires to increase the covariance by the sum

of a matrix at least as big as the product of (119906minus1198861) in order to

be consistentThe same situation applies to the measurement(1198862 1198602) in order to be consistent

A simple strategy is to choose the mean of the estimationas the input value of one of themeasurements (119906 = 119886

1) In this

case the value of 119880must be chosen such that the estimationis consistent with the worst case (the correct measurement is1198862) However it is possible to assign 119906 an intermediate value

between 1198861and 1198862to decrease the value of 119880 Therefore the

CU algorithm establishes the mean fused value 119906 that hasthe least covariance 119880 but is sufficiently large for the twomeasurements (119886

1and 1198862) for consistency

Because the matrix inequalities presented in previousequations are convex convex optimization algorithms mustbe employed to solve them The value of 119880 can be computedwith the iterative method described by Julier et al [51]The obtained covariance could be significantly larger thanany of the initial covariances and is an indicator of theexisting uncertainty between the initial estimations One ofthe advantages of the CU method arises from the fact thatthe same process could be easily extended to119873 inputs

5 Decision Fusion Methods

A decision is typically taken based on the knowledge of theperceived situation which is provided by many sources inthe data fusion domain These techniques aim to make ahigh-level inference about the events and activities that areproduced from the detected targets These techniques oftenuse symbolic information and the fusion process requires toreasonwhile accounting for the uncertainties and constraintsThese methods fall under level 2 (situation assessment) andlevel 4 (impact assessment) of the JDL data fusion model

51 The Bayesian Methods Information fusion based on theBayesian inference provides a formalism for combining evi-dence according to the probability theory rules Uncertaintyis represented using the conditional probability terms thatdescribe beliefs and take on values in the interval [0 1] wherezero indicates a complete lack of belief and one indicates anabsolute belief The Bayesian inference is based on the Bayesrule as follows

119875 (119884 | 119883) =119875 (119883 | 119884) 119875 (119884)

119875 (119883) (41)

where the posterior probability 119875(119884 | 119883) represents thebelief in the hypothesis 119884 given the information 119883 Thisprobability is obtained by multiplying the a priori probabilityof the hypothesis 119875(119884) by the probability of having 119883 giventhat 119884 is true 119875(119883 | 119884) The value 119875(119883) is used as anormalizing constantThemain disadvantage of the Bayesianinference is that the probabilities 119875(119883) and 119875(119883 | 119884) mustbe known To estimate the conditional probabilities Panet al [52] proposed the use of NNs whereas Coue et al [53]proposed the Bayesian programming

Hall and Llinas [54] described the following problemsassociated with Bayesian inference

(i) Difficulty in establishing the value of a priori proba-bilities

(ii) Complexity when there are multiple potential hypo-theses and a substantial number of events that dependon the conditions

(iii) The hypothesis should be mutually exclusive

(iv) Difficulty in describing the uncertainty of the deci-sions

52 The Dempster-Shafer Inference The Dempster-Shaferinference is based on the mathematical theory introducedby Dempster [55] and Shafer [56] which generalizes theBayesian theory The Dempster-Shafer theory provides aformalism that could be used to represent incomplete knowl-edge updating beliefs and a combination of evidence andallows us to represent the uncertainty explicitly [57]

A fundamental concept in the Dempster-Shafer reason-ing is the frame of discernment which is defined as followsLet Θ = 120579

1 1205792 120579

119873 be the set of all possible states

that define the system and let Θ be exhaustive and mutuallyexclusive due to the system being only in one state 120579

119894isin Θ

where 1 ⪕ 119894 ⪕ 119873 The set Θ is called a frame of discernmentbecause its elements are employed to discern the current stateof the system

The elements of the set 2Θ are called hypotheses Inthe Dempster-Shafer theory based on the evidence 119864 aprobability is assigned to each hypothesis119867 isin 2Θ accordingto the basic assignment of probabilities or the mass function119898 2Θrarr [01] which satisfies

119898(0) = 0 (42)

16 The Scientific World Journal

Thus themass function of the empty set is zero Furthermorethe mass function of a hypothesis is larger than or equal tozero for all of the hypotheses Consider

119898(119867) ge 0 forall119867 isin 2Θ (43)

The sum of the mass function of all the hypotheses is oneConsider

sum

119867isin2Θ

119898(119867) = 1 (44)

To express incomplete beliefs in a hypothesis 119867 the Demp-ster-Shafer theory defines the belief function bel 2Θ rarr

[0 1] over Θ as

bel (119867) = sum119860sube119867

119898(119860) (45)

where bel(0) = 0 and bel(Θ) = 1The doubt level in119867 can beexpressed in terms of the belief function by

dou (119867) = bel (not119867) = sum

119860subenot119867

119898(119860) (46)

To express the plausibility of each hypothesis the functionpl 2Θ rarr [0 1] over Θ is defined as

pl (119867) = 1 minus dou (119867) = sum

119860cap119867=0

119898(119860) (47)

Intuitive plausibility indicates that there is less uncer-tainty in hypothesis119867 if it is more plausible The confidenceinterval [bel(119867) pl(119867)] defines the true belief in hypothesis119867 To combine the effects of the two mass functions 119898

1and

1198982 the Dempster-Shafer theory defines a rule119898

1oplus 1198982as

1198981oplus 1198982(0) = 0

1198981oplus 1198982(119867) =

sum119883cap119884=119867

1198981(119883)119898

2(119884)

1 minus sum119883cap119884=0

1198981(119883)119898

2(119884)

(48)

In contrast to the Bayesian inference a priori probabilitiesare not required in the Dempster-Shafer inference becausethey are assigned at the instant that the information is pro-vided Several studies in the literature have compared the useof the Bayesian inference and the Dempster-Shafer inferencesuch as [58ndash60] Wu et al [61] used the Dempster-Shafertheory to fuse information in context-aware environmentsThis work was extended in [62] to dynamically modify theassociated weights to the sensor measurements Thereforethe fusion mechanism is calibrated according to the recentmeasurements of the sensors (in cases in which the ground-truth is available) In themilitary domain [63] theDempster-Shafer reasoning is used with the a priori information storedin a database for classifying military ships Morbee et al [64]described the use of the Dempster-Shafer theory to build 2Doccupancy maps from several cameras and to evaluate thecontribution of subsets of cameras to a specific task Each taskis the observation of an event of interest and the goal is toassess the validity of a set of hypotheses that are fused usingthe Dempster-Shafer theory

53 Abductive Reasoning Abductive reasoning or inferringthe best explanation is a reasoning method in which ahypothesis is chosen under the assumption that in case itis true it explains the observed event most accurately [65]In other words when an event is observed the abductionmethod attempts to find the best explanation

In the context of probabilistic reasoning abductive infer-ence finds the posterior ML of the system variables givensome observed variables Abductive reasoning is more areasoning pattern than a data fusion technique Thereforedifferent inference methods such as NNs [66] or fuzzy logic[67] can be employed

54 Semantic Methods Decision fusion techniques thatemploy semantic data fromdifferent sources as an input couldprovide more accurate results than those that rely on onlysingle sources There is a growing interest in techniques thatautomatically determine the presence of semantic features invideos to solve the semantic gap [68]

Semantic information fusion is essentially a scheme inwhich raw sensor data are processed such that the nodesexchange only the resultant semantic information Semanticinformation fusion typically covers two phases (i) build-ing the knowledge and (ii) pattern matching (inference)The first phase (typically offline) incorporates the mostappropriate knowledge into semantic information Then thesecond phase (typically online or in real-time) fuses relevantattributes and provides a semantic interpretation of thesensor data [69ndash71]

Semantic fusion could be viewed as an idea for integratingand translating sensor data into formal languages Thereforethe obtained resulting language from the observations ofthe environment is compared with similar languages thatare stored in the database The key of this strategy is thatsimilar behaviors represented by formal languages are alsosemantically similar This type of method provides savingsin the cost of transmission because the nodes need onlytransmit the formal language structure instead of the rawdata However a known set of behaviors must be storedin a database in advance which might be difficult in somescenarios

6 Conclusions

This paper reviews the most popular methods and tech-niques for performing datainformation fusion To determinewhether the application of datainformation fusion methodsis feasible we must evaluate the computational cost of theprocess and the delay introduced in the communicationA centralized data fusion approach is theoretically optimalwhen there is no cost of transmission and there are sufficientcomputational resources However this situation typicallydoes not hold in practical applications

The selection of the most appropriate technique dependson the type of the problem and the established assumptionsof each technique Statistical data fusion methods (eg PDAJPDA MHT and Kalman) are optimal under specific condi-tions [72] First the assumption that the targets are moving

The Scientific World Journal 17

independently and the measurements are normally dis-tributed around the predicted position typically does nothold Second because the statistical techniques model allof the events as probabilities they typically have severalparameters and a priori probabilities for false measurementsand detection errors that are often difficult to obtain (atleast in an optimal sense) For example in the case of theMHT algorithm specific parameters must be established thatare nontrivial to determine and are very sensitive [73] Incontrast statisticalmethods that optimize over several framesare computationally intensive and their complexity typicallygrows exponentially with the number of targets For examplein the case of particle filters tracking several targets can beaccomplished jointly as a group or individually If severaltargets are tracked jointly the necessary number of particlesgrows exponentially Therefore in practice it is better toperform tracking on them individually with the assumptionthat targets do not interact between the particles

In contrast to centralized systems the distributed datafusion methods introduce some challenges in the data fusionprocess such as (i) spatial and temporal alignments of theinformation (ii) out-of-sequence measurements and (iii)data correlation reported by Castanedo et al [74 75] Theinherent redundancy of the distributed systems could beexploited with distributed reasoning techniques and cooper-ative algorithms to improve the individual node estimationsreported by Castanedo et al [76] In addition to the previousstudies a new trend based on the geometric notion of a low-dimensional manifold is gaining attention in the data fusioncommunity An example is the work of Davenport et al [77]which proposes a simple model that captures the correlationbetween the sensor observations by matching the parametervalues for the different obtained manifolds

Acknowledgments

The author would like to thank Jesus Garcıa Miguel APatricio and James Llinas for their interesting and relateddiscussions on several topics that were presented in thispaper

References

[1] JDL Data Fusion Lexicon Technical Panel For C3 FE WhiteSan Diego Calif USA Code 420 1991

[2] D L Hall and J Llinas ldquoAn introduction to multisensor datafusionrdquo Proceedings of the IEEE vol 85 no 1 pp 6ndash23 1997

[3] H F Durrant-Whyte ldquoSensor models and multisensor integra-tionrdquo International Journal of Robotics Research vol 7 no 6 pp97ndash113 1988

[4] B V Dasarathy ldquoSensor fusion potential exploitation-inno-vative architectures and illustrative applicationsrdquo Proceedings ofthe IEEE vol 85 no 1 pp 24ndash38 1997

[5] R C Luo C-C Yih and K L Su ldquoMultisensor fusion andintegration approaches applications and future research direc-tionsrdquo IEEE Sensors Journal vol 2 no 2 pp 107ndash119 2002

[6] J Llinas C Bowman G Rogova A Steinberg E Waltz andF White ldquoRevisiting the JDL data fusion model IIrdquo TechnicalReport DTIC Document 2004

[7] E P Blasch and S Plano ldquoJDL level 5 fusionmodel ldquouser refine-mentrdquo issues and applications in group trackingrdquo in Proceedingsof the Signal Processing Sensor Fusion and Target RecognitionXI pp 270ndash279 April 2002

[8] H F Durrant-Whyte and M Stevens ldquoData fusion in decen-tralized sensing networksrdquo in Proceedings of the 4th Interna-tional Conference on Information Fusion pp 302ndash307MontrealCanada 2001

[9] J Manyika and H Durrant-Whyte Data Fusion and SensorManagement A Decentralized Information-Theoretic ApproachPrentice Hall Upper Saddle River NJ USA 1995

[10] S S Blackman ldquoAssociation and fusion ofmultiple sensor datardquoinMultitarget-Multisensor Tracking Advanced Applications pp187ndash217 Artech House 1990

[11] S Lloyd ldquoLeast squares quantization in pcmrdquo IEEETransactionson Information Theory vol 28 no 2 pp 129ndash137 1982

[12] M Shindler A Wong and A Meyerson ldquoFast and accurate120581-means for large datasetsrdquo in Proceedings of the 25th AnnualConference on Neural Information Processing Systems (NIPS rsquo11)pp 2375ndash2383 December 2011

[13] Y Bar-Shalom and E Tse ldquoTracking in a cluttered environmentwith probabilistic data associationrdquo Automatica vol 11 no 5pp 451ndash460 1975

[14] T E Fortmann Y Bar-Shalom and M Scheffe ldquoMulti-targettracking using joint probabilistic data associationrdquo in Pro-ceedings of the 19th IEEE Conference on Decision and Controlincluding the Symposium on Adaptive Processes vol 19 pp 807ndash812 December 1980

[15] D B Reid ldquoAn algorithm for tracking multiple targetsrdquo IEEETransactions on Automatic Control vol 24 no 6 pp 843ndash8541979

[16] C L Morefield ldquoApplication of 0-1 integer programming tomultitarget tracking problemsrdquo IEEETransactions onAutomaticControl vol 22 no 3 pp 302ndash312 1977

[17] R L Streit and T E Luginbuhl ldquoMaximum likelihood methodfor probabilistic multihypothesis trackingrdquo in Proceedings of theSignal and Data Processing of Small Targets vol 2235 of Pro-ceedings of SPIE p 394 1994

[18] I J Cox and S L Hingorani ldquoEfficient implementation of Reidrsquosmultiple hypothesis tracking algorithm and its evaluation forthe purpose of visual trackingrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 18 no 2 pp 138ndash1501996

[19] K G Murty ldquoAn algorithm for ranking all the assignments inorder of increasing costrdquo Operations Research vol 16 no 3 pp682ndash687 1968

[20] M E Liggins II C-Y Chong I Kadar et al ldquoDistributed fusionarchitectures and algorithms for target trackingrdquo Proceedings ofthe IEEE vol 85 no 1 pp 95ndash106 1997

[21] S Coraluppi C Carthel M Luettgen and S Lynch ldquoAll-source track and identity fusionrdquo in Proceedings of the NationalSymposium on Sensor and Data Fusion 2000

[22] P Storms and F Spieksma ldquoAn lp-based algorithm for the dataassociation problem in multitarget trackingrdquo in Proceedings ofthe 3rd IEEE International Conference on Information Fusionvol 1 2000

[23] S-W Joo and R Chellappa ldquoA multiple-hypothesis approachfor multiobject visual trackingrdquo IEEE Transactions on ImageProcessing vol 16 no 11 pp 2849ndash2854 2007

[24] S Coraluppi andC Carthel ldquoAggregate surveillance a cardinal-ity tracking approachrdquo in Proceedings of the 14th InternationalConference on Information Fusion (FUSION rsquo11) July 2011

18 The Scientific World Journal

[25] K C Chang C Y Chong and Y Bar-Shalom ldquoJoint proba-bilistic data association in distributed sensor networksrdquo IEEETransactions on Automatic Control vol 31 no 10 pp 889ndash8971986

[26] Y Chong S Mori and K C Chang ldquoInformation lusion indistributed sensor networksrdquo in Proceedings of the 4th AmericanControl Conference Boston Mass USA June 1985

[27] Y Chong S Mori and K C Chang ldquoDistributed multitar-get multisensor trackingrdquo in Multitarget-Multisensor TrackingAdvanced Applications vol 1 pp 247ndash295 1990

[28] J Pearl Probabilistic Reasoning in Intelligent Systems Networksof Plausible Inference Morgan Kaufmann San Mateo CalifUSA 1988

[29] Koller and N Friedman Probabilistic Graphical Models Princi-ples and Techniques MIT press 2009

[30] L Chen M Cetin and A S Willsky ldquoDistributed data associ-ation for multi-target tracking in sensor networksrdquo in Proceed-ings of the 7th International Conference on Information Fusion(FUSION rsquo05) pp 9ndash16 July 2005

[31] L Chen M J Wainwright M Cetin and A S Willsky ldquoDataassociation based on optimization in graphical models withapplication to sensor networksrdquo Mathematical and ComputerModelling vol 43 no 9-10 pp 1114ndash1113 2006

[32] Y Weiss and W T Freeman ldquoOn the optimality of solutionsof the max-product belief-propagation algorithm in arbitrarygraphsrdquo IEEE Transactions on InformationTheory vol 47 no 2pp 736ndash744 2001

[33] C Brown H Durrant-Whyte J Leonard B Rao and B SteerldquoDistributed data fusion using Kalman filtering a roboticsapplicationrdquo in Data Fusion in Robotics and Machine Intelli-gence M A Abidi and R C Gonzalez Eds pp 267ndash309 1992

[34] R E Kalman ldquoA new approach to linear filtering and predictionproblemsrdquo Journal of Basic Engineering vol 82 no 1 pp 35ndash451960

[35] R C Luo and M G Kay ldquoData fusion and sensor integrationstate-of-the-art 1990srdquo in Data Fusion in Robotics and MachineIntelligence pp 7ndash135 1992

[36] Welch and G Bishop An Introduction to the Kalman FilterACM SIC-CRAPH 2001 Course Notes 2001

[37] S J Julier and J K Uhlmann ldquoA new extension of the Kalmanfilter to nonlinear systemsrdquo in Proceedings of the InternationalSymposium on AerospaceDefense Sensing Simulation and Con-trols vol 3 1997

[38] A Wan and R Van Der Merwe ldquoThe unscented kalman filterfor nonlinear estimationrdquo in Proceedings of the Adaptive Systemsfor Signal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 153ndash158 2000

[39] D Crisan and A Doucet ldquoA survey of convergence results onparticle filtering methods for practitionersrdquo IEEE Transactionson Signal Processing vol 50 no 3 pp 736ndash746 2002

[40] J Martinez-del Rincon C Orrite-Urunuela and J E Herrero-Jaraba ldquoAn efficient particle filter for color-based tracking incomplex scenesrdquo in Proceedings of the IEEE Conference onAdvancedVideo and Signal Based Surveillance pp 176ndash181 2007

[41] S Ganeriwal R Kumar and M B Srivastava ldquoTiming-syncprotocol for sensor networksrdquo in Proceedings of the 1st Inter-national Conference on Embedded Networked Sensor Systems(SenSys rsquo03) pp 138ndash149 November 2003

[42] M Manzo T Roosta and S Sastry ldquoTime synchronization innetworksrdquo in Proceedings of the 3rd ACMWorkshop on Securityof Ad Hoc and Sensor Networks (SASN rsquo05) pp 107ndash116November 2005

[43] J K Uhlmann ldquoCovariance consistency methods for fault-tolerant distributed data fusionrdquo Information Fusion vol 4 no3 pp 201ndash215 2003

[44] S Bashi V P Jilkov X R Li and H Chen ldquoDistributed imple-mentations of particle filtersrdquo in Proceedings of the 6th Interna-tional Conference of Information Fusion pp 1164ndash1171 2003

[45] M Coates ldquoDistributed particle filters for sensor networksrdquo inProceedings of the 3rd International symposium on InformationProcessing in Sensor Networks (ACM rsquo04) pp 99ndash107 New YorkNY USA 2004

[46] D Gu ldquoDistributed particle filter for target trackingrdquo in Pro-ceedings of the IEEE International Conference on Robotics andAutomation (ICRA rsquo07) pp 3856ndash3861 April 2007

[47] Y Bar-Shalom ldquoUpdate with out-of-sequencemeasurements intracking exact solutionrdquo IEEE Transactions on Aerospace andElectronic Systems vol 38 no 3 pp 769ndash778 2002

[48] M Orton and A Marrs ldquoA Bayesian approach to multi-targettracking and data fusionwithOut-of-SequenceMeasurementsrdquoIEE Colloquium no 174 pp 151ndash155 2001

[49] M L Hernandez A D Marrs S Maskell and M R OrtonldquoTracking and fusion for wireless sensor networksrdquo in Proceed-ings of the 5th International Conference on Information Fusion2002

[50] P C Mahalanobis ldquoOn the generalized distance in statisticsrdquoProceedings National Institute of ScienceIndia vol 2 no 1 pp49ndash55 1936

[51] S J Julier J K Uhlmann and D Nicholson ldquoA methodfor dealing with assignment ambiguityrdquo in Proceedings of theAmerican Control Conference (AAC rsquo04) vol 5 pp 4102ndash4107July 2004

[52] H Pan Z-P Liang T J Anastasio and T S Huang ldquoHybridNN-Bayesian architecture for information fusionrdquo in Proceed-ings of the International Conference on Image Processing (ICIPrsquo98) pp 368ndash371 October 1998

[53] C Coue T Fraichard P Bessiere and E Mazer ldquoMulti-sensordata fusion using Bayesian programming an automotive appli-cationrdquo in Proceedings of the IEEERSJ International Conferenceon Intelligent Robots and Systems pp 141ndash146 October 2002

[54] D L Hall and J Llinas Handbook of Multisensor Data FusionCRC Press Boca Raton Fla USA 2001

[55] P Dempster ldquoA Generalization of Bayesian Inferencerdquo Journalof the Royal Statistical Society B vol 30 no 2 pp 205ndash247 1968

[56] A ShaferMathematical Theory of Evidence Princeton Univer-sity Press Princeton NJ USA 1976

[57] G M Provan ldquoThe validity of Dempster-Shafer belief func-tionsrdquo International Journal of Approximate Reasoning vol 6no 3 pp 389ndash399 1992

[58] D M Buede ldquoShafer-Dempster and Bayesian reasoning aresponse to lsquoShafer-Dempster reasoning with applications tomultisensor target identification systemsrsquordquo IEEE Transactions onSystems Man and Cybernetics vol 18 no 6 pp 1009ndash1011 1988

[59] Y Cheng and R L Kashyap ldquoComparisonol Bayesian andDempsterrsquos rules in evidence combinationrdquo in Maximum-Entropy and BayesianMethods in Science and Engineering 1988

[60] B R Cobb and P P Shenoy ldquoA comparison of Bayesian andbelief function reasoningrdquo Information Systems Frontiers vol 5no 4 pp 345ndash358 2003

[61] H Wu M Siegel R Stiefelhagen and J Yang ldquoSensor fusionusing Dempster-Shafer theoryrdquo in Proceedings of the 19thIEEE Instrumentation and Measurement Technology Conference(TMTC rsquo02) pp 7ndash11 May 2002

The Scientific World Journal 19

[62] H Wu M Siegel and S Ablay ldquoSensor fusion using dempster-shafer theory II staticweighting andKalmanfilter-like dynamicweightingrdquo in Proceedings of the 20th IEEE Information andMeasurement Technology Conference (TMTC rsquo03) pp 907ndash912May 2003

[63] E Bosse P Valin A-C Boury-Brisset and D Grenier ldquoEx-ploitation of a priori knowledge for information fusionrdquo Infor-mation Fusion vol 7 no 2 pp 161ndash175 2006

[64] MMorbee L Tessens H Aghajan andW Philips ldquoDempster-Shafer based multi-view occupancy mapsrdquo Electronics Lettersvol 46 no 5 pp 341ndash343 2010

[65] C S Peirce Abduction and Induction Philosophical Writings ofPeirce vol 156 Dover New York NY USA 1955

[66] A M Abdelbar E A M Andrews and D C Wunsch IIldquoAbductive reasoning with recurrent neural networksrdquo NeuralNetworks vol 16 no 5-6 pp 665ndash673 2003

[67] J R Aguero and A Vargas ldquoInference of operative configu-ration of distribution networks using fuzzy logic techniquesPart II extended real-time modelrdquo IEEE Transactions on PowerSystems vol 20 no 3 pp 1562ndash1569 2005

[68] A W M Smeulders M Worring S Santini A Gupta and RJain ldquoContent-based image retrieval at the end of the earlyyearsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 22 no 12 pp 1349ndash1380 2000

[69] D S Friedlander and S Phoha ldquoSemantic information fusionfor coordinated signal processing in mobile sensor networksrdquoInternational Journal of High Performance Computing Applica-tions vol 16 no 3 pp 235ndash241 2002

[70] S Friedlander ldquoSemantic information extractionrdquo in Dis-tributed Sensor Networks 2005

[71] KWhitehouse J Liu and F Zhao ldquoSemantic Streams a frame-work for composable inference over sensor datardquo in Proceedingsof the 3rd European Workshop on Wireless Sensor NetworksLecture Notes in Computer Science Springer February 2006

[72] J Cox ldquoA review of statistical data association techniques formotion correspondencerdquo International Journal of ComputerVision vol 10 no 1 pp 53ndash66 1993

[73] C J Veenman M J T Reinders and E Backer ldquoResolvingmotion correspondence for densely moving pointsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol23 no 1 pp 54ndash72 2001

[74] F Castanedo M A Patricio J Garcıa and J M MolinaldquoBottom-uptop-down coordination in a multiagent visualsensor networkrdquo in Proceedings of the IEEE Conference onAdvanced Video and Signal Based Surveillance (AVSS rsquo07) pp93ndash98 September 2007

[75] F Castanedo J Garcıa M A Patricio and J M MolinaldquoAnalysis of distributed fusion alternatives in coordinated visionagentsrdquo in Proceedings of the 11th International Conference onInformation Fusion (FUSION rsquo08) July 2008

[76] F Castanedo J Garcıa M A Patricio and J M Molina ldquoDatafusion to improve trajectory tracking in a cooperative surveil-lance multi-agent architecturerdquo Information Fusion vol 11 no3 pp 243ndash255 2010

[77] M A Davenport C Hegde M F Duarte and R G BaraniukldquoJoint manifolds for data fusionrdquo IEEE Transactions on ImageProcessing vol 19 no 10 pp 2580ndash2594 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 3: Review Article A Review of Data Fusion Techniquesdownloads.hindawi.com/journals/tswj/2013/704504.pdf · Classification of Data Fusion Techniques Data fusion is a ... to extract features

The Scientific World Journal 3

S1 S2 S3 S4 S5

Complementaryfusion

Redundantfusion

Cooperativefusion

Fusedinformation

Sources

Information

(a + b) (b) (c)

A B

A B

B C

C

C998400

Figure 1 Whytersquos classification based on the relations between the data sources

Data

Data

Features

Features

Decisions

Data

Features

Features

Decisions

Decisions

Data in-data out(DAI-DAO)

Data in-feature out(DAI-FEO)

Feature in-decision out(FEI-DEO)

Decision in-decision out(DEI-DEO)

Feature in-feature out(FEI-FEO)

Figure 2 Dasarathyrsquos classification

(3) high level fusion this level which is also knownas decision fusion takes symbolic representations assources and combines them to obtain amore accuratedecision Bayesianrsquosmethods are typically employed atthis level

(4) multiple level fusion this level addresses data pro-vided from different levels of abstraction (ie whena measurement is combined with a feature to obtain adecision)

24 JDL Data Fusion Classification This classification is themost popular conceptual model in the data fusion commu-nity It was originally proposed by JDL and the American

Department of Defense (DoD) [1] These organizations clas-sified the data fusion process into five processing levels anassociated database and an information bus that connectsthe five components (see Figure 3) The five levels could begrouped into two groups low-level fusion and high-levelfusion which comprise the following components

(i) sources the sources are in charge of providingthe input data Different types of sources can beemployed such as sensors a priori information (ref-erences or geographic data) databases and humaninputs

(ii) human-computer interaction (HCI) HCI is an inter-face that allows inputs to the system from the oper-ators and produces outputs to the operators HCIincludes queries commands and information on theobtained results and alarms

(iii) database management system the database manage-ment system stores the provided information andthe fused results This system is a critical componentbecause of the large amount of highly diverse infor-mation that is stored

In contrast the five levels of data processing are defined asfollows

(1) level 0mdashsource preprocessing source preprocessingis the lowest level of the data fusion process andit includes fusion at the signal and pixel levels Inthe case of text sources this level also includes theinformation extraction processThis level reduces theamount of data and maintains useful information forthe high-level processes

(2) level 1mdashobject refinement object refinement employsthe processed data from the previous level Com-mon procedures of this level include spatio-temporalalignment association correlation clustering orgrouping techniques state estimation the removal offalse positives identity fusion and the combining offeatures that were extracted from images The output

4 The Scientific World Journal

Fusion domain

Level 0 Level 1 Level 2 Level 3Sourcepreprocessing

Objectrefinement

Situationassessment

Threatassessment

Information bus

SourcesSensorsDatabasesKnowledge

Level 4 DatabasemanagementProcess

refinement

Userinterface

Figure 3 The JDL data fusion framework

results of this stage are the object discrimination(classification and identification) and object track-ing (state of the object and orientation) This stagetransforms the input information into consistent datastructures

(3) level 2mdashsituation assessment this level focuses ona higher level of inference than level 1 Situationassessment aims to identify the likely situations giventhe observed events and obtained data It establishesrelationships between the objects Relations (ieproximity communication) are valued to determinethe significance of the entities or objects in a specificenvironment The aim of this level includes perform-ing high-level inferences and identifying significantactivities and events (patterns in general) The outputis a set of high-level inferences

(4) level 3mdashimpact assessment this level evaluates theimpact of the detected activities in level 2 to obtain aproper perspectiveThe current situation is evaluatedand a future projection is performed to identifypossible risks vulnerabilities and operational oppor-tunities This level includes (1) an evaluation of therisk or threat and (2) a prediction of the logicaloutcome

(5) level 4mdashprocess refinement this level improves theprocess from level 0 to level 3 and provides resourceand sensor management The aim is to achieve effi-cient resource management while accounting for taskpriorities scheduling and the control of availableresources

High-level fusion typically starts at level 2 because thetype localization movement and quantity of the objectsare known at that level One of the limitations of the JDLmethod is how the uncertainty about previous or subsequentresults could be employed to enhance the fusion process(feedback loop) Llinas et al [6] propose several refinementsand extensions to the JDL model Blasch and Plano [7]proposed to add a new level (user refinement) to support ahumanuser in the data fusion loopThe JDLmodel represents

the first effort to provide a detailed model and a commonterminology for the data fusion domain However becausetheir roots originate in the military domain the employedterms are oriented to the risks that commonly occur inthese scenarios The Dasarathy model differs from the JDLmodel with regard to the adopted terminology and employedapproach The former is oriented toward the differencesamong the input and output results independent of theemployed fusion method In summary the Dasarathy modelprovides a method for understanding the relations betweenthe fusion tasks and employed data whereas the JDL modelpresents an appropriate fusion perspective to design datafusion systems

25 Classification Based on the Type of Architecture One ofthe main questions that arise when designing a data fusionsystem is where the data fusion process will be performedBased on this criterion the following types of architecturescould be identified

(1) centralized architecture in a centralized architecturethe fusion node resides in the central processor thatreceives the information from all of the input sourcesTherefore all of the fusion processes are executedin a central processor that uses the provided rawmeasurements from the sources In this schema thesources obtain only the observationas measurementsand transmit them to a central processor where thedata fusion process is performed If we assume thatdata alignment and data association are performedcorrectly and that the required time to transfer thedata is not significant then the centralized scheme istheoretically optimal However the previous assump-tions typically do not hold for real systems Moreoverthe large amount of bandwidth that is required to sendraw data through the network is another disadvantagefor the centralized approach This issue becomes abottleneck when this type of architecture is employedfor fusing data in visual sensor networks Finallythe time delays when transferring the informationbetween the different sources are variable and affect

The Scientific World Journal 5

the results in the centralized scheme to a greaterdegree than in other schemes

(2) decentralized architecture a decentralized architec-ture is composed of a network of nodes in which eachnode has its own processing capabilities and there isno single point of data fusion Therefore each nodefuses its local information with the information thatis received from its peers Data fusion is performedautonomously with each node accounting for its localinformation and the information received from itspeers Decentralized data fusion algorithms typicallycommunicate information using the Fisher and Shan-non measurements instead of the objectrsquos state [8]The main disadvantage of this architecture is thecommunication cost which is 119874(1198992) at each com-munication step where 119899 is the number of nodesadditionally the extreme case is considered in whicheach node communicates with all of its peers Thusthis type of architecture could suffer from scalabilityproblems when the number of nodes is increased

(3) distributed architecture in a distributed architecturemeasurements from each source node are processedindependently before the information is sent to thefusion node the fusion node accounts for the infor-mation that is received from the other nodes In otherwords the data association and state estimation areperformed in the source node before the informationis communicated to the fusion node Therefore eachnode provides an estimation of the object state basedon only their local views and this information isthe input to the fusion process which provides afused global view This type of architecture providesdifferent options and variations that range from onlyone fusion node to several intermediate fusion nodes

(4) hierarchical architecture other architectures com-prise a combination of decentralized and distributednodes generating hierarchical schemes in which thedata fusion process is performed at different levels inthe hierarchy

In principle a decentralized data fusion system is moredifficult to implement because of the computation andcommunication requirements However in practice there isno single best architecture and the selection of the mostappropriate architecture should be made depending on therequirements demand existing networks data availabilitynode processing capabilities and organization of the datafusion system

The reader might think that the decentralized anddistributed architectures are similar however they havemeaningful differences (see Figure 4) First in a distributedarchitecture a preprocessing of the obtainedmeasurements isperformed which provides a vector of features as a result (thefeatures are fused thereafter) In contrast in the decentralizedarchitecture the complete data fusion process is conductedin each node and each of the nodes provides a globallyfused result Second the decentralized fusion algorithmstypically communicate information employing the Fisher

and Shannon measurements In contrast distributed algo-rithms typically share a common notion of state (positionvelocity and identity) with their associated probabilitieswhich are used to perform the fusion process [9] Thirdbecause the decentralized data fusion algorithms exchangeinformation instead of states and probabilities they havethe advantage of easily separating old knowledge from newknowledge Thus the process is additive and the associativemeaning is not relevant when the information is receivedand fused However in the distributed data fusion algorithms(ie distributed by Kalman Filter) the state that is goingto be fused is not associative and when and how the fusedestimates are computed is relevant Nevertheless in contrastto the centralized architectures the distributed algorithmsreduce the necessary communication and computationalcosts because some tasks are computed in the distributednodes before data fusion is performed in the fusion node

3 Data Association Techniques

The data association problem must determine the set ofmeasurements that correspond to each target (see Figure 5)Let us suppose that there are 119874 targets that are being trackedby only one sensor in a cluttered environment (by a clutteredenvironment we refer to an environment that has severaltargets that are to close each other)Then the data associationproblem can be defined as follows

(i) each sensorrsquos observation is received in the fusionnode at discrete time intervals

(ii) the sensormight not provide observations at a specificinterval

(iii) some observations are noise and other observationsoriginate from the detected target

(iv) for any specific target and in every time interval wedo not know (a priori) the observations that will begenerated by that target

Therefore the goal of data association is to establish theset of observations or measurements that are generated bythe same target over time Hall and Llinas [2] provided thefollowing definition of data association ldquoThe process of assignand compute the weights that relates the observations or tracks(A track can be defined as an ordered set of points that followa path and are generated by the same target) from one set tothe observation of tracks of another setrdquo

As an example of the complexity of the data associationproblem if we take a frame-to-frame association and assumethat119872 possible points could be detected in all 119899 frames thenthe number of possible sets is (119872)119899minus1 Note that from allof these possible solutions only one set establishes the truemovement of the119872 points

Data association is often performed before the stateestimation of the detected targets Moreover it is a keystep because the estimation or classification will behaveincorrectly if the data association phase does not workcoherently The data association process could also appear inall of the fusion levels but the granularity varies dependingon the objective of each level

6 The Scientific World Journal

Preprocessing

Preprocessing

Preprocessing

Alignment Association Estimation

Stateof theobject

Centralized architecture

Decentralized architecture

Distributed architecture

S1

S2

Fusion node

Preprocessing

Stateof theobject

Stateof theobject

Stateof theobject

S1

S2

S1

S2

Preprocessing

Preprocessing

Preprocessing

Preprocessing

Preprocessing

Alignment

Alignment

Alignment

Alignment

Alignment

Alignment

Alignment

Association

Association

Association

Association

Association

Association

Association

Estimation

Estimation

Estimation

Estimation

Estimation

Estimation

Estimation

Sn

Sn

Sn

Stateof theobject

Figure 4 Classification based on the type of architecture

In general an exhaustive search of all possible combina-tions grows exponentially with the number of targets thusthe data association problem becomes NP complete Themost common techniques that are employed to solve the dataassociation problem are presented in the following sections(from Sections 31 to 37)

31 Nearest Neighbors and K-Means Nearest neighbor(NN) is the simplest data association technique NN isa well-known clustering algorithm that selects or groups

the most similar values How close the one measurement isto another depends on the employed distance metric andtypically depends on the threshold that is established by thedesigner In general the employed criteria could be based on(1) an absolute distance (2) the Euclidean distance or (3) astatistical function of the distance

NN is a simple algorithm that can find a feasible (approx-imate) solution in a small amount of time However in acluttered environment it could provide many pairs that havethe same probability and could thus produce undesirable

The Scientific World Journal 7

Targets Sensors Observations Tracks

Track 1

Track 2

False alarms

Ass

ocia

tion

S1

S2

Sn

Track n

y1 y2 yn

Figure 5 Conceptual overview of the data association process from multiple sensors and multiple targets It is necessary to establish the setof observations over time from the same object that forms a track

error propagation [10] Moreover this algorithm has poorperformance in environments in which false measurementsare frequent which are in highly noisy environments

All neighbors use a similar technique in which all of themeasurements inside a region are included in the tracks119870-Means [11] method is a well-known modification of

the NN algorithm 119870-Means divides the dataset values into119870 different clusters 119870-Means algorithm finds the best local-ization of the cluster centroids where best means a centroidthat is in the center of the data cluster119870-Means is an iterativealgorithm that can be divided into the following steps

(1) obtain the input data and the number of desiredclusters (119870)

(2) randomly assign the centroid of each cluster(3) match each data point with the centroid of each

cluster(4) move the cluster centers to the centroid of the cluster(5) if the algorithm does not converge return to step (3)

119870-Means is a popular algorithm that has been widelyemployed however it has the following disadvantages

(i) the algorithm does not always find the optimal solu-tion for the cluster centers

(ii) the number of clusters must be known a priori andone must assume that this number is the optimum

(iii) the algorithm assumes that the covariance of thedataset is irrelevant or that it has been normalizedalready

There are several options for overcoming these limita-tions For the first one it is possible to execute the algorithmseveral times and obtain the solution that has less varianceFor the second one it is possible to start with a low valueof 119870 and increment the values of 119870 until an adequate resultis obtained The third limitation can be easily overcome bymultiplying the datawith the inverse of the covariancematrix

Many variations have been proposed to Lloydrsquos basic119870-Means algorithm [11] which has a computational upperbound cost of 119874(119870119899) where 119899 is the number of input pointsand 119870 is the number of desired clusters Some algorithmsmodify the initial cluster assignments to improve the separa-tions and reduce the number of iterations Others introduce

soft or multinomial clustering assignments using fuzzy logicprobabilistic or the Bayesian techniques However most ofthe previous variations still must perform several iterationsthrough the data space to converge to a reasonable solutionThis issue becomes a major disadvantage in several real-time applications A new approach that is based on havinga large (but still affordable) number of cluster candidatescompared to the desired 119870 clusters is currently gainingattention The idea behind this computational model is thatthe algorithm builds a good sketch of the original data whilereducing the dimensionality of the input space significantlyIn this manner a weighted 119870-Means can be applied to thelarge candidate clusters to derive a good clustering of theoriginal data Using this idea [12] presented an efficientand scalable 119870-Means algorithm that is based on randomprojections This algorithm requires only one pass throughthe input data to build the clusters More specifically if theinput data distribution holds some separability requirementsthen the number of required candidate clusters grows onlyaccording to119874(log 119899) where 119899 is the number of observationsin the original data This salient feature makes the algorithmscalable in terms of both the memory and computationalrequirements

32 Probabilistic Data Association The probabilistic dataassociation (PDA) algorithm was proposed by Bar-Shalomand Tse [13] and is also known as the modified filter of allneighbors This algorithm assigns an association probabilityto each hypothesis from a valid measurement of a targetA valid measurement refers to the observation that falls inthe validation gate of the target at that time instant Thevalidation gate 120574 which is the center around the predictedmeasurements of the target is used to select the set of basicmeasurements and is defined as

120574 ge (119885 (119896) minus (119896 | 119896 minus 1))119879119878minus1(119896) (119911 (119896) minus 119911 (119896 | 119896 minus 1))

(1)

where 119870 is the temporal index 119878(119896) is the covariance gainand 120574 determines the gating or window size The set of validmeasurements at time instant 119896 is defined as

119885 (119896) = 119911119894(119896) 119894 = 1 119898

119896 (2)

8 The Scientific World Journal

where 119911119894(119896) is the 119894-measurement in the validation region at

time instant 119896 We give the standard equations of the PDAalgorithm next For the state prediction consider

119909 (119896 | 119896 minus 1) = 119865 (119896 minus 1) 119909 (119896 minus 1 | 119896 minus 1) (3)

where 119865(119896 minus 1) is the transition matrix at time instant 119896 minus 1To calculate the measurement prediction consider

(119896 | 119896 minus 1) = 119867 (119896) 119909 (119896 | 119896 minus 1) (4)

where 119867(119896) is the linearization measurement matrix Tocompute the gain or the innovation of the 119894-measurementconsider

V119894(119896) = 119911

119894(119896) minus (119896 | 119896 minus 1) (5)

To calculate the covariance prediction consider

(119896 | 119896 minus 1) = 119865 (119896 minus 1) (119896 minus 1 | 119896 minus 1) 119865(119896 minus 1)119879+ 119876 (119896)

(6)

where 119876(119896) is the process noise covariance matrix To com-pute the innovation covariance (119878) and the Kalman gain (119870)

119878 (119896) = 119867 (119896) (119896 | 119896 minus 1)119867(119896)119879+ 119877

119870 (119896) = (119896 | 119896 minus 1)119867(119896)119879119878(119896)minus1

(7)

To obtain the covariance update in the case in which themea-surements originated by the target are known consider

1198750(119896 | 119896) = (119896 | 119896 minus 1) minus 119870 (119896) 119878 (119896)119870(119896)

119879 (8)

The total update of the covariance is computed as

V (119896) =

119898119896

sum

119894=1

120573119894(119896) V119894(119896)

119875 (119896) = 119870 (119896) [

119898119896

sum

119894=1

(120573119894(119896) V119894(119896) V119894(119896)119879) minus V (119896) V(119896)

119879]119870119879(119896)

(9)

where119898119896is the number of valid measurements in the instant

119896The equation to update the estimated state which is formedby the position and velocity is given by

119909 (119896 | 119896) = 119909 (119896 | 119896 minus 1) + 119870 (119896) V (119896) (10)

Finally the association probabilities of PDA are as follows

120573119894(119896) =

119901119894(119896)

sum119898119896

119894=0119901119894(119896) (11)

where

119901119894(119896) =

(2Π)1198722120582radic1003816100381610038161003816119878119894 (119896)

1003816100381610038161003816 (1 minus 119875119889119875119892)

119875119889

if 119894 = 0

exp [minus12V119879 (119896) 119878minus1 (119896) V (119896)] if 119894 = 0

0 in other cases(12)

where119872 is the dimension of themeasurement vector 120582 is thedensity of the clutter environment 119875

119889is the detection prob-

ability of the correct measurement and 119875119892is the validation

probability of a detected valueIn the PDA algorithm the state estimation of the target is

computed as a weighted sum of the estimated state under allof the hypothesesThe algorithm can associate different mea-surements to one specific target Thus the association of thedifferent measurements to a specific target helps PDA toestimate the target state and the association probabilitiesare used as weights The main disadvantages of the PDAalgorithm are the following

(i) loss of tracks because PDA ignores the interferencewith other targets it sometimes could wrongly clas-sify the closest tracks Therefore it provides a poorperformance when the targets are close to each otheror crossed

(ii) the suboptimal Bayesian approximation when thesource of information is uncertain PDA is the sub-optimal Bayesian approximation to the associationproblem

(iii) one target PDA was initially designed for the asso-ciation of one target in a low-cluttered environmentThe number of false alarms is typically modeled withthe Poisson distribution and they are assumed to bedistributed uniformly in space PDA behaves incor-rectlywhen there aremultiple targets because the falsealarm model does not work well

(iv) track management because PDA assumes that thetrack is already established algorithms must be pro-vided for track initialization and track deletion

PDA is mainly good for tracking targets that do notmake abrupt changes in their movement patterns PDA willmost likely lose the target if it makes abrupt changes in itsmovement patterns

33 Joint Probabilistic Data Association Joint probabilisticdata association (JPDA) is a suboptimal approach for trackingmultiple targets in cluttered environments [14] JPDA issimilar to PDA with the difference that the associationprobabilities are computed using all of the observationsand all of the targets Thus in contrast to PDA JPDAconsiders various hypotheses together and combines themJPDA determines the probability 120573119905

119894(119896) that measurement 119894 is

originated from target 119905 accounting for the fact that underthis hypothesis the measurement cannot be generated byother targets Therefore for a known number of targets itevaluates the different options of the measurement-targetassociation (for the most recent set of measurements) andcombines them into the corresponding state estimation Ifthe association probability is known then the Kalman filterupdating equation of the track 119905 can be written as

119909119905(119896 | 119896) = 119909

119905(119896 | 119896 minus 1) + 119870 (119896) V

119905(119896) (13)

where 119909119905(119896 | 119896) and 119909119905(119896 | 119896 minus 1) are the estimation andprediction of target 119905 and119870(119896) is the filter gainTheweighted

The Scientific World Journal 9

sum of the residuals associated with the observation 119898(119896) oftarget 119905 is as follows

V119905(119896) =

119898(119896)

sum

119894=1

120573119905

119894(119896) V119905

119894(119896) (14)

where V119905119894= 119911119894(119896) minus 119867119909

119905(119896 | 119896 minus 1) Therefore this method

incorporates all of the observations (inside the neighborhoodof the targetrsquos predicted position) to update the estimatedposition by using a posterior probability that is a weightedsum of residuals

The main restrictions of JPDA are the following

(i) a measurement cannot come from more than onetarget

(ii) two measurements cannot be originated by the sametarget (at one time instant)

(iii) the sum of all of the measurementsrsquo probabilities thatare assigned to one target must be 1 sum119898(119896)

119894=0120573119905

119894(119896) = 1

The main disadvantages of JPDA are the following

(i) it requires an explicit mechanism for track initial-ization Similar to PDA JPDA cannot initialize newtracks or remove tracks that are out of the observationarea

(ii) JPDA is a computationally expensive algorithm whenit is applied in environments that havemultiple targetsbecause the number of hypotheses is incrementedexponentially with the number of targets

In general JPDA is more appropriate than MHT insituations in which the density of false measurements is high(ie sonar applications)

34 Multiple Hypothesis Test The underlying idea of themultiple hypothesis test (MHT) is based on using more thantwo consecutive observations to make an association withbetter results Other algorithms that use only two consecutiveobservations have a higher probability of generating an errorIn contrast to PDA and JPDA MHT estimates all of thepossible hypotheses and maintains new hypotheses in eachiteration

MHTwas developed to track multiple targets in clutteredenvironments as a result it combines the data associationproblem and tracking into a unified framework becomingan estimation technique as well The Bayes rule or theBayesian networks are commonly employed to calculate theMHT hypothesis In general researchers have claimed thatMHT outperforms JPDA for the lower densities of falsepositives However the main disadvantage of MHT is thecomputational cost when the number of tracks or falsepositives is incremented Pruning the hypothesis tree usinga window could solve this limitation

The Reid [15] tracking algorithm is considered the stan-dard MHT algorithm but the initial integer programmingformulation of the problem is due to Morefield [16] MHT isan iterative algorithm in which each iteration starts with a set

of correspondence hypotheses Each hypothesis is a collec-tion of disjoint tracks and the prediction of the target in thenext time instant is computed for each hypothesis Next thepredictions are compared with the new observations by usinga distance metric The set of associations established in eachhypothesis (based on a distance) introduces new hypothesesin the next iteration Each new hypothesis represents a newset of tracks that is based on the current observations

Note that each new measurement could come from (i) anew target in the visual field of view (ii) a target being trackedor (iii) noise in the measurement process It is also possiblethat a measurement is not assigned to a target because thetarget disappears or because it is not possible to obtain atarget measurement at that time instant

MHT maintains several correspondence hypotheses foreach target in each frame If the hypothesis in the instant119896 is represented by 119867(119896) = [ℎ

119897(119896) 119896 = 1 119899] then

the probability of the hypothesis ℎ119897(119896) could be represented

recursively using the Bayes rule as follows

119875 (ℎ119897(119896) | 119885 (119896)) = 119875 (ℎ

119892(119896 minus 1) 119886

119894(119896) | 119885 (119896))

=1

119888119875 (119885 (119896) | ℎ

119892(119896 minus 1) 119886

119894(119896))

lowast 119875 (119886119894(119896) | ℎ

119892(119896 minus 1)) lowast 119875 (ℎ

119892(119896 minus 1))

(15)

where ℎ119892(119896 minus 1) is the hypothesis 119892 of the complete set until

the time instant 119896minus1 119886119894(119896) is the 119894th possible association of the

track to the object 119885(119896) is the set of detections of the currentframe and 119888 is a normal constant

The first term on the right side of the previous equationis the likelihood function of the measurement set 119885(119896) giventhe joint likelihood and current hypothesis The second termis the probability of the association hypothesis of the currentdata given the previous hypothesis ℎ

119892(119896 minus 1) The third term

is the probability of the previous hypothesis from which thecurrent hypothesis is calculated

The MHT algorithm has the ability to detect a newtrack while maintaining the hypothesis tree structure Theprobability of a true track is given by theBayes decisionmodelas

119875 (120582 | 119885) =119875 (119885 | 120582) lowast 119875

∘(120582)

119875 (119885) (16)

where 119875(119885 | 120582) is the probability of obtaining the set ofmeasurements 119885 given 120582 119875

∘(120582) is the a priori probability of

the source signal and 119875(119885) is the probability of obtaining theset of detections 119885

MHT considers all of the possibilities including boththe track maintenance and the initialization and removalof tracks in an integrated framework MHT calculates thepossibility of having an object after the generation of a setof measurements using an exhaustive approach and thealgorithm does not assume a fixed number of targetsThe keychallenge of MHT is the effective hypothesis managementThe baseline MHT algorithm can be extended as follows(i) use the hypothesis aggregation for missed targets births

10 The Scientific World Journal

cardinality tracking and closely spaced objects (ii) applya multistage MHT for improving the performance androbustness in challenging settings and (iii) use a feature-aided MHT for extended object surveillance

The main disadvantage of this algorithm is the compu-tational cost which grows exponentially with the number oftracks andmeasurementsTherefore the practical implemen-tation of this algorithm is limited because it is exponential inboth time and memory

With the aim of reducing the computational cost [17]presented a probabilistic MHT algorithm in which theassociations are considered to be random variables thatare statistically independent and in which performing anexhaustive search enumeration is avoided This algorithm isknown as PMHT The PMHT algorithm assumes that thenumber of targets and measurements is known With thesame goal of reducing the computational cost [18] presentedan efficient implementation of the MHT algorithm Thisimplementation was the first version to be applied to performtracking in visual environments They employed the Murty[19] algorithm to determine the best set of 119896 hypothesesin polynomial time with the goal of tracking the points ofinterest

MHT typically performs the tracking process by employ-ing only one characteristic commonly the position TheBayesian combination to use multiple characteristics wasproposed by Liggins II et al [20]

A linear-programming-based relaxation approach to theoptimization problem in MHT tracking was proposed inde-pendently by Coraluppi et al [21] and Storms and Spieksma[22] Joo and Chellappa [23] proposed an association algo-rithm for tracking multiple targets in visual environmentsTheir algorithm is based on in MHT modification in whicha measurement can be associated with more than one targetand several targets can be associated with one measurementThey also proposed a combinatorial optimization algorithmto generate the best set of association hypotheses Theiralgorithm always finds the best hypothesis in contrast toother models which are approximate Coraluppi and Carthel[24] presented a generalization of the MHT algorithm usinga recursion over hypothesis classes rather than over a singlehypothesis This work has been applied in a special case ofthemulti-target tracking problem called cardinality trackingin which they observed the number of sensor measurementsinstead of the target states

35 Distributed Joint Probabilistic Data Association The dis-tributed version of the joint probabilistic data association(JPDA-D) was presented by Chang et al [25] In this tech-nique the estimated state of the target (using two sensors)after being associated is given by

119864 119909 | 1198851 1198852 =

1198981

sum

119895=0

1198982

sum

119897=0

119864 119909 | 1205941

119895 1205942

119897 1198851 1198852

lowast 119875 1205941

119895 1205942

119897| 1198851 1198852

(17)

where 119898119894 119894 = 1 2 is the last set of measurements of

sensor 1 and 2 119885119894 119894 = 1 2 is the set of accumulative data

and 120594 is the association hypothesisThe first term of the rightside of the equation is calculated from the associations thatwere made earlier The second term is computed from theindividual association probabilities as follows

119875 (1205941

119895 1205942

119897| 1198851 1198852) = sum

1199091

sum

1199092

= 119875 (1205941 1205942| 1198851 1198852) 1

119895(1205941) 2

119897(1205942)

119875 (1205941 1205942| 1198851 1198852) =

1

119888119875 (1205941| 1198851) 119875 (120594

2| 1198852) 120574 (120594

1 1205942)

(18)

where 120594119894 are the joint hypotheses involving all of themeasurements and all of the objectives and 119894

119895(120594119894) are the

binary indicators of the measurement-target associationTheadditional term 120574(1205941 1205942) depends on the correlation of theindividual hypothesis and reflects the localization influenceof the current measurements in the joint hypotheses

These equations are obtained assuming that commu-nication exists after every observation and there are onlyapproximations in the case in which communication issporadic and when a substantial amount of noise occursTherefore this algorithm is a theoretical model that has somelimitations in practical applications

36 Distributed Multiple Hypothesis Test The distributedversion of the MHT algorithm (MHT-D) [26 27] follows asimilar structure as the JPDA-D algorithm Let us assume thecase in which one node must fuse two sets of hypotheses andtracks If the hypotheses and track sets are represented by119867119894(119885119894) and 119879119894(119885119894) with 119894 = 1 2 the hypothesis probabilities

are represented by 120582119894119895 and the state distribution of the tracks

(120591119894119895) is represented by 119875(120582119894

119895) and 119875(119909 | 119885

119894 120591119894

119895) then the

maximum available information in the fusion node is 119885 =1198851cup 1198852 The data fusion objective of the MHT-D is to

obtain the set of hypotheses119867(119885) the set of tracks 119879(119885) thehypothesis probabilities 119875(120582 | 119885) and the state distribution119901(119909 | 119885 120591) for the observed data

The MHT-D algorithm is composed of the followingsteps

(1) hypothesis formation for each hypothesis pair 1205821119895and

1205822

119896 which could be fused a track 120591 is formed by

associating the pair of tracks 1205911119895and 1205912

119896 where each

pair comes from one node and could originate fromthe same target The final result of this stage is a setof hypotheses denoted by 119867(119885) and the fused tracks119879(119885)

(2) hypothesis evaluation in this stage the associationprobability of each hypothesis and the estimatedstate of each fused track are obtained The dis-tributed estimation algorithm is employed to calcu-late the likelihood of the possible associations andthe obtained estimations at each specific association

The Scientific World Journal 11

Using the information model the probability of eachfused hypothesis is given by

119875 (120582 | 119885) = 119862minus1prod

119895isin119869

119875(120582(119895)| 119885(119895))120572(119895)

prod

120591isin120582

119871 (120591 | 119885) (19)

where 119862 is a normalizing constant and 119871(120591 | 119885) is thelikelihood of each hypothesis pair

The main disadvantage of the MHT-D is the high com-putational cost that is in the order of 119874(119899119872) where 119899 is thenumber of possible associations and 119872 is the number ofvariables to be estimated

37 Graphical Models Graphical models are a formalism forrepresenting and reasoning with probabilities and indepen-dence A graphical model represents a conditional decom-position of the joint probability A graphical model can berepresented as a graph in which the nodes denote randomvariables the edges denote the possible dependence betweenthe random variables and the plates denote the replication ofa substructure with the appropriate indexing of the relevantvariables The graph captures the joint distribution over therandom variables which can be decomposed into a productof factors that each dependononly a subset of variablesThereare two major classes of graphical models (i) the Bayesiannetworks [28] which are also known as the directed graphicalmodels and (ii) the Markov random fields which are alsoknown as undirected graphical models The directed graph-ical models are useful for expressing causal relationshipsbetween random variables whereas undirected models arebetter suited for expressing soft constraints between randomvariables We refer the reader to the book of Koller andFriedman [29] for more information on graphical models

A framework based on graphical models can solve theproblem of distributed data association in synchronizedsensor networkswith overlapped areas andwhere each sensorreceives noisy measurements this solution was proposedby Chen et al [30 31] Their work is based on graphicalmodels that are used to represent the statistical dependencebetween random variables The data association problem istreated as an inference problem and solved by using themax-product algorithm [32] Graphical models representstatistical dependencies between variables as graphs andthe max-product algorithm converges when the graph isa tree structure Moreover the employed algorithm couldbe implemented in a distributed manner by exchangingmessages between the source nodes in parallel With thisalgorithm if each sensor has 119899 possible combinations ofassociations and there are119872 variables to be estimated it hasa complexity of 119874(1198992119872) which is reasonable and less thanthe 119874(119899119872) complexity of the MHT-D algorithm Howeveraspecial attention must be given to the correlated variableswhen building the graphical model

4 State Estimation Methods

State estimation techniques aim to determine the state ofthe target under movement (typically the position) given

the observation or measurements State estimation tech-niques are also known as tracking techniques In their generalform it is not guaranteed that the target observations arerelevant which means that some of the observations couldactually come from the target and others could be only noiseThe state estimation phase is a common stage in data fusionalgorithms because the targetrsquos observation could come fromdifferent sensors or sources and the final goal is to obtain aglobal target state from the observations

The estimation problem involves finding the values of thevector state (eg position velocity and size) that fits as muchas possible with the observed data From a mathematicalperspective we have a set of redundant observations andthe goal is to find the set of parameters that provides thebest fit to the observed data In general these observationsare corrupted by errors and the propagation of noise in themeasurement process State estimation methods fall underlevel 1 of the JDL classification and could be divided into twobroader groups

(1) linear dynamics and measurements here the esti-mation problem has a standard solution Specificallywhen the equations of the object state and the mea-surements are linear the noise follows the Gaussiandistribution and we do not refer to it as a clutterenvironment in this case the optimal theoreticalsolution is based on the Kalman filter

(2) nonlinear dynamics the state estimation problembecomes difficult and there is not an analytical solu-tion to solve the problem in a generalmanner In prin-ciple there are no practical algorithms available tosolve this problem satisfactorily

Most of the state estimationmethods are based on controltheory and employ the laws of probability to compute avector state from a vector measurement or a stream of vectormeasurements Next the most common estimation methodsare presented including maximum likelihood and maxi-mum posterior (Section 41) the Kalman filter (Section 42)particle filter (Section 43) the distributed Kalman filter(Section 44) distributed particle filter (Section 45) andcovariance consistency methods (Section 46)

41 Maximum Likelihood andMaximum Posterior Themax-imum likelihood (ML) technique is an estimation methodthat is based on probabilistic theory Probabilistic estimationmethods are appropriate when the state variable follows anunknown probability distribution [33] In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of119909 The likelihood function 120582(119909) is defined as a probabilitydensity function of the sequence of 119911 observations given thetrue value of the state 119909 Consider

120582 (119909) = 119901 (119911 | 119909) (20)

The ML estimator finds the value of 119909 that maximizes thelikelihood function

119909 (119896) = argmax119909119901 (119911 | 119909) (21)

12 The Scientific World Journal

which can be obtained from the analytical or empiricalmodels of the sensorsThis function expresses the probabilityof the observed data The main disadvantage of this methodin practice is that it requires the analytical or empirical modelof the sensor to be known to provide the prior distributionand compute the likelihood function This method can alsosystematically underestimate the variance of the distributionwhich leads to a bias problem However the bias of the MLsolution becomes less significant as the number 119873 of datapoints increases and is equal to the true variance of thedistribution that generated the data at the limit119873 rarr infin

The maximum posterior (MAP) method is based on theBayesian theory It is employed when the parameter 119909 tobe estimated is the output of a random variable that has aknown probability density function 119901(119909) In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of 119909The MAP estimator finds the value of 119909 that maximizes theposterior probability distribution as follows

119909 (119896) = argmax119909119901 (119909 | 119911) (22)

Both methods (ML andMAP) aim to find the most likelyvalue for the state 119909 However ML assumes that 119909 is a fixedbut an unknown point from the parameter space whereasMAP considers 119909 to be the output of a random variable witha known a priori probability density function Both of thesemethods are equivalent when there is no a priori informationabout 119909 that is when there are only observations

42The Kalman Filter TheKalman filter is the most popularestimation technique It was originally proposed by Kalman[34] and has been widely studied and applied since then TheKalman filter estimates the state 119909 of a discrete time processgoverned by the following space-time model

119909 (119896 + 1) = Φ (119896) 119909 (119896) + 119866 (119896) 119906 (119896) + 119908 (119896) (23)

with the observations ormeasurements 119911 at time 119896 of the state119909 represented by

119911 (119896) = 119867 (119896) 119909 (119896) + V (119896) (24)

where Φ(119896) is the state transition matrix 119866(119896) is the inputmatrix transition 119906(119896) is the input vector 119867(119896) is themeasurement matrix and 119908 and V are the random Gaussianvariables with zero mean and covariance matrices of 119876(119896)and 119877(119896) respectively Based on the measurements and onthe system parameters the estimation of 119909(119896) which isrepresented by 119909(119896) and the prediction of 119909(119896 + 1) whichis represented by 119909(119896 + 1 | 119896) are given by the following

119909 (119896) = 119909 (119896 | 119896 + 1) + 119870 (119896) [119911 (119896) minus 119867 (119896) 119909 (119896 | 119896 minus 1)]

119909 (119896 + 1 | 119896) = Φ (119896) 119909 (119896 | 119896) + 119866 (119896) 119906 (119896)

(25)

respectively where119870 is the filter gain determined by

119870 (119896) = 119875 (119896 | 119896 minus 1)119867119879(119896)

times [119867 (119896) 119875 (119896 | 119896 minus 1)119867119879(119896) + 119877 (119896)]

minus1

(26)

where 119875(119896 | 119896 minus 1) is the prediction covariance matrix andcan be determined by

119875 (119896 + 1 | 119896) = Φ (119896) 119875 (119896)Φ119879(119896) + 119876 (119896) (27)

with

119875 (119896) = 119875 (119896 | 119896 minus 1) minus 119870 (119896)119867 (119896) 119875 (119896 | 119896 minus 1) (28)

The Kalman filter is mainly employed to fuse low-leveldata If the system could be described as a linear model andthe error could be modeled as the Gaussian noise then therecursive Kalman filter obtains optimal statistical estimations[35] However othermethods are required to address nonlin-ear dynamicmodels and nonlinearmeasurementsThemodi-fied Kalman filter known as the extended Kalman filter (EKF)is an optimal approach for implementing nonlinear recursivefilters [36] The EKF is one of the most often employedmethods for fusing data in robotic applications Howeverit has some disadvantages because the computations of theJacobians are extremely expensive Some attempts have beenmade to reduce the computational cost such as linearizationbut these attempts introduce errors in the filter and make itunstable

The unscented Kalman filter (UKF) [37] has gainedpopularity because it does not have the linearization step andthe associated errors of the EKF [38] The UKF employs adeterministic sampling strategy to establish theminimum setof points around the mean This set of points captures thetrue mean and covariance completely Then these points arepropagated through nonlinear functions and the covarianceof the estimations can be recuperated Another advantage ofthe UKF is its ability to be employed in parallel implementa-tions

43 Particle Filter Particle filters are recursive implemen-tations of the sequential Monte Carlo methods [39] Thismethod builds the posterior density function using severalrandom samples called particles Particles are propagatedover time with a combination of sampling and resamplingsteps At each iteration the sampling step is employed todiscard some particles increasing the relevance of regionswith a higher posterior probability In the filtering processseveral particles of the same state variable are employedand each particle has an associated weight that indicatesthe quality of the particle Therefore the estimation is theresult of a weighted sum of all of the particles The standardparticle filter algorithm has two phases (1) the predictingphase and (2) the updating phase In the predicting phaseeach particle is modified according to the existing modeland accounts for the sum of the random noise to simulatethe noise effect Then in the updating phase the weight ofeach particle is reevaluated using the last available sensorobservation and particles with lower weights are removedSpecifically a generic particle filter comprises the followingsteps

The Scientific World Journal 13

(1) Initialization of the particles

(i) let119873 be equal to the number of particles(ii) 119883(119894)(1) = [119909(1) 119910(1) 0 0]119879 for 119894 = 1 119873

(2) Prediction step

(i) for each particle 119894 = 1 119873 evaluate the state(119896 + 1 | 119896) of the system using the state at timeinstant 119896 with the noise of the system at time 119896Consider

119883(119894)(119896 + 1 | 119896) = 119865 (119896)119883

(119894)(119896)

+ (cauchy-distribution-noise)(119896)

(29)

where 119865(119896) is the transition matrix of the sys-tem

(3) Evaluate the particle weight For each particle 119894 =1 119873

(i) compute the predicted observation state of thesystem using the current predicted state and thenoise at instant 119896 Consider

(119894)(119896 + 1 | 119896) = 119867 (119896 + 1)119883

(119894)(119896 + 1 | 119896)

+ (gaussian-measurement-noise)(119896+1)

(30)

(ii) compute the likelihood (weights) according tothe given distribution Consider

likelihood(119894) = 119873((119894) (119896 + 1 | 119896) 119911(119894) (119896 + 1) var) (31)

(iii) normalize the weights as follows

119908(119894)=

likelihood(119894)

sum119873

119895=1likelihood(119895)

(32)

(4) ResamplingSelection multiply particles with higherweights and remove those with lower weights Thecurrent state must be adjusted using the computedweights of the new particles

(i) Compute the cumulative weights Consider

Cum Wt(119894) =119894

sum

119895=1

119908(119895) (33)

(ii) Generate uniform distributed random variablesfrom 119880(119894) sim 119882(0 1) with the number of stepsequal to the number of particles

(iii) Determine which particles should be multipliedand which ones removed

(5) Propagation phase

(i) incorporate the new values of the state after theresampling of instant 119896 to calculate the value atinstant 119896 + 1 Consider

119909(1119873)

(119896 + 1 | 119896 + 1) = 119909 (119896 + 1 | 119896) (34)

(ii) compute the posterior mean Consider

119909 (119896 + 1) = mean [119909119894 (119896 + 1 | 119896 + 1)] 119894 = 1 119873 (35)

(iii) repeat steps 2 to 5 for each time instant

Particle filters are more flexible than the Kalman filtersand can copewith nonlinear dependencies and non-Gaussiandensities in the dynamic model and in the noise errorHowever they have some disadvantages A large numberof particles are required to obtain a small variance in theestimator It is also difficult to establish the optimal number ofparticles in advance and the number of particles affects thecomputational cost significantly Earlier versions of particlefilters employed a fixed number of particles but recent studieshave started to use a dynamic number of particles [40]

44 The Distributed Kalman Filter The distributed Kalmanfilter requires a correct clock synchronization between eachsource as demonstrated in [41] In other words to correctlyuse the distributed Kalman filter the clocks from all ofthe sources must be synchronized This synchronization istypically achieved through using protocols that employ ashared global clock such as the network time protocol (NTP)Synchronization problems between clocks have been shownto have an effect on the accuracy of the Kalman filterproducing inaccurate estimations [42]

If the estimations are consistent and the cross covarianceis known (or the estimations are uncorrelated) then it ispossible to use the distributed Kalman filters [43] Howeverthe cross covariance must be determined exactly or theobservations must be consistent

We refer the reader to Liggins II et al [20] formore detailsabout the Kalman filter in a distributed and hierarchicalarchitecture

45 Distributed Particle Filter Distributed particle filtershave gained attention recently [44ndash46] Coates [45] used adistributed particle filter to monitor an environment thatcould be captured by the Markovian state-space modelinvolving nonlinear dynamics and observations and non-Gaussian noise

In contrast earlier attempts to solve out-of-sequencemeasurements using particle filters are based on regeneratingthe probability density function to the time instant of theout-of-sequence measurement [47] In a particle filter thisstep requires a large computational cost in addition to thenecessary space to store the previous particles To avoidthis problem Orton and Marrs [48] proposed to store theinformation on the particles at each time instant saving thecost of recalculating this information This technique is close

14 The Scientific World Journal

to optimal and when the delay increases the result is onlyslightly affected [49]However it requires a very large amountof space to store the state of the particles at each time instant

46 Covariance Consistency Methods Covariance Intersec-tionUnion Covariance consistency methods (intersectionand union) were proposed by Uhlmann [43] and are generaland fault-tolerant frameworks for maintaining covariancemeans and estimations in a distributed networkThese meth-ods do not comprise estimation techniques instead they aresimilar to an estimation fusion technique The distributedKalman filter requirement of independent measurements orknown cross-covariances is not a constraintwith thismethod

461 Covariance Intersection If the Kalman filter is employ-ed to combine two estimations (119886

1 1198601) and (119886

2 1198602) then it

is assumed that the joint covariance is in the following form

[

[

1198601119883

1198831198791198602

]

]

(36)

where the cross-covariance 119883 should be known exactly sothat the Kalman filter can be applied without difficultyBecause the computation of the cross-covariances is compu-tationally intensive Uhlmann [43] proposed the covarianceintersection (CI) algorithm

Let us assume that a joint covariance 119872 can be definedwith the diagonal blocks119872

1198601gt 1198601and119872

1198602gt 1198602 Consider

119872 ⩾ [

[

1198601119883

1198831198791198602

]

]

(37)

for every possible instance of the unknown cross-covariance119883 then the components of the matrix119872 could be employedin the Kalman filter equations to provide a fused estimation(119888 119862) that is considered consistent The key point of thismethod relies on generating a joint covariance matrix119872 thatcan represent a useful fused estimation (in this context usefulrefers to something with a lower associated uncertainty) Insummary the CI algorithm computes the joint covariancematrix 119872 where the Kalman filter provides the best fusedestimation (119888 119862) with respect to a fixed measurement of thecovariance matrix (ie the minimum determinant)

Specific covariance criteria must be established becausethere is not a specific minimum joint covariance in theorder of the positive semidefinite matrices Moreover thejoint covariance is the basis of the formal analysis of theCI algorithm the actual result is a nonlinear mixture of theinformation stored on the estimations being fused followingthe following equation

119862 = (1199081119867119879

1119860minus1

11198671+ 1199082119867119879

2119860minus1

21198672+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119867119899)minus1

119888 = 119862(1199081119867119879

1119860minus1

11198861+ 1199082119867119879

2119860minus1

21198862+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119886119899)minus1

(38)

where 119867119894is the transformation of the fused state-space

estimation to the space of the estimated state 119894 The values

of 119908 can be calculated to minimize the covariance determi-nant using convex optimization packages and semipositivematrix programming The result of the CI algorithm hasdifferent characteristics compared to the Kalman filter Forexample if two estimations are provided (119886 119860) and (119887 119861)and their covariances are equal 119860 = 119861 since the Kalmanfilter is based on the statistical independence assumption itproduces a fused estimation with covariance 119862 = (12)119860In contrast the CI method does not assume independenceand thus must be consistent even in the case in whichthe estimations are completely correlated with the estimatedfused covariance 119862 = 119860 In the case of estimations where119860 lt 119861 the CI algorithm does not provide information aboutthe estimation (119887 119861) thus the fused result is (119886 119860)

Every joint-consistent covariance is sufficient to producea fused estimation which guarantees consistency Howeverit is also necessary to guarantee a lack of divergence Diver-gence is avoided in the CI algorithm by choosing a specificmeasurement (ie the determinant) which is minimized ineach fusion operation This measurement represents a non-divergence criterion because the size of the estimated covari-ance according to this criterion would not be incremented

The application of the CI method guarantees consis-tency and nondivergence for every sequence of mean andcovariance-consistent estimations However this methoddoes not work well when the measurements to be fused areinconsistent

462 Covariance Union CI solves the problem of correlatedinputs but not the problemof inconsistent inputs (inconsistentinputs refer to different estimations each of which has ahigh accuracy (small variance) but also a large differencefrom the states of the others) thus the covariance union(CU) algorithm was proposed to solve the latter [43] CUaddresses the following problem two estimations (119886

1 1198601)

and (1198862 1198602) relate to the state of an object and are mutually

inconsistent from one another This issue arises when thedifference between the average estimations is larger thanthe provided covariance Inconsistent inputs can be detectedusing the Mahalanobis distance [50] between them which isdefined as

119872119889= (1198861minus 1198862)119879

(1198601+ 1198602)minus1

(1198861minus 1198862) (39)

and detecting whether this distance is larger than a giventhreshold

The Mahalanobis distance accounts for the covarianceinformation to obtain the distance If the difference betweenthe estimations is high but their covariance is also highthe Mahalanobis distance yields a small value In contrastif the difference between the estimations is small and thecovariances are small it could produce a larger distancevalue A high Mahalanobis distance could indicate that theestimations are inconsistent however it is necessary tohave a specific threshold established by the user or learnedautomatically

The CU algorithm aims to solve the following prob-lem let us suppose that a filtering algorithm provides twoobservations withmean and covariance (119886

1 1198601) and (119886

2 1198602)

The Scientific World Journal 15

respectively It is known that one of the observations is correctand the other is erroneous However the identity of thecorrect estimation is unknown and cannot be determinedIn this situation if both estimations are employed as aninput to the Kalman filter there will be a problem becausethe Kalman filter only guarantees a consistent output if theobservation is updated with a measurement consistent withboth of them In the specific case in which themeasurementscorrespond to the same object but are acquired from twodifferent sensors the Kalman filter can only guarantee thatthe output is consistent if it is consistent with both separatelyBecause it is not possible to knowwhich estimation is correctthe only way to combine the two estimations rigorously isto provide an estimation (119906 119880) that is consistent with bothestimations and to obey the following properties

119880 ⪖ 1198601+ (119906 minus 119886

1) (119906 minus 119860

1)119879

119880 ⪖ 1198602+ (119906 minus 119886

2) (119906 minus 119860

2)119879

(40)

where somemeasurement of thematrix size119880 (ie the deter-minant) is minimized

In other words the previous equations indicate that if theestimation (119886

1 1198601) is consistent then the translation of the

vector 1198861to 119906 requires to increase the covariance by the sum

of a matrix at least as big as the product of (119906minus1198861) in order to

be consistentThe same situation applies to the measurement(1198862 1198602) in order to be consistent

A simple strategy is to choose the mean of the estimationas the input value of one of themeasurements (119906 = 119886

1) In this

case the value of 119880must be chosen such that the estimationis consistent with the worst case (the correct measurement is1198862) However it is possible to assign 119906 an intermediate value

between 1198861and 1198862to decrease the value of 119880 Therefore the

CU algorithm establishes the mean fused value 119906 that hasthe least covariance 119880 but is sufficiently large for the twomeasurements (119886

1and 1198862) for consistency

Because the matrix inequalities presented in previousequations are convex convex optimization algorithms mustbe employed to solve them The value of 119880 can be computedwith the iterative method described by Julier et al [51]The obtained covariance could be significantly larger thanany of the initial covariances and is an indicator of theexisting uncertainty between the initial estimations One ofthe advantages of the CU method arises from the fact thatthe same process could be easily extended to119873 inputs

5 Decision Fusion Methods

A decision is typically taken based on the knowledge of theperceived situation which is provided by many sources inthe data fusion domain These techniques aim to make ahigh-level inference about the events and activities that areproduced from the detected targets These techniques oftenuse symbolic information and the fusion process requires toreasonwhile accounting for the uncertainties and constraintsThese methods fall under level 2 (situation assessment) andlevel 4 (impact assessment) of the JDL data fusion model

51 The Bayesian Methods Information fusion based on theBayesian inference provides a formalism for combining evi-dence according to the probability theory rules Uncertaintyis represented using the conditional probability terms thatdescribe beliefs and take on values in the interval [0 1] wherezero indicates a complete lack of belief and one indicates anabsolute belief The Bayesian inference is based on the Bayesrule as follows

119875 (119884 | 119883) =119875 (119883 | 119884) 119875 (119884)

119875 (119883) (41)

where the posterior probability 119875(119884 | 119883) represents thebelief in the hypothesis 119884 given the information 119883 Thisprobability is obtained by multiplying the a priori probabilityof the hypothesis 119875(119884) by the probability of having 119883 giventhat 119884 is true 119875(119883 | 119884) The value 119875(119883) is used as anormalizing constantThemain disadvantage of the Bayesianinference is that the probabilities 119875(119883) and 119875(119883 | 119884) mustbe known To estimate the conditional probabilities Panet al [52] proposed the use of NNs whereas Coue et al [53]proposed the Bayesian programming

Hall and Llinas [54] described the following problemsassociated with Bayesian inference

(i) Difficulty in establishing the value of a priori proba-bilities

(ii) Complexity when there are multiple potential hypo-theses and a substantial number of events that dependon the conditions

(iii) The hypothesis should be mutually exclusive

(iv) Difficulty in describing the uncertainty of the deci-sions

52 The Dempster-Shafer Inference The Dempster-Shaferinference is based on the mathematical theory introducedby Dempster [55] and Shafer [56] which generalizes theBayesian theory The Dempster-Shafer theory provides aformalism that could be used to represent incomplete knowl-edge updating beliefs and a combination of evidence andallows us to represent the uncertainty explicitly [57]

A fundamental concept in the Dempster-Shafer reason-ing is the frame of discernment which is defined as followsLet Θ = 120579

1 1205792 120579

119873 be the set of all possible states

that define the system and let Θ be exhaustive and mutuallyexclusive due to the system being only in one state 120579

119894isin Θ

where 1 ⪕ 119894 ⪕ 119873 The set Θ is called a frame of discernmentbecause its elements are employed to discern the current stateof the system

The elements of the set 2Θ are called hypotheses Inthe Dempster-Shafer theory based on the evidence 119864 aprobability is assigned to each hypothesis119867 isin 2Θ accordingto the basic assignment of probabilities or the mass function119898 2Θrarr [01] which satisfies

119898(0) = 0 (42)

16 The Scientific World Journal

Thus themass function of the empty set is zero Furthermorethe mass function of a hypothesis is larger than or equal tozero for all of the hypotheses Consider

119898(119867) ge 0 forall119867 isin 2Θ (43)

The sum of the mass function of all the hypotheses is oneConsider

sum

119867isin2Θ

119898(119867) = 1 (44)

To express incomplete beliefs in a hypothesis 119867 the Demp-ster-Shafer theory defines the belief function bel 2Θ rarr

[0 1] over Θ as

bel (119867) = sum119860sube119867

119898(119860) (45)

where bel(0) = 0 and bel(Θ) = 1The doubt level in119867 can beexpressed in terms of the belief function by

dou (119867) = bel (not119867) = sum

119860subenot119867

119898(119860) (46)

To express the plausibility of each hypothesis the functionpl 2Θ rarr [0 1] over Θ is defined as

pl (119867) = 1 minus dou (119867) = sum

119860cap119867=0

119898(119860) (47)

Intuitive plausibility indicates that there is less uncer-tainty in hypothesis119867 if it is more plausible The confidenceinterval [bel(119867) pl(119867)] defines the true belief in hypothesis119867 To combine the effects of the two mass functions 119898

1and

1198982 the Dempster-Shafer theory defines a rule119898

1oplus 1198982as

1198981oplus 1198982(0) = 0

1198981oplus 1198982(119867) =

sum119883cap119884=119867

1198981(119883)119898

2(119884)

1 minus sum119883cap119884=0

1198981(119883)119898

2(119884)

(48)

In contrast to the Bayesian inference a priori probabilitiesare not required in the Dempster-Shafer inference becausethey are assigned at the instant that the information is pro-vided Several studies in the literature have compared the useof the Bayesian inference and the Dempster-Shafer inferencesuch as [58ndash60] Wu et al [61] used the Dempster-Shafertheory to fuse information in context-aware environmentsThis work was extended in [62] to dynamically modify theassociated weights to the sensor measurements Thereforethe fusion mechanism is calibrated according to the recentmeasurements of the sensors (in cases in which the ground-truth is available) In themilitary domain [63] theDempster-Shafer reasoning is used with the a priori information storedin a database for classifying military ships Morbee et al [64]described the use of the Dempster-Shafer theory to build 2Doccupancy maps from several cameras and to evaluate thecontribution of subsets of cameras to a specific task Each taskis the observation of an event of interest and the goal is toassess the validity of a set of hypotheses that are fused usingthe Dempster-Shafer theory

53 Abductive Reasoning Abductive reasoning or inferringthe best explanation is a reasoning method in which ahypothesis is chosen under the assumption that in case itis true it explains the observed event most accurately [65]In other words when an event is observed the abductionmethod attempts to find the best explanation

In the context of probabilistic reasoning abductive infer-ence finds the posterior ML of the system variables givensome observed variables Abductive reasoning is more areasoning pattern than a data fusion technique Thereforedifferent inference methods such as NNs [66] or fuzzy logic[67] can be employed

54 Semantic Methods Decision fusion techniques thatemploy semantic data fromdifferent sources as an input couldprovide more accurate results than those that rely on onlysingle sources There is a growing interest in techniques thatautomatically determine the presence of semantic features invideos to solve the semantic gap [68]

Semantic information fusion is essentially a scheme inwhich raw sensor data are processed such that the nodesexchange only the resultant semantic information Semanticinformation fusion typically covers two phases (i) build-ing the knowledge and (ii) pattern matching (inference)The first phase (typically offline) incorporates the mostappropriate knowledge into semantic information Then thesecond phase (typically online or in real-time) fuses relevantattributes and provides a semantic interpretation of thesensor data [69ndash71]

Semantic fusion could be viewed as an idea for integratingand translating sensor data into formal languages Thereforethe obtained resulting language from the observations ofthe environment is compared with similar languages thatare stored in the database The key of this strategy is thatsimilar behaviors represented by formal languages are alsosemantically similar This type of method provides savingsin the cost of transmission because the nodes need onlytransmit the formal language structure instead of the rawdata However a known set of behaviors must be storedin a database in advance which might be difficult in somescenarios

6 Conclusions

This paper reviews the most popular methods and tech-niques for performing datainformation fusion To determinewhether the application of datainformation fusion methodsis feasible we must evaluate the computational cost of theprocess and the delay introduced in the communicationA centralized data fusion approach is theoretically optimalwhen there is no cost of transmission and there are sufficientcomputational resources However this situation typicallydoes not hold in practical applications

The selection of the most appropriate technique dependson the type of the problem and the established assumptionsof each technique Statistical data fusion methods (eg PDAJPDA MHT and Kalman) are optimal under specific condi-tions [72] First the assumption that the targets are moving

The Scientific World Journal 17

independently and the measurements are normally dis-tributed around the predicted position typically does nothold Second because the statistical techniques model allof the events as probabilities they typically have severalparameters and a priori probabilities for false measurementsand detection errors that are often difficult to obtain (atleast in an optimal sense) For example in the case of theMHT algorithm specific parameters must be established thatare nontrivial to determine and are very sensitive [73] Incontrast statisticalmethods that optimize over several framesare computationally intensive and their complexity typicallygrows exponentially with the number of targets For examplein the case of particle filters tracking several targets can beaccomplished jointly as a group or individually If severaltargets are tracked jointly the necessary number of particlesgrows exponentially Therefore in practice it is better toperform tracking on them individually with the assumptionthat targets do not interact between the particles

In contrast to centralized systems the distributed datafusion methods introduce some challenges in the data fusionprocess such as (i) spatial and temporal alignments of theinformation (ii) out-of-sequence measurements and (iii)data correlation reported by Castanedo et al [74 75] Theinherent redundancy of the distributed systems could beexploited with distributed reasoning techniques and cooper-ative algorithms to improve the individual node estimationsreported by Castanedo et al [76] In addition to the previousstudies a new trend based on the geometric notion of a low-dimensional manifold is gaining attention in the data fusioncommunity An example is the work of Davenport et al [77]which proposes a simple model that captures the correlationbetween the sensor observations by matching the parametervalues for the different obtained manifolds

Acknowledgments

The author would like to thank Jesus Garcıa Miguel APatricio and James Llinas for their interesting and relateddiscussions on several topics that were presented in thispaper

References

[1] JDL Data Fusion Lexicon Technical Panel For C3 FE WhiteSan Diego Calif USA Code 420 1991

[2] D L Hall and J Llinas ldquoAn introduction to multisensor datafusionrdquo Proceedings of the IEEE vol 85 no 1 pp 6ndash23 1997

[3] H F Durrant-Whyte ldquoSensor models and multisensor integra-tionrdquo International Journal of Robotics Research vol 7 no 6 pp97ndash113 1988

[4] B V Dasarathy ldquoSensor fusion potential exploitation-inno-vative architectures and illustrative applicationsrdquo Proceedings ofthe IEEE vol 85 no 1 pp 24ndash38 1997

[5] R C Luo C-C Yih and K L Su ldquoMultisensor fusion andintegration approaches applications and future research direc-tionsrdquo IEEE Sensors Journal vol 2 no 2 pp 107ndash119 2002

[6] J Llinas C Bowman G Rogova A Steinberg E Waltz andF White ldquoRevisiting the JDL data fusion model IIrdquo TechnicalReport DTIC Document 2004

[7] E P Blasch and S Plano ldquoJDL level 5 fusionmodel ldquouser refine-mentrdquo issues and applications in group trackingrdquo in Proceedingsof the Signal Processing Sensor Fusion and Target RecognitionXI pp 270ndash279 April 2002

[8] H F Durrant-Whyte and M Stevens ldquoData fusion in decen-tralized sensing networksrdquo in Proceedings of the 4th Interna-tional Conference on Information Fusion pp 302ndash307MontrealCanada 2001

[9] J Manyika and H Durrant-Whyte Data Fusion and SensorManagement A Decentralized Information-Theoretic ApproachPrentice Hall Upper Saddle River NJ USA 1995

[10] S S Blackman ldquoAssociation and fusion ofmultiple sensor datardquoinMultitarget-Multisensor Tracking Advanced Applications pp187ndash217 Artech House 1990

[11] S Lloyd ldquoLeast squares quantization in pcmrdquo IEEETransactionson Information Theory vol 28 no 2 pp 129ndash137 1982

[12] M Shindler A Wong and A Meyerson ldquoFast and accurate120581-means for large datasetsrdquo in Proceedings of the 25th AnnualConference on Neural Information Processing Systems (NIPS rsquo11)pp 2375ndash2383 December 2011

[13] Y Bar-Shalom and E Tse ldquoTracking in a cluttered environmentwith probabilistic data associationrdquo Automatica vol 11 no 5pp 451ndash460 1975

[14] T E Fortmann Y Bar-Shalom and M Scheffe ldquoMulti-targettracking using joint probabilistic data associationrdquo in Pro-ceedings of the 19th IEEE Conference on Decision and Controlincluding the Symposium on Adaptive Processes vol 19 pp 807ndash812 December 1980

[15] D B Reid ldquoAn algorithm for tracking multiple targetsrdquo IEEETransactions on Automatic Control vol 24 no 6 pp 843ndash8541979

[16] C L Morefield ldquoApplication of 0-1 integer programming tomultitarget tracking problemsrdquo IEEETransactions onAutomaticControl vol 22 no 3 pp 302ndash312 1977

[17] R L Streit and T E Luginbuhl ldquoMaximum likelihood methodfor probabilistic multihypothesis trackingrdquo in Proceedings of theSignal and Data Processing of Small Targets vol 2235 of Pro-ceedings of SPIE p 394 1994

[18] I J Cox and S L Hingorani ldquoEfficient implementation of Reidrsquosmultiple hypothesis tracking algorithm and its evaluation forthe purpose of visual trackingrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 18 no 2 pp 138ndash1501996

[19] K G Murty ldquoAn algorithm for ranking all the assignments inorder of increasing costrdquo Operations Research vol 16 no 3 pp682ndash687 1968

[20] M E Liggins II C-Y Chong I Kadar et al ldquoDistributed fusionarchitectures and algorithms for target trackingrdquo Proceedings ofthe IEEE vol 85 no 1 pp 95ndash106 1997

[21] S Coraluppi C Carthel M Luettgen and S Lynch ldquoAll-source track and identity fusionrdquo in Proceedings of the NationalSymposium on Sensor and Data Fusion 2000

[22] P Storms and F Spieksma ldquoAn lp-based algorithm for the dataassociation problem in multitarget trackingrdquo in Proceedings ofthe 3rd IEEE International Conference on Information Fusionvol 1 2000

[23] S-W Joo and R Chellappa ldquoA multiple-hypothesis approachfor multiobject visual trackingrdquo IEEE Transactions on ImageProcessing vol 16 no 11 pp 2849ndash2854 2007

[24] S Coraluppi andC Carthel ldquoAggregate surveillance a cardinal-ity tracking approachrdquo in Proceedings of the 14th InternationalConference on Information Fusion (FUSION rsquo11) July 2011

18 The Scientific World Journal

[25] K C Chang C Y Chong and Y Bar-Shalom ldquoJoint proba-bilistic data association in distributed sensor networksrdquo IEEETransactions on Automatic Control vol 31 no 10 pp 889ndash8971986

[26] Y Chong S Mori and K C Chang ldquoInformation lusion indistributed sensor networksrdquo in Proceedings of the 4th AmericanControl Conference Boston Mass USA June 1985

[27] Y Chong S Mori and K C Chang ldquoDistributed multitar-get multisensor trackingrdquo in Multitarget-Multisensor TrackingAdvanced Applications vol 1 pp 247ndash295 1990

[28] J Pearl Probabilistic Reasoning in Intelligent Systems Networksof Plausible Inference Morgan Kaufmann San Mateo CalifUSA 1988

[29] Koller and N Friedman Probabilistic Graphical Models Princi-ples and Techniques MIT press 2009

[30] L Chen M Cetin and A S Willsky ldquoDistributed data associ-ation for multi-target tracking in sensor networksrdquo in Proceed-ings of the 7th International Conference on Information Fusion(FUSION rsquo05) pp 9ndash16 July 2005

[31] L Chen M J Wainwright M Cetin and A S Willsky ldquoDataassociation based on optimization in graphical models withapplication to sensor networksrdquo Mathematical and ComputerModelling vol 43 no 9-10 pp 1114ndash1113 2006

[32] Y Weiss and W T Freeman ldquoOn the optimality of solutionsof the max-product belief-propagation algorithm in arbitrarygraphsrdquo IEEE Transactions on InformationTheory vol 47 no 2pp 736ndash744 2001

[33] C Brown H Durrant-Whyte J Leonard B Rao and B SteerldquoDistributed data fusion using Kalman filtering a roboticsapplicationrdquo in Data Fusion in Robotics and Machine Intelli-gence M A Abidi and R C Gonzalez Eds pp 267ndash309 1992

[34] R E Kalman ldquoA new approach to linear filtering and predictionproblemsrdquo Journal of Basic Engineering vol 82 no 1 pp 35ndash451960

[35] R C Luo and M G Kay ldquoData fusion and sensor integrationstate-of-the-art 1990srdquo in Data Fusion in Robotics and MachineIntelligence pp 7ndash135 1992

[36] Welch and G Bishop An Introduction to the Kalman FilterACM SIC-CRAPH 2001 Course Notes 2001

[37] S J Julier and J K Uhlmann ldquoA new extension of the Kalmanfilter to nonlinear systemsrdquo in Proceedings of the InternationalSymposium on AerospaceDefense Sensing Simulation and Con-trols vol 3 1997

[38] A Wan and R Van Der Merwe ldquoThe unscented kalman filterfor nonlinear estimationrdquo in Proceedings of the Adaptive Systemsfor Signal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 153ndash158 2000

[39] D Crisan and A Doucet ldquoA survey of convergence results onparticle filtering methods for practitionersrdquo IEEE Transactionson Signal Processing vol 50 no 3 pp 736ndash746 2002

[40] J Martinez-del Rincon C Orrite-Urunuela and J E Herrero-Jaraba ldquoAn efficient particle filter for color-based tracking incomplex scenesrdquo in Proceedings of the IEEE Conference onAdvancedVideo and Signal Based Surveillance pp 176ndash181 2007

[41] S Ganeriwal R Kumar and M B Srivastava ldquoTiming-syncprotocol for sensor networksrdquo in Proceedings of the 1st Inter-national Conference on Embedded Networked Sensor Systems(SenSys rsquo03) pp 138ndash149 November 2003

[42] M Manzo T Roosta and S Sastry ldquoTime synchronization innetworksrdquo in Proceedings of the 3rd ACMWorkshop on Securityof Ad Hoc and Sensor Networks (SASN rsquo05) pp 107ndash116November 2005

[43] J K Uhlmann ldquoCovariance consistency methods for fault-tolerant distributed data fusionrdquo Information Fusion vol 4 no3 pp 201ndash215 2003

[44] S Bashi V P Jilkov X R Li and H Chen ldquoDistributed imple-mentations of particle filtersrdquo in Proceedings of the 6th Interna-tional Conference of Information Fusion pp 1164ndash1171 2003

[45] M Coates ldquoDistributed particle filters for sensor networksrdquo inProceedings of the 3rd International symposium on InformationProcessing in Sensor Networks (ACM rsquo04) pp 99ndash107 New YorkNY USA 2004

[46] D Gu ldquoDistributed particle filter for target trackingrdquo in Pro-ceedings of the IEEE International Conference on Robotics andAutomation (ICRA rsquo07) pp 3856ndash3861 April 2007

[47] Y Bar-Shalom ldquoUpdate with out-of-sequencemeasurements intracking exact solutionrdquo IEEE Transactions on Aerospace andElectronic Systems vol 38 no 3 pp 769ndash778 2002

[48] M Orton and A Marrs ldquoA Bayesian approach to multi-targettracking and data fusionwithOut-of-SequenceMeasurementsrdquoIEE Colloquium no 174 pp 151ndash155 2001

[49] M L Hernandez A D Marrs S Maskell and M R OrtonldquoTracking and fusion for wireless sensor networksrdquo in Proceed-ings of the 5th International Conference on Information Fusion2002

[50] P C Mahalanobis ldquoOn the generalized distance in statisticsrdquoProceedings National Institute of ScienceIndia vol 2 no 1 pp49ndash55 1936

[51] S J Julier J K Uhlmann and D Nicholson ldquoA methodfor dealing with assignment ambiguityrdquo in Proceedings of theAmerican Control Conference (AAC rsquo04) vol 5 pp 4102ndash4107July 2004

[52] H Pan Z-P Liang T J Anastasio and T S Huang ldquoHybridNN-Bayesian architecture for information fusionrdquo in Proceed-ings of the International Conference on Image Processing (ICIPrsquo98) pp 368ndash371 October 1998

[53] C Coue T Fraichard P Bessiere and E Mazer ldquoMulti-sensordata fusion using Bayesian programming an automotive appli-cationrdquo in Proceedings of the IEEERSJ International Conferenceon Intelligent Robots and Systems pp 141ndash146 October 2002

[54] D L Hall and J Llinas Handbook of Multisensor Data FusionCRC Press Boca Raton Fla USA 2001

[55] P Dempster ldquoA Generalization of Bayesian Inferencerdquo Journalof the Royal Statistical Society B vol 30 no 2 pp 205ndash247 1968

[56] A ShaferMathematical Theory of Evidence Princeton Univer-sity Press Princeton NJ USA 1976

[57] G M Provan ldquoThe validity of Dempster-Shafer belief func-tionsrdquo International Journal of Approximate Reasoning vol 6no 3 pp 389ndash399 1992

[58] D M Buede ldquoShafer-Dempster and Bayesian reasoning aresponse to lsquoShafer-Dempster reasoning with applications tomultisensor target identification systemsrsquordquo IEEE Transactions onSystems Man and Cybernetics vol 18 no 6 pp 1009ndash1011 1988

[59] Y Cheng and R L Kashyap ldquoComparisonol Bayesian andDempsterrsquos rules in evidence combinationrdquo in Maximum-Entropy and BayesianMethods in Science and Engineering 1988

[60] B R Cobb and P P Shenoy ldquoA comparison of Bayesian andbelief function reasoningrdquo Information Systems Frontiers vol 5no 4 pp 345ndash358 2003

[61] H Wu M Siegel R Stiefelhagen and J Yang ldquoSensor fusionusing Dempster-Shafer theoryrdquo in Proceedings of the 19thIEEE Instrumentation and Measurement Technology Conference(TMTC rsquo02) pp 7ndash11 May 2002

The Scientific World Journal 19

[62] H Wu M Siegel and S Ablay ldquoSensor fusion using dempster-shafer theory II staticweighting andKalmanfilter-like dynamicweightingrdquo in Proceedings of the 20th IEEE Information andMeasurement Technology Conference (TMTC rsquo03) pp 907ndash912May 2003

[63] E Bosse P Valin A-C Boury-Brisset and D Grenier ldquoEx-ploitation of a priori knowledge for information fusionrdquo Infor-mation Fusion vol 7 no 2 pp 161ndash175 2006

[64] MMorbee L Tessens H Aghajan andW Philips ldquoDempster-Shafer based multi-view occupancy mapsrdquo Electronics Lettersvol 46 no 5 pp 341ndash343 2010

[65] C S Peirce Abduction and Induction Philosophical Writings ofPeirce vol 156 Dover New York NY USA 1955

[66] A M Abdelbar E A M Andrews and D C Wunsch IIldquoAbductive reasoning with recurrent neural networksrdquo NeuralNetworks vol 16 no 5-6 pp 665ndash673 2003

[67] J R Aguero and A Vargas ldquoInference of operative configu-ration of distribution networks using fuzzy logic techniquesPart II extended real-time modelrdquo IEEE Transactions on PowerSystems vol 20 no 3 pp 1562ndash1569 2005

[68] A W M Smeulders M Worring S Santini A Gupta and RJain ldquoContent-based image retrieval at the end of the earlyyearsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 22 no 12 pp 1349ndash1380 2000

[69] D S Friedlander and S Phoha ldquoSemantic information fusionfor coordinated signal processing in mobile sensor networksrdquoInternational Journal of High Performance Computing Applica-tions vol 16 no 3 pp 235ndash241 2002

[70] S Friedlander ldquoSemantic information extractionrdquo in Dis-tributed Sensor Networks 2005

[71] KWhitehouse J Liu and F Zhao ldquoSemantic Streams a frame-work for composable inference over sensor datardquo in Proceedingsof the 3rd European Workshop on Wireless Sensor NetworksLecture Notes in Computer Science Springer February 2006

[72] J Cox ldquoA review of statistical data association techniques formotion correspondencerdquo International Journal of ComputerVision vol 10 no 1 pp 53ndash66 1993

[73] C J Veenman M J T Reinders and E Backer ldquoResolvingmotion correspondence for densely moving pointsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol23 no 1 pp 54ndash72 2001

[74] F Castanedo M A Patricio J Garcıa and J M MolinaldquoBottom-uptop-down coordination in a multiagent visualsensor networkrdquo in Proceedings of the IEEE Conference onAdvanced Video and Signal Based Surveillance (AVSS rsquo07) pp93ndash98 September 2007

[75] F Castanedo J Garcıa M A Patricio and J M MolinaldquoAnalysis of distributed fusion alternatives in coordinated visionagentsrdquo in Proceedings of the 11th International Conference onInformation Fusion (FUSION rsquo08) July 2008

[76] F Castanedo J Garcıa M A Patricio and J M Molina ldquoDatafusion to improve trajectory tracking in a cooperative surveil-lance multi-agent architecturerdquo Information Fusion vol 11 no3 pp 243ndash255 2010

[77] M A Davenport C Hegde M F Duarte and R G BaraniukldquoJoint manifolds for data fusionrdquo IEEE Transactions on ImageProcessing vol 19 no 10 pp 2580ndash2594 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 4: Review Article A Review of Data Fusion Techniquesdownloads.hindawi.com/journals/tswj/2013/704504.pdf · Classification of Data Fusion Techniques Data fusion is a ... to extract features

4 The Scientific World Journal

Fusion domain

Level 0 Level 1 Level 2 Level 3Sourcepreprocessing

Objectrefinement

Situationassessment

Threatassessment

Information bus

SourcesSensorsDatabasesKnowledge

Level 4 DatabasemanagementProcess

refinement

Userinterface

Figure 3 The JDL data fusion framework

results of this stage are the object discrimination(classification and identification) and object track-ing (state of the object and orientation) This stagetransforms the input information into consistent datastructures

(3) level 2mdashsituation assessment this level focuses ona higher level of inference than level 1 Situationassessment aims to identify the likely situations giventhe observed events and obtained data It establishesrelationships between the objects Relations (ieproximity communication) are valued to determinethe significance of the entities or objects in a specificenvironment The aim of this level includes perform-ing high-level inferences and identifying significantactivities and events (patterns in general) The outputis a set of high-level inferences

(4) level 3mdashimpact assessment this level evaluates theimpact of the detected activities in level 2 to obtain aproper perspectiveThe current situation is evaluatedand a future projection is performed to identifypossible risks vulnerabilities and operational oppor-tunities This level includes (1) an evaluation of therisk or threat and (2) a prediction of the logicaloutcome

(5) level 4mdashprocess refinement this level improves theprocess from level 0 to level 3 and provides resourceand sensor management The aim is to achieve effi-cient resource management while accounting for taskpriorities scheduling and the control of availableresources

High-level fusion typically starts at level 2 because thetype localization movement and quantity of the objectsare known at that level One of the limitations of the JDLmethod is how the uncertainty about previous or subsequentresults could be employed to enhance the fusion process(feedback loop) Llinas et al [6] propose several refinementsand extensions to the JDL model Blasch and Plano [7]proposed to add a new level (user refinement) to support ahumanuser in the data fusion loopThe JDLmodel represents

the first effort to provide a detailed model and a commonterminology for the data fusion domain However becausetheir roots originate in the military domain the employedterms are oriented to the risks that commonly occur inthese scenarios The Dasarathy model differs from the JDLmodel with regard to the adopted terminology and employedapproach The former is oriented toward the differencesamong the input and output results independent of theemployed fusion method In summary the Dasarathy modelprovides a method for understanding the relations betweenthe fusion tasks and employed data whereas the JDL modelpresents an appropriate fusion perspective to design datafusion systems

25 Classification Based on the Type of Architecture One ofthe main questions that arise when designing a data fusionsystem is where the data fusion process will be performedBased on this criterion the following types of architecturescould be identified

(1) centralized architecture in a centralized architecturethe fusion node resides in the central processor thatreceives the information from all of the input sourcesTherefore all of the fusion processes are executedin a central processor that uses the provided rawmeasurements from the sources In this schema thesources obtain only the observationas measurementsand transmit them to a central processor where thedata fusion process is performed If we assume thatdata alignment and data association are performedcorrectly and that the required time to transfer thedata is not significant then the centralized scheme istheoretically optimal However the previous assump-tions typically do not hold for real systems Moreoverthe large amount of bandwidth that is required to sendraw data through the network is another disadvantagefor the centralized approach This issue becomes abottleneck when this type of architecture is employedfor fusing data in visual sensor networks Finallythe time delays when transferring the informationbetween the different sources are variable and affect

The Scientific World Journal 5

the results in the centralized scheme to a greaterdegree than in other schemes

(2) decentralized architecture a decentralized architec-ture is composed of a network of nodes in which eachnode has its own processing capabilities and there isno single point of data fusion Therefore each nodefuses its local information with the information thatis received from its peers Data fusion is performedautonomously with each node accounting for its localinformation and the information received from itspeers Decentralized data fusion algorithms typicallycommunicate information using the Fisher and Shan-non measurements instead of the objectrsquos state [8]The main disadvantage of this architecture is thecommunication cost which is 119874(1198992) at each com-munication step where 119899 is the number of nodesadditionally the extreme case is considered in whicheach node communicates with all of its peers Thusthis type of architecture could suffer from scalabilityproblems when the number of nodes is increased

(3) distributed architecture in a distributed architecturemeasurements from each source node are processedindependently before the information is sent to thefusion node the fusion node accounts for the infor-mation that is received from the other nodes In otherwords the data association and state estimation areperformed in the source node before the informationis communicated to the fusion node Therefore eachnode provides an estimation of the object state basedon only their local views and this information isthe input to the fusion process which provides afused global view This type of architecture providesdifferent options and variations that range from onlyone fusion node to several intermediate fusion nodes

(4) hierarchical architecture other architectures com-prise a combination of decentralized and distributednodes generating hierarchical schemes in which thedata fusion process is performed at different levels inthe hierarchy

In principle a decentralized data fusion system is moredifficult to implement because of the computation andcommunication requirements However in practice there isno single best architecture and the selection of the mostappropriate architecture should be made depending on therequirements demand existing networks data availabilitynode processing capabilities and organization of the datafusion system

The reader might think that the decentralized anddistributed architectures are similar however they havemeaningful differences (see Figure 4) First in a distributedarchitecture a preprocessing of the obtainedmeasurements isperformed which provides a vector of features as a result (thefeatures are fused thereafter) In contrast in the decentralizedarchitecture the complete data fusion process is conductedin each node and each of the nodes provides a globallyfused result Second the decentralized fusion algorithmstypically communicate information employing the Fisher

and Shannon measurements In contrast distributed algo-rithms typically share a common notion of state (positionvelocity and identity) with their associated probabilitieswhich are used to perform the fusion process [9] Thirdbecause the decentralized data fusion algorithms exchangeinformation instead of states and probabilities they havethe advantage of easily separating old knowledge from newknowledge Thus the process is additive and the associativemeaning is not relevant when the information is receivedand fused However in the distributed data fusion algorithms(ie distributed by Kalman Filter) the state that is goingto be fused is not associative and when and how the fusedestimates are computed is relevant Nevertheless in contrastto the centralized architectures the distributed algorithmsreduce the necessary communication and computationalcosts because some tasks are computed in the distributednodes before data fusion is performed in the fusion node

3 Data Association Techniques

The data association problem must determine the set ofmeasurements that correspond to each target (see Figure 5)Let us suppose that there are 119874 targets that are being trackedby only one sensor in a cluttered environment (by a clutteredenvironment we refer to an environment that has severaltargets that are to close each other)Then the data associationproblem can be defined as follows

(i) each sensorrsquos observation is received in the fusionnode at discrete time intervals

(ii) the sensormight not provide observations at a specificinterval

(iii) some observations are noise and other observationsoriginate from the detected target

(iv) for any specific target and in every time interval wedo not know (a priori) the observations that will begenerated by that target

Therefore the goal of data association is to establish theset of observations or measurements that are generated bythe same target over time Hall and Llinas [2] provided thefollowing definition of data association ldquoThe process of assignand compute the weights that relates the observations or tracks(A track can be defined as an ordered set of points that followa path and are generated by the same target) from one set tothe observation of tracks of another setrdquo

As an example of the complexity of the data associationproblem if we take a frame-to-frame association and assumethat119872 possible points could be detected in all 119899 frames thenthe number of possible sets is (119872)119899minus1 Note that from allof these possible solutions only one set establishes the truemovement of the119872 points

Data association is often performed before the stateestimation of the detected targets Moreover it is a keystep because the estimation or classification will behaveincorrectly if the data association phase does not workcoherently The data association process could also appear inall of the fusion levels but the granularity varies dependingon the objective of each level

6 The Scientific World Journal

Preprocessing

Preprocessing

Preprocessing

Alignment Association Estimation

Stateof theobject

Centralized architecture

Decentralized architecture

Distributed architecture

S1

S2

Fusion node

Preprocessing

Stateof theobject

Stateof theobject

Stateof theobject

S1

S2

S1

S2

Preprocessing

Preprocessing

Preprocessing

Preprocessing

Preprocessing

Alignment

Alignment

Alignment

Alignment

Alignment

Alignment

Alignment

Association

Association

Association

Association

Association

Association

Association

Estimation

Estimation

Estimation

Estimation

Estimation

Estimation

Estimation

Sn

Sn

Sn

Stateof theobject

Figure 4 Classification based on the type of architecture

In general an exhaustive search of all possible combina-tions grows exponentially with the number of targets thusthe data association problem becomes NP complete Themost common techniques that are employed to solve the dataassociation problem are presented in the following sections(from Sections 31 to 37)

31 Nearest Neighbors and K-Means Nearest neighbor(NN) is the simplest data association technique NN isa well-known clustering algorithm that selects or groups

the most similar values How close the one measurement isto another depends on the employed distance metric andtypically depends on the threshold that is established by thedesigner In general the employed criteria could be based on(1) an absolute distance (2) the Euclidean distance or (3) astatistical function of the distance

NN is a simple algorithm that can find a feasible (approx-imate) solution in a small amount of time However in acluttered environment it could provide many pairs that havethe same probability and could thus produce undesirable

The Scientific World Journal 7

Targets Sensors Observations Tracks

Track 1

Track 2

False alarms

Ass

ocia

tion

S1

S2

Sn

Track n

y1 y2 yn

Figure 5 Conceptual overview of the data association process from multiple sensors and multiple targets It is necessary to establish the setof observations over time from the same object that forms a track

error propagation [10] Moreover this algorithm has poorperformance in environments in which false measurementsare frequent which are in highly noisy environments

All neighbors use a similar technique in which all of themeasurements inside a region are included in the tracks119870-Means [11] method is a well-known modification of

the NN algorithm 119870-Means divides the dataset values into119870 different clusters 119870-Means algorithm finds the best local-ization of the cluster centroids where best means a centroidthat is in the center of the data cluster119870-Means is an iterativealgorithm that can be divided into the following steps

(1) obtain the input data and the number of desiredclusters (119870)

(2) randomly assign the centroid of each cluster(3) match each data point with the centroid of each

cluster(4) move the cluster centers to the centroid of the cluster(5) if the algorithm does not converge return to step (3)

119870-Means is a popular algorithm that has been widelyemployed however it has the following disadvantages

(i) the algorithm does not always find the optimal solu-tion for the cluster centers

(ii) the number of clusters must be known a priori andone must assume that this number is the optimum

(iii) the algorithm assumes that the covariance of thedataset is irrelevant or that it has been normalizedalready

There are several options for overcoming these limita-tions For the first one it is possible to execute the algorithmseveral times and obtain the solution that has less varianceFor the second one it is possible to start with a low valueof 119870 and increment the values of 119870 until an adequate resultis obtained The third limitation can be easily overcome bymultiplying the datawith the inverse of the covariancematrix

Many variations have been proposed to Lloydrsquos basic119870-Means algorithm [11] which has a computational upperbound cost of 119874(119870119899) where 119899 is the number of input pointsand 119870 is the number of desired clusters Some algorithmsmodify the initial cluster assignments to improve the separa-tions and reduce the number of iterations Others introduce

soft or multinomial clustering assignments using fuzzy logicprobabilistic or the Bayesian techniques However most ofthe previous variations still must perform several iterationsthrough the data space to converge to a reasonable solutionThis issue becomes a major disadvantage in several real-time applications A new approach that is based on havinga large (but still affordable) number of cluster candidatescompared to the desired 119870 clusters is currently gainingattention The idea behind this computational model is thatthe algorithm builds a good sketch of the original data whilereducing the dimensionality of the input space significantlyIn this manner a weighted 119870-Means can be applied to thelarge candidate clusters to derive a good clustering of theoriginal data Using this idea [12] presented an efficientand scalable 119870-Means algorithm that is based on randomprojections This algorithm requires only one pass throughthe input data to build the clusters More specifically if theinput data distribution holds some separability requirementsthen the number of required candidate clusters grows onlyaccording to119874(log 119899) where 119899 is the number of observationsin the original data This salient feature makes the algorithmscalable in terms of both the memory and computationalrequirements

32 Probabilistic Data Association The probabilistic dataassociation (PDA) algorithm was proposed by Bar-Shalomand Tse [13] and is also known as the modified filter of allneighbors This algorithm assigns an association probabilityto each hypothesis from a valid measurement of a targetA valid measurement refers to the observation that falls inthe validation gate of the target at that time instant Thevalidation gate 120574 which is the center around the predictedmeasurements of the target is used to select the set of basicmeasurements and is defined as

120574 ge (119885 (119896) minus (119896 | 119896 minus 1))119879119878minus1(119896) (119911 (119896) minus 119911 (119896 | 119896 minus 1))

(1)

where 119870 is the temporal index 119878(119896) is the covariance gainand 120574 determines the gating or window size The set of validmeasurements at time instant 119896 is defined as

119885 (119896) = 119911119894(119896) 119894 = 1 119898

119896 (2)

8 The Scientific World Journal

where 119911119894(119896) is the 119894-measurement in the validation region at

time instant 119896 We give the standard equations of the PDAalgorithm next For the state prediction consider

119909 (119896 | 119896 minus 1) = 119865 (119896 minus 1) 119909 (119896 minus 1 | 119896 minus 1) (3)

where 119865(119896 minus 1) is the transition matrix at time instant 119896 minus 1To calculate the measurement prediction consider

(119896 | 119896 minus 1) = 119867 (119896) 119909 (119896 | 119896 minus 1) (4)

where 119867(119896) is the linearization measurement matrix Tocompute the gain or the innovation of the 119894-measurementconsider

V119894(119896) = 119911

119894(119896) minus (119896 | 119896 minus 1) (5)

To calculate the covariance prediction consider

(119896 | 119896 minus 1) = 119865 (119896 minus 1) (119896 minus 1 | 119896 minus 1) 119865(119896 minus 1)119879+ 119876 (119896)

(6)

where 119876(119896) is the process noise covariance matrix To com-pute the innovation covariance (119878) and the Kalman gain (119870)

119878 (119896) = 119867 (119896) (119896 | 119896 minus 1)119867(119896)119879+ 119877

119870 (119896) = (119896 | 119896 minus 1)119867(119896)119879119878(119896)minus1

(7)

To obtain the covariance update in the case in which themea-surements originated by the target are known consider

1198750(119896 | 119896) = (119896 | 119896 minus 1) minus 119870 (119896) 119878 (119896)119870(119896)

119879 (8)

The total update of the covariance is computed as

V (119896) =

119898119896

sum

119894=1

120573119894(119896) V119894(119896)

119875 (119896) = 119870 (119896) [

119898119896

sum

119894=1

(120573119894(119896) V119894(119896) V119894(119896)119879) minus V (119896) V(119896)

119879]119870119879(119896)

(9)

where119898119896is the number of valid measurements in the instant

119896The equation to update the estimated state which is formedby the position and velocity is given by

119909 (119896 | 119896) = 119909 (119896 | 119896 minus 1) + 119870 (119896) V (119896) (10)

Finally the association probabilities of PDA are as follows

120573119894(119896) =

119901119894(119896)

sum119898119896

119894=0119901119894(119896) (11)

where

119901119894(119896) =

(2Π)1198722120582radic1003816100381610038161003816119878119894 (119896)

1003816100381610038161003816 (1 minus 119875119889119875119892)

119875119889

if 119894 = 0

exp [minus12V119879 (119896) 119878minus1 (119896) V (119896)] if 119894 = 0

0 in other cases(12)

where119872 is the dimension of themeasurement vector 120582 is thedensity of the clutter environment 119875

119889is the detection prob-

ability of the correct measurement and 119875119892is the validation

probability of a detected valueIn the PDA algorithm the state estimation of the target is

computed as a weighted sum of the estimated state under allof the hypothesesThe algorithm can associate different mea-surements to one specific target Thus the association of thedifferent measurements to a specific target helps PDA toestimate the target state and the association probabilitiesare used as weights The main disadvantages of the PDAalgorithm are the following

(i) loss of tracks because PDA ignores the interferencewith other targets it sometimes could wrongly clas-sify the closest tracks Therefore it provides a poorperformance when the targets are close to each otheror crossed

(ii) the suboptimal Bayesian approximation when thesource of information is uncertain PDA is the sub-optimal Bayesian approximation to the associationproblem

(iii) one target PDA was initially designed for the asso-ciation of one target in a low-cluttered environmentThe number of false alarms is typically modeled withthe Poisson distribution and they are assumed to bedistributed uniformly in space PDA behaves incor-rectlywhen there aremultiple targets because the falsealarm model does not work well

(iv) track management because PDA assumes that thetrack is already established algorithms must be pro-vided for track initialization and track deletion

PDA is mainly good for tracking targets that do notmake abrupt changes in their movement patterns PDA willmost likely lose the target if it makes abrupt changes in itsmovement patterns

33 Joint Probabilistic Data Association Joint probabilisticdata association (JPDA) is a suboptimal approach for trackingmultiple targets in cluttered environments [14] JPDA issimilar to PDA with the difference that the associationprobabilities are computed using all of the observationsand all of the targets Thus in contrast to PDA JPDAconsiders various hypotheses together and combines themJPDA determines the probability 120573119905

119894(119896) that measurement 119894 is

originated from target 119905 accounting for the fact that underthis hypothesis the measurement cannot be generated byother targets Therefore for a known number of targets itevaluates the different options of the measurement-targetassociation (for the most recent set of measurements) andcombines them into the corresponding state estimation Ifthe association probability is known then the Kalman filterupdating equation of the track 119905 can be written as

119909119905(119896 | 119896) = 119909

119905(119896 | 119896 minus 1) + 119870 (119896) V

119905(119896) (13)

where 119909119905(119896 | 119896) and 119909119905(119896 | 119896 minus 1) are the estimation andprediction of target 119905 and119870(119896) is the filter gainTheweighted

The Scientific World Journal 9

sum of the residuals associated with the observation 119898(119896) oftarget 119905 is as follows

V119905(119896) =

119898(119896)

sum

119894=1

120573119905

119894(119896) V119905

119894(119896) (14)

where V119905119894= 119911119894(119896) minus 119867119909

119905(119896 | 119896 minus 1) Therefore this method

incorporates all of the observations (inside the neighborhoodof the targetrsquos predicted position) to update the estimatedposition by using a posterior probability that is a weightedsum of residuals

The main restrictions of JPDA are the following

(i) a measurement cannot come from more than onetarget

(ii) two measurements cannot be originated by the sametarget (at one time instant)

(iii) the sum of all of the measurementsrsquo probabilities thatare assigned to one target must be 1 sum119898(119896)

119894=0120573119905

119894(119896) = 1

The main disadvantages of JPDA are the following

(i) it requires an explicit mechanism for track initial-ization Similar to PDA JPDA cannot initialize newtracks or remove tracks that are out of the observationarea

(ii) JPDA is a computationally expensive algorithm whenit is applied in environments that havemultiple targetsbecause the number of hypotheses is incrementedexponentially with the number of targets

In general JPDA is more appropriate than MHT insituations in which the density of false measurements is high(ie sonar applications)

34 Multiple Hypothesis Test The underlying idea of themultiple hypothesis test (MHT) is based on using more thantwo consecutive observations to make an association withbetter results Other algorithms that use only two consecutiveobservations have a higher probability of generating an errorIn contrast to PDA and JPDA MHT estimates all of thepossible hypotheses and maintains new hypotheses in eachiteration

MHTwas developed to track multiple targets in clutteredenvironments as a result it combines the data associationproblem and tracking into a unified framework becomingan estimation technique as well The Bayes rule or theBayesian networks are commonly employed to calculate theMHT hypothesis In general researchers have claimed thatMHT outperforms JPDA for the lower densities of falsepositives However the main disadvantage of MHT is thecomputational cost when the number of tracks or falsepositives is incremented Pruning the hypothesis tree usinga window could solve this limitation

The Reid [15] tracking algorithm is considered the stan-dard MHT algorithm but the initial integer programmingformulation of the problem is due to Morefield [16] MHT isan iterative algorithm in which each iteration starts with a set

of correspondence hypotheses Each hypothesis is a collec-tion of disjoint tracks and the prediction of the target in thenext time instant is computed for each hypothesis Next thepredictions are compared with the new observations by usinga distance metric The set of associations established in eachhypothesis (based on a distance) introduces new hypothesesin the next iteration Each new hypothesis represents a newset of tracks that is based on the current observations

Note that each new measurement could come from (i) anew target in the visual field of view (ii) a target being trackedor (iii) noise in the measurement process It is also possiblethat a measurement is not assigned to a target because thetarget disappears or because it is not possible to obtain atarget measurement at that time instant

MHT maintains several correspondence hypotheses foreach target in each frame If the hypothesis in the instant119896 is represented by 119867(119896) = [ℎ

119897(119896) 119896 = 1 119899] then

the probability of the hypothesis ℎ119897(119896) could be represented

recursively using the Bayes rule as follows

119875 (ℎ119897(119896) | 119885 (119896)) = 119875 (ℎ

119892(119896 minus 1) 119886

119894(119896) | 119885 (119896))

=1

119888119875 (119885 (119896) | ℎ

119892(119896 minus 1) 119886

119894(119896))

lowast 119875 (119886119894(119896) | ℎ

119892(119896 minus 1)) lowast 119875 (ℎ

119892(119896 minus 1))

(15)

where ℎ119892(119896 minus 1) is the hypothesis 119892 of the complete set until

the time instant 119896minus1 119886119894(119896) is the 119894th possible association of the

track to the object 119885(119896) is the set of detections of the currentframe and 119888 is a normal constant

The first term on the right side of the previous equationis the likelihood function of the measurement set 119885(119896) giventhe joint likelihood and current hypothesis The second termis the probability of the association hypothesis of the currentdata given the previous hypothesis ℎ

119892(119896 minus 1) The third term

is the probability of the previous hypothesis from which thecurrent hypothesis is calculated

The MHT algorithm has the ability to detect a newtrack while maintaining the hypothesis tree structure Theprobability of a true track is given by theBayes decisionmodelas

119875 (120582 | 119885) =119875 (119885 | 120582) lowast 119875

∘(120582)

119875 (119885) (16)

where 119875(119885 | 120582) is the probability of obtaining the set ofmeasurements 119885 given 120582 119875

∘(120582) is the a priori probability of

the source signal and 119875(119885) is the probability of obtaining theset of detections 119885

MHT considers all of the possibilities including boththe track maintenance and the initialization and removalof tracks in an integrated framework MHT calculates thepossibility of having an object after the generation of a setof measurements using an exhaustive approach and thealgorithm does not assume a fixed number of targetsThe keychallenge of MHT is the effective hypothesis managementThe baseline MHT algorithm can be extended as follows(i) use the hypothesis aggregation for missed targets births

10 The Scientific World Journal

cardinality tracking and closely spaced objects (ii) applya multistage MHT for improving the performance androbustness in challenging settings and (iii) use a feature-aided MHT for extended object surveillance

The main disadvantage of this algorithm is the compu-tational cost which grows exponentially with the number oftracks andmeasurementsTherefore the practical implemen-tation of this algorithm is limited because it is exponential inboth time and memory

With the aim of reducing the computational cost [17]presented a probabilistic MHT algorithm in which theassociations are considered to be random variables thatare statistically independent and in which performing anexhaustive search enumeration is avoided This algorithm isknown as PMHT The PMHT algorithm assumes that thenumber of targets and measurements is known With thesame goal of reducing the computational cost [18] presentedan efficient implementation of the MHT algorithm Thisimplementation was the first version to be applied to performtracking in visual environments They employed the Murty[19] algorithm to determine the best set of 119896 hypothesesin polynomial time with the goal of tracking the points ofinterest

MHT typically performs the tracking process by employ-ing only one characteristic commonly the position TheBayesian combination to use multiple characteristics wasproposed by Liggins II et al [20]

A linear-programming-based relaxation approach to theoptimization problem in MHT tracking was proposed inde-pendently by Coraluppi et al [21] and Storms and Spieksma[22] Joo and Chellappa [23] proposed an association algo-rithm for tracking multiple targets in visual environmentsTheir algorithm is based on in MHT modification in whicha measurement can be associated with more than one targetand several targets can be associated with one measurementThey also proposed a combinatorial optimization algorithmto generate the best set of association hypotheses Theiralgorithm always finds the best hypothesis in contrast toother models which are approximate Coraluppi and Carthel[24] presented a generalization of the MHT algorithm usinga recursion over hypothesis classes rather than over a singlehypothesis This work has been applied in a special case ofthemulti-target tracking problem called cardinality trackingin which they observed the number of sensor measurementsinstead of the target states

35 Distributed Joint Probabilistic Data Association The dis-tributed version of the joint probabilistic data association(JPDA-D) was presented by Chang et al [25] In this tech-nique the estimated state of the target (using two sensors)after being associated is given by

119864 119909 | 1198851 1198852 =

1198981

sum

119895=0

1198982

sum

119897=0

119864 119909 | 1205941

119895 1205942

119897 1198851 1198852

lowast 119875 1205941

119895 1205942

119897| 1198851 1198852

(17)

where 119898119894 119894 = 1 2 is the last set of measurements of

sensor 1 and 2 119885119894 119894 = 1 2 is the set of accumulative data

and 120594 is the association hypothesisThe first term of the rightside of the equation is calculated from the associations thatwere made earlier The second term is computed from theindividual association probabilities as follows

119875 (1205941

119895 1205942

119897| 1198851 1198852) = sum

1199091

sum

1199092

= 119875 (1205941 1205942| 1198851 1198852) 1

119895(1205941) 2

119897(1205942)

119875 (1205941 1205942| 1198851 1198852) =

1

119888119875 (1205941| 1198851) 119875 (120594

2| 1198852) 120574 (120594

1 1205942)

(18)

where 120594119894 are the joint hypotheses involving all of themeasurements and all of the objectives and 119894

119895(120594119894) are the

binary indicators of the measurement-target associationTheadditional term 120574(1205941 1205942) depends on the correlation of theindividual hypothesis and reflects the localization influenceof the current measurements in the joint hypotheses

These equations are obtained assuming that commu-nication exists after every observation and there are onlyapproximations in the case in which communication issporadic and when a substantial amount of noise occursTherefore this algorithm is a theoretical model that has somelimitations in practical applications

36 Distributed Multiple Hypothesis Test The distributedversion of the MHT algorithm (MHT-D) [26 27] follows asimilar structure as the JPDA-D algorithm Let us assume thecase in which one node must fuse two sets of hypotheses andtracks If the hypotheses and track sets are represented by119867119894(119885119894) and 119879119894(119885119894) with 119894 = 1 2 the hypothesis probabilities

are represented by 120582119894119895 and the state distribution of the tracks

(120591119894119895) is represented by 119875(120582119894

119895) and 119875(119909 | 119885

119894 120591119894

119895) then the

maximum available information in the fusion node is 119885 =1198851cup 1198852 The data fusion objective of the MHT-D is to

obtain the set of hypotheses119867(119885) the set of tracks 119879(119885) thehypothesis probabilities 119875(120582 | 119885) and the state distribution119901(119909 | 119885 120591) for the observed data

The MHT-D algorithm is composed of the followingsteps

(1) hypothesis formation for each hypothesis pair 1205821119895and

1205822

119896 which could be fused a track 120591 is formed by

associating the pair of tracks 1205911119895and 1205912

119896 where each

pair comes from one node and could originate fromthe same target The final result of this stage is a setof hypotheses denoted by 119867(119885) and the fused tracks119879(119885)

(2) hypothesis evaluation in this stage the associationprobability of each hypothesis and the estimatedstate of each fused track are obtained The dis-tributed estimation algorithm is employed to calcu-late the likelihood of the possible associations andthe obtained estimations at each specific association

The Scientific World Journal 11

Using the information model the probability of eachfused hypothesis is given by

119875 (120582 | 119885) = 119862minus1prod

119895isin119869

119875(120582(119895)| 119885(119895))120572(119895)

prod

120591isin120582

119871 (120591 | 119885) (19)

where 119862 is a normalizing constant and 119871(120591 | 119885) is thelikelihood of each hypothesis pair

The main disadvantage of the MHT-D is the high com-putational cost that is in the order of 119874(119899119872) where 119899 is thenumber of possible associations and 119872 is the number ofvariables to be estimated

37 Graphical Models Graphical models are a formalism forrepresenting and reasoning with probabilities and indepen-dence A graphical model represents a conditional decom-position of the joint probability A graphical model can berepresented as a graph in which the nodes denote randomvariables the edges denote the possible dependence betweenthe random variables and the plates denote the replication ofa substructure with the appropriate indexing of the relevantvariables The graph captures the joint distribution over therandom variables which can be decomposed into a productof factors that each dependononly a subset of variablesThereare two major classes of graphical models (i) the Bayesiannetworks [28] which are also known as the directed graphicalmodels and (ii) the Markov random fields which are alsoknown as undirected graphical models The directed graph-ical models are useful for expressing causal relationshipsbetween random variables whereas undirected models arebetter suited for expressing soft constraints between randomvariables We refer the reader to the book of Koller andFriedman [29] for more information on graphical models

A framework based on graphical models can solve theproblem of distributed data association in synchronizedsensor networkswith overlapped areas andwhere each sensorreceives noisy measurements this solution was proposedby Chen et al [30 31] Their work is based on graphicalmodels that are used to represent the statistical dependencebetween random variables The data association problem istreated as an inference problem and solved by using themax-product algorithm [32] Graphical models representstatistical dependencies between variables as graphs andthe max-product algorithm converges when the graph isa tree structure Moreover the employed algorithm couldbe implemented in a distributed manner by exchangingmessages between the source nodes in parallel With thisalgorithm if each sensor has 119899 possible combinations ofassociations and there are119872 variables to be estimated it hasa complexity of 119874(1198992119872) which is reasonable and less thanthe 119874(119899119872) complexity of the MHT-D algorithm Howeveraspecial attention must be given to the correlated variableswhen building the graphical model

4 State Estimation Methods

State estimation techniques aim to determine the state ofthe target under movement (typically the position) given

the observation or measurements State estimation tech-niques are also known as tracking techniques In their generalform it is not guaranteed that the target observations arerelevant which means that some of the observations couldactually come from the target and others could be only noiseThe state estimation phase is a common stage in data fusionalgorithms because the targetrsquos observation could come fromdifferent sensors or sources and the final goal is to obtain aglobal target state from the observations

The estimation problem involves finding the values of thevector state (eg position velocity and size) that fits as muchas possible with the observed data From a mathematicalperspective we have a set of redundant observations andthe goal is to find the set of parameters that provides thebest fit to the observed data In general these observationsare corrupted by errors and the propagation of noise in themeasurement process State estimation methods fall underlevel 1 of the JDL classification and could be divided into twobroader groups

(1) linear dynamics and measurements here the esti-mation problem has a standard solution Specificallywhen the equations of the object state and the mea-surements are linear the noise follows the Gaussiandistribution and we do not refer to it as a clutterenvironment in this case the optimal theoreticalsolution is based on the Kalman filter

(2) nonlinear dynamics the state estimation problembecomes difficult and there is not an analytical solu-tion to solve the problem in a generalmanner In prin-ciple there are no practical algorithms available tosolve this problem satisfactorily

Most of the state estimationmethods are based on controltheory and employ the laws of probability to compute avector state from a vector measurement or a stream of vectormeasurements Next the most common estimation methodsare presented including maximum likelihood and maxi-mum posterior (Section 41) the Kalman filter (Section 42)particle filter (Section 43) the distributed Kalman filter(Section 44) distributed particle filter (Section 45) andcovariance consistency methods (Section 46)

41 Maximum Likelihood andMaximum Posterior Themax-imum likelihood (ML) technique is an estimation methodthat is based on probabilistic theory Probabilistic estimationmethods are appropriate when the state variable follows anunknown probability distribution [33] In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of119909 The likelihood function 120582(119909) is defined as a probabilitydensity function of the sequence of 119911 observations given thetrue value of the state 119909 Consider

120582 (119909) = 119901 (119911 | 119909) (20)

The ML estimator finds the value of 119909 that maximizes thelikelihood function

119909 (119896) = argmax119909119901 (119911 | 119909) (21)

12 The Scientific World Journal

which can be obtained from the analytical or empiricalmodels of the sensorsThis function expresses the probabilityof the observed data The main disadvantage of this methodin practice is that it requires the analytical or empirical modelof the sensor to be known to provide the prior distributionand compute the likelihood function This method can alsosystematically underestimate the variance of the distributionwhich leads to a bias problem However the bias of the MLsolution becomes less significant as the number 119873 of datapoints increases and is equal to the true variance of thedistribution that generated the data at the limit119873 rarr infin

The maximum posterior (MAP) method is based on theBayesian theory It is employed when the parameter 119909 tobe estimated is the output of a random variable that has aknown probability density function 119901(119909) In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of 119909The MAP estimator finds the value of 119909 that maximizes theposterior probability distribution as follows

119909 (119896) = argmax119909119901 (119909 | 119911) (22)

Both methods (ML andMAP) aim to find the most likelyvalue for the state 119909 However ML assumes that 119909 is a fixedbut an unknown point from the parameter space whereasMAP considers 119909 to be the output of a random variable witha known a priori probability density function Both of thesemethods are equivalent when there is no a priori informationabout 119909 that is when there are only observations

42The Kalman Filter TheKalman filter is the most popularestimation technique It was originally proposed by Kalman[34] and has been widely studied and applied since then TheKalman filter estimates the state 119909 of a discrete time processgoverned by the following space-time model

119909 (119896 + 1) = Φ (119896) 119909 (119896) + 119866 (119896) 119906 (119896) + 119908 (119896) (23)

with the observations ormeasurements 119911 at time 119896 of the state119909 represented by

119911 (119896) = 119867 (119896) 119909 (119896) + V (119896) (24)

where Φ(119896) is the state transition matrix 119866(119896) is the inputmatrix transition 119906(119896) is the input vector 119867(119896) is themeasurement matrix and 119908 and V are the random Gaussianvariables with zero mean and covariance matrices of 119876(119896)and 119877(119896) respectively Based on the measurements and onthe system parameters the estimation of 119909(119896) which isrepresented by 119909(119896) and the prediction of 119909(119896 + 1) whichis represented by 119909(119896 + 1 | 119896) are given by the following

119909 (119896) = 119909 (119896 | 119896 + 1) + 119870 (119896) [119911 (119896) minus 119867 (119896) 119909 (119896 | 119896 minus 1)]

119909 (119896 + 1 | 119896) = Φ (119896) 119909 (119896 | 119896) + 119866 (119896) 119906 (119896)

(25)

respectively where119870 is the filter gain determined by

119870 (119896) = 119875 (119896 | 119896 minus 1)119867119879(119896)

times [119867 (119896) 119875 (119896 | 119896 minus 1)119867119879(119896) + 119877 (119896)]

minus1

(26)

where 119875(119896 | 119896 minus 1) is the prediction covariance matrix andcan be determined by

119875 (119896 + 1 | 119896) = Φ (119896) 119875 (119896)Φ119879(119896) + 119876 (119896) (27)

with

119875 (119896) = 119875 (119896 | 119896 minus 1) minus 119870 (119896)119867 (119896) 119875 (119896 | 119896 minus 1) (28)

The Kalman filter is mainly employed to fuse low-leveldata If the system could be described as a linear model andthe error could be modeled as the Gaussian noise then therecursive Kalman filter obtains optimal statistical estimations[35] However othermethods are required to address nonlin-ear dynamicmodels and nonlinearmeasurementsThemodi-fied Kalman filter known as the extended Kalman filter (EKF)is an optimal approach for implementing nonlinear recursivefilters [36] The EKF is one of the most often employedmethods for fusing data in robotic applications Howeverit has some disadvantages because the computations of theJacobians are extremely expensive Some attempts have beenmade to reduce the computational cost such as linearizationbut these attempts introduce errors in the filter and make itunstable

The unscented Kalman filter (UKF) [37] has gainedpopularity because it does not have the linearization step andthe associated errors of the EKF [38] The UKF employs adeterministic sampling strategy to establish theminimum setof points around the mean This set of points captures thetrue mean and covariance completely Then these points arepropagated through nonlinear functions and the covarianceof the estimations can be recuperated Another advantage ofthe UKF is its ability to be employed in parallel implementa-tions

43 Particle Filter Particle filters are recursive implemen-tations of the sequential Monte Carlo methods [39] Thismethod builds the posterior density function using severalrandom samples called particles Particles are propagatedover time with a combination of sampling and resamplingsteps At each iteration the sampling step is employed todiscard some particles increasing the relevance of regionswith a higher posterior probability In the filtering processseveral particles of the same state variable are employedand each particle has an associated weight that indicatesthe quality of the particle Therefore the estimation is theresult of a weighted sum of all of the particles The standardparticle filter algorithm has two phases (1) the predictingphase and (2) the updating phase In the predicting phaseeach particle is modified according to the existing modeland accounts for the sum of the random noise to simulatethe noise effect Then in the updating phase the weight ofeach particle is reevaluated using the last available sensorobservation and particles with lower weights are removedSpecifically a generic particle filter comprises the followingsteps

The Scientific World Journal 13

(1) Initialization of the particles

(i) let119873 be equal to the number of particles(ii) 119883(119894)(1) = [119909(1) 119910(1) 0 0]119879 for 119894 = 1 119873

(2) Prediction step

(i) for each particle 119894 = 1 119873 evaluate the state(119896 + 1 | 119896) of the system using the state at timeinstant 119896 with the noise of the system at time 119896Consider

119883(119894)(119896 + 1 | 119896) = 119865 (119896)119883

(119894)(119896)

+ (cauchy-distribution-noise)(119896)

(29)

where 119865(119896) is the transition matrix of the sys-tem

(3) Evaluate the particle weight For each particle 119894 =1 119873

(i) compute the predicted observation state of thesystem using the current predicted state and thenoise at instant 119896 Consider

(119894)(119896 + 1 | 119896) = 119867 (119896 + 1)119883

(119894)(119896 + 1 | 119896)

+ (gaussian-measurement-noise)(119896+1)

(30)

(ii) compute the likelihood (weights) according tothe given distribution Consider

likelihood(119894) = 119873((119894) (119896 + 1 | 119896) 119911(119894) (119896 + 1) var) (31)

(iii) normalize the weights as follows

119908(119894)=

likelihood(119894)

sum119873

119895=1likelihood(119895)

(32)

(4) ResamplingSelection multiply particles with higherweights and remove those with lower weights Thecurrent state must be adjusted using the computedweights of the new particles

(i) Compute the cumulative weights Consider

Cum Wt(119894) =119894

sum

119895=1

119908(119895) (33)

(ii) Generate uniform distributed random variablesfrom 119880(119894) sim 119882(0 1) with the number of stepsequal to the number of particles

(iii) Determine which particles should be multipliedand which ones removed

(5) Propagation phase

(i) incorporate the new values of the state after theresampling of instant 119896 to calculate the value atinstant 119896 + 1 Consider

119909(1119873)

(119896 + 1 | 119896 + 1) = 119909 (119896 + 1 | 119896) (34)

(ii) compute the posterior mean Consider

119909 (119896 + 1) = mean [119909119894 (119896 + 1 | 119896 + 1)] 119894 = 1 119873 (35)

(iii) repeat steps 2 to 5 for each time instant

Particle filters are more flexible than the Kalman filtersand can copewith nonlinear dependencies and non-Gaussiandensities in the dynamic model and in the noise errorHowever they have some disadvantages A large numberof particles are required to obtain a small variance in theestimator It is also difficult to establish the optimal number ofparticles in advance and the number of particles affects thecomputational cost significantly Earlier versions of particlefilters employed a fixed number of particles but recent studieshave started to use a dynamic number of particles [40]

44 The Distributed Kalman Filter The distributed Kalmanfilter requires a correct clock synchronization between eachsource as demonstrated in [41] In other words to correctlyuse the distributed Kalman filter the clocks from all ofthe sources must be synchronized This synchronization istypically achieved through using protocols that employ ashared global clock such as the network time protocol (NTP)Synchronization problems between clocks have been shownto have an effect on the accuracy of the Kalman filterproducing inaccurate estimations [42]

If the estimations are consistent and the cross covarianceis known (or the estimations are uncorrelated) then it ispossible to use the distributed Kalman filters [43] Howeverthe cross covariance must be determined exactly or theobservations must be consistent

We refer the reader to Liggins II et al [20] formore detailsabout the Kalman filter in a distributed and hierarchicalarchitecture

45 Distributed Particle Filter Distributed particle filtershave gained attention recently [44ndash46] Coates [45] used adistributed particle filter to monitor an environment thatcould be captured by the Markovian state-space modelinvolving nonlinear dynamics and observations and non-Gaussian noise

In contrast earlier attempts to solve out-of-sequencemeasurements using particle filters are based on regeneratingthe probability density function to the time instant of theout-of-sequence measurement [47] In a particle filter thisstep requires a large computational cost in addition to thenecessary space to store the previous particles To avoidthis problem Orton and Marrs [48] proposed to store theinformation on the particles at each time instant saving thecost of recalculating this information This technique is close

14 The Scientific World Journal

to optimal and when the delay increases the result is onlyslightly affected [49]However it requires a very large amountof space to store the state of the particles at each time instant

46 Covariance Consistency Methods Covariance Intersec-tionUnion Covariance consistency methods (intersectionand union) were proposed by Uhlmann [43] and are generaland fault-tolerant frameworks for maintaining covariancemeans and estimations in a distributed networkThese meth-ods do not comprise estimation techniques instead they aresimilar to an estimation fusion technique The distributedKalman filter requirement of independent measurements orknown cross-covariances is not a constraintwith thismethod

461 Covariance Intersection If the Kalman filter is employ-ed to combine two estimations (119886

1 1198601) and (119886

2 1198602) then it

is assumed that the joint covariance is in the following form

[

[

1198601119883

1198831198791198602

]

]

(36)

where the cross-covariance 119883 should be known exactly sothat the Kalman filter can be applied without difficultyBecause the computation of the cross-covariances is compu-tationally intensive Uhlmann [43] proposed the covarianceintersection (CI) algorithm

Let us assume that a joint covariance 119872 can be definedwith the diagonal blocks119872

1198601gt 1198601and119872

1198602gt 1198602 Consider

119872 ⩾ [

[

1198601119883

1198831198791198602

]

]

(37)

for every possible instance of the unknown cross-covariance119883 then the components of the matrix119872 could be employedin the Kalman filter equations to provide a fused estimation(119888 119862) that is considered consistent The key point of thismethod relies on generating a joint covariance matrix119872 thatcan represent a useful fused estimation (in this context usefulrefers to something with a lower associated uncertainty) Insummary the CI algorithm computes the joint covariancematrix 119872 where the Kalman filter provides the best fusedestimation (119888 119862) with respect to a fixed measurement of thecovariance matrix (ie the minimum determinant)

Specific covariance criteria must be established becausethere is not a specific minimum joint covariance in theorder of the positive semidefinite matrices Moreover thejoint covariance is the basis of the formal analysis of theCI algorithm the actual result is a nonlinear mixture of theinformation stored on the estimations being fused followingthe following equation

119862 = (1199081119867119879

1119860minus1

11198671+ 1199082119867119879

2119860minus1

21198672+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119867119899)minus1

119888 = 119862(1199081119867119879

1119860minus1

11198861+ 1199082119867119879

2119860minus1

21198862+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119886119899)minus1

(38)

where 119867119894is the transformation of the fused state-space

estimation to the space of the estimated state 119894 The values

of 119908 can be calculated to minimize the covariance determi-nant using convex optimization packages and semipositivematrix programming The result of the CI algorithm hasdifferent characteristics compared to the Kalman filter Forexample if two estimations are provided (119886 119860) and (119887 119861)and their covariances are equal 119860 = 119861 since the Kalmanfilter is based on the statistical independence assumption itproduces a fused estimation with covariance 119862 = (12)119860In contrast the CI method does not assume independenceand thus must be consistent even in the case in whichthe estimations are completely correlated with the estimatedfused covariance 119862 = 119860 In the case of estimations where119860 lt 119861 the CI algorithm does not provide information aboutthe estimation (119887 119861) thus the fused result is (119886 119860)

Every joint-consistent covariance is sufficient to producea fused estimation which guarantees consistency Howeverit is also necessary to guarantee a lack of divergence Diver-gence is avoided in the CI algorithm by choosing a specificmeasurement (ie the determinant) which is minimized ineach fusion operation This measurement represents a non-divergence criterion because the size of the estimated covari-ance according to this criterion would not be incremented

The application of the CI method guarantees consis-tency and nondivergence for every sequence of mean andcovariance-consistent estimations However this methoddoes not work well when the measurements to be fused areinconsistent

462 Covariance Union CI solves the problem of correlatedinputs but not the problemof inconsistent inputs (inconsistentinputs refer to different estimations each of which has ahigh accuracy (small variance) but also a large differencefrom the states of the others) thus the covariance union(CU) algorithm was proposed to solve the latter [43] CUaddresses the following problem two estimations (119886

1 1198601)

and (1198862 1198602) relate to the state of an object and are mutually

inconsistent from one another This issue arises when thedifference between the average estimations is larger thanthe provided covariance Inconsistent inputs can be detectedusing the Mahalanobis distance [50] between them which isdefined as

119872119889= (1198861minus 1198862)119879

(1198601+ 1198602)minus1

(1198861minus 1198862) (39)

and detecting whether this distance is larger than a giventhreshold

The Mahalanobis distance accounts for the covarianceinformation to obtain the distance If the difference betweenthe estimations is high but their covariance is also highthe Mahalanobis distance yields a small value In contrastif the difference between the estimations is small and thecovariances are small it could produce a larger distancevalue A high Mahalanobis distance could indicate that theestimations are inconsistent however it is necessary tohave a specific threshold established by the user or learnedautomatically

The CU algorithm aims to solve the following prob-lem let us suppose that a filtering algorithm provides twoobservations withmean and covariance (119886

1 1198601) and (119886

2 1198602)

The Scientific World Journal 15

respectively It is known that one of the observations is correctand the other is erroneous However the identity of thecorrect estimation is unknown and cannot be determinedIn this situation if both estimations are employed as aninput to the Kalman filter there will be a problem becausethe Kalman filter only guarantees a consistent output if theobservation is updated with a measurement consistent withboth of them In the specific case in which themeasurementscorrespond to the same object but are acquired from twodifferent sensors the Kalman filter can only guarantee thatthe output is consistent if it is consistent with both separatelyBecause it is not possible to knowwhich estimation is correctthe only way to combine the two estimations rigorously isto provide an estimation (119906 119880) that is consistent with bothestimations and to obey the following properties

119880 ⪖ 1198601+ (119906 minus 119886

1) (119906 minus 119860

1)119879

119880 ⪖ 1198602+ (119906 minus 119886

2) (119906 minus 119860

2)119879

(40)

where somemeasurement of thematrix size119880 (ie the deter-minant) is minimized

In other words the previous equations indicate that if theestimation (119886

1 1198601) is consistent then the translation of the

vector 1198861to 119906 requires to increase the covariance by the sum

of a matrix at least as big as the product of (119906minus1198861) in order to

be consistentThe same situation applies to the measurement(1198862 1198602) in order to be consistent

A simple strategy is to choose the mean of the estimationas the input value of one of themeasurements (119906 = 119886

1) In this

case the value of 119880must be chosen such that the estimationis consistent with the worst case (the correct measurement is1198862) However it is possible to assign 119906 an intermediate value

between 1198861and 1198862to decrease the value of 119880 Therefore the

CU algorithm establishes the mean fused value 119906 that hasthe least covariance 119880 but is sufficiently large for the twomeasurements (119886

1and 1198862) for consistency

Because the matrix inequalities presented in previousequations are convex convex optimization algorithms mustbe employed to solve them The value of 119880 can be computedwith the iterative method described by Julier et al [51]The obtained covariance could be significantly larger thanany of the initial covariances and is an indicator of theexisting uncertainty between the initial estimations One ofthe advantages of the CU method arises from the fact thatthe same process could be easily extended to119873 inputs

5 Decision Fusion Methods

A decision is typically taken based on the knowledge of theperceived situation which is provided by many sources inthe data fusion domain These techniques aim to make ahigh-level inference about the events and activities that areproduced from the detected targets These techniques oftenuse symbolic information and the fusion process requires toreasonwhile accounting for the uncertainties and constraintsThese methods fall under level 2 (situation assessment) andlevel 4 (impact assessment) of the JDL data fusion model

51 The Bayesian Methods Information fusion based on theBayesian inference provides a formalism for combining evi-dence according to the probability theory rules Uncertaintyis represented using the conditional probability terms thatdescribe beliefs and take on values in the interval [0 1] wherezero indicates a complete lack of belief and one indicates anabsolute belief The Bayesian inference is based on the Bayesrule as follows

119875 (119884 | 119883) =119875 (119883 | 119884) 119875 (119884)

119875 (119883) (41)

where the posterior probability 119875(119884 | 119883) represents thebelief in the hypothesis 119884 given the information 119883 Thisprobability is obtained by multiplying the a priori probabilityof the hypothesis 119875(119884) by the probability of having 119883 giventhat 119884 is true 119875(119883 | 119884) The value 119875(119883) is used as anormalizing constantThemain disadvantage of the Bayesianinference is that the probabilities 119875(119883) and 119875(119883 | 119884) mustbe known To estimate the conditional probabilities Panet al [52] proposed the use of NNs whereas Coue et al [53]proposed the Bayesian programming

Hall and Llinas [54] described the following problemsassociated with Bayesian inference

(i) Difficulty in establishing the value of a priori proba-bilities

(ii) Complexity when there are multiple potential hypo-theses and a substantial number of events that dependon the conditions

(iii) The hypothesis should be mutually exclusive

(iv) Difficulty in describing the uncertainty of the deci-sions

52 The Dempster-Shafer Inference The Dempster-Shaferinference is based on the mathematical theory introducedby Dempster [55] and Shafer [56] which generalizes theBayesian theory The Dempster-Shafer theory provides aformalism that could be used to represent incomplete knowl-edge updating beliefs and a combination of evidence andallows us to represent the uncertainty explicitly [57]

A fundamental concept in the Dempster-Shafer reason-ing is the frame of discernment which is defined as followsLet Θ = 120579

1 1205792 120579

119873 be the set of all possible states

that define the system and let Θ be exhaustive and mutuallyexclusive due to the system being only in one state 120579

119894isin Θ

where 1 ⪕ 119894 ⪕ 119873 The set Θ is called a frame of discernmentbecause its elements are employed to discern the current stateof the system

The elements of the set 2Θ are called hypotheses Inthe Dempster-Shafer theory based on the evidence 119864 aprobability is assigned to each hypothesis119867 isin 2Θ accordingto the basic assignment of probabilities or the mass function119898 2Θrarr [01] which satisfies

119898(0) = 0 (42)

16 The Scientific World Journal

Thus themass function of the empty set is zero Furthermorethe mass function of a hypothesis is larger than or equal tozero for all of the hypotheses Consider

119898(119867) ge 0 forall119867 isin 2Θ (43)

The sum of the mass function of all the hypotheses is oneConsider

sum

119867isin2Θ

119898(119867) = 1 (44)

To express incomplete beliefs in a hypothesis 119867 the Demp-ster-Shafer theory defines the belief function bel 2Θ rarr

[0 1] over Θ as

bel (119867) = sum119860sube119867

119898(119860) (45)

where bel(0) = 0 and bel(Θ) = 1The doubt level in119867 can beexpressed in terms of the belief function by

dou (119867) = bel (not119867) = sum

119860subenot119867

119898(119860) (46)

To express the plausibility of each hypothesis the functionpl 2Θ rarr [0 1] over Θ is defined as

pl (119867) = 1 minus dou (119867) = sum

119860cap119867=0

119898(119860) (47)

Intuitive plausibility indicates that there is less uncer-tainty in hypothesis119867 if it is more plausible The confidenceinterval [bel(119867) pl(119867)] defines the true belief in hypothesis119867 To combine the effects of the two mass functions 119898

1and

1198982 the Dempster-Shafer theory defines a rule119898

1oplus 1198982as

1198981oplus 1198982(0) = 0

1198981oplus 1198982(119867) =

sum119883cap119884=119867

1198981(119883)119898

2(119884)

1 minus sum119883cap119884=0

1198981(119883)119898

2(119884)

(48)

In contrast to the Bayesian inference a priori probabilitiesare not required in the Dempster-Shafer inference becausethey are assigned at the instant that the information is pro-vided Several studies in the literature have compared the useof the Bayesian inference and the Dempster-Shafer inferencesuch as [58ndash60] Wu et al [61] used the Dempster-Shafertheory to fuse information in context-aware environmentsThis work was extended in [62] to dynamically modify theassociated weights to the sensor measurements Thereforethe fusion mechanism is calibrated according to the recentmeasurements of the sensors (in cases in which the ground-truth is available) In themilitary domain [63] theDempster-Shafer reasoning is used with the a priori information storedin a database for classifying military ships Morbee et al [64]described the use of the Dempster-Shafer theory to build 2Doccupancy maps from several cameras and to evaluate thecontribution of subsets of cameras to a specific task Each taskis the observation of an event of interest and the goal is toassess the validity of a set of hypotheses that are fused usingthe Dempster-Shafer theory

53 Abductive Reasoning Abductive reasoning or inferringthe best explanation is a reasoning method in which ahypothesis is chosen under the assumption that in case itis true it explains the observed event most accurately [65]In other words when an event is observed the abductionmethod attempts to find the best explanation

In the context of probabilistic reasoning abductive infer-ence finds the posterior ML of the system variables givensome observed variables Abductive reasoning is more areasoning pattern than a data fusion technique Thereforedifferent inference methods such as NNs [66] or fuzzy logic[67] can be employed

54 Semantic Methods Decision fusion techniques thatemploy semantic data fromdifferent sources as an input couldprovide more accurate results than those that rely on onlysingle sources There is a growing interest in techniques thatautomatically determine the presence of semantic features invideos to solve the semantic gap [68]

Semantic information fusion is essentially a scheme inwhich raw sensor data are processed such that the nodesexchange only the resultant semantic information Semanticinformation fusion typically covers two phases (i) build-ing the knowledge and (ii) pattern matching (inference)The first phase (typically offline) incorporates the mostappropriate knowledge into semantic information Then thesecond phase (typically online or in real-time) fuses relevantattributes and provides a semantic interpretation of thesensor data [69ndash71]

Semantic fusion could be viewed as an idea for integratingand translating sensor data into formal languages Thereforethe obtained resulting language from the observations ofthe environment is compared with similar languages thatare stored in the database The key of this strategy is thatsimilar behaviors represented by formal languages are alsosemantically similar This type of method provides savingsin the cost of transmission because the nodes need onlytransmit the formal language structure instead of the rawdata However a known set of behaviors must be storedin a database in advance which might be difficult in somescenarios

6 Conclusions

This paper reviews the most popular methods and tech-niques for performing datainformation fusion To determinewhether the application of datainformation fusion methodsis feasible we must evaluate the computational cost of theprocess and the delay introduced in the communicationA centralized data fusion approach is theoretically optimalwhen there is no cost of transmission and there are sufficientcomputational resources However this situation typicallydoes not hold in practical applications

The selection of the most appropriate technique dependson the type of the problem and the established assumptionsof each technique Statistical data fusion methods (eg PDAJPDA MHT and Kalman) are optimal under specific condi-tions [72] First the assumption that the targets are moving

The Scientific World Journal 17

independently and the measurements are normally dis-tributed around the predicted position typically does nothold Second because the statistical techniques model allof the events as probabilities they typically have severalparameters and a priori probabilities for false measurementsand detection errors that are often difficult to obtain (atleast in an optimal sense) For example in the case of theMHT algorithm specific parameters must be established thatare nontrivial to determine and are very sensitive [73] Incontrast statisticalmethods that optimize over several framesare computationally intensive and their complexity typicallygrows exponentially with the number of targets For examplein the case of particle filters tracking several targets can beaccomplished jointly as a group or individually If severaltargets are tracked jointly the necessary number of particlesgrows exponentially Therefore in practice it is better toperform tracking on them individually with the assumptionthat targets do not interact between the particles

In contrast to centralized systems the distributed datafusion methods introduce some challenges in the data fusionprocess such as (i) spatial and temporal alignments of theinformation (ii) out-of-sequence measurements and (iii)data correlation reported by Castanedo et al [74 75] Theinherent redundancy of the distributed systems could beexploited with distributed reasoning techniques and cooper-ative algorithms to improve the individual node estimationsreported by Castanedo et al [76] In addition to the previousstudies a new trend based on the geometric notion of a low-dimensional manifold is gaining attention in the data fusioncommunity An example is the work of Davenport et al [77]which proposes a simple model that captures the correlationbetween the sensor observations by matching the parametervalues for the different obtained manifolds

Acknowledgments

The author would like to thank Jesus Garcıa Miguel APatricio and James Llinas for their interesting and relateddiscussions on several topics that were presented in thispaper

References

[1] JDL Data Fusion Lexicon Technical Panel For C3 FE WhiteSan Diego Calif USA Code 420 1991

[2] D L Hall and J Llinas ldquoAn introduction to multisensor datafusionrdquo Proceedings of the IEEE vol 85 no 1 pp 6ndash23 1997

[3] H F Durrant-Whyte ldquoSensor models and multisensor integra-tionrdquo International Journal of Robotics Research vol 7 no 6 pp97ndash113 1988

[4] B V Dasarathy ldquoSensor fusion potential exploitation-inno-vative architectures and illustrative applicationsrdquo Proceedings ofthe IEEE vol 85 no 1 pp 24ndash38 1997

[5] R C Luo C-C Yih and K L Su ldquoMultisensor fusion andintegration approaches applications and future research direc-tionsrdquo IEEE Sensors Journal vol 2 no 2 pp 107ndash119 2002

[6] J Llinas C Bowman G Rogova A Steinberg E Waltz andF White ldquoRevisiting the JDL data fusion model IIrdquo TechnicalReport DTIC Document 2004

[7] E P Blasch and S Plano ldquoJDL level 5 fusionmodel ldquouser refine-mentrdquo issues and applications in group trackingrdquo in Proceedingsof the Signal Processing Sensor Fusion and Target RecognitionXI pp 270ndash279 April 2002

[8] H F Durrant-Whyte and M Stevens ldquoData fusion in decen-tralized sensing networksrdquo in Proceedings of the 4th Interna-tional Conference on Information Fusion pp 302ndash307MontrealCanada 2001

[9] J Manyika and H Durrant-Whyte Data Fusion and SensorManagement A Decentralized Information-Theoretic ApproachPrentice Hall Upper Saddle River NJ USA 1995

[10] S S Blackman ldquoAssociation and fusion ofmultiple sensor datardquoinMultitarget-Multisensor Tracking Advanced Applications pp187ndash217 Artech House 1990

[11] S Lloyd ldquoLeast squares quantization in pcmrdquo IEEETransactionson Information Theory vol 28 no 2 pp 129ndash137 1982

[12] M Shindler A Wong and A Meyerson ldquoFast and accurate120581-means for large datasetsrdquo in Proceedings of the 25th AnnualConference on Neural Information Processing Systems (NIPS rsquo11)pp 2375ndash2383 December 2011

[13] Y Bar-Shalom and E Tse ldquoTracking in a cluttered environmentwith probabilistic data associationrdquo Automatica vol 11 no 5pp 451ndash460 1975

[14] T E Fortmann Y Bar-Shalom and M Scheffe ldquoMulti-targettracking using joint probabilistic data associationrdquo in Pro-ceedings of the 19th IEEE Conference on Decision and Controlincluding the Symposium on Adaptive Processes vol 19 pp 807ndash812 December 1980

[15] D B Reid ldquoAn algorithm for tracking multiple targetsrdquo IEEETransactions on Automatic Control vol 24 no 6 pp 843ndash8541979

[16] C L Morefield ldquoApplication of 0-1 integer programming tomultitarget tracking problemsrdquo IEEETransactions onAutomaticControl vol 22 no 3 pp 302ndash312 1977

[17] R L Streit and T E Luginbuhl ldquoMaximum likelihood methodfor probabilistic multihypothesis trackingrdquo in Proceedings of theSignal and Data Processing of Small Targets vol 2235 of Pro-ceedings of SPIE p 394 1994

[18] I J Cox and S L Hingorani ldquoEfficient implementation of Reidrsquosmultiple hypothesis tracking algorithm and its evaluation forthe purpose of visual trackingrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 18 no 2 pp 138ndash1501996

[19] K G Murty ldquoAn algorithm for ranking all the assignments inorder of increasing costrdquo Operations Research vol 16 no 3 pp682ndash687 1968

[20] M E Liggins II C-Y Chong I Kadar et al ldquoDistributed fusionarchitectures and algorithms for target trackingrdquo Proceedings ofthe IEEE vol 85 no 1 pp 95ndash106 1997

[21] S Coraluppi C Carthel M Luettgen and S Lynch ldquoAll-source track and identity fusionrdquo in Proceedings of the NationalSymposium on Sensor and Data Fusion 2000

[22] P Storms and F Spieksma ldquoAn lp-based algorithm for the dataassociation problem in multitarget trackingrdquo in Proceedings ofthe 3rd IEEE International Conference on Information Fusionvol 1 2000

[23] S-W Joo and R Chellappa ldquoA multiple-hypothesis approachfor multiobject visual trackingrdquo IEEE Transactions on ImageProcessing vol 16 no 11 pp 2849ndash2854 2007

[24] S Coraluppi andC Carthel ldquoAggregate surveillance a cardinal-ity tracking approachrdquo in Proceedings of the 14th InternationalConference on Information Fusion (FUSION rsquo11) July 2011

18 The Scientific World Journal

[25] K C Chang C Y Chong and Y Bar-Shalom ldquoJoint proba-bilistic data association in distributed sensor networksrdquo IEEETransactions on Automatic Control vol 31 no 10 pp 889ndash8971986

[26] Y Chong S Mori and K C Chang ldquoInformation lusion indistributed sensor networksrdquo in Proceedings of the 4th AmericanControl Conference Boston Mass USA June 1985

[27] Y Chong S Mori and K C Chang ldquoDistributed multitar-get multisensor trackingrdquo in Multitarget-Multisensor TrackingAdvanced Applications vol 1 pp 247ndash295 1990

[28] J Pearl Probabilistic Reasoning in Intelligent Systems Networksof Plausible Inference Morgan Kaufmann San Mateo CalifUSA 1988

[29] Koller and N Friedman Probabilistic Graphical Models Princi-ples and Techniques MIT press 2009

[30] L Chen M Cetin and A S Willsky ldquoDistributed data associ-ation for multi-target tracking in sensor networksrdquo in Proceed-ings of the 7th International Conference on Information Fusion(FUSION rsquo05) pp 9ndash16 July 2005

[31] L Chen M J Wainwright M Cetin and A S Willsky ldquoDataassociation based on optimization in graphical models withapplication to sensor networksrdquo Mathematical and ComputerModelling vol 43 no 9-10 pp 1114ndash1113 2006

[32] Y Weiss and W T Freeman ldquoOn the optimality of solutionsof the max-product belief-propagation algorithm in arbitrarygraphsrdquo IEEE Transactions on InformationTheory vol 47 no 2pp 736ndash744 2001

[33] C Brown H Durrant-Whyte J Leonard B Rao and B SteerldquoDistributed data fusion using Kalman filtering a roboticsapplicationrdquo in Data Fusion in Robotics and Machine Intelli-gence M A Abidi and R C Gonzalez Eds pp 267ndash309 1992

[34] R E Kalman ldquoA new approach to linear filtering and predictionproblemsrdquo Journal of Basic Engineering vol 82 no 1 pp 35ndash451960

[35] R C Luo and M G Kay ldquoData fusion and sensor integrationstate-of-the-art 1990srdquo in Data Fusion in Robotics and MachineIntelligence pp 7ndash135 1992

[36] Welch and G Bishop An Introduction to the Kalman FilterACM SIC-CRAPH 2001 Course Notes 2001

[37] S J Julier and J K Uhlmann ldquoA new extension of the Kalmanfilter to nonlinear systemsrdquo in Proceedings of the InternationalSymposium on AerospaceDefense Sensing Simulation and Con-trols vol 3 1997

[38] A Wan and R Van Der Merwe ldquoThe unscented kalman filterfor nonlinear estimationrdquo in Proceedings of the Adaptive Systemsfor Signal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 153ndash158 2000

[39] D Crisan and A Doucet ldquoA survey of convergence results onparticle filtering methods for practitionersrdquo IEEE Transactionson Signal Processing vol 50 no 3 pp 736ndash746 2002

[40] J Martinez-del Rincon C Orrite-Urunuela and J E Herrero-Jaraba ldquoAn efficient particle filter for color-based tracking incomplex scenesrdquo in Proceedings of the IEEE Conference onAdvancedVideo and Signal Based Surveillance pp 176ndash181 2007

[41] S Ganeriwal R Kumar and M B Srivastava ldquoTiming-syncprotocol for sensor networksrdquo in Proceedings of the 1st Inter-national Conference on Embedded Networked Sensor Systems(SenSys rsquo03) pp 138ndash149 November 2003

[42] M Manzo T Roosta and S Sastry ldquoTime synchronization innetworksrdquo in Proceedings of the 3rd ACMWorkshop on Securityof Ad Hoc and Sensor Networks (SASN rsquo05) pp 107ndash116November 2005

[43] J K Uhlmann ldquoCovariance consistency methods for fault-tolerant distributed data fusionrdquo Information Fusion vol 4 no3 pp 201ndash215 2003

[44] S Bashi V P Jilkov X R Li and H Chen ldquoDistributed imple-mentations of particle filtersrdquo in Proceedings of the 6th Interna-tional Conference of Information Fusion pp 1164ndash1171 2003

[45] M Coates ldquoDistributed particle filters for sensor networksrdquo inProceedings of the 3rd International symposium on InformationProcessing in Sensor Networks (ACM rsquo04) pp 99ndash107 New YorkNY USA 2004

[46] D Gu ldquoDistributed particle filter for target trackingrdquo in Pro-ceedings of the IEEE International Conference on Robotics andAutomation (ICRA rsquo07) pp 3856ndash3861 April 2007

[47] Y Bar-Shalom ldquoUpdate with out-of-sequencemeasurements intracking exact solutionrdquo IEEE Transactions on Aerospace andElectronic Systems vol 38 no 3 pp 769ndash778 2002

[48] M Orton and A Marrs ldquoA Bayesian approach to multi-targettracking and data fusionwithOut-of-SequenceMeasurementsrdquoIEE Colloquium no 174 pp 151ndash155 2001

[49] M L Hernandez A D Marrs S Maskell and M R OrtonldquoTracking and fusion for wireless sensor networksrdquo in Proceed-ings of the 5th International Conference on Information Fusion2002

[50] P C Mahalanobis ldquoOn the generalized distance in statisticsrdquoProceedings National Institute of ScienceIndia vol 2 no 1 pp49ndash55 1936

[51] S J Julier J K Uhlmann and D Nicholson ldquoA methodfor dealing with assignment ambiguityrdquo in Proceedings of theAmerican Control Conference (AAC rsquo04) vol 5 pp 4102ndash4107July 2004

[52] H Pan Z-P Liang T J Anastasio and T S Huang ldquoHybridNN-Bayesian architecture for information fusionrdquo in Proceed-ings of the International Conference on Image Processing (ICIPrsquo98) pp 368ndash371 October 1998

[53] C Coue T Fraichard P Bessiere and E Mazer ldquoMulti-sensordata fusion using Bayesian programming an automotive appli-cationrdquo in Proceedings of the IEEERSJ International Conferenceon Intelligent Robots and Systems pp 141ndash146 October 2002

[54] D L Hall and J Llinas Handbook of Multisensor Data FusionCRC Press Boca Raton Fla USA 2001

[55] P Dempster ldquoA Generalization of Bayesian Inferencerdquo Journalof the Royal Statistical Society B vol 30 no 2 pp 205ndash247 1968

[56] A ShaferMathematical Theory of Evidence Princeton Univer-sity Press Princeton NJ USA 1976

[57] G M Provan ldquoThe validity of Dempster-Shafer belief func-tionsrdquo International Journal of Approximate Reasoning vol 6no 3 pp 389ndash399 1992

[58] D M Buede ldquoShafer-Dempster and Bayesian reasoning aresponse to lsquoShafer-Dempster reasoning with applications tomultisensor target identification systemsrsquordquo IEEE Transactions onSystems Man and Cybernetics vol 18 no 6 pp 1009ndash1011 1988

[59] Y Cheng and R L Kashyap ldquoComparisonol Bayesian andDempsterrsquos rules in evidence combinationrdquo in Maximum-Entropy and BayesianMethods in Science and Engineering 1988

[60] B R Cobb and P P Shenoy ldquoA comparison of Bayesian andbelief function reasoningrdquo Information Systems Frontiers vol 5no 4 pp 345ndash358 2003

[61] H Wu M Siegel R Stiefelhagen and J Yang ldquoSensor fusionusing Dempster-Shafer theoryrdquo in Proceedings of the 19thIEEE Instrumentation and Measurement Technology Conference(TMTC rsquo02) pp 7ndash11 May 2002

The Scientific World Journal 19

[62] H Wu M Siegel and S Ablay ldquoSensor fusion using dempster-shafer theory II staticweighting andKalmanfilter-like dynamicweightingrdquo in Proceedings of the 20th IEEE Information andMeasurement Technology Conference (TMTC rsquo03) pp 907ndash912May 2003

[63] E Bosse P Valin A-C Boury-Brisset and D Grenier ldquoEx-ploitation of a priori knowledge for information fusionrdquo Infor-mation Fusion vol 7 no 2 pp 161ndash175 2006

[64] MMorbee L Tessens H Aghajan andW Philips ldquoDempster-Shafer based multi-view occupancy mapsrdquo Electronics Lettersvol 46 no 5 pp 341ndash343 2010

[65] C S Peirce Abduction and Induction Philosophical Writings ofPeirce vol 156 Dover New York NY USA 1955

[66] A M Abdelbar E A M Andrews and D C Wunsch IIldquoAbductive reasoning with recurrent neural networksrdquo NeuralNetworks vol 16 no 5-6 pp 665ndash673 2003

[67] J R Aguero and A Vargas ldquoInference of operative configu-ration of distribution networks using fuzzy logic techniquesPart II extended real-time modelrdquo IEEE Transactions on PowerSystems vol 20 no 3 pp 1562ndash1569 2005

[68] A W M Smeulders M Worring S Santini A Gupta and RJain ldquoContent-based image retrieval at the end of the earlyyearsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 22 no 12 pp 1349ndash1380 2000

[69] D S Friedlander and S Phoha ldquoSemantic information fusionfor coordinated signal processing in mobile sensor networksrdquoInternational Journal of High Performance Computing Applica-tions vol 16 no 3 pp 235ndash241 2002

[70] S Friedlander ldquoSemantic information extractionrdquo in Dis-tributed Sensor Networks 2005

[71] KWhitehouse J Liu and F Zhao ldquoSemantic Streams a frame-work for composable inference over sensor datardquo in Proceedingsof the 3rd European Workshop on Wireless Sensor NetworksLecture Notes in Computer Science Springer February 2006

[72] J Cox ldquoA review of statistical data association techniques formotion correspondencerdquo International Journal of ComputerVision vol 10 no 1 pp 53ndash66 1993

[73] C J Veenman M J T Reinders and E Backer ldquoResolvingmotion correspondence for densely moving pointsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol23 no 1 pp 54ndash72 2001

[74] F Castanedo M A Patricio J Garcıa and J M MolinaldquoBottom-uptop-down coordination in a multiagent visualsensor networkrdquo in Proceedings of the IEEE Conference onAdvanced Video and Signal Based Surveillance (AVSS rsquo07) pp93ndash98 September 2007

[75] F Castanedo J Garcıa M A Patricio and J M MolinaldquoAnalysis of distributed fusion alternatives in coordinated visionagentsrdquo in Proceedings of the 11th International Conference onInformation Fusion (FUSION rsquo08) July 2008

[76] F Castanedo J Garcıa M A Patricio and J M Molina ldquoDatafusion to improve trajectory tracking in a cooperative surveil-lance multi-agent architecturerdquo Information Fusion vol 11 no3 pp 243ndash255 2010

[77] M A Davenport C Hegde M F Duarte and R G BaraniukldquoJoint manifolds for data fusionrdquo IEEE Transactions on ImageProcessing vol 19 no 10 pp 2580ndash2594 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 5: Review Article A Review of Data Fusion Techniquesdownloads.hindawi.com/journals/tswj/2013/704504.pdf · Classification of Data Fusion Techniques Data fusion is a ... to extract features

The Scientific World Journal 5

the results in the centralized scheme to a greaterdegree than in other schemes

(2) decentralized architecture a decentralized architec-ture is composed of a network of nodes in which eachnode has its own processing capabilities and there isno single point of data fusion Therefore each nodefuses its local information with the information thatis received from its peers Data fusion is performedautonomously with each node accounting for its localinformation and the information received from itspeers Decentralized data fusion algorithms typicallycommunicate information using the Fisher and Shan-non measurements instead of the objectrsquos state [8]The main disadvantage of this architecture is thecommunication cost which is 119874(1198992) at each com-munication step where 119899 is the number of nodesadditionally the extreme case is considered in whicheach node communicates with all of its peers Thusthis type of architecture could suffer from scalabilityproblems when the number of nodes is increased

(3) distributed architecture in a distributed architecturemeasurements from each source node are processedindependently before the information is sent to thefusion node the fusion node accounts for the infor-mation that is received from the other nodes In otherwords the data association and state estimation areperformed in the source node before the informationis communicated to the fusion node Therefore eachnode provides an estimation of the object state basedon only their local views and this information isthe input to the fusion process which provides afused global view This type of architecture providesdifferent options and variations that range from onlyone fusion node to several intermediate fusion nodes

(4) hierarchical architecture other architectures com-prise a combination of decentralized and distributednodes generating hierarchical schemes in which thedata fusion process is performed at different levels inthe hierarchy

In principle a decentralized data fusion system is moredifficult to implement because of the computation andcommunication requirements However in practice there isno single best architecture and the selection of the mostappropriate architecture should be made depending on therequirements demand existing networks data availabilitynode processing capabilities and organization of the datafusion system

The reader might think that the decentralized anddistributed architectures are similar however they havemeaningful differences (see Figure 4) First in a distributedarchitecture a preprocessing of the obtainedmeasurements isperformed which provides a vector of features as a result (thefeatures are fused thereafter) In contrast in the decentralizedarchitecture the complete data fusion process is conductedin each node and each of the nodes provides a globallyfused result Second the decentralized fusion algorithmstypically communicate information employing the Fisher

and Shannon measurements In contrast distributed algo-rithms typically share a common notion of state (positionvelocity and identity) with their associated probabilitieswhich are used to perform the fusion process [9] Thirdbecause the decentralized data fusion algorithms exchangeinformation instead of states and probabilities they havethe advantage of easily separating old knowledge from newknowledge Thus the process is additive and the associativemeaning is not relevant when the information is receivedand fused However in the distributed data fusion algorithms(ie distributed by Kalman Filter) the state that is goingto be fused is not associative and when and how the fusedestimates are computed is relevant Nevertheless in contrastto the centralized architectures the distributed algorithmsreduce the necessary communication and computationalcosts because some tasks are computed in the distributednodes before data fusion is performed in the fusion node

3 Data Association Techniques

The data association problem must determine the set ofmeasurements that correspond to each target (see Figure 5)Let us suppose that there are 119874 targets that are being trackedby only one sensor in a cluttered environment (by a clutteredenvironment we refer to an environment that has severaltargets that are to close each other)Then the data associationproblem can be defined as follows

(i) each sensorrsquos observation is received in the fusionnode at discrete time intervals

(ii) the sensormight not provide observations at a specificinterval

(iii) some observations are noise and other observationsoriginate from the detected target

(iv) for any specific target and in every time interval wedo not know (a priori) the observations that will begenerated by that target

Therefore the goal of data association is to establish theset of observations or measurements that are generated bythe same target over time Hall and Llinas [2] provided thefollowing definition of data association ldquoThe process of assignand compute the weights that relates the observations or tracks(A track can be defined as an ordered set of points that followa path and are generated by the same target) from one set tothe observation of tracks of another setrdquo

As an example of the complexity of the data associationproblem if we take a frame-to-frame association and assumethat119872 possible points could be detected in all 119899 frames thenthe number of possible sets is (119872)119899minus1 Note that from allof these possible solutions only one set establishes the truemovement of the119872 points

Data association is often performed before the stateestimation of the detected targets Moreover it is a keystep because the estimation or classification will behaveincorrectly if the data association phase does not workcoherently The data association process could also appear inall of the fusion levels but the granularity varies dependingon the objective of each level

6 The Scientific World Journal

Preprocessing

Preprocessing

Preprocessing

Alignment Association Estimation

Stateof theobject

Centralized architecture

Decentralized architecture

Distributed architecture

S1

S2

Fusion node

Preprocessing

Stateof theobject

Stateof theobject

Stateof theobject

S1

S2

S1

S2

Preprocessing

Preprocessing

Preprocessing

Preprocessing

Preprocessing

Alignment

Alignment

Alignment

Alignment

Alignment

Alignment

Alignment

Association

Association

Association

Association

Association

Association

Association

Estimation

Estimation

Estimation

Estimation

Estimation

Estimation

Estimation

Sn

Sn

Sn

Stateof theobject

Figure 4 Classification based on the type of architecture

In general an exhaustive search of all possible combina-tions grows exponentially with the number of targets thusthe data association problem becomes NP complete Themost common techniques that are employed to solve the dataassociation problem are presented in the following sections(from Sections 31 to 37)

31 Nearest Neighbors and K-Means Nearest neighbor(NN) is the simplest data association technique NN isa well-known clustering algorithm that selects or groups

the most similar values How close the one measurement isto another depends on the employed distance metric andtypically depends on the threshold that is established by thedesigner In general the employed criteria could be based on(1) an absolute distance (2) the Euclidean distance or (3) astatistical function of the distance

NN is a simple algorithm that can find a feasible (approx-imate) solution in a small amount of time However in acluttered environment it could provide many pairs that havethe same probability and could thus produce undesirable

The Scientific World Journal 7

Targets Sensors Observations Tracks

Track 1

Track 2

False alarms

Ass

ocia

tion

S1

S2

Sn

Track n

y1 y2 yn

Figure 5 Conceptual overview of the data association process from multiple sensors and multiple targets It is necessary to establish the setof observations over time from the same object that forms a track

error propagation [10] Moreover this algorithm has poorperformance in environments in which false measurementsare frequent which are in highly noisy environments

All neighbors use a similar technique in which all of themeasurements inside a region are included in the tracks119870-Means [11] method is a well-known modification of

the NN algorithm 119870-Means divides the dataset values into119870 different clusters 119870-Means algorithm finds the best local-ization of the cluster centroids where best means a centroidthat is in the center of the data cluster119870-Means is an iterativealgorithm that can be divided into the following steps

(1) obtain the input data and the number of desiredclusters (119870)

(2) randomly assign the centroid of each cluster(3) match each data point with the centroid of each

cluster(4) move the cluster centers to the centroid of the cluster(5) if the algorithm does not converge return to step (3)

119870-Means is a popular algorithm that has been widelyemployed however it has the following disadvantages

(i) the algorithm does not always find the optimal solu-tion for the cluster centers

(ii) the number of clusters must be known a priori andone must assume that this number is the optimum

(iii) the algorithm assumes that the covariance of thedataset is irrelevant or that it has been normalizedalready

There are several options for overcoming these limita-tions For the first one it is possible to execute the algorithmseveral times and obtain the solution that has less varianceFor the second one it is possible to start with a low valueof 119870 and increment the values of 119870 until an adequate resultis obtained The third limitation can be easily overcome bymultiplying the datawith the inverse of the covariancematrix

Many variations have been proposed to Lloydrsquos basic119870-Means algorithm [11] which has a computational upperbound cost of 119874(119870119899) where 119899 is the number of input pointsand 119870 is the number of desired clusters Some algorithmsmodify the initial cluster assignments to improve the separa-tions and reduce the number of iterations Others introduce

soft or multinomial clustering assignments using fuzzy logicprobabilistic or the Bayesian techniques However most ofthe previous variations still must perform several iterationsthrough the data space to converge to a reasonable solutionThis issue becomes a major disadvantage in several real-time applications A new approach that is based on havinga large (but still affordable) number of cluster candidatescompared to the desired 119870 clusters is currently gainingattention The idea behind this computational model is thatthe algorithm builds a good sketch of the original data whilereducing the dimensionality of the input space significantlyIn this manner a weighted 119870-Means can be applied to thelarge candidate clusters to derive a good clustering of theoriginal data Using this idea [12] presented an efficientand scalable 119870-Means algorithm that is based on randomprojections This algorithm requires only one pass throughthe input data to build the clusters More specifically if theinput data distribution holds some separability requirementsthen the number of required candidate clusters grows onlyaccording to119874(log 119899) where 119899 is the number of observationsin the original data This salient feature makes the algorithmscalable in terms of both the memory and computationalrequirements

32 Probabilistic Data Association The probabilistic dataassociation (PDA) algorithm was proposed by Bar-Shalomand Tse [13] and is also known as the modified filter of allneighbors This algorithm assigns an association probabilityto each hypothesis from a valid measurement of a targetA valid measurement refers to the observation that falls inthe validation gate of the target at that time instant Thevalidation gate 120574 which is the center around the predictedmeasurements of the target is used to select the set of basicmeasurements and is defined as

120574 ge (119885 (119896) minus (119896 | 119896 minus 1))119879119878minus1(119896) (119911 (119896) minus 119911 (119896 | 119896 minus 1))

(1)

where 119870 is the temporal index 119878(119896) is the covariance gainand 120574 determines the gating or window size The set of validmeasurements at time instant 119896 is defined as

119885 (119896) = 119911119894(119896) 119894 = 1 119898

119896 (2)

8 The Scientific World Journal

where 119911119894(119896) is the 119894-measurement in the validation region at

time instant 119896 We give the standard equations of the PDAalgorithm next For the state prediction consider

119909 (119896 | 119896 minus 1) = 119865 (119896 minus 1) 119909 (119896 minus 1 | 119896 minus 1) (3)

where 119865(119896 minus 1) is the transition matrix at time instant 119896 minus 1To calculate the measurement prediction consider

(119896 | 119896 minus 1) = 119867 (119896) 119909 (119896 | 119896 minus 1) (4)

where 119867(119896) is the linearization measurement matrix Tocompute the gain or the innovation of the 119894-measurementconsider

V119894(119896) = 119911

119894(119896) minus (119896 | 119896 minus 1) (5)

To calculate the covariance prediction consider

(119896 | 119896 minus 1) = 119865 (119896 minus 1) (119896 minus 1 | 119896 minus 1) 119865(119896 minus 1)119879+ 119876 (119896)

(6)

where 119876(119896) is the process noise covariance matrix To com-pute the innovation covariance (119878) and the Kalman gain (119870)

119878 (119896) = 119867 (119896) (119896 | 119896 minus 1)119867(119896)119879+ 119877

119870 (119896) = (119896 | 119896 minus 1)119867(119896)119879119878(119896)minus1

(7)

To obtain the covariance update in the case in which themea-surements originated by the target are known consider

1198750(119896 | 119896) = (119896 | 119896 minus 1) minus 119870 (119896) 119878 (119896)119870(119896)

119879 (8)

The total update of the covariance is computed as

V (119896) =

119898119896

sum

119894=1

120573119894(119896) V119894(119896)

119875 (119896) = 119870 (119896) [

119898119896

sum

119894=1

(120573119894(119896) V119894(119896) V119894(119896)119879) minus V (119896) V(119896)

119879]119870119879(119896)

(9)

where119898119896is the number of valid measurements in the instant

119896The equation to update the estimated state which is formedby the position and velocity is given by

119909 (119896 | 119896) = 119909 (119896 | 119896 minus 1) + 119870 (119896) V (119896) (10)

Finally the association probabilities of PDA are as follows

120573119894(119896) =

119901119894(119896)

sum119898119896

119894=0119901119894(119896) (11)

where

119901119894(119896) =

(2Π)1198722120582radic1003816100381610038161003816119878119894 (119896)

1003816100381610038161003816 (1 minus 119875119889119875119892)

119875119889

if 119894 = 0

exp [minus12V119879 (119896) 119878minus1 (119896) V (119896)] if 119894 = 0

0 in other cases(12)

where119872 is the dimension of themeasurement vector 120582 is thedensity of the clutter environment 119875

119889is the detection prob-

ability of the correct measurement and 119875119892is the validation

probability of a detected valueIn the PDA algorithm the state estimation of the target is

computed as a weighted sum of the estimated state under allof the hypothesesThe algorithm can associate different mea-surements to one specific target Thus the association of thedifferent measurements to a specific target helps PDA toestimate the target state and the association probabilitiesare used as weights The main disadvantages of the PDAalgorithm are the following

(i) loss of tracks because PDA ignores the interferencewith other targets it sometimes could wrongly clas-sify the closest tracks Therefore it provides a poorperformance when the targets are close to each otheror crossed

(ii) the suboptimal Bayesian approximation when thesource of information is uncertain PDA is the sub-optimal Bayesian approximation to the associationproblem

(iii) one target PDA was initially designed for the asso-ciation of one target in a low-cluttered environmentThe number of false alarms is typically modeled withthe Poisson distribution and they are assumed to bedistributed uniformly in space PDA behaves incor-rectlywhen there aremultiple targets because the falsealarm model does not work well

(iv) track management because PDA assumes that thetrack is already established algorithms must be pro-vided for track initialization and track deletion

PDA is mainly good for tracking targets that do notmake abrupt changes in their movement patterns PDA willmost likely lose the target if it makes abrupt changes in itsmovement patterns

33 Joint Probabilistic Data Association Joint probabilisticdata association (JPDA) is a suboptimal approach for trackingmultiple targets in cluttered environments [14] JPDA issimilar to PDA with the difference that the associationprobabilities are computed using all of the observationsand all of the targets Thus in contrast to PDA JPDAconsiders various hypotheses together and combines themJPDA determines the probability 120573119905

119894(119896) that measurement 119894 is

originated from target 119905 accounting for the fact that underthis hypothesis the measurement cannot be generated byother targets Therefore for a known number of targets itevaluates the different options of the measurement-targetassociation (for the most recent set of measurements) andcombines them into the corresponding state estimation Ifthe association probability is known then the Kalman filterupdating equation of the track 119905 can be written as

119909119905(119896 | 119896) = 119909

119905(119896 | 119896 minus 1) + 119870 (119896) V

119905(119896) (13)

where 119909119905(119896 | 119896) and 119909119905(119896 | 119896 minus 1) are the estimation andprediction of target 119905 and119870(119896) is the filter gainTheweighted

The Scientific World Journal 9

sum of the residuals associated with the observation 119898(119896) oftarget 119905 is as follows

V119905(119896) =

119898(119896)

sum

119894=1

120573119905

119894(119896) V119905

119894(119896) (14)

where V119905119894= 119911119894(119896) minus 119867119909

119905(119896 | 119896 minus 1) Therefore this method

incorporates all of the observations (inside the neighborhoodof the targetrsquos predicted position) to update the estimatedposition by using a posterior probability that is a weightedsum of residuals

The main restrictions of JPDA are the following

(i) a measurement cannot come from more than onetarget

(ii) two measurements cannot be originated by the sametarget (at one time instant)

(iii) the sum of all of the measurementsrsquo probabilities thatare assigned to one target must be 1 sum119898(119896)

119894=0120573119905

119894(119896) = 1

The main disadvantages of JPDA are the following

(i) it requires an explicit mechanism for track initial-ization Similar to PDA JPDA cannot initialize newtracks or remove tracks that are out of the observationarea

(ii) JPDA is a computationally expensive algorithm whenit is applied in environments that havemultiple targetsbecause the number of hypotheses is incrementedexponentially with the number of targets

In general JPDA is more appropriate than MHT insituations in which the density of false measurements is high(ie sonar applications)

34 Multiple Hypothesis Test The underlying idea of themultiple hypothesis test (MHT) is based on using more thantwo consecutive observations to make an association withbetter results Other algorithms that use only two consecutiveobservations have a higher probability of generating an errorIn contrast to PDA and JPDA MHT estimates all of thepossible hypotheses and maintains new hypotheses in eachiteration

MHTwas developed to track multiple targets in clutteredenvironments as a result it combines the data associationproblem and tracking into a unified framework becomingan estimation technique as well The Bayes rule or theBayesian networks are commonly employed to calculate theMHT hypothesis In general researchers have claimed thatMHT outperforms JPDA for the lower densities of falsepositives However the main disadvantage of MHT is thecomputational cost when the number of tracks or falsepositives is incremented Pruning the hypothesis tree usinga window could solve this limitation

The Reid [15] tracking algorithm is considered the stan-dard MHT algorithm but the initial integer programmingformulation of the problem is due to Morefield [16] MHT isan iterative algorithm in which each iteration starts with a set

of correspondence hypotheses Each hypothesis is a collec-tion of disjoint tracks and the prediction of the target in thenext time instant is computed for each hypothesis Next thepredictions are compared with the new observations by usinga distance metric The set of associations established in eachhypothesis (based on a distance) introduces new hypothesesin the next iteration Each new hypothesis represents a newset of tracks that is based on the current observations

Note that each new measurement could come from (i) anew target in the visual field of view (ii) a target being trackedor (iii) noise in the measurement process It is also possiblethat a measurement is not assigned to a target because thetarget disappears or because it is not possible to obtain atarget measurement at that time instant

MHT maintains several correspondence hypotheses foreach target in each frame If the hypothesis in the instant119896 is represented by 119867(119896) = [ℎ

119897(119896) 119896 = 1 119899] then

the probability of the hypothesis ℎ119897(119896) could be represented

recursively using the Bayes rule as follows

119875 (ℎ119897(119896) | 119885 (119896)) = 119875 (ℎ

119892(119896 minus 1) 119886

119894(119896) | 119885 (119896))

=1

119888119875 (119885 (119896) | ℎ

119892(119896 minus 1) 119886

119894(119896))

lowast 119875 (119886119894(119896) | ℎ

119892(119896 minus 1)) lowast 119875 (ℎ

119892(119896 minus 1))

(15)

where ℎ119892(119896 minus 1) is the hypothesis 119892 of the complete set until

the time instant 119896minus1 119886119894(119896) is the 119894th possible association of the

track to the object 119885(119896) is the set of detections of the currentframe and 119888 is a normal constant

The first term on the right side of the previous equationis the likelihood function of the measurement set 119885(119896) giventhe joint likelihood and current hypothesis The second termis the probability of the association hypothesis of the currentdata given the previous hypothesis ℎ

119892(119896 minus 1) The third term

is the probability of the previous hypothesis from which thecurrent hypothesis is calculated

The MHT algorithm has the ability to detect a newtrack while maintaining the hypothesis tree structure Theprobability of a true track is given by theBayes decisionmodelas

119875 (120582 | 119885) =119875 (119885 | 120582) lowast 119875

∘(120582)

119875 (119885) (16)

where 119875(119885 | 120582) is the probability of obtaining the set ofmeasurements 119885 given 120582 119875

∘(120582) is the a priori probability of

the source signal and 119875(119885) is the probability of obtaining theset of detections 119885

MHT considers all of the possibilities including boththe track maintenance and the initialization and removalof tracks in an integrated framework MHT calculates thepossibility of having an object after the generation of a setof measurements using an exhaustive approach and thealgorithm does not assume a fixed number of targetsThe keychallenge of MHT is the effective hypothesis managementThe baseline MHT algorithm can be extended as follows(i) use the hypothesis aggregation for missed targets births

10 The Scientific World Journal

cardinality tracking and closely spaced objects (ii) applya multistage MHT for improving the performance androbustness in challenging settings and (iii) use a feature-aided MHT for extended object surveillance

The main disadvantage of this algorithm is the compu-tational cost which grows exponentially with the number oftracks andmeasurementsTherefore the practical implemen-tation of this algorithm is limited because it is exponential inboth time and memory

With the aim of reducing the computational cost [17]presented a probabilistic MHT algorithm in which theassociations are considered to be random variables thatare statistically independent and in which performing anexhaustive search enumeration is avoided This algorithm isknown as PMHT The PMHT algorithm assumes that thenumber of targets and measurements is known With thesame goal of reducing the computational cost [18] presentedan efficient implementation of the MHT algorithm Thisimplementation was the first version to be applied to performtracking in visual environments They employed the Murty[19] algorithm to determine the best set of 119896 hypothesesin polynomial time with the goal of tracking the points ofinterest

MHT typically performs the tracking process by employ-ing only one characteristic commonly the position TheBayesian combination to use multiple characteristics wasproposed by Liggins II et al [20]

A linear-programming-based relaxation approach to theoptimization problem in MHT tracking was proposed inde-pendently by Coraluppi et al [21] and Storms and Spieksma[22] Joo and Chellappa [23] proposed an association algo-rithm for tracking multiple targets in visual environmentsTheir algorithm is based on in MHT modification in whicha measurement can be associated with more than one targetand several targets can be associated with one measurementThey also proposed a combinatorial optimization algorithmto generate the best set of association hypotheses Theiralgorithm always finds the best hypothesis in contrast toother models which are approximate Coraluppi and Carthel[24] presented a generalization of the MHT algorithm usinga recursion over hypothesis classes rather than over a singlehypothesis This work has been applied in a special case ofthemulti-target tracking problem called cardinality trackingin which they observed the number of sensor measurementsinstead of the target states

35 Distributed Joint Probabilistic Data Association The dis-tributed version of the joint probabilistic data association(JPDA-D) was presented by Chang et al [25] In this tech-nique the estimated state of the target (using two sensors)after being associated is given by

119864 119909 | 1198851 1198852 =

1198981

sum

119895=0

1198982

sum

119897=0

119864 119909 | 1205941

119895 1205942

119897 1198851 1198852

lowast 119875 1205941

119895 1205942

119897| 1198851 1198852

(17)

where 119898119894 119894 = 1 2 is the last set of measurements of

sensor 1 and 2 119885119894 119894 = 1 2 is the set of accumulative data

and 120594 is the association hypothesisThe first term of the rightside of the equation is calculated from the associations thatwere made earlier The second term is computed from theindividual association probabilities as follows

119875 (1205941

119895 1205942

119897| 1198851 1198852) = sum

1199091

sum

1199092

= 119875 (1205941 1205942| 1198851 1198852) 1

119895(1205941) 2

119897(1205942)

119875 (1205941 1205942| 1198851 1198852) =

1

119888119875 (1205941| 1198851) 119875 (120594

2| 1198852) 120574 (120594

1 1205942)

(18)

where 120594119894 are the joint hypotheses involving all of themeasurements and all of the objectives and 119894

119895(120594119894) are the

binary indicators of the measurement-target associationTheadditional term 120574(1205941 1205942) depends on the correlation of theindividual hypothesis and reflects the localization influenceof the current measurements in the joint hypotheses

These equations are obtained assuming that commu-nication exists after every observation and there are onlyapproximations in the case in which communication issporadic and when a substantial amount of noise occursTherefore this algorithm is a theoretical model that has somelimitations in practical applications

36 Distributed Multiple Hypothesis Test The distributedversion of the MHT algorithm (MHT-D) [26 27] follows asimilar structure as the JPDA-D algorithm Let us assume thecase in which one node must fuse two sets of hypotheses andtracks If the hypotheses and track sets are represented by119867119894(119885119894) and 119879119894(119885119894) with 119894 = 1 2 the hypothesis probabilities

are represented by 120582119894119895 and the state distribution of the tracks

(120591119894119895) is represented by 119875(120582119894

119895) and 119875(119909 | 119885

119894 120591119894

119895) then the

maximum available information in the fusion node is 119885 =1198851cup 1198852 The data fusion objective of the MHT-D is to

obtain the set of hypotheses119867(119885) the set of tracks 119879(119885) thehypothesis probabilities 119875(120582 | 119885) and the state distribution119901(119909 | 119885 120591) for the observed data

The MHT-D algorithm is composed of the followingsteps

(1) hypothesis formation for each hypothesis pair 1205821119895and

1205822

119896 which could be fused a track 120591 is formed by

associating the pair of tracks 1205911119895and 1205912

119896 where each

pair comes from one node and could originate fromthe same target The final result of this stage is a setof hypotheses denoted by 119867(119885) and the fused tracks119879(119885)

(2) hypothesis evaluation in this stage the associationprobability of each hypothesis and the estimatedstate of each fused track are obtained The dis-tributed estimation algorithm is employed to calcu-late the likelihood of the possible associations andthe obtained estimations at each specific association

The Scientific World Journal 11

Using the information model the probability of eachfused hypothesis is given by

119875 (120582 | 119885) = 119862minus1prod

119895isin119869

119875(120582(119895)| 119885(119895))120572(119895)

prod

120591isin120582

119871 (120591 | 119885) (19)

where 119862 is a normalizing constant and 119871(120591 | 119885) is thelikelihood of each hypothesis pair

The main disadvantage of the MHT-D is the high com-putational cost that is in the order of 119874(119899119872) where 119899 is thenumber of possible associations and 119872 is the number ofvariables to be estimated

37 Graphical Models Graphical models are a formalism forrepresenting and reasoning with probabilities and indepen-dence A graphical model represents a conditional decom-position of the joint probability A graphical model can berepresented as a graph in which the nodes denote randomvariables the edges denote the possible dependence betweenthe random variables and the plates denote the replication ofa substructure with the appropriate indexing of the relevantvariables The graph captures the joint distribution over therandom variables which can be decomposed into a productof factors that each dependononly a subset of variablesThereare two major classes of graphical models (i) the Bayesiannetworks [28] which are also known as the directed graphicalmodels and (ii) the Markov random fields which are alsoknown as undirected graphical models The directed graph-ical models are useful for expressing causal relationshipsbetween random variables whereas undirected models arebetter suited for expressing soft constraints between randomvariables We refer the reader to the book of Koller andFriedman [29] for more information on graphical models

A framework based on graphical models can solve theproblem of distributed data association in synchronizedsensor networkswith overlapped areas andwhere each sensorreceives noisy measurements this solution was proposedby Chen et al [30 31] Their work is based on graphicalmodels that are used to represent the statistical dependencebetween random variables The data association problem istreated as an inference problem and solved by using themax-product algorithm [32] Graphical models representstatistical dependencies between variables as graphs andthe max-product algorithm converges when the graph isa tree structure Moreover the employed algorithm couldbe implemented in a distributed manner by exchangingmessages between the source nodes in parallel With thisalgorithm if each sensor has 119899 possible combinations ofassociations and there are119872 variables to be estimated it hasa complexity of 119874(1198992119872) which is reasonable and less thanthe 119874(119899119872) complexity of the MHT-D algorithm Howeveraspecial attention must be given to the correlated variableswhen building the graphical model

4 State Estimation Methods

State estimation techniques aim to determine the state ofthe target under movement (typically the position) given

the observation or measurements State estimation tech-niques are also known as tracking techniques In their generalform it is not guaranteed that the target observations arerelevant which means that some of the observations couldactually come from the target and others could be only noiseThe state estimation phase is a common stage in data fusionalgorithms because the targetrsquos observation could come fromdifferent sensors or sources and the final goal is to obtain aglobal target state from the observations

The estimation problem involves finding the values of thevector state (eg position velocity and size) that fits as muchas possible with the observed data From a mathematicalperspective we have a set of redundant observations andthe goal is to find the set of parameters that provides thebest fit to the observed data In general these observationsare corrupted by errors and the propagation of noise in themeasurement process State estimation methods fall underlevel 1 of the JDL classification and could be divided into twobroader groups

(1) linear dynamics and measurements here the esti-mation problem has a standard solution Specificallywhen the equations of the object state and the mea-surements are linear the noise follows the Gaussiandistribution and we do not refer to it as a clutterenvironment in this case the optimal theoreticalsolution is based on the Kalman filter

(2) nonlinear dynamics the state estimation problembecomes difficult and there is not an analytical solu-tion to solve the problem in a generalmanner In prin-ciple there are no practical algorithms available tosolve this problem satisfactorily

Most of the state estimationmethods are based on controltheory and employ the laws of probability to compute avector state from a vector measurement or a stream of vectormeasurements Next the most common estimation methodsare presented including maximum likelihood and maxi-mum posterior (Section 41) the Kalman filter (Section 42)particle filter (Section 43) the distributed Kalman filter(Section 44) distributed particle filter (Section 45) andcovariance consistency methods (Section 46)

41 Maximum Likelihood andMaximum Posterior Themax-imum likelihood (ML) technique is an estimation methodthat is based on probabilistic theory Probabilistic estimationmethods are appropriate when the state variable follows anunknown probability distribution [33] In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of119909 The likelihood function 120582(119909) is defined as a probabilitydensity function of the sequence of 119911 observations given thetrue value of the state 119909 Consider

120582 (119909) = 119901 (119911 | 119909) (20)

The ML estimator finds the value of 119909 that maximizes thelikelihood function

119909 (119896) = argmax119909119901 (119911 | 119909) (21)

12 The Scientific World Journal

which can be obtained from the analytical or empiricalmodels of the sensorsThis function expresses the probabilityof the observed data The main disadvantage of this methodin practice is that it requires the analytical or empirical modelof the sensor to be known to provide the prior distributionand compute the likelihood function This method can alsosystematically underestimate the variance of the distributionwhich leads to a bias problem However the bias of the MLsolution becomes less significant as the number 119873 of datapoints increases and is equal to the true variance of thedistribution that generated the data at the limit119873 rarr infin

The maximum posterior (MAP) method is based on theBayesian theory It is employed when the parameter 119909 tobe estimated is the output of a random variable that has aknown probability density function 119901(119909) In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of 119909The MAP estimator finds the value of 119909 that maximizes theposterior probability distribution as follows

119909 (119896) = argmax119909119901 (119909 | 119911) (22)

Both methods (ML andMAP) aim to find the most likelyvalue for the state 119909 However ML assumes that 119909 is a fixedbut an unknown point from the parameter space whereasMAP considers 119909 to be the output of a random variable witha known a priori probability density function Both of thesemethods are equivalent when there is no a priori informationabout 119909 that is when there are only observations

42The Kalman Filter TheKalman filter is the most popularestimation technique It was originally proposed by Kalman[34] and has been widely studied and applied since then TheKalman filter estimates the state 119909 of a discrete time processgoverned by the following space-time model

119909 (119896 + 1) = Φ (119896) 119909 (119896) + 119866 (119896) 119906 (119896) + 119908 (119896) (23)

with the observations ormeasurements 119911 at time 119896 of the state119909 represented by

119911 (119896) = 119867 (119896) 119909 (119896) + V (119896) (24)

where Φ(119896) is the state transition matrix 119866(119896) is the inputmatrix transition 119906(119896) is the input vector 119867(119896) is themeasurement matrix and 119908 and V are the random Gaussianvariables with zero mean and covariance matrices of 119876(119896)and 119877(119896) respectively Based on the measurements and onthe system parameters the estimation of 119909(119896) which isrepresented by 119909(119896) and the prediction of 119909(119896 + 1) whichis represented by 119909(119896 + 1 | 119896) are given by the following

119909 (119896) = 119909 (119896 | 119896 + 1) + 119870 (119896) [119911 (119896) minus 119867 (119896) 119909 (119896 | 119896 minus 1)]

119909 (119896 + 1 | 119896) = Φ (119896) 119909 (119896 | 119896) + 119866 (119896) 119906 (119896)

(25)

respectively where119870 is the filter gain determined by

119870 (119896) = 119875 (119896 | 119896 minus 1)119867119879(119896)

times [119867 (119896) 119875 (119896 | 119896 minus 1)119867119879(119896) + 119877 (119896)]

minus1

(26)

where 119875(119896 | 119896 minus 1) is the prediction covariance matrix andcan be determined by

119875 (119896 + 1 | 119896) = Φ (119896) 119875 (119896)Φ119879(119896) + 119876 (119896) (27)

with

119875 (119896) = 119875 (119896 | 119896 minus 1) minus 119870 (119896)119867 (119896) 119875 (119896 | 119896 minus 1) (28)

The Kalman filter is mainly employed to fuse low-leveldata If the system could be described as a linear model andthe error could be modeled as the Gaussian noise then therecursive Kalman filter obtains optimal statistical estimations[35] However othermethods are required to address nonlin-ear dynamicmodels and nonlinearmeasurementsThemodi-fied Kalman filter known as the extended Kalman filter (EKF)is an optimal approach for implementing nonlinear recursivefilters [36] The EKF is one of the most often employedmethods for fusing data in robotic applications Howeverit has some disadvantages because the computations of theJacobians are extremely expensive Some attempts have beenmade to reduce the computational cost such as linearizationbut these attempts introduce errors in the filter and make itunstable

The unscented Kalman filter (UKF) [37] has gainedpopularity because it does not have the linearization step andthe associated errors of the EKF [38] The UKF employs adeterministic sampling strategy to establish theminimum setof points around the mean This set of points captures thetrue mean and covariance completely Then these points arepropagated through nonlinear functions and the covarianceof the estimations can be recuperated Another advantage ofthe UKF is its ability to be employed in parallel implementa-tions

43 Particle Filter Particle filters are recursive implemen-tations of the sequential Monte Carlo methods [39] Thismethod builds the posterior density function using severalrandom samples called particles Particles are propagatedover time with a combination of sampling and resamplingsteps At each iteration the sampling step is employed todiscard some particles increasing the relevance of regionswith a higher posterior probability In the filtering processseveral particles of the same state variable are employedand each particle has an associated weight that indicatesthe quality of the particle Therefore the estimation is theresult of a weighted sum of all of the particles The standardparticle filter algorithm has two phases (1) the predictingphase and (2) the updating phase In the predicting phaseeach particle is modified according to the existing modeland accounts for the sum of the random noise to simulatethe noise effect Then in the updating phase the weight ofeach particle is reevaluated using the last available sensorobservation and particles with lower weights are removedSpecifically a generic particle filter comprises the followingsteps

The Scientific World Journal 13

(1) Initialization of the particles

(i) let119873 be equal to the number of particles(ii) 119883(119894)(1) = [119909(1) 119910(1) 0 0]119879 for 119894 = 1 119873

(2) Prediction step

(i) for each particle 119894 = 1 119873 evaluate the state(119896 + 1 | 119896) of the system using the state at timeinstant 119896 with the noise of the system at time 119896Consider

119883(119894)(119896 + 1 | 119896) = 119865 (119896)119883

(119894)(119896)

+ (cauchy-distribution-noise)(119896)

(29)

where 119865(119896) is the transition matrix of the sys-tem

(3) Evaluate the particle weight For each particle 119894 =1 119873

(i) compute the predicted observation state of thesystem using the current predicted state and thenoise at instant 119896 Consider

(119894)(119896 + 1 | 119896) = 119867 (119896 + 1)119883

(119894)(119896 + 1 | 119896)

+ (gaussian-measurement-noise)(119896+1)

(30)

(ii) compute the likelihood (weights) according tothe given distribution Consider

likelihood(119894) = 119873((119894) (119896 + 1 | 119896) 119911(119894) (119896 + 1) var) (31)

(iii) normalize the weights as follows

119908(119894)=

likelihood(119894)

sum119873

119895=1likelihood(119895)

(32)

(4) ResamplingSelection multiply particles with higherweights and remove those with lower weights Thecurrent state must be adjusted using the computedweights of the new particles

(i) Compute the cumulative weights Consider

Cum Wt(119894) =119894

sum

119895=1

119908(119895) (33)

(ii) Generate uniform distributed random variablesfrom 119880(119894) sim 119882(0 1) with the number of stepsequal to the number of particles

(iii) Determine which particles should be multipliedand which ones removed

(5) Propagation phase

(i) incorporate the new values of the state after theresampling of instant 119896 to calculate the value atinstant 119896 + 1 Consider

119909(1119873)

(119896 + 1 | 119896 + 1) = 119909 (119896 + 1 | 119896) (34)

(ii) compute the posterior mean Consider

119909 (119896 + 1) = mean [119909119894 (119896 + 1 | 119896 + 1)] 119894 = 1 119873 (35)

(iii) repeat steps 2 to 5 for each time instant

Particle filters are more flexible than the Kalman filtersand can copewith nonlinear dependencies and non-Gaussiandensities in the dynamic model and in the noise errorHowever they have some disadvantages A large numberof particles are required to obtain a small variance in theestimator It is also difficult to establish the optimal number ofparticles in advance and the number of particles affects thecomputational cost significantly Earlier versions of particlefilters employed a fixed number of particles but recent studieshave started to use a dynamic number of particles [40]

44 The Distributed Kalman Filter The distributed Kalmanfilter requires a correct clock synchronization between eachsource as demonstrated in [41] In other words to correctlyuse the distributed Kalman filter the clocks from all ofthe sources must be synchronized This synchronization istypically achieved through using protocols that employ ashared global clock such as the network time protocol (NTP)Synchronization problems between clocks have been shownto have an effect on the accuracy of the Kalman filterproducing inaccurate estimations [42]

If the estimations are consistent and the cross covarianceis known (or the estimations are uncorrelated) then it ispossible to use the distributed Kalman filters [43] Howeverthe cross covariance must be determined exactly or theobservations must be consistent

We refer the reader to Liggins II et al [20] formore detailsabout the Kalman filter in a distributed and hierarchicalarchitecture

45 Distributed Particle Filter Distributed particle filtershave gained attention recently [44ndash46] Coates [45] used adistributed particle filter to monitor an environment thatcould be captured by the Markovian state-space modelinvolving nonlinear dynamics and observations and non-Gaussian noise

In contrast earlier attempts to solve out-of-sequencemeasurements using particle filters are based on regeneratingthe probability density function to the time instant of theout-of-sequence measurement [47] In a particle filter thisstep requires a large computational cost in addition to thenecessary space to store the previous particles To avoidthis problem Orton and Marrs [48] proposed to store theinformation on the particles at each time instant saving thecost of recalculating this information This technique is close

14 The Scientific World Journal

to optimal and when the delay increases the result is onlyslightly affected [49]However it requires a very large amountof space to store the state of the particles at each time instant

46 Covariance Consistency Methods Covariance Intersec-tionUnion Covariance consistency methods (intersectionand union) were proposed by Uhlmann [43] and are generaland fault-tolerant frameworks for maintaining covariancemeans and estimations in a distributed networkThese meth-ods do not comprise estimation techniques instead they aresimilar to an estimation fusion technique The distributedKalman filter requirement of independent measurements orknown cross-covariances is not a constraintwith thismethod

461 Covariance Intersection If the Kalman filter is employ-ed to combine two estimations (119886

1 1198601) and (119886

2 1198602) then it

is assumed that the joint covariance is in the following form

[

[

1198601119883

1198831198791198602

]

]

(36)

where the cross-covariance 119883 should be known exactly sothat the Kalman filter can be applied without difficultyBecause the computation of the cross-covariances is compu-tationally intensive Uhlmann [43] proposed the covarianceintersection (CI) algorithm

Let us assume that a joint covariance 119872 can be definedwith the diagonal blocks119872

1198601gt 1198601and119872

1198602gt 1198602 Consider

119872 ⩾ [

[

1198601119883

1198831198791198602

]

]

(37)

for every possible instance of the unknown cross-covariance119883 then the components of the matrix119872 could be employedin the Kalman filter equations to provide a fused estimation(119888 119862) that is considered consistent The key point of thismethod relies on generating a joint covariance matrix119872 thatcan represent a useful fused estimation (in this context usefulrefers to something with a lower associated uncertainty) Insummary the CI algorithm computes the joint covariancematrix 119872 where the Kalman filter provides the best fusedestimation (119888 119862) with respect to a fixed measurement of thecovariance matrix (ie the minimum determinant)

Specific covariance criteria must be established becausethere is not a specific minimum joint covariance in theorder of the positive semidefinite matrices Moreover thejoint covariance is the basis of the formal analysis of theCI algorithm the actual result is a nonlinear mixture of theinformation stored on the estimations being fused followingthe following equation

119862 = (1199081119867119879

1119860minus1

11198671+ 1199082119867119879

2119860minus1

21198672+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119867119899)minus1

119888 = 119862(1199081119867119879

1119860minus1

11198861+ 1199082119867119879

2119860minus1

21198862+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119886119899)minus1

(38)

where 119867119894is the transformation of the fused state-space

estimation to the space of the estimated state 119894 The values

of 119908 can be calculated to minimize the covariance determi-nant using convex optimization packages and semipositivematrix programming The result of the CI algorithm hasdifferent characteristics compared to the Kalman filter Forexample if two estimations are provided (119886 119860) and (119887 119861)and their covariances are equal 119860 = 119861 since the Kalmanfilter is based on the statistical independence assumption itproduces a fused estimation with covariance 119862 = (12)119860In contrast the CI method does not assume independenceand thus must be consistent even in the case in whichthe estimations are completely correlated with the estimatedfused covariance 119862 = 119860 In the case of estimations where119860 lt 119861 the CI algorithm does not provide information aboutthe estimation (119887 119861) thus the fused result is (119886 119860)

Every joint-consistent covariance is sufficient to producea fused estimation which guarantees consistency Howeverit is also necessary to guarantee a lack of divergence Diver-gence is avoided in the CI algorithm by choosing a specificmeasurement (ie the determinant) which is minimized ineach fusion operation This measurement represents a non-divergence criterion because the size of the estimated covari-ance according to this criterion would not be incremented

The application of the CI method guarantees consis-tency and nondivergence for every sequence of mean andcovariance-consistent estimations However this methoddoes not work well when the measurements to be fused areinconsistent

462 Covariance Union CI solves the problem of correlatedinputs but not the problemof inconsistent inputs (inconsistentinputs refer to different estimations each of which has ahigh accuracy (small variance) but also a large differencefrom the states of the others) thus the covariance union(CU) algorithm was proposed to solve the latter [43] CUaddresses the following problem two estimations (119886

1 1198601)

and (1198862 1198602) relate to the state of an object and are mutually

inconsistent from one another This issue arises when thedifference between the average estimations is larger thanthe provided covariance Inconsistent inputs can be detectedusing the Mahalanobis distance [50] between them which isdefined as

119872119889= (1198861minus 1198862)119879

(1198601+ 1198602)minus1

(1198861minus 1198862) (39)

and detecting whether this distance is larger than a giventhreshold

The Mahalanobis distance accounts for the covarianceinformation to obtain the distance If the difference betweenthe estimations is high but their covariance is also highthe Mahalanobis distance yields a small value In contrastif the difference between the estimations is small and thecovariances are small it could produce a larger distancevalue A high Mahalanobis distance could indicate that theestimations are inconsistent however it is necessary tohave a specific threshold established by the user or learnedautomatically

The CU algorithm aims to solve the following prob-lem let us suppose that a filtering algorithm provides twoobservations withmean and covariance (119886

1 1198601) and (119886

2 1198602)

The Scientific World Journal 15

respectively It is known that one of the observations is correctand the other is erroneous However the identity of thecorrect estimation is unknown and cannot be determinedIn this situation if both estimations are employed as aninput to the Kalman filter there will be a problem becausethe Kalman filter only guarantees a consistent output if theobservation is updated with a measurement consistent withboth of them In the specific case in which themeasurementscorrespond to the same object but are acquired from twodifferent sensors the Kalman filter can only guarantee thatthe output is consistent if it is consistent with both separatelyBecause it is not possible to knowwhich estimation is correctthe only way to combine the two estimations rigorously isto provide an estimation (119906 119880) that is consistent with bothestimations and to obey the following properties

119880 ⪖ 1198601+ (119906 minus 119886

1) (119906 minus 119860

1)119879

119880 ⪖ 1198602+ (119906 minus 119886

2) (119906 minus 119860

2)119879

(40)

where somemeasurement of thematrix size119880 (ie the deter-minant) is minimized

In other words the previous equations indicate that if theestimation (119886

1 1198601) is consistent then the translation of the

vector 1198861to 119906 requires to increase the covariance by the sum

of a matrix at least as big as the product of (119906minus1198861) in order to

be consistentThe same situation applies to the measurement(1198862 1198602) in order to be consistent

A simple strategy is to choose the mean of the estimationas the input value of one of themeasurements (119906 = 119886

1) In this

case the value of 119880must be chosen such that the estimationis consistent with the worst case (the correct measurement is1198862) However it is possible to assign 119906 an intermediate value

between 1198861and 1198862to decrease the value of 119880 Therefore the

CU algorithm establishes the mean fused value 119906 that hasthe least covariance 119880 but is sufficiently large for the twomeasurements (119886

1and 1198862) for consistency

Because the matrix inequalities presented in previousequations are convex convex optimization algorithms mustbe employed to solve them The value of 119880 can be computedwith the iterative method described by Julier et al [51]The obtained covariance could be significantly larger thanany of the initial covariances and is an indicator of theexisting uncertainty between the initial estimations One ofthe advantages of the CU method arises from the fact thatthe same process could be easily extended to119873 inputs

5 Decision Fusion Methods

A decision is typically taken based on the knowledge of theperceived situation which is provided by many sources inthe data fusion domain These techniques aim to make ahigh-level inference about the events and activities that areproduced from the detected targets These techniques oftenuse symbolic information and the fusion process requires toreasonwhile accounting for the uncertainties and constraintsThese methods fall under level 2 (situation assessment) andlevel 4 (impact assessment) of the JDL data fusion model

51 The Bayesian Methods Information fusion based on theBayesian inference provides a formalism for combining evi-dence according to the probability theory rules Uncertaintyis represented using the conditional probability terms thatdescribe beliefs and take on values in the interval [0 1] wherezero indicates a complete lack of belief and one indicates anabsolute belief The Bayesian inference is based on the Bayesrule as follows

119875 (119884 | 119883) =119875 (119883 | 119884) 119875 (119884)

119875 (119883) (41)

where the posterior probability 119875(119884 | 119883) represents thebelief in the hypothesis 119884 given the information 119883 Thisprobability is obtained by multiplying the a priori probabilityof the hypothesis 119875(119884) by the probability of having 119883 giventhat 119884 is true 119875(119883 | 119884) The value 119875(119883) is used as anormalizing constantThemain disadvantage of the Bayesianinference is that the probabilities 119875(119883) and 119875(119883 | 119884) mustbe known To estimate the conditional probabilities Panet al [52] proposed the use of NNs whereas Coue et al [53]proposed the Bayesian programming

Hall and Llinas [54] described the following problemsassociated with Bayesian inference

(i) Difficulty in establishing the value of a priori proba-bilities

(ii) Complexity when there are multiple potential hypo-theses and a substantial number of events that dependon the conditions

(iii) The hypothesis should be mutually exclusive

(iv) Difficulty in describing the uncertainty of the deci-sions

52 The Dempster-Shafer Inference The Dempster-Shaferinference is based on the mathematical theory introducedby Dempster [55] and Shafer [56] which generalizes theBayesian theory The Dempster-Shafer theory provides aformalism that could be used to represent incomplete knowl-edge updating beliefs and a combination of evidence andallows us to represent the uncertainty explicitly [57]

A fundamental concept in the Dempster-Shafer reason-ing is the frame of discernment which is defined as followsLet Θ = 120579

1 1205792 120579

119873 be the set of all possible states

that define the system and let Θ be exhaustive and mutuallyexclusive due to the system being only in one state 120579

119894isin Θ

where 1 ⪕ 119894 ⪕ 119873 The set Θ is called a frame of discernmentbecause its elements are employed to discern the current stateof the system

The elements of the set 2Θ are called hypotheses Inthe Dempster-Shafer theory based on the evidence 119864 aprobability is assigned to each hypothesis119867 isin 2Θ accordingto the basic assignment of probabilities or the mass function119898 2Θrarr [01] which satisfies

119898(0) = 0 (42)

16 The Scientific World Journal

Thus themass function of the empty set is zero Furthermorethe mass function of a hypothesis is larger than or equal tozero for all of the hypotheses Consider

119898(119867) ge 0 forall119867 isin 2Θ (43)

The sum of the mass function of all the hypotheses is oneConsider

sum

119867isin2Θ

119898(119867) = 1 (44)

To express incomplete beliefs in a hypothesis 119867 the Demp-ster-Shafer theory defines the belief function bel 2Θ rarr

[0 1] over Θ as

bel (119867) = sum119860sube119867

119898(119860) (45)

where bel(0) = 0 and bel(Θ) = 1The doubt level in119867 can beexpressed in terms of the belief function by

dou (119867) = bel (not119867) = sum

119860subenot119867

119898(119860) (46)

To express the plausibility of each hypothesis the functionpl 2Θ rarr [0 1] over Θ is defined as

pl (119867) = 1 minus dou (119867) = sum

119860cap119867=0

119898(119860) (47)

Intuitive plausibility indicates that there is less uncer-tainty in hypothesis119867 if it is more plausible The confidenceinterval [bel(119867) pl(119867)] defines the true belief in hypothesis119867 To combine the effects of the two mass functions 119898

1and

1198982 the Dempster-Shafer theory defines a rule119898

1oplus 1198982as

1198981oplus 1198982(0) = 0

1198981oplus 1198982(119867) =

sum119883cap119884=119867

1198981(119883)119898

2(119884)

1 minus sum119883cap119884=0

1198981(119883)119898

2(119884)

(48)

In contrast to the Bayesian inference a priori probabilitiesare not required in the Dempster-Shafer inference becausethey are assigned at the instant that the information is pro-vided Several studies in the literature have compared the useof the Bayesian inference and the Dempster-Shafer inferencesuch as [58ndash60] Wu et al [61] used the Dempster-Shafertheory to fuse information in context-aware environmentsThis work was extended in [62] to dynamically modify theassociated weights to the sensor measurements Thereforethe fusion mechanism is calibrated according to the recentmeasurements of the sensors (in cases in which the ground-truth is available) In themilitary domain [63] theDempster-Shafer reasoning is used with the a priori information storedin a database for classifying military ships Morbee et al [64]described the use of the Dempster-Shafer theory to build 2Doccupancy maps from several cameras and to evaluate thecontribution of subsets of cameras to a specific task Each taskis the observation of an event of interest and the goal is toassess the validity of a set of hypotheses that are fused usingthe Dempster-Shafer theory

53 Abductive Reasoning Abductive reasoning or inferringthe best explanation is a reasoning method in which ahypothesis is chosen under the assumption that in case itis true it explains the observed event most accurately [65]In other words when an event is observed the abductionmethod attempts to find the best explanation

In the context of probabilistic reasoning abductive infer-ence finds the posterior ML of the system variables givensome observed variables Abductive reasoning is more areasoning pattern than a data fusion technique Thereforedifferent inference methods such as NNs [66] or fuzzy logic[67] can be employed

54 Semantic Methods Decision fusion techniques thatemploy semantic data fromdifferent sources as an input couldprovide more accurate results than those that rely on onlysingle sources There is a growing interest in techniques thatautomatically determine the presence of semantic features invideos to solve the semantic gap [68]

Semantic information fusion is essentially a scheme inwhich raw sensor data are processed such that the nodesexchange only the resultant semantic information Semanticinformation fusion typically covers two phases (i) build-ing the knowledge and (ii) pattern matching (inference)The first phase (typically offline) incorporates the mostappropriate knowledge into semantic information Then thesecond phase (typically online or in real-time) fuses relevantattributes and provides a semantic interpretation of thesensor data [69ndash71]

Semantic fusion could be viewed as an idea for integratingand translating sensor data into formal languages Thereforethe obtained resulting language from the observations ofthe environment is compared with similar languages thatare stored in the database The key of this strategy is thatsimilar behaviors represented by formal languages are alsosemantically similar This type of method provides savingsin the cost of transmission because the nodes need onlytransmit the formal language structure instead of the rawdata However a known set of behaviors must be storedin a database in advance which might be difficult in somescenarios

6 Conclusions

This paper reviews the most popular methods and tech-niques for performing datainformation fusion To determinewhether the application of datainformation fusion methodsis feasible we must evaluate the computational cost of theprocess and the delay introduced in the communicationA centralized data fusion approach is theoretically optimalwhen there is no cost of transmission and there are sufficientcomputational resources However this situation typicallydoes not hold in practical applications

The selection of the most appropriate technique dependson the type of the problem and the established assumptionsof each technique Statistical data fusion methods (eg PDAJPDA MHT and Kalman) are optimal under specific condi-tions [72] First the assumption that the targets are moving

The Scientific World Journal 17

independently and the measurements are normally dis-tributed around the predicted position typically does nothold Second because the statistical techniques model allof the events as probabilities they typically have severalparameters and a priori probabilities for false measurementsand detection errors that are often difficult to obtain (atleast in an optimal sense) For example in the case of theMHT algorithm specific parameters must be established thatare nontrivial to determine and are very sensitive [73] Incontrast statisticalmethods that optimize over several framesare computationally intensive and their complexity typicallygrows exponentially with the number of targets For examplein the case of particle filters tracking several targets can beaccomplished jointly as a group or individually If severaltargets are tracked jointly the necessary number of particlesgrows exponentially Therefore in practice it is better toperform tracking on them individually with the assumptionthat targets do not interact between the particles

In contrast to centralized systems the distributed datafusion methods introduce some challenges in the data fusionprocess such as (i) spatial and temporal alignments of theinformation (ii) out-of-sequence measurements and (iii)data correlation reported by Castanedo et al [74 75] Theinherent redundancy of the distributed systems could beexploited with distributed reasoning techniques and cooper-ative algorithms to improve the individual node estimationsreported by Castanedo et al [76] In addition to the previousstudies a new trend based on the geometric notion of a low-dimensional manifold is gaining attention in the data fusioncommunity An example is the work of Davenport et al [77]which proposes a simple model that captures the correlationbetween the sensor observations by matching the parametervalues for the different obtained manifolds

Acknowledgments

The author would like to thank Jesus Garcıa Miguel APatricio and James Llinas for their interesting and relateddiscussions on several topics that were presented in thispaper

References

[1] JDL Data Fusion Lexicon Technical Panel For C3 FE WhiteSan Diego Calif USA Code 420 1991

[2] D L Hall and J Llinas ldquoAn introduction to multisensor datafusionrdquo Proceedings of the IEEE vol 85 no 1 pp 6ndash23 1997

[3] H F Durrant-Whyte ldquoSensor models and multisensor integra-tionrdquo International Journal of Robotics Research vol 7 no 6 pp97ndash113 1988

[4] B V Dasarathy ldquoSensor fusion potential exploitation-inno-vative architectures and illustrative applicationsrdquo Proceedings ofthe IEEE vol 85 no 1 pp 24ndash38 1997

[5] R C Luo C-C Yih and K L Su ldquoMultisensor fusion andintegration approaches applications and future research direc-tionsrdquo IEEE Sensors Journal vol 2 no 2 pp 107ndash119 2002

[6] J Llinas C Bowman G Rogova A Steinberg E Waltz andF White ldquoRevisiting the JDL data fusion model IIrdquo TechnicalReport DTIC Document 2004

[7] E P Blasch and S Plano ldquoJDL level 5 fusionmodel ldquouser refine-mentrdquo issues and applications in group trackingrdquo in Proceedingsof the Signal Processing Sensor Fusion and Target RecognitionXI pp 270ndash279 April 2002

[8] H F Durrant-Whyte and M Stevens ldquoData fusion in decen-tralized sensing networksrdquo in Proceedings of the 4th Interna-tional Conference on Information Fusion pp 302ndash307MontrealCanada 2001

[9] J Manyika and H Durrant-Whyte Data Fusion and SensorManagement A Decentralized Information-Theoretic ApproachPrentice Hall Upper Saddle River NJ USA 1995

[10] S S Blackman ldquoAssociation and fusion ofmultiple sensor datardquoinMultitarget-Multisensor Tracking Advanced Applications pp187ndash217 Artech House 1990

[11] S Lloyd ldquoLeast squares quantization in pcmrdquo IEEETransactionson Information Theory vol 28 no 2 pp 129ndash137 1982

[12] M Shindler A Wong and A Meyerson ldquoFast and accurate120581-means for large datasetsrdquo in Proceedings of the 25th AnnualConference on Neural Information Processing Systems (NIPS rsquo11)pp 2375ndash2383 December 2011

[13] Y Bar-Shalom and E Tse ldquoTracking in a cluttered environmentwith probabilistic data associationrdquo Automatica vol 11 no 5pp 451ndash460 1975

[14] T E Fortmann Y Bar-Shalom and M Scheffe ldquoMulti-targettracking using joint probabilistic data associationrdquo in Pro-ceedings of the 19th IEEE Conference on Decision and Controlincluding the Symposium on Adaptive Processes vol 19 pp 807ndash812 December 1980

[15] D B Reid ldquoAn algorithm for tracking multiple targetsrdquo IEEETransactions on Automatic Control vol 24 no 6 pp 843ndash8541979

[16] C L Morefield ldquoApplication of 0-1 integer programming tomultitarget tracking problemsrdquo IEEETransactions onAutomaticControl vol 22 no 3 pp 302ndash312 1977

[17] R L Streit and T E Luginbuhl ldquoMaximum likelihood methodfor probabilistic multihypothesis trackingrdquo in Proceedings of theSignal and Data Processing of Small Targets vol 2235 of Pro-ceedings of SPIE p 394 1994

[18] I J Cox and S L Hingorani ldquoEfficient implementation of Reidrsquosmultiple hypothesis tracking algorithm and its evaluation forthe purpose of visual trackingrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 18 no 2 pp 138ndash1501996

[19] K G Murty ldquoAn algorithm for ranking all the assignments inorder of increasing costrdquo Operations Research vol 16 no 3 pp682ndash687 1968

[20] M E Liggins II C-Y Chong I Kadar et al ldquoDistributed fusionarchitectures and algorithms for target trackingrdquo Proceedings ofthe IEEE vol 85 no 1 pp 95ndash106 1997

[21] S Coraluppi C Carthel M Luettgen and S Lynch ldquoAll-source track and identity fusionrdquo in Proceedings of the NationalSymposium on Sensor and Data Fusion 2000

[22] P Storms and F Spieksma ldquoAn lp-based algorithm for the dataassociation problem in multitarget trackingrdquo in Proceedings ofthe 3rd IEEE International Conference on Information Fusionvol 1 2000

[23] S-W Joo and R Chellappa ldquoA multiple-hypothesis approachfor multiobject visual trackingrdquo IEEE Transactions on ImageProcessing vol 16 no 11 pp 2849ndash2854 2007

[24] S Coraluppi andC Carthel ldquoAggregate surveillance a cardinal-ity tracking approachrdquo in Proceedings of the 14th InternationalConference on Information Fusion (FUSION rsquo11) July 2011

18 The Scientific World Journal

[25] K C Chang C Y Chong and Y Bar-Shalom ldquoJoint proba-bilistic data association in distributed sensor networksrdquo IEEETransactions on Automatic Control vol 31 no 10 pp 889ndash8971986

[26] Y Chong S Mori and K C Chang ldquoInformation lusion indistributed sensor networksrdquo in Proceedings of the 4th AmericanControl Conference Boston Mass USA June 1985

[27] Y Chong S Mori and K C Chang ldquoDistributed multitar-get multisensor trackingrdquo in Multitarget-Multisensor TrackingAdvanced Applications vol 1 pp 247ndash295 1990

[28] J Pearl Probabilistic Reasoning in Intelligent Systems Networksof Plausible Inference Morgan Kaufmann San Mateo CalifUSA 1988

[29] Koller and N Friedman Probabilistic Graphical Models Princi-ples and Techniques MIT press 2009

[30] L Chen M Cetin and A S Willsky ldquoDistributed data associ-ation for multi-target tracking in sensor networksrdquo in Proceed-ings of the 7th International Conference on Information Fusion(FUSION rsquo05) pp 9ndash16 July 2005

[31] L Chen M J Wainwright M Cetin and A S Willsky ldquoDataassociation based on optimization in graphical models withapplication to sensor networksrdquo Mathematical and ComputerModelling vol 43 no 9-10 pp 1114ndash1113 2006

[32] Y Weiss and W T Freeman ldquoOn the optimality of solutionsof the max-product belief-propagation algorithm in arbitrarygraphsrdquo IEEE Transactions on InformationTheory vol 47 no 2pp 736ndash744 2001

[33] C Brown H Durrant-Whyte J Leonard B Rao and B SteerldquoDistributed data fusion using Kalman filtering a roboticsapplicationrdquo in Data Fusion in Robotics and Machine Intelli-gence M A Abidi and R C Gonzalez Eds pp 267ndash309 1992

[34] R E Kalman ldquoA new approach to linear filtering and predictionproblemsrdquo Journal of Basic Engineering vol 82 no 1 pp 35ndash451960

[35] R C Luo and M G Kay ldquoData fusion and sensor integrationstate-of-the-art 1990srdquo in Data Fusion in Robotics and MachineIntelligence pp 7ndash135 1992

[36] Welch and G Bishop An Introduction to the Kalman FilterACM SIC-CRAPH 2001 Course Notes 2001

[37] S J Julier and J K Uhlmann ldquoA new extension of the Kalmanfilter to nonlinear systemsrdquo in Proceedings of the InternationalSymposium on AerospaceDefense Sensing Simulation and Con-trols vol 3 1997

[38] A Wan and R Van Der Merwe ldquoThe unscented kalman filterfor nonlinear estimationrdquo in Proceedings of the Adaptive Systemsfor Signal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 153ndash158 2000

[39] D Crisan and A Doucet ldquoA survey of convergence results onparticle filtering methods for practitionersrdquo IEEE Transactionson Signal Processing vol 50 no 3 pp 736ndash746 2002

[40] J Martinez-del Rincon C Orrite-Urunuela and J E Herrero-Jaraba ldquoAn efficient particle filter for color-based tracking incomplex scenesrdquo in Proceedings of the IEEE Conference onAdvancedVideo and Signal Based Surveillance pp 176ndash181 2007

[41] S Ganeriwal R Kumar and M B Srivastava ldquoTiming-syncprotocol for sensor networksrdquo in Proceedings of the 1st Inter-national Conference on Embedded Networked Sensor Systems(SenSys rsquo03) pp 138ndash149 November 2003

[42] M Manzo T Roosta and S Sastry ldquoTime synchronization innetworksrdquo in Proceedings of the 3rd ACMWorkshop on Securityof Ad Hoc and Sensor Networks (SASN rsquo05) pp 107ndash116November 2005

[43] J K Uhlmann ldquoCovariance consistency methods for fault-tolerant distributed data fusionrdquo Information Fusion vol 4 no3 pp 201ndash215 2003

[44] S Bashi V P Jilkov X R Li and H Chen ldquoDistributed imple-mentations of particle filtersrdquo in Proceedings of the 6th Interna-tional Conference of Information Fusion pp 1164ndash1171 2003

[45] M Coates ldquoDistributed particle filters for sensor networksrdquo inProceedings of the 3rd International symposium on InformationProcessing in Sensor Networks (ACM rsquo04) pp 99ndash107 New YorkNY USA 2004

[46] D Gu ldquoDistributed particle filter for target trackingrdquo in Pro-ceedings of the IEEE International Conference on Robotics andAutomation (ICRA rsquo07) pp 3856ndash3861 April 2007

[47] Y Bar-Shalom ldquoUpdate with out-of-sequencemeasurements intracking exact solutionrdquo IEEE Transactions on Aerospace andElectronic Systems vol 38 no 3 pp 769ndash778 2002

[48] M Orton and A Marrs ldquoA Bayesian approach to multi-targettracking and data fusionwithOut-of-SequenceMeasurementsrdquoIEE Colloquium no 174 pp 151ndash155 2001

[49] M L Hernandez A D Marrs S Maskell and M R OrtonldquoTracking and fusion for wireless sensor networksrdquo in Proceed-ings of the 5th International Conference on Information Fusion2002

[50] P C Mahalanobis ldquoOn the generalized distance in statisticsrdquoProceedings National Institute of ScienceIndia vol 2 no 1 pp49ndash55 1936

[51] S J Julier J K Uhlmann and D Nicholson ldquoA methodfor dealing with assignment ambiguityrdquo in Proceedings of theAmerican Control Conference (AAC rsquo04) vol 5 pp 4102ndash4107July 2004

[52] H Pan Z-P Liang T J Anastasio and T S Huang ldquoHybridNN-Bayesian architecture for information fusionrdquo in Proceed-ings of the International Conference on Image Processing (ICIPrsquo98) pp 368ndash371 October 1998

[53] C Coue T Fraichard P Bessiere and E Mazer ldquoMulti-sensordata fusion using Bayesian programming an automotive appli-cationrdquo in Proceedings of the IEEERSJ International Conferenceon Intelligent Robots and Systems pp 141ndash146 October 2002

[54] D L Hall and J Llinas Handbook of Multisensor Data FusionCRC Press Boca Raton Fla USA 2001

[55] P Dempster ldquoA Generalization of Bayesian Inferencerdquo Journalof the Royal Statistical Society B vol 30 no 2 pp 205ndash247 1968

[56] A ShaferMathematical Theory of Evidence Princeton Univer-sity Press Princeton NJ USA 1976

[57] G M Provan ldquoThe validity of Dempster-Shafer belief func-tionsrdquo International Journal of Approximate Reasoning vol 6no 3 pp 389ndash399 1992

[58] D M Buede ldquoShafer-Dempster and Bayesian reasoning aresponse to lsquoShafer-Dempster reasoning with applications tomultisensor target identification systemsrsquordquo IEEE Transactions onSystems Man and Cybernetics vol 18 no 6 pp 1009ndash1011 1988

[59] Y Cheng and R L Kashyap ldquoComparisonol Bayesian andDempsterrsquos rules in evidence combinationrdquo in Maximum-Entropy and BayesianMethods in Science and Engineering 1988

[60] B R Cobb and P P Shenoy ldquoA comparison of Bayesian andbelief function reasoningrdquo Information Systems Frontiers vol 5no 4 pp 345ndash358 2003

[61] H Wu M Siegel R Stiefelhagen and J Yang ldquoSensor fusionusing Dempster-Shafer theoryrdquo in Proceedings of the 19thIEEE Instrumentation and Measurement Technology Conference(TMTC rsquo02) pp 7ndash11 May 2002

The Scientific World Journal 19

[62] H Wu M Siegel and S Ablay ldquoSensor fusion using dempster-shafer theory II staticweighting andKalmanfilter-like dynamicweightingrdquo in Proceedings of the 20th IEEE Information andMeasurement Technology Conference (TMTC rsquo03) pp 907ndash912May 2003

[63] E Bosse P Valin A-C Boury-Brisset and D Grenier ldquoEx-ploitation of a priori knowledge for information fusionrdquo Infor-mation Fusion vol 7 no 2 pp 161ndash175 2006

[64] MMorbee L Tessens H Aghajan andW Philips ldquoDempster-Shafer based multi-view occupancy mapsrdquo Electronics Lettersvol 46 no 5 pp 341ndash343 2010

[65] C S Peirce Abduction and Induction Philosophical Writings ofPeirce vol 156 Dover New York NY USA 1955

[66] A M Abdelbar E A M Andrews and D C Wunsch IIldquoAbductive reasoning with recurrent neural networksrdquo NeuralNetworks vol 16 no 5-6 pp 665ndash673 2003

[67] J R Aguero and A Vargas ldquoInference of operative configu-ration of distribution networks using fuzzy logic techniquesPart II extended real-time modelrdquo IEEE Transactions on PowerSystems vol 20 no 3 pp 1562ndash1569 2005

[68] A W M Smeulders M Worring S Santini A Gupta and RJain ldquoContent-based image retrieval at the end of the earlyyearsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 22 no 12 pp 1349ndash1380 2000

[69] D S Friedlander and S Phoha ldquoSemantic information fusionfor coordinated signal processing in mobile sensor networksrdquoInternational Journal of High Performance Computing Applica-tions vol 16 no 3 pp 235ndash241 2002

[70] S Friedlander ldquoSemantic information extractionrdquo in Dis-tributed Sensor Networks 2005

[71] KWhitehouse J Liu and F Zhao ldquoSemantic Streams a frame-work for composable inference over sensor datardquo in Proceedingsof the 3rd European Workshop on Wireless Sensor NetworksLecture Notes in Computer Science Springer February 2006

[72] J Cox ldquoA review of statistical data association techniques formotion correspondencerdquo International Journal of ComputerVision vol 10 no 1 pp 53ndash66 1993

[73] C J Veenman M J T Reinders and E Backer ldquoResolvingmotion correspondence for densely moving pointsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol23 no 1 pp 54ndash72 2001

[74] F Castanedo M A Patricio J Garcıa and J M MolinaldquoBottom-uptop-down coordination in a multiagent visualsensor networkrdquo in Proceedings of the IEEE Conference onAdvanced Video and Signal Based Surveillance (AVSS rsquo07) pp93ndash98 September 2007

[75] F Castanedo J Garcıa M A Patricio and J M MolinaldquoAnalysis of distributed fusion alternatives in coordinated visionagentsrdquo in Proceedings of the 11th International Conference onInformation Fusion (FUSION rsquo08) July 2008

[76] F Castanedo J Garcıa M A Patricio and J M Molina ldquoDatafusion to improve trajectory tracking in a cooperative surveil-lance multi-agent architecturerdquo Information Fusion vol 11 no3 pp 243ndash255 2010

[77] M A Davenport C Hegde M F Duarte and R G BaraniukldquoJoint manifolds for data fusionrdquo IEEE Transactions on ImageProcessing vol 19 no 10 pp 2580ndash2594 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 6: Review Article A Review of Data Fusion Techniquesdownloads.hindawi.com/journals/tswj/2013/704504.pdf · Classification of Data Fusion Techniques Data fusion is a ... to extract features

6 The Scientific World Journal

Preprocessing

Preprocessing

Preprocessing

Alignment Association Estimation

Stateof theobject

Centralized architecture

Decentralized architecture

Distributed architecture

S1

S2

Fusion node

Preprocessing

Stateof theobject

Stateof theobject

Stateof theobject

S1

S2

S1

S2

Preprocessing

Preprocessing

Preprocessing

Preprocessing

Preprocessing

Alignment

Alignment

Alignment

Alignment

Alignment

Alignment

Alignment

Association

Association

Association

Association

Association

Association

Association

Estimation

Estimation

Estimation

Estimation

Estimation

Estimation

Estimation

Sn

Sn

Sn

Stateof theobject

Figure 4 Classification based on the type of architecture

In general an exhaustive search of all possible combina-tions grows exponentially with the number of targets thusthe data association problem becomes NP complete Themost common techniques that are employed to solve the dataassociation problem are presented in the following sections(from Sections 31 to 37)

31 Nearest Neighbors and K-Means Nearest neighbor(NN) is the simplest data association technique NN isa well-known clustering algorithm that selects or groups

the most similar values How close the one measurement isto another depends on the employed distance metric andtypically depends on the threshold that is established by thedesigner In general the employed criteria could be based on(1) an absolute distance (2) the Euclidean distance or (3) astatistical function of the distance

NN is a simple algorithm that can find a feasible (approx-imate) solution in a small amount of time However in acluttered environment it could provide many pairs that havethe same probability and could thus produce undesirable

The Scientific World Journal 7

Targets Sensors Observations Tracks

Track 1

Track 2

False alarms

Ass

ocia

tion

S1

S2

Sn

Track n

y1 y2 yn

Figure 5 Conceptual overview of the data association process from multiple sensors and multiple targets It is necessary to establish the setof observations over time from the same object that forms a track

error propagation [10] Moreover this algorithm has poorperformance in environments in which false measurementsare frequent which are in highly noisy environments

All neighbors use a similar technique in which all of themeasurements inside a region are included in the tracks119870-Means [11] method is a well-known modification of

the NN algorithm 119870-Means divides the dataset values into119870 different clusters 119870-Means algorithm finds the best local-ization of the cluster centroids where best means a centroidthat is in the center of the data cluster119870-Means is an iterativealgorithm that can be divided into the following steps

(1) obtain the input data and the number of desiredclusters (119870)

(2) randomly assign the centroid of each cluster(3) match each data point with the centroid of each

cluster(4) move the cluster centers to the centroid of the cluster(5) if the algorithm does not converge return to step (3)

119870-Means is a popular algorithm that has been widelyemployed however it has the following disadvantages

(i) the algorithm does not always find the optimal solu-tion for the cluster centers

(ii) the number of clusters must be known a priori andone must assume that this number is the optimum

(iii) the algorithm assumes that the covariance of thedataset is irrelevant or that it has been normalizedalready

There are several options for overcoming these limita-tions For the first one it is possible to execute the algorithmseveral times and obtain the solution that has less varianceFor the second one it is possible to start with a low valueof 119870 and increment the values of 119870 until an adequate resultis obtained The third limitation can be easily overcome bymultiplying the datawith the inverse of the covariancematrix

Many variations have been proposed to Lloydrsquos basic119870-Means algorithm [11] which has a computational upperbound cost of 119874(119870119899) where 119899 is the number of input pointsand 119870 is the number of desired clusters Some algorithmsmodify the initial cluster assignments to improve the separa-tions and reduce the number of iterations Others introduce

soft or multinomial clustering assignments using fuzzy logicprobabilistic or the Bayesian techniques However most ofthe previous variations still must perform several iterationsthrough the data space to converge to a reasonable solutionThis issue becomes a major disadvantage in several real-time applications A new approach that is based on havinga large (but still affordable) number of cluster candidatescompared to the desired 119870 clusters is currently gainingattention The idea behind this computational model is thatthe algorithm builds a good sketch of the original data whilereducing the dimensionality of the input space significantlyIn this manner a weighted 119870-Means can be applied to thelarge candidate clusters to derive a good clustering of theoriginal data Using this idea [12] presented an efficientand scalable 119870-Means algorithm that is based on randomprojections This algorithm requires only one pass throughthe input data to build the clusters More specifically if theinput data distribution holds some separability requirementsthen the number of required candidate clusters grows onlyaccording to119874(log 119899) where 119899 is the number of observationsin the original data This salient feature makes the algorithmscalable in terms of both the memory and computationalrequirements

32 Probabilistic Data Association The probabilistic dataassociation (PDA) algorithm was proposed by Bar-Shalomand Tse [13] and is also known as the modified filter of allneighbors This algorithm assigns an association probabilityto each hypothesis from a valid measurement of a targetA valid measurement refers to the observation that falls inthe validation gate of the target at that time instant Thevalidation gate 120574 which is the center around the predictedmeasurements of the target is used to select the set of basicmeasurements and is defined as

120574 ge (119885 (119896) minus (119896 | 119896 minus 1))119879119878minus1(119896) (119911 (119896) minus 119911 (119896 | 119896 minus 1))

(1)

where 119870 is the temporal index 119878(119896) is the covariance gainand 120574 determines the gating or window size The set of validmeasurements at time instant 119896 is defined as

119885 (119896) = 119911119894(119896) 119894 = 1 119898

119896 (2)

8 The Scientific World Journal

where 119911119894(119896) is the 119894-measurement in the validation region at

time instant 119896 We give the standard equations of the PDAalgorithm next For the state prediction consider

119909 (119896 | 119896 minus 1) = 119865 (119896 minus 1) 119909 (119896 minus 1 | 119896 minus 1) (3)

where 119865(119896 minus 1) is the transition matrix at time instant 119896 minus 1To calculate the measurement prediction consider

(119896 | 119896 minus 1) = 119867 (119896) 119909 (119896 | 119896 minus 1) (4)

where 119867(119896) is the linearization measurement matrix Tocompute the gain or the innovation of the 119894-measurementconsider

V119894(119896) = 119911

119894(119896) minus (119896 | 119896 minus 1) (5)

To calculate the covariance prediction consider

(119896 | 119896 minus 1) = 119865 (119896 minus 1) (119896 minus 1 | 119896 minus 1) 119865(119896 minus 1)119879+ 119876 (119896)

(6)

where 119876(119896) is the process noise covariance matrix To com-pute the innovation covariance (119878) and the Kalman gain (119870)

119878 (119896) = 119867 (119896) (119896 | 119896 minus 1)119867(119896)119879+ 119877

119870 (119896) = (119896 | 119896 minus 1)119867(119896)119879119878(119896)minus1

(7)

To obtain the covariance update in the case in which themea-surements originated by the target are known consider

1198750(119896 | 119896) = (119896 | 119896 minus 1) minus 119870 (119896) 119878 (119896)119870(119896)

119879 (8)

The total update of the covariance is computed as

V (119896) =

119898119896

sum

119894=1

120573119894(119896) V119894(119896)

119875 (119896) = 119870 (119896) [

119898119896

sum

119894=1

(120573119894(119896) V119894(119896) V119894(119896)119879) minus V (119896) V(119896)

119879]119870119879(119896)

(9)

where119898119896is the number of valid measurements in the instant

119896The equation to update the estimated state which is formedby the position and velocity is given by

119909 (119896 | 119896) = 119909 (119896 | 119896 minus 1) + 119870 (119896) V (119896) (10)

Finally the association probabilities of PDA are as follows

120573119894(119896) =

119901119894(119896)

sum119898119896

119894=0119901119894(119896) (11)

where

119901119894(119896) =

(2Π)1198722120582radic1003816100381610038161003816119878119894 (119896)

1003816100381610038161003816 (1 minus 119875119889119875119892)

119875119889

if 119894 = 0

exp [minus12V119879 (119896) 119878minus1 (119896) V (119896)] if 119894 = 0

0 in other cases(12)

where119872 is the dimension of themeasurement vector 120582 is thedensity of the clutter environment 119875

119889is the detection prob-

ability of the correct measurement and 119875119892is the validation

probability of a detected valueIn the PDA algorithm the state estimation of the target is

computed as a weighted sum of the estimated state under allof the hypothesesThe algorithm can associate different mea-surements to one specific target Thus the association of thedifferent measurements to a specific target helps PDA toestimate the target state and the association probabilitiesare used as weights The main disadvantages of the PDAalgorithm are the following

(i) loss of tracks because PDA ignores the interferencewith other targets it sometimes could wrongly clas-sify the closest tracks Therefore it provides a poorperformance when the targets are close to each otheror crossed

(ii) the suboptimal Bayesian approximation when thesource of information is uncertain PDA is the sub-optimal Bayesian approximation to the associationproblem

(iii) one target PDA was initially designed for the asso-ciation of one target in a low-cluttered environmentThe number of false alarms is typically modeled withthe Poisson distribution and they are assumed to bedistributed uniformly in space PDA behaves incor-rectlywhen there aremultiple targets because the falsealarm model does not work well

(iv) track management because PDA assumes that thetrack is already established algorithms must be pro-vided for track initialization and track deletion

PDA is mainly good for tracking targets that do notmake abrupt changes in their movement patterns PDA willmost likely lose the target if it makes abrupt changes in itsmovement patterns

33 Joint Probabilistic Data Association Joint probabilisticdata association (JPDA) is a suboptimal approach for trackingmultiple targets in cluttered environments [14] JPDA issimilar to PDA with the difference that the associationprobabilities are computed using all of the observationsand all of the targets Thus in contrast to PDA JPDAconsiders various hypotheses together and combines themJPDA determines the probability 120573119905

119894(119896) that measurement 119894 is

originated from target 119905 accounting for the fact that underthis hypothesis the measurement cannot be generated byother targets Therefore for a known number of targets itevaluates the different options of the measurement-targetassociation (for the most recent set of measurements) andcombines them into the corresponding state estimation Ifthe association probability is known then the Kalman filterupdating equation of the track 119905 can be written as

119909119905(119896 | 119896) = 119909

119905(119896 | 119896 minus 1) + 119870 (119896) V

119905(119896) (13)

where 119909119905(119896 | 119896) and 119909119905(119896 | 119896 minus 1) are the estimation andprediction of target 119905 and119870(119896) is the filter gainTheweighted

The Scientific World Journal 9

sum of the residuals associated with the observation 119898(119896) oftarget 119905 is as follows

V119905(119896) =

119898(119896)

sum

119894=1

120573119905

119894(119896) V119905

119894(119896) (14)

where V119905119894= 119911119894(119896) minus 119867119909

119905(119896 | 119896 minus 1) Therefore this method

incorporates all of the observations (inside the neighborhoodof the targetrsquos predicted position) to update the estimatedposition by using a posterior probability that is a weightedsum of residuals

The main restrictions of JPDA are the following

(i) a measurement cannot come from more than onetarget

(ii) two measurements cannot be originated by the sametarget (at one time instant)

(iii) the sum of all of the measurementsrsquo probabilities thatare assigned to one target must be 1 sum119898(119896)

119894=0120573119905

119894(119896) = 1

The main disadvantages of JPDA are the following

(i) it requires an explicit mechanism for track initial-ization Similar to PDA JPDA cannot initialize newtracks or remove tracks that are out of the observationarea

(ii) JPDA is a computationally expensive algorithm whenit is applied in environments that havemultiple targetsbecause the number of hypotheses is incrementedexponentially with the number of targets

In general JPDA is more appropriate than MHT insituations in which the density of false measurements is high(ie sonar applications)

34 Multiple Hypothesis Test The underlying idea of themultiple hypothesis test (MHT) is based on using more thantwo consecutive observations to make an association withbetter results Other algorithms that use only two consecutiveobservations have a higher probability of generating an errorIn contrast to PDA and JPDA MHT estimates all of thepossible hypotheses and maintains new hypotheses in eachiteration

MHTwas developed to track multiple targets in clutteredenvironments as a result it combines the data associationproblem and tracking into a unified framework becomingan estimation technique as well The Bayes rule or theBayesian networks are commonly employed to calculate theMHT hypothesis In general researchers have claimed thatMHT outperforms JPDA for the lower densities of falsepositives However the main disadvantage of MHT is thecomputational cost when the number of tracks or falsepositives is incremented Pruning the hypothesis tree usinga window could solve this limitation

The Reid [15] tracking algorithm is considered the stan-dard MHT algorithm but the initial integer programmingformulation of the problem is due to Morefield [16] MHT isan iterative algorithm in which each iteration starts with a set

of correspondence hypotheses Each hypothesis is a collec-tion of disjoint tracks and the prediction of the target in thenext time instant is computed for each hypothesis Next thepredictions are compared with the new observations by usinga distance metric The set of associations established in eachhypothesis (based on a distance) introduces new hypothesesin the next iteration Each new hypothesis represents a newset of tracks that is based on the current observations

Note that each new measurement could come from (i) anew target in the visual field of view (ii) a target being trackedor (iii) noise in the measurement process It is also possiblethat a measurement is not assigned to a target because thetarget disappears or because it is not possible to obtain atarget measurement at that time instant

MHT maintains several correspondence hypotheses foreach target in each frame If the hypothesis in the instant119896 is represented by 119867(119896) = [ℎ

119897(119896) 119896 = 1 119899] then

the probability of the hypothesis ℎ119897(119896) could be represented

recursively using the Bayes rule as follows

119875 (ℎ119897(119896) | 119885 (119896)) = 119875 (ℎ

119892(119896 minus 1) 119886

119894(119896) | 119885 (119896))

=1

119888119875 (119885 (119896) | ℎ

119892(119896 minus 1) 119886

119894(119896))

lowast 119875 (119886119894(119896) | ℎ

119892(119896 minus 1)) lowast 119875 (ℎ

119892(119896 minus 1))

(15)

where ℎ119892(119896 minus 1) is the hypothesis 119892 of the complete set until

the time instant 119896minus1 119886119894(119896) is the 119894th possible association of the

track to the object 119885(119896) is the set of detections of the currentframe and 119888 is a normal constant

The first term on the right side of the previous equationis the likelihood function of the measurement set 119885(119896) giventhe joint likelihood and current hypothesis The second termis the probability of the association hypothesis of the currentdata given the previous hypothesis ℎ

119892(119896 minus 1) The third term

is the probability of the previous hypothesis from which thecurrent hypothesis is calculated

The MHT algorithm has the ability to detect a newtrack while maintaining the hypothesis tree structure Theprobability of a true track is given by theBayes decisionmodelas

119875 (120582 | 119885) =119875 (119885 | 120582) lowast 119875

∘(120582)

119875 (119885) (16)

where 119875(119885 | 120582) is the probability of obtaining the set ofmeasurements 119885 given 120582 119875

∘(120582) is the a priori probability of

the source signal and 119875(119885) is the probability of obtaining theset of detections 119885

MHT considers all of the possibilities including boththe track maintenance and the initialization and removalof tracks in an integrated framework MHT calculates thepossibility of having an object after the generation of a setof measurements using an exhaustive approach and thealgorithm does not assume a fixed number of targetsThe keychallenge of MHT is the effective hypothesis managementThe baseline MHT algorithm can be extended as follows(i) use the hypothesis aggregation for missed targets births

10 The Scientific World Journal

cardinality tracking and closely spaced objects (ii) applya multistage MHT for improving the performance androbustness in challenging settings and (iii) use a feature-aided MHT for extended object surveillance

The main disadvantage of this algorithm is the compu-tational cost which grows exponentially with the number oftracks andmeasurementsTherefore the practical implemen-tation of this algorithm is limited because it is exponential inboth time and memory

With the aim of reducing the computational cost [17]presented a probabilistic MHT algorithm in which theassociations are considered to be random variables thatare statistically independent and in which performing anexhaustive search enumeration is avoided This algorithm isknown as PMHT The PMHT algorithm assumes that thenumber of targets and measurements is known With thesame goal of reducing the computational cost [18] presentedan efficient implementation of the MHT algorithm Thisimplementation was the first version to be applied to performtracking in visual environments They employed the Murty[19] algorithm to determine the best set of 119896 hypothesesin polynomial time with the goal of tracking the points ofinterest

MHT typically performs the tracking process by employ-ing only one characteristic commonly the position TheBayesian combination to use multiple characteristics wasproposed by Liggins II et al [20]

A linear-programming-based relaxation approach to theoptimization problem in MHT tracking was proposed inde-pendently by Coraluppi et al [21] and Storms and Spieksma[22] Joo and Chellappa [23] proposed an association algo-rithm for tracking multiple targets in visual environmentsTheir algorithm is based on in MHT modification in whicha measurement can be associated with more than one targetand several targets can be associated with one measurementThey also proposed a combinatorial optimization algorithmto generate the best set of association hypotheses Theiralgorithm always finds the best hypothesis in contrast toother models which are approximate Coraluppi and Carthel[24] presented a generalization of the MHT algorithm usinga recursion over hypothesis classes rather than over a singlehypothesis This work has been applied in a special case ofthemulti-target tracking problem called cardinality trackingin which they observed the number of sensor measurementsinstead of the target states

35 Distributed Joint Probabilistic Data Association The dis-tributed version of the joint probabilistic data association(JPDA-D) was presented by Chang et al [25] In this tech-nique the estimated state of the target (using two sensors)after being associated is given by

119864 119909 | 1198851 1198852 =

1198981

sum

119895=0

1198982

sum

119897=0

119864 119909 | 1205941

119895 1205942

119897 1198851 1198852

lowast 119875 1205941

119895 1205942

119897| 1198851 1198852

(17)

where 119898119894 119894 = 1 2 is the last set of measurements of

sensor 1 and 2 119885119894 119894 = 1 2 is the set of accumulative data

and 120594 is the association hypothesisThe first term of the rightside of the equation is calculated from the associations thatwere made earlier The second term is computed from theindividual association probabilities as follows

119875 (1205941

119895 1205942

119897| 1198851 1198852) = sum

1199091

sum

1199092

= 119875 (1205941 1205942| 1198851 1198852) 1

119895(1205941) 2

119897(1205942)

119875 (1205941 1205942| 1198851 1198852) =

1

119888119875 (1205941| 1198851) 119875 (120594

2| 1198852) 120574 (120594

1 1205942)

(18)

where 120594119894 are the joint hypotheses involving all of themeasurements and all of the objectives and 119894

119895(120594119894) are the

binary indicators of the measurement-target associationTheadditional term 120574(1205941 1205942) depends on the correlation of theindividual hypothesis and reflects the localization influenceof the current measurements in the joint hypotheses

These equations are obtained assuming that commu-nication exists after every observation and there are onlyapproximations in the case in which communication issporadic and when a substantial amount of noise occursTherefore this algorithm is a theoretical model that has somelimitations in practical applications

36 Distributed Multiple Hypothesis Test The distributedversion of the MHT algorithm (MHT-D) [26 27] follows asimilar structure as the JPDA-D algorithm Let us assume thecase in which one node must fuse two sets of hypotheses andtracks If the hypotheses and track sets are represented by119867119894(119885119894) and 119879119894(119885119894) with 119894 = 1 2 the hypothesis probabilities

are represented by 120582119894119895 and the state distribution of the tracks

(120591119894119895) is represented by 119875(120582119894

119895) and 119875(119909 | 119885

119894 120591119894

119895) then the

maximum available information in the fusion node is 119885 =1198851cup 1198852 The data fusion objective of the MHT-D is to

obtain the set of hypotheses119867(119885) the set of tracks 119879(119885) thehypothesis probabilities 119875(120582 | 119885) and the state distribution119901(119909 | 119885 120591) for the observed data

The MHT-D algorithm is composed of the followingsteps

(1) hypothesis formation for each hypothesis pair 1205821119895and

1205822

119896 which could be fused a track 120591 is formed by

associating the pair of tracks 1205911119895and 1205912

119896 where each

pair comes from one node and could originate fromthe same target The final result of this stage is a setof hypotheses denoted by 119867(119885) and the fused tracks119879(119885)

(2) hypothesis evaluation in this stage the associationprobability of each hypothesis and the estimatedstate of each fused track are obtained The dis-tributed estimation algorithm is employed to calcu-late the likelihood of the possible associations andthe obtained estimations at each specific association

The Scientific World Journal 11

Using the information model the probability of eachfused hypothesis is given by

119875 (120582 | 119885) = 119862minus1prod

119895isin119869

119875(120582(119895)| 119885(119895))120572(119895)

prod

120591isin120582

119871 (120591 | 119885) (19)

where 119862 is a normalizing constant and 119871(120591 | 119885) is thelikelihood of each hypothesis pair

The main disadvantage of the MHT-D is the high com-putational cost that is in the order of 119874(119899119872) where 119899 is thenumber of possible associations and 119872 is the number ofvariables to be estimated

37 Graphical Models Graphical models are a formalism forrepresenting and reasoning with probabilities and indepen-dence A graphical model represents a conditional decom-position of the joint probability A graphical model can berepresented as a graph in which the nodes denote randomvariables the edges denote the possible dependence betweenthe random variables and the plates denote the replication ofa substructure with the appropriate indexing of the relevantvariables The graph captures the joint distribution over therandom variables which can be decomposed into a productof factors that each dependononly a subset of variablesThereare two major classes of graphical models (i) the Bayesiannetworks [28] which are also known as the directed graphicalmodels and (ii) the Markov random fields which are alsoknown as undirected graphical models The directed graph-ical models are useful for expressing causal relationshipsbetween random variables whereas undirected models arebetter suited for expressing soft constraints between randomvariables We refer the reader to the book of Koller andFriedman [29] for more information on graphical models

A framework based on graphical models can solve theproblem of distributed data association in synchronizedsensor networkswith overlapped areas andwhere each sensorreceives noisy measurements this solution was proposedby Chen et al [30 31] Their work is based on graphicalmodels that are used to represent the statistical dependencebetween random variables The data association problem istreated as an inference problem and solved by using themax-product algorithm [32] Graphical models representstatistical dependencies between variables as graphs andthe max-product algorithm converges when the graph isa tree structure Moreover the employed algorithm couldbe implemented in a distributed manner by exchangingmessages between the source nodes in parallel With thisalgorithm if each sensor has 119899 possible combinations ofassociations and there are119872 variables to be estimated it hasa complexity of 119874(1198992119872) which is reasonable and less thanthe 119874(119899119872) complexity of the MHT-D algorithm Howeveraspecial attention must be given to the correlated variableswhen building the graphical model

4 State Estimation Methods

State estimation techniques aim to determine the state ofthe target under movement (typically the position) given

the observation or measurements State estimation tech-niques are also known as tracking techniques In their generalform it is not guaranteed that the target observations arerelevant which means that some of the observations couldactually come from the target and others could be only noiseThe state estimation phase is a common stage in data fusionalgorithms because the targetrsquos observation could come fromdifferent sensors or sources and the final goal is to obtain aglobal target state from the observations

The estimation problem involves finding the values of thevector state (eg position velocity and size) that fits as muchas possible with the observed data From a mathematicalperspective we have a set of redundant observations andthe goal is to find the set of parameters that provides thebest fit to the observed data In general these observationsare corrupted by errors and the propagation of noise in themeasurement process State estimation methods fall underlevel 1 of the JDL classification and could be divided into twobroader groups

(1) linear dynamics and measurements here the esti-mation problem has a standard solution Specificallywhen the equations of the object state and the mea-surements are linear the noise follows the Gaussiandistribution and we do not refer to it as a clutterenvironment in this case the optimal theoreticalsolution is based on the Kalman filter

(2) nonlinear dynamics the state estimation problembecomes difficult and there is not an analytical solu-tion to solve the problem in a generalmanner In prin-ciple there are no practical algorithms available tosolve this problem satisfactorily

Most of the state estimationmethods are based on controltheory and employ the laws of probability to compute avector state from a vector measurement or a stream of vectormeasurements Next the most common estimation methodsare presented including maximum likelihood and maxi-mum posterior (Section 41) the Kalman filter (Section 42)particle filter (Section 43) the distributed Kalman filter(Section 44) distributed particle filter (Section 45) andcovariance consistency methods (Section 46)

41 Maximum Likelihood andMaximum Posterior Themax-imum likelihood (ML) technique is an estimation methodthat is based on probabilistic theory Probabilistic estimationmethods are appropriate when the state variable follows anunknown probability distribution [33] In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of119909 The likelihood function 120582(119909) is defined as a probabilitydensity function of the sequence of 119911 observations given thetrue value of the state 119909 Consider

120582 (119909) = 119901 (119911 | 119909) (20)

The ML estimator finds the value of 119909 that maximizes thelikelihood function

119909 (119896) = argmax119909119901 (119911 | 119909) (21)

12 The Scientific World Journal

which can be obtained from the analytical or empiricalmodels of the sensorsThis function expresses the probabilityof the observed data The main disadvantage of this methodin practice is that it requires the analytical or empirical modelof the sensor to be known to provide the prior distributionand compute the likelihood function This method can alsosystematically underestimate the variance of the distributionwhich leads to a bias problem However the bias of the MLsolution becomes less significant as the number 119873 of datapoints increases and is equal to the true variance of thedistribution that generated the data at the limit119873 rarr infin

The maximum posterior (MAP) method is based on theBayesian theory It is employed when the parameter 119909 tobe estimated is the output of a random variable that has aknown probability density function 119901(119909) In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of 119909The MAP estimator finds the value of 119909 that maximizes theposterior probability distribution as follows

119909 (119896) = argmax119909119901 (119909 | 119911) (22)

Both methods (ML andMAP) aim to find the most likelyvalue for the state 119909 However ML assumes that 119909 is a fixedbut an unknown point from the parameter space whereasMAP considers 119909 to be the output of a random variable witha known a priori probability density function Both of thesemethods are equivalent when there is no a priori informationabout 119909 that is when there are only observations

42The Kalman Filter TheKalman filter is the most popularestimation technique It was originally proposed by Kalman[34] and has been widely studied and applied since then TheKalman filter estimates the state 119909 of a discrete time processgoverned by the following space-time model

119909 (119896 + 1) = Φ (119896) 119909 (119896) + 119866 (119896) 119906 (119896) + 119908 (119896) (23)

with the observations ormeasurements 119911 at time 119896 of the state119909 represented by

119911 (119896) = 119867 (119896) 119909 (119896) + V (119896) (24)

where Φ(119896) is the state transition matrix 119866(119896) is the inputmatrix transition 119906(119896) is the input vector 119867(119896) is themeasurement matrix and 119908 and V are the random Gaussianvariables with zero mean and covariance matrices of 119876(119896)and 119877(119896) respectively Based on the measurements and onthe system parameters the estimation of 119909(119896) which isrepresented by 119909(119896) and the prediction of 119909(119896 + 1) whichis represented by 119909(119896 + 1 | 119896) are given by the following

119909 (119896) = 119909 (119896 | 119896 + 1) + 119870 (119896) [119911 (119896) minus 119867 (119896) 119909 (119896 | 119896 minus 1)]

119909 (119896 + 1 | 119896) = Φ (119896) 119909 (119896 | 119896) + 119866 (119896) 119906 (119896)

(25)

respectively where119870 is the filter gain determined by

119870 (119896) = 119875 (119896 | 119896 minus 1)119867119879(119896)

times [119867 (119896) 119875 (119896 | 119896 minus 1)119867119879(119896) + 119877 (119896)]

minus1

(26)

where 119875(119896 | 119896 minus 1) is the prediction covariance matrix andcan be determined by

119875 (119896 + 1 | 119896) = Φ (119896) 119875 (119896)Φ119879(119896) + 119876 (119896) (27)

with

119875 (119896) = 119875 (119896 | 119896 minus 1) minus 119870 (119896)119867 (119896) 119875 (119896 | 119896 minus 1) (28)

The Kalman filter is mainly employed to fuse low-leveldata If the system could be described as a linear model andthe error could be modeled as the Gaussian noise then therecursive Kalman filter obtains optimal statistical estimations[35] However othermethods are required to address nonlin-ear dynamicmodels and nonlinearmeasurementsThemodi-fied Kalman filter known as the extended Kalman filter (EKF)is an optimal approach for implementing nonlinear recursivefilters [36] The EKF is one of the most often employedmethods for fusing data in robotic applications Howeverit has some disadvantages because the computations of theJacobians are extremely expensive Some attempts have beenmade to reduce the computational cost such as linearizationbut these attempts introduce errors in the filter and make itunstable

The unscented Kalman filter (UKF) [37] has gainedpopularity because it does not have the linearization step andthe associated errors of the EKF [38] The UKF employs adeterministic sampling strategy to establish theminimum setof points around the mean This set of points captures thetrue mean and covariance completely Then these points arepropagated through nonlinear functions and the covarianceof the estimations can be recuperated Another advantage ofthe UKF is its ability to be employed in parallel implementa-tions

43 Particle Filter Particle filters are recursive implemen-tations of the sequential Monte Carlo methods [39] Thismethod builds the posterior density function using severalrandom samples called particles Particles are propagatedover time with a combination of sampling and resamplingsteps At each iteration the sampling step is employed todiscard some particles increasing the relevance of regionswith a higher posterior probability In the filtering processseveral particles of the same state variable are employedand each particle has an associated weight that indicatesthe quality of the particle Therefore the estimation is theresult of a weighted sum of all of the particles The standardparticle filter algorithm has two phases (1) the predictingphase and (2) the updating phase In the predicting phaseeach particle is modified according to the existing modeland accounts for the sum of the random noise to simulatethe noise effect Then in the updating phase the weight ofeach particle is reevaluated using the last available sensorobservation and particles with lower weights are removedSpecifically a generic particle filter comprises the followingsteps

The Scientific World Journal 13

(1) Initialization of the particles

(i) let119873 be equal to the number of particles(ii) 119883(119894)(1) = [119909(1) 119910(1) 0 0]119879 for 119894 = 1 119873

(2) Prediction step

(i) for each particle 119894 = 1 119873 evaluate the state(119896 + 1 | 119896) of the system using the state at timeinstant 119896 with the noise of the system at time 119896Consider

119883(119894)(119896 + 1 | 119896) = 119865 (119896)119883

(119894)(119896)

+ (cauchy-distribution-noise)(119896)

(29)

where 119865(119896) is the transition matrix of the sys-tem

(3) Evaluate the particle weight For each particle 119894 =1 119873

(i) compute the predicted observation state of thesystem using the current predicted state and thenoise at instant 119896 Consider

(119894)(119896 + 1 | 119896) = 119867 (119896 + 1)119883

(119894)(119896 + 1 | 119896)

+ (gaussian-measurement-noise)(119896+1)

(30)

(ii) compute the likelihood (weights) according tothe given distribution Consider

likelihood(119894) = 119873((119894) (119896 + 1 | 119896) 119911(119894) (119896 + 1) var) (31)

(iii) normalize the weights as follows

119908(119894)=

likelihood(119894)

sum119873

119895=1likelihood(119895)

(32)

(4) ResamplingSelection multiply particles with higherweights and remove those with lower weights Thecurrent state must be adjusted using the computedweights of the new particles

(i) Compute the cumulative weights Consider

Cum Wt(119894) =119894

sum

119895=1

119908(119895) (33)

(ii) Generate uniform distributed random variablesfrom 119880(119894) sim 119882(0 1) with the number of stepsequal to the number of particles

(iii) Determine which particles should be multipliedand which ones removed

(5) Propagation phase

(i) incorporate the new values of the state after theresampling of instant 119896 to calculate the value atinstant 119896 + 1 Consider

119909(1119873)

(119896 + 1 | 119896 + 1) = 119909 (119896 + 1 | 119896) (34)

(ii) compute the posterior mean Consider

119909 (119896 + 1) = mean [119909119894 (119896 + 1 | 119896 + 1)] 119894 = 1 119873 (35)

(iii) repeat steps 2 to 5 for each time instant

Particle filters are more flexible than the Kalman filtersand can copewith nonlinear dependencies and non-Gaussiandensities in the dynamic model and in the noise errorHowever they have some disadvantages A large numberof particles are required to obtain a small variance in theestimator It is also difficult to establish the optimal number ofparticles in advance and the number of particles affects thecomputational cost significantly Earlier versions of particlefilters employed a fixed number of particles but recent studieshave started to use a dynamic number of particles [40]

44 The Distributed Kalman Filter The distributed Kalmanfilter requires a correct clock synchronization between eachsource as demonstrated in [41] In other words to correctlyuse the distributed Kalman filter the clocks from all ofthe sources must be synchronized This synchronization istypically achieved through using protocols that employ ashared global clock such as the network time protocol (NTP)Synchronization problems between clocks have been shownto have an effect on the accuracy of the Kalman filterproducing inaccurate estimations [42]

If the estimations are consistent and the cross covarianceis known (or the estimations are uncorrelated) then it ispossible to use the distributed Kalman filters [43] Howeverthe cross covariance must be determined exactly or theobservations must be consistent

We refer the reader to Liggins II et al [20] formore detailsabout the Kalman filter in a distributed and hierarchicalarchitecture

45 Distributed Particle Filter Distributed particle filtershave gained attention recently [44ndash46] Coates [45] used adistributed particle filter to monitor an environment thatcould be captured by the Markovian state-space modelinvolving nonlinear dynamics and observations and non-Gaussian noise

In contrast earlier attempts to solve out-of-sequencemeasurements using particle filters are based on regeneratingthe probability density function to the time instant of theout-of-sequence measurement [47] In a particle filter thisstep requires a large computational cost in addition to thenecessary space to store the previous particles To avoidthis problem Orton and Marrs [48] proposed to store theinformation on the particles at each time instant saving thecost of recalculating this information This technique is close

14 The Scientific World Journal

to optimal and when the delay increases the result is onlyslightly affected [49]However it requires a very large amountof space to store the state of the particles at each time instant

46 Covariance Consistency Methods Covariance Intersec-tionUnion Covariance consistency methods (intersectionand union) were proposed by Uhlmann [43] and are generaland fault-tolerant frameworks for maintaining covariancemeans and estimations in a distributed networkThese meth-ods do not comprise estimation techniques instead they aresimilar to an estimation fusion technique The distributedKalman filter requirement of independent measurements orknown cross-covariances is not a constraintwith thismethod

461 Covariance Intersection If the Kalman filter is employ-ed to combine two estimations (119886

1 1198601) and (119886

2 1198602) then it

is assumed that the joint covariance is in the following form

[

[

1198601119883

1198831198791198602

]

]

(36)

where the cross-covariance 119883 should be known exactly sothat the Kalman filter can be applied without difficultyBecause the computation of the cross-covariances is compu-tationally intensive Uhlmann [43] proposed the covarianceintersection (CI) algorithm

Let us assume that a joint covariance 119872 can be definedwith the diagonal blocks119872

1198601gt 1198601and119872

1198602gt 1198602 Consider

119872 ⩾ [

[

1198601119883

1198831198791198602

]

]

(37)

for every possible instance of the unknown cross-covariance119883 then the components of the matrix119872 could be employedin the Kalman filter equations to provide a fused estimation(119888 119862) that is considered consistent The key point of thismethod relies on generating a joint covariance matrix119872 thatcan represent a useful fused estimation (in this context usefulrefers to something with a lower associated uncertainty) Insummary the CI algorithm computes the joint covariancematrix 119872 where the Kalman filter provides the best fusedestimation (119888 119862) with respect to a fixed measurement of thecovariance matrix (ie the minimum determinant)

Specific covariance criteria must be established becausethere is not a specific minimum joint covariance in theorder of the positive semidefinite matrices Moreover thejoint covariance is the basis of the formal analysis of theCI algorithm the actual result is a nonlinear mixture of theinformation stored on the estimations being fused followingthe following equation

119862 = (1199081119867119879

1119860minus1

11198671+ 1199082119867119879

2119860minus1

21198672+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119867119899)minus1

119888 = 119862(1199081119867119879

1119860minus1

11198861+ 1199082119867119879

2119860minus1

21198862+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119886119899)minus1

(38)

where 119867119894is the transformation of the fused state-space

estimation to the space of the estimated state 119894 The values

of 119908 can be calculated to minimize the covariance determi-nant using convex optimization packages and semipositivematrix programming The result of the CI algorithm hasdifferent characteristics compared to the Kalman filter Forexample if two estimations are provided (119886 119860) and (119887 119861)and their covariances are equal 119860 = 119861 since the Kalmanfilter is based on the statistical independence assumption itproduces a fused estimation with covariance 119862 = (12)119860In contrast the CI method does not assume independenceand thus must be consistent even in the case in whichthe estimations are completely correlated with the estimatedfused covariance 119862 = 119860 In the case of estimations where119860 lt 119861 the CI algorithm does not provide information aboutthe estimation (119887 119861) thus the fused result is (119886 119860)

Every joint-consistent covariance is sufficient to producea fused estimation which guarantees consistency Howeverit is also necessary to guarantee a lack of divergence Diver-gence is avoided in the CI algorithm by choosing a specificmeasurement (ie the determinant) which is minimized ineach fusion operation This measurement represents a non-divergence criterion because the size of the estimated covari-ance according to this criterion would not be incremented

The application of the CI method guarantees consis-tency and nondivergence for every sequence of mean andcovariance-consistent estimations However this methoddoes not work well when the measurements to be fused areinconsistent

462 Covariance Union CI solves the problem of correlatedinputs but not the problemof inconsistent inputs (inconsistentinputs refer to different estimations each of which has ahigh accuracy (small variance) but also a large differencefrom the states of the others) thus the covariance union(CU) algorithm was proposed to solve the latter [43] CUaddresses the following problem two estimations (119886

1 1198601)

and (1198862 1198602) relate to the state of an object and are mutually

inconsistent from one another This issue arises when thedifference between the average estimations is larger thanthe provided covariance Inconsistent inputs can be detectedusing the Mahalanobis distance [50] between them which isdefined as

119872119889= (1198861minus 1198862)119879

(1198601+ 1198602)minus1

(1198861minus 1198862) (39)

and detecting whether this distance is larger than a giventhreshold

The Mahalanobis distance accounts for the covarianceinformation to obtain the distance If the difference betweenthe estimations is high but their covariance is also highthe Mahalanobis distance yields a small value In contrastif the difference between the estimations is small and thecovariances are small it could produce a larger distancevalue A high Mahalanobis distance could indicate that theestimations are inconsistent however it is necessary tohave a specific threshold established by the user or learnedautomatically

The CU algorithm aims to solve the following prob-lem let us suppose that a filtering algorithm provides twoobservations withmean and covariance (119886

1 1198601) and (119886

2 1198602)

The Scientific World Journal 15

respectively It is known that one of the observations is correctand the other is erroneous However the identity of thecorrect estimation is unknown and cannot be determinedIn this situation if both estimations are employed as aninput to the Kalman filter there will be a problem becausethe Kalman filter only guarantees a consistent output if theobservation is updated with a measurement consistent withboth of them In the specific case in which themeasurementscorrespond to the same object but are acquired from twodifferent sensors the Kalman filter can only guarantee thatthe output is consistent if it is consistent with both separatelyBecause it is not possible to knowwhich estimation is correctthe only way to combine the two estimations rigorously isto provide an estimation (119906 119880) that is consistent with bothestimations and to obey the following properties

119880 ⪖ 1198601+ (119906 minus 119886

1) (119906 minus 119860

1)119879

119880 ⪖ 1198602+ (119906 minus 119886

2) (119906 minus 119860

2)119879

(40)

where somemeasurement of thematrix size119880 (ie the deter-minant) is minimized

In other words the previous equations indicate that if theestimation (119886

1 1198601) is consistent then the translation of the

vector 1198861to 119906 requires to increase the covariance by the sum

of a matrix at least as big as the product of (119906minus1198861) in order to

be consistentThe same situation applies to the measurement(1198862 1198602) in order to be consistent

A simple strategy is to choose the mean of the estimationas the input value of one of themeasurements (119906 = 119886

1) In this

case the value of 119880must be chosen such that the estimationis consistent with the worst case (the correct measurement is1198862) However it is possible to assign 119906 an intermediate value

between 1198861and 1198862to decrease the value of 119880 Therefore the

CU algorithm establishes the mean fused value 119906 that hasthe least covariance 119880 but is sufficiently large for the twomeasurements (119886

1and 1198862) for consistency

Because the matrix inequalities presented in previousequations are convex convex optimization algorithms mustbe employed to solve them The value of 119880 can be computedwith the iterative method described by Julier et al [51]The obtained covariance could be significantly larger thanany of the initial covariances and is an indicator of theexisting uncertainty between the initial estimations One ofthe advantages of the CU method arises from the fact thatthe same process could be easily extended to119873 inputs

5 Decision Fusion Methods

A decision is typically taken based on the knowledge of theperceived situation which is provided by many sources inthe data fusion domain These techniques aim to make ahigh-level inference about the events and activities that areproduced from the detected targets These techniques oftenuse symbolic information and the fusion process requires toreasonwhile accounting for the uncertainties and constraintsThese methods fall under level 2 (situation assessment) andlevel 4 (impact assessment) of the JDL data fusion model

51 The Bayesian Methods Information fusion based on theBayesian inference provides a formalism for combining evi-dence according to the probability theory rules Uncertaintyis represented using the conditional probability terms thatdescribe beliefs and take on values in the interval [0 1] wherezero indicates a complete lack of belief and one indicates anabsolute belief The Bayesian inference is based on the Bayesrule as follows

119875 (119884 | 119883) =119875 (119883 | 119884) 119875 (119884)

119875 (119883) (41)

where the posterior probability 119875(119884 | 119883) represents thebelief in the hypothesis 119884 given the information 119883 Thisprobability is obtained by multiplying the a priori probabilityof the hypothesis 119875(119884) by the probability of having 119883 giventhat 119884 is true 119875(119883 | 119884) The value 119875(119883) is used as anormalizing constantThemain disadvantage of the Bayesianinference is that the probabilities 119875(119883) and 119875(119883 | 119884) mustbe known To estimate the conditional probabilities Panet al [52] proposed the use of NNs whereas Coue et al [53]proposed the Bayesian programming

Hall and Llinas [54] described the following problemsassociated with Bayesian inference

(i) Difficulty in establishing the value of a priori proba-bilities

(ii) Complexity when there are multiple potential hypo-theses and a substantial number of events that dependon the conditions

(iii) The hypothesis should be mutually exclusive

(iv) Difficulty in describing the uncertainty of the deci-sions

52 The Dempster-Shafer Inference The Dempster-Shaferinference is based on the mathematical theory introducedby Dempster [55] and Shafer [56] which generalizes theBayesian theory The Dempster-Shafer theory provides aformalism that could be used to represent incomplete knowl-edge updating beliefs and a combination of evidence andallows us to represent the uncertainty explicitly [57]

A fundamental concept in the Dempster-Shafer reason-ing is the frame of discernment which is defined as followsLet Θ = 120579

1 1205792 120579

119873 be the set of all possible states

that define the system and let Θ be exhaustive and mutuallyexclusive due to the system being only in one state 120579

119894isin Θ

where 1 ⪕ 119894 ⪕ 119873 The set Θ is called a frame of discernmentbecause its elements are employed to discern the current stateof the system

The elements of the set 2Θ are called hypotheses Inthe Dempster-Shafer theory based on the evidence 119864 aprobability is assigned to each hypothesis119867 isin 2Θ accordingto the basic assignment of probabilities or the mass function119898 2Θrarr [01] which satisfies

119898(0) = 0 (42)

16 The Scientific World Journal

Thus themass function of the empty set is zero Furthermorethe mass function of a hypothesis is larger than or equal tozero for all of the hypotheses Consider

119898(119867) ge 0 forall119867 isin 2Θ (43)

The sum of the mass function of all the hypotheses is oneConsider

sum

119867isin2Θ

119898(119867) = 1 (44)

To express incomplete beliefs in a hypothesis 119867 the Demp-ster-Shafer theory defines the belief function bel 2Θ rarr

[0 1] over Θ as

bel (119867) = sum119860sube119867

119898(119860) (45)

where bel(0) = 0 and bel(Θ) = 1The doubt level in119867 can beexpressed in terms of the belief function by

dou (119867) = bel (not119867) = sum

119860subenot119867

119898(119860) (46)

To express the plausibility of each hypothesis the functionpl 2Θ rarr [0 1] over Θ is defined as

pl (119867) = 1 minus dou (119867) = sum

119860cap119867=0

119898(119860) (47)

Intuitive plausibility indicates that there is less uncer-tainty in hypothesis119867 if it is more plausible The confidenceinterval [bel(119867) pl(119867)] defines the true belief in hypothesis119867 To combine the effects of the two mass functions 119898

1and

1198982 the Dempster-Shafer theory defines a rule119898

1oplus 1198982as

1198981oplus 1198982(0) = 0

1198981oplus 1198982(119867) =

sum119883cap119884=119867

1198981(119883)119898

2(119884)

1 minus sum119883cap119884=0

1198981(119883)119898

2(119884)

(48)

In contrast to the Bayesian inference a priori probabilitiesare not required in the Dempster-Shafer inference becausethey are assigned at the instant that the information is pro-vided Several studies in the literature have compared the useof the Bayesian inference and the Dempster-Shafer inferencesuch as [58ndash60] Wu et al [61] used the Dempster-Shafertheory to fuse information in context-aware environmentsThis work was extended in [62] to dynamically modify theassociated weights to the sensor measurements Thereforethe fusion mechanism is calibrated according to the recentmeasurements of the sensors (in cases in which the ground-truth is available) In themilitary domain [63] theDempster-Shafer reasoning is used with the a priori information storedin a database for classifying military ships Morbee et al [64]described the use of the Dempster-Shafer theory to build 2Doccupancy maps from several cameras and to evaluate thecontribution of subsets of cameras to a specific task Each taskis the observation of an event of interest and the goal is toassess the validity of a set of hypotheses that are fused usingthe Dempster-Shafer theory

53 Abductive Reasoning Abductive reasoning or inferringthe best explanation is a reasoning method in which ahypothesis is chosen under the assumption that in case itis true it explains the observed event most accurately [65]In other words when an event is observed the abductionmethod attempts to find the best explanation

In the context of probabilistic reasoning abductive infer-ence finds the posterior ML of the system variables givensome observed variables Abductive reasoning is more areasoning pattern than a data fusion technique Thereforedifferent inference methods such as NNs [66] or fuzzy logic[67] can be employed

54 Semantic Methods Decision fusion techniques thatemploy semantic data fromdifferent sources as an input couldprovide more accurate results than those that rely on onlysingle sources There is a growing interest in techniques thatautomatically determine the presence of semantic features invideos to solve the semantic gap [68]

Semantic information fusion is essentially a scheme inwhich raw sensor data are processed such that the nodesexchange only the resultant semantic information Semanticinformation fusion typically covers two phases (i) build-ing the knowledge and (ii) pattern matching (inference)The first phase (typically offline) incorporates the mostappropriate knowledge into semantic information Then thesecond phase (typically online or in real-time) fuses relevantattributes and provides a semantic interpretation of thesensor data [69ndash71]

Semantic fusion could be viewed as an idea for integratingand translating sensor data into formal languages Thereforethe obtained resulting language from the observations ofthe environment is compared with similar languages thatare stored in the database The key of this strategy is thatsimilar behaviors represented by formal languages are alsosemantically similar This type of method provides savingsin the cost of transmission because the nodes need onlytransmit the formal language structure instead of the rawdata However a known set of behaviors must be storedin a database in advance which might be difficult in somescenarios

6 Conclusions

This paper reviews the most popular methods and tech-niques for performing datainformation fusion To determinewhether the application of datainformation fusion methodsis feasible we must evaluate the computational cost of theprocess and the delay introduced in the communicationA centralized data fusion approach is theoretically optimalwhen there is no cost of transmission and there are sufficientcomputational resources However this situation typicallydoes not hold in practical applications

The selection of the most appropriate technique dependson the type of the problem and the established assumptionsof each technique Statistical data fusion methods (eg PDAJPDA MHT and Kalman) are optimal under specific condi-tions [72] First the assumption that the targets are moving

The Scientific World Journal 17

independently and the measurements are normally dis-tributed around the predicted position typically does nothold Second because the statistical techniques model allof the events as probabilities they typically have severalparameters and a priori probabilities for false measurementsand detection errors that are often difficult to obtain (atleast in an optimal sense) For example in the case of theMHT algorithm specific parameters must be established thatare nontrivial to determine and are very sensitive [73] Incontrast statisticalmethods that optimize over several framesare computationally intensive and their complexity typicallygrows exponentially with the number of targets For examplein the case of particle filters tracking several targets can beaccomplished jointly as a group or individually If severaltargets are tracked jointly the necessary number of particlesgrows exponentially Therefore in practice it is better toperform tracking on them individually with the assumptionthat targets do not interact between the particles

In contrast to centralized systems the distributed datafusion methods introduce some challenges in the data fusionprocess such as (i) spatial and temporal alignments of theinformation (ii) out-of-sequence measurements and (iii)data correlation reported by Castanedo et al [74 75] Theinherent redundancy of the distributed systems could beexploited with distributed reasoning techniques and cooper-ative algorithms to improve the individual node estimationsreported by Castanedo et al [76] In addition to the previousstudies a new trend based on the geometric notion of a low-dimensional manifold is gaining attention in the data fusioncommunity An example is the work of Davenport et al [77]which proposes a simple model that captures the correlationbetween the sensor observations by matching the parametervalues for the different obtained manifolds

Acknowledgments

The author would like to thank Jesus Garcıa Miguel APatricio and James Llinas for their interesting and relateddiscussions on several topics that were presented in thispaper

References

[1] JDL Data Fusion Lexicon Technical Panel For C3 FE WhiteSan Diego Calif USA Code 420 1991

[2] D L Hall and J Llinas ldquoAn introduction to multisensor datafusionrdquo Proceedings of the IEEE vol 85 no 1 pp 6ndash23 1997

[3] H F Durrant-Whyte ldquoSensor models and multisensor integra-tionrdquo International Journal of Robotics Research vol 7 no 6 pp97ndash113 1988

[4] B V Dasarathy ldquoSensor fusion potential exploitation-inno-vative architectures and illustrative applicationsrdquo Proceedings ofthe IEEE vol 85 no 1 pp 24ndash38 1997

[5] R C Luo C-C Yih and K L Su ldquoMultisensor fusion andintegration approaches applications and future research direc-tionsrdquo IEEE Sensors Journal vol 2 no 2 pp 107ndash119 2002

[6] J Llinas C Bowman G Rogova A Steinberg E Waltz andF White ldquoRevisiting the JDL data fusion model IIrdquo TechnicalReport DTIC Document 2004

[7] E P Blasch and S Plano ldquoJDL level 5 fusionmodel ldquouser refine-mentrdquo issues and applications in group trackingrdquo in Proceedingsof the Signal Processing Sensor Fusion and Target RecognitionXI pp 270ndash279 April 2002

[8] H F Durrant-Whyte and M Stevens ldquoData fusion in decen-tralized sensing networksrdquo in Proceedings of the 4th Interna-tional Conference on Information Fusion pp 302ndash307MontrealCanada 2001

[9] J Manyika and H Durrant-Whyte Data Fusion and SensorManagement A Decentralized Information-Theoretic ApproachPrentice Hall Upper Saddle River NJ USA 1995

[10] S S Blackman ldquoAssociation and fusion ofmultiple sensor datardquoinMultitarget-Multisensor Tracking Advanced Applications pp187ndash217 Artech House 1990

[11] S Lloyd ldquoLeast squares quantization in pcmrdquo IEEETransactionson Information Theory vol 28 no 2 pp 129ndash137 1982

[12] M Shindler A Wong and A Meyerson ldquoFast and accurate120581-means for large datasetsrdquo in Proceedings of the 25th AnnualConference on Neural Information Processing Systems (NIPS rsquo11)pp 2375ndash2383 December 2011

[13] Y Bar-Shalom and E Tse ldquoTracking in a cluttered environmentwith probabilistic data associationrdquo Automatica vol 11 no 5pp 451ndash460 1975

[14] T E Fortmann Y Bar-Shalom and M Scheffe ldquoMulti-targettracking using joint probabilistic data associationrdquo in Pro-ceedings of the 19th IEEE Conference on Decision and Controlincluding the Symposium on Adaptive Processes vol 19 pp 807ndash812 December 1980

[15] D B Reid ldquoAn algorithm for tracking multiple targetsrdquo IEEETransactions on Automatic Control vol 24 no 6 pp 843ndash8541979

[16] C L Morefield ldquoApplication of 0-1 integer programming tomultitarget tracking problemsrdquo IEEETransactions onAutomaticControl vol 22 no 3 pp 302ndash312 1977

[17] R L Streit and T E Luginbuhl ldquoMaximum likelihood methodfor probabilistic multihypothesis trackingrdquo in Proceedings of theSignal and Data Processing of Small Targets vol 2235 of Pro-ceedings of SPIE p 394 1994

[18] I J Cox and S L Hingorani ldquoEfficient implementation of Reidrsquosmultiple hypothesis tracking algorithm and its evaluation forthe purpose of visual trackingrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 18 no 2 pp 138ndash1501996

[19] K G Murty ldquoAn algorithm for ranking all the assignments inorder of increasing costrdquo Operations Research vol 16 no 3 pp682ndash687 1968

[20] M E Liggins II C-Y Chong I Kadar et al ldquoDistributed fusionarchitectures and algorithms for target trackingrdquo Proceedings ofthe IEEE vol 85 no 1 pp 95ndash106 1997

[21] S Coraluppi C Carthel M Luettgen and S Lynch ldquoAll-source track and identity fusionrdquo in Proceedings of the NationalSymposium on Sensor and Data Fusion 2000

[22] P Storms and F Spieksma ldquoAn lp-based algorithm for the dataassociation problem in multitarget trackingrdquo in Proceedings ofthe 3rd IEEE International Conference on Information Fusionvol 1 2000

[23] S-W Joo and R Chellappa ldquoA multiple-hypothesis approachfor multiobject visual trackingrdquo IEEE Transactions on ImageProcessing vol 16 no 11 pp 2849ndash2854 2007

[24] S Coraluppi andC Carthel ldquoAggregate surveillance a cardinal-ity tracking approachrdquo in Proceedings of the 14th InternationalConference on Information Fusion (FUSION rsquo11) July 2011

18 The Scientific World Journal

[25] K C Chang C Y Chong and Y Bar-Shalom ldquoJoint proba-bilistic data association in distributed sensor networksrdquo IEEETransactions on Automatic Control vol 31 no 10 pp 889ndash8971986

[26] Y Chong S Mori and K C Chang ldquoInformation lusion indistributed sensor networksrdquo in Proceedings of the 4th AmericanControl Conference Boston Mass USA June 1985

[27] Y Chong S Mori and K C Chang ldquoDistributed multitar-get multisensor trackingrdquo in Multitarget-Multisensor TrackingAdvanced Applications vol 1 pp 247ndash295 1990

[28] J Pearl Probabilistic Reasoning in Intelligent Systems Networksof Plausible Inference Morgan Kaufmann San Mateo CalifUSA 1988

[29] Koller and N Friedman Probabilistic Graphical Models Princi-ples and Techniques MIT press 2009

[30] L Chen M Cetin and A S Willsky ldquoDistributed data associ-ation for multi-target tracking in sensor networksrdquo in Proceed-ings of the 7th International Conference on Information Fusion(FUSION rsquo05) pp 9ndash16 July 2005

[31] L Chen M J Wainwright M Cetin and A S Willsky ldquoDataassociation based on optimization in graphical models withapplication to sensor networksrdquo Mathematical and ComputerModelling vol 43 no 9-10 pp 1114ndash1113 2006

[32] Y Weiss and W T Freeman ldquoOn the optimality of solutionsof the max-product belief-propagation algorithm in arbitrarygraphsrdquo IEEE Transactions on InformationTheory vol 47 no 2pp 736ndash744 2001

[33] C Brown H Durrant-Whyte J Leonard B Rao and B SteerldquoDistributed data fusion using Kalman filtering a roboticsapplicationrdquo in Data Fusion in Robotics and Machine Intelli-gence M A Abidi and R C Gonzalez Eds pp 267ndash309 1992

[34] R E Kalman ldquoA new approach to linear filtering and predictionproblemsrdquo Journal of Basic Engineering vol 82 no 1 pp 35ndash451960

[35] R C Luo and M G Kay ldquoData fusion and sensor integrationstate-of-the-art 1990srdquo in Data Fusion in Robotics and MachineIntelligence pp 7ndash135 1992

[36] Welch and G Bishop An Introduction to the Kalman FilterACM SIC-CRAPH 2001 Course Notes 2001

[37] S J Julier and J K Uhlmann ldquoA new extension of the Kalmanfilter to nonlinear systemsrdquo in Proceedings of the InternationalSymposium on AerospaceDefense Sensing Simulation and Con-trols vol 3 1997

[38] A Wan and R Van Der Merwe ldquoThe unscented kalman filterfor nonlinear estimationrdquo in Proceedings of the Adaptive Systemsfor Signal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 153ndash158 2000

[39] D Crisan and A Doucet ldquoA survey of convergence results onparticle filtering methods for practitionersrdquo IEEE Transactionson Signal Processing vol 50 no 3 pp 736ndash746 2002

[40] J Martinez-del Rincon C Orrite-Urunuela and J E Herrero-Jaraba ldquoAn efficient particle filter for color-based tracking incomplex scenesrdquo in Proceedings of the IEEE Conference onAdvancedVideo and Signal Based Surveillance pp 176ndash181 2007

[41] S Ganeriwal R Kumar and M B Srivastava ldquoTiming-syncprotocol for sensor networksrdquo in Proceedings of the 1st Inter-national Conference on Embedded Networked Sensor Systems(SenSys rsquo03) pp 138ndash149 November 2003

[42] M Manzo T Roosta and S Sastry ldquoTime synchronization innetworksrdquo in Proceedings of the 3rd ACMWorkshop on Securityof Ad Hoc and Sensor Networks (SASN rsquo05) pp 107ndash116November 2005

[43] J K Uhlmann ldquoCovariance consistency methods for fault-tolerant distributed data fusionrdquo Information Fusion vol 4 no3 pp 201ndash215 2003

[44] S Bashi V P Jilkov X R Li and H Chen ldquoDistributed imple-mentations of particle filtersrdquo in Proceedings of the 6th Interna-tional Conference of Information Fusion pp 1164ndash1171 2003

[45] M Coates ldquoDistributed particle filters for sensor networksrdquo inProceedings of the 3rd International symposium on InformationProcessing in Sensor Networks (ACM rsquo04) pp 99ndash107 New YorkNY USA 2004

[46] D Gu ldquoDistributed particle filter for target trackingrdquo in Pro-ceedings of the IEEE International Conference on Robotics andAutomation (ICRA rsquo07) pp 3856ndash3861 April 2007

[47] Y Bar-Shalom ldquoUpdate with out-of-sequencemeasurements intracking exact solutionrdquo IEEE Transactions on Aerospace andElectronic Systems vol 38 no 3 pp 769ndash778 2002

[48] M Orton and A Marrs ldquoA Bayesian approach to multi-targettracking and data fusionwithOut-of-SequenceMeasurementsrdquoIEE Colloquium no 174 pp 151ndash155 2001

[49] M L Hernandez A D Marrs S Maskell and M R OrtonldquoTracking and fusion for wireless sensor networksrdquo in Proceed-ings of the 5th International Conference on Information Fusion2002

[50] P C Mahalanobis ldquoOn the generalized distance in statisticsrdquoProceedings National Institute of ScienceIndia vol 2 no 1 pp49ndash55 1936

[51] S J Julier J K Uhlmann and D Nicholson ldquoA methodfor dealing with assignment ambiguityrdquo in Proceedings of theAmerican Control Conference (AAC rsquo04) vol 5 pp 4102ndash4107July 2004

[52] H Pan Z-P Liang T J Anastasio and T S Huang ldquoHybridNN-Bayesian architecture for information fusionrdquo in Proceed-ings of the International Conference on Image Processing (ICIPrsquo98) pp 368ndash371 October 1998

[53] C Coue T Fraichard P Bessiere and E Mazer ldquoMulti-sensordata fusion using Bayesian programming an automotive appli-cationrdquo in Proceedings of the IEEERSJ International Conferenceon Intelligent Robots and Systems pp 141ndash146 October 2002

[54] D L Hall and J Llinas Handbook of Multisensor Data FusionCRC Press Boca Raton Fla USA 2001

[55] P Dempster ldquoA Generalization of Bayesian Inferencerdquo Journalof the Royal Statistical Society B vol 30 no 2 pp 205ndash247 1968

[56] A ShaferMathematical Theory of Evidence Princeton Univer-sity Press Princeton NJ USA 1976

[57] G M Provan ldquoThe validity of Dempster-Shafer belief func-tionsrdquo International Journal of Approximate Reasoning vol 6no 3 pp 389ndash399 1992

[58] D M Buede ldquoShafer-Dempster and Bayesian reasoning aresponse to lsquoShafer-Dempster reasoning with applications tomultisensor target identification systemsrsquordquo IEEE Transactions onSystems Man and Cybernetics vol 18 no 6 pp 1009ndash1011 1988

[59] Y Cheng and R L Kashyap ldquoComparisonol Bayesian andDempsterrsquos rules in evidence combinationrdquo in Maximum-Entropy and BayesianMethods in Science and Engineering 1988

[60] B R Cobb and P P Shenoy ldquoA comparison of Bayesian andbelief function reasoningrdquo Information Systems Frontiers vol 5no 4 pp 345ndash358 2003

[61] H Wu M Siegel R Stiefelhagen and J Yang ldquoSensor fusionusing Dempster-Shafer theoryrdquo in Proceedings of the 19thIEEE Instrumentation and Measurement Technology Conference(TMTC rsquo02) pp 7ndash11 May 2002

The Scientific World Journal 19

[62] H Wu M Siegel and S Ablay ldquoSensor fusion using dempster-shafer theory II staticweighting andKalmanfilter-like dynamicweightingrdquo in Proceedings of the 20th IEEE Information andMeasurement Technology Conference (TMTC rsquo03) pp 907ndash912May 2003

[63] E Bosse P Valin A-C Boury-Brisset and D Grenier ldquoEx-ploitation of a priori knowledge for information fusionrdquo Infor-mation Fusion vol 7 no 2 pp 161ndash175 2006

[64] MMorbee L Tessens H Aghajan andW Philips ldquoDempster-Shafer based multi-view occupancy mapsrdquo Electronics Lettersvol 46 no 5 pp 341ndash343 2010

[65] C S Peirce Abduction and Induction Philosophical Writings ofPeirce vol 156 Dover New York NY USA 1955

[66] A M Abdelbar E A M Andrews and D C Wunsch IIldquoAbductive reasoning with recurrent neural networksrdquo NeuralNetworks vol 16 no 5-6 pp 665ndash673 2003

[67] J R Aguero and A Vargas ldquoInference of operative configu-ration of distribution networks using fuzzy logic techniquesPart II extended real-time modelrdquo IEEE Transactions on PowerSystems vol 20 no 3 pp 1562ndash1569 2005

[68] A W M Smeulders M Worring S Santini A Gupta and RJain ldquoContent-based image retrieval at the end of the earlyyearsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 22 no 12 pp 1349ndash1380 2000

[69] D S Friedlander and S Phoha ldquoSemantic information fusionfor coordinated signal processing in mobile sensor networksrdquoInternational Journal of High Performance Computing Applica-tions vol 16 no 3 pp 235ndash241 2002

[70] S Friedlander ldquoSemantic information extractionrdquo in Dis-tributed Sensor Networks 2005

[71] KWhitehouse J Liu and F Zhao ldquoSemantic Streams a frame-work for composable inference over sensor datardquo in Proceedingsof the 3rd European Workshop on Wireless Sensor NetworksLecture Notes in Computer Science Springer February 2006

[72] J Cox ldquoA review of statistical data association techniques formotion correspondencerdquo International Journal of ComputerVision vol 10 no 1 pp 53ndash66 1993

[73] C J Veenman M J T Reinders and E Backer ldquoResolvingmotion correspondence for densely moving pointsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol23 no 1 pp 54ndash72 2001

[74] F Castanedo M A Patricio J Garcıa and J M MolinaldquoBottom-uptop-down coordination in a multiagent visualsensor networkrdquo in Proceedings of the IEEE Conference onAdvanced Video and Signal Based Surveillance (AVSS rsquo07) pp93ndash98 September 2007

[75] F Castanedo J Garcıa M A Patricio and J M MolinaldquoAnalysis of distributed fusion alternatives in coordinated visionagentsrdquo in Proceedings of the 11th International Conference onInformation Fusion (FUSION rsquo08) July 2008

[76] F Castanedo J Garcıa M A Patricio and J M Molina ldquoDatafusion to improve trajectory tracking in a cooperative surveil-lance multi-agent architecturerdquo Information Fusion vol 11 no3 pp 243ndash255 2010

[77] M A Davenport C Hegde M F Duarte and R G BaraniukldquoJoint manifolds for data fusionrdquo IEEE Transactions on ImageProcessing vol 19 no 10 pp 2580ndash2594 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 7: Review Article A Review of Data Fusion Techniquesdownloads.hindawi.com/journals/tswj/2013/704504.pdf · Classification of Data Fusion Techniques Data fusion is a ... to extract features

The Scientific World Journal 7

Targets Sensors Observations Tracks

Track 1

Track 2

False alarms

Ass

ocia

tion

S1

S2

Sn

Track n

y1 y2 yn

Figure 5 Conceptual overview of the data association process from multiple sensors and multiple targets It is necessary to establish the setof observations over time from the same object that forms a track

error propagation [10] Moreover this algorithm has poorperformance in environments in which false measurementsare frequent which are in highly noisy environments

All neighbors use a similar technique in which all of themeasurements inside a region are included in the tracks119870-Means [11] method is a well-known modification of

the NN algorithm 119870-Means divides the dataset values into119870 different clusters 119870-Means algorithm finds the best local-ization of the cluster centroids where best means a centroidthat is in the center of the data cluster119870-Means is an iterativealgorithm that can be divided into the following steps

(1) obtain the input data and the number of desiredclusters (119870)

(2) randomly assign the centroid of each cluster(3) match each data point with the centroid of each

cluster(4) move the cluster centers to the centroid of the cluster(5) if the algorithm does not converge return to step (3)

119870-Means is a popular algorithm that has been widelyemployed however it has the following disadvantages

(i) the algorithm does not always find the optimal solu-tion for the cluster centers

(ii) the number of clusters must be known a priori andone must assume that this number is the optimum

(iii) the algorithm assumes that the covariance of thedataset is irrelevant or that it has been normalizedalready

There are several options for overcoming these limita-tions For the first one it is possible to execute the algorithmseveral times and obtain the solution that has less varianceFor the second one it is possible to start with a low valueof 119870 and increment the values of 119870 until an adequate resultis obtained The third limitation can be easily overcome bymultiplying the datawith the inverse of the covariancematrix

Many variations have been proposed to Lloydrsquos basic119870-Means algorithm [11] which has a computational upperbound cost of 119874(119870119899) where 119899 is the number of input pointsand 119870 is the number of desired clusters Some algorithmsmodify the initial cluster assignments to improve the separa-tions and reduce the number of iterations Others introduce

soft or multinomial clustering assignments using fuzzy logicprobabilistic or the Bayesian techniques However most ofthe previous variations still must perform several iterationsthrough the data space to converge to a reasonable solutionThis issue becomes a major disadvantage in several real-time applications A new approach that is based on havinga large (but still affordable) number of cluster candidatescompared to the desired 119870 clusters is currently gainingattention The idea behind this computational model is thatthe algorithm builds a good sketch of the original data whilereducing the dimensionality of the input space significantlyIn this manner a weighted 119870-Means can be applied to thelarge candidate clusters to derive a good clustering of theoriginal data Using this idea [12] presented an efficientand scalable 119870-Means algorithm that is based on randomprojections This algorithm requires only one pass throughthe input data to build the clusters More specifically if theinput data distribution holds some separability requirementsthen the number of required candidate clusters grows onlyaccording to119874(log 119899) where 119899 is the number of observationsin the original data This salient feature makes the algorithmscalable in terms of both the memory and computationalrequirements

32 Probabilistic Data Association The probabilistic dataassociation (PDA) algorithm was proposed by Bar-Shalomand Tse [13] and is also known as the modified filter of allneighbors This algorithm assigns an association probabilityto each hypothesis from a valid measurement of a targetA valid measurement refers to the observation that falls inthe validation gate of the target at that time instant Thevalidation gate 120574 which is the center around the predictedmeasurements of the target is used to select the set of basicmeasurements and is defined as

120574 ge (119885 (119896) minus (119896 | 119896 minus 1))119879119878minus1(119896) (119911 (119896) minus 119911 (119896 | 119896 minus 1))

(1)

where 119870 is the temporal index 119878(119896) is the covariance gainand 120574 determines the gating or window size The set of validmeasurements at time instant 119896 is defined as

119885 (119896) = 119911119894(119896) 119894 = 1 119898

119896 (2)

8 The Scientific World Journal

where 119911119894(119896) is the 119894-measurement in the validation region at

time instant 119896 We give the standard equations of the PDAalgorithm next For the state prediction consider

119909 (119896 | 119896 minus 1) = 119865 (119896 minus 1) 119909 (119896 minus 1 | 119896 minus 1) (3)

where 119865(119896 minus 1) is the transition matrix at time instant 119896 minus 1To calculate the measurement prediction consider

(119896 | 119896 minus 1) = 119867 (119896) 119909 (119896 | 119896 minus 1) (4)

where 119867(119896) is the linearization measurement matrix Tocompute the gain or the innovation of the 119894-measurementconsider

V119894(119896) = 119911

119894(119896) minus (119896 | 119896 minus 1) (5)

To calculate the covariance prediction consider

(119896 | 119896 minus 1) = 119865 (119896 minus 1) (119896 minus 1 | 119896 minus 1) 119865(119896 minus 1)119879+ 119876 (119896)

(6)

where 119876(119896) is the process noise covariance matrix To com-pute the innovation covariance (119878) and the Kalman gain (119870)

119878 (119896) = 119867 (119896) (119896 | 119896 minus 1)119867(119896)119879+ 119877

119870 (119896) = (119896 | 119896 minus 1)119867(119896)119879119878(119896)minus1

(7)

To obtain the covariance update in the case in which themea-surements originated by the target are known consider

1198750(119896 | 119896) = (119896 | 119896 minus 1) minus 119870 (119896) 119878 (119896)119870(119896)

119879 (8)

The total update of the covariance is computed as

V (119896) =

119898119896

sum

119894=1

120573119894(119896) V119894(119896)

119875 (119896) = 119870 (119896) [

119898119896

sum

119894=1

(120573119894(119896) V119894(119896) V119894(119896)119879) minus V (119896) V(119896)

119879]119870119879(119896)

(9)

where119898119896is the number of valid measurements in the instant

119896The equation to update the estimated state which is formedby the position and velocity is given by

119909 (119896 | 119896) = 119909 (119896 | 119896 minus 1) + 119870 (119896) V (119896) (10)

Finally the association probabilities of PDA are as follows

120573119894(119896) =

119901119894(119896)

sum119898119896

119894=0119901119894(119896) (11)

where

119901119894(119896) =

(2Π)1198722120582radic1003816100381610038161003816119878119894 (119896)

1003816100381610038161003816 (1 minus 119875119889119875119892)

119875119889

if 119894 = 0

exp [minus12V119879 (119896) 119878minus1 (119896) V (119896)] if 119894 = 0

0 in other cases(12)

where119872 is the dimension of themeasurement vector 120582 is thedensity of the clutter environment 119875

119889is the detection prob-

ability of the correct measurement and 119875119892is the validation

probability of a detected valueIn the PDA algorithm the state estimation of the target is

computed as a weighted sum of the estimated state under allof the hypothesesThe algorithm can associate different mea-surements to one specific target Thus the association of thedifferent measurements to a specific target helps PDA toestimate the target state and the association probabilitiesare used as weights The main disadvantages of the PDAalgorithm are the following

(i) loss of tracks because PDA ignores the interferencewith other targets it sometimes could wrongly clas-sify the closest tracks Therefore it provides a poorperformance when the targets are close to each otheror crossed

(ii) the suboptimal Bayesian approximation when thesource of information is uncertain PDA is the sub-optimal Bayesian approximation to the associationproblem

(iii) one target PDA was initially designed for the asso-ciation of one target in a low-cluttered environmentThe number of false alarms is typically modeled withthe Poisson distribution and they are assumed to bedistributed uniformly in space PDA behaves incor-rectlywhen there aremultiple targets because the falsealarm model does not work well

(iv) track management because PDA assumes that thetrack is already established algorithms must be pro-vided for track initialization and track deletion

PDA is mainly good for tracking targets that do notmake abrupt changes in their movement patterns PDA willmost likely lose the target if it makes abrupt changes in itsmovement patterns

33 Joint Probabilistic Data Association Joint probabilisticdata association (JPDA) is a suboptimal approach for trackingmultiple targets in cluttered environments [14] JPDA issimilar to PDA with the difference that the associationprobabilities are computed using all of the observationsand all of the targets Thus in contrast to PDA JPDAconsiders various hypotheses together and combines themJPDA determines the probability 120573119905

119894(119896) that measurement 119894 is

originated from target 119905 accounting for the fact that underthis hypothesis the measurement cannot be generated byother targets Therefore for a known number of targets itevaluates the different options of the measurement-targetassociation (for the most recent set of measurements) andcombines them into the corresponding state estimation Ifthe association probability is known then the Kalman filterupdating equation of the track 119905 can be written as

119909119905(119896 | 119896) = 119909

119905(119896 | 119896 minus 1) + 119870 (119896) V

119905(119896) (13)

where 119909119905(119896 | 119896) and 119909119905(119896 | 119896 minus 1) are the estimation andprediction of target 119905 and119870(119896) is the filter gainTheweighted

The Scientific World Journal 9

sum of the residuals associated with the observation 119898(119896) oftarget 119905 is as follows

V119905(119896) =

119898(119896)

sum

119894=1

120573119905

119894(119896) V119905

119894(119896) (14)

where V119905119894= 119911119894(119896) minus 119867119909

119905(119896 | 119896 minus 1) Therefore this method

incorporates all of the observations (inside the neighborhoodof the targetrsquos predicted position) to update the estimatedposition by using a posterior probability that is a weightedsum of residuals

The main restrictions of JPDA are the following

(i) a measurement cannot come from more than onetarget

(ii) two measurements cannot be originated by the sametarget (at one time instant)

(iii) the sum of all of the measurementsrsquo probabilities thatare assigned to one target must be 1 sum119898(119896)

119894=0120573119905

119894(119896) = 1

The main disadvantages of JPDA are the following

(i) it requires an explicit mechanism for track initial-ization Similar to PDA JPDA cannot initialize newtracks or remove tracks that are out of the observationarea

(ii) JPDA is a computationally expensive algorithm whenit is applied in environments that havemultiple targetsbecause the number of hypotheses is incrementedexponentially with the number of targets

In general JPDA is more appropriate than MHT insituations in which the density of false measurements is high(ie sonar applications)

34 Multiple Hypothesis Test The underlying idea of themultiple hypothesis test (MHT) is based on using more thantwo consecutive observations to make an association withbetter results Other algorithms that use only two consecutiveobservations have a higher probability of generating an errorIn contrast to PDA and JPDA MHT estimates all of thepossible hypotheses and maintains new hypotheses in eachiteration

MHTwas developed to track multiple targets in clutteredenvironments as a result it combines the data associationproblem and tracking into a unified framework becomingan estimation technique as well The Bayes rule or theBayesian networks are commonly employed to calculate theMHT hypothesis In general researchers have claimed thatMHT outperforms JPDA for the lower densities of falsepositives However the main disadvantage of MHT is thecomputational cost when the number of tracks or falsepositives is incremented Pruning the hypothesis tree usinga window could solve this limitation

The Reid [15] tracking algorithm is considered the stan-dard MHT algorithm but the initial integer programmingformulation of the problem is due to Morefield [16] MHT isan iterative algorithm in which each iteration starts with a set

of correspondence hypotheses Each hypothesis is a collec-tion of disjoint tracks and the prediction of the target in thenext time instant is computed for each hypothesis Next thepredictions are compared with the new observations by usinga distance metric The set of associations established in eachhypothesis (based on a distance) introduces new hypothesesin the next iteration Each new hypothesis represents a newset of tracks that is based on the current observations

Note that each new measurement could come from (i) anew target in the visual field of view (ii) a target being trackedor (iii) noise in the measurement process It is also possiblethat a measurement is not assigned to a target because thetarget disappears or because it is not possible to obtain atarget measurement at that time instant

MHT maintains several correspondence hypotheses foreach target in each frame If the hypothesis in the instant119896 is represented by 119867(119896) = [ℎ

119897(119896) 119896 = 1 119899] then

the probability of the hypothesis ℎ119897(119896) could be represented

recursively using the Bayes rule as follows

119875 (ℎ119897(119896) | 119885 (119896)) = 119875 (ℎ

119892(119896 minus 1) 119886

119894(119896) | 119885 (119896))

=1

119888119875 (119885 (119896) | ℎ

119892(119896 minus 1) 119886

119894(119896))

lowast 119875 (119886119894(119896) | ℎ

119892(119896 minus 1)) lowast 119875 (ℎ

119892(119896 minus 1))

(15)

where ℎ119892(119896 minus 1) is the hypothesis 119892 of the complete set until

the time instant 119896minus1 119886119894(119896) is the 119894th possible association of the

track to the object 119885(119896) is the set of detections of the currentframe and 119888 is a normal constant

The first term on the right side of the previous equationis the likelihood function of the measurement set 119885(119896) giventhe joint likelihood and current hypothesis The second termis the probability of the association hypothesis of the currentdata given the previous hypothesis ℎ

119892(119896 minus 1) The third term

is the probability of the previous hypothesis from which thecurrent hypothesis is calculated

The MHT algorithm has the ability to detect a newtrack while maintaining the hypothesis tree structure Theprobability of a true track is given by theBayes decisionmodelas

119875 (120582 | 119885) =119875 (119885 | 120582) lowast 119875

∘(120582)

119875 (119885) (16)

where 119875(119885 | 120582) is the probability of obtaining the set ofmeasurements 119885 given 120582 119875

∘(120582) is the a priori probability of

the source signal and 119875(119885) is the probability of obtaining theset of detections 119885

MHT considers all of the possibilities including boththe track maintenance and the initialization and removalof tracks in an integrated framework MHT calculates thepossibility of having an object after the generation of a setof measurements using an exhaustive approach and thealgorithm does not assume a fixed number of targetsThe keychallenge of MHT is the effective hypothesis managementThe baseline MHT algorithm can be extended as follows(i) use the hypothesis aggregation for missed targets births

10 The Scientific World Journal

cardinality tracking and closely spaced objects (ii) applya multistage MHT for improving the performance androbustness in challenging settings and (iii) use a feature-aided MHT for extended object surveillance

The main disadvantage of this algorithm is the compu-tational cost which grows exponentially with the number oftracks andmeasurementsTherefore the practical implemen-tation of this algorithm is limited because it is exponential inboth time and memory

With the aim of reducing the computational cost [17]presented a probabilistic MHT algorithm in which theassociations are considered to be random variables thatare statistically independent and in which performing anexhaustive search enumeration is avoided This algorithm isknown as PMHT The PMHT algorithm assumes that thenumber of targets and measurements is known With thesame goal of reducing the computational cost [18] presentedan efficient implementation of the MHT algorithm Thisimplementation was the first version to be applied to performtracking in visual environments They employed the Murty[19] algorithm to determine the best set of 119896 hypothesesin polynomial time with the goal of tracking the points ofinterest

MHT typically performs the tracking process by employ-ing only one characteristic commonly the position TheBayesian combination to use multiple characteristics wasproposed by Liggins II et al [20]

A linear-programming-based relaxation approach to theoptimization problem in MHT tracking was proposed inde-pendently by Coraluppi et al [21] and Storms and Spieksma[22] Joo and Chellappa [23] proposed an association algo-rithm for tracking multiple targets in visual environmentsTheir algorithm is based on in MHT modification in whicha measurement can be associated with more than one targetand several targets can be associated with one measurementThey also proposed a combinatorial optimization algorithmto generate the best set of association hypotheses Theiralgorithm always finds the best hypothesis in contrast toother models which are approximate Coraluppi and Carthel[24] presented a generalization of the MHT algorithm usinga recursion over hypothesis classes rather than over a singlehypothesis This work has been applied in a special case ofthemulti-target tracking problem called cardinality trackingin which they observed the number of sensor measurementsinstead of the target states

35 Distributed Joint Probabilistic Data Association The dis-tributed version of the joint probabilistic data association(JPDA-D) was presented by Chang et al [25] In this tech-nique the estimated state of the target (using two sensors)after being associated is given by

119864 119909 | 1198851 1198852 =

1198981

sum

119895=0

1198982

sum

119897=0

119864 119909 | 1205941

119895 1205942

119897 1198851 1198852

lowast 119875 1205941

119895 1205942

119897| 1198851 1198852

(17)

where 119898119894 119894 = 1 2 is the last set of measurements of

sensor 1 and 2 119885119894 119894 = 1 2 is the set of accumulative data

and 120594 is the association hypothesisThe first term of the rightside of the equation is calculated from the associations thatwere made earlier The second term is computed from theindividual association probabilities as follows

119875 (1205941

119895 1205942

119897| 1198851 1198852) = sum

1199091

sum

1199092

= 119875 (1205941 1205942| 1198851 1198852) 1

119895(1205941) 2

119897(1205942)

119875 (1205941 1205942| 1198851 1198852) =

1

119888119875 (1205941| 1198851) 119875 (120594

2| 1198852) 120574 (120594

1 1205942)

(18)

where 120594119894 are the joint hypotheses involving all of themeasurements and all of the objectives and 119894

119895(120594119894) are the

binary indicators of the measurement-target associationTheadditional term 120574(1205941 1205942) depends on the correlation of theindividual hypothesis and reflects the localization influenceof the current measurements in the joint hypotheses

These equations are obtained assuming that commu-nication exists after every observation and there are onlyapproximations in the case in which communication issporadic and when a substantial amount of noise occursTherefore this algorithm is a theoretical model that has somelimitations in practical applications

36 Distributed Multiple Hypothesis Test The distributedversion of the MHT algorithm (MHT-D) [26 27] follows asimilar structure as the JPDA-D algorithm Let us assume thecase in which one node must fuse two sets of hypotheses andtracks If the hypotheses and track sets are represented by119867119894(119885119894) and 119879119894(119885119894) with 119894 = 1 2 the hypothesis probabilities

are represented by 120582119894119895 and the state distribution of the tracks

(120591119894119895) is represented by 119875(120582119894

119895) and 119875(119909 | 119885

119894 120591119894

119895) then the

maximum available information in the fusion node is 119885 =1198851cup 1198852 The data fusion objective of the MHT-D is to

obtain the set of hypotheses119867(119885) the set of tracks 119879(119885) thehypothesis probabilities 119875(120582 | 119885) and the state distribution119901(119909 | 119885 120591) for the observed data

The MHT-D algorithm is composed of the followingsteps

(1) hypothesis formation for each hypothesis pair 1205821119895and

1205822

119896 which could be fused a track 120591 is formed by

associating the pair of tracks 1205911119895and 1205912

119896 where each

pair comes from one node and could originate fromthe same target The final result of this stage is a setof hypotheses denoted by 119867(119885) and the fused tracks119879(119885)

(2) hypothesis evaluation in this stage the associationprobability of each hypothesis and the estimatedstate of each fused track are obtained The dis-tributed estimation algorithm is employed to calcu-late the likelihood of the possible associations andthe obtained estimations at each specific association

The Scientific World Journal 11

Using the information model the probability of eachfused hypothesis is given by

119875 (120582 | 119885) = 119862minus1prod

119895isin119869

119875(120582(119895)| 119885(119895))120572(119895)

prod

120591isin120582

119871 (120591 | 119885) (19)

where 119862 is a normalizing constant and 119871(120591 | 119885) is thelikelihood of each hypothesis pair

The main disadvantage of the MHT-D is the high com-putational cost that is in the order of 119874(119899119872) where 119899 is thenumber of possible associations and 119872 is the number ofvariables to be estimated

37 Graphical Models Graphical models are a formalism forrepresenting and reasoning with probabilities and indepen-dence A graphical model represents a conditional decom-position of the joint probability A graphical model can berepresented as a graph in which the nodes denote randomvariables the edges denote the possible dependence betweenthe random variables and the plates denote the replication ofa substructure with the appropriate indexing of the relevantvariables The graph captures the joint distribution over therandom variables which can be decomposed into a productof factors that each dependononly a subset of variablesThereare two major classes of graphical models (i) the Bayesiannetworks [28] which are also known as the directed graphicalmodels and (ii) the Markov random fields which are alsoknown as undirected graphical models The directed graph-ical models are useful for expressing causal relationshipsbetween random variables whereas undirected models arebetter suited for expressing soft constraints between randomvariables We refer the reader to the book of Koller andFriedman [29] for more information on graphical models

A framework based on graphical models can solve theproblem of distributed data association in synchronizedsensor networkswith overlapped areas andwhere each sensorreceives noisy measurements this solution was proposedby Chen et al [30 31] Their work is based on graphicalmodels that are used to represent the statistical dependencebetween random variables The data association problem istreated as an inference problem and solved by using themax-product algorithm [32] Graphical models representstatistical dependencies between variables as graphs andthe max-product algorithm converges when the graph isa tree structure Moreover the employed algorithm couldbe implemented in a distributed manner by exchangingmessages between the source nodes in parallel With thisalgorithm if each sensor has 119899 possible combinations ofassociations and there are119872 variables to be estimated it hasa complexity of 119874(1198992119872) which is reasonable and less thanthe 119874(119899119872) complexity of the MHT-D algorithm Howeveraspecial attention must be given to the correlated variableswhen building the graphical model

4 State Estimation Methods

State estimation techniques aim to determine the state ofthe target under movement (typically the position) given

the observation or measurements State estimation tech-niques are also known as tracking techniques In their generalform it is not guaranteed that the target observations arerelevant which means that some of the observations couldactually come from the target and others could be only noiseThe state estimation phase is a common stage in data fusionalgorithms because the targetrsquos observation could come fromdifferent sensors or sources and the final goal is to obtain aglobal target state from the observations

The estimation problem involves finding the values of thevector state (eg position velocity and size) that fits as muchas possible with the observed data From a mathematicalperspective we have a set of redundant observations andthe goal is to find the set of parameters that provides thebest fit to the observed data In general these observationsare corrupted by errors and the propagation of noise in themeasurement process State estimation methods fall underlevel 1 of the JDL classification and could be divided into twobroader groups

(1) linear dynamics and measurements here the esti-mation problem has a standard solution Specificallywhen the equations of the object state and the mea-surements are linear the noise follows the Gaussiandistribution and we do not refer to it as a clutterenvironment in this case the optimal theoreticalsolution is based on the Kalman filter

(2) nonlinear dynamics the state estimation problembecomes difficult and there is not an analytical solu-tion to solve the problem in a generalmanner In prin-ciple there are no practical algorithms available tosolve this problem satisfactorily

Most of the state estimationmethods are based on controltheory and employ the laws of probability to compute avector state from a vector measurement or a stream of vectormeasurements Next the most common estimation methodsare presented including maximum likelihood and maxi-mum posterior (Section 41) the Kalman filter (Section 42)particle filter (Section 43) the distributed Kalman filter(Section 44) distributed particle filter (Section 45) andcovariance consistency methods (Section 46)

41 Maximum Likelihood andMaximum Posterior Themax-imum likelihood (ML) technique is an estimation methodthat is based on probabilistic theory Probabilistic estimationmethods are appropriate when the state variable follows anunknown probability distribution [33] In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of119909 The likelihood function 120582(119909) is defined as a probabilitydensity function of the sequence of 119911 observations given thetrue value of the state 119909 Consider

120582 (119909) = 119901 (119911 | 119909) (20)

The ML estimator finds the value of 119909 that maximizes thelikelihood function

119909 (119896) = argmax119909119901 (119911 | 119909) (21)

12 The Scientific World Journal

which can be obtained from the analytical or empiricalmodels of the sensorsThis function expresses the probabilityof the observed data The main disadvantage of this methodin practice is that it requires the analytical or empirical modelof the sensor to be known to provide the prior distributionand compute the likelihood function This method can alsosystematically underestimate the variance of the distributionwhich leads to a bias problem However the bias of the MLsolution becomes less significant as the number 119873 of datapoints increases and is equal to the true variance of thedistribution that generated the data at the limit119873 rarr infin

The maximum posterior (MAP) method is based on theBayesian theory It is employed when the parameter 119909 tobe estimated is the output of a random variable that has aknown probability density function 119901(119909) In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of 119909The MAP estimator finds the value of 119909 that maximizes theposterior probability distribution as follows

119909 (119896) = argmax119909119901 (119909 | 119911) (22)

Both methods (ML andMAP) aim to find the most likelyvalue for the state 119909 However ML assumes that 119909 is a fixedbut an unknown point from the parameter space whereasMAP considers 119909 to be the output of a random variable witha known a priori probability density function Both of thesemethods are equivalent when there is no a priori informationabout 119909 that is when there are only observations

42The Kalman Filter TheKalman filter is the most popularestimation technique It was originally proposed by Kalman[34] and has been widely studied and applied since then TheKalman filter estimates the state 119909 of a discrete time processgoverned by the following space-time model

119909 (119896 + 1) = Φ (119896) 119909 (119896) + 119866 (119896) 119906 (119896) + 119908 (119896) (23)

with the observations ormeasurements 119911 at time 119896 of the state119909 represented by

119911 (119896) = 119867 (119896) 119909 (119896) + V (119896) (24)

where Φ(119896) is the state transition matrix 119866(119896) is the inputmatrix transition 119906(119896) is the input vector 119867(119896) is themeasurement matrix and 119908 and V are the random Gaussianvariables with zero mean and covariance matrices of 119876(119896)and 119877(119896) respectively Based on the measurements and onthe system parameters the estimation of 119909(119896) which isrepresented by 119909(119896) and the prediction of 119909(119896 + 1) whichis represented by 119909(119896 + 1 | 119896) are given by the following

119909 (119896) = 119909 (119896 | 119896 + 1) + 119870 (119896) [119911 (119896) minus 119867 (119896) 119909 (119896 | 119896 minus 1)]

119909 (119896 + 1 | 119896) = Φ (119896) 119909 (119896 | 119896) + 119866 (119896) 119906 (119896)

(25)

respectively where119870 is the filter gain determined by

119870 (119896) = 119875 (119896 | 119896 minus 1)119867119879(119896)

times [119867 (119896) 119875 (119896 | 119896 minus 1)119867119879(119896) + 119877 (119896)]

minus1

(26)

where 119875(119896 | 119896 minus 1) is the prediction covariance matrix andcan be determined by

119875 (119896 + 1 | 119896) = Φ (119896) 119875 (119896)Φ119879(119896) + 119876 (119896) (27)

with

119875 (119896) = 119875 (119896 | 119896 minus 1) minus 119870 (119896)119867 (119896) 119875 (119896 | 119896 minus 1) (28)

The Kalman filter is mainly employed to fuse low-leveldata If the system could be described as a linear model andthe error could be modeled as the Gaussian noise then therecursive Kalman filter obtains optimal statistical estimations[35] However othermethods are required to address nonlin-ear dynamicmodels and nonlinearmeasurementsThemodi-fied Kalman filter known as the extended Kalman filter (EKF)is an optimal approach for implementing nonlinear recursivefilters [36] The EKF is one of the most often employedmethods for fusing data in robotic applications Howeverit has some disadvantages because the computations of theJacobians are extremely expensive Some attempts have beenmade to reduce the computational cost such as linearizationbut these attempts introduce errors in the filter and make itunstable

The unscented Kalman filter (UKF) [37] has gainedpopularity because it does not have the linearization step andthe associated errors of the EKF [38] The UKF employs adeterministic sampling strategy to establish theminimum setof points around the mean This set of points captures thetrue mean and covariance completely Then these points arepropagated through nonlinear functions and the covarianceof the estimations can be recuperated Another advantage ofthe UKF is its ability to be employed in parallel implementa-tions

43 Particle Filter Particle filters are recursive implemen-tations of the sequential Monte Carlo methods [39] Thismethod builds the posterior density function using severalrandom samples called particles Particles are propagatedover time with a combination of sampling and resamplingsteps At each iteration the sampling step is employed todiscard some particles increasing the relevance of regionswith a higher posterior probability In the filtering processseveral particles of the same state variable are employedand each particle has an associated weight that indicatesthe quality of the particle Therefore the estimation is theresult of a weighted sum of all of the particles The standardparticle filter algorithm has two phases (1) the predictingphase and (2) the updating phase In the predicting phaseeach particle is modified according to the existing modeland accounts for the sum of the random noise to simulatethe noise effect Then in the updating phase the weight ofeach particle is reevaluated using the last available sensorobservation and particles with lower weights are removedSpecifically a generic particle filter comprises the followingsteps

The Scientific World Journal 13

(1) Initialization of the particles

(i) let119873 be equal to the number of particles(ii) 119883(119894)(1) = [119909(1) 119910(1) 0 0]119879 for 119894 = 1 119873

(2) Prediction step

(i) for each particle 119894 = 1 119873 evaluate the state(119896 + 1 | 119896) of the system using the state at timeinstant 119896 with the noise of the system at time 119896Consider

119883(119894)(119896 + 1 | 119896) = 119865 (119896)119883

(119894)(119896)

+ (cauchy-distribution-noise)(119896)

(29)

where 119865(119896) is the transition matrix of the sys-tem

(3) Evaluate the particle weight For each particle 119894 =1 119873

(i) compute the predicted observation state of thesystem using the current predicted state and thenoise at instant 119896 Consider

(119894)(119896 + 1 | 119896) = 119867 (119896 + 1)119883

(119894)(119896 + 1 | 119896)

+ (gaussian-measurement-noise)(119896+1)

(30)

(ii) compute the likelihood (weights) according tothe given distribution Consider

likelihood(119894) = 119873((119894) (119896 + 1 | 119896) 119911(119894) (119896 + 1) var) (31)

(iii) normalize the weights as follows

119908(119894)=

likelihood(119894)

sum119873

119895=1likelihood(119895)

(32)

(4) ResamplingSelection multiply particles with higherweights and remove those with lower weights Thecurrent state must be adjusted using the computedweights of the new particles

(i) Compute the cumulative weights Consider

Cum Wt(119894) =119894

sum

119895=1

119908(119895) (33)

(ii) Generate uniform distributed random variablesfrom 119880(119894) sim 119882(0 1) with the number of stepsequal to the number of particles

(iii) Determine which particles should be multipliedand which ones removed

(5) Propagation phase

(i) incorporate the new values of the state after theresampling of instant 119896 to calculate the value atinstant 119896 + 1 Consider

119909(1119873)

(119896 + 1 | 119896 + 1) = 119909 (119896 + 1 | 119896) (34)

(ii) compute the posterior mean Consider

119909 (119896 + 1) = mean [119909119894 (119896 + 1 | 119896 + 1)] 119894 = 1 119873 (35)

(iii) repeat steps 2 to 5 for each time instant

Particle filters are more flexible than the Kalman filtersand can copewith nonlinear dependencies and non-Gaussiandensities in the dynamic model and in the noise errorHowever they have some disadvantages A large numberof particles are required to obtain a small variance in theestimator It is also difficult to establish the optimal number ofparticles in advance and the number of particles affects thecomputational cost significantly Earlier versions of particlefilters employed a fixed number of particles but recent studieshave started to use a dynamic number of particles [40]

44 The Distributed Kalman Filter The distributed Kalmanfilter requires a correct clock synchronization between eachsource as demonstrated in [41] In other words to correctlyuse the distributed Kalman filter the clocks from all ofthe sources must be synchronized This synchronization istypically achieved through using protocols that employ ashared global clock such as the network time protocol (NTP)Synchronization problems between clocks have been shownto have an effect on the accuracy of the Kalman filterproducing inaccurate estimations [42]

If the estimations are consistent and the cross covarianceis known (or the estimations are uncorrelated) then it ispossible to use the distributed Kalman filters [43] Howeverthe cross covariance must be determined exactly or theobservations must be consistent

We refer the reader to Liggins II et al [20] formore detailsabout the Kalman filter in a distributed and hierarchicalarchitecture

45 Distributed Particle Filter Distributed particle filtershave gained attention recently [44ndash46] Coates [45] used adistributed particle filter to monitor an environment thatcould be captured by the Markovian state-space modelinvolving nonlinear dynamics and observations and non-Gaussian noise

In contrast earlier attempts to solve out-of-sequencemeasurements using particle filters are based on regeneratingthe probability density function to the time instant of theout-of-sequence measurement [47] In a particle filter thisstep requires a large computational cost in addition to thenecessary space to store the previous particles To avoidthis problem Orton and Marrs [48] proposed to store theinformation on the particles at each time instant saving thecost of recalculating this information This technique is close

14 The Scientific World Journal

to optimal and when the delay increases the result is onlyslightly affected [49]However it requires a very large amountof space to store the state of the particles at each time instant

46 Covariance Consistency Methods Covariance Intersec-tionUnion Covariance consistency methods (intersectionand union) were proposed by Uhlmann [43] and are generaland fault-tolerant frameworks for maintaining covariancemeans and estimations in a distributed networkThese meth-ods do not comprise estimation techniques instead they aresimilar to an estimation fusion technique The distributedKalman filter requirement of independent measurements orknown cross-covariances is not a constraintwith thismethod

461 Covariance Intersection If the Kalman filter is employ-ed to combine two estimations (119886

1 1198601) and (119886

2 1198602) then it

is assumed that the joint covariance is in the following form

[

[

1198601119883

1198831198791198602

]

]

(36)

where the cross-covariance 119883 should be known exactly sothat the Kalman filter can be applied without difficultyBecause the computation of the cross-covariances is compu-tationally intensive Uhlmann [43] proposed the covarianceintersection (CI) algorithm

Let us assume that a joint covariance 119872 can be definedwith the diagonal blocks119872

1198601gt 1198601and119872

1198602gt 1198602 Consider

119872 ⩾ [

[

1198601119883

1198831198791198602

]

]

(37)

for every possible instance of the unknown cross-covariance119883 then the components of the matrix119872 could be employedin the Kalman filter equations to provide a fused estimation(119888 119862) that is considered consistent The key point of thismethod relies on generating a joint covariance matrix119872 thatcan represent a useful fused estimation (in this context usefulrefers to something with a lower associated uncertainty) Insummary the CI algorithm computes the joint covariancematrix 119872 where the Kalman filter provides the best fusedestimation (119888 119862) with respect to a fixed measurement of thecovariance matrix (ie the minimum determinant)

Specific covariance criteria must be established becausethere is not a specific minimum joint covariance in theorder of the positive semidefinite matrices Moreover thejoint covariance is the basis of the formal analysis of theCI algorithm the actual result is a nonlinear mixture of theinformation stored on the estimations being fused followingthe following equation

119862 = (1199081119867119879

1119860minus1

11198671+ 1199082119867119879

2119860minus1

21198672+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119867119899)minus1

119888 = 119862(1199081119867119879

1119860minus1

11198861+ 1199082119867119879

2119860minus1

21198862+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119886119899)minus1

(38)

where 119867119894is the transformation of the fused state-space

estimation to the space of the estimated state 119894 The values

of 119908 can be calculated to minimize the covariance determi-nant using convex optimization packages and semipositivematrix programming The result of the CI algorithm hasdifferent characteristics compared to the Kalman filter Forexample if two estimations are provided (119886 119860) and (119887 119861)and their covariances are equal 119860 = 119861 since the Kalmanfilter is based on the statistical independence assumption itproduces a fused estimation with covariance 119862 = (12)119860In contrast the CI method does not assume independenceand thus must be consistent even in the case in whichthe estimations are completely correlated with the estimatedfused covariance 119862 = 119860 In the case of estimations where119860 lt 119861 the CI algorithm does not provide information aboutthe estimation (119887 119861) thus the fused result is (119886 119860)

Every joint-consistent covariance is sufficient to producea fused estimation which guarantees consistency Howeverit is also necessary to guarantee a lack of divergence Diver-gence is avoided in the CI algorithm by choosing a specificmeasurement (ie the determinant) which is minimized ineach fusion operation This measurement represents a non-divergence criterion because the size of the estimated covari-ance according to this criterion would not be incremented

The application of the CI method guarantees consis-tency and nondivergence for every sequence of mean andcovariance-consistent estimations However this methoddoes not work well when the measurements to be fused areinconsistent

462 Covariance Union CI solves the problem of correlatedinputs but not the problemof inconsistent inputs (inconsistentinputs refer to different estimations each of which has ahigh accuracy (small variance) but also a large differencefrom the states of the others) thus the covariance union(CU) algorithm was proposed to solve the latter [43] CUaddresses the following problem two estimations (119886

1 1198601)

and (1198862 1198602) relate to the state of an object and are mutually

inconsistent from one another This issue arises when thedifference between the average estimations is larger thanthe provided covariance Inconsistent inputs can be detectedusing the Mahalanobis distance [50] between them which isdefined as

119872119889= (1198861minus 1198862)119879

(1198601+ 1198602)minus1

(1198861minus 1198862) (39)

and detecting whether this distance is larger than a giventhreshold

The Mahalanobis distance accounts for the covarianceinformation to obtain the distance If the difference betweenthe estimations is high but their covariance is also highthe Mahalanobis distance yields a small value In contrastif the difference between the estimations is small and thecovariances are small it could produce a larger distancevalue A high Mahalanobis distance could indicate that theestimations are inconsistent however it is necessary tohave a specific threshold established by the user or learnedautomatically

The CU algorithm aims to solve the following prob-lem let us suppose that a filtering algorithm provides twoobservations withmean and covariance (119886

1 1198601) and (119886

2 1198602)

The Scientific World Journal 15

respectively It is known that one of the observations is correctand the other is erroneous However the identity of thecorrect estimation is unknown and cannot be determinedIn this situation if both estimations are employed as aninput to the Kalman filter there will be a problem becausethe Kalman filter only guarantees a consistent output if theobservation is updated with a measurement consistent withboth of them In the specific case in which themeasurementscorrespond to the same object but are acquired from twodifferent sensors the Kalman filter can only guarantee thatthe output is consistent if it is consistent with both separatelyBecause it is not possible to knowwhich estimation is correctthe only way to combine the two estimations rigorously isto provide an estimation (119906 119880) that is consistent with bothestimations and to obey the following properties

119880 ⪖ 1198601+ (119906 minus 119886

1) (119906 minus 119860

1)119879

119880 ⪖ 1198602+ (119906 minus 119886

2) (119906 minus 119860

2)119879

(40)

where somemeasurement of thematrix size119880 (ie the deter-minant) is minimized

In other words the previous equations indicate that if theestimation (119886

1 1198601) is consistent then the translation of the

vector 1198861to 119906 requires to increase the covariance by the sum

of a matrix at least as big as the product of (119906minus1198861) in order to

be consistentThe same situation applies to the measurement(1198862 1198602) in order to be consistent

A simple strategy is to choose the mean of the estimationas the input value of one of themeasurements (119906 = 119886

1) In this

case the value of 119880must be chosen such that the estimationis consistent with the worst case (the correct measurement is1198862) However it is possible to assign 119906 an intermediate value

between 1198861and 1198862to decrease the value of 119880 Therefore the

CU algorithm establishes the mean fused value 119906 that hasthe least covariance 119880 but is sufficiently large for the twomeasurements (119886

1and 1198862) for consistency

Because the matrix inequalities presented in previousequations are convex convex optimization algorithms mustbe employed to solve them The value of 119880 can be computedwith the iterative method described by Julier et al [51]The obtained covariance could be significantly larger thanany of the initial covariances and is an indicator of theexisting uncertainty between the initial estimations One ofthe advantages of the CU method arises from the fact thatthe same process could be easily extended to119873 inputs

5 Decision Fusion Methods

A decision is typically taken based on the knowledge of theperceived situation which is provided by many sources inthe data fusion domain These techniques aim to make ahigh-level inference about the events and activities that areproduced from the detected targets These techniques oftenuse symbolic information and the fusion process requires toreasonwhile accounting for the uncertainties and constraintsThese methods fall under level 2 (situation assessment) andlevel 4 (impact assessment) of the JDL data fusion model

51 The Bayesian Methods Information fusion based on theBayesian inference provides a formalism for combining evi-dence according to the probability theory rules Uncertaintyis represented using the conditional probability terms thatdescribe beliefs and take on values in the interval [0 1] wherezero indicates a complete lack of belief and one indicates anabsolute belief The Bayesian inference is based on the Bayesrule as follows

119875 (119884 | 119883) =119875 (119883 | 119884) 119875 (119884)

119875 (119883) (41)

where the posterior probability 119875(119884 | 119883) represents thebelief in the hypothesis 119884 given the information 119883 Thisprobability is obtained by multiplying the a priori probabilityof the hypothesis 119875(119884) by the probability of having 119883 giventhat 119884 is true 119875(119883 | 119884) The value 119875(119883) is used as anormalizing constantThemain disadvantage of the Bayesianinference is that the probabilities 119875(119883) and 119875(119883 | 119884) mustbe known To estimate the conditional probabilities Panet al [52] proposed the use of NNs whereas Coue et al [53]proposed the Bayesian programming

Hall and Llinas [54] described the following problemsassociated with Bayesian inference

(i) Difficulty in establishing the value of a priori proba-bilities

(ii) Complexity when there are multiple potential hypo-theses and a substantial number of events that dependon the conditions

(iii) The hypothesis should be mutually exclusive

(iv) Difficulty in describing the uncertainty of the deci-sions

52 The Dempster-Shafer Inference The Dempster-Shaferinference is based on the mathematical theory introducedby Dempster [55] and Shafer [56] which generalizes theBayesian theory The Dempster-Shafer theory provides aformalism that could be used to represent incomplete knowl-edge updating beliefs and a combination of evidence andallows us to represent the uncertainty explicitly [57]

A fundamental concept in the Dempster-Shafer reason-ing is the frame of discernment which is defined as followsLet Θ = 120579

1 1205792 120579

119873 be the set of all possible states

that define the system and let Θ be exhaustive and mutuallyexclusive due to the system being only in one state 120579

119894isin Θ

where 1 ⪕ 119894 ⪕ 119873 The set Θ is called a frame of discernmentbecause its elements are employed to discern the current stateof the system

The elements of the set 2Θ are called hypotheses Inthe Dempster-Shafer theory based on the evidence 119864 aprobability is assigned to each hypothesis119867 isin 2Θ accordingto the basic assignment of probabilities or the mass function119898 2Θrarr [01] which satisfies

119898(0) = 0 (42)

16 The Scientific World Journal

Thus themass function of the empty set is zero Furthermorethe mass function of a hypothesis is larger than or equal tozero for all of the hypotheses Consider

119898(119867) ge 0 forall119867 isin 2Θ (43)

The sum of the mass function of all the hypotheses is oneConsider

sum

119867isin2Θ

119898(119867) = 1 (44)

To express incomplete beliefs in a hypothesis 119867 the Demp-ster-Shafer theory defines the belief function bel 2Θ rarr

[0 1] over Θ as

bel (119867) = sum119860sube119867

119898(119860) (45)

where bel(0) = 0 and bel(Θ) = 1The doubt level in119867 can beexpressed in terms of the belief function by

dou (119867) = bel (not119867) = sum

119860subenot119867

119898(119860) (46)

To express the plausibility of each hypothesis the functionpl 2Θ rarr [0 1] over Θ is defined as

pl (119867) = 1 minus dou (119867) = sum

119860cap119867=0

119898(119860) (47)

Intuitive plausibility indicates that there is less uncer-tainty in hypothesis119867 if it is more plausible The confidenceinterval [bel(119867) pl(119867)] defines the true belief in hypothesis119867 To combine the effects of the two mass functions 119898

1and

1198982 the Dempster-Shafer theory defines a rule119898

1oplus 1198982as

1198981oplus 1198982(0) = 0

1198981oplus 1198982(119867) =

sum119883cap119884=119867

1198981(119883)119898

2(119884)

1 minus sum119883cap119884=0

1198981(119883)119898

2(119884)

(48)

In contrast to the Bayesian inference a priori probabilitiesare not required in the Dempster-Shafer inference becausethey are assigned at the instant that the information is pro-vided Several studies in the literature have compared the useof the Bayesian inference and the Dempster-Shafer inferencesuch as [58ndash60] Wu et al [61] used the Dempster-Shafertheory to fuse information in context-aware environmentsThis work was extended in [62] to dynamically modify theassociated weights to the sensor measurements Thereforethe fusion mechanism is calibrated according to the recentmeasurements of the sensors (in cases in which the ground-truth is available) In themilitary domain [63] theDempster-Shafer reasoning is used with the a priori information storedin a database for classifying military ships Morbee et al [64]described the use of the Dempster-Shafer theory to build 2Doccupancy maps from several cameras and to evaluate thecontribution of subsets of cameras to a specific task Each taskis the observation of an event of interest and the goal is toassess the validity of a set of hypotheses that are fused usingthe Dempster-Shafer theory

53 Abductive Reasoning Abductive reasoning or inferringthe best explanation is a reasoning method in which ahypothesis is chosen under the assumption that in case itis true it explains the observed event most accurately [65]In other words when an event is observed the abductionmethod attempts to find the best explanation

In the context of probabilistic reasoning abductive infer-ence finds the posterior ML of the system variables givensome observed variables Abductive reasoning is more areasoning pattern than a data fusion technique Thereforedifferent inference methods such as NNs [66] or fuzzy logic[67] can be employed

54 Semantic Methods Decision fusion techniques thatemploy semantic data fromdifferent sources as an input couldprovide more accurate results than those that rely on onlysingle sources There is a growing interest in techniques thatautomatically determine the presence of semantic features invideos to solve the semantic gap [68]

Semantic information fusion is essentially a scheme inwhich raw sensor data are processed such that the nodesexchange only the resultant semantic information Semanticinformation fusion typically covers two phases (i) build-ing the knowledge and (ii) pattern matching (inference)The first phase (typically offline) incorporates the mostappropriate knowledge into semantic information Then thesecond phase (typically online or in real-time) fuses relevantattributes and provides a semantic interpretation of thesensor data [69ndash71]

Semantic fusion could be viewed as an idea for integratingand translating sensor data into formal languages Thereforethe obtained resulting language from the observations ofthe environment is compared with similar languages thatare stored in the database The key of this strategy is thatsimilar behaviors represented by formal languages are alsosemantically similar This type of method provides savingsin the cost of transmission because the nodes need onlytransmit the formal language structure instead of the rawdata However a known set of behaviors must be storedin a database in advance which might be difficult in somescenarios

6 Conclusions

This paper reviews the most popular methods and tech-niques for performing datainformation fusion To determinewhether the application of datainformation fusion methodsis feasible we must evaluate the computational cost of theprocess and the delay introduced in the communicationA centralized data fusion approach is theoretically optimalwhen there is no cost of transmission and there are sufficientcomputational resources However this situation typicallydoes not hold in practical applications

The selection of the most appropriate technique dependson the type of the problem and the established assumptionsof each technique Statistical data fusion methods (eg PDAJPDA MHT and Kalman) are optimal under specific condi-tions [72] First the assumption that the targets are moving

The Scientific World Journal 17

independently and the measurements are normally dis-tributed around the predicted position typically does nothold Second because the statistical techniques model allof the events as probabilities they typically have severalparameters and a priori probabilities for false measurementsand detection errors that are often difficult to obtain (atleast in an optimal sense) For example in the case of theMHT algorithm specific parameters must be established thatare nontrivial to determine and are very sensitive [73] Incontrast statisticalmethods that optimize over several framesare computationally intensive and their complexity typicallygrows exponentially with the number of targets For examplein the case of particle filters tracking several targets can beaccomplished jointly as a group or individually If severaltargets are tracked jointly the necessary number of particlesgrows exponentially Therefore in practice it is better toperform tracking on them individually with the assumptionthat targets do not interact between the particles

In contrast to centralized systems the distributed datafusion methods introduce some challenges in the data fusionprocess such as (i) spatial and temporal alignments of theinformation (ii) out-of-sequence measurements and (iii)data correlation reported by Castanedo et al [74 75] Theinherent redundancy of the distributed systems could beexploited with distributed reasoning techniques and cooper-ative algorithms to improve the individual node estimationsreported by Castanedo et al [76] In addition to the previousstudies a new trend based on the geometric notion of a low-dimensional manifold is gaining attention in the data fusioncommunity An example is the work of Davenport et al [77]which proposes a simple model that captures the correlationbetween the sensor observations by matching the parametervalues for the different obtained manifolds

Acknowledgments

The author would like to thank Jesus Garcıa Miguel APatricio and James Llinas for their interesting and relateddiscussions on several topics that were presented in thispaper

References

[1] JDL Data Fusion Lexicon Technical Panel For C3 FE WhiteSan Diego Calif USA Code 420 1991

[2] D L Hall and J Llinas ldquoAn introduction to multisensor datafusionrdquo Proceedings of the IEEE vol 85 no 1 pp 6ndash23 1997

[3] H F Durrant-Whyte ldquoSensor models and multisensor integra-tionrdquo International Journal of Robotics Research vol 7 no 6 pp97ndash113 1988

[4] B V Dasarathy ldquoSensor fusion potential exploitation-inno-vative architectures and illustrative applicationsrdquo Proceedings ofthe IEEE vol 85 no 1 pp 24ndash38 1997

[5] R C Luo C-C Yih and K L Su ldquoMultisensor fusion andintegration approaches applications and future research direc-tionsrdquo IEEE Sensors Journal vol 2 no 2 pp 107ndash119 2002

[6] J Llinas C Bowman G Rogova A Steinberg E Waltz andF White ldquoRevisiting the JDL data fusion model IIrdquo TechnicalReport DTIC Document 2004

[7] E P Blasch and S Plano ldquoJDL level 5 fusionmodel ldquouser refine-mentrdquo issues and applications in group trackingrdquo in Proceedingsof the Signal Processing Sensor Fusion and Target RecognitionXI pp 270ndash279 April 2002

[8] H F Durrant-Whyte and M Stevens ldquoData fusion in decen-tralized sensing networksrdquo in Proceedings of the 4th Interna-tional Conference on Information Fusion pp 302ndash307MontrealCanada 2001

[9] J Manyika and H Durrant-Whyte Data Fusion and SensorManagement A Decentralized Information-Theoretic ApproachPrentice Hall Upper Saddle River NJ USA 1995

[10] S S Blackman ldquoAssociation and fusion ofmultiple sensor datardquoinMultitarget-Multisensor Tracking Advanced Applications pp187ndash217 Artech House 1990

[11] S Lloyd ldquoLeast squares quantization in pcmrdquo IEEETransactionson Information Theory vol 28 no 2 pp 129ndash137 1982

[12] M Shindler A Wong and A Meyerson ldquoFast and accurate120581-means for large datasetsrdquo in Proceedings of the 25th AnnualConference on Neural Information Processing Systems (NIPS rsquo11)pp 2375ndash2383 December 2011

[13] Y Bar-Shalom and E Tse ldquoTracking in a cluttered environmentwith probabilistic data associationrdquo Automatica vol 11 no 5pp 451ndash460 1975

[14] T E Fortmann Y Bar-Shalom and M Scheffe ldquoMulti-targettracking using joint probabilistic data associationrdquo in Pro-ceedings of the 19th IEEE Conference on Decision and Controlincluding the Symposium on Adaptive Processes vol 19 pp 807ndash812 December 1980

[15] D B Reid ldquoAn algorithm for tracking multiple targetsrdquo IEEETransactions on Automatic Control vol 24 no 6 pp 843ndash8541979

[16] C L Morefield ldquoApplication of 0-1 integer programming tomultitarget tracking problemsrdquo IEEETransactions onAutomaticControl vol 22 no 3 pp 302ndash312 1977

[17] R L Streit and T E Luginbuhl ldquoMaximum likelihood methodfor probabilistic multihypothesis trackingrdquo in Proceedings of theSignal and Data Processing of Small Targets vol 2235 of Pro-ceedings of SPIE p 394 1994

[18] I J Cox and S L Hingorani ldquoEfficient implementation of Reidrsquosmultiple hypothesis tracking algorithm and its evaluation forthe purpose of visual trackingrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 18 no 2 pp 138ndash1501996

[19] K G Murty ldquoAn algorithm for ranking all the assignments inorder of increasing costrdquo Operations Research vol 16 no 3 pp682ndash687 1968

[20] M E Liggins II C-Y Chong I Kadar et al ldquoDistributed fusionarchitectures and algorithms for target trackingrdquo Proceedings ofthe IEEE vol 85 no 1 pp 95ndash106 1997

[21] S Coraluppi C Carthel M Luettgen and S Lynch ldquoAll-source track and identity fusionrdquo in Proceedings of the NationalSymposium on Sensor and Data Fusion 2000

[22] P Storms and F Spieksma ldquoAn lp-based algorithm for the dataassociation problem in multitarget trackingrdquo in Proceedings ofthe 3rd IEEE International Conference on Information Fusionvol 1 2000

[23] S-W Joo and R Chellappa ldquoA multiple-hypothesis approachfor multiobject visual trackingrdquo IEEE Transactions on ImageProcessing vol 16 no 11 pp 2849ndash2854 2007

[24] S Coraluppi andC Carthel ldquoAggregate surveillance a cardinal-ity tracking approachrdquo in Proceedings of the 14th InternationalConference on Information Fusion (FUSION rsquo11) July 2011

18 The Scientific World Journal

[25] K C Chang C Y Chong and Y Bar-Shalom ldquoJoint proba-bilistic data association in distributed sensor networksrdquo IEEETransactions on Automatic Control vol 31 no 10 pp 889ndash8971986

[26] Y Chong S Mori and K C Chang ldquoInformation lusion indistributed sensor networksrdquo in Proceedings of the 4th AmericanControl Conference Boston Mass USA June 1985

[27] Y Chong S Mori and K C Chang ldquoDistributed multitar-get multisensor trackingrdquo in Multitarget-Multisensor TrackingAdvanced Applications vol 1 pp 247ndash295 1990

[28] J Pearl Probabilistic Reasoning in Intelligent Systems Networksof Plausible Inference Morgan Kaufmann San Mateo CalifUSA 1988

[29] Koller and N Friedman Probabilistic Graphical Models Princi-ples and Techniques MIT press 2009

[30] L Chen M Cetin and A S Willsky ldquoDistributed data associ-ation for multi-target tracking in sensor networksrdquo in Proceed-ings of the 7th International Conference on Information Fusion(FUSION rsquo05) pp 9ndash16 July 2005

[31] L Chen M J Wainwright M Cetin and A S Willsky ldquoDataassociation based on optimization in graphical models withapplication to sensor networksrdquo Mathematical and ComputerModelling vol 43 no 9-10 pp 1114ndash1113 2006

[32] Y Weiss and W T Freeman ldquoOn the optimality of solutionsof the max-product belief-propagation algorithm in arbitrarygraphsrdquo IEEE Transactions on InformationTheory vol 47 no 2pp 736ndash744 2001

[33] C Brown H Durrant-Whyte J Leonard B Rao and B SteerldquoDistributed data fusion using Kalman filtering a roboticsapplicationrdquo in Data Fusion in Robotics and Machine Intelli-gence M A Abidi and R C Gonzalez Eds pp 267ndash309 1992

[34] R E Kalman ldquoA new approach to linear filtering and predictionproblemsrdquo Journal of Basic Engineering vol 82 no 1 pp 35ndash451960

[35] R C Luo and M G Kay ldquoData fusion and sensor integrationstate-of-the-art 1990srdquo in Data Fusion in Robotics and MachineIntelligence pp 7ndash135 1992

[36] Welch and G Bishop An Introduction to the Kalman FilterACM SIC-CRAPH 2001 Course Notes 2001

[37] S J Julier and J K Uhlmann ldquoA new extension of the Kalmanfilter to nonlinear systemsrdquo in Proceedings of the InternationalSymposium on AerospaceDefense Sensing Simulation and Con-trols vol 3 1997

[38] A Wan and R Van Der Merwe ldquoThe unscented kalman filterfor nonlinear estimationrdquo in Proceedings of the Adaptive Systemsfor Signal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 153ndash158 2000

[39] D Crisan and A Doucet ldquoA survey of convergence results onparticle filtering methods for practitionersrdquo IEEE Transactionson Signal Processing vol 50 no 3 pp 736ndash746 2002

[40] J Martinez-del Rincon C Orrite-Urunuela and J E Herrero-Jaraba ldquoAn efficient particle filter for color-based tracking incomplex scenesrdquo in Proceedings of the IEEE Conference onAdvancedVideo and Signal Based Surveillance pp 176ndash181 2007

[41] S Ganeriwal R Kumar and M B Srivastava ldquoTiming-syncprotocol for sensor networksrdquo in Proceedings of the 1st Inter-national Conference on Embedded Networked Sensor Systems(SenSys rsquo03) pp 138ndash149 November 2003

[42] M Manzo T Roosta and S Sastry ldquoTime synchronization innetworksrdquo in Proceedings of the 3rd ACMWorkshop on Securityof Ad Hoc and Sensor Networks (SASN rsquo05) pp 107ndash116November 2005

[43] J K Uhlmann ldquoCovariance consistency methods for fault-tolerant distributed data fusionrdquo Information Fusion vol 4 no3 pp 201ndash215 2003

[44] S Bashi V P Jilkov X R Li and H Chen ldquoDistributed imple-mentations of particle filtersrdquo in Proceedings of the 6th Interna-tional Conference of Information Fusion pp 1164ndash1171 2003

[45] M Coates ldquoDistributed particle filters for sensor networksrdquo inProceedings of the 3rd International symposium on InformationProcessing in Sensor Networks (ACM rsquo04) pp 99ndash107 New YorkNY USA 2004

[46] D Gu ldquoDistributed particle filter for target trackingrdquo in Pro-ceedings of the IEEE International Conference on Robotics andAutomation (ICRA rsquo07) pp 3856ndash3861 April 2007

[47] Y Bar-Shalom ldquoUpdate with out-of-sequencemeasurements intracking exact solutionrdquo IEEE Transactions on Aerospace andElectronic Systems vol 38 no 3 pp 769ndash778 2002

[48] M Orton and A Marrs ldquoA Bayesian approach to multi-targettracking and data fusionwithOut-of-SequenceMeasurementsrdquoIEE Colloquium no 174 pp 151ndash155 2001

[49] M L Hernandez A D Marrs S Maskell and M R OrtonldquoTracking and fusion for wireless sensor networksrdquo in Proceed-ings of the 5th International Conference on Information Fusion2002

[50] P C Mahalanobis ldquoOn the generalized distance in statisticsrdquoProceedings National Institute of ScienceIndia vol 2 no 1 pp49ndash55 1936

[51] S J Julier J K Uhlmann and D Nicholson ldquoA methodfor dealing with assignment ambiguityrdquo in Proceedings of theAmerican Control Conference (AAC rsquo04) vol 5 pp 4102ndash4107July 2004

[52] H Pan Z-P Liang T J Anastasio and T S Huang ldquoHybridNN-Bayesian architecture for information fusionrdquo in Proceed-ings of the International Conference on Image Processing (ICIPrsquo98) pp 368ndash371 October 1998

[53] C Coue T Fraichard P Bessiere and E Mazer ldquoMulti-sensordata fusion using Bayesian programming an automotive appli-cationrdquo in Proceedings of the IEEERSJ International Conferenceon Intelligent Robots and Systems pp 141ndash146 October 2002

[54] D L Hall and J Llinas Handbook of Multisensor Data FusionCRC Press Boca Raton Fla USA 2001

[55] P Dempster ldquoA Generalization of Bayesian Inferencerdquo Journalof the Royal Statistical Society B vol 30 no 2 pp 205ndash247 1968

[56] A ShaferMathematical Theory of Evidence Princeton Univer-sity Press Princeton NJ USA 1976

[57] G M Provan ldquoThe validity of Dempster-Shafer belief func-tionsrdquo International Journal of Approximate Reasoning vol 6no 3 pp 389ndash399 1992

[58] D M Buede ldquoShafer-Dempster and Bayesian reasoning aresponse to lsquoShafer-Dempster reasoning with applications tomultisensor target identification systemsrsquordquo IEEE Transactions onSystems Man and Cybernetics vol 18 no 6 pp 1009ndash1011 1988

[59] Y Cheng and R L Kashyap ldquoComparisonol Bayesian andDempsterrsquos rules in evidence combinationrdquo in Maximum-Entropy and BayesianMethods in Science and Engineering 1988

[60] B R Cobb and P P Shenoy ldquoA comparison of Bayesian andbelief function reasoningrdquo Information Systems Frontiers vol 5no 4 pp 345ndash358 2003

[61] H Wu M Siegel R Stiefelhagen and J Yang ldquoSensor fusionusing Dempster-Shafer theoryrdquo in Proceedings of the 19thIEEE Instrumentation and Measurement Technology Conference(TMTC rsquo02) pp 7ndash11 May 2002

The Scientific World Journal 19

[62] H Wu M Siegel and S Ablay ldquoSensor fusion using dempster-shafer theory II staticweighting andKalmanfilter-like dynamicweightingrdquo in Proceedings of the 20th IEEE Information andMeasurement Technology Conference (TMTC rsquo03) pp 907ndash912May 2003

[63] E Bosse P Valin A-C Boury-Brisset and D Grenier ldquoEx-ploitation of a priori knowledge for information fusionrdquo Infor-mation Fusion vol 7 no 2 pp 161ndash175 2006

[64] MMorbee L Tessens H Aghajan andW Philips ldquoDempster-Shafer based multi-view occupancy mapsrdquo Electronics Lettersvol 46 no 5 pp 341ndash343 2010

[65] C S Peirce Abduction and Induction Philosophical Writings ofPeirce vol 156 Dover New York NY USA 1955

[66] A M Abdelbar E A M Andrews and D C Wunsch IIldquoAbductive reasoning with recurrent neural networksrdquo NeuralNetworks vol 16 no 5-6 pp 665ndash673 2003

[67] J R Aguero and A Vargas ldquoInference of operative configu-ration of distribution networks using fuzzy logic techniquesPart II extended real-time modelrdquo IEEE Transactions on PowerSystems vol 20 no 3 pp 1562ndash1569 2005

[68] A W M Smeulders M Worring S Santini A Gupta and RJain ldquoContent-based image retrieval at the end of the earlyyearsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 22 no 12 pp 1349ndash1380 2000

[69] D S Friedlander and S Phoha ldquoSemantic information fusionfor coordinated signal processing in mobile sensor networksrdquoInternational Journal of High Performance Computing Applica-tions vol 16 no 3 pp 235ndash241 2002

[70] S Friedlander ldquoSemantic information extractionrdquo in Dis-tributed Sensor Networks 2005

[71] KWhitehouse J Liu and F Zhao ldquoSemantic Streams a frame-work for composable inference over sensor datardquo in Proceedingsof the 3rd European Workshop on Wireless Sensor NetworksLecture Notes in Computer Science Springer February 2006

[72] J Cox ldquoA review of statistical data association techniques formotion correspondencerdquo International Journal of ComputerVision vol 10 no 1 pp 53ndash66 1993

[73] C J Veenman M J T Reinders and E Backer ldquoResolvingmotion correspondence for densely moving pointsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol23 no 1 pp 54ndash72 2001

[74] F Castanedo M A Patricio J Garcıa and J M MolinaldquoBottom-uptop-down coordination in a multiagent visualsensor networkrdquo in Proceedings of the IEEE Conference onAdvanced Video and Signal Based Surveillance (AVSS rsquo07) pp93ndash98 September 2007

[75] F Castanedo J Garcıa M A Patricio and J M MolinaldquoAnalysis of distributed fusion alternatives in coordinated visionagentsrdquo in Proceedings of the 11th International Conference onInformation Fusion (FUSION rsquo08) July 2008

[76] F Castanedo J Garcıa M A Patricio and J M Molina ldquoDatafusion to improve trajectory tracking in a cooperative surveil-lance multi-agent architecturerdquo Information Fusion vol 11 no3 pp 243ndash255 2010

[77] M A Davenport C Hegde M F Duarte and R G BaraniukldquoJoint manifolds for data fusionrdquo IEEE Transactions on ImageProcessing vol 19 no 10 pp 2580ndash2594 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 8: Review Article A Review of Data Fusion Techniquesdownloads.hindawi.com/journals/tswj/2013/704504.pdf · Classification of Data Fusion Techniques Data fusion is a ... to extract features

8 The Scientific World Journal

where 119911119894(119896) is the 119894-measurement in the validation region at

time instant 119896 We give the standard equations of the PDAalgorithm next For the state prediction consider

119909 (119896 | 119896 minus 1) = 119865 (119896 minus 1) 119909 (119896 minus 1 | 119896 minus 1) (3)

where 119865(119896 minus 1) is the transition matrix at time instant 119896 minus 1To calculate the measurement prediction consider

(119896 | 119896 minus 1) = 119867 (119896) 119909 (119896 | 119896 minus 1) (4)

where 119867(119896) is the linearization measurement matrix Tocompute the gain or the innovation of the 119894-measurementconsider

V119894(119896) = 119911

119894(119896) minus (119896 | 119896 minus 1) (5)

To calculate the covariance prediction consider

(119896 | 119896 minus 1) = 119865 (119896 minus 1) (119896 minus 1 | 119896 minus 1) 119865(119896 minus 1)119879+ 119876 (119896)

(6)

where 119876(119896) is the process noise covariance matrix To com-pute the innovation covariance (119878) and the Kalman gain (119870)

119878 (119896) = 119867 (119896) (119896 | 119896 minus 1)119867(119896)119879+ 119877

119870 (119896) = (119896 | 119896 minus 1)119867(119896)119879119878(119896)minus1

(7)

To obtain the covariance update in the case in which themea-surements originated by the target are known consider

1198750(119896 | 119896) = (119896 | 119896 minus 1) minus 119870 (119896) 119878 (119896)119870(119896)

119879 (8)

The total update of the covariance is computed as

V (119896) =

119898119896

sum

119894=1

120573119894(119896) V119894(119896)

119875 (119896) = 119870 (119896) [

119898119896

sum

119894=1

(120573119894(119896) V119894(119896) V119894(119896)119879) minus V (119896) V(119896)

119879]119870119879(119896)

(9)

where119898119896is the number of valid measurements in the instant

119896The equation to update the estimated state which is formedby the position and velocity is given by

119909 (119896 | 119896) = 119909 (119896 | 119896 minus 1) + 119870 (119896) V (119896) (10)

Finally the association probabilities of PDA are as follows

120573119894(119896) =

119901119894(119896)

sum119898119896

119894=0119901119894(119896) (11)

where

119901119894(119896) =

(2Π)1198722120582radic1003816100381610038161003816119878119894 (119896)

1003816100381610038161003816 (1 minus 119875119889119875119892)

119875119889

if 119894 = 0

exp [minus12V119879 (119896) 119878minus1 (119896) V (119896)] if 119894 = 0

0 in other cases(12)

where119872 is the dimension of themeasurement vector 120582 is thedensity of the clutter environment 119875

119889is the detection prob-

ability of the correct measurement and 119875119892is the validation

probability of a detected valueIn the PDA algorithm the state estimation of the target is

computed as a weighted sum of the estimated state under allof the hypothesesThe algorithm can associate different mea-surements to one specific target Thus the association of thedifferent measurements to a specific target helps PDA toestimate the target state and the association probabilitiesare used as weights The main disadvantages of the PDAalgorithm are the following

(i) loss of tracks because PDA ignores the interferencewith other targets it sometimes could wrongly clas-sify the closest tracks Therefore it provides a poorperformance when the targets are close to each otheror crossed

(ii) the suboptimal Bayesian approximation when thesource of information is uncertain PDA is the sub-optimal Bayesian approximation to the associationproblem

(iii) one target PDA was initially designed for the asso-ciation of one target in a low-cluttered environmentThe number of false alarms is typically modeled withthe Poisson distribution and they are assumed to bedistributed uniformly in space PDA behaves incor-rectlywhen there aremultiple targets because the falsealarm model does not work well

(iv) track management because PDA assumes that thetrack is already established algorithms must be pro-vided for track initialization and track deletion

PDA is mainly good for tracking targets that do notmake abrupt changes in their movement patterns PDA willmost likely lose the target if it makes abrupt changes in itsmovement patterns

33 Joint Probabilistic Data Association Joint probabilisticdata association (JPDA) is a suboptimal approach for trackingmultiple targets in cluttered environments [14] JPDA issimilar to PDA with the difference that the associationprobabilities are computed using all of the observationsand all of the targets Thus in contrast to PDA JPDAconsiders various hypotheses together and combines themJPDA determines the probability 120573119905

119894(119896) that measurement 119894 is

originated from target 119905 accounting for the fact that underthis hypothesis the measurement cannot be generated byother targets Therefore for a known number of targets itevaluates the different options of the measurement-targetassociation (for the most recent set of measurements) andcombines them into the corresponding state estimation Ifthe association probability is known then the Kalman filterupdating equation of the track 119905 can be written as

119909119905(119896 | 119896) = 119909

119905(119896 | 119896 minus 1) + 119870 (119896) V

119905(119896) (13)

where 119909119905(119896 | 119896) and 119909119905(119896 | 119896 minus 1) are the estimation andprediction of target 119905 and119870(119896) is the filter gainTheweighted

The Scientific World Journal 9

sum of the residuals associated with the observation 119898(119896) oftarget 119905 is as follows

V119905(119896) =

119898(119896)

sum

119894=1

120573119905

119894(119896) V119905

119894(119896) (14)

where V119905119894= 119911119894(119896) minus 119867119909

119905(119896 | 119896 minus 1) Therefore this method

incorporates all of the observations (inside the neighborhoodof the targetrsquos predicted position) to update the estimatedposition by using a posterior probability that is a weightedsum of residuals

The main restrictions of JPDA are the following

(i) a measurement cannot come from more than onetarget

(ii) two measurements cannot be originated by the sametarget (at one time instant)

(iii) the sum of all of the measurementsrsquo probabilities thatare assigned to one target must be 1 sum119898(119896)

119894=0120573119905

119894(119896) = 1

The main disadvantages of JPDA are the following

(i) it requires an explicit mechanism for track initial-ization Similar to PDA JPDA cannot initialize newtracks or remove tracks that are out of the observationarea

(ii) JPDA is a computationally expensive algorithm whenit is applied in environments that havemultiple targetsbecause the number of hypotheses is incrementedexponentially with the number of targets

In general JPDA is more appropriate than MHT insituations in which the density of false measurements is high(ie sonar applications)

34 Multiple Hypothesis Test The underlying idea of themultiple hypothesis test (MHT) is based on using more thantwo consecutive observations to make an association withbetter results Other algorithms that use only two consecutiveobservations have a higher probability of generating an errorIn contrast to PDA and JPDA MHT estimates all of thepossible hypotheses and maintains new hypotheses in eachiteration

MHTwas developed to track multiple targets in clutteredenvironments as a result it combines the data associationproblem and tracking into a unified framework becomingan estimation technique as well The Bayes rule or theBayesian networks are commonly employed to calculate theMHT hypothesis In general researchers have claimed thatMHT outperforms JPDA for the lower densities of falsepositives However the main disadvantage of MHT is thecomputational cost when the number of tracks or falsepositives is incremented Pruning the hypothesis tree usinga window could solve this limitation

The Reid [15] tracking algorithm is considered the stan-dard MHT algorithm but the initial integer programmingformulation of the problem is due to Morefield [16] MHT isan iterative algorithm in which each iteration starts with a set

of correspondence hypotheses Each hypothesis is a collec-tion of disjoint tracks and the prediction of the target in thenext time instant is computed for each hypothesis Next thepredictions are compared with the new observations by usinga distance metric The set of associations established in eachhypothesis (based on a distance) introduces new hypothesesin the next iteration Each new hypothesis represents a newset of tracks that is based on the current observations

Note that each new measurement could come from (i) anew target in the visual field of view (ii) a target being trackedor (iii) noise in the measurement process It is also possiblethat a measurement is not assigned to a target because thetarget disappears or because it is not possible to obtain atarget measurement at that time instant

MHT maintains several correspondence hypotheses foreach target in each frame If the hypothesis in the instant119896 is represented by 119867(119896) = [ℎ

119897(119896) 119896 = 1 119899] then

the probability of the hypothesis ℎ119897(119896) could be represented

recursively using the Bayes rule as follows

119875 (ℎ119897(119896) | 119885 (119896)) = 119875 (ℎ

119892(119896 minus 1) 119886

119894(119896) | 119885 (119896))

=1

119888119875 (119885 (119896) | ℎ

119892(119896 minus 1) 119886

119894(119896))

lowast 119875 (119886119894(119896) | ℎ

119892(119896 minus 1)) lowast 119875 (ℎ

119892(119896 minus 1))

(15)

where ℎ119892(119896 minus 1) is the hypothesis 119892 of the complete set until

the time instant 119896minus1 119886119894(119896) is the 119894th possible association of the

track to the object 119885(119896) is the set of detections of the currentframe and 119888 is a normal constant

The first term on the right side of the previous equationis the likelihood function of the measurement set 119885(119896) giventhe joint likelihood and current hypothesis The second termis the probability of the association hypothesis of the currentdata given the previous hypothesis ℎ

119892(119896 minus 1) The third term

is the probability of the previous hypothesis from which thecurrent hypothesis is calculated

The MHT algorithm has the ability to detect a newtrack while maintaining the hypothesis tree structure Theprobability of a true track is given by theBayes decisionmodelas

119875 (120582 | 119885) =119875 (119885 | 120582) lowast 119875

∘(120582)

119875 (119885) (16)

where 119875(119885 | 120582) is the probability of obtaining the set ofmeasurements 119885 given 120582 119875

∘(120582) is the a priori probability of

the source signal and 119875(119885) is the probability of obtaining theset of detections 119885

MHT considers all of the possibilities including boththe track maintenance and the initialization and removalof tracks in an integrated framework MHT calculates thepossibility of having an object after the generation of a setof measurements using an exhaustive approach and thealgorithm does not assume a fixed number of targetsThe keychallenge of MHT is the effective hypothesis managementThe baseline MHT algorithm can be extended as follows(i) use the hypothesis aggregation for missed targets births

10 The Scientific World Journal

cardinality tracking and closely spaced objects (ii) applya multistage MHT for improving the performance androbustness in challenging settings and (iii) use a feature-aided MHT for extended object surveillance

The main disadvantage of this algorithm is the compu-tational cost which grows exponentially with the number oftracks andmeasurementsTherefore the practical implemen-tation of this algorithm is limited because it is exponential inboth time and memory

With the aim of reducing the computational cost [17]presented a probabilistic MHT algorithm in which theassociations are considered to be random variables thatare statistically independent and in which performing anexhaustive search enumeration is avoided This algorithm isknown as PMHT The PMHT algorithm assumes that thenumber of targets and measurements is known With thesame goal of reducing the computational cost [18] presentedan efficient implementation of the MHT algorithm Thisimplementation was the first version to be applied to performtracking in visual environments They employed the Murty[19] algorithm to determine the best set of 119896 hypothesesin polynomial time with the goal of tracking the points ofinterest

MHT typically performs the tracking process by employ-ing only one characteristic commonly the position TheBayesian combination to use multiple characteristics wasproposed by Liggins II et al [20]

A linear-programming-based relaxation approach to theoptimization problem in MHT tracking was proposed inde-pendently by Coraluppi et al [21] and Storms and Spieksma[22] Joo and Chellappa [23] proposed an association algo-rithm for tracking multiple targets in visual environmentsTheir algorithm is based on in MHT modification in whicha measurement can be associated with more than one targetand several targets can be associated with one measurementThey also proposed a combinatorial optimization algorithmto generate the best set of association hypotheses Theiralgorithm always finds the best hypothesis in contrast toother models which are approximate Coraluppi and Carthel[24] presented a generalization of the MHT algorithm usinga recursion over hypothesis classes rather than over a singlehypothesis This work has been applied in a special case ofthemulti-target tracking problem called cardinality trackingin which they observed the number of sensor measurementsinstead of the target states

35 Distributed Joint Probabilistic Data Association The dis-tributed version of the joint probabilistic data association(JPDA-D) was presented by Chang et al [25] In this tech-nique the estimated state of the target (using two sensors)after being associated is given by

119864 119909 | 1198851 1198852 =

1198981

sum

119895=0

1198982

sum

119897=0

119864 119909 | 1205941

119895 1205942

119897 1198851 1198852

lowast 119875 1205941

119895 1205942

119897| 1198851 1198852

(17)

where 119898119894 119894 = 1 2 is the last set of measurements of

sensor 1 and 2 119885119894 119894 = 1 2 is the set of accumulative data

and 120594 is the association hypothesisThe first term of the rightside of the equation is calculated from the associations thatwere made earlier The second term is computed from theindividual association probabilities as follows

119875 (1205941

119895 1205942

119897| 1198851 1198852) = sum

1199091

sum

1199092

= 119875 (1205941 1205942| 1198851 1198852) 1

119895(1205941) 2

119897(1205942)

119875 (1205941 1205942| 1198851 1198852) =

1

119888119875 (1205941| 1198851) 119875 (120594

2| 1198852) 120574 (120594

1 1205942)

(18)

where 120594119894 are the joint hypotheses involving all of themeasurements and all of the objectives and 119894

119895(120594119894) are the

binary indicators of the measurement-target associationTheadditional term 120574(1205941 1205942) depends on the correlation of theindividual hypothesis and reflects the localization influenceof the current measurements in the joint hypotheses

These equations are obtained assuming that commu-nication exists after every observation and there are onlyapproximations in the case in which communication issporadic and when a substantial amount of noise occursTherefore this algorithm is a theoretical model that has somelimitations in practical applications

36 Distributed Multiple Hypothesis Test The distributedversion of the MHT algorithm (MHT-D) [26 27] follows asimilar structure as the JPDA-D algorithm Let us assume thecase in which one node must fuse two sets of hypotheses andtracks If the hypotheses and track sets are represented by119867119894(119885119894) and 119879119894(119885119894) with 119894 = 1 2 the hypothesis probabilities

are represented by 120582119894119895 and the state distribution of the tracks

(120591119894119895) is represented by 119875(120582119894

119895) and 119875(119909 | 119885

119894 120591119894

119895) then the

maximum available information in the fusion node is 119885 =1198851cup 1198852 The data fusion objective of the MHT-D is to

obtain the set of hypotheses119867(119885) the set of tracks 119879(119885) thehypothesis probabilities 119875(120582 | 119885) and the state distribution119901(119909 | 119885 120591) for the observed data

The MHT-D algorithm is composed of the followingsteps

(1) hypothesis formation for each hypothesis pair 1205821119895and

1205822

119896 which could be fused a track 120591 is formed by

associating the pair of tracks 1205911119895and 1205912

119896 where each

pair comes from one node and could originate fromthe same target The final result of this stage is a setof hypotheses denoted by 119867(119885) and the fused tracks119879(119885)

(2) hypothesis evaluation in this stage the associationprobability of each hypothesis and the estimatedstate of each fused track are obtained The dis-tributed estimation algorithm is employed to calcu-late the likelihood of the possible associations andthe obtained estimations at each specific association

The Scientific World Journal 11

Using the information model the probability of eachfused hypothesis is given by

119875 (120582 | 119885) = 119862minus1prod

119895isin119869

119875(120582(119895)| 119885(119895))120572(119895)

prod

120591isin120582

119871 (120591 | 119885) (19)

where 119862 is a normalizing constant and 119871(120591 | 119885) is thelikelihood of each hypothesis pair

The main disadvantage of the MHT-D is the high com-putational cost that is in the order of 119874(119899119872) where 119899 is thenumber of possible associations and 119872 is the number ofvariables to be estimated

37 Graphical Models Graphical models are a formalism forrepresenting and reasoning with probabilities and indepen-dence A graphical model represents a conditional decom-position of the joint probability A graphical model can berepresented as a graph in which the nodes denote randomvariables the edges denote the possible dependence betweenthe random variables and the plates denote the replication ofa substructure with the appropriate indexing of the relevantvariables The graph captures the joint distribution over therandom variables which can be decomposed into a productof factors that each dependononly a subset of variablesThereare two major classes of graphical models (i) the Bayesiannetworks [28] which are also known as the directed graphicalmodels and (ii) the Markov random fields which are alsoknown as undirected graphical models The directed graph-ical models are useful for expressing causal relationshipsbetween random variables whereas undirected models arebetter suited for expressing soft constraints between randomvariables We refer the reader to the book of Koller andFriedman [29] for more information on graphical models

A framework based on graphical models can solve theproblem of distributed data association in synchronizedsensor networkswith overlapped areas andwhere each sensorreceives noisy measurements this solution was proposedby Chen et al [30 31] Their work is based on graphicalmodels that are used to represent the statistical dependencebetween random variables The data association problem istreated as an inference problem and solved by using themax-product algorithm [32] Graphical models representstatistical dependencies between variables as graphs andthe max-product algorithm converges when the graph isa tree structure Moreover the employed algorithm couldbe implemented in a distributed manner by exchangingmessages between the source nodes in parallel With thisalgorithm if each sensor has 119899 possible combinations ofassociations and there are119872 variables to be estimated it hasa complexity of 119874(1198992119872) which is reasonable and less thanthe 119874(119899119872) complexity of the MHT-D algorithm Howeveraspecial attention must be given to the correlated variableswhen building the graphical model

4 State Estimation Methods

State estimation techniques aim to determine the state ofthe target under movement (typically the position) given

the observation or measurements State estimation tech-niques are also known as tracking techniques In their generalform it is not guaranteed that the target observations arerelevant which means that some of the observations couldactually come from the target and others could be only noiseThe state estimation phase is a common stage in data fusionalgorithms because the targetrsquos observation could come fromdifferent sensors or sources and the final goal is to obtain aglobal target state from the observations

The estimation problem involves finding the values of thevector state (eg position velocity and size) that fits as muchas possible with the observed data From a mathematicalperspective we have a set of redundant observations andthe goal is to find the set of parameters that provides thebest fit to the observed data In general these observationsare corrupted by errors and the propagation of noise in themeasurement process State estimation methods fall underlevel 1 of the JDL classification and could be divided into twobroader groups

(1) linear dynamics and measurements here the esti-mation problem has a standard solution Specificallywhen the equations of the object state and the mea-surements are linear the noise follows the Gaussiandistribution and we do not refer to it as a clutterenvironment in this case the optimal theoreticalsolution is based on the Kalman filter

(2) nonlinear dynamics the state estimation problembecomes difficult and there is not an analytical solu-tion to solve the problem in a generalmanner In prin-ciple there are no practical algorithms available tosolve this problem satisfactorily

Most of the state estimationmethods are based on controltheory and employ the laws of probability to compute avector state from a vector measurement or a stream of vectormeasurements Next the most common estimation methodsare presented including maximum likelihood and maxi-mum posterior (Section 41) the Kalman filter (Section 42)particle filter (Section 43) the distributed Kalman filter(Section 44) distributed particle filter (Section 45) andcovariance consistency methods (Section 46)

41 Maximum Likelihood andMaximum Posterior Themax-imum likelihood (ML) technique is an estimation methodthat is based on probabilistic theory Probabilistic estimationmethods are appropriate when the state variable follows anunknown probability distribution [33] In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of119909 The likelihood function 120582(119909) is defined as a probabilitydensity function of the sequence of 119911 observations given thetrue value of the state 119909 Consider

120582 (119909) = 119901 (119911 | 119909) (20)

The ML estimator finds the value of 119909 that maximizes thelikelihood function

119909 (119896) = argmax119909119901 (119911 | 119909) (21)

12 The Scientific World Journal

which can be obtained from the analytical or empiricalmodels of the sensorsThis function expresses the probabilityof the observed data The main disadvantage of this methodin practice is that it requires the analytical or empirical modelof the sensor to be known to provide the prior distributionand compute the likelihood function This method can alsosystematically underestimate the variance of the distributionwhich leads to a bias problem However the bias of the MLsolution becomes less significant as the number 119873 of datapoints increases and is equal to the true variance of thedistribution that generated the data at the limit119873 rarr infin

The maximum posterior (MAP) method is based on theBayesian theory It is employed when the parameter 119909 tobe estimated is the output of a random variable that has aknown probability density function 119901(119909) In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of 119909The MAP estimator finds the value of 119909 that maximizes theposterior probability distribution as follows

119909 (119896) = argmax119909119901 (119909 | 119911) (22)

Both methods (ML andMAP) aim to find the most likelyvalue for the state 119909 However ML assumes that 119909 is a fixedbut an unknown point from the parameter space whereasMAP considers 119909 to be the output of a random variable witha known a priori probability density function Both of thesemethods are equivalent when there is no a priori informationabout 119909 that is when there are only observations

42The Kalman Filter TheKalman filter is the most popularestimation technique It was originally proposed by Kalman[34] and has been widely studied and applied since then TheKalman filter estimates the state 119909 of a discrete time processgoverned by the following space-time model

119909 (119896 + 1) = Φ (119896) 119909 (119896) + 119866 (119896) 119906 (119896) + 119908 (119896) (23)

with the observations ormeasurements 119911 at time 119896 of the state119909 represented by

119911 (119896) = 119867 (119896) 119909 (119896) + V (119896) (24)

where Φ(119896) is the state transition matrix 119866(119896) is the inputmatrix transition 119906(119896) is the input vector 119867(119896) is themeasurement matrix and 119908 and V are the random Gaussianvariables with zero mean and covariance matrices of 119876(119896)and 119877(119896) respectively Based on the measurements and onthe system parameters the estimation of 119909(119896) which isrepresented by 119909(119896) and the prediction of 119909(119896 + 1) whichis represented by 119909(119896 + 1 | 119896) are given by the following

119909 (119896) = 119909 (119896 | 119896 + 1) + 119870 (119896) [119911 (119896) minus 119867 (119896) 119909 (119896 | 119896 minus 1)]

119909 (119896 + 1 | 119896) = Φ (119896) 119909 (119896 | 119896) + 119866 (119896) 119906 (119896)

(25)

respectively where119870 is the filter gain determined by

119870 (119896) = 119875 (119896 | 119896 minus 1)119867119879(119896)

times [119867 (119896) 119875 (119896 | 119896 minus 1)119867119879(119896) + 119877 (119896)]

minus1

(26)

where 119875(119896 | 119896 minus 1) is the prediction covariance matrix andcan be determined by

119875 (119896 + 1 | 119896) = Φ (119896) 119875 (119896)Φ119879(119896) + 119876 (119896) (27)

with

119875 (119896) = 119875 (119896 | 119896 minus 1) minus 119870 (119896)119867 (119896) 119875 (119896 | 119896 minus 1) (28)

The Kalman filter is mainly employed to fuse low-leveldata If the system could be described as a linear model andthe error could be modeled as the Gaussian noise then therecursive Kalman filter obtains optimal statistical estimations[35] However othermethods are required to address nonlin-ear dynamicmodels and nonlinearmeasurementsThemodi-fied Kalman filter known as the extended Kalman filter (EKF)is an optimal approach for implementing nonlinear recursivefilters [36] The EKF is one of the most often employedmethods for fusing data in robotic applications Howeverit has some disadvantages because the computations of theJacobians are extremely expensive Some attempts have beenmade to reduce the computational cost such as linearizationbut these attempts introduce errors in the filter and make itunstable

The unscented Kalman filter (UKF) [37] has gainedpopularity because it does not have the linearization step andthe associated errors of the EKF [38] The UKF employs adeterministic sampling strategy to establish theminimum setof points around the mean This set of points captures thetrue mean and covariance completely Then these points arepropagated through nonlinear functions and the covarianceof the estimations can be recuperated Another advantage ofthe UKF is its ability to be employed in parallel implementa-tions

43 Particle Filter Particle filters are recursive implemen-tations of the sequential Monte Carlo methods [39] Thismethod builds the posterior density function using severalrandom samples called particles Particles are propagatedover time with a combination of sampling and resamplingsteps At each iteration the sampling step is employed todiscard some particles increasing the relevance of regionswith a higher posterior probability In the filtering processseveral particles of the same state variable are employedand each particle has an associated weight that indicatesthe quality of the particle Therefore the estimation is theresult of a weighted sum of all of the particles The standardparticle filter algorithm has two phases (1) the predictingphase and (2) the updating phase In the predicting phaseeach particle is modified according to the existing modeland accounts for the sum of the random noise to simulatethe noise effect Then in the updating phase the weight ofeach particle is reevaluated using the last available sensorobservation and particles with lower weights are removedSpecifically a generic particle filter comprises the followingsteps

The Scientific World Journal 13

(1) Initialization of the particles

(i) let119873 be equal to the number of particles(ii) 119883(119894)(1) = [119909(1) 119910(1) 0 0]119879 for 119894 = 1 119873

(2) Prediction step

(i) for each particle 119894 = 1 119873 evaluate the state(119896 + 1 | 119896) of the system using the state at timeinstant 119896 with the noise of the system at time 119896Consider

119883(119894)(119896 + 1 | 119896) = 119865 (119896)119883

(119894)(119896)

+ (cauchy-distribution-noise)(119896)

(29)

where 119865(119896) is the transition matrix of the sys-tem

(3) Evaluate the particle weight For each particle 119894 =1 119873

(i) compute the predicted observation state of thesystem using the current predicted state and thenoise at instant 119896 Consider

(119894)(119896 + 1 | 119896) = 119867 (119896 + 1)119883

(119894)(119896 + 1 | 119896)

+ (gaussian-measurement-noise)(119896+1)

(30)

(ii) compute the likelihood (weights) according tothe given distribution Consider

likelihood(119894) = 119873((119894) (119896 + 1 | 119896) 119911(119894) (119896 + 1) var) (31)

(iii) normalize the weights as follows

119908(119894)=

likelihood(119894)

sum119873

119895=1likelihood(119895)

(32)

(4) ResamplingSelection multiply particles with higherweights and remove those with lower weights Thecurrent state must be adjusted using the computedweights of the new particles

(i) Compute the cumulative weights Consider

Cum Wt(119894) =119894

sum

119895=1

119908(119895) (33)

(ii) Generate uniform distributed random variablesfrom 119880(119894) sim 119882(0 1) with the number of stepsequal to the number of particles

(iii) Determine which particles should be multipliedand which ones removed

(5) Propagation phase

(i) incorporate the new values of the state after theresampling of instant 119896 to calculate the value atinstant 119896 + 1 Consider

119909(1119873)

(119896 + 1 | 119896 + 1) = 119909 (119896 + 1 | 119896) (34)

(ii) compute the posterior mean Consider

119909 (119896 + 1) = mean [119909119894 (119896 + 1 | 119896 + 1)] 119894 = 1 119873 (35)

(iii) repeat steps 2 to 5 for each time instant

Particle filters are more flexible than the Kalman filtersand can copewith nonlinear dependencies and non-Gaussiandensities in the dynamic model and in the noise errorHowever they have some disadvantages A large numberof particles are required to obtain a small variance in theestimator It is also difficult to establish the optimal number ofparticles in advance and the number of particles affects thecomputational cost significantly Earlier versions of particlefilters employed a fixed number of particles but recent studieshave started to use a dynamic number of particles [40]

44 The Distributed Kalman Filter The distributed Kalmanfilter requires a correct clock synchronization between eachsource as demonstrated in [41] In other words to correctlyuse the distributed Kalman filter the clocks from all ofthe sources must be synchronized This synchronization istypically achieved through using protocols that employ ashared global clock such as the network time protocol (NTP)Synchronization problems between clocks have been shownto have an effect on the accuracy of the Kalman filterproducing inaccurate estimations [42]

If the estimations are consistent and the cross covarianceis known (or the estimations are uncorrelated) then it ispossible to use the distributed Kalman filters [43] Howeverthe cross covariance must be determined exactly or theobservations must be consistent

We refer the reader to Liggins II et al [20] formore detailsabout the Kalman filter in a distributed and hierarchicalarchitecture

45 Distributed Particle Filter Distributed particle filtershave gained attention recently [44ndash46] Coates [45] used adistributed particle filter to monitor an environment thatcould be captured by the Markovian state-space modelinvolving nonlinear dynamics and observations and non-Gaussian noise

In contrast earlier attempts to solve out-of-sequencemeasurements using particle filters are based on regeneratingthe probability density function to the time instant of theout-of-sequence measurement [47] In a particle filter thisstep requires a large computational cost in addition to thenecessary space to store the previous particles To avoidthis problem Orton and Marrs [48] proposed to store theinformation on the particles at each time instant saving thecost of recalculating this information This technique is close

14 The Scientific World Journal

to optimal and when the delay increases the result is onlyslightly affected [49]However it requires a very large amountof space to store the state of the particles at each time instant

46 Covariance Consistency Methods Covariance Intersec-tionUnion Covariance consistency methods (intersectionand union) were proposed by Uhlmann [43] and are generaland fault-tolerant frameworks for maintaining covariancemeans and estimations in a distributed networkThese meth-ods do not comprise estimation techniques instead they aresimilar to an estimation fusion technique The distributedKalman filter requirement of independent measurements orknown cross-covariances is not a constraintwith thismethod

461 Covariance Intersection If the Kalman filter is employ-ed to combine two estimations (119886

1 1198601) and (119886

2 1198602) then it

is assumed that the joint covariance is in the following form

[

[

1198601119883

1198831198791198602

]

]

(36)

where the cross-covariance 119883 should be known exactly sothat the Kalman filter can be applied without difficultyBecause the computation of the cross-covariances is compu-tationally intensive Uhlmann [43] proposed the covarianceintersection (CI) algorithm

Let us assume that a joint covariance 119872 can be definedwith the diagonal blocks119872

1198601gt 1198601and119872

1198602gt 1198602 Consider

119872 ⩾ [

[

1198601119883

1198831198791198602

]

]

(37)

for every possible instance of the unknown cross-covariance119883 then the components of the matrix119872 could be employedin the Kalman filter equations to provide a fused estimation(119888 119862) that is considered consistent The key point of thismethod relies on generating a joint covariance matrix119872 thatcan represent a useful fused estimation (in this context usefulrefers to something with a lower associated uncertainty) Insummary the CI algorithm computes the joint covariancematrix 119872 where the Kalman filter provides the best fusedestimation (119888 119862) with respect to a fixed measurement of thecovariance matrix (ie the minimum determinant)

Specific covariance criteria must be established becausethere is not a specific minimum joint covariance in theorder of the positive semidefinite matrices Moreover thejoint covariance is the basis of the formal analysis of theCI algorithm the actual result is a nonlinear mixture of theinformation stored on the estimations being fused followingthe following equation

119862 = (1199081119867119879

1119860minus1

11198671+ 1199082119867119879

2119860minus1

21198672+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119867119899)minus1

119888 = 119862(1199081119867119879

1119860minus1

11198861+ 1199082119867119879

2119860minus1

21198862+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119886119899)minus1

(38)

where 119867119894is the transformation of the fused state-space

estimation to the space of the estimated state 119894 The values

of 119908 can be calculated to minimize the covariance determi-nant using convex optimization packages and semipositivematrix programming The result of the CI algorithm hasdifferent characteristics compared to the Kalman filter Forexample if two estimations are provided (119886 119860) and (119887 119861)and their covariances are equal 119860 = 119861 since the Kalmanfilter is based on the statistical independence assumption itproduces a fused estimation with covariance 119862 = (12)119860In contrast the CI method does not assume independenceand thus must be consistent even in the case in whichthe estimations are completely correlated with the estimatedfused covariance 119862 = 119860 In the case of estimations where119860 lt 119861 the CI algorithm does not provide information aboutthe estimation (119887 119861) thus the fused result is (119886 119860)

Every joint-consistent covariance is sufficient to producea fused estimation which guarantees consistency Howeverit is also necessary to guarantee a lack of divergence Diver-gence is avoided in the CI algorithm by choosing a specificmeasurement (ie the determinant) which is minimized ineach fusion operation This measurement represents a non-divergence criterion because the size of the estimated covari-ance according to this criterion would not be incremented

The application of the CI method guarantees consis-tency and nondivergence for every sequence of mean andcovariance-consistent estimations However this methoddoes not work well when the measurements to be fused areinconsistent

462 Covariance Union CI solves the problem of correlatedinputs but not the problemof inconsistent inputs (inconsistentinputs refer to different estimations each of which has ahigh accuracy (small variance) but also a large differencefrom the states of the others) thus the covariance union(CU) algorithm was proposed to solve the latter [43] CUaddresses the following problem two estimations (119886

1 1198601)

and (1198862 1198602) relate to the state of an object and are mutually

inconsistent from one another This issue arises when thedifference between the average estimations is larger thanthe provided covariance Inconsistent inputs can be detectedusing the Mahalanobis distance [50] between them which isdefined as

119872119889= (1198861minus 1198862)119879

(1198601+ 1198602)minus1

(1198861minus 1198862) (39)

and detecting whether this distance is larger than a giventhreshold

The Mahalanobis distance accounts for the covarianceinformation to obtain the distance If the difference betweenthe estimations is high but their covariance is also highthe Mahalanobis distance yields a small value In contrastif the difference between the estimations is small and thecovariances are small it could produce a larger distancevalue A high Mahalanobis distance could indicate that theestimations are inconsistent however it is necessary tohave a specific threshold established by the user or learnedautomatically

The CU algorithm aims to solve the following prob-lem let us suppose that a filtering algorithm provides twoobservations withmean and covariance (119886

1 1198601) and (119886

2 1198602)

The Scientific World Journal 15

respectively It is known that one of the observations is correctand the other is erroneous However the identity of thecorrect estimation is unknown and cannot be determinedIn this situation if both estimations are employed as aninput to the Kalman filter there will be a problem becausethe Kalman filter only guarantees a consistent output if theobservation is updated with a measurement consistent withboth of them In the specific case in which themeasurementscorrespond to the same object but are acquired from twodifferent sensors the Kalman filter can only guarantee thatthe output is consistent if it is consistent with both separatelyBecause it is not possible to knowwhich estimation is correctthe only way to combine the two estimations rigorously isto provide an estimation (119906 119880) that is consistent with bothestimations and to obey the following properties

119880 ⪖ 1198601+ (119906 minus 119886

1) (119906 minus 119860

1)119879

119880 ⪖ 1198602+ (119906 minus 119886

2) (119906 minus 119860

2)119879

(40)

where somemeasurement of thematrix size119880 (ie the deter-minant) is minimized

In other words the previous equations indicate that if theestimation (119886

1 1198601) is consistent then the translation of the

vector 1198861to 119906 requires to increase the covariance by the sum

of a matrix at least as big as the product of (119906minus1198861) in order to

be consistentThe same situation applies to the measurement(1198862 1198602) in order to be consistent

A simple strategy is to choose the mean of the estimationas the input value of one of themeasurements (119906 = 119886

1) In this

case the value of 119880must be chosen such that the estimationis consistent with the worst case (the correct measurement is1198862) However it is possible to assign 119906 an intermediate value

between 1198861and 1198862to decrease the value of 119880 Therefore the

CU algorithm establishes the mean fused value 119906 that hasthe least covariance 119880 but is sufficiently large for the twomeasurements (119886

1and 1198862) for consistency

Because the matrix inequalities presented in previousequations are convex convex optimization algorithms mustbe employed to solve them The value of 119880 can be computedwith the iterative method described by Julier et al [51]The obtained covariance could be significantly larger thanany of the initial covariances and is an indicator of theexisting uncertainty between the initial estimations One ofthe advantages of the CU method arises from the fact thatthe same process could be easily extended to119873 inputs

5 Decision Fusion Methods

A decision is typically taken based on the knowledge of theperceived situation which is provided by many sources inthe data fusion domain These techniques aim to make ahigh-level inference about the events and activities that areproduced from the detected targets These techniques oftenuse symbolic information and the fusion process requires toreasonwhile accounting for the uncertainties and constraintsThese methods fall under level 2 (situation assessment) andlevel 4 (impact assessment) of the JDL data fusion model

51 The Bayesian Methods Information fusion based on theBayesian inference provides a formalism for combining evi-dence according to the probability theory rules Uncertaintyis represented using the conditional probability terms thatdescribe beliefs and take on values in the interval [0 1] wherezero indicates a complete lack of belief and one indicates anabsolute belief The Bayesian inference is based on the Bayesrule as follows

119875 (119884 | 119883) =119875 (119883 | 119884) 119875 (119884)

119875 (119883) (41)

where the posterior probability 119875(119884 | 119883) represents thebelief in the hypothesis 119884 given the information 119883 Thisprobability is obtained by multiplying the a priori probabilityof the hypothesis 119875(119884) by the probability of having 119883 giventhat 119884 is true 119875(119883 | 119884) The value 119875(119883) is used as anormalizing constantThemain disadvantage of the Bayesianinference is that the probabilities 119875(119883) and 119875(119883 | 119884) mustbe known To estimate the conditional probabilities Panet al [52] proposed the use of NNs whereas Coue et al [53]proposed the Bayesian programming

Hall and Llinas [54] described the following problemsassociated with Bayesian inference

(i) Difficulty in establishing the value of a priori proba-bilities

(ii) Complexity when there are multiple potential hypo-theses and a substantial number of events that dependon the conditions

(iii) The hypothesis should be mutually exclusive

(iv) Difficulty in describing the uncertainty of the deci-sions

52 The Dempster-Shafer Inference The Dempster-Shaferinference is based on the mathematical theory introducedby Dempster [55] and Shafer [56] which generalizes theBayesian theory The Dempster-Shafer theory provides aformalism that could be used to represent incomplete knowl-edge updating beliefs and a combination of evidence andallows us to represent the uncertainty explicitly [57]

A fundamental concept in the Dempster-Shafer reason-ing is the frame of discernment which is defined as followsLet Θ = 120579

1 1205792 120579

119873 be the set of all possible states

that define the system and let Θ be exhaustive and mutuallyexclusive due to the system being only in one state 120579

119894isin Θ

where 1 ⪕ 119894 ⪕ 119873 The set Θ is called a frame of discernmentbecause its elements are employed to discern the current stateof the system

The elements of the set 2Θ are called hypotheses Inthe Dempster-Shafer theory based on the evidence 119864 aprobability is assigned to each hypothesis119867 isin 2Θ accordingto the basic assignment of probabilities or the mass function119898 2Θrarr [01] which satisfies

119898(0) = 0 (42)

16 The Scientific World Journal

Thus themass function of the empty set is zero Furthermorethe mass function of a hypothesis is larger than or equal tozero for all of the hypotheses Consider

119898(119867) ge 0 forall119867 isin 2Θ (43)

The sum of the mass function of all the hypotheses is oneConsider

sum

119867isin2Θ

119898(119867) = 1 (44)

To express incomplete beliefs in a hypothesis 119867 the Demp-ster-Shafer theory defines the belief function bel 2Θ rarr

[0 1] over Θ as

bel (119867) = sum119860sube119867

119898(119860) (45)

where bel(0) = 0 and bel(Θ) = 1The doubt level in119867 can beexpressed in terms of the belief function by

dou (119867) = bel (not119867) = sum

119860subenot119867

119898(119860) (46)

To express the plausibility of each hypothesis the functionpl 2Θ rarr [0 1] over Θ is defined as

pl (119867) = 1 minus dou (119867) = sum

119860cap119867=0

119898(119860) (47)

Intuitive plausibility indicates that there is less uncer-tainty in hypothesis119867 if it is more plausible The confidenceinterval [bel(119867) pl(119867)] defines the true belief in hypothesis119867 To combine the effects of the two mass functions 119898

1and

1198982 the Dempster-Shafer theory defines a rule119898

1oplus 1198982as

1198981oplus 1198982(0) = 0

1198981oplus 1198982(119867) =

sum119883cap119884=119867

1198981(119883)119898

2(119884)

1 minus sum119883cap119884=0

1198981(119883)119898

2(119884)

(48)

In contrast to the Bayesian inference a priori probabilitiesare not required in the Dempster-Shafer inference becausethey are assigned at the instant that the information is pro-vided Several studies in the literature have compared the useof the Bayesian inference and the Dempster-Shafer inferencesuch as [58ndash60] Wu et al [61] used the Dempster-Shafertheory to fuse information in context-aware environmentsThis work was extended in [62] to dynamically modify theassociated weights to the sensor measurements Thereforethe fusion mechanism is calibrated according to the recentmeasurements of the sensors (in cases in which the ground-truth is available) In themilitary domain [63] theDempster-Shafer reasoning is used with the a priori information storedin a database for classifying military ships Morbee et al [64]described the use of the Dempster-Shafer theory to build 2Doccupancy maps from several cameras and to evaluate thecontribution of subsets of cameras to a specific task Each taskis the observation of an event of interest and the goal is toassess the validity of a set of hypotheses that are fused usingthe Dempster-Shafer theory

53 Abductive Reasoning Abductive reasoning or inferringthe best explanation is a reasoning method in which ahypothesis is chosen under the assumption that in case itis true it explains the observed event most accurately [65]In other words when an event is observed the abductionmethod attempts to find the best explanation

In the context of probabilistic reasoning abductive infer-ence finds the posterior ML of the system variables givensome observed variables Abductive reasoning is more areasoning pattern than a data fusion technique Thereforedifferent inference methods such as NNs [66] or fuzzy logic[67] can be employed

54 Semantic Methods Decision fusion techniques thatemploy semantic data fromdifferent sources as an input couldprovide more accurate results than those that rely on onlysingle sources There is a growing interest in techniques thatautomatically determine the presence of semantic features invideos to solve the semantic gap [68]

Semantic information fusion is essentially a scheme inwhich raw sensor data are processed such that the nodesexchange only the resultant semantic information Semanticinformation fusion typically covers two phases (i) build-ing the knowledge and (ii) pattern matching (inference)The first phase (typically offline) incorporates the mostappropriate knowledge into semantic information Then thesecond phase (typically online or in real-time) fuses relevantattributes and provides a semantic interpretation of thesensor data [69ndash71]

Semantic fusion could be viewed as an idea for integratingand translating sensor data into formal languages Thereforethe obtained resulting language from the observations ofthe environment is compared with similar languages thatare stored in the database The key of this strategy is thatsimilar behaviors represented by formal languages are alsosemantically similar This type of method provides savingsin the cost of transmission because the nodes need onlytransmit the formal language structure instead of the rawdata However a known set of behaviors must be storedin a database in advance which might be difficult in somescenarios

6 Conclusions

This paper reviews the most popular methods and tech-niques for performing datainformation fusion To determinewhether the application of datainformation fusion methodsis feasible we must evaluate the computational cost of theprocess and the delay introduced in the communicationA centralized data fusion approach is theoretically optimalwhen there is no cost of transmission and there are sufficientcomputational resources However this situation typicallydoes not hold in practical applications

The selection of the most appropriate technique dependson the type of the problem and the established assumptionsof each technique Statistical data fusion methods (eg PDAJPDA MHT and Kalman) are optimal under specific condi-tions [72] First the assumption that the targets are moving

The Scientific World Journal 17

independently and the measurements are normally dis-tributed around the predicted position typically does nothold Second because the statistical techniques model allof the events as probabilities they typically have severalparameters and a priori probabilities for false measurementsand detection errors that are often difficult to obtain (atleast in an optimal sense) For example in the case of theMHT algorithm specific parameters must be established thatare nontrivial to determine and are very sensitive [73] Incontrast statisticalmethods that optimize over several framesare computationally intensive and their complexity typicallygrows exponentially with the number of targets For examplein the case of particle filters tracking several targets can beaccomplished jointly as a group or individually If severaltargets are tracked jointly the necessary number of particlesgrows exponentially Therefore in practice it is better toperform tracking on them individually with the assumptionthat targets do not interact between the particles

In contrast to centralized systems the distributed datafusion methods introduce some challenges in the data fusionprocess such as (i) spatial and temporal alignments of theinformation (ii) out-of-sequence measurements and (iii)data correlation reported by Castanedo et al [74 75] Theinherent redundancy of the distributed systems could beexploited with distributed reasoning techniques and cooper-ative algorithms to improve the individual node estimationsreported by Castanedo et al [76] In addition to the previousstudies a new trend based on the geometric notion of a low-dimensional manifold is gaining attention in the data fusioncommunity An example is the work of Davenport et al [77]which proposes a simple model that captures the correlationbetween the sensor observations by matching the parametervalues for the different obtained manifolds

Acknowledgments

The author would like to thank Jesus Garcıa Miguel APatricio and James Llinas for their interesting and relateddiscussions on several topics that were presented in thispaper

References

[1] JDL Data Fusion Lexicon Technical Panel For C3 FE WhiteSan Diego Calif USA Code 420 1991

[2] D L Hall and J Llinas ldquoAn introduction to multisensor datafusionrdquo Proceedings of the IEEE vol 85 no 1 pp 6ndash23 1997

[3] H F Durrant-Whyte ldquoSensor models and multisensor integra-tionrdquo International Journal of Robotics Research vol 7 no 6 pp97ndash113 1988

[4] B V Dasarathy ldquoSensor fusion potential exploitation-inno-vative architectures and illustrative applicationsrdquo Proceedings ofthe IEEE vol 85 no 1 pp 24ndash38 1997

[5] R C Luo C-C Yih and K L Su ldquoMultisensor fusion andintegration approaches applications and future research direc-tionsrdquo IEEE Sensors Journal vol 2 no 2 pp 107ndash119 2002

[6] J Llinas C Bowman G Rogova A Steinberg E Waltz andF White ldquoRevisiting the JDL data fusion model IIrdquo TechnicalReport DTIC Document 2004

[7] E P Blasch and S Plano ldquoJDL level 5 fusionmodel ldquouser refine-mentrdquo issues and applications in group trackingrdquo in Proceedingsof the Signal Processing Sensor Fusion and Target RecognitionXI pp 270ndash279 April 2002

[8] H F Durrant-Whyte and M Stevens ldquoData fusion in decen-tralized sensing networksrdquo in Proceedings of the 4th Interna-tional Conference on Information Fusion pp 302ndash307MontrealCanada 2001

[9] J Manyika and H Durrant-Whyte Data Fusion and SensorManagement A Decentralized Information-Theoretic ApproachPrentice Hall Upper Saddle River NJ USA 1995

[10] S S Blackman ldquoAssociation and fusion ofmultiple sensor datardquoinMultitarget-Multisensor Tracking Advanced Applications pp187ndash217 Artech House 1990

[11] S Lloyd ldquoLeast squares quantization in pcmrdquo IEEETransactionson Information Theory vol 28 no 2 pp 129ndash137 1982

[12] M Shindler A Wong and A Meyerson ldquoFast and accurate120581-means for large datasetsrdquo in Proceedings of the 25th AnnualConference on Neural Information Processing Systems (NIPS rsquo11)pp 2375ndash2383 December 2011

[13] Y Bar-Shalom and E Tse ldquoTracking in a cluttered environmentwith probabilistic data associationrdquo Automatica vol 11 no 5pp 451ndash460 1975

[14] T E Fortmann Y Bar-Shalom and M Scheffe ldquoMulti-targettracking using joint probabilistic data associationrdquo in Pro-ceedings of the 19th IEEE Conference on Decision and Controlincluding the Symposium on Adaptive Processes vol 19 pp 807ndash812 December 1980

[15] D B Reid ldquoAn algorithm for tracking multiple targetsrdquo IEEETransactions on Automatic Control vol 24 no 6 pp 843ndash8541979

[16] C L Morefield ldquoApplication of 0-1 integer programming tomultitarget tracking problemsrdquo IEEETransactions onAutomaticControl vol 22 no 3 pp 302ndash312 1977

[17] R L Streit and T E Luginbuhl ldquoMaximum likelihood methodfor probabilistic multihypothesis trackingrdquo in Proceedings of theSignal and Data Processing of Small Targets vol 2235 of Pro-ceedings of SPIE p 394 1994

[18] I J Cox and S L Hingorani ldquoEfficient implementation of Reidrsquosmultiple hypothesis tracking algorithm and its evaluation forthe purpose of visual trackingrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 18 no 2 pp 138ndash1501996

[19] K G Murty ldquoAn algorithm for ranking all the assignments inorder of increasing costrdquo Operations Research vol 16 no 3 pp682ndash687 1968

[20] M E Liggins II C-Y Chong I Kadar et al ldquoDistributed fusionarchitectures and algorithms for target trackingrdquo Proceedings ofthe IEEE vol 85 no 1 pp 95ndash106 1997

[21] S Coraluppi C Carthel M Luettgen and S Lynch ldquoAll-source track and identity fusionrdquo in Proceedings of the NationalSymposium on Sensor and Data Fusion 2000

[22] P Storms and F Spieksma ldquoAn lp-based algorithm for the dataassociation problem in multitarget trackingrdquo in Proceedings ofthe 3rd IEEE International Conference on Information Fusionvol 1 2000

[23] S-W Joo and R Chellappa ldquoA multiple-hypothesis approachfor multiobject visual trackingrdquo IEEE Transactions on ImageProcessing vol 16 no 11 pp 2849ndash2854 2007

[24] S Coraluppi andC Carthel ldquoAggregate surveillance a cardinal-ity tracking approachrdquo in Proceedings of the 14th InternationalConference on Information Fusion (FUSION rsquo11) July 2011

18 The Scientific World Journal

[25] K C Chang C Y Chong and Y Bar-Shalom ldquoJoint proba-bilistic data association in distributed sensor networksrdquo IEEETransactions on Automatic Control vol 31 no 10 pp 889ndash8971986

[26] Y Chong S Mori and K C Chang ldquoInformation lusion indistributed sensor networksrdquo in Proceedings of the 4th AmericanControl Conference Boston Mass USA June 1985

[27] Y Chong S Mori and K C Chang ldquoDistributed multitar-get multisensor trackingrdquo in Multitarget-Multisensor TrackingAdvanced Applications vol 1 pp 247ndash295 1990

[28] J Pearl Probabilistic Reasoning in Intelligent Systems Networksof Plausible Inference Morgan Kaufmann San Mateo CalifUSA 1988

[29] Koller and N Friedman Probabilistic Graphical Models Princi-ples and Techniques MIT press 2009

[30] L Chen M Cetin and A S Willsky ldquoDistributed data associ-ation for multi-target tracking in sensor networksrdquo in Proceed-ings of the 7th International Conference on Information Fusion(FUSION rsquo05) pp 9ndash16 July 2005

[31] L Chen M J Wainwright M Cetin and A S Willsky ldquoDataassociation based on optimization in graphical models withapplication to sensor networksrdquo Mathematical and ComputerModelling vol 43 no 9-10 pp 1114ndash1113 2006

[32] Y Weiss and W T Freeman ldquoOn the optimality of solutionsof the max-product belief-propagation algorithm in arbitrarygraphsrdquo IEEE Transactions on InformationTheory vol 47 no 2pp 736ndash744 2001

[33] C Brown H Durrant-Whyte J Leonard B Rao and B SteerldquoDistributed data fusion using Kalman filtering a roboticsapplicationrdquo in Data Fusion in Robotics and Machine Intelli-gence M A Abidi and R C Gonzalez Eds pp 267ndash309 1992

[34] R E Kalman ldquoA new approach to linear filtering and predictionproblemsrdquo Journal of Basic Engineering vol 82 no 1 pp 35ndash451960

[35] R C Luo and M G Kay ldquoData fusion and sensor integrationstate-of-the-art 1990srdquo in Data Fusion in Robotics and MachineIntelligence pp 7ndash135 1992

[36] Welch and G Bishop An Introduction to the Kalman FilterACM SIC-CRAPH 2001 Course Notes 2001

[37] S J Julier and J K Uhlmann ldquoA new extension of the Kalmanfilter to nonlinear systemsrdquo in Proceedings of the InternationalSymposium on AerospaceDefense Sensing Simulation and Con-trols vol 3 1997

[38] A Wan and R Van Der Merwe ldquoThe unscented kalman filterfor nonlinear estimationrdquo in Proceedings of the Adaptive Systemsfor Signal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 153ndash158 2000

[39] D Crisan and A Doucet ldquoA survey of convergence results onparticle filtering methods for practitionersrdquo IEEE Transactionson Signal Processing vol 50 no 3 pp 736ndash746 2002

[40] J Martinez-del Rincon C Orrite-Urunuela and J E Herrero-Jaraba ldquoAn efficient particle filter for color-based tracking incomplex scenesrdquo in Proceedings of the IEEE Conference onAdvancedVideo and Signal Based Surveillance pp 176ndash181 2007

[41] S Ganeriwal R Kumar and M B Srivastava ldquoTiming-syncprotocol for sensor networksrdquo in Proceedings of the 1st Inter-national Conference on Embedded Networked Sensor Systems(SenSys rsquo03) pp 138ndash149 November 2003

[42] M Manzo T Roosta and S Sastry ldquoTime synchronization innetworksrdquo in Proceedings of the 3rd ACMWorkshop on Securityof Ad Hoc and Sensor Networks (SASN rsquo05) pp 107ndash116November 2005

[43] J K Uhlmann ldquoCovariance consistency methods for fault-tolerant distributed data fusionrdquo Information Fusion vol 4 no3 pp 201ndash215 2003

[44] S Bashi V P Jilkov X R Li and H Chen ldquoDistributed imple-mentations of particle filtersrdquo in Proceedings of the 6th Interna-tional Conference of Information Fusion pp 1164ndash1171 2003

[45] M Coates ldquoDistributed particle filters for sensor networksrdquo inProceedings of the 3rd International symposium on InformationProcessing in Sensor Networks (ACM rsquo04) pp 99ndash107 New YorkNY USA 2004

[46] D Gu ldquoDistributed particle filter for target trackingrdquo in Pro-ceedings of the IEEE International Conference on Robotics andAutomation (ICRA rsquo07) pp 3856ndash3861 April 2007

[47] Y Bar-Shalom ldquoUpdate with out-of-sequencemeasurements intracking exact solutionrdquo IEEE Transactions on Aerospace andElectronic Systems vol 38 no 3 pp 769ndash778 2002

[48] M Orton and A Marrs ldquoA Bayesian approach to multi-targettracking and data fusionwithOut-of-SequenceMeasurementsrdquoIEE Colloquium no 174 pp 151ndash155 2001

[49] M L Hernandez A D Marrs S Maskell and M R OrtonldquoTracking and fusion for wireless sensor networksrdquo in Proceed-ings of the 5th International Conference on Information Fusion2002

[50] P C Mahalanobis ldquoOn the generalized distance in statisticsrdquoProceedings National Institute of ScienceIndia vol 2 no 1 pp49ndash55 1936

[51] S J Julier J K Uhlmann and D Nicholson ldquoA methodfor dealing with assignment ambiguityrdquo in Proceedings of theAmerican Control Conference (AAC rsquo04) vol 5 pp 4102ndash4107July 2004

[52] H Pan Z-P Liang T J Anastasio and T S Huang ldquoHybridNN-Bayesian architecture for information fusionrdquo in Proceed-ings of the International Conference on Image Processing (ICIPrsquo98) pp 368ndash371 October 1998

[53] C Coue T Fraichard P Bessiere and E Mazer ldquoMulti-sensordata fusion using Bayesian programming an automotive appli-cationrdquo in Proceedings of the IEEERSJ International Conferenceon Intelligent Robots and Systems pp 141ndash146 October 2002

[54] D L Hall and J Llinas Handbook of Multisensor Data FusionCRC Press Boca Raton Fla USA 2001

[55] P Dempster ldquoA Generalization of Bayesian Inferencerdquo Journalof the Royal Statistical Society B vol 30 no 2 pp 205ndash247 1968

[56] A ShaferMathematical Theory of Evidence Princeton Univer-sity Press Princeton NJ USA 1976

[57] G M Provan ldquoThe validity of Dempster-Shafer belief func-tionsrdquo International Journal of Approximate Reasoning vol 6no 3 pp 389ndash399 1992

[58] D M Buede ldquoShafer-Dempster and Bayesian reasoning aresponse to lsquoShafer-Dempster reasoning with applications tomultisensor target identification systemsrsquordquo IEEE Transactions onSystems Man and Cybernetics vol 18 no 6 pp 1009ndash1011 1988

[59] Y Cheng and R L Kashyap ldquoComparisonol Bayesian andDempsterrsquos rules in evidence combinationrdquo in Maximum-Entropy and BayesianMethods in Science and Engineering 1988

[60] B R Cobb and P P Shenoy ldquoA comparison of Bayesian andbelief function reasoningrdquo Information Systems Frontiers vol 5no 4 pp 345ndash358 2003

[61] H Wu M Siegel R Stiefelhagen and J Yang ldquoSensor fusionusing Dempster-Shafer theoryrdquo in Proceedings of the 19thIEEE Instrumentation and Measurement Technology Conference(TMTC rsquo02) pp 7ndash11 May 2002

The Scientific World Journal 19

[62] H Wu M Siegel and S Ablay ldquoSensor fusion using dempster-shafer theory II staticweighting andKalmanfilter-like dynamicweightingrdquo in Proceedings of the 20th IEEE Information andMeasurement Technology Conference (TMTC rsquo03) pp 907ndash912May 2003

[63] E Bosse P Valin A-C Boury-Brisset and D Grenier ldquoEx-ploitation of a priori knowledge for information fusionrdquo Infor-mation Fusion vol 7 no 2 pp 161ndash175 2006

[64] MMorbee L Tessens H Aghajan andW Philips ldquoDempster-Shafer based multi-view occupancy mapsrdquo Electronics Lettersvol 46 no 5 pp 341ndash343 2010

[65] C S Peirce Abduction and Induction Philosophical Writings ofPeirce vol 156 Dover New York NY USA 1955

[66] A M Abdelbar E A M Andrews and D C Wunsch IIldquoAbductive reasoning with recurrent neural networksrdquo NeuralNetworks vol 16 no 5-6 pp 665ndash673 2003

[67] J R Aguero and A Vargas ldquoInference of operative configu-ration of distribution networks using fuzzy logic techniquesPart II extended real-time modelrdquo IEEE Transactions on PowerSystems vol 20 no 3 pp 1562ndash1569 2005

[68] A W M Smeulders M Worring S Santini A Gupta and RJain ldquoContent-based image retrieval at the end of the earlyyearsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 22 no 12 pp 1349ndash1380 2000

[69] D S Friedlander and S Phoha ldquoSemantic information fusionfor coordinated signal processing in mobile sensor networksrdquoInternational Journal of High Performance Computing Applica-tions vol 16 no 3 pp 235ndash241 2002

[70] S Friedlander ldquoSemantic information extractionrdquo in Dis-tributed Sensor Networks 2005

[71] KWhitehouse J Liu and F Zhao ldquoSemantic Streams a frame-work for composable inference over sensor datardquo in Proceedingsof the 3rd European Workshop on Wireless Sensor NetworksLecture Notes in Computer Science Springer February 2006

[72] J Cox ldquoA review of statistical data association techniques formotion correspondencerdquo International Journal of ComputerVision vol 10 no 1 pp 53ndash66 1993

[73] C J Veenman M J T Reinders and E Backer ldquoResolvingmotion correspondence for densely moving pointsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol23 no 1 pp 54ndash72 2001

[74] F Castanedo M A Patricio J Garcıa and J M MolinaldquoBottom-uptop-down coordination in a multiagent visualsensor networkrdquo in Proceedings of the IEEE Conference onAdvanced Video and Signal Based Surveillance (AVSS rsquo07) pp93ndash98 September 2007

[75] F Castanedo J Garcıa M A Patricio and J M MolinaldquoAnalysis of distributed fusion alternatives in coordinated visionagentsrdquo in Proceedings of the 11th International Conference onInformation Fusion (FUSION rsquo08) July 2008

[76] F Castanedo J Garcıa M A Patricio and J M Molina ldquoDatafusion to improve trajectory tracking in a cooperative surveil-lance multi-agent architecturerdquo Information Fusion vol 11 no3 pp 243ndash255 2010

[77] M A Davenport C Hegde M F Duarte and R G BaraniukldquoJoint manifolds for data fusionrdquo IEEE Transactions on ImageProcessing vol 19 no 10 pp 2580ndash2594 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 9: Review Article A Review of Data Fusion Techniquesdownloads.hindawi.com/journals/tswj/2013/704504.pdf · Classification of Data Fusion Techniques Data fusion is a ... to extract features

The Scientific World Journal 9

sum of the residuals associated with the observation 119898(119896) oftarget 119905 is as follows

V119905(119896) =

119898(119896)

sum

119894=1

120573119905

119894(119896) V119905

119894(119896) (14)

where V119905119894= 119911119894(119896) minus 119867119909

119905(119896 | 119896 minus 1) Therefore this method

incorporates all of the observations (inside the neighborhoodof the targetrsquos predicted position) to update the estimatedposition by using a posterior probability that is a weightedsum of residuals

The main restrictions of JPDA are the following

(i) a measurement cannot come from more than onetarget

(ii) two measurements cannot be originated by the sametarget (at one time instant)

(iii) the sum of all of the measurementsrsquo probabilities thatare assigned to one target must be 1 sum119898(119896)

119894=0120573119905

119894(119896) = 1

The main disadvantages of JPDA are the following

(i) it requires an explicit mechanism for track initial-ization Similar to PDA JPDA cannot initialize newtracks or remove tracks that are out of the observationarea

(ii) JPDA is a computationally expensive algorithm whenit is applied in environments that havemultiple targetsbecause the number of hypotheses is incrementedexponentially with the number of targets

In general JPDA is more appropriate than MHT insituations in which the density of false measurements is high(ie sonar applications)

34 Multiple Hypothesis Test The underlying idea of themultiple hypothesis test (MHT) is based on using more thantwo consecutive observations to make an association withbetter results Other algorithms that use only two consecutiveobservations have a higher probability of generating an errorIn contrast to PDA and JPDA MHT estimates all of thepossible hypotheses and maintains new hypotheses in eachiteration

MHTwas developed to track multiple targets in clutteredenvironments as a result it combines the data associationproblem and tracking into a unified framework becomingan estimation technique as well The Bayes rule or theBayesian networks are commonly employed to calculate theMHT hypothesis In general researchers have claimed thatMHT outperforms JPDA for the lower densities of falsepositives However the main disadvantage of MHT is thecomputational cost when the number of tracks or falsepositives is incremented Pruning the hypothesis tree usinga window could solve this limitation

The Reid [15] tracking algorithm is considered the stan-dard MHT algorithm but the initial integer programmingformulation of the problem is due to Morefield [16] MHT isan iterative algorithm in which each iteration starts with a set

of correspondence hypotheses Each hypothesis is a collec-tion of disjoint tracks and the prediction of the target in thenext time instant is computed for each hypothesis Next thepredictions are compared with the new observations by usinga distance metric The set of associations established in eachhypothesis (based on a distance) introduces new hypothesesin the next iteration Each new hypothesis represents a newset of tracks that is based on the current observations

Note that each new measurement could come from (i) anew target in the visual field of view (ii) a target being trackedor (iii) noise in the measurement process It is also possiblethat a measurement is not assigned to a target because thetarget disappears or because it is not possible to obtain atarget measurement at that time instant

MHT maintains several correspondence hypotheses foreach target in each frame If the hypothesis in the instant119896 is represented by 119867(119896) = [ℎ

119897(119896) 119896 = 1 119899] then

the probability of the hypothesis ℎ119897(119896) could be represented

recursively using the Bayes rule as follows

119875 (ℎ119897(119896) | 119885 (119896)) = 119875 (ℎ

119892(119896 minus 1) 119886

119894(119896) | 119885 (119896))

=1

119888119875 (119885 (119896) | ℎ

119892(119896 minus 1) 119886

119894(119896))

lowast 119875 (119886119894(119896) | ℎ

119892(119896 minus 1)) lowast 119875 (ℎ

119892(119896 minus 1))

(15)

where ℎ119892(119896 minus 1) is the hypothesis 119892 of the complete set until

the time instant 119896minus1 119886119894(119896) is the 119894th possible association of the

track to the object 119885(119896) is the set of detections of the currentframe and 119888 is a normal constant

The first term on the right side of the previous equationis the likelihood function of the measurement set 119885(119896) giventhe joint likelihood and current hypothesis The second termis the probability of the association hypothesis of the currentdata given the previous hypothesis ℎ

119892(119896 minus 1) The third term

is the probability of the previous hypothesis from which thecurrent hypothesis is calculated

The MHT algorithm has the ability to detect a newtrack while maintaining the hypothesis tree structure Theprobability of a true track is given by theBayes decisionmodelas

119875 (120582 | 119885) =119875 (119885 | 120582) lowast 119875

∘(120582)

119875 (119885) (16)

where 119875(119885 | 120582) is the probability of obtaining the set ofmeasurements 119885 given 120582 119875

∘(120582) is the a priori probability of

the source signal and 119875(119885) is the probability of obtaining theset of detections 119885

MHT considers all of the possibilities including boththe track maintenance and the initialization and removalof tracks in an integrated framework MHT calculates thepossibility of having an object after the generation of a setof measurements using an exhaustive approach and thealgorithm does not assume a fixed number of targetsThe keychallenge of MHT is the effective hypothesis managementThe baseline MHT algorithm can be extended as follows(i) use the hypothesis aggregation for missed targets births

10 The Scientific World Journal

cardinality tracking and closely spaced objects (ii) applya multistage MHT for improving the performance androbustness in challenging settings and (iii) use a feature-aided MHT for extended object surveillance

The main disadvantage of this algorithm is the compu-tational cost which grows exponentially with the number oftracks andmeasurementsTherefore the practical implemen-tation of this algorithm is limited because it is exponential inboth time and memory

With the aim of reducing the computational cost [17]presented a probabilistic MHT algorithm in which theassociations are considered to be random variables thatare statistically independent and in which performing anexhaustive search enumeration is avoided This algorithm isknown as PMHT The PMHT algorithm assumes that thenumber of targets and measurements is known With thesame goal of reducing the computational cost [18] presentedan efficient implementation of the MHT algorithm Thisimplementation was the first version to be applied to performtracking in visual environments They employed the Murty[19] algorithm to determine the best set of 119896 hypothesesin polynomial time with the goal of tracking the points ofinterest

MHT typically performs the tracking process by employ-ing only one characteristic commonly the position TheBayesian combination to use multiple characteristics wasproposed by Liggins II et al [20]

A linear-programming-based relaxation approach to theoptimization problem in MHT tracking was proposed inde-pendently by Coraluppi et al [21] and Storms and Spieksma[22] Joo and Chellappa [23] proposed an association algo-rithm for tracking multiple targets in visual environmentsTheir algorithm is based on in MHT modification in whicha measurement can be associated with more than one targetand several targets can be associated with one measurementThey also proposed a combinatorial optimization algorithmto generate the best set of association hypotheses Theiralgorithm always finds the best hypothesis in contrast toother models which are approximate Coraluppi and Carthel[24] presented a generalization of the MHT algorithm usinga recursion over hypothesis classes rather than over a singlehypothesis This work has been applied in a special case ofthemulti-target tracking problem called cardinality trackingin which they observed the number of sensor measurementsinstead of the target states

35 Distributed Joint Probabilistic Data Association The dis-tributed version of the joint probabilistic data association(JPDA-D) was presented by Chang et al [25] In this tech-nique the estimated state of the target (using two sensors)after being associated is given by

119864 119909 | 1198851 1198852 =

1198981

sum

119895=0

1198982

sum

119897=0

119864 119909 | 1205941

119895 1205942

119897 1198851 1198852

lowast 119875 1205941

119895 1205942

119897| 1198851 1198852

(17)

where 119898119894 119894 = 1 2 is the last set of measurements of

sensor 1 and 2 119885119894 119894 = 1 2 is the set of accumulative data

and 120594 is the association hypothesisThe first term of the rightside of the equation is calculated from the associations thatwere made earlier The second term is computed from theindividual association probabilities as follows

119875 (1205941

119895 1205942

119897| 1198851 1198852) = sum

1199091

sum

1199092

= 119875 (1205941 1205942| 1198851 1198852) 1

119895(1205941) 2

119897(1205942)

119875 (1205941 1205942| 1198851 1198852) =

1

119888119875 (1205941| 1198851) 119875 (120594

2| 1198852) 120574 (120594

1 1205942)

(18)

where 120594119894 are the joint hypotheses involving all of themeasurements and all of the objectives and 119894

119895(120594119894) are the

binary indicators of the measurement-target associationTheadditional term 120574(1205941 1205942) depends on the correlation of theindividual hypothesis and reflects the localization influenceof the current measurements in the joint hypotheses

These equations are obtained assuming that commu-nication exists after every observation and there are onlyapproximations in the case in which communication issporadic and when a substantial amount of noise occursTherefore this algorithm is a theoretical model that has somelimitations in practical applications

36 Distributed Multiple Hypothesis Test The distributedversion of the MHT algorithm (MHT-D) [26 27] follows asimilar structure as the JPDA-D algorithm Let us assume thecase in which one node must fuse two sets of hypotheses andtracks If the hypotheses and track sets are represented by119867119894(119885119894) and 119879119894(119885119894) with 119894 = 1 2 the hypothesis probabilities

are represented by 120582119894119895 and the state distribution of the tracks

(120591119894119895) is represented by 119875(120582119894

119895) and 119875(119909 | 119885

119894 120591119894

119895) then the

maximum available information in the fusion node is 119885 =1198851cup 1198852 The data fusion objective of the MHT-D is to

obtain the set of hypotheses119867(119885) the set of tracks 119879(119885) thehypothesis probabilities 119875(120582 | 119885) and the state distribution119901(119909 | 119885 120591) for the observed data

The MHT-D algorithm is composed of the followingsteps

(1) hypothesis formation for each hypothesis pair 1205821119895and

1205822

119896 which could be fused a track 120591 is formed by

associating the pair of tracks 1205911119895and 1205912

119896 where each

pair comes from one node and could originate fromthe same target The final result of this stage is a setof hypotheses denoted by 119867(119885) and the fused tracks119879(119885)

(2) hypothesis evaluation in this stage the associationprobability of each hypothesis and the estimatedstate of each fused track are obtained The dis-tributed estimation algorithm is employed to calcu-late the likelihood of the possible associations andthe obtained estimations at each specific association

The Scientific World Journal 11

Using the information model the probability of eachfused hypothesis is given by

119875 (120582 | 119885) = 119862minus1prod

119895isin119869

119875(120582(119895)| 119885(119895))120572(119895)

prod

120591isin120582

119871 (120591 | 119885) (19)

where 119862 is a normalizing constant and 119871(120591 | 119885) is thelikelihood of each hypothesis pair

The main disadvantage of the MHT-D is the high com-putational cost that is in the order of 119874(119899119872) where 119899 is thenumber of possible associations and 119872 is the number ofvariables to be estimated

37 Graphical Models Graphical models are a formalism forrepresenting and reasoning with probabilities and indepen-dence A graphical model represents a conditional decom-position of the joint probability A graphical model can berepresented as a graph in which the nodes denote randomvariables the edges denote the possible dependence betweenthe random variables and the plates denote the replication ofa substructure with the appropriate indexing of the relevantvariables The graph captures the joint distribution over therandom variables which can be decomposed into a productof factors that each dependononly a subset of variablesThereare two major classes of graphical models (i) the Bayesiannetworks [28] which are also known as the directed graphicalmodels and (ii) the Markov random fields which are alsoknown as undirected graphical models The directed graph-ical models are useful for expressing causal relationshipsbetween random variables whereas undirected models arebetter suited for expressing soft constraints between randomvariables We refer the reader to the book of Koller andFriedman [29] for more information on graphical models

A framework based on graphical models can solve theproblem of distributed data association in synchronizedsensor networkswith overlapped areas andwhere each sensorreceives noisy measurements this solution was proposedby Chen et al [30 31] Their work is based on graphicalmodels that are used to represent the statistical dependencebetween random variables The data association problem istreated as an inference problem and solved by using themax-product algorithm [32] Graphical models representstatistical dependencies between variables as graphs andthe max-product algorithm converges when the graph isa tree structure Moreover the employed algorithm couldbe implemented in a distributed manner by exchangingmessages between the source nodes in parallel With thisalgorithm if each sensor has 119899 possible combinations ofassociations and there are119872 variables to be estimated it hasa complexity of 119874(1198992119872) which is reasonable and less thanthe 119874(119899119872) complexity of the MHT-D algorithm Howeveraspecial attention must be given to the correlated variableswhen building the graphical model

4 State Estimation Methods

State estimation techniques aim to determine the state ofthe target under movement (typically the position) given

the observation or measurements State estimation tech-niques are also known as tracking techniques In their generalform it is not guaranteed that the target observations arerelevant which means that some of the observations couldactually come from the target and others could be only noiseThe state estimation phase is a common stage in data fusionalgorithms because the targetrsquos observation could come fromdifferent sensors or sources and the final goal is to obtain aglobal target state from the observations

The estimation problem involves finding the values of thevector state (eg position velocity and size) that fits as muchas possible with the observed data From a mathematicalperspective we have a set of redundant observations andthe goal is to find the set of parameters that provides thebest fit to the observed data In general these observationsare corrupted by errors and the propagation of noise in themeasurement process State estimation methods fall underlevel 1 of the JDL classification and could be divided into twobroader groups

(1) linear dynamics and measurements here the esti-mation problem has a standard solution Specificallywhen the equations of the object state and the mea-surements are linear the noise follows the Gaussiandistribution and we do not refer to it as a clutterenvironment in this case the optimal theoreticalsolution is based on the Kalman filter

(2) nonlinear dynamics the state estimation problembecomes difficult and there is not an analytical solu-tion to solve the problem in a generalmanner In prin-ciple there are no practical algorithms available tosolve this problem satisfactorily

Most of the state estimationmethods are based on controltheory and employ the laws of probability to compute avector state from a vector measurement or a stream of vectormeasurements Next the most common estimation methodsare presented including maximum likelihood and maxi-mum posterior (Section 41) the Kalman filter (Section 42)particle filter (Section 43) the distributed Kalman filter(Section 44) distributed particle filter (Section 45) andcovariance consistency methods (Section 46)

41 Maximum Likelihood andMaximum Posterior Themax-imum likelihood (ML) technique is an estimation methodthat is based on probabilistic theory Probabilistic estimationmethods are appropriate when the state variable follows anunknown probability distribution [33] In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of119909 The likelihood function 120582(119909) is defined as a probabilitydensity function of the sequence of 119911 observations given thetrue value of the state 119909 Consider

120582 (119909) = 119901 (119911 | 119909) (20)

The ML estimator finds the value of 119909 that maximizes thelikelihood function

119909 (119896) = argmax119909119901 (119911 | 119909) (21)

12 The Scientific World Journal

which can be obtained from the analytical or empiricalmodels of the sensorsThis function expresses the probabilityof the observed data The main disadvantage of this methodin practice is that it requires the analytical or empirical modelof the sensor to be known to provide the prior distributionand compute the likelihood function This method can alsosystematically underestimate the variance of the distributionwhich leads to a bias problem However the bias of the MLsolution becomes less significant as the number 119873 of datapoints increases and is equal to the true variance of thedistribution that generated the data at the limit119873 rarr infin

The maximum posterior (MAP) method is based on theBayesian theory It is employed when the parameter 119909 tobe estimated is the output of a random variable that has aknown probability density function 119901(119909) In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of 119909The MAP estimator finds the value of 119909 that maximizes theposterior probability distribution as follows

119909 (119896) = argmax119909119901 (119909 | 119911) (22)

Both methods (ML andMAP) aim to find the most likelyvalue for the state 119909 However ML assumes that 119909 is a fixedbut an unknown point from the parameter space whereasMAP considers 119909 to be the output of a random variable witha known a priori probability density function Both of thesemethods are equivalent when there is no a priori informationabout 119909 that is when there are only observations

42The Kalman Filter TheKalman filter is the most popularestimation technique It was originally proposed by Kalman[34] and has been widely studied and applied since then TheKalman filter estimates the state 119909 of a discrete time processgoverned by the following space-time model

119909 (119896 + 1) = Φ (119896) 119909 (119896) + 119866 (119896) 119906 (119896) + 119908 (119896) (23)

with the observations ormeasurements 119911 at time 119896 of the state119909 represented by

119911 (119896) = 119867 (119896) 119909 (119896) + V (119896) (24)

where Φ(119896) is the state transition matrix 119866(119896) is the inputmatrix transition 119906(119896) is the input vector 119867(119896) is themeasurement matrix and 119908 and V are the random Gaussianvariables with zero mean and covariance matrices of 119876(119896)and 119877(119896) respectively Based on the measurements and onthe system parameters the estimation of 119909(119896) which isrepresented by 119909(119896) and the prediction of 119909(119896 + 1) whichis represented by 119909(119896 + 1 | 119896) are given by the following

119909 (119896) = 119909 (119896 | 119896 + 1) + 119870 (119896) [119911 (119896) minus 119867 (119896) 119909 (119896 | 119896 minus 1)]

119909 (119896 + 1 | 119896) = Φ (119896) 119909 (119896 | 119896) + 119866 (119896) 119906 (119896)

(25)

respectively where119870 is the filter gain determined by

119870 (119896) = 119875 (119896 | 119896 minus 1)119867119879(119896)

times [119867 (119896) 119875 (119896 | 119896 minus 1)119867119879(119896) + 119877 (119896)]

minus1

(26)

where 119875(119896 | 119896 minus 1) is the prediction covariance matrix andcan be determined by

119875 (119896 + 1 | 119896) = Φ (119896) 119875 (119896)Φ119879(119896) + 119876 (119896) (27)

with

119875 (119896) = 119875 (119896 | 119896 minus 1) minus 119870 (119896)119867 (119896) 119875 (119896 | 119896 minus 1) (28)

The Kalman filter is mainly employed to fuse low-leveldata If the system could be described as a linear model andthe error could be modeled as the Gaussian noise then therecursive Kalman filter obtains optimal statistical estimations[35] However othermethods are required to address nonlin-ear dynamicmodels and nonlinearmeasurementsThemodi-fied Kalman filter known as the extended Kalman filter (EKF)is an optimal approach for implementing nonlinear recursivefilters [36] The EKF is one of the most often employedmethods for fusing data in robotic applications Howeverit has some disadvantages because the computations of theJacobians are extremely expensive Some attempts have beenmade to reduce the computational cost such as linearizationbut these attempts introduce errors in the filter and make itunstable

The unscented Kalman filter (UKF) [37] has gainedpopularity because it does not have the linearization step andthe associated errors of the EKF [38] The UKF employs adeterministic sampling strategy to establish theminimum setof points around the mean This set of points captures thetrue mean and covariance completely Then these points arepropagated through nonlinear functions and the covarianceof the estimations can be recuperated Another advantage ofthe UKF is its ability to be employed in parallel implementa-tions

43 Particle Filter Particle filters are recursive implemen-tations of the sequential Monte Carlo methods [39] Thismethod builds the posterior density function using severalrandom samples called particles Particles are propagatedover time with a combination of sampling and resamplingsteps At each iteration the sampling step is employed todiscard some particles increasing the relevance of regionswith a higher posterior probability In the filtering processseveral particles of the same state variable are employedand each particle has an associated weight that indicatesthe quality of the particle Therefore the estimation is theresult of a weighted sum of all of the particles The standardparticle filter algorithm has two phases (1) the predictingphase and (2) the updating phase In the predicting phaseeach particle is modified according to the existing modeland accounts for the sum of the random noise to simulatethe noise effect Then in the updating phase the weight ofeach particle is reevaluated using the last available sensorobservation and particles with lower weights are removedSpecifically a generic particle filter comprises the followingsteps

The Scientific World Journal 13

(1) Initialization of the particles

(i) let119873 be equal to the number of particles(ii) 119883(119894)(1) = [119909(1) 119910(1) 0 0]119879 for 119894 = 1 119873

(2) Prediction step

(i) for each particle 119894 = 1 119873 evaluate the state(119896 + 1 | 119896) of the system using the state at timeinstant 119896 with the noise of the system at time 119896Consider

119883(119894)(119896 + 1 | 119896) = 119865 (119896)119883

(119894)(119896)

+ (cauchy-distribution-noise)(119896)

(29)

where 119865(119896) is the transition matrix of the sys-tem

(3) Evaluate the particle weight For each particle 119894 =1 119873

(i) compute the predicted observation state of thesystem using the current predicted state and thenoise at instant 119896 Consider

(119894)(119896 + 1 | 119896) = 119867 (119896 + 1)119883

(119894)(119896 + 1 | 119896)

+ (gaussian-measurement-noise)(119896+1)

(30)

(ii) compute the likelihood (weights) according tothe given distribution Consider

likelihood(119894) = 119873((119894) (119896 + 1 | 119896) 119911(119894) (119896 + 1) var) (31)

(iii) normalize the weights as follows

119908(119894)=

likelihood(119894)

sum119873

119895=1likelihood(119895)

(32)

(4) ResamplingSelection multiply particles with higherweights and remove those with lower weights Thecurrent state must be adjusted using the computedweights of the new particles

(i) Compute the cumulative weights Consider

Cum Wt(119894) =119894

sum

119895=1

119908(119895) (33)

(ii) Generate uniform distributed random variablesfrom 119880(119894) sim 119882(0 1) with the number of stepsequal to the number of particles

(iii) Determine which particles should be multipliedand which ones removed

(5) Propagation phase

(i) incorporate the new values of the state after theresampling of instant 119896 to calculate the value atinstant 119896 + 1 Consider

119909(1119873)

(119896 + 1 | 119896 + 1) = 119909 (119896 + 1 | 119896) (34)

(ii) compute the posterior mean Consider

119909 (119896 + 1) = mean [119909119894 (119896 + 1 | 119896 + 1)] 119894 = 1 119873 (35)

(iii) repeat steps 2 to 5 for each time instant

Particle filters are more flexible than the Kalman filtersand can copewith nonlinear dependencies and non-Gaussiandensities in the dynamic model and in the noise errorHowever they have some disadvantages A large numberof particles are required to obtain a small variance in theestimator It is also difficult to establish the optimal number ofparticles in advance and the number of particles affects thecomputational cost significantly Earlier versions of particlefilters employed a fixed number of particles but recent studieshave started to use a dynamic number of particles [40]

44 The Distributed Kalman Filter The distributed Kalmanfilter requires a correct clock synchronization between eachsource as demonstrated in [41] In other words to correctlyuse the distributed Kalman filter the clocks from all ofthe sources must be synchronized This synchronization istypically achieved through using protocols that employ ashared global clock such as the network time protocol (NTP)Synchronization problems between clocks have been shownto have an effect on the accuracy of the Kalman filterproducing inaccurate estimations [42]

If the estimations are consistent and the cross covarianceis known (or the estimations are uncorrelated) then it ispossible to use the distributed Kalman filters [43] Howeverthe cross covariance must be determined exactly or theobservations must be consistent

We refer the reader to Liggins II et al [20] formore detailsabout the Kalman filter in a distributed and hierarchicalarchitecture

45 Distributed Particle Filter Distributed particle filtershave gained attention recently [44ndash46] Coates [45] used adistributed particle filter to monitor an environment thatcould be captured by the Markovian state-space modelinvolving nonlinear dynamics and observations and non-Gaussian noise

In contrast earlier attempts to solve out-of-sequencemeasurements using particle filters are based on regeneratingthe probability density function to the time instant of theout-of-sequence measurement [47] In a particle filter thisstep requires a large computational cost in addition to thenecessary space to store the previous particles To avoidthis problem Orton and Marrs [48] proposed to store theinformation on the particles at each time instant saving thecost of recalculating this information This technique is close

14 The Scientific World Journal

to optimal and when the delay increases the result is onlyslightly affected [49]However it requires a very large amountof space to store the state of the particles at each time instant

46 Covariance Consistency Methods Covariance Intersec-tionUnion Covariance consistency methods (intersectionand union) were proposed by Uhlmann [43] and are generaland fault-tolerant frameworks for maintaining covariancemeans and estimations in a distributed networkThese meth-ods do not comprise estimation techniques instead they aresimilar to an estimation fusion technique The distributedKalman filter requirement of independent measurements orknown cross-covariances is not a constraintwith thismethod

461 Covariance Intersection If the Kalman filter is employ-ed to combine two estimations (119886

1 1198601) and (119886

2 1198602) then it

is assumed that the joint covariance is in the following form

[

[

1198601119883

1198831198791198602

]

]

(36)

where the cross-covariance 119883 should be known exactly sothat the Kalman filter can be applied without difficultyBecause the computation of the cross-covariances is compu-tationally intensive Uhlmann [43] proposed the covarianceintersection (CI) algorithm

Let us assume that a joint covariance 119872 can be definedwith the diagonal blocks119872

1198601gt 1198601and119872

1198602gt 1198602 Consider

119872 ⩾ [

[

1198601119883

1198831198791198602

]

]

(37)

for every possible instance of the unknown cross-covariance119883 then the components of the matrix119872 could be employedin the Kalman filter equations to provide a fused estimation(119888 119862) that is considered consistent The key point of thismethod relies on generating a joint covariance matrix119872 thatcan represent a useful fused estimation (in this context usefulrefers to something with a lower associated uncertainty) Insummary the CI algorithm computes the joint covariancematrix 119872 where the Kalman filter provides the best fusedestimation (119888 119862) with respect to a fixed measurement of thecovariance matrix (ie the minimum determinant)

Specific covariance criteria must be established becausethere is not a specific minimum joint covariance in theorder of the positive semidefinite matrices Moreover thejoint covariance is the basis of the formal analysis of theCI algorithm the actual result is a nonlinear mixture of theinformation stored on the estimations being fused followingthe following equation

119862 = (1199081119867119879

1119860minus1

11198671+ 1199082119867119879

2119860minus1

21198672+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119867119899)minus1

119888 = 119862(1199081119867119879

1119860minus1

11198861+ 1199082119867119879

2119860minus1

21198862+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119886119899)minus1

(38)

where 119867119894is the transformation of the fused state-space

estimation to the space of the estimated state 119894 The values

of 119908 can be calculated to minimize the covariance determi-nant using convex optimization packages and semipositivematrix programming The result of the CI algorithm hasdifferent characteristics compared to the Kalman filter Forexample if two estimations are provided (119886 119860) and (119887 119861)and their covariances are equal 119860 = 119861 since the Kalmanfilter is based on the statistical independence assumption itproduces a fused estimation with covariance 119862 = (12)119860In contrast the CI method does not assume independenceand thus must be consistent even in the case in whichthe estimations are completely correlated with the estimatedfused covariance 119862 = 119860 In the case of estimations where119860 lt 119861 the CI algorithm does not provide information aboutthe estimation (119887 119861) thus the fused result is (119886 119860)

Every joint-consistent covariance is sufficient to producea fused estimation which guarantees consistency Howeverit is also necessary to guarantee a lack of divergence Diver-gence is avoided in the CI algorithm by choosing a specificmeasurement (ie the determinant) which is minimized ineach fusion operation This measurement represents a non-divergence criterion because the size of the estimated covari-ance according to this criterion would not be incremented

The application of the CI method guarantees consis-tency and nondivergence for every sequence of mean andcovariance-consistent estimations However this methoddoes not work well when the measurements to be fused areinconsistent

462 Covariance Union CI solves the problem of correlatedinputs but not the problemof inconsistent inputs (inconsistentinputs refer to different estimations each of which has ahigh accuracy (small variance) but also a large differencefrom the states of the others) thus the covariance union(CU) algorithm was proposed to solve the latter [43] CUaddresses the following problem two estimations (119886

1 1198601)

and (1198862 1198602) relate to the state of an object and are mutually

inconsistent from one another This issue arises when thedifference between the average estimations is larger thanthe provided covariance Inconsistent inputs can be detectedusing the Mahalanobis distance [50] between them which isdefined as

119872119889= (1198861minus 1198862)119879

(1198601+ 1198602)minus1

(1198861minus 1198862) (39)

and detecting whether this distance is larger than a giventhreshold

The Mahalanobis distance accounts for the covarianceinformation to obtain the distance If the difference betweenthe estimations is high but their covariance is also highthe Mahalanobis distance yields a small value In contrastif the difference between the estimations is small and thecovariances are small it could produce a larger distancevalue A high Mahalanobis distance could indicate that theestimations are inconsistent however it is necessary tohave a specific threshold established by the user or learnedautomatically

The CU algorithm aims to solve the following prob-lem let us suppose that a filtering algorithm provides twoobservations withmean and covariance (119886

1 1198601) and (119886

2 1198602)

The Scientific World Journal 15

respectively It is known that one of the observations is correctand the other is erroneous However the identity of thecorrect estimation is unknown and cannot be determinedIn this situation if both estimations are employed as aninput to the Kalman filter there will be a problem becausethe Kalman filter only guarantees a consistent output if theobservation is updated with a measurement consistent withboth of them In the specific case in which themeasurementscorrespond to the same object but are acquired from twodifferent sensors the Kalman filter can only guarantee thatthe output is consistent if it is consistent with both separatelyBecause it is not possible to knowwhich estimation is correctthe only way to combine the two estimations rigorously isto provide an estimation (119906 119880) that is consistent with bothestimations and to obey the following properties

119880 ⪖ 1198601+ (119906 minus 119886

1) (119906 minus 119860

1)119879

119880 ⪖ 1198602+ (119906 minus 119886

2) (119906 minus 119860

2)119879

(40)

where somemeasurement of thematrix size119880 (ie the deter-minant) is minimized

In other words the previous equations indicate that if theestimation (119886

1 1198601) is consistent then the translation of the

vector 1198861to 119906 requires to increase the covariance by the sum

of a matrix at least as big as the product of (119906minus1198861) in order to

be consistentThe same situation applies to the measurement(1198862 1198602) in order to be consistent

A simple strategy is to choose the mean of the estimationas the input value of one of themeasurements (119906 = 119886

1) In this

case the value of 119880must be chosen such that the estimationis consistent with the worst case (the correct measurement is1198862) However it is possible to assign 119906 an intermediate value

between 1198861and 1198862to decrease the value of 119880 Therefore the

CU algorithm establishes the mean fused value 119906 that hasthe least covariance 119880 but is sufficiently large for the twomeasurements (119886

1and 1198862) for consistency

Because the matrix inequalities presented in previousequations are convex convex optimization algorithms mustbe employed to solve them The value of 119880 can be computedwith the iterative method described by Julier et al [51]The obtained covariance could be significantly larger thanany of the initial covariances and is an indicator of theexisting uncertainty between the initial estimations One ofthe advantages of the CU method arises from the fact thatthe same process could be easily extended to119873 inputs

5 Decision Fusion Methods

A decision is typically taken based on the knowledge of theperceived situation which is provided by many sources inthe data fusion domain These techniques aim to make ahigh-level inference about the events and activities that areproduced from the detected targets These techniques oftenuse symbolic information and the fusion process requires toreasonwhile accounting for the uncertainties and constraintsThese methods fall under level 2 (situation assessment) andlevel 4 (impact assessment) of the JDL data fusion model

51 The Bayesian Methods Information fusion based on theBayesian inference provides a formalism for combining evi-dence according to the probability theory rules Uncertaintyis represented using the conditional probability terms thatdescribe beliefs and take on values in the interval [0 1] wherezero indicates a complete lack of belief and one indicates anabsolute belief The Bayesian inference is based on the Bayesrule as follows

119875 (119884 | 119883) =119875 (119883 | 119884) 119875 (119884)

119875 (119883) (41)

where the posterior probability 119875(119884 | 119883) represents thebelief in the hypothesis 119884 given the information 119883 Thisprobability is obtained by multiplying the a priori probabilityof the hypothesis 119875(119884) by the probability of having 119883 giventhat 119884 is true 119875(119883 | 119884) The value 119875(119883) is used as anormalizing constantThemain disadvantage of the Bayesianinference is that the probabilities 119875(119883) and 119875(119883 | 119884) mustbe known To estimate the conditional probabilities Panet al [52] proposed the use of NNs whereas Coue et al [53]proposed the Bayesian programming

Hall and Llinas [54] described the following problemsassociated with Bayesian inference

(i) Difficulty in establishing the value of a priori proba-bilities

(ii) Complexity when there are multiple potential hypo-theses and a substantial number of events that dependon the conditions

(iii) The hypothesis should be mutually exclusive

(iv) Difficulty in describing the uncertainty of the deci-sions

52 The Dempster-Shafer Inference The Dempster-Shaferinference is based on the mathematical theory introducedby Dempster [55] and Shafer [56] which generalizes theBayesian theory The Dempster-Shafer theory provides aformalism that could be used to represent incomplete knowl-edge updating beliefs and a combination of evidence andallows us to represent the uncertainty explicitly [57]

A fundamental concept in the Dempster-Shafer reason-ing is the frame of discernment which is defined as followsLet Θ = 120579

1 1205792 120579

119873 be the set of all possible states

that define the system and let Θ be exhaustive and mutuallyexclusive due to the system being only in one state 120579

119894isin Θ

where 1 ⪕ 119894 ⪕ 119873 The set Θ is called a frame of discernmentbecause its elements are employed to discern the current stateof the system

The elements of the set 2Θ are called hypotheses Inthe Dempster-Shafer theory based on the evidence 119864 aprobability is assigned to each hypothesis119867 isin 2Θ accordingto the basic assignment of probabilities or the mass function119898 2Θrarr [01] which satisfies

119898(0) = 0 (42)

16 The Scientific World Journal

Thus themass function of the empty set is zero Furthermorethe mass function of a hypothesis is larger than or equal tozero for all of the hypotheses Consider

119898(119867) ge 0 forall119867 isin 2Θ (43)

The sum of the mass function of all the hypotheses is oneConsider

sum

119867isin2Θ

119898(119867) = 1 (44)

To express incomplete beliefs in a hypothesis 119867 the Demp-ster-Shafer theory defines the belief function bel 2Θ rarr

[0 1] over Θ as

bel (119867) = sum119860sube119867

119898(119860) (45)

where bel(0) = 0 and bel(Θ) = 1The doubt level in119867 can beexpressed in terms of the belief function by

dou (119867) = bel (not119867) = sum

119860subenot119867

119898(119860) (46)

To express the plausibility of each hypothesis the functionpl 2Θ rarr [0 1] over Θ is defined as

pl (119867) = 1 minus dou (119867) = sum

119860cap119867=0

119898(119860) (47)

Intuitive plausibility indicates that there is less uncer-tainty in hypothesis119867 if it is more plausible The confidenceinterval [bel(119867) pl(119867)] defines the true belief in hypothesis119867 To combine the effects of the two mass functions 119898

1and

1198982 the Dempster-Shafer theory defines a rule119898

1oplus 1198982as

1198981oplus 1198982(0) = 0

1198981oplus 1198982(119867) =

sum119883cap119884=119867

1198981(119883)119898

2(119884)

1 minus sum119883cap119884=0

1198981(119883)119898

2(119884)

(48)

In contrast to the Bayesian inference a priori probabilitiesare not required in the Dempster-Shafer inference becausethey are assigned at the instant that the information is pro-vided Several studies in the literature have compared the useof the Bayesian inference and the Dempster-Shafer inferencesuch as [58ndash60] Wu et al [61] used the Dempster-Shafertheory to fuse information in context-aware environmentsThis work was extended in [62] to dynamically modify theassociated weights to the sensor measurements Thereforethe fusion mechanism is calibrated according to the recentmeasurements of the sensors (in cases in which the ground-truth is available) In themilitary domain [63] theDempster-Shafer reasoning is used with the a priori information storedin a database for classifying military ships Morbee et al [64]described the use of the Dempster-Shafer theory to build 2Doccupancy maps from several cameras and to evaluate thecontribution of subsets of cameras to a specific task Each taskis the observation of an event of interest and the goal is toassess the validity of a set of hypotheses that are fused usingthe Dempster-Shafer theory

53 Abductive Reasoning Abductive reasoning or inferringthe best explanation is a reasoning method in which ahypothesis is chosen under the assumption that in case itis true it explains the observed event most accurately [65]In other words when an event is observed the abductionmethod attempts to find the best explanation

In the context of probabilistic reasoning abductive infer-ence finds the posterior ML of the system variables givensome observed variables Abductive reasoning is more areasoning pattern than a data fusion technique Thereforedifferent inference methods such as NNs [66] or fuzzy logic[67] can be employed

54 Semantic Methods Decision fusion techniques thatemploy semantic data fromdifferent sources as an input couldprovide more accurate results than those that rely on onlysingle sources There is a growing interest in techniques thatautomatically determine the presence of semantic features invideos to solve the semantic gap [68]

Semantic information fusion is essentially a scheme inwhich raw sensor data are processed such that the nodesexchange only the resultant semantic information Semanticinformation fusion typically covers two phases (i) build-ing the knowledge and (ii) pattern matching (inference)The first phase (typically offline) incorporates the mostappropriate knowledge into semantic information Then thesecond phase (typically online or in real-time) fuses relevantattributes and provides a semantic interpretation of thesensor data [69ndash71]

Semantic fusion could be viewed as an idea for integratingand translating sensor data into formal languages Thereforethe obtained resulting language from the observations ofthe environment is compared with similar languages thatare stored in the database The key of this strategy is thatsimilar behaviors represented by formal languages are alsosemantically similar This type of method provides savingsin the cost of transmission because the nodes need onlytransmit the formal language structure instead of the rawdata However a known set of behaviors must be storedin a database in advance which might be difficult in somescenarios

6 Conclusions

This paper reviews the most popular methods and tech-niques for performing datainformation fusion To determinewhether the application of datainformation fusion methodsis feasible we must evaluate the computational cost of theprocess and the delay introduced in the communicationA centralized data fusion approach is theoretically optimalwhen there is no cost of transmission and there are sufficientcomputational resources However this situation typicallydoes not hold in practical applications

The selection of the most appropriate technique dependson the type of the problem and the established assumptionsof each technique Statistical data fusion methods (eg PDAJPDA MHT and Kalman) are optimal under specific condi-tions [72] First the assumption that the targets are moving

The Scientific World Journal 17

independently and the measurements are normally dis-tributed around the predicted position typically does nothold Second because the statistical techniques model allof the events as probabilities they typically have severalparameters and a priori probabilities for false measurementsand detection errors that are often difficult to obtain (atleast in an optimal sense) For example in the case of theMHT algorithm specific parameters must be established thatare nontrivial to determine and are very sensitive [73] Incontrast statisticalmethods that optimize over several framesare computationally intensive and their complexity typicallygrows exponentially with the number of targets For examplein the case of particle filters tracking several targets can beaccomplished jointly as a group or individually If severaltargets are tracked jointly the necessary number of particlesgrows exponentially Therefore in practice it is better toperform tracking on them individually with the assumptionthat targets do not interact between the particles

In contrast to centralized systems the distributed datafusion methods introduce some challenges in the data fusionprocess such as (i) spatial and temporal alignments of theinformation (ii) out-of-sequence measurements and (iii)data correlation reported by Castanedo et al [74 75] Theinherent redundancy of the distributed systems could beexploited with distributed reasoning techniques and cooper-ative algorithms to improve the individual node estimationsreported by Castanedo et al [76] In addition to the previousstudies a new trend based on the geometric notion of a low-dimensional manifold is gaining attention in the data fusioncommunity An example is the work of Davenport et al [77]which proposes a simple model that captures the correlationbetween the sensor observations by matching the parametervalues for the different obtained manifolds

Acknowledgments

The author would like to thank Jesus Garcıa Miguel APatricio and James Llinas for their interesting and relateddiscussions on several topics that were presented in thispaper

References

[1] JDL Data Fusion Lexicon Technical Panel For C3 FE WhiteSan Diego Calif USA Code 420 1991

[2] D L Hall and J Llinas ldquoAn introduction to multisensor datafusionrdquo Proceedings of the IEEE vol 85 no 1 pp 6ndash23 1997

[3] H F Durrant-Whyte ldquoSensor models and multisensor integra-tionrdquo International Journal of Robotics Research vol 7 no 6 pp97ndash113 1988

[4] B V Dasarathy ldquoSensor fusion potential exploitation-inno-vative architectures and illustrative applicationsrdquo Proceedings ofthe IEEE vol 85 no 1 pp 24ndash38 1997

[5] R C Luo C-C Yih and K L Su ldquoMultisensor fusion andintegration approaches applications and future research direc-tionsrdquo IEEE Sensors Journal vol 2 no 2 pp 107ndash119 2002

[6] J Llinas C Bowman G Rogova A Steinberg E Waltz andF White ldquoRevisiting the JDL data fusion model IIrdquo TechnicalReport DTIC Document 2004

[7] E P Blasch and S Plano ldquoJDL level 5 fusionmodel ldquouser refine-mentrdquo issues and applications in group trackingrdquo in Proceedingsof the Signal Processing Sensor Fusion and Target RecognitionXI pp 270ndash279 April 2002

[8] H F Durrant-Whyte and M Stevens ldquoData fusion in decen-tralized sensing networksrdquo in Proceedings of the 4th Interna-tional Conference on Information Fusion pp 302ndash307MontrealCanada 2001

[9] J Manyika and H Durrant-Whyte Data Fusion and SensorManagement A Decentralized Information-Theoretic ApproachPrentice Hall Upper Saddle River NJ USA 1995

[10] S S Blackman ldquoAssociation and fusion ofmultiple sensor datardquoinMultitarget-Multisensor Tracking Advanced Applications pp187ndash217 Artech House 1990

[11] S Lloyd ldquoLeast squares quantization in pcmrdquo IEEETransactionson Information Theory vol 28 no 2 pp 129ndash137 1982

[12] M Shindler A Wong and A Meyerson ldquoFast and accurate120581-means for large datasetsrdquo in Proceedings of the 25th AnnualConference on Neural Information Processing Systems (NIPS rsquo11)pp 2375ndash2383 December 2011

[13] Y Bar-Shalom and E Tse ldquoTracking in a cluttered environmentwith probabilistic data associationrdquo Automatica vol 11 no 5pp 451ndash460 1975

[14] T E Fortmann Y Bar-Shalom and M Scheffe ldquoMulti-targettracking using joint probabilistic data associationrdquo in Pro-ceedings of the 19th IEEE Conference on Decision and Controlincluding the Symposium on Adaptive Processes vol 19 pp 807ndash812 December 1980

[15] D B Reid ldquoAn algorithm for tracking multiple targetsrdquo IEEETransactions on Automatic Control vol 24 no 6 pp 843ndash8541979

[16] C L Morefield ldquoApplication of 0-1 integer programming tomultitarget tracking problemsrdquo IEEETransactions onAutomaticControl vol 22 no 3 pp 302ndash312 1977

[17] R L Streit and T E Luginbuhl ldquoMaximum likelihood methodfor probabilistic multihypothesis trackingrdquo in Proceedings of theSignal and Data Processing of Small Targets vol 2235 of Pro-ceedings of SPIE p 394 1994

[18] I J Cox and S L Hingorani ldquoEfficient implementation of Reidrsquosmultiple hypothesis tracking algorithm and its evaluation forthe purpose of visual trackingrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 18 no 2 pp 138ndash1501996

[19] K G Murty ldquoAn algorithm for ranking all the assignments inorder of increasing costrdquo Operations Research vol 16 no 3 pp682ndash687 1968

[20] M E Liggins II C-Y Chong I Kadar et al ldquoDistributed fusionarchitectures and algorithms for target trackingrdquo Proceedings ofthe IEEE vol 85 no 1 pp 95ndash106 1997

[21] S Coraluppi C Carthel M Luettgen and S Lynch ldquoAll-source track and identity fusionrdquo in Proceedings of the NationalSymposium on Sensor and Data Fusion 2000

[22] P Storms and F Spieksma ldquoAn lp-based algorithm for the dataassociation problem in multitarget trackingrdquo in Proceedings ofthe 3rd IEEE International Conference on Information Fusionvol 1 2000

[23] S-W Joo and R Chellappa ldquoA multiple-hypothesis approachfor multiobject visual trackingrdquo IEEE Transactions on ImageProcessing vol 16 no 11 pp 2849ndash2854 2007

[24] S Coraluppi andC Carthel ldquoAggregate surveillance a cardinal-ity tracking approachrdquo in Proceedings of the 14th InternationalConference on Information Fusion (FUSION rsquo11) July 2011

18 The Scientific World Journal

[25] K C Chang C Y Chong and Y Bar-Shalom ldquoJoint proba-bilistic data association in distributed sensor networksrdquo IEEETransactions on Automatic Control vol 31 no 10 pp 889ndash8971986

[26] Y Chong S Mori and K C Chang ldquoInformation lusion indistributed sensor networksrdquo in Proceedings of the 4th AmericanControl Conference Boston Mass USA June 1985

[27] Y Chong S Mori and K C Chang ldquoDistributed multitar-get multisensor trackingrdquo in Multitarget-Multisensor TrackingAdvanced Applications vol 1 pp 247ndash295 1990

[28] J Pearl Probabilistic Reasoning in Intelligent Systems Networksof Plausible Inference Morgan Kaufmann San Mateo CalifUSA 1988

[29] Koller and N Friedman Probabilistic Graphical Models Princi-ples and Techniques MIT press 2009

[30] L Chen M Cetin and A S Willsky ldquoDistributed data associ-ation for multi-target tracking in sensor networksrdquo in Proceed-ings of the 7th International Conference on Information Fusion(FUSION rsquo05) pp 9ndash16 July 2005

[31] L Chen M J Wainwright M Cetin and A S Willsky ldquoDataassociation based on optimization in graphical models withapplication to sensor networksrdquo Mathematical and ComputerModelling vol 43 no 9-10 pp 1114ndash1113 2006

[32] Y Weiss and W T Freeman ldquoOn the optimality of solutionsof the max-product belief-propagation algorithm in arbitrarygraphsrdquo IEEE Transactions on InformationTheory vol 47 no 2pp 736ndash744 2001

[33] C Brown H Durrant-Whyte J Leonard B Rao and B SteerldquoDistributed data fusion using Kalman filtering a roboticsapplicationrdquo in Data Fusion in Robotics and Machine Intelli-gence M A Abidi and R C Gonzalez Eds pp 267ndash309 1992

[34] R E Kalman ldquoA new approach to linear filtering and predictionproblemsrdquo Journal of Basic Engineering vol 82 no 1 pp 35ndash451960

[35] R C Luo and M G Kay ldquoData fusion and sensor integrationstate-of-the-art 1990srdquo in Data Fusion in Robotics and MachineIntelligence pp 7ndash135 1992

[36] Welch and G Bishop An Introduction to the Kalman FilterACM SIC-CRAPH 2001 Course Notes 2001

[37] S J Julier and J K Uhlmann ldquoA new extension of the Kalmanfilter to nonlinear systemsrdquo in Proceedings of the InternationalSymposium on AerospaceDefense Sensing Simulation and Con-trols vol 3 1997

[38] A Wan and R Van Der Merwe ldquoThe unscented kalman filterfor nonlinear estimationrdquo in Proceedings of the Adaptive Systemsfor Signal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 153ndash158 2000

[39] D Crisan and A Doucet ldquoA survey of convergence results onparticle filtering methods for practitionersrdquo IEEE Transactionson Signal Processing vol 50 no 3 pp 736ndash746 2002

[40] J Martinez-del Rincon C Orrite-Urunuela and J E Herrero-Jaraba ldquoAn efficient particle filter for color-based tracking incomplex scenesrdquo in Proceedings of the IEEE Conference onAdvancedVideo and Signal Based Surveillance pp 176ndash181 2007

[41] S Ganeriwal R Kumar and M B Srivastava ldquoTiming-syncprotocol for sensor networksrdquo in Proceedings of the 1st Inter-national Conference on Embedded Networked Sensor Systems(SenSys rsquo03) pp 138ndash149 November 2003

[42] M Manzo T Roosta and S Sastry ldquoTime synchronization innetworksrdquo in Proceedings of the 3rd ACMWorkshop on Securityof Ad Hoc and Sensor Networks (SASN rsquo05) pp 107ndash116November 2005

[43] J K Uhlmann ldquoCovariance consistency methods for fault-tolerant distributed data fusionrdquo Information Fusion vol 4 no3 pp 201ndash215 2003

[44] S Bashi V P Jilkov X R Li and H Chen ldquoDistributed imple-mentations of particle filtersrdquo in Proceedings of the 6th Interna-tional Conference of Information Fusion pp 1164ndash1171 2003

[45] M Coates ldquoDistributed particle filters for sensor networksrdquo inProceedings of the 3rd International symposium on InformationProcessing in Sensor Networks (ACM rsquo04) pp 99ndash107 New YorkNY USA 2004

[46] D Gu ldquoDistributed particle filter for target trackingrdquo in Pro-ceedings of the IEEE International Conference on Robotics andAutomation (ICRA rsquo07) pp 3856ndash3861 April 2007

[47] Y Bar-Shalom ldquoUpdate with out-of-sequencemeasurements intracking exact solutionrdquo IEEE Transactions on Aerospace andElectronic Systems vol 38 no 3 pp 769ndash778 2002

[48] M Orton and A Marrs ldquoA Bayesian approach to multi-targettracking and data fusionwithOut-of-SequenceMeasurementsrdquoIEE Colloquium no 174 pp 151ndash155 2001

[49] M L Hernandez A D Marrs S Maskell and M R OrtonldquoTracking and fusion for wireless sensor networksrdquo in Proceed-ings of the 5th International Conference on Information Fusion2002

[50] P C Mahalanobis ldquoOn the generalized distance in statisticsrdquoProceedings National Institute of ScienceIndia vol 2 no 1 pp49ndash55 1936

[51] S J Julier J K Uhlmann and D Nicholson ldquoA methodfor dealing with assignment ambiguityrdquo in Proceedings of theAmerican Control Conference (AAC rsquo04) vol 5 pp 4102ndash4107July 2004

[52] H Pan Z-P Liang T J Anastasio and T S Huang ldquoHybridNN-Bayesian architecture for information fusionrdquo in Proceed-ings of the International Conference on Image Processing (ICIPrsquo98) pp 368ndash371 October 1998

[53] C Coue T Fraichard P Bessiere and E Mazer ldquoMulti-sensordata fusion using Bayesian programming an automotive appli-cationrdquo in Proceedings of the IEEERSJ International Conferenceon Intelligent Robots and Systems pp 141ndash146 October 2002

[54] D L Hall and J Llinas Handbook of Multisensor Data FusionCRC Press Boca Raton Fla USA 2001

[55] P Dempster ldquoA Generalization of Bayesian Inferencerdquo Journalof the Royal Statistical Society B vol 30 no 2 pp 205ndash247 1968

[56] A ShaferMathematical Theory of Evidence Princeton Univer-sity Press Princeton NJ USA 1976

[57] G M Provan ldquoThe validity of Dempster-Shafer belief func-tionsrdquo International Journal of Approximate Reasoning vol 6no 3 pp 389ndash399 1992

[58] D M Buede ldquoShafer-Dempster and Bayesian reasoning aresponse to lsquoShafer-Dempster reasoning with applications tomultisensor target identification systemsrsquordquo IEEE Transactions onSystems Man and Cybernetics vol 18 no 6 pp 1009ndash1011 1988

[59] Y Cheng and R L Kashyap ldquoComparisonol Bayesian andDempsterrsquos rules in evidence combinationrdquo in Maximum-Entropy and BayesianMethods in Science and Engineering 1988

[60] B R Cobb and P P Shenoy ldquoA comparison of Bayesian andbelief function reasoningrdquo Information Systems Frontiers vol 5no 4 pp 345ndash358 2003

[61] H Wu M Siegel R Stiefelhagen and J Yang ldquoSensor fusionusing Dempster-Shafer theoryrdquo in Proceedings of the 19thIEEE Instrumentation and Measurement Technology Conference(TMTC rsquo02) pp 7ndash11 May 2002

The Scientific World Journal 19

[62] H Wu M Siegel and S Ablay ldquoSensor fusion using dempster-shafer theory II staticweighting andKalmanfilter-like dynamicweightingrdquo in Proceedings of the 20th IEEE Information andMeasurement Technology Conference (TMTC rsquo03) pp 907ndash912May 2003

[63] E Bosse P Valin A-C Boury-Brisset and D Grenier ldquoEx-ploitation of a priori knowledge for information fusionrdquo Infor-mation Fusion vol 7 no 2 pp 161ndash175 2006

[64] MMorbee L Tessens H Aghajan andW Philips ldquoDempster-Shafer based multi-view occupancy mapsrdquo Electronics Lettersvol 46 no 5 pp 341ndash343 2010

[65] C S Peirce Abduction and Induction Philosophical Writings ofPeirce vol 156 Dover New York NY USA 1955

[66] A M Abdelbar E A M Andrews and D C Wunsch IIldquoAbductive reasoning with recurrent neural networksrdquo NeuralNetworks vol 16 no 5-6 pp 665ndash673 2003

[67] J R Aguero and A Vargas ldquoInference of operative configu-ration of distribution networks using fuzzy logic techniquesPart II extended real-time modelrdquo IEEE Transactions on PowerSystems vol 20 no 3 pp 1562ndash1569 2005

[68] A W M Smeulders M Worring S Santini A Gupta and RJain ldquoContent-based image retrieval at the end of the earlyyearsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 22 no 12 pp 1349ndash1380 2000

[69] D S Friedlander and S Phoha ldquoSemantic information fusionfor coordinated signal processing in mobile sensor networksrdquoInternational Journal of High Performance Computing Applica-tions vol 16 no 3 pp 235ndash241 2002

[70] S Friedlander ldquoSemantic information extractionrdquo in Dis-tributed Sensor Networks 2005

[71] KWhitehouse J Liu and F Zhao ldquoSemantic Streams a frame-work for composable inference over sensor datardquo in Proceedingsof the 3rd European Workshop on Wireless Sensor NetworksLecture Notes in Computer Science Springer February 2006

[72] J Cox ldquoA review of statistical data association techniques formotion correspondencerdquo International Journal of ComputerVision vol 10 no 1 pp 53ndash66 1993

[73] C J Veenman M J T Reinders and E Backer ldquoResolvingmotion correspondence for densely moving pointsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol23 no 1 pp 54ndash72 2001

[74] F Castanedo M A Patricio J Garcıa and J M MolinaldquoBottom-uptop-down coordination in a multiagent visualsensor networkrdquo in Proceedings of the IEEE Conference onAdvanced Video and Signal Based Surveillance (AVSS rsquo07) pp93ndash98 September 2007

[75] F Castanedo J Garcıa M A Patricio and J M MolinaldquoAnalysis of distributed fusion alternatives in coordinated visionagentsrdquo in Proceedings of the 11th International Conference onInformation Fusion (FUSION rsquo08) July 2008

[76] F Castanedo J Garcıa M A Patricio and J M Molina ldquoDatafusion to improve trajectory tracking in a cooperative surveil-lance multi-agent architecturerdquo Information Fusion vol 11 no3 pp 243ndash255 2010

[77] M A Davenport C Hegde M F Duarte and R G BaraniukldquoJoint manifolds for data fusionrdquo IEEE Transactions on ImageProcessing vol 19 no 10 pp 2580ndash2594 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 10: Review Article A Review of Data Fusion Techniquesdownloads.hindawi.com/journals/tswj/2013/704504.pdf · Classification of Data Fusion Techniques Data fusion is a ... to extract features

10 The Scientific World Journal

cardinality tracking and closely spaced objects (ii) applya multistage MHT for improving the performance androbustness in challenging settings and (iii) use a feature-aided MHT for extended object surveillance

The main disadvantage of this algorithm is the compu-tational cost which grows exponentially with the number oftracks andmeasurementsTherefore the practical implemen-tation of this algorithm is limited because it is exponential inboth time and memory

With the aim of reducing the computational cost [17]presented a probabilistic MHT algorithm in which theassociations are considered to be random variables thatare statistically independent and in which performing anexhaustive search enumeration is avoided This algorithm isknown as PMHT The PMHT algorithm assumes that thenumber of targets and measurements is known With thesame goal of reducing the computational cost [18] presentedan efficient implementation of the MHT algorithm Thisimplementation was the first version to be applied to performtracking in visual environments They employed the Murty[19] algorithm to determine the best set of 119896 hypothesesin polynomial time with the goal of tracking the points ofinterest

MHT typically performs the tracking process by employ-ing only one characteristic commonly the position TheBayesian combination to use multiple characteristics wasproposed by Liggins II et al [20]

A linear-programming-based relaxation approach to theoptimization problem in MHT tracking was proposed inde-pendently by Coraluppi et al [21] and Storms and Spieksma[22] Joo and Chellappa [23] proposed an association algo-rithm for tracking multiple targets in visual environmentsTheir algorithm is based on in MHT modification in whicha measurement can be associated with more than one targetand several targets can be associated with one measurementThey also proposed a combinatorial optimization algorithmto generate the best set of association hypotheses Theiralgorithm always finds the best hypothesis in contrast toother models which are approximate Coraluppi and Carthel[24] presented a generalization of the MHT algorithm usinga recursion over hypothesis classes rather than over a singlehypothesis This work has been applied in a special case ofthemulti-target tracking problem called cardinality trackingin which they observed the number of sensor measurementsinstead of the target states

35 Distributed Joint Probabilistic Data Association The dis-tributed version of the joint probabilistic data association(JPDA-D) was presented by Chang et al [25] In this tech-nique the estimated state of the target (using two sensors)after being associated is given by

119864 119909 | 1198851 1198852 =

1198981

sum

119895=0

1198982

sum

119897=0

119864 119909 | 1205941

119895 1205942

119897 1198851 1198852

lowast 119875 1205941

119895 1205942

119897| 1198851 1198852

(17)

where 119898119894 119894 = 1 2 is the last set of measurements of

sensor 1 and 2 119885119894 119894 = 1 2 is the set of accumulative data

and 120594 is the association hypothesisThe first term of the rightside of the equation is calculated from the associations thatwere made earlier The second term is computed from theindividual association probabilities as follows

119875 (1205941

119895 1205942

119897| 1198851 1198852) = sum

1199091

sum

1199092

= 119875 (1205941 1205942| 1198851 1198852) 1

119895(1205941) 2

119897(1205942)

119875 (1205941 1205942| 1198851 1198852) =

1

119888119875 (1205941| 1198851) 119875 (120594

2| 1198852) 120574 (120594

1 1205942)

(18)

where 120594119894 are the joint hypotheses involving all of themeasurements and all of the objectives and 119894

119895(120594119894) are the

binary indicators of the measurement-target associationTheadditional term 120574(1205941 1205942) depends on the correlation of theindividual hypothesis and reflects the localization influenceof the current measurements in the joint hypotheses

These equations are obtained assuming that commu-nication exists after every observation and there are onlyapproximations in the case in which communication issporadic and when a substantial amount of noise occursTherefore this algorithm is a theoretical model that has somelimitations in practical applications

36 Distributed Multiple Hypothesis Test The distributedversion of the MHT algorithm (MHT-D) [26 27] follows asimilar structure as the JPDA-D algorithm Let us assume thecase in which one node must fuse two sets of hypotheses andtracks If the hypotheses and track sets are represented by119867119894(119885119894) and 119879119894(119885119894) with 119894 = 1 2 the hypothesis probabilities

are represented by 120582119894119895 and the state distribution of the tracks

(120591119894119895) is represented by 119875(120582119894

119895) and 119875(119909 | 119885

119894 120591119894

119895) then the

maximum available information in the fusion node is 119885 =1198851cup 1198852 The data fusion objective of the MHT-D is to

obtain the set of hypotheses119867(119885) the set of tracks 119879(119885) thehypothesis probabilities 119875(120582 | 119885) and the state distribution119901(119909 | 119885 120591) for the observed data

The MHT-D algorithm is composed of the followingsteps

(1) hypothesis formation for each hypothesis pair 1205821119895and

1205822

119896 which could be fused a track 120591 is formed by

associating the pair of tracks 1205911119895and 1205912

119896 where each

pair comes from one node and could originate fromthe same target The final result of this stage is a setof hypotheses denoted by 119867(119885) and the fused tracks119879(119885)

(2) hypothesis evaluation in this stage the associationprobability of each hypothesis and the estimatedstate of each fused track are obtained The dis-tributed estimation algorithm is employed to calcu-late the likelihood of the possible associations andthe obtained estimations at each specific association

The Scientific World Journal 11

Using the information model the probability of eachfused hypothesis is given by

119875 (120582 | 119885) = 119862minus1prod

119895isin119869

119875(120582(119895)| 119885(119895))120572(119895)

prod

120591isin120582

119871 (120591 | 119885) (19)

where 119862 is a normalizing constant and 119871(120591 | 119885) is thelikelihood of each hypothesis pair

The main disadvantage of the MHT-D is the high com-putational cost that is in the order of 119874(119899119872) where 119899 is thenumber of possible associations and 119872 is the number ofvariables to be estimated

37 Graphical Models Graphical models are a formalism forrepresenting and reasoning with probabilities and indepen-dence A graphical model represents a conditional decom-position of the joint probability A graphical model can berepresented as a graph in which the nodes denote randomvariables the edges denote the possible dependence betweenthe random variables and the plates denote the replication ofa substructure with the appropriate indexing of the relevantvariables The graph captures the joint distribution over therandom variables which can be decomposed into a productof factors that each dependononly a subset of variablesThereare two major classes of graphical models (i) the Bayesiannetworks [28] which are also known as the directed graphicalmodels and (ii) the Markov random fields which are alsoknown as undirected graphical models The directed graph-ical models are useful for expressing causal relationshipsbetween random variables whereas undirected models arebetter suited for expressing soft constraints between randomvariables We refer the reader to the book of Koller andFriedman [29] for more information on graphical models

A framework based on graphical models can solve theproblem of distributed data association in synchronizedsensor networkswith overlapped areas andwhere each sensorreceives noisy measurements this solution was proposedby Chen et al [30 31] Their work is based on graphicalmodels that are used to represent the statistical dependencebetween random variables The data association problem istreated as an inference problem and solved by using themax-product algorithm [32] Graphical models representstatistical dependencies between variables as graphs andthe max-product algorithm converges when the graph isa tree structure Moreover the employed algorithm couldbe implemented in a distributed manner by exchangingmessages between the source nodes in parallel With thisalgorithm if each sensor has 119899 possible combinations ofassociations and there are119872 variables to be estimated it hasa complexity of 119874(1198992119872) which is reasonable and less thanthe 119874(119899119872) complexity of the MHT-D algorithm Howeveraspecial attention must be given to the correlated variableswhen building the graphical model

4 State Estimation Methods

State estimation techniques aim to determine the state ofthe target under movement (typically the position) given

the observation or measurements State estimation tech-niques are also known as tracking techniques In their generalform it is not guaranteed that the target observations arerelevant which means that some of the observations couldactually come from the target and others could be only noiseThe state estimation phase is a common stage in data fusionalgorithms because the targetrsquos observation could come fromdifferent sensors or sources and the final goal is to obtain aglobal target state from the observations

The estimation problem involves finding the values of thevector state (eg position velocity and size) that fits as muchas possible with the observed data From a mathematicalperspective we have a set of redundant observations andthe goal is to find the set of parameters that provides thebest fit to the observed data In general these observationsare corrupted by errors and the propagation of noise in themeasurement process State estimation methods fall underlevel 1 of the JDL classification and could be divided into twobroader groups

(1) linear dynamics and measurements here the esti-mation problem has a standard solution Specificallywhen the equations of the object state and the mea-surements are linear the noise follows the Gaussiandistribution and we do not refer to it as a clutterenvironment in this case the optimal theoreticalsolution is based on the Kalman filter

(2) nonlinear dynamics the state estimation problembecomes difficult and there is not an analytical solu-tion to solve the problem in a generalmanner In prin-ciple there are no practical algorithms available tosolve this problem satisfactorily

Most of the state estimationmethods are based on controltheory and employ the laws of probability to compute avector state from a vector measurement or a stream of vectormeasurements Next the most common estimation methodsare presented including maximum likelihood and maxi-mum posterior (Section 41) the Kalman filter (Section 42)particle filter (Section 43) the distributed Kalman filter(Section 44) distributed particle filter (Section 45) andcovariance consistency methods (Section 46)

41 Maximum Likelihood andMaximum Posterior Themax-imum likelihood (ML) technique is an estimation methodthat is based on probabilistic theory Probabilistic estimationmethods are appropriate when the state variable follows anunknown probability distribution [33] In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of119909 The likelihood function 120582(119909) is defined as a probabilitydensity function of the sequence of 119911 observations given thetrue value of the state 119909 Consider

120582 (119909) = 119901 (119911 | 119909) (20)

The ML estimator finds the value of 119909 that maximizes thelikelihood function

119909 (119896) = argmax119909119901 (119911 | 119909) (21)

12 The Scientific World Journal

which can be obtained from the analytical or empiricalmodels of the sensorsThis function expresses the probabilityof the observed data The main disadvantage of this methodin practice is that it requires the analytical or empirical modelof the sensor to be known to provide the prior distributionand compute the likelihood function This method can alsosystematically underestimate the variance of the distributionwhich leads to a bias problem However the bias of the MLsolution becomes less significant as the number 119873 of datapoints increases and is equal to the true variance of thedistribution that generated the data at the limit119873 rarr infin

The maximum posterior (MAP) method is based on theBayesian theory It is employed when the parameter 119909 tobe estimated is the output of a random variable that has aknown probability density function 119901(119909) In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of 119909The MAP estimator finds the value of 119909 that maximizes theposterior probability distribution as follows

119909 (119896) = argmax119909119901 (119909 | 119911) (22)

Both methods (ML andMAP) aim to find the most likelyvalue for the state 119909 However ML assumes that 119909 is a fixedbut an unknown point from the parameter space whereasMAP considers 119909 to be the output of a random variable witha known a priori probability density function Both of thesemethods are equivalent when there is no a priori informationabout 119909 that is when there are only observations

42The Kalman Filter TheKalman filter is the most popularestimation technique It was originally proposed by Kalman[34] and has been widely studied and applied since then TheKalman filter estimates the state 119909 of a discrete time processgoverned by the following space-time model

119909 (119896 + 1) = Φ (119896) 119909 (119896) + 119866 (119896) 119906 (119896) + 119908 (119896) (23)

with the observations ormeasurements 119911 at time 119896 of the state119909 represented by

119911 (119896) = 119867 (119896) 119909 (119896) + V (119896) (24)

where Φ(119896) is the state transition matrix 119866(119896) is the inputmatrix transition 119906(119896) is the input vector 119867(119896) is themeasurement matrix and 119908 and V are the random Gaussianvariables with zero mean and covariance matrices of 119876(119896)and 119877(119896) respectively Based on the measurements and onthe system parameters the estimation of 119909(119896) which isrepresented by 119909(119896) and the prediction of 119909(119896 + 1) whichis represented by 119909(119896 + 1 | 119896) are given by the following

119909 (119896) = 119909 (119896 | 119896 + 1) + 119870 (119896) [119911 (119896) minus 119867 (119896) 119909 (119896 | 119896 minus 1)]

119909 (119896 + 1 | 119896) = Φ (119896) 119909 (119896 | 119896) + 119866 (119896) 119906 (119896)

(25)

respectively where119870 is the filter gain determined by

119870 (119896) = 119875 (119896 | 119896 minus 1)119867119879(119896)

times [119867 (119896) 119875 (119896 | 119896 minus 1)119867119879(119896) + 119877 (119896)]

minus1

(26)

where 119875(119896 | 119896 minus 1) is the prediction covariance matrix andcan be determined by

119875 (119896 + 1 | 119896) = Φ (119896) 119875 (119896)Φ119879(119896) + 119876 (119896) (27)

with

119875 (119896) = 119875 (119896 | 119896 minus 1) minus 119870 (119896)119867 (119896) 119875 (119896 | 119896 minus 1) (28)

The Kalman filter is mainly employed to fuse low-leveldata If the system could be described as a linear model andthe error could be modeled as the Gaussian noise then therecursive Kalman filter obtains optimal statistical estimations[35] However othermethods are required to address nonlin-ear dynamicmodels and nonlinearmeasurementsThemodi-fied Kalman filter known as the extended Kalman filter (EKF)is an optimal approach for implementing nonlinear recursivefilters [36] The EKF is one of the most often employedmethods for fusing data in robotic applications Howeverit has some disadvantages because the computations of theJacobians are extremely expensive Some attempts have beenmade to reduce the computational cost such as linearizationbut these attempts introduce errors in the filter and make itunstable

The unscented Kalman filter (UKF) [37] has gainedpopularity because it does not have the linearization step andthe associated errors of the EKF [38] The UKF employs adeterministic sampling strategy to establish theminimum setof points around the mean This set of points captures thetrue mean and covariance completely Then these points arepropagated through nonlinear functions and the covarianceof the estimations can be recuperated Another advantage ofthe UKF is its ability to be employed in parallel implementa-tions

43 Particle Filter Particle filters are recursive implemen-tations of the sequential Monte Carlo methods [39] Thismethod builds the posterior density function using severalrandom samples called particles Particles are propagatedover time with a combination of sampling and resamplingsteps At each iteration the sampling step is employed todiscard some particles increasing the relevance of regionswith a higher posterior probability In the filtering processseveral particles of the same state variable are employedand each particle has an associated weight that indicatesthe quality of the particle Therefore the estimation is theresult of a weighted sum of all of the particles The standardparticle filter algorithm has two phases (1) the predictingphase and (2) the updating phase In the predicting phaseeach particle is modified according to the existing modeland accounts for the sum of the random noise to simulatethe noise effect Then in the updating phase the weight ofeach particle is reevaluated using the last available sensorobservation and particles with lower weights are removedSpecifically a generic particle filter comprises the followingsteps

The Scientific World Journal 13

(1) Initialization of the particles

(i) let119873 be equal to the number of particles(ii) 119883(119894)(1) = [119909(1) 119910(1) 0 0]119879 for 119894 = 1 119873

(2) Prediction step

(i) for each particle 119894 = 1 119873 evaluate the state(119896 + 1 | 119896) of the system using the state at timeinstant 119896 with the noise of the system at time 119896Consider

119883(119894)(119896 + 1 | 119896) = 119865 (119896)119883

(119894)(119896)

+ (cauchy-distribution-noise)(119896)

(29)

where 119865(119896) is the transition matrix of the sys-tem

(3) Evaluate the particle weight For each particle 119894 =1 119873

(i) compute the predicted observation state of thesystem using the current predicted state and thenoise at instant 119896 Consider

(119894)(119896 + 1 | 119896) = 119867 (119896 + 1)119883

(119894)(119896 + 1 | 119896)

+ (gaussian-measurement-noise)(119896+1)

(30)

(ii) compute the likelihood (weights) according tothe given distribution Consider

likelihood(119894) = 119873((119894) (119896 + 1 | 119896) 119911(119894) (119896 + 1) var) (31)

(iii) normalize the weights as follows

119908(119894)=

likelihood(119894)

sum119873

119895=1likelihood(119895)

(32)

(4) ResamplingSelection multiply particles with higherweights and remove those with lower weights Thecurrent state must be adjusted using the computedweights of the new particles

(i) Compute the cumulative weights Consider

Cum Wt(119894) =119894

sum

119895=1

119908(119895) (33)

(ii) Generate uniform distributed random variablesfrom 119880(119894) sim 119882(0 1) with the number of stepsequal to the number of particles

(iii) Determine which particles should be multipliedand which ones removed

(5) Propagation phase

(i) incorporate the new values of the state after theresampling of instant 119896 to calculate the value atinstant 119896 + 1 Consider

119909(1119873)

(119896 + 1 | 119896 + 1) = 119909 (119896 + 1 | 119896) (34)

(ii) compute the posterior mean Consider

119909 (119896 + 1) = mean [119909119894 (119896 + 1 | 119896 + 1)] 119894 = 1 119873 (35)

(iii) repeat steps 2 to 5 for each time instant

Particle filters are more flexible than the Kalman filtersand can copewith nonlinear dependencies and non-Gaussiandensities in the dynamic model and in the noise errorHowever they have some disadvantages A large numberof particles are required to obtain a small variance in theestimator It is also difficult to establish the optimal number ofparticles in advance and the number of particles affects thecomputational cost significantly Earlier versions of particlefilters employed a fixed number of particles but recent studieshave started to use a dynamic number of particles [40]

44 The Distributed Kalman Filter The distributed Kalmanfilter requires a correct clock synchronization between eachsource as demonstrated in [41] In other words to correctlyuse the distributed Kalman filter the clocks from all ofthe sources must be synchronized This synchronization istypically achieved through using protocols that employ ashared global clock such as the network time protocol (NTP)Synchronization problems between clocks have been shownto have an effect on the accuracy of the Kalman filterproducing inaccurate estimations [42]

If the estimations are consistent and the cross covarianceis known (or the estimations are uncorrelated) then it ispossible to use the distributed Kalman filters [43] Howeverthe cross covariance must be determined exactly or theobservations must be consistent

We refer the reader to Liggins II et al [20] formore detailsabout the Kalman filter in a distributed and hierarchicalarchitecture

45 Distributed Particle Filter Distributed particle filtershave gained attention recently [44ndash46] Coates [45] used adistributed particle filter to monitor an environment thatcould be captured by the Markovian state-space modelinvolving nonlinear dynamics and observations and non-Gaussian noise

In contrast earlier attempts to solve out-of-sequencemeasurements using particle filters are based on regeneratingthe probability density function to the time instant of theout-of-sequence measurement [47] In a particle filter thisstep requires a large computational cost in addition to thenecessary space to store the previous particles To avoidthis problem Orton and Marrs [48] proposed to store theinformation on the particles at each time instant saving thecost of recalculating this information This technique is close

14 The Scientific World Journal

to optimal and when the delay increases the result is onlyslightly affected [49]However it requires a very large amountof space to store the state of the particles at each time instant

46 Covariance Consistency Methods Covariance Intersec-tionUnion Covariance consistency methods (intersectionand union) were proposed by Uhlmann [43] and are generaland fault-tolerant frameworks for maintaining covariancemeans and estimations in a distributed networkThese meth-ods do not comprise estimation techniques instead they aresimilar to an estimation fusion technique The distributedKalman filter requirement of independent measurements orknown cross-covariances is not a constraintwith thismethod

461 Covariance Intersection If the Kalman filter is employ-ed to combine two estimations (119886

1 1198601) and (119886

2 1198602) then it

is assumed that the joint covariance is in the following form

[

[

1198601119883

1198831198791198602

]

]

(36)

where the cross-covariance 119883 should be known exactly sothat the Kalman filter can be applied without difficultyBecause the computation of the cross-covariances is compu-tationally intensive Uhlmann [43] proposed the covarianceintersection (CI) algorithm

Let us assume that a joint covariance 119872 can be definedwith the diagonal blocks119872

1198601gt 1198601and119872

1198602gt 1198602 Consider

119872 ⩾ [

[

1198601119883

1198831198791198602

]

]

(37)

for every possible instance of the unknown cross-covariance119883 then the components of the matrix119872 could be employedin the Kalman filter equations to provide a fused estimation(119888 119862) that is considered consistent The key point of thismethod relies on generating a joint covariance matrix119872 thatcan represent a useful fused estimation (in this context usefulrefers to something with a lower associated uncertainty) Insummary the CI algorithm computes the joint covariancematrix 119872 where the Kalman filter provides the best fusedestimation (119888 119862) with respect to a fixed measurement of thecovariance matrix (ie the minimum determinant)

Specific covariance criteria must be established becausethere is not a specific minimum joint covariance in theorder of the positive semidefinite matrices Moreover thejoint covariance is the basis of the formal analysis of theCI algorithm the actual result is a nonlinear mixture of theinformation stored on the estimations being fused followingthe following equation

119862 = (1199081119867119879

1119860minus1

11198671+ 1199082119867119879

2119860minus1

21198672+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119867119899)minus1

119888 = 119862(1199081119867119879

1119860minus1

11198861+ 1199082119867119879

2119860minus1

21198862+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119886119899)minus1

(38)

where 119867119894is the transformation of the fused state-space

estimation to the space of the estimated state 119894 The values

of 119908 can be calculated to minimize the covariance determi-nant using convex optimization packages and semipositivematrix programming The result of the CI algorithm hasdifferent characteristics compared to the Kalman filter Forexample if two estimations are provided (119886 119860) and (119887 119861)and their covariances are equal 119860 = 119861 since the Kalmanfilter is based on the statistical independence assumption itproduces a fused estimation with covariance 119862 = (12)119860In contrast the CI method does not assume independenceand thus must be consistent even in the case in whichthe estimations are completely correlated with the estimatedfused covariance 119862 = 119860 In the case of estimations where119860 lt 119861 the CI algorithm does not provide information aboutthe estimation (119887 119861) thus the fused result is (119886 119860)

Every joint-consistent covariance is sufficient to producea fused estimation which guarantees consistency Howeverit is also necessary to guarantee a lack of divergence Diver-gence is avoided in the CI algorithm by choosing a specificmeasurement (ie the determinant) which is minimized ineach fusion operation This measurement represents a non-divergence criterion because the size of the estimated covari-ance according to this criterion would not be incremented

The application of the CI method guarantees consis-tency and nondivergence for every sequence of mean andcovariance-consistent estimations However this methoddoes not work well when the measurements to be fused areinconsistent

462 Covariance Union CI solves the problem of correlatedinputs but not the problemof inconsistent inputs (inconsistentinputs refer to different estimations each of which has ahigh accuracy (small variance) but also a large differencefrom the states of the others) thus the covariance union(CU) algorithm was proposed to solve the latter [43] CUaddresses the following problem two estimations (119886

1 1198601)

and (1198862 1198602) relate to the state of an object and are mutually

inconsistent from one another This issue arises when thedifference between the average estimations is larger thanthe provided covariance Inconsistent inputs can be detectedusing the Mahalanobis distance [50] between them which isdefined as

119872119889= (1198861minus 1198862)119879

(1198601+ 1198602)minus1

(1198861minus 1198862) (39)

and detecting whether this distance is larger than a giventhreshold

The Mahalanobis distance accounts for the covarianceinformation to obtain the distance If the difference betweenthe estimations is high but their covariance is also highthe Mahalanobis distance yields a small value In contrastif the difference between the estimations is small and thecovariances are small it could produce a larger distancevalue A high Mahalanobis distance could indicate that theestimations are inconsistent however it is necessary tohave a specific threshold established by the user or learnedautomatically

The CU algorithm aims to solve the following prob-lem let us suppose that a filtering algorithm provides twoobservations withmean and covariance (119886

1 1198601) and (119886

2 1198602)

The Scientific World Journal 15

respectively It is known that one of the observations is correctand the other is erroneous However the identity of thecorrect estimation is unknown and cannot be determinedIn this situation if both estimations are employed as aninput to the Kalman filter there will be a problem becausethe Kalman filter only guarantees a consistent output if theobservation is updated with a measurement consistent withboth of them In the specific case in which themeasurementscorrespond to the same object but are acquired from twodifferent sensors the Kalman filter can only guarantee thatthe output is consistent if it is consistent with both separatelyBecause it is not possible to knowwhich estimation is correctthe only way to combine the two estimations rigorously isto provide an estimation (119906 119880) that is consistent with bothestimations and to obey the following properties

119880 ⪖ 1198601+ (119906 minus 119886

1) (119906 minus 119860

1)119879

119880 ⪖ 1198602+ (119906 minus 119886

2) (119906 minus 119860

2)119879

(40)

where somemeasurement of thematrix size119880 (ie the deter-minant) is minimized

In other words the previous equations indicate that if theestimation (119886

1 1198601) is consistent then the translation of the

vector 1198861to 119906 requires to increase the covariance by the sum

of a matrix at least as big as the product of (119906minus1198861) in order to

be consistentThe same situation applies to the measurement(1198862 1198602) in order to be consistent

A simple strategy is to choose the mean of the estimationas the input value of one of themeasurements (119906 = 119886

1) In this

case the value of 119880must be chosen such that the estimationis consistent with the worst case (the correct measurement is1198862) However it is possible to assign 119906 an intermediate value

between 1198861and 1198862to decrease the value of 119880 Therefore the

CU algorithm establishes the mean fused value 119906 that hasthe least covariance 119880 but is sufficiently large for the twomeasurements (119886

1and 1198862) for consistency

Because the matrix inequalities presented in previousequations are convex convex optimization algorithms mustbe employed to solve them The value of 119880 can be computedwith the iterative method described by Julier et al [51]The obtained covariance could be significantly larger thanany of the initial covariances and is an indicator of theexisting uncertainty between the initial estimations One ofthe advantages of the CU method arises from the fact thatthe same process could be easily extended to119873 inputs

5 Decision Fusion Methods

A decision is typically taken based on the knowledge of theperceived situation which is provided by many sources inthe data fusion domain These techniques aim to make ahigh-level inference about the events and activities that areproduced from the detected targets These techniques oftenuse symbolic information and the fusion process requires toreasonwhile accounting for the uncertainties and constraintsThese methods fall under level 2 (situation assessment) andlevel 4 (impact assessment) of the JDL data fusion model

51 The Bayesian Methods Information fusion based on theBayesian inference provides a formalism for combining evi-dence according to the probability theory rules Uncertaintyis represented using the conditional probability terms thatdescribe beliefs and take on values in the interval [0 1] wherezero indicates a complete lack of belief and one indicates anabsolute belief The Bayesian inference is based on the Bayesrule as follows

119875 (119884 | 119883) =119875 (119883 | 119884) 119875 (119884)

119875 (119883) (41)

where the posterior probability 119875(119884 | 119883) represents thebelief in the hypothesis 119884 given the information 119883 Thisprobability is obtained by multiplying the a priori probabilityof the hypothesis 119875(119884) by the probability of having 119883 giventhat 119884 is true 119875(119883 | 119884) The value 119875(119883) is used as anormalizing constantThemain disadvantage of the Bayesianinference is that the probabilities 119875(119883) and 119875(119883 | 119884) mustbe known To estimate the conditional probabilities Panet al [52] proposed the use of NNs whereas Coue et al [53]proposed the Bayesian programming

Hall and Llinas [54] described the following problemsassociated with Bayesian inference

(i) Difficulty in establishing the value of a priori proba-bilities

(ii) Complexity when there are multiple potential hypo-theses and a substantial number of events that dependon the conditions

(iii) The hypothesis should be mutually exclusive

(iv) Difficulty in describing the uncertainty of the deci-sions

52 The Dempster-Shafer Inference The Dempster-Shaferinference is based on the mathematical theory introducedby Dempster [55] and Shafer [56] which generalizes theBayesian theory The Dempster-Shafer theory provides aformalism that could be used to represent incomplete knowl-edge updating beliefs and a combination of evidence andallows us to represent the uncertainty explicitly [57]

A fundamental concept in the Dempster-Shafer reason-ing is the frame of discernment which is defined as followsLet Θ = 120579

1 1205792 120579

119873 be the set of all possible states

that define the system and let Θ be exhaustive and mutuallyexclusive due to the system being only in one state 120579

119894isin Θ

where 1 ⪕ 119894 ⪕ 119873 The set Θ is called a frame of discernmentbecause its elements are employed to discern the current stateof the system

The elements of the set 2Θ are called hypotheses Inthe Dempster-Shafer theory based on the evidence 119864 aprobability is assigned to each hypothesis119867 isin 2Θ accordingto the basic assignment of probabilities or the mass function119898 2Θrarr [01] which satisfies

119898(0) = 0 (42)

16 The Scientific World Journal

Thus themass function of the empty set is zero Furthermorethe mass function of a hypothesis is larger than or equal tozero for all of the hypotheses Consider

119898(119867) ge 0 forall119867 isin 2Θ (43)

The sum of the mass function of all the hypotheses is oneConsider

sum

119867isin2Θ

119898(119867) = 1 (44)

To express incomplete beliefs in a hypothesis 119867 the Demp-ster-Shafer theory defines the belief function bel 2Θ rarr

[0 1] over Θ as

bel (119867) = sum119860sube119867

119898(119860) (45)

where bel(0) = 0 and bel(Θ) = 1The doubt level in119867 can beexpressed in terms of the belief function by

dou (119867) = bel (not119867) = sum

119860subenot119867

119898(119860) (46)

To express the plausibility of each hypothesis the functionpl 2Θ rarr [0 1] over Θ is defined as

pl (119867) = 1 minus dou (119867) = sum

119860cap119867=0

119898(119860) (47)

Intuitive plausibility indicates that there is less uncer-tainty in hypothesis119867 if it is more plausible The confidenceinterval [bel(119867) pl(119867)] defines the true belief in hypothesis119867 To combine the effects of the two mass functions 119898

1and

1198982 the Dempster-Shafer theory defines a rule119898

1oplus 1198982as

1198981oplus 1198982(0) = 0

1198981oplus 1198982(119867) =

sum119883cap119884=119867

1198981(119883)119898

2(119884)

1 minus sum119883cap119884=0

1198981(119883)119898

2(119884)

(48)

In contrast to the Bayesian inference a priori probabilitiesare not required in the Dempster-Shafer inference becausethey are assigned at the instant that the information is pro-vided Several studies in the literature have compared the useof the Bayesian inference and the Dempster-Shafer inferencesuch as [58ndash60] Wu et al [61] used the Dempster-Shafertheory to fuse information in context-aware environmentsThis work was extended in [62] to dynamically modify theassociated weights to the sensor measurements Thereforethe fusion mechanism is calibrated according to the recentmeasurements of the sensors (in cases in which the ground-truth is available) In themilitary domain [63] theDempster-Shafer reasoning is used with the a priori information storedin a database for classifying military ships Morbee et al [64]described the use of the Dempster-Shafer theory to build 2Doccupancy maps from several cameras and to evaluate thecontribution of subsets of cameras to a specific task Each taskis the observation of an event of interest and the goal is toassess the validity of a set of hypotheses that are fused usingthe Dempster-Shafer theory

53 Abductive Reasoning Abductive reasoning or inferringthe best explanation is a reasoning method in which ahypothesis is chosen under the assumption that in case itis true it explains the observed event most accurately [65]In other words when an event is observed the abductionmethod attempts to find the best explanation

In the context of probabilistic reasoning abductive infer-ence finds the posterior ML of the system variables givensome observed variables Abductive reasoning is more areasoning pattern than a data fusion technique Thereforedifferent inference methods such as NNs [66] or fuzzy logic[67] can be employed

54 Semantic Methods Decision fusion techniques thatemploy semantic data fromdifferent sources as an input couldprovide more accurate results than those that rely on onlysingle sources There is a growing interest in techniques thatautomatically determine the presence of semantic features invideos to solve the semantic gap [68]

Semantic information fusion is essentially a scheme inwhich raw sensor data are processed such that the nodesexchange only the resultant semantic information Semanticinformation fusion typically covers two phases (i) build-ing the knowledge and (ii) pattern matching (inference)The first phase (typically offline) incorporates the mostappropriate knowledge into semantic information Then thesecond phase (typically online or in real-time) fuses relevantattributes and provides a semantic interpretation of thesensor data [69ndash71]

Semantic fusion could be viewed as an idea for integratingand translating sensor data into formal languages Thereforethe obtained resulting language from the observations ofthe environment is compared with similar languages thatare stored in the database The key of this strategy is thatsimilar behaviors represented by formal languages are alsosemantically similar This type of method provides savingsin the cost of transmission because the nodes need onlytransmit the formal language structure instead of the rawdata However a known set of behaviors must be storedin a database in advance which might be difficult in somescenarios

6 Conclusions

This paper reviews the most popular methods and tech-niques for performing datainformation fusion To determinewhether the application of datainformation fusion methodsis feasible we must evaluate the computational cost of theprocess and the delay introduced in the communicationA centralized data fusion approach is theoretically optimalwhen there is no cost of transmission and there are sufficientcomputational resources However this situation typicallydoes not hold in practical applications

The selection of the most appropriate technique dependson the type of the problem and the established assumptionsof each technique Statistical data fusion methods (eg PDAJPDA MHT and Kalman) are optimal under specific condi-tions [72] First the assumption that the targets are moving

The Scientific World Journal 17

independently and the measurements are normally dis-tributed around the predicted position typically does nothold Second because the statistical techniques model allof the events as probabilities they typically have severalparameters and a priori probabilities for false measurementsand detection errors that are often difficult to obtain (atleast in an optimal sense) For example in the case of theMHT algorithm specific parameters must be established thatare nontrivial to determine and are very sensitive [73] Incontrast statisticalmethods that optimize over several framesare computationally intensive and their complexity typicallygrows exponentially with the number of targets For examplein the case of particle filters tracking several targets can beaccomplished jointly as a group or individually If severaltargets are tracked jointly the necessary number of particlesgrows exponentially Therefore in practice it is better toperform tracking on them individually with the assumptionthat targets do not interact between the particles

In contrast to centralized systems the distributed datafusion methods introduce some challenges in the data fusionprocess such as (i) spatial and temporal alignments of theinformation (ii) out-of-sequence measurements and (iii)data correlation reported by Castanedo et al [74 75] Theinherent redundancy of the distributed systems could beexploited with distributed reasoning techniques and cooper-ative algorithms to improve the individual node estimationsreported by Castanedo et al [76] In addition to the previousstudies a new trend based on the geometric notion of a low-dimensional manifold is gaining attention in the data fusioncommunity An example is the work of Davenport et al [77]which proposes a simple model that captures the correlationbetween the sensor observations by matching the parametervalues for the different obtained manifolds

Acknowledgments

The author would like to thank Jesus Garcıa Miguel APatricio and James Llinas for their interesting and relateddiscussions on several topics that were presented in thispaper

References

[1] JDL Data Fusion Lexicon Technical Panel For C3 FE WhiteSan Diego Calif USA Code 420 1991

[2] D L Hall and J Llinas ldquoAn introduction to multisensor datafusionrdquo Proceedings of the IEEE vol 85 no 1 pp 6ndash23 1997

[3] H F Durrant-Whyte ldquoSensor models and multisensor integra-tionrdquo International Journal of Robotics Research vol 7 no 6 pp97ndash113 1988

[4] B V Dasarathy ldquoSensor fusion potential exploitation-inno-vative architectures and illustrative applicationsrdquo Proceedings ofthe IEEE vol 85 no 1 pp 24ndash38 1997

[5] R C Luo C-C Yih and K L Su ldquoMultisensor fusion andintegration approaches applications and future research direc-tionsrdquo IEEE Sensors Journal vol 2 no 2 pp 107ndash119 2002

[6] J Llinas C Bowman G Rogova A Steinberg E Waltz andF White ldquoRevisiting the JDL data fusion model IIrdquo TechnicalReport DTIC Document 2004

[7] E P Blasch and S Plano ldquoJDL level 5 fusionmodel ldquouser refine-mentrdquo issues and applications in group trackingrdquo in Proceedingsof the Signal Processing Sensor Fusion and Target RecognitionXI pp 270ndash279 April 2002

[8] H F Durrant-Whyte and M Stevens ldquoData fusion in decen-tralized sensing networksrdquo in Proceedings of the 4th Interna-tional Conference on Information Fusion pp 302ndash307MontrealCanada 2001

[9] J Manyika and H Durrant-Whyte Data Fusion and SensorManagement A Decentralized Information-Theoretic ApproachPrentice Hall Upper Saddle River NJ USA 1995

[10] S S Blackman ldquoAssociation and fusion ofmultiple sensor datardquoinMultitarget-Multisensor Tracking Advanced Applications pp187ndash217 Artech House 1990

[11] S Lloyd ldquoLeast squares quantization in pcmrdquo IEEETransactionson Information Theory vol 28 no 2 pp 129ndash137 1982

[12] M Shindler A Wong and A Meyerson ldquoFast and accurate120581-means for large datasetsrdquo in Proceedings of the 25th AnnualConference on Neural Information Processing Systems (NIPS rsquo11)pp 2375ndash2383 December 2011

[13] Y Bar-Shalom and E Tse ldquoTracking in a cluttered environmentwith probabilistic data associationrdquo Automatica vol 11 no 5pp 451ndash460 1975

[14] T E Fortmann Y Bar-Shalom and M Scheffe ldquoMulti-targettracking using joint probabilistic data associationrdquo in Pro-ceedings of the 19th IEEE Conference on Decision and Controlincluding the Symposium on Adaptive Processes vol 19 pp 807ndash812 December 1980

[15] D B Reid ldquoAn algorithm for tracking multiple targetsrdquo IEEETransactions on Automatic Control vol 24 no 6 pp 843ndash8541979

[16] C L Morefield ldquoApplication of 0-1 integer programming tomultitarget tracking problemsrdquo IEEETransactions onAutomaticControl vol 22 no 3 pp 302ndash312 1977

[17] R L Streit and T E Luginbuhl ldquoMaximum likelihood methodfor probabilistic multihypothesis trackingrdquo in Proceedings of theSignal and Data Processing of Small Targets vol 2235 of Pro-ceedings of SPIE p 394 1994

[18] I J Cox and S L Hingorani ldquoEfficient implementation of Reidrsquosmultiple hypothesis tracking algorithm and its evaluation forthe purpose of visual trackingrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 18 no 2 pp 138ndash1501996

[19] K G Murty ldquoAn algorithm for ranking all the assignments inorder of increasing costrdquo Operations Research vol 16 no 3 pp682ndash687 1968

[20] M E Liggins II C-Y Chong I Kadar et al ldquoDistributed fusionarchitectures and algorithms for target trackingrdquo Proceedings ofthe IEEE vol 85 no 1 pp 95ndash106 1997

[21] S Coraluppi C Carthel M Luettgen and S Lynch ldquoAll-source track and identity fusionrdquo in Proceedings of the NationalSymposium on Sensor and Data Fusion 2000

[22] P Storms and F Spieksma ldquoAn lp-based algorithm for the dataassociation problem in multitarget trackingrdquo in Proceedings ofthe 3rd IEEE International Conference on Information Fusionvol 1 2000

[23] S-W Joo and R Chellappa ldquoA multiple-hypothesis approachfor multiobject visual trackingrdquo IEEE Transactions on ImageProcessing vol 16 no 11 pp 2849ndash2854 2007

[24] S Coraluppi andC Carthel ldquoAggregate surveillance a cardinal-ity tracking approachrdquo in Proceedings of the 14th InternationalConference on Information Fusion (FUSION rsquo11) July 2011

18 The Scientific World Journal

[25] K C Chang C Y Chong and Y Bar-Shalom ldquoJoint proba-bilistic data association in distributed sensor networksrdquo IEEETransactions on Automatic Control vol 31 no 10 pp 889ndash8971986

[26] Y Chong S Mori and K C Chang ldquoInformation lusion indistributed sensor networksrdquo in Proceedings of the 4th AmericanControl Conference Boston Mass USA June 1985

[27] Y Chong S Mori and K C Chang ldquoDistributed multitar-get multisensor trackingrdquo in Multitarget-Multisensor TrackingAdvanced Applications vol 1 pp 247ndash295 1990

[28] J Pearl Probabilistic Reasoning in Intelligent Systems Networksof Plausible Inference Morgan Kaufmann San Mateo CalifUSA 1988

[29] Koller and N Friedman Probabilistic Graphical Models Princi-ples and Techniques MIT press 2009

[30] L Chen M Cetin and A S Willsky ldquoDistributed data associ-ation for multi-target tracking in sensor networksrdquo in Proceed-ings of the 7th International Conference on Information Fusion(FUSION rsquo05) pp 9ndash16 July 2005

[31] L Chen M J Wainwright M Cetin and A S Willsky ldquoDataassociation based on optimization in graphical models withapplication to sensor networksrdquo Mathematical and ComputerModelling vol 43 no 9-10 pp 1114ndash1113 2006

[32] Y Weiss and W T Freeman ldquoOn the optimality of solutionsof the max-product belief-propagation algorithm in arbitrarygraphsrdquo IEEE Transactions on InformationTheory vol 47 no 2pp 736ndash744 2001

[33] C Brown H Durrant-Whyte J Leonard B Rao and B SteerldquoDistributed data fusion using Kalman filtering a roboticsapplicationrdquo in Data Fusion in Robotics and Machine Intelli-gence M A Abidi and R C Gonzalez Eds pp 267ndash309 1992

[34] R E Kalman ldquoA new approach to linear filtering and predictionproblemsrdquo Journal of Basic Engineering vol 82 no 1 pp 35ndash451960

[35] R C Luo and M G Kay ldquoData fusion and sensor integrationstate-of-the-art 1990srdquo in Data Fusion in Robotics and MachineIntelligence pp 7ndash135 1992

[36] Welch and G Bishop An Introduction to the Kalman FilterACM SIC-CRAPH 2001 Course Notes 2001

[37] S J Julier and J K Uhlmann ldquoA new extension of the Kalmanfilter to nonlinear systemsrdquo in Proceedings of the InternationalSymposium on AerospaceDefense Sensing Simulation and Con-trols vol 3 1997

[38] A Wan and R Van Der Merwe ldquoThe unscented kalman filterfor nonlinear estimationrdquo in Proceedings of the Adaptive Systemsfor Signal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 153ndash158 2000

[39] D Crisan and A Doucet ldquoA survey of convergence results onparticle filtering methods for practitionersrdquo IEEE Transactionson Signal Processing vol 50 no 3 pp 736ndash746 2002

[40] J Martinez-del Rincon C Orrite-Urunuela and J E Herrero-Jaraba ldquoAn efficient particle filter for color-based tracking incomplex scenesrdquo in Proceedings of the IEEE Conference onAdvancedVideo and Signal Based Surveillance pp 176ndash181 2007

[41] S Ganeriwal R Kumar and M B Srivastava ldquoTiming-syncprotocol for sensor networksrdquo in Proceedings of the 1st Inter-national Conference on Embedded Networked Sensor Systems(SenSys rsquo03) pp 138ndash149 November 2003

[42] M Manzo T Roosta and S Sastry ldquoTime synchronization innetworksrdquo in Proceedings of the 3rd ACMWorkshop on Securityof Ad Hoc and Sensor Networks (SASN rsquo05) pp 107ndash116November 2005

[43] J K Uhlmann ldquoCovariance consistency methods for fault-tolerant distributed data fusionrdquo Information Fusion vol 4 no3 pp 201ndash215 2003

[44] S Bashi V P Jilkov X R Li and H Chen ldquoDistributed imple-mentations of particle filtersrdquo in Proceedings of the 6th Interna-tional Conference of Information Fusion pp 1164ndash1171 2003

[45] M Coates ldquoDistributed particle filters for sensor networksrdquo inProceedings of the 3rd International symposium on InformationProcessing in Sensor Networks (ACM rsquo04) pp 99ndash107 New YorkNY USA 2004

[46] D Gu ldquoDistributed particle filter for target trackingrdquo in Pro-ceedings of the IEEE International Conference on Robotics andAutomation (ICRA rsquo07) pp 3856ndash3861 April 2007

[47] Y Bar-Shalom ldquoUpdate with out-of-sequencemeasurements intracking exact solutionrdquo IEEE Transactions on Aerospace andElectronic Systems vol 38 no 3 pp 769ndash778 2002

[48] M Orton and A Marrs ldquoA Bayesian approach to multi-targettracking and data fusionwithOut-of-SequenceMeasurementsrdquoIEE Colloquium no 174 pp 151ndash155 2001

[49] M L Hernandez A D Marrs S Maskell and M R OrtonldquoTracking and fusion for wireless sensor networksrdquo in Proceed-ings of the 5th International Conference on Information Fusion2002

[50] P C Mahalanobis ldquoOn the generalized distance in statisticsrdquoProceedings National Institute of ScienceIndia vol 2 no 1 pp49ndash55 1936

[51] S J Julier J K Uhlmann and D Nicholson ldquoA methodfor dealing with assignment ambiguityrdquo in Proceedings of theAmerican Control Conference (AAC rsquo04) vol 5 pp 4102ndash4107July 2004

[52] H Pan Z-P Liang T J Anastasio and T S Huang ldquoHybridNN-Bayesian architecture for information fusionrdquo in Proceed-ings of the International Conference on Image Processing (ICIPrsquo98) pp 368ndash371 October 1998

[53] C Coue T Fraichard P Bessiere and E Mazer ldquoMulti-sensordata fusion using Bayesian programming an automotive appli-cationrdquo in Proceedings of the IEEERSJ International Conferenceon Intelligent Robots and Systems pp 141ndash146 October 2002

[54] D L Hall and J Llinas Handbook of Multisensor Data FusionCRC Press Boca Raton Fla USA 2001

[55] P Dempster ldquoA Generalization of Bayesian Inferencerdquo Journalof the Royal Statistical Society B vol 30 no 2 pp 205ndash247 1968

[56] A ShaferMathematical Theory of Evidence Princeton Univer-sity Press Princeton NJ USA 1976

[57] G M Provan ldquoThe validity of Dempster-Shafer belief func-tionsrdquo International Journal of Approximate Reasoning vol 6no 3 pp 389ndash399 1992

[58] D M Buede ldquoShafer-Dempster and Bayesian reasoning aresponse to lsquoShafer-Dempster reasoning with applications tomultisensor target identification systemsrsquordquo IEEE Transactions onSystems Man and Cybernetics vol 18 no 6 pp 1009ndash1011 1988

[59] Y Cheng and R L Kashyap ldquoComparisonol Bayesian andDempsterrsquos rules in evidence combinationrdquo in Maximum-Entropy and BayesianMethods in Science and Engineering 1988

[60] B R Cobb and P P Shenoy ldquoA comparison of Bayesian andbelief function reasoningrdquo Information Systems Frontiers vol 5no 4 pp 345ndash358 2003

[61] H Wu M Siegel R Stiefelhagen and J Yang ldquoSensor fusionusing Dempster-Shafer theoryrdquo in Proceedings of the 19thIEEE Instrumentation and Measurement Technology Conference(TMTC rsquo02) pp 7ndash11 May 2002

The Scientific World Journal 19

[62] H Wu M Siegel and S Ablay ldquoSensor fusion using dempster-shafer theory II staticweighting andKalmanfilter-like dynamicweightingrdquo in Proceedings of the 20th IEEE Information andMeasurement Technology Conference (TMTC rsquo03) pp 907ndash912May 2003

[63] E Bosse P Valin A-C Boury-Brisset and D Grenier ldquoEx-ploitation of a priori knowledge for information fusionrdquo Infor-mation Fusion vol 7 no 2 pp 161ndash175 2006

[64] MMorbee L Tessens H Aghajan andW Philips ldquoDempster-Shafer based multi-view occupancy mapsrdquo Electronics Lettersvol 46 no 5 pp 341ndash343 2010

[65] C S Peirce Abduction and Induction Philosophical Writings ofPeirce vol 156 Dover New York NY USA 1955

[66] A M Abdelbar E A M Andrews and D C Wunsch IIldquoAbductive reasoning with recurrent neural networksrdquo NeuralNetworks vol 16 no 5-6 pp 665ndash673 2003

[67] J R Aguero and A Vargas ldquoInference of operative configu-ration of distribution networks using fuzzy logic techniquesPart II extended real-time modelrdquo IEEE Transactions on PowerSystems vol 20 no 3 pp 1562ndash1569 2005

[68] A W M Smeulders M Worring S Santini A Gupta and RJain ldquoContent-based image retrieval at the end of the earlyyearsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 22 no 12 pp 1349ndash1380 2000

[69] D S Friedlander and S Phoha ldquoSemantic information fusionfor coordinated signal processing in mobile sensor networksrdquoInternational Journal of High Performance Computing Applica-tions vol 16 no 3 pp 235ndash241 2002

[70] S Friedlander ldquoSemantic information extractionrdquo in Dis-tributed Sensor Networks 2005

[71] KWhitehouse J Liu and F Zhao ldquoSemantic Streams a frame-work for composable inference over sensor datardquo in Proceedingsof the 3rd European Workshop on Wireless Sensor NetworksLecture Notes in Computer Science Springer February 2006

[72] J Cox ldquoA review of statistical data association techniques formotion correspondencerdquo International Journal of ComputerVision vol 10 no 1 pp 53ndash66 1993

[73] C J Veenman M J T Reinders and E Backer ldquoResolvingmotion correspondence for densely moving pointsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol23 no 1 pp 54ndash72 2001

[74] F Castanedo M A Patricio J Garcıa and J M MolinaldquoBottom-uptop-down coordination in a multiagent visualsensor networkrdquo in Proceedings of the IEEE Conference onAdvanced Video and Signal Based Surveillance (AVSS rsquo07) pp93ndash98 September 2007

[75] F Castanedo J Garcıa M A Patricio and J M MolinaldquoAnalysis of distributed fusion alternatives in coordinated visionagentsrdquo in Proceedings of the 11th International Conference onInformation Fusion (FUSION rsquo08) July 2008

[76] F Castanedo J Garcıa M A Patricio and J M Molina ldquoDatafusion to improve trajectory tracking in a cooperative surveil-lance multi-agent architecturerdquo Information Fusion vol 11 no3 pp 243ndash255 2010

[77] M A Davenport C Hegde M F Duarte and R G BaraniukldquoJoint manifolds for data fusionrdquo IEEE Transactions on ImageProcessing vol 19 no 10 pp 2580ndash2594 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 11: Review Article A Review of Data Fusion Techniquesdownloads.hindawi.com/journals/tswj/2013/704504.pdf · Classification of Data Fusion Techniques Data fusion is a ... to extract features

The Scientific World Journal 11

Using the information model the probability of eachfused hypothesis is given by

119875 (120582 | 119885) = 119862minus1prod

119895isin119869

119875(120582(119895)| 119885(119895))120572(119895)

prod

120591isin120582

119871 (120591 | 119885) (19)

where 119862 is a normalizing constant and 119871(120591 | 119885) is thelikelihood of each hypothesis pair

The main disadvantage of the MHT-D is the high com-putational cost that is in the order of 119874(119899119872) where 119899 is thenumber of possible associations and 119872 is the number ofvariables to be estimated

37 Graphical Models Graphical models are a formalism forrepresenting and reasoning with probabilities and indepen-dence A graphical model represents a conditional decom-position of the joint probability A graphical model can berepresented as a graph in which the nodes denote randomvariables the edges denote the possible dependence betweenthe random variables and the plates denote the replication ofa substructure with the appropriate indexing of the relevantvariables The graph captures the joint distribution over therandom variables which can be decomposed into a productof factors that each dependononly a subset of variablesThereare two major classes of graphical models (i) the Bayesiannetworks [28] which are also known as the directed graphicalmodels and (ii) the Markov random fields which are alsoknown as undirected graphical models The directed graph-ical models are useful for expressing causal relationshipsbetween random variables whereas undirected models arebetter suited for expressing soft constraints between randomvariables We refer the reader to the book of Koller andFriedman [29] for more information on graphical models

A framework based on graphical models can solve theproblem of distributed data association in synchronizedsensor networkswith overlapped areas andwhere each sensorreceives noisy measurements this solution was proposedby Chen et al [30 31] Their work is based on graphicalmodels that are used to represent the statistical dependencebetween random variables The data association problem istreated as an inference problem and solved by using themax-product algorithm [32] Graphical models representstatistical dependencies between variables as graphs andthe max-product algorithm converges when the graph isa tree structure Moreover the employed algorithm couldbe implemented in a distributed manner by exchangingmessages between the source nodes in parallel With thisalgorithm if each sensor has 119899 possible combinations ofassociations and there are119872 variables to be estimated it hasa complexity of 119874(1198992119872) which is reasonable and less thanthe 119874(119899119872) complexity of the MHT-D algorithm Howeveraspecial attention must be given to the correlated variableswhen building the graphical model

4 State Estimation Methods

State estimation techniques aim to determine the state ofthe target under movement (typically the position) given

the observation or measurements State estimation tech-niques are also known as tracking techniques In their generalform it is not guaranteed that the target observations arerelevant which means that some of the observations couldactually come from the target and others could be only noiseThe state estimation phase is a common stage in data fusionalgorithms because the targetrsquos observation could come fromdifferent sensors or sources and the final goal is to obtain aglobal target state from the observations

The estimation problem involves finding the values of thevector state (eg position velocity and size) that fits as muchas possible with the observed data From a mathematicalperspective we have a set of redundant observations andthe goal is to find the set of parameters that provides thebest fit to the observed data In general these observationsare corrupted by errors and the propagation of noise in themeasurement process State estimation methods fall underlevel 1 of the JDL classification and could be divided into twobroader groups

(1) linear dynamics and measurements here the esti-mation problem has a standard solution Specificallywhen the equations of the object state and the mea-surements are linear the noise follows the Gaussiandistribution and we do not refer to it as a clutterenvironment in this case the optimal theoreticalsolution is based on the Kalman filter

(2) nonlinear dynamics the state estimation problembecomes difficult and there is not an analytical solu-tion to solve the problem in a generalmanner In prin-ciple there are no practical algorithms available tosolve this problem satisfactorily

Most of the state estimationmethods are based on controltheory and employ the laws of probability to compute avector state from a vector measurement or a stream of vectormeasurements Next the most common estimation methodsare presented including maximum likelihood and maxi-mum posterior (Section 41) the Kalman filter (Section 42)particle filter (Section 43) the distributed Kalman filter(Section 44) distributed particle filter (Section 45) andcovariance consistency methods (Section 46)

41 Maximum Likelihood andMaximum Posterior Themax-imum likelihood (ML) technique is an estimation methodthat is based on probabilistic theory Probabilistic estimationmethods are appropriate when the state variable follows anunknown probability distribution [33] In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of119909 The likelihood function 120582(119909) is defined as a probabilitydensity function of the sequence of 119911 observations given thetrue value of the state 119909 Consider

120582 (119909) = 119901 (119911 | 119909) (20)

The ML estimator finds the value of 119909 that maximizes thelikelihood function

119909 (119896) = argmax119909119901 (119911 | 119909) (21)

12 The Scientific World Journal

which can be obtained from the analytical or empiricalmodels of the sensorsThis function expresses the probabilityof the observed data The main disadvantage of this methodin practice is that it requires the analytical or empirical modelof the sensor to be known to provide the prior distributionand compute the likelihood function This method can alsosystematically underestimate the variance of the distributionwhich leads to a bias problem However the bias of the MLsolution becomes less significant as the number 119873 of datapoints increases and is equal to the true variance of thedistribution that generated the data at the limit119873 rarr infin

The maximum posterior (MAP) method is based on theBayesian theory It is employed when the parameter 119909 tobe estimated is the output of a random variable that has aknown probability density function 119901(119909) In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of 119909The MAP estimator finds the value of 119909 that maximizes theposterior probability distribution as follows

119909 (119896) = argmax119909119901 (119909 | 119911) (22)

Both methods (ML andMAP) aim to find the most likelyvalue for the state 119909 However ML assumes that 119909 is a fixedbut an unknown point from the parameter space whereasMAP considers 119909 to be the output of a random variable witha known a priori probability density function Both of thesemethods are equivalent when there is no a priori informationabout 119909 that is when there are only observations

42The Kalman Filter TheKalman filter is the most popularestimation technique It was originally proposed by Kalman[34] and has been widely studied and applied since then TheKalman filter estimates the state 119909 of a discrete time processgoverned by the following space-time model

119909 (119896 + 1) = Φ (119896) 119909 (119896) + 119866 (119896) 119906 (119896) + 119908 (119896) (23)

with the observations ormeasurements 119911 at time 119896 of the state119909 represented by

119911 (119896) = 119867 (119896) 119909 (119896) + V (119896) (24)

where Φ(119896) is the state transition matrix 119866(119896) is the inputmatrix transition 119906(119896) is the input vector 119867(119896) is themeasurement matrix and 119908 and V are the random Gaussianvariables with zero mean and covariance matrices of 119876(119896)and 119877(119896) respectively Based on the measurements and onthe system parameters the estimation of 119909(119896) which isrepresented by 119909(119896) and the prediction of 119909(119896 + 1) whichis represented by 119909(119896 + 1 | 119896) are given by the following

119909 (119896) = 119909 (119896 | 119896 + 1) + 119870 (119896) [119911 (119896) minus 119867 (119896) 119909 (119896 | 119896 minus 1)]

119909 (119896 + 1 | 119896) = Φ (119896) 119909 (119896 | 119896) + 119866 (119896) 119906 (119896)

(25)

respectively where119870 is the filter gain determined by

119870 (119896) = 119875 (119896 | 119896 minus 1)119867119879(119896)

times [119867 (119896) 119875 (119896 | 119896 minus 1)119867119879(119896) + 119877 (119896)]

minus1

(26)

where 119875(119896 | 119896 minus 1) is the prediction covariance matrix andcan be determined by

119875 (119896 + 1 | 119896) = Φ (119896) 119875 (119896)Φ119879(119896) + 119876 (119896) (27)

with

119875 (119896) = 119875 (119896 | 119896 minus 1) minus 119870 (119896)119867 (119896) 119875 (119896 | 119896 minus 1) (28)

The Kalman filter is mainly employed to fuse low-leveldata If the system could be described as a linear model andthe error could be modeled as the Gaussian noise then therecursive Kalman filter obtains optimal statistical estimations[35] However othermethods are required to address nonlin-ear dynamicmodels and nonlinearmeasurementsThemodi-fied Kalman filter known as the extended Kalman filter (EKF)is an optimal approach for implementing nonlinear recursivefilters [36] The EKF is one of the most often employedmethods for fusing data in robotic applications Howeverit has some disadvantages because the computations of theJacobians are extremely expensive Some attempts have beenmade to reduce the computational cost such as linearizationbut these attempts introduce errors in the filter and make itunstable

The unscented Kalman filter (UKF) [37] has gainedpopularity because it does not have the linearization step andthe associated errors of the EKF [38] The UKF employs adeterministic sampling strategy to establish theminimum setof points around the mean This set of points captures thetrue mean and covariance completely Then these points arepropagated through nonlinear functions and the covarianceof the estimations can be recuperated Another advantage ofthe UKF is its ability to be employed in parallel implementa-tions

43 Particle Filter Particle filters are recursive implemen-tations of the sequential Monte Carlo methods [39] Thismethod builds the posterior density function using severalrandom samples called particles Particles are propagatedover time with a combination of sampling and resamplingsteps At each iteration the sampling step is employed todiscard some particles increasing the relevance of regionswith a higher posterior probability In the filtering processseveral particles of the same state variable are employedand each particle has an associated weight that indicatesthe quality of the particle Therefore the estimation is theresult of a weighted sum of all of the particles The standardparticle filter algorithm has two phases (1) the predictingphase and (2) the updating phase In the predicting phaseeach particle is modified according to the existing modeland accounts for the sum of the random noise to simulatethe noise effect Then in the updating phase the weight ofeach particle is reevaluated using the last available sensorobservation and particles with lower weights are removedSpecifically a generic particle filter comprises the followingsteps

The Scientific World Journal 13

(1) Initialization of the particles

(i) let119873 be equal to the number of particles(ii) 119883(119894)(1) = [119909(1) 119910(1) 0 0]119879 for 119894 = 1 119873

(2) Prediction step

(i) for each particle 119894 = 1 119873 evaluate the state(119896 + 1 | 119896) of the system using the state at timeinstant 119896 with the noise of the system at time 119896Consider

119883(119894)(119896 + 1 | 119896) = 119865 (119896)119883

(119894)(119896)

+ (cauchy-distribution-noise)(119896)

(29)

where 119865(119896) is the transition matrix of the sys-tem

(3) Evaluate the particle weight For each particle 119894 =1 119873

(i) compute the predicted observation state of thesystem using the current predicted state and thenoise at instant 119896 Consider

(119894)(119896 + 1 | 119896) = 119867 (119896 + 1)119883

(119894)(119896 + 1 | 119896)

+ (gaussian-measurement-noise)(119896+1)

(30)

(ii) compute the likelihood (weights) according tothe given distribution Consider

likelihood(119894) = 119873((119894) (119896 + 1 | 119896) 119911(119894) (119896 + 1) var) (31)

(iii) normalize the weights as follows

119908(119894)=

likelihood(119894)

sum119873

119895=1likelihood(119895)

(32)

(4) ResamplingSelection multiply particles with higherweights and remove those with lower weights Thecurrent state must be adjusted using the computedweights of the new particles

(i) Compute the cumulative weights Consider

Cum Wt(119894) =119894

sum

119895=1

119908(119895) (33)

(ii) Generate uniform distributed random variablesfrom 119880(119894) sim 119882(0 1) with the number of stepsequal to the number of particles

(iii) Determine which particles should be multipliedand which ones removed

(5) Propagation phase

(i) incorporate the new values of the state after theresampling of instant 119896 to calculate the value atinstant 119896 + 1 Consider

119909(1119873)

(119896 + 1 | 119896 + 1) = 119909 (119896 + 1 | 119896) (34)

(ii) compute the posterior mean Consider

119909 (119896 + 1) = mean [119909119894 (119896 + 1 | 119896 + 1)] 119894 = 1 119873 (35)

(iii) repeat steps 2 to 5 for each time instant

Particle filters are more flexible than the Kalman filtersand can copewith nonlinear dependencies and non-Gaussiandensities in the dynamic model and in the noise errorHowever they have some disadvantages A large numberof particles are required to obtain a small variance in theestimator It is also difficult to establish the optimal number ofparticles in advance and the number of particles affects thecomputational cost significantly Earlier versions of particlefilters employed a fixed number of particles but recent studieshave started to use a dynamic number of particles [40]

44 The Distributed Kalman Filter The distributed Kalmanfilter requires a correct clock synchronization between eachsource as demonstrated in [41] In other words to correctlyuse the distributed Kalman filter the clocks from all ofthe sources must be synchronized This synchronization istypically achieved through using protocols that employ ashared global clock such as the network time protocol (NTP)Synchronization problems between clocks have been shownto have an effect on the accuracy of the Kalman filterproducing inaccurate estimations [42]

If the estimations are consistent and the cross covarianceis known (or the estimations are uncorrelated) then it ispossible to use the distributed Kalman filters [43] Howeverthe cross covariance must be determined exactly or theobservations must be consistent

We refer the reader to Liggins II et al [20] formore detailsabout the Kalman filter in a distributed and hierarchicalarchitecture

45 Distributed Particle Filter Distributed particle filtershave gained attention recently [44ndash46] Coates [45] used adistributed particle filter to monitor an environment thatcould be captured by the Markovian state-space modelinvolving nonlinear dynamics and observations and non-Gaussian noise

In contrast earlier attempts to solve out-of-sequencemeasurements using particle filters are based on regeneratingthe probability density function to the time instant of theout-of-sequence measurement [47] In a particle filter thisstep requires a large computational cost in addition to thenecessary space to store the previous particles To avoidthis problem Orton and Marrs [48] proposed to store theinformation on the particles at each time instant saving thecost of recalculating this information This technique is close

14 The Scientific World Journal

to optimal and when the delay increases the result is onlyslightly affected [49]However it requires a very large amountof space to store the state of the particles at each time instant

46 Covariance Consistency Methods Covariance Intersec-tionUnion Covariance consistency methods (intersectionand union) were proposed by Uhlmann [43] and are generaland fault-tolerant frameworks for maintaining covariancemeans and estimations in a distributed networkThese meth-ods do not comprise estimation techniques instead they aresimilar to an estimation fusion technique The distributedKalman filter requirement of independent measurements orknown cross-covariances is not a constraintwith thismethod

461 Covariance Intersection If the Kalman filter is employ-ed to combine two estimations (119886

1 1198601) and (119886

2 1198602) then it

is assumed that the joint covariance is in the following form

[

[

1198601119883

1198831198791198602

]

]

(36)

where the cross-covariance 119883 should be known exactly sothat the Kalman filter can be applied without difficultyBecause the computation of the cross-covariances is compu-tationally intensive Uhlmann [43] proposed the covarianceintersection (CI) algorithm

Let us assume that a joint covariance 119872 can be definedwith the diagonal blocks119872

1198601gt 1198601and119872

1198602gt 1198602 Consider

119872 ⩾ [

[

1198601119883

1198831198791198602

]

]

(37)

for every possible instance of the unknown cross-covariance119883 then the components of the matrix119872 could be employedin the Kalman filter equations to provide a fused estimation(119888 119862) that is considered consistent The key point of thismethod relies on generating a joint covariance matrix119872 thatcan represent a useful fused estimation (in this context usefulrefers to something with a lower associated uncertainty) Insummary the CI algorithm computes the joint covariancematrix 119872 where the Kalman filter provides the best fusedestimation (119888 119862) with respect to a fixed measurement of thecovariance matrix (ie the minimum determinant)

Specific covariance criteria must be established becausethere is not a specific minimum joint covariance in theorder of the positive semidefinite matrices Moreover thejoint covariance is the basis of the formal analysis of theCI algorithm the actual result is a nonlinear mixture of theinformation stored on the estimations being fused followingthe following equation

119862 = (1199081119867119879

1119860minus1

11198671+ 1199082119867119879

2119860minus1

21198672+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119867119899)minus1

119888 = 119862(1199081119867119879

1119860minus1

11198861+ 1199082119867119879

2119860minus1

21198862+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119886119899)minus1

(38)

where 119867119894is the transformation of the fused state-space

estimation to the space of the estimated state 119894 The values

of 119908 can be calculated to minimize the covariance determi-nant using convex optimization packages and semipositivematrix programming The result of the CI algorithm hasdifferent characteristics compared to the Kalman filter Forexample if two estimations are provided (119886 119860) and (119887 119861)and their covariances are equal 119860 = 119861 since the Kalmanfilter is based on the statistical independence assumption itproduces a fused estimation with covariance 119862 = (12)119860In contrast the CI method does not assume independenceand thus must be consistent even in the case in whichthe estimations are completely correlated with the estimatedfused covariance 119862 = 119860 In the case of estimations where119860 lt 119861 the CI algorithm does not provide information aboutthe estimation (119887 119861) thus the fused result is (119886 119860)

Every joint-consistent covariance is sufficient to producea fused estimation which guarantees consistency Howeverit is also necessary to guarantee a lack of divergence Diver-gence is avoided in the CI algorithm by choosing a specificmeasurement (ie the determinant) which is minimized ineach fusion operation This measurement represents a non-divergence criterion because the size of the estimated covari-ance according to this criterion would not be incremented

The application of the CI method guarantees consis-tency and nondivergence for every sequence of mean andcovariance-consistent estimations However this methoddoes not work well when the measurements to be fused areinconsistent

462 Covariance Union CI solves the problem of correlatedinputs but not the problemof inconsistent inputs (inconsistentinputs refer to different estimations each of which has ahigh accuracy (small variance) but also a large differencefrom the states of the others) thus the covariance union(CU) algorithm was proposed to solve the latter [43] CUaddresses the following problem two estimations (119886

1 1198601)

and (1198862 1198602) relate to the state of an object and are mutually

inconsistent from one another This issue arises when thedifference between the average estimations is larger thanthe provided covariance Inconsistent inputs can be detectedusing the Mahalanobis distance [50] between them which isdefined as

119872119889= (1198861minus 1198862)119879

(1198601+ 1198602)minus1

(1198861minus 1198862) (39)

and detecting whether this distance is larger than a giventhreshold

The Mahalanobis distance accounts for the covarianceinformation to obtain the distance If the difference betweenthe estimations is high but their covariance is also highthe Mahalanobis distance yields a small value In contrastif the difference between the estimations is small and thecovariances are small it could produce a larger distancevalue A high Mahalanobis distance could indicate that theestimations are inconsistent however it is necessary tohave a specific threshold established by the user or learnedautomatically

The CU algorithm aims to solve the following prob-lem let us suppose that a filtering algorithm provides twoobservations withmean and covariance (119886

1 1198601) and (119886

2 1198602)

The Scientific World Journal 15

respectively It is known that one of the observations is correctand the other is erroneous However the identity of thecorrect estimation is unknown and cannot be determinedIn this situation if both estimations are employed as aninput to the Kalman filter there will be a problem becausethe Kalman filter only guarantees a consistent output if theobservation is updated with a measurement consistent withboth of them In the specific case in which themeasurementscorrespond to the same object but are acquired from twodifferent sensors the Kalman filter can only guarantee thatthe output is consistent if it is consistent with both separatelyBecause it is not possible to knowwhich estimation is correctthe only way to combine the two estimations rigorously isto provide an estimation (119906 119880) that is consistent with bothestimations and to obey the following properties

119880 ⪖ 1198601+ (119906 minus 119886

1) (119906 minus 119860

1)119879

119880 ⪖ 1198602+ (119906 minus 119886

2) (119906 minus 119860

2)119879

(40)

where somemeasurement of thematrix size119880 (ie the deter-minant) is minimized

In other words the previous equations indicate that if theestimation (119886

1 1198601) is consistent then the translation of the

vector 1198861to 119906 requires to increase the covariance by the sum

of a matrix at least as big as the product of (119906minus1198861) in order to

be consistentThe same situation applies to the measurement(1198862 1198602) in order to be consistent

A simple strategy is to choose the mean of the estimationas the input value of one of themeasurements (119906 = 119886

1) In this

case the value of 119880must be chosen such that the estimationis consistent with the worst case (the correct measurement is1198862) However it is possible to assign 119906 an intermediate value

between 1198861and 1198862to decrease the value of 119880 Therefore the

CU algorithm establishes the mean fused value 119906 that hasthe least covariance 119880 but is sufficiently large for the twomeasurements (119886

1and 1198862) for consistency

Because the matrix inequalities presented in previousequations are convex convex optimization algorithms mustbe employed to solve them The value of 119880 can be computedwith the iterative method described by Julier et al [51]The obtained covariance could be significantly larger thanany of the initial covariances and is an indicator of theexisting uncertainty between the initial estimations One ofthe advantages of the CU method arises from the fact thatthe same process could be easily extended to119873 inputs

5 Decision Fusion Methods

A decision is typically taken based on the knowledge of theperceived situation which is provided by many sources inthe data fusion domain These techniques aim to make ahigh-level inference about the events and activities that areproduced from the detected targets These techniques oftenuse symbolic information and the fusion process requires toreasonwhile accounting for the uncertainties and constraintsThese methods fall under level 2 (situation assessment) andlevel 4 (impact assessment) of the JDL data fusion model

51 The Bayesian Methods Information fusion based on theBayesian inference provides a formalism for combining evi-dence according to the probability theory rules Uncertaintyis represented using the conditional probability terms thatdescribe beliefs and take on values in the interval [0 1] wherezero indicates a complete lack of belief and one indicates anabsolute belief The Bayesian inference is based on the Bayesrule as follows

119875 (119884 | 119883) =119875 (119883 | 119884) 119875 (119884)

119875 (119883) (41)

where the posterior probability 119875(119884 | 119883) represents thebelief in the hypothesis 119884 given the information 119883 Thisprobability is obtained by multiplying the a priori probabilityof the hypothesis 119875(119884) by the probability of having 119883 giventhat 119884 is true 119875(119883 | 119884) The value 119875(119883) is used as anormalizing constantThemain disadvantage of the Bayesianinference is that the probabilities 119875(119883) and 119875(119883 | 119884) mustbe known To estimate the conditional probabilities Panet al [52] proposed the use of NNs whereas Coue et al [53]proposed the Bayesian programming

Hall and Llinas [54] described the following problemsassociated with Bayesian inference

(i) Difficulty in establishing the value of a priori proba-bilities

(ii) Complexity when there are multiple potential hypo-theses and a substantial number of events that dependon the conditions

(iii) The hypothesis should be mutually exclusive

(iv) Difficulty in describing the uncertainty of the deci-sions

52 The Dempster-Shafer Inference The Dempster-Shaferinference is based on the mathematical theory introducedby Dempster [55] and Shafer [56] which generalizes theBayesian theory The Dempster-Shafer theory provides aformalism that could be used to represent incomplete knowl-edge updating beliefs and a combination of evidence andallows us to represent the uncertainty explicitly [57]

A fundamental concept in the Dempster-Shafer reason-ing is the frame of discernment which is defined as followsLet Θ = 120579

1 1205792 120579

119873 be the set of all possible states

that define the system and let Θ be exhaustive and mutuallyexclusive due to the system being only in one state 120579

119894isin Θ

where 1 ⪕ 119894 ⪕ 119873 The set Θ is called a frame of discernmentbecause its elements are employed to discern the current stateof the system

The elements of the set 2Θ are called hypotheses Inthe Dempster-Shafer theory based on the evidence 119864 aprobability is assigned to each hypothesis119867 isin 2Θ accordingto the basic assignment of probabilities or the mass function119898 2Θrarr [01] which satisfies

119898(0) = 0 (42)

16 The Scientific World Journal

Thus themass function of the empty set is zero Furthermorethe mass function of a hypothesis is larger than or equal tozero for all of the hypotheses Consider

119898(119867) ge 0 forall119867 isin 2Θ (43)

The sum of the mass function of all the hypotheses is oneConsider

sum

119867isin2Θ

119898(119867) = 1 (44)

To express incomplete beliefs in a hypothesis 119867 the Demp-ster-Shafer theory defines the belief function bel 2Θ rarr

[0 1] over Θ as

bel (119867) = sum119860sube119867

119898(119860) (45)

where bel(0) = 0 and bel(Θ) = 1The doubt level in119867 can beexpressed in terms of the belief function by

dou (119867) = bel (not119867) = sum

119860subenot119867

119898(119860) (46)

To express the plausibility of each hypothesis the functionpl 2Θ rarr [0 1] over Θ is defined as

pl (119867) = 1 minus dou (119867) = sum

119860cap119867=0

119898(119860) (47)

Intuitive plausibility indicates that there is less uncer-tainty in hypothesis119867 if it is more plausible The confidenceinterval [bel(119867) pl(119867)] defines the true belief in hypothesis119867 To combine the effects of the two mass functions 119898

1and

1198982 the Dempster-Shafer theory defines a rule119898

1oplus 1198982as

1198981oplus 1198982(0) = 0

1198981oplus 1198982(119867) =

sum119883cap119884=119867

1198981(119883)119898

2(119884)

1 minus sum119883cap119884=0

1198981(119883)119898

2(119884)

(48)

In contrast to the Bayesian inference a priori probabilitiesare not required in the Dempster-Shafer inference becausethey are assigned at the instant that the information is pro-vided Several studies in the literature have compared the useof the Bayesian inference and the Dempster-Shafer inferencesuch as [58ndash60] Wu et al [61] used the Dempster-Shafertheory to fuse information in context-aware environmentsThis work was extended in [62] to dynamically modify theassociated weights to the sensor measurements Thereforethe fusion mechanism is calibrated according to the recentmeasurements of the sensors (in cases in which the ground-truth is available) In themilitary domain [63] theDempster-Shafer reasoning is used with the a priori information storedin a database for classifying military ships Morbee et al [64]described the use of the Dempster-Shafer theory to build 2Doccupancy maps from several cameras and to evaluate thecontribution of subsets of cameras to a specific task Each taskis the observation of an event of interest and the goal is toassess the validity of a set of hypotheses that are fused usingthe Dempster-Shafer theory

53 Abductive Reasoning Abductive reasoning or inferringthe best explanation is a reasoning method in which ahypothesis is chosen under the assumption that in case itis true it explains the observed event most accurately [65]In other words when an event is observed the abductionmethod attempts to find the best explanation

In the context of probabilistic reasoning abductive infer-ence finds the posterior ML of the system variables givensome observed variables Abductive reasoning is more areasoning pattern than a data fusion technique Thereforedifferent inference methods such as NNs [66] or fuzzy logic[67] can be employed

54 Semantic Methods Decision fusion techniques thatemploy semantic data fromdifferent sources as an input couldprovide more accurate results than those that rely on onlysingle sources There is a growing interest in techniques thatautomatically determine the presence of semantic features invideos to solve the semantic gap [68]

Semantic information fusion is essentially a scheme inwhich raw sensor data are processed such that the nodesexchange only the resultant semantic information Semanticinformation fusion typically covers two phases (i) build-ing the knowledge and (ii) pattern matching (inference)The first phase (typically offline) incorporates the mostappropriate knowledge into semantic information Then thesecond phase (typically online or in real-time) fuses relevantattributes and provides a semantic interpretation of thesensor data [69ndash71]

Semantic fusion could be viewed as an idea for integratingand translating sensor data into formal languages Thereforethe obtained resulting language from the observations ofthe environment is compared with similar languages thatare stored in the database The key of this strategy is thatsimilar behaviors represented by formal languages are alsosemantically similar This type of method provides savingsin the cost of transmission because the nodes need onlytransmit the formal language structure instead of the rawdata However a known set of behaviors must be storedin a database in advance which might be difficult in somescenarios

6 Conclusions

This paper reviews the most popular methods and tech-niques for performing datainformation fusion To determinewhether the application of datainformation fusion methodsis feasible we must evaluate the computational cost of theprocess and the delay introduced in the communicationA centralized data fusion approach is theoretically optimalwhen there is no cost of transmission and there are sufficientcomputational resources However this situation typicallydoes not hold in practical applications

The selection of the most appropriate technique dependson the type of the problem and the established assumptionsof each technique Statistical data fusion methods (eg PDAJPDA MHT and Kalman) are optimal under specific condi-tions [72] First the assumption that the targets are moving

The Scientific World Journal 17

independently and the measurements are normally dis-tributed around the predicted position typically does nothold Second because the statistical techniques model allof the events as probabilities they typically have severalparameters and a priori probabilities for false measurementsand detection errors that are often difficult to obtain (atleast in an optimal sense) For example in the case of theMHT algorithm specific parameters must be established thatare nontrivial to determine and are very sensitive [73] Incontrast statisticalmethods that optimize over several framesare computationally intensive and their complexity typicallygrows exponentially with the number of targets For examplein the case of particle filters tracking several targets can beaccomplished jointly as a group or individually If severaltargets are tracked jointly the necessary number of particlesgrows exponentially Therefore in practice it is better toperform tracking on them individually with the assumptionthat targets do not interact between the particles

In contrast to centralized systems the distributed datafusion methods introduce some challenges in the data fusionprocess such as (i) spatial and temporal alignments of theinformation (ii) out-of-sequence measurements and (iii)data correlation reported by Castanedo et al [74 75] Theinherent redundancy of the distributed systems could beexploited with distributed reasoning techniques and cooper-ative algorithms to improve the individual node estimationsreported by Castanedo et al [76] In addition to the previousstudies a new trend based on the geometric notion of a low-dimensional manifold is gaining attention in the data fusioncommunity An example is the work of Davenport et al [77]which proposes a simple model that captures the correlationbetween the sensor observations by matching the parametervalues for the different obtained manifolds

Acknowledgments

The author would like to thank Jesus Garcıa Miguel APatricio and James Llinas for their interesting and relateddiscussions on several topics that were presented in thispaper

References

[1] JDL Data Fusion Lexicon Technical Panel For C3 FE WhiteSan Diego Calif USA Code 420 1991

[2] D L Hall and J Llinas ldquoAn introduction to multisensor datafusionrdquo Proceedings of the IEEE vol 85 no 1 pp 6ndash23 1997

[3] H F Durrant-Whyte ldquoSensor models and multisensor integra-tionrdquo International Journal of Robotics Research vol 7 no 6 pp97ndash113 1988

[4] B V Dasarathy ldquoSensor fusion potential exploitation-inno-vative architectures and illustrative applicationsrdquo Proceedings ofthe IEEE vol 85 no 1 pp 24ndash38 1997

[5] R C Luo C-C Yih and K L Su ldquoMultisensor fusion andintegration approaches applications and future research direc-tionsrdquo IEEE Sensors Journal vol 2 no 2 pp 107ndash119 2002

[6] J Llinas C Bowman G Rogova A Steinberg E Waltz andF White ldquoRevisiting the JDL data fusion model IIrdquo TechnicalReport DTIC Document 2004

[7] E P Blasch and S Plano ldquoJDL level 5 fusionmodel ldquouser refine-mentrdquo issues and applications in group trackingrdquo in Proceedingsof the Signal Processing Sensor Fusion and Target RecognitionXI pp 270ndash279 April 2002

[8] H F Durrant-Whyte and M Stevens ldquoData fusion in decen-tralized sensing networksrdquo in Proceedings of the 4th Interna-tional Conference on Information Fusion pp 302ndash307MontrealCanada 2001

[9] J Manyika and H Durrant-Whyte Data Fusion and SensorManagement A Decentralized Information-Theoretic ApproachPrentice Hall Upper Saddle River NJ USA 1995

[10] S S Blackman ldquoAssociation and fusion ofmultiple sensor datardquoinMultitarget-Multisensor Tracking Advanced Applications pp187ndash217 Artech House 1990

[11] S Lloyd ldquoLeast squares quantization in pcmrdquo IEEETransactionson Information Theory vol 28 no 2 pp 129ndash137 1982

[12] M Shindler A Wong and A Meyerson ldquoFast and accurate120581-means for large datasetsrdquo in Proceedings of the 25th AnnualConference on Neural Information Processing Systems (NIPS rsquo11)pp 2375ndash2383 December 2011

[13] Y Bar-Shalom and E Tse ldquoTracking in a cluttered environmentwith probabilistic data associationrdquo Automatica vol 11 no 5pp 451ndash460 1975

[14] T E Fortmann Y Bar-Shalom and M Scheffe ldquoMulti-targettracking using joint probabilistic data associationrdquo in Pro-ceedings of the 19th IEEE Conference on Decision and Controlincluding the Symposium on Adaptive Processes vol 19 pp 807ndash812 December 1980

[15] D B Reid ldquoAn algorithm for tracking multiple targetsrdquo IEEETransactions on Automatic Control vol 24 no 6 pp 843ndash8541979

[16] C L Morefield ldquoApplication of 0-1 integer programming tomultitarget tracking problemsrdquo IEEETransactions onAutomaticControl vol 22 no 3 pp 302ndash312 1977

[17] R L Streit and T E Luginbuhl ldquoMaximum likelihood methodfor probabilistic multihypothesis trackingrdquo in Proceedings of theSignal and Data Processing of Small Targets vol 2235 of Pro-ceedings of SPIE p 394 1994

[18] I J Cox and S L Hingorani ldquoEfficient implementation of Reidrsquosmultiple hypothesis tracking algorithm and its evaluation forthe purpose of visual trackingrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 18 no 2 pp 138ndash1501996

[19] K G Murty ldquoAn algorithm for ranking all the assignments inorder of increasing costrdquo Operations Research vol 16 no 3 pp682ndash687 1968

[20] M E Liggins II C-Y Chong I Kadar et al ldquoDistributed fusionarchitectures and algorithms for target trackingrdquo Proceedings ofthe IEEE vol 85 no 1 pp 95ndash106 1997

[21] S Coraluppi C Carthel M Luettgen and S Lynch ldquoAll-source track and identity fusionrdquo in Proceedings of the NationalSymposium on Sensor and Data Fusion 2000

[22] P Storms and F Spieksma ldquoAn lp-based algorithm for the dataassociation problem in multitarget trackingrdquo in Proceedings ofthe 3rd IEEE International Conference on Information Fusionvol 1 2000

[23] S-W Joo and R Chellappa ldquoA multiple-hypothesis approachfor multiobject visual trackingrdquo IEEE Transactions on ImageProcessing vol 16 no 11 pp 2849ndash2854 2007

[24] S Coraluppi andC Carthel ldquoAggregate surveillance a cardinal-ity tracking approachrdquo in Proceedings of the 14th InternationalConference on Information Fusion (FUSION rsquo11) July 2011

18 The Scientific World Journal

[25] K C Chang C Y Chong and Y Bar-Shalom ldquoJoint proba-bilistic data association in distributed sensor networksrdquo IEEETransactions on Automatic Control vol 31 no 10 pp 889ndash8971986

[26] Y Chong S Mori and K C Chang ldquoInformation lusion indistributed sensor networksrdquo in Proceedings of the 4th AmericanControl Conference Boston Mass USA June 1985

[27] Y Chong S Mori and K C Chang ldquoDistributed multitar-get multisensor trackingrdquo in Multitarget-Multisensor TrackingAdvanced Applications vol 1 pp 247ndash295 1990

[28] J Pearl Probabilistic Reasoning in Intelligent Systems Networksof Plausible Inference Morgan Kaufmann San Mateo CalifUSA 1988

[29] Koller and N Friedman Probabilistic Graphical Models Princi-ples and Techniques MIT press 2009

[30] L Chen M Cetin and A S Willsky ldquoDistributed data associ-ation for multi-target tracking in sensor networksrdquo in Proceed-ings of the 7th International Conference on Information Fusion(FUSION rsquo05) pp 9ndash16 July 2005

[31] L Chen M J Wainwright M Cetin and A S Willsky ldquoDataassociation based on optimization in graphical models withapplication to sensor networksrdquo Mathematical and ComputerModelling vol 43 no 9-10 pp 1114ndash1113 2006

[32] Y Weiss and W T Freeman ldquoOn the optimality of solutionsof the max-product belief-propagation algorithm in arbitrarygraphsrdquo IEEE Transactions on InformationTheory vol 47 no 2pp 736ndash744 2001

[33] C Brown H Durrant-Whyte J Leonard B Rao and B SteerldquoDistributed data fusion using Kalman filtering a roboticsapplicationrdquo in Data Fusion in Robotics and Machine Intelli-gence M A Abidi and R C Gonzalez Eds pp 267ndash309 1992

[34] R E Kalman ldquoA new approach to linear filtering and predictionproblemsrdquo Journal of Basic Engineering vol 82 no 1 pp 35ndash451960

[35] R C Luo and M G Kay ldquoData fusion and sensor integrationstate-of-the-art 1990srdquo in Data Fusion in Robotics and MachineIntelligence pp 7ndash135 1992

[36] Welch and G Bishop An Introduction to the Kalman FilterACM SIC-CRAPH 2001 Course Notes 2001

[37] S J Julier and J K Uhlmann ldquoA new extension of the Kalmanfilter to nonlinear systemsrdquo in Proceedings of the InternationalSymposium on AerospaceDefense Sensing Simulation and Con-trols vol 3 1997

[38] A Wan and R Van Der Merwe ldquoThe unscented kalman filterfor nonlinear estimationrdquo in Proceedings of the Adaptive Systemsfor Signal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 153ndash158 2000

[39] D Crisan and A Doucet ldquoA survey of convergence results onparticle filtering methods for practitionersrdquo IEEE Transactionson Signal Processing vol 50 no 3 pp 736ndash746 2002

[40] J Martinez-del Rincon C Orrite-Urunuela and J E Herrero-Jaraba ldquoAn efficient particle filter for color-based tracking incomplex scenesrdquo in Proceedings of the IEEE Conference onAdvancedVideo and Signal Based Surveillance pp 176ndash181 2007

[41] S Ganeriwal R Kumar and M B Srivastava ldquoTiming-syncprotocol for sensor networksrdquo in Proceedings of the 1st Inter-national Conference on Embedded Networked Sensor Systems(SenSys rsquo03) pp 138ndash149 November 2003

[42] M Manzo T Roosta and S Sastry ldquoTime synchronization innetworksrdquo in Proceedings of the 3rd ACMWorkshop on Securityof Ad Hoc and Sensor Networks (SASN rsquo05) pp 107ndash116November 2005

[43] J K Uhlmann ldquoCovariance consistency methods for fault-tolerant distributed data fusionrdquo Information Fusion vol 4 no3 pp 201ndash215 2003

[44] S Bashi V P Jilkov X R Li and H Chen ldquoDistributed imple-mentations of particle filtersrdquo in Proceedings of the 6th Interna-tional Conference of Information Fusion pp 1164ndash1171 2003

[45] M Coates ldquoDistributed particle filters for sensor networksrdquo inProceedings of the 3rd International symposium on InformationProcessing in Sensor Networks (ACM rsquo04) pp 99ndash107 New YorkNY USA 2004

[46] D Gu ldquoDistributed particle filter for target trackingrdquo in Pro-ceedings of the IEEE International Conference on Robotics andAutomation (ICRA rsquo07) pp 3856ndash3861 April 2007

[47] Y Bar-Shalom ldquoUpdate with out-of-sequencemeasurements intracking exact solutionrdquo IEEE Transactions on Aerospace andElectronic Systems vol 38 no 3 pp 769ndash778 2002

[48] M Orton and A Marrs ldquoA Bayesian approach to multi-targettracking and data fusionwithOut-of-SequenceMeasurementsrdquoIEE Colloquium no 174 pp 151ndash155 2001

[49] M L Hernandez A D Marrs S Maskell and M R OrtonldquoTracking and fusion for wireless sensor networksrdquo in Proceed-ings of the 5th International Conference on Information Fusion2002

[50] P C Mahalanobis ldquoOn the generalized distance in statisticsrdquoProceedings National Institute of ScienceIndia vol 2 no 1 pp49ndash55 1936

[51] S J Julier J K Uhlmann and D Nicholson ldquoA methodfor dealing with assignment ambiguityrdquo in Proceedings of theAmerican Control Conference (AAC rsquo04) vol 5 pp 4102ndash4107July 2004

[52] H Pan Z-P Liang T J Anastasio and T S Huang ldquoHybridNN-Bayesian architecture for information fusionrdquo in Proceed-ings of the International Conference on Image Processing (ICIPrsquo98) pp 368ndash371 October 1998

[53] C Coue T Fraichard P Bessiere and E Mazer ldquoMulti-sensordata fusion using Bayesian programming an automotive appli-cationrdquo in Proceedings of the IEEERSJ International Conferenceon Intelligent Robots and Systems pp 141ndash146 October 2002

[54] D L Hall and J Llinas Handbook of Multisensor Data FusionCRC Press Boca Raton Fla USA 2001

[55] P Dempster ldquoA Generalization of Bayesian Inferencerdquo Journalof the Royal Statistical Society B vol 30 no 2 pp 205ndash247 1968

[56] A ShaferMathematical Theory of Evidence Princeton Univer-sity Press Princeton NJ USA 1976

[57] G M Provan ldquoThe validity of Dempster-Shafer belief func-tionsrdquo International Journal of Approximate Reasoning vol 6no 3 pp 389ndash399 1992

[58] D M Buede ldquoShafer-Dempster and Bayesian reasoning aresponse to lsquoShafer-Dempster reasoning with applications tomultisensor target identification systemsrsquordquo IEEE Transactions onSystems Man and Cybernetics vol 18 no 6 pp 1009ndash1011 1988

[59] Y Cheng and R L Kashyap ldquoComparisonol Bayesian andDempsterrsquos rules in evidence combinationrdquo in Maximum-Entropy and BayesianMethods in Science and Engineering 1988

[60] B R Cobb and P P Shenoy ldquoA comparison of Bayesian andbelief function reasoningrdquo Information Systems Frontiers vol 5no 4 pp 345ndash358 2003

[61] H Wu M Siegel R Stiefelhagen and J Yang ldquoSensor fusionusing Dempster-Shafer theoryrdquo in Proceedings of the 19thIEEE Instrumentation and Measurement Technology Conference(TMTC rsquo02) pp 7ndash11 May 2002

The Scientific World Journal 19

[62] H Wu M Siegel and S Ablay ldquoSensor fusion using dempster-shafer theory II staticweighting andKalmanfilter-like dynamicweightingrdquo in Proceedings of the 20th IEEE Information andMeasurement Technology Conference (TMTC rsquo03) pp 907ndash912May 2003

[63] E Bosse P Valin A-C Boury-Brisset and D Grenier ldquoEx-ploitation of a priori knowledge for information fusionrdquo Infor-mation Fusion vol 7 no 2 pp 161ndash175 2006

[64] MMorbee L Tessens H Aghajan andW Philips ldquoDempster-Shafer based multi-view occupancy mapsrdquo Electronics Lettersvol 46 no 5 pp 341ndash343 2010

[65] C S Peirce Abduction and Induction Philosophical Writings ofPeirce vol 156 Dover New York NY USA 1955

[66] A M Abdelbar E A M Andrews and D C Wunsch IIldquoAbductive reasoning with recurrent neural networksrdquo NeuralNetworks vol 16 no 5-6 pp 665ndash673 2003

[67] J R Aguero and A Vargas ldquoInference of operative configu-ration of distribution networks using fuzzy logic techniquesPart II extended real-time modelrdquo IEEE Transactions on PowerSystems vol 20 no 3 pp 1562ndash1569 2005

[68] A W M Smeulders M Worring S Santini A Gupta and RJain ldquoContent-based image retrieval at the end of the earlyyearsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 22 no 12 pp 1349ndash1380 2000

[69] D S Friedlander and S Phoha ldquoSemantic information fusionfor coordinated signal processing in mobile sensor networksrdquoInternational Journal of High Performance Computing Applica-tions vol 16 no 3 pp 235ndash241 2002

[70] S Friedlander ldquoSemantic information extractionrdquo in Dis-tributed Sensor Networks 2005

[71] KWhitehouse J Liu and F Zhao ldquoSemantic Streams a frame-work for composable inference over sensor datardquo in Proceedingsof the 3rd European Workshop on Wireless Sensor NetworksLecture Notes in Computer Science Springer February 2006

[72] J Cox ldquoA review of statistical data association techniques formotion correspondencerdquo International Journal of ComputerVision vol 10 no 1 pp 53ndash66 1993

[73] C J Veenman M J T Reinders and E Backer ldquoResolvingmotion correspondence for densely moving pointsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol23 no 1 pp 54ndash72 2001

[74] F Castanedo M A Patricio J Garcıa and J M MolinaldquoBottom-uptop-down coordination in a multiagent visualsensor networkrdquo in Proceedings of the IEEE Conference onAdvanced Video and Signal Based Surveillance (AVSS rsquo07) pp93ndash98 September 2007

[75] F Castanedo J Garcıa M A Patricio and J M MolinaldquoAnalysis of distributed fusion alternatives in coordinated visionagentsrdquo in Proceedings of the 11th International Conference onInformation Fusion (FUSION rsquo08) July 2008

[76] F Castanedo J Garcıa M A Patricio and J M Molina ldquoDatafusion to improve trajectory tracking in a cooperative surveil-lance multi-agent architecturerdquo Information Fusion vol 11 no3 pp 243ndash255 2010

[77] M A Davenport C Hegde M F Duarte and R G BaraniukldquoJoint manifolds for data fusionrdquo IEEE Transactions on ImageProcessing vol 19 no 10 pp 2580ndash2594 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 12: Review Article A Review of Data Fusion Techniquesdownloads.hindawi.com/journals/tswj/2013/704504.pdf · Classification of Data Fusion Techniques Data fusion is a ... to extract features

12 The Scientific World Journal

which can be obtained from the analytical or empiricalmodels of the sensorsThis function expresses the probabilityof the observed data The main disadvantage of this methodin practice is that it requires the analytical or empirical modelof the sensor to be known to provide the prior distributionand compute the likelihood function This method can alsosystematically underestimate the variance of the distributionwhich leads to a bias problem However the bias of the MLsolution becomes less significant as the number 119873 of datapoints increases and is equal to the true variance of thedistribution that generated the data at the limit119873 rarr infin

The maximum posterior (MAP) method is based on theBayesian theory It is employed when the parameter 119909 tobe estimated is the output of a random variable that has aknown probability density function 119901(119909) In the context ofdata fusion 119909 is the state that is being estimated and 119911 =(119911(1) 119911(119896)) is a sequence of 119896 previous observations of 119909The MAP estimator finds the value of 119909 that maximizes theposterior probability distribution as follows

119909 (119896) = argmax119909119901 (119909 | 119911) (22)

Both methods (ML andMAP) aim to find the most likelyvalue for the state 119909 However ML assumes that 119909 is a fixedbut an unknown point from the parameter space whereasMAP considers 119909 to be the output of a random variable witha known a priori probability density function Both of thesemethods are equivalent when there is no a priori informationabout 119909 that is when there are only observations

42The Kalman Filter TheKalman filter is the most popularestimation technique It was originally proposed by Kalman[34] and has been widely studied and applied since then TheKalman filter estimates the state 119909 of a discrete time processgoverned by the following space-time model

119909 (119896 + 1) = Φ (119896) 119909 (119896) + 119866 (119896) 119906 (119896) + 119908 (119896) (23)

with the observations ormeasurements 119911 at time 119896 of the state119909 represented by

119911 (119896) = 119867 (119896) 119909 (119896) + V (119896) (24)

where Φ(119896) is the state transition matrix 119866(119896) is the inputmatrix transition 119906(119896) is the input vector 119867(119896) is themeasurement matrix and 119908 and V are the random Gaussianvariables with zero mean and covariance matrices of 119876(119896)and 119877(119896) respectively Based on the measurements and onthe system parameters the estimation of 119909(119896) which isrepresented by 119909(119896) and the prediction of 119909(119896 + 1) whichis represented by 119909(119896 + 1 | 119896) are given by the following

119909 (119896) = 119909 (119896 | 119896 + 1) + 119870 (119896) [119911 (119896) minus 119867 (119896) 119909 (119896 | 119896 minus 1)]

119909 (119896 + 1 | 119896) = Φ (119896) 119909 (119896 | 119896) + 119866 (119896) 119906 (119896)

(25)

respectively where119870 is the filter gain determined by

119870 (119896) = 119875 (119896 | 119896 minus 1)119867119879(119896)

times [119867 (119896) 119875 (119896 | 119896 minus 1)119867119879(119896) + 119877 (119896)]

minus1

(26)

where 119875(119896 | 119896 minus 1) is the prediction covariance matrix andcan be determined by

119875 (119896 + 1 | 119896) = Φ (119896) 119875 (119896)Φ119879(119896) + 119876 (119896) (27)

with

119875 (119896) = 119875 (119896 | 119896 minus 1) minus 119870 (119896)119867 (119896) 119875 (119896 | 119896 minus 1) (28)

The Kalman filter is mainly employed to fuse low-leveldata If the system could be described as a linear model andthe error could be modeled as the Gaussian noise then therecursive Kalman filter obtains optimal statistical estimations[35] However othermethods are required to address nonlin-ear dynamicmodels and nonlinearmeasurementsThemodi-fied Kalman filter known as the extended Kalman filter (EKF)is an optimal approach for implementing nonlinear recursivefilters [36] The EKF is one of the most often employedmethods for fusing data in robotic applications Howeverit has some disadvantages because the computations of theJacobians are extremely expensive Some attempts have beenmade to reduce the computational cost such as linearizationbut these attempts introduce errors in the filter and make itunstable

The unscented Kalman filter (UKF) [37] has gainedpopularity because it does not have the linearization step andthe associated errors of the EKF [38] The UKF employs adeterministic sampling strategy to establish theminimum setof points around the mean This set of points captures thetrue mean and covariance completely Then these points arepropagated through nonlinear functions and the covarianceof the estimations can be recuperated Another advantage ofthe UKF is its ability to be employed in parallel implementa-tions

43 Particle Filter Particle filters are recursive implemen-tations of the sequential Monte Carlo methods [39] Thismethod builds the posterior density function using severalrandom samples called particles Particles are propagatedover time with a combination of sampling and resamplingsteps At each iteration the sampling step is employed todiscard some particles increasing the relevance of regionswith a higher posterior probability In the filtering processseveral particles of the same state variable are employedand each particle has an associated weight that indicatesthe quality of the particle Therefore the estimation is theresult of a weighted sum of all of the particles The standardparticle filter algorithm has two phases (1) the predictingphase and (2) the updating phase In the predicting phaseeach particle is modified according to the existing modeland accounts for the sum of the random noise to simulatethe noise effect Then in the updating phase the weight ofeach particle is reevaluated using the last available sensorobservation and particles with lower weights are removedSpecifically a generic particle filter comprises the followingsteps

The Scientific World Journal 13

(1) Initialization of the particles

(i) let119873 be equal to the number of particles(ii) 119883(119894)(1) = [119909(1) 119910(1) 0 0]119879 for 119894 = 1 119873

(2) Prediction step

(i) for each particle 119894 = 1 119873 evaluate the state(119896 + 1 | 119896) of the system using the state at timeinstant 119896 with the noise of the system at time 119896Consider

119883(119894)(119896 + 1 | 119896) = 119865 (119896)119883

(119894)(119896)

+ (cauchy-distribution-noise)(119896)

(29)

where 119865(119896) is the transition matrix of the sys-tem

(3) Evaluate the particle weight For each particle 119894 =1 119873

(i) compute the predicted observation state of thesystem using the current predicted state and thenoise at instant 119896 Consider

(119894)(119896 + 1 | 119896) = 119867 (119896 + 1)119883

(119894)(119896 + 1 | 119896)

+ (gaussian-measurement-noise)(119896+1)

(30)

(ii) compute the likelihood (weights) according tothe given distribution Consider

likelihood(119894) = 119873((119894) (119896 + 1 | 119896) 119911(119894) (119896 + 1) var) (31)

(iii) normalize the weights as follows

119908(119894)=

likelihood(119894)

sum119873

119895=1likelihood(119895)

(32)

(4) ResamplingSelection multiply particles with higherweights and remove those with lower weights Thecurrent state must be adjusted using the computedweights of the new particles

(i) Compute the cumulative weights Consider

Cum Wt(119894) =119894

sum

119895=1

119908(119895) (33)

(ii) Generate uniform distributed random variablesfrom 119880(119894) sim 119882(0 1) with the number of stepsequal to the number of particles

(iii) Determine which particles should be multipliedand which ones removed

(5) Propagation phase

(i) incorporate the new values of the state after theresampling of instant 119896 to calculate the value atinstant 119896 + 1 Consider

119909(1119873)

(119896 + 1 | 119896 + 1) = 119909 (119896 + 1 | 119896) (34)

(ii) compute the posterior mean Consider

119909 (119896 + 1) = mean [119909119894 (119896 + 1 | 119896 + 1)] 119894 = 1 119873 (35)

(iii) repeat steps 2 to 5 for each time instant

Particle filters are more flexible than the Kalman filtersand can copewith nonlinear dependencies and non-Gaussiandensities in the dynamic model and in the noise errorHowever they have some disadvantages A large numberof particles are required to obtain a small variance in theestimator It is also difficult to establish the optimal number ofparticles in advance and the number of particles affects thecomputational cost significantly Earlier versions of particlefilters employed a fixed number of particles but recent studieshave started to use a dynamic number of particles [40]

44 The Distributed Kalman Filter The distributed Kalmanfilter requires a correct clock synchronization between eachsource as demonstrated in [41] In other words to correctlyuse the distributed Kalman filter the clocks from all ofthe sources must be synchronized This synchronization istypically achieved through using protocols that employ ashared global clock such as the network time protocol (NTP)Synchronization problems between clocks have been shownto have an effect on the accuracy of the Kalman filterproducing inaccurate estimations [42]

If the estimations are consistent and the cross covarianceis known (or the estimations are uncorrelated) then it ispossible to use the distributed Kalman filters [43] Howeverthe cross covariance must be determined exactly or theobservations must be consistent

We refer the reader to Liggins II et al [20] formore detailsabout the Kalman filter in a distributed and hierarchicalarchitecture

45 Distributed Particle Filter Distributed particle filtershave gained attention recently [44ndash46] Coates [45] used adistributed particle filter to monitor an environment thatcould be captured by the Markovian state-space modelinvolving nonlinear dynamics and observations and non-Gaussian noise

In contrast earlier attempts to solve out-of-sequencemeasurements using particle filters are based on regeneratingthe probability density function to the time instant of theout-of-sequence measurement [47] In a particle filter thisstep requires a large computational cost in addition to thenecessary space to store the previous particles To avoidthis problem Orton and Marrs [48] proposed to store theinformation on the particles at each time instant saving thecost of recalculating this information This technique is close

14 The Scientific World Journal

to optimal and when the delay increases the result is onlyslightly affected [49]However it requires a very large amountof space to store the state of the particles at each time instant

46 Covariance Consistency Methods Covariance Intersec-tionUnion Covariance consistency methods (intersectionand union) were proposed by Uhlmann [43] and are generaland fault-tolerant frameworks for maintaining covariancemeans and estimations in a distributed networkThese meth-ods do not comprise estimation techniques instead they aresimilar to an estimation fusion technique The distributedKalman filter requirement of independent measurements orknown cross-covariances is not a constraintwith thismethod

461 Covariance Intersection If the Kalman filter is employ-ed to combine two estimations (119886

1 1198601) and (119886

2 1198602) then it

is assumed that the joint covariance is in the following form

[

[

1198601119883

1198831198791198602

]

]

(36)

where the cross-covariance 119883 should be known exactly sothat the Kalman filter can be applied without difficultyBecause the computation of the cross-covariances is compu-tationally intensive Uhlmann [43] proposed the covarianceintersection (CI) algorithm

Let us assume that a joint covariance 119872 can be definedwith the diagonal blocks119872

1198601gt 1198601and119872

1198602gt 1198602 Consider

119872 ⩾ [

[

1198601119883

1198831198791198602

]

]

(37)

for every possible instance of the unknown cross-covariance119883 then the components of the matrix119872 could be employedin the Kalman filter equations to provide a fused estimation(119888 119862) that is considered consistent The key point of thismethod relies on generating a joint covariance matrix119872 thatcan represent a useful fused estimation (in this context usefulrefers to something with a lower associated uncertainty) Insummary the CI algorithm computes the joint covariancematrix 119872 where the Kalman filter provides the best fusedestimation (119888 119862) with respect to a fixed measurement of thecovariance matrix (ie the minimum determinant)

Specific covariance criteria must be established becausethere is not a specific minimum joint covariance in theorder of the positive semidefinite matrices Moreover thejoint covariance is the basis of the formal analysis of theCI algorithm the actual result is a nonlinear mixture of theinformation stored on the estimations being fused followingthe following equation

119862 = (1199081119867119879

1119860minus1

11198671+ 1199082119867119879

2119860minus1

21198672+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119867119899)minus1

119888 = 119862(1199081119867119879

1119860minus1

11198861+ 1199082119867119879

2119860minus1

21198862+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119886119899)minus1

(38)

where 119867119894is the transformation of the fused state-space

estimation to the space of the estimated state 119894 The values

of 119908 can be calculated to minimize the covariance determi-nant using convex optimization packages and semipositivematrix programming The result of the CI algorithm hasdifferent characteristics compared to the Kalman filter Forexample if two estimations are provided (119886 119860) and (119887 119861)and their covariances are equal 119860 = 119861 since the Kalmanfilter is based on the statistical independence assumption itproduces a fused estimation with covariance 119862 = (12)119860In contrast the CI method does not assume independenceand thus must be consistent even in the case in whichthe estimations are completely correlated with the estimatedfused covariance 119862 = 119860 In the case of estimations where119860 lt 119861 the CI algorithm does not provide information aboutthe estimation (119887 119861) thus the fused result is (119886 119860)

Every joint-consistent covariance is sufficient to producea fused estimation which guarantees consistency Howeverit is also necessary to guarantee a lack of divergence Diver-gence is avoided in the CI algorithm by choosing a specificmeasurement (ie the determinant) which is minimized ineach fusion operation This measurement represents a non-divergence criterion because the size of the estimated covari-ance according to this criterion would not be incremented

The application of the CI method guarantees consis-tency and nondivergence for every sequence of mean andcovariance-consistent estimations However this methoddoes not work well when the measurements to be fused areinconsistent

462 Covariance Union CI solves the problem of correlatedinputs but not the problemof inconsistent inputs (inconsistentinputs refer to different estimations each of which has ahigh accuracy (small variance) but also a large differencefrom the states of the others) thus the covariance union(CU) algorithm was proposed to solve the latter [43] CUaddresses the following problem two estimations (119886

1 1198601)

and (1198862 1198602) relate to the state of an object and are mutually

inconsistent from one another This issue arises when thedifference between the average estimations is larger thanthe provided covariance Inconsistent inputs can be detectedusing the Mahalanobis distance [50] between them which isdefined as

119872119889= (1198861minus 1198862)119879

(1198601+ 1198602)minus1

(1198861minus 1198862) (39)

and detecting whether this distance is larger than a giventhreshold

The Mahalanobis distance accounts for the covarianceinformation to obtain the distance If the difference betweenthe estimations is high but their covariance is also highthe Mahalanobis distance yields a small value In contrastif the difference between the estimations is small and thecovariances are small it could produce a larger distancevalue A high Mahalanobis distance could indicate that theestimations are inconsistent however it is necessary tohave a specific threshold established by the user or learnedautomatically

The CU algorithm aims to solve the following prob-lem let us suppose that a filtering algorithm provides twoobservations withmean and covariance (119886

1 1198601) and (119886

2 1198602)

The Scientific World Journal 15

respectively It is known that one of the observations is correctand the other is erroneous However the identity of thecorrect estimation is unknown and cannot be determinedIn this situation if both estimations are employed as aninput to the Kalman filter there will be a problem becausethe Kalman filter only guarantees a consistent output if theobservation is updated with a measurement consistent withboth of them In the specific case in which themeasurementscorrespond to the same object but are acquired from twodifferent sensors the Kalman filter can only guarantee thatthe output is consistent if it is consistent with both separatelyBecause it is not possible to knowwhich estimation is correctthe only way to combine the two estimations rigorously isto provide an estimation (119906 119880) that is consistent with bothestimations and to obey the following properties

119880 ⪖ 1198601+ (119906 minus 119886

1) (119906 minus 119860

1)119879

119880 ⪖ 1198602+ (119906 minus 119886

2) (119906 minus 119860

2)119879

(40)

where somemeasurement of thematrix size119880 (ie the deter-minant) is minimized

In other words the previous equations indicate that if theestimation (119886

1 1198601) is consistent then the translation of the

vector 1198861to 119906 requires to increase the covariance by the sum

of a matrix at least as big as the product of (119906minus1198861) in order to

be consistentThe same situation applies to the measurement(1198862 1198602) in order to be consistent

A simple strategy is to choose the mean of the estimationas the input value of one of themeasurements (119906 = 119886

1) In this

case the value of 119880must be chosen such that the estimationis consistent with the worst case (the correct measurement is1198862) However it is possible to assign 119906 an intermediate value

between 1198861and 1198862to decrease the value of 119880 Therefore the

CU algorithm establishes the mean fused value 119906 that hasthe least covariance 119880 but is sufficiently large for the twomeasurements (119886

1and 1198862) for consistency

Because the matrix inequalities presented in previousequations are convex convex optimization algorithms mustbe employed to solve them The value of 119880 can be computedwith the iterative method described by Julier et al [51]The obtained covariance could be significantly larger thanany of the initial covariances and is an indicator of theexisting uncertainty between the initial estimations One ofthe advantages of the CU method arises from the fact thatthe same process could be easily extended to119873 inputs

5 Decision Fusion Methods

A decision is typically taken based on the knowledge of theperceived situation which is provided by many sources inthe data fusion domain These techniques aim to make ahigh-level inference about the events and activities that areproduced from the detected targets These techniques oftenuse symbolic information and the fusion process requires toreasonwhile accounting for the uncertainties and constraintsThese methods fall under level 2 (situation assessment) andlevel 4 (impact assessment) of the JDL data fusion model

51 The Bayesian Methods Information fusion based on theBayesian inference provides a formalism for combining evi-dence according to the probability theory rules Uncertaintyis represented using the conditional probability terms thatdescribe beliefs and take on values in the interval [0 1] wherezero indicates a complete lack of belief and one indicates anabsolute belief The Bayesian inference is based on the Bayesrule as follows

119875 (119884 | 119883) =119875 (119883 | 119884) 119875 (119884)

119875 (119883) (41)

where the posterior probability 119875(119884 | 119883) represents thebelief in the hypothesis 119884 given the information 119883 Thisprobability is obtained by multiplying the a priori probabilityof the hypothesis 119875(119884) by the probability of having 119883 giventhat 119884 is true 119875(119883 | 119884) The value 119875(119883) is used as anormalizing constantThemain disadvantage of the Bayesianinference is that the probabilities 119875(119883) and 119875(119883 | 119884) mustbe known To estimate the conditional probabilities Panet al [52] proposed the use of NNs whereas Coue et al [53]proposed the Bayesian programming

Hall and Llinas [54] described the following problemsassociated with Bayesian inference

(i) Difficulty in establishing the value of a priori proba-bilities

(ii) Complexity when there are multiple potential hypo-theses and a substantial number of events that dependon the conditions

(iii) The hypothesis should be mutually exclusive

(iv) Difficulty in describing the uncertainty of the deci-sions

52 The Dempster-Shafer Inference The Dempster-Shaferinference is based on the mathematical theory introducedby Dempster [55] and Shafer [56] which generalizes theBayesian theory The Dempster-Shafer theory provides aformalism that could be used to represent incomplete knowl-edge updating beliefs and a combination of evidence andallows us to represent the uncertainty explicitly [57]

A fundamental concept in the Dempster-Shafer reason-ing is the frame of discernment which is defined as followsLet Θ = 120579

1 1205792 120579

119873 be the set of all possible states

that define the system and let Θ be exhaustive and mutuallyexclusive due to the system being only in one state 120579

119894isin Θ

where 1 ⪕ 119894 ⪕ 119873 The set Θ is called a frame of discernmentbecause its elements are employed to discern the current stateof the system

The elements of the set 2Θ are called hypotheses Inthe Dempster-Shafer theory based on the evidence 119864 aprobability is assigned to each hypothesis119867 isin 2Θ accordingto the basic assignment of probabilities or the mass function119898 2Θrarr [01] which satisfies

119898(0) = 0 (42)

16 The Scientific World Journal

Thus themass function of the empty set is zero Furthermorethe mass function of a hypothesis is larger than or equal tozero for all of the hypotheses Consider

119898(119867) ge 0 forall119867 isin 2Θ (43)

The sum of the mass function of all the hypotheses is oneConsider

sum

119867isin2Θ

119898(119867) = 1 (44)

To express incomplete beliefs in a hypothesis 119867 the Demp-ster-Shafer theory defines the belief function bel 2Θ rarr

[0 1] over Θ as

bel (119867) = sum119860sube119867

119898(119860) (45)

where bel(0) = 0 and bel(Θ) = 1The doubt level in119867 can beexpressed in terms of the belief function by

dou (119867) = bel (not119867) = sum

119860subenot119867

119898(119860) (46)

To express the plausibility of each hypothesis the functionpl 2Θ rarr [0 1] over Θ is defined as

pl (119867) = 1 minus dou (119867) = sum

119860cap119867=0

119898(119860) (47)

Intuitive plausibility indicates that there is less uncer-tainty in hypothesis119867 if it is more plausible The confidenceinterval [bel(119867) pl(119867)] defines the true belief in hypothesis119867 To combine the effects of the two mass functions 119898

1and

1198982 the Dempster-Shafer theory defines a rule119898

1oplus 1198982as

1198981oplus 1198982(0) = 0

1198981oplus 1198982(119867) =

sum119883cap119884=119867

1198981(119883)119898

2(119884)

1 minus sum119883cap119884=0

1198981(119883)119898

2(119884)

(48)

In contrast to the Bayesian inference a priori probabilitiesare not required in the Dempster-Shafer inference becausethey are assigned at the instant that the information is pro-vided Several studies in the literature have compared the useof the Bayesian inference and the Dempster-Shafer inferencesuch as [58ndash60] Wu et al [61] used the Dempster-Shafertheory to fuse information in context-aware environmentsThis work was extended in [62] to dynamically modify theassociated weights to the sensor measurements Thereforethe fusion mechanism is calibrated according to the recentmeasurements of the sensors (in cases in which the ground-truth is available) In themilitary domain [63] theDempster-Shafer reasoning is used with the a priori information storedin a database for classifying military ships Morbee et al [64]described the use of the Dempster-Shafer theory to build 2Doccupancy maps from several cameras and to evaluate thecontribution of subsets of cameras to a specific task Each taskis the observation of an event of interest and the goal is toassess the validity of a set of hypotheses that are fused usingthe Dempster-Shafer theory

53 Abductive Reasoning Abductive reasoning or inferringthe best explanation is a reasoning method in which ahypothesis is chosen under the assumption that in case itis true it explains the observed event most accurately [65]In other words when an event is observed the abductionmethod attempts to find the best explanation

In the context of probabilistic reasoning abductive infer-ence finds the posterior ML of the system variables givensome observed variables Abductive reasoning is more areasoning pattern than a data fusion technique Thereforedifferent inference methods such as NNs [66] or fuzzy logic[67] can be employed

54 Semantic Methods Decision fusion techniques thatemploy semantic data fromdifferent sources as an input couldprovide more accurate results than those that rely on onlysingle sources There is a growing interest in techniques thatautomatically determine the presence of semantic features invideos to solve the semantic gap [68]

Semantic information fusion is essentially a scheme inwhich raw sensor data are processed such that the nodesexchange only the resultant semantic information Semanticinformation fusion typically covers two phases (i) build-ing the knowledge and (ii) pattern matching (inference)The first phase (typically offline) incorporates the mostappropriate knowledge into semantic information Then thesecond phase (typically online or in real-time) fuses relevantattributes and provides a semantic interpretation of thesensor data [69ndash71]

Semantic fusion could be viewed as an idea for integratingand translating sensor data into formal languages Thereforethe obtained resulting language from the observations ofthe environment is compared with similar languages thatare stored in the database The key of this strategy is thatsimilar behaviors represented by formal languages are alsosemantically similar This type of method provides savingsin the cost of transmission because the nodes need onlytransmit the formal language structure instead of the rawdata However a known set of behaviors must be storedin a database in advance which might be difficult in somescenarios

6 Conclusions

This paper reviews the most popular methods and tech-niques for performing datainformation fusion To determinewhether the application of datainformation fusion methodsis feasible we must evaluate the computational cost of theprocess and the delay introduced in the communicationA centralized data fusion approach is theoretically optimalwhen there is no cost of transmission and there are sufficientcomputational resources However this situation typicallydoes not hold in practical applications

The selection of the most appropriate technique dependson the type of the problem and the established assumptionsof each technique Statistical data fusion methods (eg PDAJPDA MHT and Kalman) are optimal under specific condi-tions [72] First the assumption that the targets are moving

The Scientific World Journal 17

independently and the measurements are normally dis-tributed around the predicted position typically does nothold Second because the statistical techniques model allof the events as probabilities they typically have severalparameters and a priori probabilities for false measurementsand detection errors that are often difficult to obtain (atleast in an optimal sense) For example in the case of theMHT algorithm specific parameters must be established thatare nontrivial to determine and are very sensitive [73] Incontrast statisticalmethods that optimize over several framesare computationally intensive and their complexity typicallygrows exponentially with the number of targets For examplein the case of particle filters tracking several targets can beaccomplished jointly as a group or individually If severaltargets are tracked jointly the necessary number of particlesgrows exponentially Therefore in practice it is better toperform tracking on them individually with the assumptionthat targets do not interact between the particles

In contrast to centralized systems the distributed datafusion methods introduce some challenges in the data fusionprocess such as (i) spatial and temporal alignments of theinformation (ii) out-of-sequence measurements and (iii)data correlation reported by Castanedo et al [74 75] Theinherent redundancy of the distributed systems could beexploited with distributed reasoning techniques and cooper-ative algorithms to improve the individual node estimationsreported by Castanedo et al [76] In addition to the previousstudies a new trend based on the geometric notion of a low-dimensional manifold is gaining attention in the data fusioncommunity An example is the work of Davenport et al [77]which proposes a simple model that captures the correlationbetween the sensor observations by matching the parametervalues for the different obtained manifolds

Acknowledgments

The author would like to thank Jesus Garcıa Miguel APatricio and James Llinas for their interesting and relateddiscussions on several topics that were presented in thispaper

References

[1] JDL Data Fusion Lexicon Technical Panel For C3 FE WhiteSan Diego Calif USA Code 420 1991

[2] D L Hall and J Llinas ldquoAn introduction to multisensor datafusionrdquo Proceedings of the IEEE vol 85 no 1 pp 6ndash23 1997

[3] H F Durrant-Whyte ldquoSensor models and multisensor integra-tionrdquo International Journal of Robotics Research vol 7 no 6 pp97ndash113 1988

[4] B V Dasarathy ldquoSensor fusion potential exploitation-inno-vative architectures and illustrative applicationsrdquo Proceedings ofthe IEEE vol 85 no 1 pp 24ndash38 1997

[5] R C Luo C-C Yih and K L Su ldquoMultisensor fusion andintegration approaches applications and future research direc-tionsrdquo IEEE Sensors Journal vol 2 no 2 pp 107ndash119 2002

[6] J Llinas C Bowman G Rogova A Steinberg E Waltz andF White ldquoRevisiting the JDL data fusion model IIrdquo TechnicalReport DTIC Document 2004

[7] E P Blasch and S Plano ldquoJDL level 5 fusionmodel ldquouser refine-mentrdquo issues and applications in group trackingrdquo in Proceedingsof the Signal Processing Sensor Fusion and Target RecognitionXI pp 270ndash279 April 2002

[8] H F Durrant-Whyte and M Stevens ldquoData fusion in decen-tralized sensing networksrdquo in Proceedings of the 4th Interna-tional Conference on Information Fusion pp 302ndash307MontrealCanada 2001

[9] J Manyika and H Durrant-Whyte Data Fusion and SensorManagement A Decentralized Information-Theoretic ApproachPrentice Hall Upper Saddle River NJ USA 1995

[10] S S Blackman ldquoAssociation and fusion ofmultiple sensor datardquoinMultitarget-Multisensor Tracking Advanced Applications pp187ndash217 Artech House 1990

[11] S Lloyd ldquoLeast squares quantization in pcmrdquo IEEETransactionson Information Theory vol 28 no 2 pp 129ndash137 1982

[12] M Shindler A Wong and A Meyerson ldquoFast and accurate120581-means for large datasetsrdquo in Proceedings of the 25th AnnualConference on Neural Information Processing Systems (NIPS rsquo11)pp 2375ndash2383 December 2011

[13] Y Bar-Shalom and E Tse ldquoTracking in a cluttered environmentwith probabilistic data associationrdquo Automatica vol 11 no 5pp 451ndash460 1975

[14] T E Fortmann Y Bar-Shalom and M Scheffe ldquoMulti-targettracking using joint probabilistic data associationrdquo in Pro-ceedings of the 19th IEEE Conference on Decision and Controlincluding the Symposium on Adaptive Processes vol 19 pp 807ndash812 December 1980

[15] D B Reid ldquoAn algorithm for tracking multiple targetsrdquo IEEETransactions on Automatic Control vol 24 no 6 pp 843ndash8541979

[16] C L Morefield ldquoApplication of 0-1 integer programming tomultitarget tracking problemsrdquo IEEETransactions onAutomaticControl vol 22 no 3 pp 302ndash312 1977

[17] R L Streit and T E Luginbuhl ldquoMaximum likelihood methodfor probabilistic multihypothesis trackingrdquo in Proceedings of theSignal and Data Processing of Small Targets vol 2235 of Pro-ceedings of SPIE p 394 1994

[18] I J Cox and S L Hingorani ldquoEfficient implementation of Reidrsquosmultiple hypothesis tracking algorithm and its evaluation forthe purpose of visual trackingrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 18 no 2 pp 138ndash1501996

[19] K G Murty ldquoAn algorithm for ranking all the assignments inorder of increasing costrdquo Operations Research vol 16 no 3 pp682ndash687 1968

[20] M E Liggins II C-Y Chong I Kadar et al ldquoDistributed fusionarchitectures and algorithms for target trackingrdquo Proceedings ofthe IEEE vol 85 no 1 pp 95ndash106 1997

[21] S Coraluppi C Carthel M Luettgen and S Lynch ldquoAll-source track and identity fusionrdquo in Proceedings of the NationalSymposium on Sensor and Data Fusion 2000

[22] P Storms and F Spieksma ldquoAn lp-based algorithm for the dataassociation problem in multitarget trackingrdquo in Proceedings ofthe 3rd IEEE International Conference on Information Fusionvol 1 2000

[23] S-W Joo and R Chellappa ldquoA multiple-hypothesis approachfor multiobject visual trackingrdquo IEEE Transactions on ImageProcessing vol 16 no 11 pp 2849ndash2854 2007

[24] S Coraluppi andC Carthel ldquoAggregate surveillance a cardinal-ity tracking approachrdquo in Proceedings of the 14th InternationalConference on Information Fusion (FUSION rsquo11) July 2011

18 The Scientific World Journal

[25] K C Chang C Y Chong and Y Bar-Shalom ldquoJoint proba-bilistic data association in distributed sensor networksrdquo IEEETransactions on Automatic Control vol 31 no 10 pp 889ndash8971986

[26] Y Chong S Mori and K C Chang ldquoInformation lusion indistributed sensor networksrdquo in Proceedings of the 4th AmericanControl Conference Boston Mass USA June 1985

[27] Y Chong S Mori and K C Chang ldquoDistributed multitar-get multisensor trackingrdquo in Multitarget-Multisensor TrackingAdvanced Applications vol 1 pp 247ndash295 1990

[28] J Pearl Probabilistic Reasoning in Intelligent Systems Networksof Plausible Inference Morgan Kaufmann San Mateo CalifUSA 1988

[29] Koller and N Friedman Probabilistic Graphical Models Princi-ples and Techniques MIT press 2009

[30] L Chen M Cetin and A S Willsky ldquoDistributed data associ-ation for multi-target tracking in sensor networksrdquo in Proceed-ings of the 7th International Conference on Information Fusion(FUSION rsquo05) pp 9ndash16 July 2005

[31] L Chen M J Wainwright M Cetin and A S Willsky ldquoDataassociation based on optimization in graphical models withapplication to sensor networksrdquo Mathematical and ComputerModelling vol 43 no 9-10 pp 1114ndash1113 2006

[32] Y Weiss and W T Freeman ldquoOn the optimality of solutionsof the max-product belief-propagation algorithm in arbitrarygraphsrdquo IEEE Transactions on InformationTheory vol 47 no 2pp 736ndash744 2001

[33] C Brown H Durrant-Whyte J Leonard B Rao and B SteerldquoDistributed data fusion using Kalman filtering a roboticsapplicationrdquo in Data Fusion in Robotics and Machine Intelli-gence M A Abidi and R C Gonzalez Eds pp 267ndash309 1992

[34] R E Kalman ldquoA new approach to linear filtering and predictionproblemsrdquo Journal of Basic Engineering vol 82 no 1 pp 35ndash451960

[35] R C Luo and M G Kay ldquoData fusion and sensor integrationstate-of-the-art 1990srdquo in Data Fusion in Robotics and MachineIntelligence pp 7ndash135 1992

[36] Welch and G Bishop An Introduction to the Kalman FilterACM SIC-CRAPH 2001 Course Notes 2001

[37] S J Julier and J K Uhlmann ldquoA new extension of the Kalmanfilter to nonlinear systemsrdquo in Proceedings of the InternationalSymposium on AerospaceDefense Sensing Simulation and Con-trols vol 3 1997

[38] A Wan and R Van Der Merwe ldquoThe unscented kalman filterfor nonlinear estimationrdquo in Proceedings of the Adaptive Systemsfor Signal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 153ndash158 2000

[39] D Crisan and A Doucet ldquoA survey of convergence results onparticle filtering methods for practitionersrdquo IEEE Transactionson Signal Processing vol 50 no 3 pp 736ndash746 2002

[40] J Martinez-del Rincon C Orrite-Urunuela and J E Herrero-Jaraba ldquoAn efficient particle filter for color-based tracking incomplex scenesrdquo in Proceedings of the IEEE Conference onAdvancedVideo and Signal Based Surveillance pp 176ndash181 2007

[41] S Ganeriwal R Kumar and M B Srivastava ldquoTiming-syncprotocol for sensor networksrdquo in Proceedings of the 1st Inter-national Conference on Embedded Networked Sensor Systems(SenSys rsquo03) pp 138ndash149 November 2003

[42] M Manzo T Roosta and S Sastry ldquoTime synchronization innetworksrdquo in Proceedings of the 3rd ACMWorkshop on Securityof Ad Hoc and Sensor Networks (SASN rsquo05) pp 107ndash116November 2005

[43] J K Uhlmann ldquoCovariance consistency methods for fault-tolerant distributed data fusionrdquo Information Fusion vol 4 no3 pp 201ndash215 2003

[44] S Bashi V P Jilkov X R Li and H Chen ldquoDistributed imple-mentations of particle filtersrdquo in Proceedings of the 6th Interna-tional Conference of Information Fusion pp 1164ndash1171 2003

[45] M Coates ldquoDistributed particle filters for sensor networksrdquo inProceedings of the 3rd International symposium on InformationProcessing in Sensor Networks (ACM rsquo04) pp 99ndash107 New YorkNY USA 2004

[46] D Gu ldquoDistributed particle filter for target trackingrdquo in Pro-ceedings of the IEEE International Conference on Robotics andAutomation (ICRA rsquo07) pp 3856ndash3861 April 2007

[47] Y Bar-Shalom ldquoUpdate with out-of-sequencemeasurements intracking exact solutionrdquo IEEE Transactions on Aerospace andElectronic Systems vol 38 no 3 pp 769ndash778 2002

[48] M Orton and A Marrs ldquoA Bayesian approach to multi-targettracking and data fusionwithOut-of-SequenceMeasurementsrdquoIEE Colloquium no 174 pp 151ndash155 2001

[49] M L Hernandez A D Marrs S Maskell and M R OrtonldquoTracking and fusion for wireless sensor networksrdquo in Proceed-ings of the 5th International Conference on Information Fusion2002

[50] P C Mahalanobis ldquoOn the generalized distance in statisticsrdquoProceedings National Institute of ScienceIndia vol 2 no 1 pp49ndash55 1936

[51] S J Julier J K Uhlmann and D Nicholson ldquoA methodfor dealing with assignment ambiguityrdquo in Proceedings of theAmerican Control Conference (AAC rsquo04) vol 5 pp 4102ndash4107July 2004

[52] H Pan Z-P Liang T J Anastasio and T S Huang ldquoHybridNN-Bayesian architecture for information fusionrdquo in Proceed-ings of the International Conference on Image Processing (ICIPrsquo98) pp 368ndash371 October 1998

[53] C Coue T Fraichard P Bessiere and E Mazer ldquoMulti-sensordata fusion using Bayesian programming an automotive appli-cationrdquo in Proceedings of the IEEERSJ International Conferenceon Intelligent Robots and Systems pp 141ndash146 October 2002

[54] D L Hall and J Llinas Handbook of Multisensor Data FusionCRC Press Boca Raton Fla USA 2001

[55] P Dempster ldquoA Generalization of Bayesian Inferencerdquo Journalof the Royal Statistical Society B vol 30 no 2 pp 205ndash247 1968

[56] A ShaferMathematical Theory of Evidence Princeton Univer-sity Press Princeton NJ USA 1976

[57] G M Provan ldquoThe validity of Dempster-Shafer belief func-tionsrdquo International Journal of Approximate Reasoning vol 6no 3 pp 389ndash399 1992

[58] D M Buede ldquoShafer-Dempster and Bayesian reasoning aresponse to lsquoShafer-Dempster reasoning with applications tomultisensor target identification systemsrsquordquo IEEE Transactions onSystems Man and Cybernetics vol 18 no 6 pp 1009ndash1011 1988

[59] Y Cheng and R L Kashyap ldquoComparisonol Bayesian andDempsterrsquos rules in evidence combinationrdquo in Maximum-Entropy and BayesianMethods in Science and Engineering 1988

[60] B R Cobb and P P Shenoy ldquoA comparison of Bayesian andbelief function reasoningrdquo Information Systems Frontiers vol 5no 4 pp 345ndash358 2003

[61] H Wu M Siegel R Stiefelhagen and J Yang ldquoSensor fusionusing Dempster-Shafer theoryrdquo in Proceedings of the 19thIEEE Instrumentation and Measurement Technology Conference(TMTC rsquo02) pp 7ndash11 May 2002

The Scientific World Journal 19

[62] H Wu M Siegel and S Ablay ldquoSensor fusion using dempster-shafer theory II staticweighting andKalmanfilter-like dynamicweightingrdquo in Proceedings of the 20th IEEE Information andMeasurement Technology Conference (TMTC rsquo03) pp 907ndash912May 2003

[63] E Bosse P Valin A-C Boury-Brisset and D Grenier ldquoEx-ploitation of a priori knowledge for information fusionrdquo Infor-mation Fusion vol 7 no 2 pp 161ndash175 2006

[64] MMorbee L Tessens H Aghajan andW Philips ldquoDempster-Shafer based multi-view occupancy mapsrdquo Electronics Lettersvol 46 no 5 pp 341ndash343 2010

[65] C S Peirce Abduction and Induction Philosophical Writings ofPeirce vol 156 Dover New York NY USA 1955

[66] A M Abdelbar E A M Andrews and D C Wunsch IIldquoAbductive reasoning with recurrent neural networksrdquo NeuralNetworks vol 16 no 5-6 pp 665ndash673 2003

[67] J R Aguero and A Vargas ldquoInference of operative configu-ration of distribution networks using fuzzy logic techniquesPart II extended real-time modelrdquo IEEE Transactions on PowerSystems vol 20 no 3 pp 1562ndash1569 2005

[68] A W M Smeulders M Worring S Santini A Gupta and RJain ldquoContent-based image retrieval at the end of the earlyyearsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 22 no 12 pp 1349ndash1380 2000

[69] D S Friedlander and S Phoha ldquoSemantic information fusionfor coordinated signal processing in mobile sensor networksrdquoInternational Journal of High Performance Computing Applica-tions vol 16 no 3 pp 235ndash241 2002

[70] S Friedlander ldquoSemantic information extractionrdquo in Dis-tributed Sensor Networks 2005

[71] KWhitehouse J Liu and F Zhao ldquoSemantic Streams a frame-work for composable inference over sensor datardquo in Proceedingsof the 3rd European Workshop on Wireless Sensor NetworksLecture Notes in Computer Science Springer February 2006

[72] J Cox ldquoA review of statistical data association techniques formotion correspondencerdquo International Journal of ComputerVision vol 10 no 1 pp 53ndash66 1993

[73] C J Veenman M J T Reinders and E Backer ldquoResolvingmotion correspondence for densely moving pointsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol23 no 1 pp 54ndash72 2001

[74] F Castanedo M A Patricio J Garcıa and J M MolinaldquoBottom-uptop-down coordination in a multiagent visualsensor networkrdquo in Proceedings of the IEEE Conference onAdvanced Video and Signal Based Surveillance (AVSS rsquo07) pp93ndash98 September 2007

[75] F Castanedo J Garcıa M A Patricio and J M MolinaldquoAnalysis of distributed fusion alternatives in coordinated visionagentsrdquo in Proceedings of the 11th International Conference onInformation Fusion (FUSION rsquo08) July 2008

[76] F Castanedo J Garcıa M A Patricio and J M Molina ldquoDatafusion to improve trajectory tracking in a cooperative surveil-lance multi-agent architecturerdquo Information Fusion vol 11 no3 pp 243ndash255 2010

[77] M A Davenport C Hegde M F Duarte and R G BaraniukldquoJoint manifolds for data fusionrdquo IEEE Transactions on ImageProcessing vol 19 no 10 pp 2580ndash2594 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 13: Review Article A Review of Data Fusion Techniquesdownloads.hindawi.com/journals/tswj/2013/704504.pdf · Classification of Data Fusion Techniques Data fusion is a ... to extract features

The Scientific World Journal 13

(1) Initialization of the particles

(i) let119873 be equal to the number of particles(ii) 119883(119894)(1) = [119909(1) 119910(1) 0 0]119879 for 119894 = 1 119873

(2) Prediction step

(i) for each particle 119894 = 1 119873 evaluate the state(119896 + 1 | 119896) of the system using the state at timeinstant 119896 with the noise of the system at time 119896Consider

119883(119894)(119896 + 1 | 119896) = 119865 (119896)119883

(119894)(119896)

+ (cauchy-distribution-noise)(119896)

(29)

where 119865(119896) is the transition matrix of the sys-tem

(3) Evaluate the particle weight For each particle 119894 =1 119873

(i) compute the predicted observation state of thesystem using the current predicted state and thenoise at instant 119896 Consider

(119894)(119896 + 1 | 119896) = 119867 (119896 + 1)119883

(119894)(119896 + 1 | 119896)

+ (gaussian-measurement-noise)(119896+1)

(30)

(ii) compute the likelihood (weights) according tothe given distribution Consider

likelihood(119894) = 119873((119894) (119896 + 1 | 119896) 119911(119894) (119896 + 1) var) (31)

(iii) normalize the weights as follows

119908(119894)=

likelihood(119894)

sum119873

119895=1likelihood(119895)

(32)

(4) ResamplingSelection multiply particles with higherweights and remove those with lower weights Thecurrent state must be adjusted using the computedweights of the new particles

(i) Compute the cumulative weights Consider

Cum Wt(119894) =119894

sum

119895=1

119908(119895) (33)

(ii) Generate uniform distributed random variablesfrom 119880(119894) sim 119882(0 1) with the number of stepsequal to the number of particles

(iii) Determine which particles should be multipliedand which ones removed

(5) Propagation phase

(i) incorporate the new values of the state after theresampling of instant 119896 to calculate the value atinstant 119896 + 1 Consider

119909(1119873)

(119896 + 1 | 119896 + 1) = 119909 (119896 + 1 | 119896) (34)

(ii) compute the posterior mean Consider

119909 (119896 + 1) = mean [119909119894 (119896 + 1 | 119896 + 1)] 119894 = 1 119873 (35)

(iii) repeat steps 2 to 5 for each time instant

Particle filters are more flexible than the Kalman filtersand can copewith nonlinear dependencies and non-Gaussiandensities in the dynamic model and in the noise errorHowever they have some disadvantages A large numberof particles are required to obtain a small variance in theestimator It is also difficult to establish the optimal number ofparticles in advance and the number of particles affects thecomputational cost significantly Earlier versions of particlefilters employed a fixed number of particles but recent studieshave started to use a dynamic number of particles [40]

44 The Distributed Kalman Filter The distributed Kalmanfilter requires a correct clock synchronization between eachsource as demonstrated in [41] In other words to correctlyuse the distributed Kalman filter the clocks from all ofthe sources must be synchronized This synchronization istypically achieved through using protocols that employ ashared global clock such as the network time protocol (NTP)Synchronization problems between clocks have been shownto have an effect on the accuracy of the Kalman filterproducing inaccurate estimations [42]

If the estimations are consistent and the cross covarianceis known (or the estimations are uncorrelated) then it ispossible to use the distributed Kalman filters [43] Howeverthe cross covariance must be determined exactly or theobservations must be consistent

We refer the reader to Liggins II et al [20] formore detailsabout the Kalman filter in a distributed and hierarchicalarchitecture

45 Distributed Particle Filter Distributed particle filtershave gained attention recently [44ndash46] Coates [45] used adistributed particle filter to monitor an environment thatcould be captured by the Markovian state-space modelinvolving nonlinear dynamics and observations and non-Gaussian noise

In contrast earlier attempts to solve out-of-sequencemeasurements using particle filters are based on regeneratingthe probability density function to the time instant of theout-of-sequence measurement [47] In a particle filter thisstep requires a large computational cost in addition to thenecessary space to store the previous particles To avoidthis problem Orton and Marrs [48] proposed to store theinformation on the particles at each time instant saving thecost of recalculating this information This technique is close

14 The Scientific World Journal

to optimal and when the delay increases the result is onlyslightly affected [49]However it requires a very large amountof space to store the state of the particles at each time instant

46 Covariance Consistency Methods Covariance Intersec-tionUnion Covariance consistency methods (intersectionand union) were proposed by Uhlmann [43] and are generaland fault-tolerant frameworks for maintaining covariancemeans and estimations in a distributed networkThese meth-ods do not comprise estimation techniques instead they aresimilar to an estimation fusion technique The distributedKalman filter requirement of independent measurements orknown cross-covariances is not a constraintwith thismethod

461 Covariance Intersection If the Kalman filter is employ-ed to combine two estimations (119886

1 1198601) and (119886

2 1198602) then it

is assumed that the joint covariance is in the following form

[

[

1198601119883

1198831198791198602

]

]

(36)

where the cross-covariance 119883 should be known exactly sothat the Kalman filter can be applied without difficultyBecause the computation of the cross-covariances is compu-tationally intensive Uhlmann [43] proposed the covarianceintersection (CI) algorithm

Let us assume that a joint covariance 119872 can be definedwith the diagonal blocks119872

1198601gt 1198601and119872

1198602gt 1198602 Consider

119872 ⩾ [

[

1198601119883

1198831198791198602

]

]

(37)

for every possible instance of the unknown cross-covariance119883 then the components of the matrix119872 could be employedin the Kalman filter equations to provide a fused estimation(119888 119862) that is considered consistent The key point of thismethod relies on generating a joint covariance matrix119872 thatcan represent a useful fused estimation (in this context usefulrefers to something with a lower associated uncertainty) Insummary the CI algorithm computes the joint covariancematrix 119872 where the Kalman filter provides the best fusedestimation (119888 119862) with respect to a fixed measurement of thecovariance matrix (ie the minimum determinant)

Specific covariance criteria must be established becausethere is not a specific minimum joint covariance in theorder of the positive semidefinite matrices Moreover thejoint covariance is the basis of the formal analysis of theCI algorithm the actual result is a nonlinear mixture of theinformation stored on the estimations being fused followingthe following equation

119862 = (1199081119867119879

1119860minus1

11198671+ 1199082119867119879

2119860minus1

21198672+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119867119899)minus1

119888 = 119862(1199081119867119879

1119860minus1

11198861+ 1199082119867119879

2119860minus1

21198862+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119886119899)minus1

(38)

where 119867119894is the transformation of the fused state-space

estimation to the space of the estimated state 119894 The values

of 119908 can be calculated to minimize the covariance determi-nant using convex optimization packages and semipositivematrix programming The result of the CI algorithm hasdifferent characteristics compared to the Kalman filter Forexample if two estimations are provided (119886 119860) and (119887 119861)and their covariances are equal 119860 = 119861 since the Kalmanfilter is based on the statistical independence assumption itproduces a fused estimation with covariance 119862 = (12)119860In contrast the CI method does not assume independenceand thus must be consistent even in the case in whichthe estimations are completely correlated with the estimatedfused covariance 119862 = 119860 In the case of estimations where119860 lt 119861 the CI algorithm does not provide information aboutthe estimation (119887 119861) thus the fused result is (119886 119860)

Every joint-consistent covariance is sufficient to producea fused estimation which guarantees consistency Howeverit is also necessary to guarantee a lack of divergence Diver-gence is avoided in the CI algorithm by choosing a specificmeasurement (ie the determinant) which is minimized ineach fusion operation This measurement represents a non-divergence criterion because the size of the estimated covari-ance according to this criterion would not be incremented

The application of the CI method guarantees consis-tency and nondivergence for every sequence of mean andcovariance-consistent estimations However this methoddoes not work well when the measurements to be fused areinconsistent

462 Covariance Union CI solves the problem of correlatedinputs but not the problemof inconsistent inputs (inconsistentinputs refer to different estimations each of which has ahigh accuracy (small variance) but also a large differencefrom the states of the others) thus the covariance union(CU) algorithm was proposed to solve the latter [43] CUaddresses the following problem two estimations (119886

1 1198601)

and (1198862 1198602) relate to the state of an object and are mutually

inconsistent from one another This issue arises when thedifference between the average estimations is larger thanthe provided covariance Inconsistent inputs can be detectedusing the Mahalanobis distance [50] between them which isdefined as

119872119889= (1198861minus 1198862)119879

(1198601+ 1198602)minus1

(1198861minus 1198862) (39)

and detecting whether this distance is larger than a giventhreshold

The Mahalanobis distance accounts for the covarianceinformation to obtain the distance If the difference betweenthe estimations is high but their covariance is also highthe Mahalanobis distance yields a small value In contrastif the difference between the estimations is small and thecovariances are small it could produce a larger distancevalue A high Mahalanobis distance could indicate that theestimations are inconsistent however it is necessary tohave a specific threshold established by the user or learnedautomatically

The CU algorithm aims to solve the following prob-lem let us suppose that a filtering algorithm provides twoobservations withmean and covariance (119886

1 1198601) and (119886

2 1198602)

The Scientific World Journal 15

respectively It is known that one of the observations is correctand the other is erroneous However the identity of thecorrect estimation is unknown and cannot be determinedIn this situation if both estimations are employed as aninput to the Kalman filter there will be a problem becausethe Kalman filter only guarantees a consistent output if theobservation is updated with a measurement consistent withboth of them In the specific case in which themeasurementscorrespond to the same object but are acquired from twodifferent sensors the Kalman filter can only guarantee thatthe output is consistent if it is consistent with both separatelyBecause it is not possible to knowwhich estimation is correctthe only way to combine the two estimations rigorously isto provide an estimation (119906 119880) that is consistent with bothestimations and to obey the following properties

119880 ⪖ 1198601+ (119906 minus 119886

1) (119906 minus 119860

1)119879

119880 ⪖ 1198602+ (119906 minus 119886

2) (119906 minus 119860

2)119879

(40)

where somemeasurement of thematrix size119880 (ie the deter-minant) is minimized

In other words the previous equations indicate that if theestimation (119886

1 1198601) is consistent then the translation of the

vector 1198861to 119906 requires to increase the covariance by the sum

of a matrix at least as big as the product of (119906minus1198861) in order to

be consistentThe same situation applies to the measurement(1198862 1198602) in order to be consistent

A simple strategy is to choose the mean of the estimationas the input value of one of themeasurements (119906 = 119886

1) In this

case the value of 119880must be chosen such that the estimationis consistent with the worst case (the correct measurement is1198862) However it is possible to assign 119906 an intermediate value

between 1198861and 1198862to decrease the value of 119880 Therefore the

CU algorithm establishes the mean fused value 119906 that hasthe least covariance 119880 but is sufficiently large for the twomeasurements (119886

1and 1198862) for consistency

Because the matrix inequalities presented in previousequations are convex convex optimization algorithms mustbe employed to solve them The value of 119880 can be computedwith the iterative method described by Julier et al [51]The obtained covariance could be significantly larger thanany of the initial covariances and is an indicator of theexisting uncertainty between the initial estimations One ofthe advantages of the CU method arises from the fact thatthe same process could be easily extended to119873 inputs

5 Decision Fusion Methods

A decision is typically taken based on the knowledge of theperceived situation which is provided by many sources inthe data fusion domain These techniques aim to make ahigh-level inference about the events and activities that areproduced from the detected targets These techniques oftenuse symbolic information and the fusion process requires toreasonwhile accounting for the uncertainties and constraintsThese methods fall under level 2 (situation assessment) andlevel 4 (impact assessment) of the JDL data fusion model

51 The Bayesian Methods Information fusion based on theBayesian inference provides a formalism for combining evi-dence according to the probability theory rules Uncertaintyis represented using the conditional probability terms thatdescribe beliefs and take on values in the interval [0 1] wherezero indicates a complete lack of belief and one indicates anabsolute belief The Bayesian inference is based on the Bayesrule as follows

119875 (119884 | 119883) =119875 (119883 | 119884) 119875 (119884)

119875 (119883) (41)

where the posterior probability 119875(119884 | 119883) represents thebelief in the hypothesis 119884 given the information 119883 Thisprobability is obtained by multiplying the a priori probabilityof the hypothesis 119875(119884) by the probability of having 119883 giventhat 119884 is true 119875(119883 | 119884) The value 119875(119883) is used as anormalizing constantThemain disadvantage of the Bayesianinference is that the probabilities 119875(119883) and 119875(119883 | 119884) mustbe known To estimate the conditional probabilities Panet al [52] proposed the use of NNs whereas Coue et al [53]proposed the Bayesian programming

Hall and Llinas [54] described the following problemsassociated with Bayesian inference

(i) Difficulty in establishing the value of a priori proba-bilities

(ii) Complexity when there are multiple potential hypo-theses and a substantial number of events that dependon the conditions

(iii) The hypothesis should be mutually exclusive

(iv) Difficulty in describing the uncertainty of the deci-sions

52 The Dempster-Shafer Inference The Dempster-Shaferinference is based on the mathematical theory introducedby Dempster [55] and Shafer [56] which generalizes theBayesian theory The Dempster-Shafer theory provides aformalism that could be used to represent incomplete knowl-edge updating beliefs and a combination of evidence andallows us to represent the uncertainty explicitly [57]

A fundamental concept in the Dempster-Shafer reason-ing is the frame of discernment which is defined as followsLet Θ = 120579

1 1205792 120579

119873 be the set of all possible states

that define the system and let Θ be exhaustive and mutuallyexclusive due to the system being only in one state 120579

119894isin Θ

where 1 ⪕ 119894 ⪕ 119873 The set Θ is called a frame of discernmentbecause its elements are employed to discern the current stateof the system

The elements of the set 2Θ are called hypotheses Inthe Dempster-Shafer theory based on the evidence 119864 aprobability is assigned to each hypothesis119867 isin 2Θ accordingto the basic assignment of probabilities or the mass function119898 2Θrarr [01] which satisfies

119898(0) = 0 (42)

16 The Scientific World Journal

Thus themass function of the empty set is zero Furthermorethe mass function of a hypothesis is larger than or equal tozero for all of the hypotheses Consider

119898(119867) ge 0 forall119867 isin 2Θ (43)

The sum of the mass function of all the hypotheses is oneConsider

sum

119867isin2Θ

119898(119867) = 1 (44)

To express incomplete beliefs in a hypothesis 119867 the Demp-ster-Shafer theory defines the belief function bel 2Θ rarr

[0 1] over Θ as

bel (119867) = sum119860sube119867

119898(119860) (45)

where bel(0) = 0 and bel(Θ) = 1The doubt level in119867 can beexpressed in terms of the belief function by

dou (119867) = bel (not119867) = sum

119860subenot119867

119898(119860) (46)

To express the plausibility of each hypothesis the functionpl 2Θ rarr [0 1] over Θ is defined as

pl (119867) = 1 minus dou (119867) = sum

119860cap119867=0

119898(119860) (47)

Intuitive plausibility indicates that there is less uncer-tainty in hypothesis119867 if it is more plausible The confidenceinterval [bel(119867) pl(119867)] defines the true belief in hypothesis119867 To combine the effects of the two mass functions 119898

1and

1198982 the Dempster-Shafer theory defines a rule119898

1oplus 1198982as

1198981oplus 1198982(0) = 0

1198981oplus 1198982(119867) =

sum119883cap119884=119867

1198981(119883)119898

2(119884)

1 minus sum119883cap119884=0

1198981(119883)119898

2(119884)

(48)

In contrast to the Bayesian inference a priori probabilitiesare not required in the Dempster-Shafer inference becausethey are assigned at the instant that the information is pro-vided Several studies in the literature have compared the useof the Bayesian inference and the Dempster-Shafer inferencesuch as [58ndash60] Wu et al [61] used the Dempster-Shafertheory to fuse information in context-aware environmentsThis work was extended in [62] to dynamically modify theassociated weights to the sensor measurements Thereforethe fusion mechanism is calibrated according to the recentmeasurements of the sensors (in cases in which the ground-truth is available) In themilitary domain [63] theDempster-Shafer reasoning is used with the a priori information storedin a database for classifying military ships Morbee et al [64]described the use of the Dempster-Shafer theory to build 2Doccupancy maps from several cameras and to evaluate thecontribution of subsets of cameras to a specific task Each taskis the observation of an event of interest and the goal is toassess the validity of a set of hypotheses that are fused usingthe Dempster-Shafer theory

53 Abductive Reasoning Abductive reasoning or inferringthe best explanation is a reasoning method in which ahypothesis is chosen under the assumption that in case itis true it explains the observed event most accurately [65]In other words when an event is observed the abductionmethod attempts to find the best explanation

In the context of probabilistic reasoning abductive infer-ence finds the posterior ML of the system variables givensome observed variables Abductive reasoning is more areasoning pattern than a data fusion technique Thereforedifferent inference methods such as NNs [66] or fuzzy logic[67] can be employed

54 Semantic Methods Decision fusion techniques thatemploy semantic data fromdifferent sources as an input couldprovide more accurate results than those that rely on onlysingle sources There is a growing interest in techniques thatautomatically determine the presence of semantic features invideos to solve the semantic gap [68]

Semantic information fusion is essentially a scheme inwhich raw sensor data are processed such that the nodesexchange only the resultant semantic information Semanticinformation fusion typically covers two phases (i) build-ing the knowledge and (ii) pattern matching (inference)The first phase (typically offline) incorporates the mostappropriate knowledge into semantic information Then thesecond phase (typically online or in real-time) fuses relevantattributes and provides a semantic interpretation of thesensor data [69ndash71]

Semantic fusion could be viewed as an idea for integratingand translating sensor data into formal languages Thereforethe obtained resulting language from the observations ofthe environment is compared with similar languages thatare stored in the database The key of this strategy is thatsimilar behaviors represented by formal languages are alsosemantically similar This type of method provides savingsin the cost of transmission because the nodes need onlytransmit the formal language structure instead of the rawdata However a known set of behaviors must be storedin a database in advance which might be difficult in somescenarios

6 Conclusions

This paper reviews the most popular methods and tech-niques for performing datainformation fusion To determinewhether the application of datainformation fusion methodsis feasible we must evaluate the computational cost of theprocess and the delay introduced in the communicationA centralized data fusion approach is theoretically optimalwhen there is no cost of transmission and there are sufficientcomputational resources However this situation typicallydoes not hold in practical applications

The selection of the most appropriate technique dependson the type of the problem and the established assumptionsof each technique Statistical data fusion methods (eg PDAJPDA MHT and Kalman) are optimal under specific condi-tions [72] First the assumption that the targets are moving

The Scientific World Journal 17

independently and the measurements are normally dis-tributed around the predicted position typically does nothold Second because the statistical techniques model allof the events as probabilities they typically have severalparameters and a priori probabilities for false measurementsand detection errors that are often difficult to obtain (atleast in an optimal sense) For example in the case of theMHT algorithm specific parameters must be established thatare nontrivial to determine and are very sensitive [73] Incontrast statisticalmethods that optimize over several framesare computationally intensive and their complexity typicallygrows exponentially with the number of targets For examplein the case of particle filters tracking several targets can beaccomplished jointly as a group or individually If severaltargets are tracked jointly the necessary number of particlesgrows exponentially Therefore in practice it is better toperform tracking on them individually with the assumptionthat targets do not interact between the particles

In contrast to centralized systems the distributed datafusion methods introduce some challenges in the data fusionprocess such as (i) spatial and temporal alignments of theinformation (ii) out-of-sequence measurements and (iii)data correlation reported by Castanedo et al [74 75] Theinherent redundancy of the distributed systems could beexploited with distributed reasoning techniques and cooper-ative algorithms to improve the individual node estimationsreported by Castanedo et al [76] In addition to the previousstudies a new trend based on the geometric notion of a low-dimensional manifold is gaining attention in the data fusioncommunity An example is the work of Davenport et al [77]which proposes a simple model that captures the correlationbetween the sensor observations by matching the parametervalues for the different obtained manifolds

Acknowledgments

The author would like to thank Jesus Garcıa Miguel APatricio and James Llinas for their interesting and relateddiscussions on several topics that were presented in thispaper

References

[1] JDL Data Fusion Lexicon Technical Panel For C3 FE WhiteSan Diego Calif USA Code 420 1991

[2] D L Hall and J Llinas ldquoAn introduction to multisensor datafusionrdquo Proceedings of the IEEE vol 85 no 1 pp 6ndash23 1997

[3] H F Durrant-Whyte ldquoSensor models and multisensor integra-tionrdquo International Journal of Robotics Research vol 7 no 6 pp97ndash113 1988

[4] B V Dasarathy ldquoSensor fusion potential exploitation-inno-vative architectures and illustrative applicationsrdquo Proceedings ofthe IEEE vol 85 no 1 pp 24ndash38 1997

[5] R C Luo C-C Yih and K L Su ldquoMultisensor fusion andintegration approaches applications and future research direc-tionsrdquo IEEE Sensors Journal vol 2 no 2 pp 107ndash119 2002

[6] J Llinas C Bowman G Rogova A Steinberg E Waltz andF White ldquoRevisiting the JDL data fusion model IIrdquo TechnicalReport DTIC Document 2004

[7] E P Blasch and S Plano ldquoJDL level 5 fusionmodel ldquouser refine-mentrdquo issues and applications in group trackingrdquo in Proceedingsof the Signal Processing Sensor Fusion and Target RecognitionXI pp 270ndash279 April 2002

[8] H F Durrant-Whyte and M Stevens ldquoData fusion in decen-tralized sensing networksrdquo in Proceedings of the 4th Interna-tional Conference on Information Fusion pp 302ndash307MontrealCanada 2001

[9] J Manyika and H Durrant-Whyte Data Fusion and SensorManagement A Decentralized Information-Theoretic ApproachPrentice Hall Upper Saddle River NJ USA 1995

[10] S S Blackman ldquoAssociation and fusion ofmultiple sensor datardquoinMultitarget-Multisensor Tracking Advanced Applications pp187ndash217 Artech House 1990

[11] S Lloyd ldquoLeast squares quantization in pcmrdquo IEEETransactionson Information Theory vol 28 no 2 pp 129ndash137 1982

[12] M Shindler A Wong and A Meyerson ldquoFast and accurate120581-means for large datasetsrdquo in Proceedings of the 25th AnnualConference on Neural Information Processing Systems (NIPS rsquo11)pp 2375ndash2383 December 2011

[13] Y Bar-Shalom and E Tse ldquoTracking in a cluttered environmentwith probabilistic data associationrdquo Automatica vol 11 no 5pp 451ndash460 1975

[14] T E Fortmann Y Bar-Shalom and M Scheffe ldquoMulti-targettracking using joint probabilistic data associationrdquo in Pro-ceedings of the 19th IEEE Conference on Decision and Controlincluding the Symposium on Adaptive Processes vol 19 pp 807ndash812 December 1980

[15] D B Reid ldquoAn algorithm for tracking multiple targetsrdquo IEEETransactions on Automatic Control vol 24 no 6 pp 843ndash8541979

[16] C L Morefield ldquoApplication of 0-1 integer programming tomultitarget tracking problemsrdquo IEEETransactions onAutomaticControl vol 22 no 3 pp 302ndash312 1977

[17] R L Streit and T E Luginbuhl ldquoMaximum likelihood methodfor probabilistic multihypothesis trackingrdquo in Proceedings of theSignal and Data Processing of Small Targets vol 2235 of Pro-ceedings of SPIE p 394 1994

[18] I J Cox and S L Hingorani ldquoEfficient implementation of Reidrsquosmultiple hypothesis tracking algorithm and its evaluation forthe purpose of visual trackingrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 18 no 2 pp 138ndash1501996

[19] K G Murty ldquoAn algorithm for ranking all the assignments inorder of increasing costrdquo Operations Research vol 16 no 3 pp682ndash687 1968

[20] M E Liggins II C-Y Chong I Kadar et al ldquoDistributed fusionarchitectures and algorithms for target trackingrdquo Proceedings ofthe IEEE vol 85 no 1 pp 95ndash106 1997

[21] S Coraluppi C Carthel M Luettgen and S Lynch ldquoAll-source track and identity fusionrdquo in Proceedings of the NationalSymposium on Sensor and Data Fusion 2000

[22] P Storms and F Spieksma ldquoAn lp-based algorithm for the dataassociation problem in multitarget trackingrdquo in Proceedings ofthe 3rd IEEE International Conference on Information Fusionvol 1 2000

[23] S-W Joo and R Chellappa ldquoA multiple-hypothesis approachfor multiobject visual trackingrdquo IEEE Transactions on ImageProcessing vol 16 no 11 pp 2849ndash2854 2007

[24] S Coraluppi andC Carthel ldquoAggregate surveillance a cardinal-ity tracking approachrdquo in Proceedings of the 14th InternationalConference on Information Fusion (FUSION rsquo11) July 2011

18 The Scientific World Journal

[25] K C Chang C Y Chong and Y Bar-Shalom ldquoJoint proba-bilistic data association in distributed sensor networksrdquo IEEETransactions on Automatic Control vol 31 no 10 pp 889ndash8971986

[26] Y Chong S Mori and K C Chang ldquoInformation lusion indistributed sensor networksrdquo in Proceedings of the 4th AmericanControl Conference Boston Mass USA June 1985

[27] Y Chong S Mori and K C Chang ldquoDistributed multitar-get multisensor trackingrdquo in Multitarget-Multisensor TrackingAdvanced Applications vol 1 pp 247ndash295 1990

[28] J Pearl Probabilistic Reasoning in Intelligent Systems Networksof Plausible Inference Morgan Kaufmann San Mateo CalifUSA 1988

[29] Koller and N Friedman Probabilistic Graphical Models Princi-ples and Techniques MIT press 2009

[30] L Chen M Cetin and A S Willsky ldquoDistributed data associ-ation for multi-target tracking in sensor networksrdquo in Proceed-ings of the 7th International Conference on Information Fusion(FUSION rsquo05) pp 9ndash16 July 2005

[31] L Chen M J Wainwright M Cetin and A S Willsky ldquoDataassociation based on optimization in graphical models withapplication to sensor networksrdquo Mathematical and ComputerModelling vol 43 no 9-10 pp 1114ndash1113 2006

[32] Y Weiss and W T Freeman ldquoOn the optimality of solutionsof the max-product belief-propagation algorithm in arbitrarygraphsrdquo IEEE Transactions on InformationTheory vol 47 no 2pp 736ndash744 2001

[33] C Brown H Durrant-Whyte J Leonard B Rao and B SteerldquoDistributed data fusion using Kalman filtering a roboticsapplicationrdquo in Data Fusion in Robotics and Machine Intelli-gence M A Abidi and R C Gonzalez Eds pp 267ndash309 1992

[34] R E Kalman ldquoA new approach to linear filtering and predictionproblemsrdquo Journal of Basic Engineering vol 82 no 1 pp 35ndash451960

[35] R C Luo and M G Kay ldquoData fusion and sensor integrationstate-of-the-art 1990srdquo in Data Fusion in Robotics and MachineIntelligence pp 7ndash135 1992

[36] Welch and G Bishop An Introduction to the Kalman FilterACM SIC-CRAPH 2001 Course Notes 2001

[37] S J Julier and J K Uhlmann ldquoA new extension of the Kalmanfilter to nonlinear systemsrdquo in Proceedings of the InternationalSymposium on AerospaceDefense Sensing Simulation and Con-trols vol 3 1997

[38] A Wan and R Van Der Merwe ldquoThe unscented kalman filterfor nonlinear estimationrdquo in Proceedings of the Adaptive Systemsfor Signal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 153ndash158 2000

[39] D Crisan and A Doucet ldquoA survey of convergence results onparticle filtering methods for practitionersrdquo IEEE Transactionson Signal Processing vol 50 no 3 pp 736ndash746 2002

[40] J Martinez-del Rincon C Orrite-Urunuela and J E Herrero-Jaraba ldquoAn efficient particle filter for color-based tracking incomplex scenesrdquo in Proceedings of the IEEE Conference onAdvancedVideo and Signal Based Surveillance pp 176ndash181 2007

[41] S Ganeriwal R Kumar and M B Srivastava ldquoTiming-syncprotocol for sensor networksrdquo in Proceedings of the 1st Inter-national Conference on Embedded Networked Sensor Systems(SenSys rsquo03) pp 138ndash149 November 2003

[42] M Manzo T Roosta and S Sastry ldquoTime synchronization innetworksrdquo in Proceedings of the 3rd ACMWorkshop on Securityof Ad Hoc and Sensor Networks (SASN rsquo05) pp 107ndash116November 2005

[43] J K Uhlmann ldquoCovariance consistency methods for fault-tolerant distributed data fusionrdquo Information Fusion vol 4 no3 pp 201ndash215 2003

[44] S Bashi V P Jilkov X R Li and H Chen ldquoDistributed imple-mentations of particle filtersrdquo in Proceedings of the 6th Interna-tional Conference of Information Fusion pp 1164ndash1171 2003

[45] M Coates ldquoDistributed particle filters for sensor networksrdquo inProceedings of the 3rd International symposium on InformationProcessing in Sensor Networks (ACM rsquo04) pp 99ndash107 New YorkNY USA 2004

[46] D Gu ldquoDistributed particle filter for target trackingrdquo in Pro-ceedings of the IEEE International Conference on Robotics andAutomation (ICRA rsquo07) pp 3856ndash3861 April 2007

[47] Y Bar-Shalom ldquoUpdate with out-of-sequencemeasurements intracking exact solutionrdquo IEEE Transactions on Aerospace andElectronic Systems vol 38 no 3 pp 769ndash778 2002

[48] M Orton and A Marrs ldquoA Bayesian approach to multi-targettracking and data fusionwithOut-of-SequenceMeasurementsrdquoIEE Colloquium no 174 pp 151ndash155 2001

[49] M L Hernandez A D Marrs S Maskell and M R OrtonldquoTracking and fusion for wireless sensor networksrdquo in Proceed-ings of the 5th International Conference on Information Fusion2002

[50] P C Mahalanobis ldquoOn the generalized distance in statisticsrdquoProceedings National Institute of ScienceIndia vol 2 no 1 pp49ndash55 1936

[51] S J Julier J K Uhlmann and D Nicholson ldquoA methodfor dealing with assignment ambiguityrdquo in Proceedings of theAmerican Control Conference (AAC rsquo04) vol 5 pp 4102ndash4107July 2004

[52] H Pan Z-P Liang T J Anastasio and T S Huang ldquoHybridNN-Bayesian architecture for information fusionrdquo in Proceed-ings of the International Conference on Image Processing (ICIPrsquo98) pp 368ndash371 October 1998

[53] C Coue T Fraichard P Bessiere and E Mazer ldquoMulti-sensordata fusion using Bayesian programming an automotive appli-cationrdquo in Proceedings of the IEEERSJ International Conferenceon Intelligent Robots and Systems pp 141ndash146 October 2002

[54] D L Hall and J Llinas Handbook of Multisensor Data FusionCRC Press Boca Raton Fla USA 2001

[55] P Dempster ldquoA Generalization of Bayesian Inferencerdquo Journalof the Royal Statistical Society B vol 30 no 2 pp 205ndash247 1968

[56] A ShaferMathematical Theory of Evidence Princeton Univer-sity Press Princeton NJ USA 1976

[57] G M Provan ldquoThe validity of Dempster-Shafer belief func-tionsrdquo International Journal of Approximate Reasoning vol 6no 3 pp 389ndash399 1992

[58] D M Buede ldquoShafer-Dempster and Bayesian reasoning aresponse to lsquoShafer-Dempster reasoning with applications tomultisensor target identification systemsrsquordquo IEEE Transactions onSystems Man and Cybernetics vol 18 no 6 pp 1009ndash1011 1988

[59] Y Cheng and R L Kashyap ldquoComparisonol Bayesian andDempsterrsquos rules in evidence combinationrdquo in Maximum-Entropy and BayesianMethods in Science and Engineering 1988

[60] B R Cobb and P P Shenoy ldquoA comparison of Bayesian andbelief function reasoningrdquo Information Systems Frontiers vol 5no 4 pp 345ndash358 2003

[61] H Wu M Siegel R Stiefelhagen and J Yang ldquoSensor fusionusing Dempster-Shafer theoryrdquo in Proceedings of the 19thIEEE Instrumentation and Measurement Technology Conference(TMTC rsquo02) pp 7ndash11 May 2002

The Scientific World Journal 19

[62] H Wu M Siegel and S Ablay ldquoSensor fusion using dempster-shafer theory II staticweighting andKalmanfilter-like dynamicweightingrdquo in Proceedings of the 20th IEEE Information andMeasurement Technology Conference (TMTC rsquo03) pp 907ndash912May 2003

[63] E Bosse P Valin A-C Boury-Brisset and D Grenier ldquoEx-ploitation of a priori knowledge for information fusionrdquo Infor-mation Fusion vol 7 no 2 pp 161ndash175 2006

[64] MMorbee L Tessens H Aghajan andW Philips ldquoDempster-Shafer based multi-view occupancy mapsrdquo Electronics Lettersvol 46 no 5 pp 341ndash343 2010

[65] C S Peirce Abduction and Induction Philosophical Writings ofPeirce vol 156 Dover New York NY USA 1955

[66] A M Abdelbar E A M Andrews and D C Wunsch IIldquoAbductive reasoning with recurrent neural networksrdquo NeuralNetworks vol 16 no 5-6 pp 665ndash673 2003

[67] J R Aguero and A Vargas ldquoInference of operative configu-ration of distribution networks using fuzzy logic techniquesPart II extended real-time modelrdquo IEEE Transactions on PowerSystems vol 20 no 3 pp 1562ndash1569 2005

[68] A W M Smeulders M Worring S Santini A Gupta and RJain ldquoContent-based image retrieval at the end of the earlyyearsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 22 no 12 pp 1349ndash1380 2000

[69] D S Friedlander and S Phoha ldquoSemantic information fusionfor coordinated signal processing in mobile sensor networksrdquoInternational Journal of High Performance Computing Applica-tions vol 16 no 3 pp 235ndash241 2002

[70] S Friedlander ldquoSemantic information extractionrdquo in Dis-tributed Sensor Networks 2005

[71] KWhitehouse J Liu and F Zhao ldquoSemantic Streams a frame-work for composable inference over sensor datardquo in Proceedingsof the 3rd European Workshop on Wireless Sensor NetworksLecture Notes in Computer Science Springer February 2006

[72] J Cox ldquoA review of statistical data association techniques formotion correspondencerdquo International Journal of ComputerVision vol 10 no 1 pp 53ndash66 1993

[73] C J Veenman M J T Reinders and E Backer ldquoResolvingmotion correspondence for densely moving pointsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol23 no 1 pp 54ndash72 2001

[74] F Castanedo M A Patricio J Garcıa and J M MolinaldquoBottom-uptop-down coordination in a multiagent visualsensor networkrdquo in Proceedings of the IEEE Conference onAdvanced Video and Signal Based Surveillance (AVSS rsquo07) pp93ndash98 September 2007

[75] F Castanedo J Garcıa M A Patricio and J M MolinaldquoAnalysis of distributed fusion alternatives in coordinated visionagentsrdquo in Proceedings of the 11th International Conference onInformation Fusion (FUSION rsquo08) July 2008

[76] F Castanedo J Garcıa M A Patricio and J M Molina ldquoDatafusion to improve trajectory tracking in a cooperative surveil-lance multi-agent architecturerdquo Information Fusion vol 11 no3 pp 243ndash255 2010

[77] M A Davenport C Hegde M F Duarte and R G BaraniukldquoJoint manifolds for data fusionrdquo IEEE Transactions on ImageProcessing vol 19 no 10 pp 2580ndash2594 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 14: Review Article A Review of Data Fusion Techniquesdownloads.hindawi.com/journals/tswj/2013/704504.pdf · Classification of Data Fusion Techniques Data fusion is a ... to extract features

14 The Scientific World Journal

to optimal and when the delay increases the result is onlyslightly affected [49]However it requires a very large amountof space to store the state of the particles at each time instant

46 Covariance Consistency Methods Covariance Intersec-tionUnion Covariance consistency methods (intersectionand union) were proposed by Uhlmann [43] and are generaland fault-tolerant frameworks for maintaining covariancemeans and estimations in a distributed networkThese meth-ods do not comprise estimation techniques instead they aresimilar to an estimation fusion technique The distributedKalman filter requirement of independent measurements orknown cross-covariances is not a constraintwith thismethod

461 Covariance Intersection If the Kalman filter is employ-ed to combine two estimations (119886

1 1198601) and (119886

2 1198602) then it

is assumed that the joint covariance is in the following form

[

[

1198601119883

1198831198791198602

]

]

(36)

where the cross-covariance 119883 should be known exactly sothat the Kalman filter can be applied without difficultyBecause the computation of the cross-covariances is compu-tationally intensive Uhlmann [43] proposed the covarianceintersection (CI) algorithm

Let us assume that a joint covariance 119872 can be definedwith the diagonal blocks119872

1198601gt 1198601and119872

1198602gt 1198602 Consider

119872 ⩾ [

[

1198601119883

1198831198791198602

]

]

(37)

for every possible instance of the unknown cross-covariance119883 then the components of the matrix119872 could be employedin the Kalman filter equations to provide a fused estimation(119888 119862) that is considered consistent The key point of thismethod relies on generating a joint covariance matrix119872 thatcan represent a useful fused estimation (in this context usefulrefers to something with a lower associated uncertainty) Insummary the CI algorithm computes the joint covariancematrix 119872 where the Kalman filter provides the best fusedestimation (119888 119862) with respect to a fixed measurement of thecovariance matrix (ie the minimum determinant)

Specific covariance criteria must be established becausethere is not a specific minimum joint covariance in theorder of the positive semidefinite matrices Moreover thejoint covariance is the basis of the formal analysis of theCI algorithm the actual result is a nonlinear mixture of theinformation stored on the estimations being fused followingthe following equation

119862 = (1199081119867119879

1119860minus1

11198671+ 1199082119867119879

2119860minus1

21198672+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119867119899)minus1

119888 = 119862(1199081119867119879

1119860minus1

11198861+ 1199082119867119879

2119860minus1

21198862+ sdot sdot sdot + 119908

119899119867119879

119899119860minus1

119899119886119899)minus1

(38)

where 119867119894is the transformation of the fused state-space

estimation to the space of the estimated state 119894 The values

of 119908 can be calculated to minimize the covariance determi-nant using convex optimization packages and semipositivematrix programming The result of the CI algorithm hasdifferent characteristics compared to the Kalman filter Forexample if two estimations are provided (119886 119860) and (119887 119861)and their covariances are equal 119860 = 119861 since the Kalmanfilter is based on the statistical independence assumption itproduces a fused estimation with covariance 119862 = (12)119860In contrast the CI method does not assume independenceand thus must be consistent even in the case in whichthe estimations are completely correlated with the estimatedfused covariance 119862 = 119860 In the case of estimations where119860 lt 119861 the CI algorithm does not provide information aboutthe estimation (119887 119861) thus the fused result is (119886 119860)

Every joint-consistent covariance is sufficient to producea fused estimation which guarantees consistency Howeverit is also necessary to guarantee a lack of divergence Diver-gence is avoided in the CI algorithm by choosing a specificmeasurement (ie the determinant) which is minimized ineach fusion operation This measurement represents a non-divergence criterion because the size of the estimated covari-ance according to this criterion would not be incremented

The application of the CI method guarantees consis-tency and nondivergence for every sequence of mean andcovariance-consistent estimations However this methoddoes not work well when the measurements to be fused areinconsistent

462 Covariance Union CI solves the problem of correlatedinputs but not the problemof inconsistent inputs (inconsistentinputs refer to different estimations each of which has ahigh accuracy (small variance) but also a large differencefrom the states of the others) thus the covariance union(CU) algorithm was proposed to solve the latter [43] CUaddresses the following problem two estimations (119886

1 1198601)

and (1198862 1198602) relate to the state of an object and are mutually

inconsistent from one another This issue arises when thedifference between the average estimations is larger thanthe provided covariance Inconsistent inputs can be detectedusing the Mahalanobis distance [50] between them which isdefined as

119872119889= (1198861minus 1198862)119879

(1198601+ 1198602)minus1

(1198861minus 1198862) (39)

and detecting whether this distance is larger than a giventhreshold

The Mahalanobis distance accounts for the covarianceinformation to obtain the distance If the difference betweenthe estimations is high but their covariance is also highthe Mahalanobis distance yields a small value In contrastif the difference between the estimations is small and thecovariances are small it could produce a larger distancevalue A high Mahalanobis distance could indicate that theestimations are inconsistent however it is necessary tohave a specific threshold established by the user or learnedautomatically

The CU algorithm aims to solve the following prob-lem let us suppose that a filtering algorithm provides twoobservations withmean and covariance (119886

1 1198601) and (119886

2 1198602)

The Scientific World Journal 15

respectively It is known that one of the observations is correctand the other is erroneous However the identity of thecorrect estimation is unknown and cannot be determinedIn this situation if both estimations are employed as aninput to the Kalman filter there will be a problem becausethe Kalman filter only guarantees a consistent output if theobservation is updated with a measurement consistent withboth of them In the specific case in which themeasurementscorrespond to the same object but are acquired from twodifferent sensors the Kalman filter can only guarantee thatthe output is consistent if it is consistent with both separatelyBecause it is not possible to knowwhich estimation is correctthe only way to combine the two estimations rigorously isto provide an estimation (119906 119880) that is consistent with bothestimations and to obey the following properties

119880 ⪖ 1198601+ (119906 minus 119886

1) (119906 minus 119860

1)119879

119880 ⪖ 1198602+ (119906 minus 119886

2) (119906 minus 119860

2)119879

(40)

where somemeasurement of thematrix size119880 (ie the deter-minant) is minimized

In other words the previous equations indicate that if theestimation (119886

1 1198601) is consistent then the translation of the

vector 1198861to 119906 requires to increase the covariance by the sum

of a matrix at least as big as the product of (119906minus1198861) in order to

be consistentThe same situation applies to the measurement(1198862 1198602) in order to be consistent

A simple strategy is to choose the mean of the estimationas the input value of one of themeasurements (119906 = 119886

1) In this

case the value of 119880must be chosen such that the estimationis consistent with the worst case (the correct measurement is1198862) However it is possible to assign 119906 an intermediate value

between 1198861and 1198862to decrease the value of 119880 Therefore the

CU algorithm establishes the mean fused value 119906 that hasthe least covariance 119880 but is sufficiently large for the twomeasurements (119886

1and 1198862) for consistency

Because the matrix inequalities presented in previousequations are convex convex optimization algorithms mustbe employed to solve them The value of 119880 can be computedwith the iterative method described by Julier et al [51]The obtained covariance could be significantly larger thanany of the initial covariances and is an indicator of theexisting uncertainty between the initial estimations One ofthe advantages of the CU method arises from the fact thatthe same process could be easily extended to119873 inputs

5 Decision Fusion Methods

A decision is typically taken based on the knowledge of theperceived situation which is provided by many sources inthe data fusion domain These techniques aim to make ahigh-level inference about the events and activities that areproduced from the detected targets These techniques oftenuse symbolic information and the fusion process requires toreasonwhile accounting for the uncertainties and constraintsThese methods fall under level 2 (situation assessment) andlevel 4 (impact assessment) of the JDL data fusion model

51 The Bayesian Methods Information fusion based on theBayesian inference provides a formalism for combining evi-dence according to the probability theory rules Uncertaintyis represented using the conditional probability terms thatdescribe beliefs and take on values in the interval [0 1] wherezero indicates a complete lack of belief and one indicates anabsolute belief The Bayesian inference is based on the Bayesrule as follows

119875 (119884 | 119883) =119875 (119883 | 119884) 119875 (119884)

119875 (119883) (41)

where the posterior probability 119875(119884 | 119883) represents thebelief in the hypothesis 119884 given the information 119883 Thisprobability is obtained by multiplying the a priori probabilityof the hypothesis 119875(119884) by the probability of having 119883 giventhat 119884 is true 119875(119883 | 119884) The value 119875(119883) is used as anormalizing constantThemain disadvantage of the Bayesianinference is that the probabilities 119875(119883) and 119875(119883 | 119884) mustbe known To estimate the conditional probabilities Panet al [52] proposed the use of NNs whereas Coue et al [53]proposed the Bayesian programming

Hall and Llinas [54] described the following problemsassociated with Bayesian inference

(i) Difficulty in establishing the value of a priori proba-bilities

(ii) Complexity when there are multiple potential hypo-theses and a substantial number of events that dependon the conditions

(iii) The hypothesis should be mutually exclusive

(iv) Difficulty in describing the uncertainty of the deci-sions

52 The Dempster-Shafer Inference The Dempster-Shaferinference is based on the mathematical theory introducedby Dempster [55] and Shafer [56] which generalizes theBayesian theory The Dempster-Shafer theory provides aformalism that could be used to represent incomplete knowl-edge updating beliefs and a combination of evidence andallows us to represent the uncertainty explicitly [57]

A fundamental concept in the Dempster-Shafer reason-ing is the frame of discernment which is defined as followsLet Θ = 120579

1 1205792 120579

119873 be the set of all possible states

that define the system and let Θ be exhaustive and mutuallyexclusive due to the system being only in one state 120579

119894isin Θ

where 1 ⪕ 119894 ⪕ 119873 The set Θ is called a frame of discernmentbecause its elements are employed to discern the current stateof the system

The elements of the set 2Θ are called hypotheses Inthe Dempster-Shafer theory based on the evidence 119864 aprobability is assigned to each hypothesis119867 isin 2Θ accordingto the basic assignment of probabilities or the mass function119898 2Θrarr [01] which satisfies

119898(0) = 0 (42)

16 The Scientific World Journal

Thus themass function of the empty set is zero Furthermorethe mass function of a hypothesis is larger than or equal tozero for all of the hypotheses Consider

119898(119867) ge 0 forall119867 isin 2Θ (43)

The sum of the mass function of all the hypotheses is oneConsider

sum

119867isin2Θ

119898(119867) = 1 (44)

To express incomplete beliefs in a hypothesis 119867 the Demp-ster-Shafer theory defines the belief function bel 2Θ rarr

[0 1] over Θ as

bel (119867) = sum119860sube119867

119898(119860) (45)

where bel(0) = 0 and bel(Θ) = 1The doubt level in119867 can beexpressed in terms of the belief function by

dou (119867) = bel (not119867) = sum

119860subenot119867

119898(119860) (46)

To express the plausibility of each hypothesis the functionpl 2Θ rarr [0 1] over Θ is defined as

pl (119867) = 1 minus dou (119867) = sum

119860cap119867=0

119898(119860) (47)

Intuitive plausibility indicates that there is less uncer-tainty in hypothesis119867 if it is more plausible The confidenceinterval [bel(119867) pl(119867)] defines the true belief in hypothesis119867 To combine the effects of the two mass functions 119898

1and

1198982 the Dempster-Shafer theory defines a rule119898

1oplus 1198982as

1198981oplus 1198982(0) = 0

1198981oplus 1198982(119867) =

sum119883cap119884=119867

1198981(119883)119898

2(119884)

1 minus sum119883cap119884=0

1198981(119883)119898

2(119884)

(48)

In contrast to the Bayesian inference a priori probabilitiesare not required in the Dempster-Shafer inference becausethey are assigned at the instant that the information is pro-vided Several studies in the literature have compared the useof the Bayesian inference and the Dempster-Shafer inferencesuch as [58ndash60] Wu et al [61] used the Dempster-Shafertheory to fuse information in context-aware environmentsThis work was extended in [62] to dynamically modify theassociated weights to the sensor measurements Thereforethe fusion mechanism is calibrated according to the recentmeasurements of the sensors (in cases in which the ground-truth is available) In themilitary domain [63] theDempster-Shafer reasoning is used with the a priori information storedin a database for classifying military ships Morbee et al [64]described the use of the Dempster-Shafer theory to build 2Doccupancy maps from several cameras and to evaluate thecontribution of subsets of cameras to a specific task Each taskis the observation of an event of interest and the goal is toassess the validity of a set of hypotheses that are fused usingthe Dempster-Shafer theory

53 Abductive Reasoning Abductive reasoning or inferringthe best explanation is a reasoning method in which ahypothesis is chosen under the assumption that in case itis true it explains the observed event most accurately [65]In other words when an event is observed the abductionmethod attempts to find the best explanation

In the context of probabilistic reasoning abductive infer-ence finds the posterior ML of the system variables givensome observed variables Abductive reasoning is more areasoning pattern than a data fusion technique Thereforedifferent inference methods such as NNs [66] or fuzzy logic[67] can be employed

54 Semantic Methods Decision fusion techniques thatemploy semantic data fromdifferent sources as an input couldprovide more accurate results than those that rely on onlysingle sources There is a growing interest in techniques thatautomatically determine the presence of semantic features invideos to solve the semantic gap [68]

Semantic information fusion is essentially a scheme inwhich raw sensor data are processed such that the nodesexchange only the resultant semantic information Semanticinformation fusion typically covers two phases (i) build-ing the knowledge and (ii) pattern matching (inference)The first phase (typically offline) incorporates the mostappropriate knowledge into semantic information Then thesecond phase (typically online or in real-time) fuses relevantattributes and provides a semantic interpretation of thesensor data [69ndash71]

Semantic fusion could be viewed as an idea for integratingand translating sensor data into formal languages Thereforethe obtained resulting language from the observations ofthe environment is compared with similar languages thatare stored in the database The key of this strategy is thatsimilar behaviors represented by formal languages are alsosemantically similar This type of method provides savingsin the cost of transmission because the nodes need onlytransmit the formal language structure instead of the rawdata However a known set of behaviors must be storedin a database in advance which might be difficult in somescenarios

6 Conclusions

This paper reviews the most popular methods and tech-niques for performing datainformation fusion To determinewhether the application of datainformation fusion methodsis feasible we must evaluate the computational cost of theprocess and the delay introduced in the communicationA centralized data fusion approach is theoretically optimalwhen there is no cost of transmission and there are sufficientcomputational resources However this situation typicallydoes not hold in practical applications

The selection of the most appropriate technique dependson the type of the problem and the established assumptionsof each technique Statistical data fusion methods (eg PDAJPDA MHT and Kalman) are optimal under specific condi-tions [72] First the assumption that the targets are moving

The Scientific World Journal 17

independently and the measurements are normally dis-tributed around the predicted position typically does nothold Second because the statistical techniques model allof the events as probabilities they typically have severalparameters and a priori probabilities for false measurementsand detection errors that are often difficult to obtain (atleast in an optimal sense) For example in the case of theMHT algorithm specific parameters must be established thatare nontrivial to determine and are very sensitive [73] Incontrast statisticalmethods that optimize over several framesare computationally intensive and their complexity typicallygrows exponentially with the number of targets For examplein the case of particle filters tracking several targets can beaccomplished jointly as a group or individually If severaltargets are tracked jointly the necessary number of particlesgrows exponentially Therefore in practice it is better toperform tracking on them individually with the assumptionthat targets do not interact between the particles

In contrast to centralized systems the distributed datafusion methods introduce some challenges in the data fusionprocess such as (i) spatial and temporal alignments of theinformation (ii) out-of-sequence measurements and (iii)data correlation reported by Castanedo et al [74 75] Theinherent redundancy of the distributed systems could beexploited with distributed reasoning techniques and cooper-ative algorithms to improve the individual node estimationsreported by Castanedo et al [76] In addition to the previousstudies a new trend based on the geometric notion of a low-dimensional manifold is gaining attention in the data fusioncommunity An example is the work of Davenport et al [77]which proposes a simple model that captures the correlationbetween the sensor observations by matching the parametervalues for the different obtained manifolds

Acknowledgments

The author would like to thank Jesus Garcıa Miguel APatricio and James Llinas for their interesting and relateddiscussions on several topics that were presented in thispaper

References

[1] JDL Data Fusion Lexicon Technical Panel For C3 FE WhiteSan Diego Calif USA Code 420 1991

[2] D L Hall and J Llinas ldquoAn introduction to multisensor datafusionrdquo Proceedings of the IEEE vol 85 no 1 pp 6ndash23 1997

[3] H F Durrant-Whyte ldquoSensor models and multisensor integra-tionrdquo International Journal of Robotics Research vol 7 no 6 pp97ndash113 1988

[4] B V Dasarathy ldquoSensor fusion potential exploitation-inno-vative architectures and illustrative applicationsrdquo Proceedings ofthe IEEE vol 85 no 1 pp 24ndash38 1997

[5] R C Luo C-C Yih and K L Su ldquoMultisensor fusion andintegration approaches applications and future research direc-tionsrdquo IEEE Sensors Journal vol 2 no 2 pp 107ndash119 2002

[6] J Llinas C Bowman G Rogova A Steinberg E Waltz andF White ldquoRevisiting the JDL data fusion model IIrdquo TechnicalReport DTIC Document 2004

[7] E P Blasch and S Plano ldquoJDL level 5 fusionmodel ldquouser refine-mentrdquo issues and applications in group trackingrdquo in Proceedingsof the Signal Processing Sensor Fusion and Target RecognitionXI pp 270ndash279 April 2002

[8] H F Durrant-Whyte and M Stevens ldquoData fusion in decen-tralized sensing networksrdquo in Proceedings of the 4th Interna-tional Conference on Information Fusion pp 302ndash307MontrealCanada 2001

[9] J Manyika and H Durrant-Whyte Data Fusion and SensorManagement A Decentralized Information-Theoretic ApproachPrentice Hall Upper Saddle River NJ USA 1995

[10] S S Blackman ldquoAssociation and fusion ofmultiple sensor datardquoinMultitarget-Multisensor Tracking Advanced Applications pp187ndash217 Artech House 1990

[11] S Lloyd ldquoLeast squares quantization in pcmrdquo IEEETransactionson Information Theory vol 28 no 2 pp 129ndash137 1982

[12] M Shindler A Wong and A Meyerson ldquoFast and accurate120581-means for large datasetsrdquo in Proceedings of the 25th AnnualConference on Neural Information Processing Systems (NIPS rsquo11)pp 2375ndash2383 December 2011

[13] Y Bar-Shalom and E Tse ldquoTracking in a cluttered environmentwith probabilistic data associationrdquo Automatica vol 11 no 5pp 451ndash460 1975

[14] T E Fortmann Y Bar-Shalom and M Scheffe ldquoMulti-targettracking using joint probabilistic data associationrdquo in Pro-ceedings of the 19th IEEE Conference on Decision and Controlincluding the Symposium on Adaptive Processes vol 19 pp 807ndash812 December 1980

[15] D B Reid ldquoAn algorithm for tracking multiple targetsrdquo IEEETransactions on Automatic Control vol 24 no 6 pp 843ndash8541979

[16] C L Morefield ldquoApplication of 0-1 integer programming tomultitarget tracking problemsrdquo IEEETransactions onAutomaticControl vol 22 no 3 pp 302ndash312 1977

[17] R L Streit and T E Luginbuhl ldquoMaximum likelihood methodfor probabilistic multihypothesis trackingrdquo in Proceedings of theSignal and Data Processing of Small Targets vol 2235 of Pro-ceedings of SPIE p 394 1994

[18] I J Cox and S L Hingorani ldquoEfficient implementation of Reidrsquosmultiple hypothesis tracking algorithm and its evaluation forthe purpose of visual trackingrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 18 no 2 pp 138ndash1501996

[19] K G Murty ldquoAn algorithm for ranking all the assignments inorder of increasing costrdquo Operations Research vol 16 no 3 pp682ndash687 1968

[20] M E Liggins II C-Y Chong I Kadar et al ldquoDistributed fusionarchitectures and algorithms for target trackingrdquo Proceedings ofthe IEEE vol 85 no 1 pp 95ndash106 1997

[21] S Coraluppi C Carthel M Luettgen and S Lynch ldquoAll-source track and identity fusionrdquo in Proceedings of the NationalSymposium on Sensor and Data Fusion 2000

[22] P Storms and F Spieksma ldquoAn lp-based algorithm for the dataassociation problem in multitarget trackingrdquo in Proceedings ofthe 3rd IEEE International Conference on Information Fusionvol 1 2000

[23] S-W Joo and R Chellappa ldquoA multiple-hypothesis approachfor multiobject visual trackingrdquo IEEE Transactions on ImageProcessing vol 16 no 11 pp 2849ndash2854 2007

[24] S Coraluppi andC Carthel ldquoAggregate surveillance a cardinal-ity tracking approachrdquo in Proceedings of the 14th InternationalConference on Information Fusion (FUSION rsquo11) July 2011

18 The Scientific World Journal

[25] K C Chang C Y Chong and Y Bar-Shalom ldquoJoint proba-bilistic data association in distributed sensor networksrdquo IEEETransactions on Automatic Control vol 31 no 10 pp 889ndash8971986

[26] Y Chong S Mori and K C Chang ldquoInformation lusion indistributed sensor networksrdquo in Proceedings of the 4th AmericanControl Conference Boston Mass USA June 1985

[27] Y Chong S Mori and K C Chang ldquoDistributed multitar-get multisensor trackingrdquo in Multitarget-Multisensor TrackingAdvanced Applications vol 1 pp 247ndash295 1990

[28] J Pearl Probabilistic Reasoning in Intelligent Systems Networksof Plausible Inference Morgan Kaufmann San Mateo CalifUSA 1988

[29] Koller and N Friedman Probabilistic Graphical Models Princi-ples and Techniques MIT press 2009

[30] L Chen M Cetin and A S Willsky ldquoDistributed data associ-ation for multi-target tracking in sensor networksrdquo in Proceed-ings of the 7th International Conference on Information Fusion(FUSION rsquo05) pp 9ndash16 July 2005

[31] L Chen M J Wainwright M Cetin and A S Willsky ldquoDataassociation based on optimization in graphical models withapplication to sensor networksrdquo Mathematical and ComputerModelling vol 43 no 9-10 pp 1114ndash1113 2006

[32] Y Weiss and W T Freeman ldquoOn the optimality of solutionsof the max-product belief-propagation algorithm in arbitrarygraphsrdquo IEEE Transactions on InformationTheory vol 47 no 2pp 736ndash744 2001

[33] C Brown H Durrant-Whyte J Leonard B Rao and B SteerldquoDistributed data fusion using Kalman filtering a roboticsapplicationrdquo in Data Fusion in Robotics and Machine Intelli-gence M A Abidi and R C Gonzalez Eds pp 267ndash309 1992

[34] R E Kalman ldquoA new approach to linear filtering and predictionproblemsrdquo Journal of Basic Engineering vol 82 no 1 pp 35ndash451960

[35] R C Luo and M G Kay ldquoData fusion and sensor integrationstate-of-the-art 1990srdquo in Data Fusion in Robotics and MachineIntelligence pp 7ndash135 1992

[36] Welch and G Bishop An Introduction to the Kalman FilterACM SIC-CRAPH 2001 Course Notes 2001

[37] S J Julier and J K Uhlmann ldquoA new extension of the Kalmanfilter to nonlinear systemsrdquo in Proceedings of the InternationalSymposium on AerospaceDefense Sensing Simulation and Con-trols vol 3 1997

[38] A Wan and R Van Der Merwe ldquoThe unscented kalman filterfor nonlinear estimationrdquo in Proceedings of the Adaptive Systemsfor Signal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 153ndash158 2000

[39] D Crisan and A Doucet ldquoA survey of convergence results onparticle filtering methods for practitionersrdquo IEEE Transactionson Signal Processing vol 50 no 3 pp 736ndash746 2002

[40] J Martinez-del Rincon C Orrite-Urunuela and J E Herrero-Jaraba ldquoAn efficient particle filter for color-based tracking incomplex scenesrdquo in Proceedings of the IEEE Conference onAdvancedVideo and Signal Based Surveillance pp 176ndash181 2007

[41] S Ganeriwal R Kumar and M B Srivastava ldquoTiming-syncprotocol for sensor networksrdquo in Proceedings of the 1st Inter-national Conference on Embedded Networked Sensor Systems(SenSys rsquo03) pp 138ndash149 November 2003

[42] M Manzo T Roosta and S Sastry ldquoTime synchronization innetworksrdquo in Proceedings of the 3rd ACMWorkshop on Securityof Ad Hoc and Sensor Networks (SASN rsquo05) pp 107ndash116November 2005

[43] J K Uhlmann ldquoCovariance consistency methods for fault-tolerant distributed data fusionrdquo Information Fusion vol 4 no3 pp 201ndash215 2003

[44] S Bashi V P Jilkov X R Li and H Chen ldquoDistributed imple-mentations of particle filtersrdquo in Proceedings of the 6th Interna-tional Conference of Information Fusion pp 1164ndash1171 2003

[45] M Coates ldquoDistributed particle filters for sensor networksrdquo inProceedings of the 3rd International symposium on InformationProcessing in Sensor Networks (ACM rsquo04) pp 99ndash107 New YorkNY USA 2004

[46] D Gu ldquoDistributed particle filter for target trackingrdquo in Pro-ceedings of the IEEE International Conference on Robotics andAutomation (ICRA rsquo07) pp 3856ndash3861 April 2007

[47] Y Bar-Shalom ldquoUpdate with out-of-sequencemeasurements intracking exact solutionrdquo IEEE Transactions on Aerospace andElectronic Systems vol 38 no 3 pp 769ndash778 2002

[48] M Orton and A Marrs ldquoA Bayesian approach to multi-targettracking and data fusionwithOut-of-SequenceMeasurementsrdquoIEE Colloquium no 174 pp 151ndash155 2001

[49] M L Hernandez A D Marrs S Maskell and M R OrtonldquoTracking and fusion for wireless sensor networksrdquo in Proceed-ings of the 5th International Conference on Information Fusion2002

[50] P C Mahalanobis ldquoOn the generalized distance in statisticsrdquoProceedings National Institute of ScienceIndia vol 2 no 1 pp49ndash55 1936

[51] S J Julier J K Uhlmann and D Nicholson ldquoA methodfor dealing with assignment ambiguityrdquo in Proceedings of theAmerican Control Conference (AAC rsquo04) vol 5 pp 4102ndash4107July 2004

[52] H Pan Z-P Liang T J Anastasio and T S Huang ldquoHybridNN-Bayesian architecture for information fusionrdquo in Proceed-ings of the International Conference on Image Processing (ICIPrsquo98) pp 368ndash371 October 1998

[53] C Coue T Fraichard P Bessiere and E Mazer ldquoMulti-sensordata fusion using Bayesian programming an automotive appli-cationrdquo in Proceedings of the IEEERSJ International Conferenceon Intelligent Robots and Systems pp 141ndash146 October 2002

[54] D L Hall and J Llinas Handbook of Multisensor Data FusionCRC Press Boca Raton Fla USA 2001

[55] P Dempster ldquoA Generalization of Bayesian Inferencerdquo Journalof the Royal Statistical Society B vol 30 no 2 pp 205ndash247 1968

[56] A ShaferMathematical Theory of Evidence Princeton Univer-sity Press Princeton NJ USA 1976

[57] G M Provan ldquoThe validity of Dempster-Shafer belief func-tionsrdquo International Journal of Approximate Reasoning vol 6no 3 pp 389ndash399 1992

[58] D M Buede ldquoShafer-Dempster and Bayesian reasoning aresponse to lsquoShafer-Dempster reasoning with applications tomultisensor target identification systemsrsquordquo IEEE Transactions onSystems Man and Cybernetics vol 18 no 6 pp 1009ndash1011 1988

[59] Y Cheng and R L Kashyap ldquoComparisonol Bayesian andDempsterrsquos rules in evidence combinationrdquo in Maximum-Entropy and BayesianMethods in Science and Engineering 1988

[60] B R Cobb and P P Shenoy ldquoA comparison of Bayesian andbelief function reasoningrdquo Information Systems Frontiers vol 5no 4 pp 345ndash358 2003

[61] H Wu M Siegel R Stiefelhagen and J Yang ldquoSensor fusionusing Dempster-Shafer theoryrdquo in Proceedings of the 19thIEEE Instrumentation and Measurement Technology Conference(TMTC rsquo02) pp 7ndash11 May 2002

The Scientific World Journal 19

[62] H Wu M Siegel and S Ablay ldquoSensor fusion using dempster-shafer theory II staticweighting andKalmanfilter-like dynamicweightingrdquo in Proceedings of the 20th IEEE Information andMeasurement Technology Conference (TMTC rsquo03) pp 907ndash912May 2003

[63] E Bosse P Valin A-C Boury-Brisset and D Grenier ldquoEx-ploitation of a priori knowledge for information fusionrdquo Infor-mation Fusion vol 7 no 2 pp 161ndash175 2006

[64] MMorbee L Tessens H Aghajan andW Philips ldquoDempster-Shafer based multi-view occupancy mapsrdquo Electronics Lettersvol 46 no 5 pp 341ndash343 2010

[65] C S Peirce Abduction and Induction Philosophical Writings ofPeirce vol 156 Dover New York NY USA 1955

[66] A M Abdelbar E A M Andrews and D C Wunsch IIldquoAbductive reasoning with recurrent neural networksrdquo NeuralNetworks vol 16 no 5-6 pp 665ndash673 2003

[67] J R Aguero and A Vargas ldquoInference of operative configu-ration of distribution networks using fuzzy logic techniquesPart II extended real-time modelrdquo IEEE Transactions on PowerSystems vol 20 no 3 pp 1562ndash1569 2005

[68] A W M Smeulders M Worring S Santini A Gupta and RJain ldquoContent-based image retrieval at the end of the earlyyearsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 22 no 12 pp 1349ndash1380 2000

[69] D S Friedlander and S Phoha ldquoSemantic information fusionfor coordinated signal processing in mobile sensor networksrdquoInternational Journal of High Performance Computing Applica-tions vol 16 no 3 pp 235ndash241 2002

[70] S Friedlander ldquoSemantic information extractionrdquo in Dis-tributed Sensor Networks 2005

[71] KWhitehouse J Liu and F Zhao ldquoSemantic Streams a frame-work for composable inference over sensor datardquo in Proceedingsof the 3rd European Workshop on Wireless Sensor NetworksLecture Notes in Computer Science Springer February 2006

[72] J Cox ldquoA review of statistical data association techniques formotion correspondencerdquo International Journal of ComputerVision vol 10 no 1 pp 53ndash66 1993

[73] C J Veenman M J T Reinders and E Backer ldquoResolvingmotion correspondence for densely moving pointsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol23 no 1 pp 54ndash72 2001

[74] F Castanedo M A Patricio J Garcıa and J M MolinaldquoBottom-uptop-down coordination in a multiagent visualsensor networkrdquo in Proceedings of the IEEE Conference onAdvanced Video and Signal Based Surveillance (AVSS rsquo07) pp93ndash98 September 2007

[75] F Castanedo J Garcıa M A Patricio and J M MolinaldquoAnalysis of distributed fusion alternatives in coordinated visionagentsrdquo in Proceedings of the 11th International Conference onInformation Fusion (FUSION rsquo08) July 2008

[76] F Castanedo J Garcıa M A Patricio and J M Molina ldquoDatafusion to improve trajectory tracking in a cooperative surveil-lance multi-agent architecturerdquo Information Fusion vol 11 no3 pp 243ndash255 2010

[77] M A Davenport C Hegde M F Duarte and R G BaraniukldquoJoint manifolds for data fusionrdquo IEEE Transactions on ImageProcessing vol 19 no 10 pp 2580ndash2594 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 15: Review Article A Review of Data Fusion Techniquesdownloads.hindawi.com/journals/tswj/2013/704504.pdf · Classification of Data Fusion Techniques Data fusion is a ... to extract features

The Scientific World Journal 15

respectively It is known that one of the observations is correctand the other is erroneous However the identity of thecorrect estimation is unknown and cannot be determinedIn this situation if both estimations are employed as aninput to the Kalman filter there will be a problem becausethe Kalman filter only guarantees a consistent output if theobservation is updated with a measurement consistent withboth of them In the specific case in which themeasurementscorrespond to the same object but are acquired from twodifferent sensors the Kalman filter can only guarantee thatthe output is consistent if it is consistent with both separatelyBecause it is not possible to knowwhich estimation is correctthe only way to combine the two estimations rigorously isto provide an estimation (119906 119880) that is consistent with bothestimations and to obey the following properties

119880 ⪖ 1198601+ (119906 minus 119886

1) (119906 minus 119860

1)119879

119880 ⪖ 1198602+ (119906 minus 119886

2) (119906 minus 119860

2)119879

(40)

where somemeasurement of thematrix size119880 (ie the deter-minant) is minimized

In other words the previous equations indicate that if theestimation (119886

1 1198601) is consistent then the translation of the

vector 1198861to 119906 requires to increase the covariance by the sum

of a matrix at least as big as the product of (119906minus1198861) in order to

be consistentThe same situation applies to the measurement(1198862 1198602) in order to be consistent

A simple strategy is to choose the mean of the estimationas the input value of one of themeasurements (119906 = 119886

1) In this

case the value of 119880must be chosen such that the estimationis consistent with the worst case (the correct measurement is1198862) However it is possible to assign 119906 an intermediate value

between 1198861and 1198862to decrease the value of 119880 Therefore the

CU algorithm establishes the mean fused value 119906 that hasthe least covariance 119880 but is sufficiently large for the twomeasurements (119886

1and 1198862) for consistency

Because the matrix inequalities presented in previousequations are convex convex optimization algorithms mustbe employed to solve them The value of 119880 can be computedwith the iterative method described by Julier et al [51]The obtained covariance could be significantly larger thanany of the initial covariances and is an indicator of theexisting uncertainty between the initial estimations One ofthe advantages of the CU method arises from the fact thatthe same process could be easily extended to119873 inputs

5 Decision Fusion Methods

A decision is typically taken based on the knowledge of theperceived situation which is provided by many sources inthe data fusion domain These techniques aim to make ahigh-level inference about the events and activities that areproduced from the detected targets These techniques oftenuse symbolic information and the fusion process requires toreasonwhile accounting for the uncertainties and constraintsThese methods fall under level 2 (situation assessment) andlevel 4 (impact assessment) of the JDL data fusion model

51 The Bayesian Methods Information fusion based on theBayesian inference provides a formalism for combining evi-dence according to the probability theory rules Uncertaintyis represented using the conditional probability terms thatdescribe beliefs and take on values in the interval [0 1] wherezero indicates a complete lack of belief and one indicates anabsolute belief The Bayesian inference is based on the Bayesrule as follows

119875 (119884 | 119883) =119875 (119883 | 119884) 119875 (119884)

119875 (119883) (41)

where the posterior probability 119875(119884 | 119883) represents thebelief in the hypothesis 119884 given the information 119883 Thisprobability is obtained by multiplying the a priori probabilityof the hypothesis 119875(119884) by the probability of having 119883 giventhat 119884 is true 119875(119883 | 119884) The value 119875(119883) is used as anormalizing constantThemain disadvantage of the Bayesianinference is that the probabilities 119875(119883) and 119875(119883 | 119884) mustbe known To estimate the conditional probabilities Panet al [52] proposed the use of NNs whereas Coue et al [53]proposed the Bayesian programming

Hall and Llinas [54] described the following problemsassociated with Bayesian inference

(i) Difficulty in establishing the value of a priori proba-bilities

(ii) Complexity when there are multiple potential hypo-theses and a substantial number of events that dependon the conditions

(iii) The hypothesis should be mutually exclusive

(iv) Difficulty in describing the uncertainty of the deci-sions

52 The Dempster-Shafer Inference The Dempster-Shaferinference is based on the mathematical theory introducedby Dempster [55] and Shafer [56] which generalizes theBayesian theory The Dempster-Shafer theory provides aformalism that could be used to represent incomplete knowl-edge updating beliefs and a combination of evidence andallows us to represent the uncertainty explicitly [57]

A fundamental concept in the Dempster-Shafer reason-ing is the frame of discernment which is defined as followsLet Θ = 120579

1 1205792 120579

119873 be the set of all possible states

that define the system and let Θ be exhaustive and mutuallyexclusive due to the system being only in one state 120579

119894isin Θ

where 1 ⪕ 119894 ⪕ 119873 The set Θ is called a frame of discernmentbecause its elements are employed to discern the current stateof the system

The elements of the set 2Θ are called hypotheses Inthe Dempster-Shafer theory based on the evidence 119864 aprobability is assigned to each hypothesis119867 isin 2Θ accordingto the basic assignment of probabilities or the mass function119898 2Θrarr [01] which satisfies

119898(0) = 0 (42)

16 The Scientific World Journal

Thus themass function of the empty set is zero Furthermorethe mass function of a hypothesis is larger than or equal tozero for all of the hypotheses Consider

119898(119867) ge 0 forall119867 isin 2Θ (43)

The sum of the mass function of all the hypotheses is oneConsider

sum

119867isin2Θ

119898(119867) = 1 (44)

To express incomplete beliefs in a hypothesis 119867 the Demp-ster-Shafer theory defines the belief function bel 2Θ rarr

[0 1] over Θ as

bel (119867) = sum119860sube119867

119898(119860) (45)

where bel(0) = 0 and bel(Θ) = 1The doubt level in119867 can beexpressed in terms of the belief function by

dou (119867) = bel (not119867) = sum

119860subenot119867

119898(119860) (46)

To express the plausibility of each hypothesis the functionpl 2Θ rarr [0 1] over Θ is defined as

pl (119867) = 1 minus dou (119867) = sum

119860cap119867=0

119898(119860) (47)

Intuitive plausibility indicates that there is less uncer-tainty in hypothesis119867 if it is more plausible The confidenceinterval [bel(119867) pl(119867)] defines the true belief in hypothesis119867 To combine the effects of the two mass functions 119898

1and

1198982 the Dempster-Shafer theory defines a rule119898

1oplus 1198982as

1198981oplus 1198982(0) = 0

1198981oplus 1198982(119867) =

sum119883cap119884=119867

1198981(119883)119898

2(119884)

1 minus sum119883cap119884=0

1198981(119883)119898

2(119884)

(48)

In contrast to the Bayesian inference a priori probabilitiesare not required in the Dempster-Shafer inference becausethey are assigned at the instant that the information is pro-vided Several studies in the literature have compared the useof the Bayesian inference and the Dempster-Shafer inferencesuch as [58ndash60] Wu et al [61] used the Dempster-Shafertheory to fuse information in context-aware environmentsThis work was extended in [62] to dynamically modify theassociated weights to the sensor measurements Thereforethe fusion mechanism is calibrated according to the recentmeasurements of the sensors (in cases in which the ground-truth is available) In themilitary domain [63] theDempster-Shafer reasoning is used with the a priori information storedin a database for classifying military ships Morbee et al [64]described the use of the Dempster-Shafer theory to build 2Doccupancy maps from several cameras and to evaluate thecontribution of subsets of cameras to a specific task Each taskis the observation of an event of interest and the goal is toassess the validity of a set of hypotheses that are fused usingthe Dempster-Shafer theory

53 Abductive Reasoning Abductive reasoning or inferringthe best explanation is a reasoning method in which ahypothesis is chosen under the assumption that in case itis true it explains the observed event most accurately [65]In other words when an event is observed the abductionmethod attempts to find the best explanation

In the context of probabilistic reasoning abductive infer-ence finds the posterior ML of the system variables givensome observed variables Abductive reasoning is more areasoning pattern than a data fusion technique Thereforedifferent inference methods such as NNs [66] or fuzzy logic[67] can be employed

54 Semantic Methods Decision fusion techniques thatemploy semantic data fromdifferent sources as an input couldprovide more accurate results than those that rely on onlysingle sources There is a growing interest in techniques thatautomatically determine the presence of semantic features invideos to solve the semantic gap [68]

Semantic information fusion is essentially a scheme inwhich raw sensor data are processed such that the nodesexchange only the resultant semantic information Semanticinformation fusion typically covers two phases (i) build-ing the knowledge and (ii) pattern matching (inference)The first phase (typically offline) incorporates the mostappropriate knowledge into semantic information Then thesecond phase (typically online or in real-time) fuses relevantattributes and provides a semantic interpretation of thesensor data [69ndash71]

Semantic fusion could be viewed as an idea for integratingand translating sensor data into formal languages Thereforethe obtained resulting language from the observations ofthe environment is compared with similar languages thatare stored in the database The key of this strategy is thatsimilar behaviors represented by formal languages are alsosemantically similar This type of method provides savingsin the cost of transmission because the nodes need onlytransmit the formal language structure instead of the rawdata However a known set of behaviors must be storedin a database in advance which might be difficult in somescenarios

6 Conclusions

This paper reviews the most popular methods and tech-niques for performing datainformation fusion To determinewhether the application of datainformation fusion methodsis feasible we must evaluate the computational cost of theprocess and the delay introduced in the communicationA centralized data fusion approach is theoretically optimalwhen there is no cost of transmission and there are sufficientcomputational resources However this situation typicallydoes not hold in practical applications

The selection of the most appropriate technique dependson the type of the problem and the established assumptionsof each technique Statistical data fusion methods (eg PDAJPDA MHT and Kalman) are optimal under specific condi-tions [72] First the assumption that the targets are moving

The Scientific World Journal 17

independently and the measurements are normally dis-tributed around the predicted position typically does nothold Second because the statistical techniques model allof the events as probabilities they typically have severalparameters and a priori probabilities for false measurementsand detection errors that are often difficult to obtain (atleast in an optimal sense) For example in the case of theMHT algorithm specific parameters must be established thatare nontrivial to determine and are very sensitive [73] Incontrast statisticalmethods that optimize over several framesare computationally intensive and their complexity typicallygrows exponentially with the number of targets For examplein the case of particle filters tracking several targets can beaccomplished jointly as a group or individually If severaltargets are tracked jointly the necessary number of particlesgrows exponentially Therefore in practice it is better toperform tracking on them individually with the assumptionthat targets do not interact between the particles

In contrast to centralized systems the distributed datafusion methods introduce some challenges in the data fusionprocess such as (i) spatial and temporal alignments of theinformation (ii) out-of-sequence measurements and (iii)data correlation reported by Castanedo et al [74 75] Theinherent redundancy of the distributed systems could beexploited with distributed reasoning techniques and cooper-ative algorithms to improve the individual node estimationsreported by Castanedo et al [76] In addition to the previousstudies a new trend based on the geometric notion of a low-dimensional manifold is gaining attention in the data fusioncommunity An example is the work of Davenport et al [77]which proposes a simple model that captures the correlationbetween the sensor observations by matching the parametervalues for the different obtained manifolds

Acknowledgments

The author would like to thank Jesus Garcıa Miguel APatricio and James Llinas for their interesting and relateddiscussions on several topics that were presented in thispaper

References

[1] JDL Data Fusion Lexicon Technical Panel For C3 FE WhiteSan Diego Calif USA Code 420 1991

[2] D L Hall and J Llinas ldquoAn introduction to multisensor datafusionrdquo Proceedings of the IEEE vol 85 no 1 pp 6ndash23 1997

[3] H F Durrant-Whyte ldquoSensor models and multisensor integra-tionrdquo International Journal of Robotics Research vol 7 no 6 pp97ndash113 1988

[4] B V Dasarathy ldquoSensor fusion potential exploitation-inno-vative architectures and illustrative applicationsrdquo Proceedings ofthe IEEE vol 85 no 1 pp 24ndash38 1997

[5] R C Luo C-C Yih and K L Su ldquoMultisensor fusion andintegration approaches applications and future research direc-tionsrdquo IEEE Sensors Journal vol 2 no 2 pp 107ndash119 2002

[6] J Llinas C Bowman G Rogova A Steinberg E Waltz andF White ldquoRevisiting the JDL data fusion model IIrdquo TechnicalReport DTIC Document 2004

[7] E P Blasch and S Plano ldquoJDL level 5 fusionmodel ldquouser refine-mentrdquo issues and applications in group trackingrdquo in Proceedingsof the Signal Processing Sensor Fusion and Target RecognitionXI pp 270ndash279 April 2002

[8] H F Durrant-Whyte and M Stevens ldquoData fusion in decen-tralized sensing networksrdquo in Proceedings of the 4th Interna-tional Conference on Information Fusion pp 302ndash307MontrealCanada 2001

[9] J Manyika and H Durrant-Whyte Data Fusion and SensorManagement A Decentralized Information-Theoretic ApproachPrentice Hall Upper Saddle River NJ USA 1995

[10] S S Blackman ldquoAssociation and fusion ofmultiple sensor datardquoinMultitarget-Multisensor Tracking Advanced Applications pp187ndash217 Artech House 1990

[11] S Lloyd ldquoLeast squares quantization in pcmrdquo IEEETransactionson Information Theory vol 28 no 2 pp 129ndash137 1982

[12] M Shindler A Wong and A Meyerson ldquoFast and accurate120581-means for large datasetsrdquo in Proceedings of the 25th AnnualConference on Neural Information Processing Systems (NIPS rsquo11)pp 2375ndash2383 December 2011

[13] Y Bar-Shalom and E Tse ldquoTracking in a cluttered environmentwith probabilistic data associationrdquo Automatica vol 11 no 5pp 451ndash460 1975

[14] T E Fortmann Y Bar-Shalom and M Scheffe ldquoMulti-targettracking using joint probabilistic data associationrdquo in Pro-ceedings of the 19th IEEE Conference on Decision and Controlincluding the Symposium on Adaptive Processes vol 19 pp 807ndash812 December 1980

[15] D B Reid ldquoAn algorithm for tracking multiple targetsrdquo IEEETransactions on Automatic Control vol 24 no 6 pp 843ndash8541979

[16] C L Morefield ldquoApplication of 0-1 integer programming tomultitarget tracking problemsrdquo IEEETransactions onAutomaticControl vol 22 no 3 pp 302ndash312 1977

[17] R L Streit and T E Luginbuhl ldquoMaximum likelihood methodfor probabilistic multihypothesis trackingrdquo in Proceedings of theSignal and Data Processing of Small Targets vol 2235 of Pro-ceedings of SPIE p 394 1994

[18] I J Cox and S L Hingorani ldquoEfficient implementation of Reidrsquosmultiple hypothesis tracking algorithm and its evaluation forthe purpose of visual trackingrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 18 no 2 pp 138ndash1501996

[19] K G Murty ldquoAn algorithm for ranking all the assignments inorder of increasing costrdquo Operations Research vol 16 no 3 pp682ndash687 1968

[20] M E Liggins II C-Y Chong I Kadar et al ldquoDistributed fusionarchitectures and algorithms for target trackingrdquo Proceedings ofthe IEEE vol 85 no 1 pp 95ndash106 1997

[21] S Coraluppi C Carthel M Luettgen and S Lynch ldquoAll-source track and identity fusionrdquo in Proceedings of the NationalSymposium on Sensor and Data Fusion 2000

[22] P Storms and F Spieksma ldquoAn lp-based algorithm for the dataassociation problem in multitarget trackingrdquo in Proceedings ofthe 3rd IEEE International Conference on Information Fusionvol 1 2000

[23] S-W Joo and R Chellappa ldquoA multiple-hypothesis approachfor multiobject visual trackingrdquo IEEE Transactions on ImageProcessing vol 16 no 11 pp 2849ndash2854 2007

[24] S Coraluppi andC Carthel ldquoAggregate surveillance a cardinal-ity tracking approachrdquo in Proceedings of the 14th InternationalConference on Information Fusion (FUSION rsquo11) July 2011

18 The Scientific World Journal

[25] K C Chang C Y Chong and Y Bar-Shalom ldquoJoint proba-bilistic data association in distributed sensor networksrdquo IEEETransactions on Automatic Control vol 31 no 10 pp 889ndash8971986

[26] Y Chong S Mori and K C Chang ldquoInformation lusion indistributed sensor networksrdquo in Proceedings of the 4th AmericanControl Conference Boston Mass USA June 1985

[27] Y Chong S Mori and K C Chang ldquoDistributed multitar-get multisensor trackingrdquo in Multitarget-Multisensor TrackingAdvanced Applications vol 1 pp 247ndash295 1990

[28] J Pearl Probabilistic Reasoning in Intelligent Systems Networksof Plausible Inference Morgan Kaufmann San Mateo CalifUSA 1988

[29] Koller and N Friedman Probabilistic Graphical Models Princi-ples and Techniques MIT press 2009

[30] L Chen M Cetin and A S Willsky ldquoDistributed data associ-ation for multi-target tracking in sensor networksrdquo in Proceed-ings of the 7th International Conference on Information Fusion(FUSION rsquo05) pp 9ndash16 July 2005

[31] L Chen M J Wainwright M Cetin and A S Willsky ldquoDataassociation based on optimization in graphical models withapplication to sensor networksrdquo Mathematical and ComputerModelling vol 43 no 9-10 pp 1114ndash1113 2006

[32] Y Weiss and W T Freeman ldquoOn the optimality of solutionsof the max-product belief-propagation algorithm in arbitrarygraphsrdquo IEEE Transactions on InformationTheory vol 47 no 2pp 736ndash744 2001

[33] C Brown H Durrant-Whyte J Leonard B Rao and B SteerldquoDistributed data fusion using Kalman filtering a roboticsapplicationrdquo in Data Fusion in Robotics and Machine Intelli-gence M A Abidi and R C Gonzalez Eds pp 267ndash309 1992

[34] R E Kalman ldquoA new approach to linear filtering and predictionproblemsrdquo Journal of Basic Engineering vol 82 no 1 pp 35ndash451960

[35] R C Luo and M G Kay ldquoData fusion and sensor integrationstate-of-the-art 1990srdquo in Data Fusion in Robotics and MachineIntelligence pp 7ndash135 1992

[36] Welch and G Bishop An Introduction to the Kalman FilterACM SIC-CRAPH 2001 Course Notes 2001

[37] S J Julier and J K Uhlmann ldquoA new extension of the Kalmanfilter to nonlinear systemsrdquo in Proceedings of the InternationalSymposium on AerospaceDefense Sensing Simulation and Con-trols vol 3 1997

[38] A Wan and R Van Der Merwe ldquoThe unscented kalman filterfor nonlinear estimationrdquo in Proceedings of the Adaptive Systemsfor Signal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 153ndash158 2000

[39] D Crisan and A Doucet ldquoA survey of convergence results onparticle filtering methods for practitionersrdquo IEEE Transactionson Signal Processing vol 50 no 3 pp 736ndash746 2002

[40] J Martinez-del Rincon C Orrite-Urunuela and J E Herrero-Jaraba ldquoAn efficient particle filter for color-based tracking incomplex scenesrdquo in Proceedings of the IEEE Conference onAdvancedVideo and Signal Based Surveillance pp 176ndash181 2007

[41] S Ganeriwal R Kumar and M B Srivastava ldquoTiming-syncprotocol for sensor networksrdquo in Proceedings of the 1st Inter-national Conference on Embedded Networked Sensor Systems(SenSys rsquo03) pp 138ndash149 November 2003

[42] M Manzo T Roosta and S Sastry ldquoTime synchronization innetworksrdquo in Proceedings of the 3rd ACMWorkshop on Securityof Ad Hoc and Sensor Networks (SASN rsquo05) pp 107ndash116November 2005

[43] J K Uhlmann ldquoCovariance consistency methods for fault-tolerant distributed data fusionrdquo Information Fusion vol 4 no3 pp 201ndash215 2003

[44] S Bashi V P Jilkov X R Li and H Chen ldquoDistributed imple-mentations of particle filtersrdquo in Proceedings of the 6th Interna-tional Conference of Information Fusion pp 1164ndash1171 2003

[45] M Coates ldquoDistributed particle filters for sensor networksrdquo inProceedings of the 3rd International symposium on InformationProcessing in Sensor Networks (ACM rsquo04) pp 99ndash107 New YorkNY USA 2004

[46] D Gu ldquoDistributed particle filter for target trackingrdquo in Pro-ceedings of the IEEE International Conference on Robotics andAutomation (ICRA rsquo07) pp 3856ndash3861 April 2007

[47] Y Bar-Shalom ldquoUpdate with out-of-sequencemeasurements intracking exact solutionrdquo IEEE Transactions on Aerospace andElectronic Systems vol 38 no 3 pp 769ndash778 2002

[48] M Orton and A Marrs ldquoA Bayesian approach to multi-targettracking and data fusionwithOut-of-SequenceMeasurementsrdquoIEE Colloquium no 174 pp 151ndash155 2001

[49] M L Hernandez A D Marrs S Maskell and M R OrtonldquoTracking and fusion for wireless sensor networksrdquo in Proceed-ings of the 5th International Conference on Information Fusion2002

[50] P C Mahalanobis ldquoOn the generalized distance in statisticsrdquoProceedings National Institute of ScienceIndia vol 2 no 1 pp49ndash55 1936

[51] S J Julier J K Uhlmann and D Nicholson ldquoA methodfor dealing with assignment ambiguityrdquo in Proceedings of theAmerican Control Conference (AAC rsquo04) vol 5 pp 4102ndash4107July 2004

[52] H Pan Z-P Liang T J Anastasio and T S Huang ldquoHybridNN-Bayesian architecture for information fusionrdquo in Proceed-ings of the International Conference on Image Processing (ICIPrsquo98) pp 368ndash371 October 1998

[53] C Coue T Fraichard P Bessiere and E Mazer ldquoMulti-sensordata fusion using Bayesian programming an automotive appli-cationrdquo in Proceedings of the IEEERSJ International Conferenceon Intelligent Robots and Systems pp 141ndash146 October 2002

[54] D L Hall and J Llinas Handbook of Multisensor Data FusionCRC Press Boca Raton Fla USA 2001

[55] P Dempster ldquoA Generalization of Bayesian Inferencerdquo Journalof the Royal Statistical Society B vol 30 no 2 pp 205ndash247 1968

[56] A ShaferMathematical Theory of Evidence Princeton Univer-sity Press Princeton NJ USA 1976

[57] G M Provan ldquoThe validity of Dempster-Shafer belief func-tionsrdquo International Journal of Approximate Reasoning vol 6no 3 pp 389ndash399 1992

[58] D M Buede ldquoShafer-Dempster and Bayesian reasoning aresponse to lsquoShafer-Dempster reasoning with applications tomultisensor target identification systemsrsquordquo IEEE Transactions onSystems Man and Cybernetics vol 18 no 6 pp 1009ndash1011 1988

[59] Y Cheng and R L Kashyap ldquoComparisonol Bayesian andDempsterrsquos rules in evidence combinationrdquo in Maximum-Entropy and BayesianMethods in Science and Engineering 1988

[60] B R Cobb and P P Shenoy ldquoA comparison of Bayesian andbelief function reasoningrdquo Information Systems Frontiers vol 5no 4 pp 345ndash358 2003

[61] H Wu M Siegel R Stiefelhagen and J Yang ldquoSensor fusionusing Dempster-Shafer theoryrdquo in Proceedings of the 19thIEEE Instrumentation and Measurement Technology Conference(TMTC rsquo02) pp 7ndash11 May 2002

The Scientific World Journal 19

[62] H Wu M Siegel and S Ablay ldquoSensor fusion using dempster-shafer theory II staticweighting andKalmanfilter-like dynamicweightingrdquo in Proceedings of the 20th IEEE Information andMeasurement Technology Conference (TMTC rsquo03) pp 907ndash912May 2003

[63] E Bosse P Valin A-C Boury-Brisset and D Grenier ldquoEx-ploitation of a priori knowledge for information fusionrdquo Infor-mation Fusion vol 7 no 2 pp 161ndash175 2006

[64] MMorbee L Tessens H Aghajan andW Philips ldquoDempster-Shafer based multi-view occupancy mapsrdquo Electronics Lettersvol 46 no 5 pp 341ndash343 2010

[65] C S Peirce Abduction and Induction Philosophical Writings ofPeirce vol 156 Dover New York NY USA 1955

[66] A M Abdelbar E A M Andrews and D C Wunsch IIldquoAbductive reasoning with recurrent neural networksrdquo NeuralNetworks vol 16 no 5-6 pp 665ndash673 2003

[67] J R Aguero and A Vargas ldquoInference of operative configu-ration of distribution networks using fuzzy logic techniquesPart II extended real-time modelrdquo IEEE Transactions on PowerSystems vol 20 no 3 pp 1562ndash1569 2005

[68] A W M Smeulders M Worring S Santini A Gupta and RJain ldquoContent-based image retrieval at the end of the earlyyearsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 22 no 12 pp 1349ndash1380 2000

[69] D S Friedlander and S Phoha ldquoSemantic information fusionfor coordinated signal processing in mobile sensor networksrdquoInternational Journal of High Performance Computing Applica-tions vol 16 no 3 pp 235ndash241 2002

[70] S Friedlander ldquoSemantic information extractionrdquo in Dis-tributed Sensor Networks 2005

[71] KWhitehouse J Liu and F Zhao ldquoSemantic Streams a frame-work for composable inference over sensor datardquo in Proceedingsof the 3rd European Workshop on Wireless Sensor NetworksLecture Notes in Computer Science Springer February 2006

[72] J Cox ldquoA review of statistical data association techniques formotion correspondencerdquo International Journal of ComputerVision vol 10 no 1 pp 53ndash66 1993

[73] C J Veenman M J T Reinders and E Backer ldquoResolvingmotion correspondence for densely moving pointsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol23 no 1 pp 54ndash72 2001

[74] F Castanedo M A Patricio J Garcıa and J M MolinaldquoBottom-uptop-down coordination in a multiagent visualsensor networkrdquo in Proceedings of the IEEE Conference onAdvanced Video and Signal Based Surveillance (AVSS rsquo07) pp93ndash98 September 2007

[75] F Castanedo J Garcıa M A Patricio and J M MolinaldquoAnalysis of distributed fusion alternatives in coordinated visionagentsrdquo in Proceedings of the 11th International Conference onInformation Fusion (FUSION rsquo08) July 2008

[76] F Castanedo J Garcıa M A Patricio and J M Molina ldquoDatafusion to improve trajectory tracking in a cooperative surveil-lance multi-agent architecturerdquo Information Fusion vol 11 no3 pp 243ndash255 2010

[77] M A Davenport C Hegde M F Duarte and R G BaraniukldquoJoint manifolds for data fusionrdquo IEEE Transactions on ImageProcessing vol 19 no 10 pp 2580ndash2594 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 16: Review Article A Review of Data Fusion Techniquesdownloads.hindawi.com/journals/tswj/2013/704504.pdf · Classification of Data Fusion Techniques Data fusion is a ... to extract features

16 The Scientific World Journal

Thus themass function of the empty set is zero Furthermorethe mass function of a hypothesis is larger than or equal tozero for all of the hypotheses Consider

119898(119867) ge 0 forall119867 isin 2Θ (43)

The sum of the mass function of all the hypotheses is oneConsider

sum

119867isin2Θ

119898(119867) = 1 (44)

To express incomplete beliefs in a hypothesis 119867 the Demp-ster-Shafer theory defines the belief function bel 2Θ rarr

[0 1] over Θ as

bel (119867) = sum119860sube119867

119898(119860) (45)

where bel(0) = 0 and bel(Θ) = 1The doubt level in119867 can beexpressed in terms of the belief function by

dou (119867) = bel (not119867) = sum

119860subenot119867

119898(119860) (46)

To express the plausibility of each hypothesis the functionpl 2Θ rarr [0 1] over Θ is defined as

pl (119867) = 1 minus dou (119867) = sum

119860cap119867=0

119898(119860) (47)

Intuitive plausibility indicates that there is less uncer-tainty in hypothesis119867 if it is more plausible The confidenceinterval [bel(119867) pl(119867)] defines the true belief in hypothesis119867 To combine the effects of the two mass functions 119898

1and

1198982 the Dempster-Shafer theory defines a rule119898

1oplus 1198982as

1198981oplus 1198982(0) = 0

1198981oplus 1198982(119867) =

sum119883cap119884=119867

1198981(119883)119898

2(119884)

1 minus sum119883cap119884=0

1198981(119883)119898

2(119884)

(48)

In contrast to the Bayesian inference a priori probabilitiesare not required in the Dempster-Shafer inference becausethey are assigned at the instant that the information is pro-vided Several studies in the literature have compared the useof the Bayesian inference and the Dempster-Shafer inferencesuch as [58ndash60] Wu et al [61] used the Dempster-Shafertheory to fuse information in context-aware environmentsThis work was extended in [62] to dynamically modify theassociated weights to the sensor measurements Thereforethe fusion mechanism is calibrated according to the recentmeasurements of the sensors (in cases in which the ground-truth is available) In themilitary domain [63] theDempster-Shafer reasoning is used with the a priori information storedin a database for classifying military ships Morbee et al [64]described the use of the Dempster-Shafer theory to build 2Doccupancy maps from several cameras and to evaluate thecontribution of subsets of cameras to a specific task Each taskis the observation of an event of interest and the goal is toassess the validity of a set of hypotheses that are fused usingthe Dempster-Shafer theory

53 Abductive Reasoning Abductive reasoning or inferringthe best explanation is a reasoning method in which ahypothesis is chosen under the assumption that in case itis true it explains the observed event most accurately [65]In other words when an event is observed the abductionmethod attempts to find the best explanation

In the context of probabilistic reasoning abductive infer-ence finds the posterior ML of the system variables givensome observed variables Abductive reasoning is more areasoning pattern than a data fusion technique Thereforedifferent inference methods such as NNs [66] or fuzzy logic[67] can be employed

54 Semantic Methods Decision fusion techniques thatemploy semantic data fromdifferent sources as an input couldprovide more accurate results than those that rely on onlysingle sources There is a growing interest in techniques thatautomatically determine the presence of semantic features invideos to solve the semantic gap [68]

Semantic information fusion is essentially a scheme inwhich raw sensor data are processed such that the nodesexchange only the resultant semantic information Semanticinformation fusion typically covers two phases (i) build-ing the knowledge and (ii) pattern matching (inference)The first phase (typically offline) incorporates the mostappropriate knowledge into semantic information Then thesecond phase (typically online or in real-time) fuses relevantattributes and provides a semantic interpretation of thesensor data [69ndash71]

Semantic fusion could be viewed as an idea for integratingand translating sensor data into formal languages Thereforethe obtained resulting language from the observations ofthe environment is compared with similar languages thatare stored in the database The key of this strategy is thatsimilar behaviors represented by formal languages are alsosemantically similar This type of method provides savingsin the cost of transmission because the nodes need onlytransmit the formal language structure instead of the rawdata However a known set of behaviors must be storedin a database in advance which might be difficult in somescenarios

6 Conclusions

This paper reviews the most popular methods and tech-niques for performing datainformation fusion To determinewhether the application of datainformation fusion methodsis feasible we must evaluate the computational cost of theprocess and the delay introduced in the communicationA centralized data fusion approach is theoretically optimalwhen there is no cost of transmission and there are sufficientcomputational resources However this situation typicallydoes not hold in practical applications

The selection of the most appropriate technique dependson the type of the problem and the established assumptionsof each technique Statistical data fusion methods (eg PDAJPDA MHT and Kalman) are optimal under specific condi-tions [72] First the assumption that the targets are moving

The Scientific World Journal 17

independently and the measurements are normally dis-tributed around the predicted position typically does nothold Second because the statistical techniques model allof the events as probabilities they typically have severalparameters and a priori probabilities for false measurementsand detection errors that are often difficult to obtain (atleast in an optimal sense) For example in the case of theMHT algorithm specific parameters must be established thatare nontrivial to determine and are very sensitive [73] Incontrast statisticalmethods that optimize over several framesare computationally intensive and their complexity typicallygrows exponentially with the number of targets For examplein the case of particle filters tracking several targets can beaccomplished jointly as a group or individually If severaltargets are tracked jointly the necessary number of particlesgrows exponentially Therefore in practice it is better toperform tracking on them individually with the assumptionthat targets do not interact between the particles

In contrast to centralized systems the distributed datafusion methods introduce some challenges in the data fusionprocess such as (i) spatial and temporal alignments of theinformation (ii) out-of-sequence measurements and (iii)data correlation reported by Castanedo et al [74 75] Theinherent redundancy of the distributed systems could beexploited with distributed reasoning techniques and cooper-ative algorithms to improve the individual node estimationsreported by Castanedo et al [76] In addition to the previousstudies a new trend based on the geometric notion of a low-dimensional manifold is gaining attention in the data fusioncommunity An example is the work of Davenport et al [77]which proposes a simple model that captures the correlationbetween the sensor observations by matching the parametervalues for the different obtained manifolds

Acknowledgments

The author would like to thank Jesus Garcıa Miguel APatricio and James Llinas for their interesting and relateddiscussions on several topics that were presented in thispaper

References

[1] JDL Data Fusion Lexicon Technical Panel For C3 FE WhiteSan Diego Calif USA Code 420 1991

[2] D L Hall and J Llinas ldquoAn introduction to multisensor datafusionrdquo Proceedings of the IEEE vol 85 no 1 pp 6ndash23 1997

[3] H F Durrant-Whyte ldquoSensor models and multisensor integra-tionrdquo International Journal of Robotics Research vol 7 no 6 pp97ndash113 1988

[4] B V Dasarathy ldquoSensor fusion potential exploitation-inno-vative architectures and illustrative applicationsrdquo Proceedings ofthe IEEE vol 85 no 1 pp 24ndash38 1997

[5] R C Luo C-C Yih and K L Su ldquoMultisensor fusion andintegration approaches applications and future research direc-tionsrdquo IEEE Sensors Journal vol 2 no 2 pp 107ndash119 2002

[6] J Llinas C Bowman G Rogova A Steinberg E Waltz andF White ldquoRevisiting the JDL data fusion model IIrdquo TechnicalReport DTIC Document 2004

[7] E P Blasch and S Plano ldquoJDL level 5 fusionmodel ldquouser refine-mentrdquo issues and applications in group trackingrdquo in Proceedingsof the Signal Processing Sensor Fusion and Target RecognitionXI pp 270ndash279 April 2002

[8] H F Durrant-Whyte and M Stevens ldquoData fusion in decen-tralized sensing networksrdquo in Proceedings of the 4th Interna-tional Conference on Information Fusion pp 302ndash307MontrealCanada 2001

[9] J Manyika and H Durrant-Whyte Data Fusion and SensorManagement A Decentralized Information-Theoretic ApproachPrentice Hall Upper Saddle River NJ USA 1995

[10] S S Blackman ldquoAssociation and fusion ofmultiple sensor datardquoinMultitarget-Multisensor Tracking Advanced Applications pp187ndash217 Artech House 1990

[11] S Lloyd ldquoLeast squares quantization in pcmrdquo IEEETransactionson Information Theory vol 28 no 2 pp 129ndash137 1982

[12] M Shindler A Wong and A Meyerson ldquoFast and accurate120581-means for large datasetsrdquo in Proceedings of the 25th AnnualConference on Neural Information Processing Systems (NIPS rsquo11)pp 2375ndash2383 December 2011

[13] Y Bar-Shalom and E Tse ldquoTracking in a cluttered environmentwith probabilistic data associationrdquo Automatica vol 11 no 5pp 451ndash460 1975

[14] T E Fortmann Y Bar-Shalom and M Scheffe ldquoMulti-targettracking using joint probabilistic data associationrdquo in Pro-ceedings of the 19th IEEE Conference on Decision and Controlincluding the Symposium on Adaptive Processes vol 19 pp 807ndash812 December 1980

[15] D B Reid ldquoAn algorithm for tracking multiple targetsrdquo IEEETransactions on Automatic Control vol 24 no 6 pp 843ndash8541979

[16] C L Morefield ldquoApplication of 0-1 integer programming tomultitarget tracking problemsrdquo IEEETransactions onAutomaticControl vol 22 no 3 pp 302ndash312 1977

[17] R L Streit and T E Luginbuhl ldquoMaximum likelihood methodfor probabilistic multihypothesis trackingrdquo in Proceedings of theSignal and Data Processing of Small Targets vol 2235 of Pro-ceedings of SPIE p 394 1994

[18] I J Cox and S L Hingorani ldquoEfficient implementation of Reidrsquosmultiple hypothesis tracking algorithm and its evaluation forthe purpose of visual trackingrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 18 no 2 pp 138ndash1501996

[19] K G Murty ldquoAn algorithm for ranking all the assignments inorder of increasing costrdquo Operations Research vol 16 no 3 pp682ndash687 1968

[20] M E Liggins II C-Y Chong I Kadar et al ldquoDistributed fusionarchitectures and algorithms for target trackingrdquo Proceedings ofthe IEEE vol 85 no 1 pp 95ndash106 1997

[21] S Coraluppi C Carthel M Luettgen and S Lynch ldquoAll-source track and identity fusionrdquo in Proceedings of the NationalSymposium on Sensor and Data Fusion 2000

[22] P Storms and F Spieksma ldquoAn lp-based algorithm for the dataassociation problem in multitarget trackingrdquo in Proceedings ofthe 3rd IEEE International Conference on Information Fusionvol 1 2000

[23] S-W Joo and R Chellappa ldquoA multiple-hypothesis approachfor multiobject visual trackingrdquo IEEE Transactions on ImageProcessing vol 16 no 11 pp 2849ndash2854 2007

[24] S Coraluppi andC Carthel ldquoAggregate surveillance a cardinal-ity tracking approachrdquo in Proceedings of the 14th InternationalConference on Information Fusion (FUSION rsquo11) July 2011

18 The Scientific World Journal

[25] K C Chang C Y Chong and Y Bar-Shalom ldquoJoint proba-bilistic data association in distributed sensor networksrdquo IEEETransactions on Automatic Control vol 31 no 10 pp 889ndash8971986

[26] Y Chong S Mori and K C Chang ldquoInformation lusion indistributed sensor networksrdquo in Proceedings of the 4th AmericanControl Conference Boston Mass USA June 1985

[27] Y Chong S Mori and K C Chang ldquoDistributed multitar-get multisensor trackingrdquo in Multitarget-Multisensor TrackingAdvanced Applications vol 1 pp 247ndash295 1990

[28] J Pearl Probabilistic Reasoning in Intelligent Systems Networksof Plausible Inference Morgan Kaufmann San Mateo CalifUSA 1988

[29] Koller and N Friedman Probabilistic Graphical Models Princi-ples and Techniques MIT press 2009

[30] L Chen M Cetin and A S Willsky ldquoDistributed data associ-ation for multi-target tracking in sensor networksrdquo in Proceed-ings of the 7th International Conference on Information Fusion(FUSION rsquo05) pp 9ndash16 July 2005

[31] L Chen M J Wainwright M Cetin and A S Willsky ldquoDataassociation based on optimization in graphical models withapplication to sensor networksrdquo Mathematical and ComputerModelling vol 43 no 9-10 pp 1114ndash1113 2006

[32] Y Weiss and W T Freeman ldquoOn the optimality of solutionsof the max-product belief-propagation algorithm in arbitrarygraphsrdquo IEEE Transactions on InformationTheory vol 47 no 2pp 736ndash744 2001

[33] C Brown H Durrant-Whyte J Leonard B Rao and B SteerldquoDistributed data fusion using Kalman filtering a roboticsapplicationrdquo in Data Fusion in Robotics and Machine Intelli-gence M A Abidi and R C Gonzalez Eds pp 267ndash309 1992

[34] R E Kalman ldquoA new approach to linear filtering and predictionproblemsrdquo Journal of Basic Engineering vol 82 no 1 pp 35ndash451960

[35] R C Luo and M G Kay ldquoData fusion and sensor integrationstate-of-the-art 1990srdquo in Data Fusion in Robotics and MachineIntelligence pp 7ndash135 1992

[36] Welch and G Bishop An Introduction to the Kalman FilterACM SIC-CRAPH 2001 Course Notes 2001

[37] S J Julier and J K Uhlmann ldquoA new extension of the Kalmanfilter to nonlinear systemsrdquo in Proceedings of the InternationalSymposium on AerospaceDefense Sensing Simulation and Con-trols vol 3 1997

[38] A Wan and R Van Der Merwe ldquoThe unscented kalman filterfor nonlinear estimationrdquo in Proceedings of the Adaptive Systemsfor Signal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 153ndash158 2000

[39] D Crisan and A Doucet ldquoA survey of convergence results onparticle filtering methods for practitionersrdquo IEEE Transactionson Signal Processing vol 50 no 3 pp 736ndash746 2002

[40] J Martinez-del Rincon C Orrite-Urunuela and J E Herrero-Jaraba ldquoAn efficient particle filter for color-based tracking incomplex scenesrdquo in Proceedings of the IEEE Conference onAdvancedVideo and Signal Based Surveillance pp 176ndash181 2007

[41] S Ganeriwal R Kumar and M B Srivastava ldquoTiming-syncprotocol for sensor networksrdquo in Proceedings of the 1st Inter-national Conference on Embedded Networked Sensor Systems(SenSys rsquo03) pp 138ndash149 November 2003

[42] M Manzo T Roosta and S Sastry ldquoTime synchronization innetworksrdquo in Proceedings of the 3rd ACMWorkshop on Securityof Ad Hoc and Sensor Networks (SASN rsquo05) pp 107ndash116November 2005

[43] J K Uhlmann ldquoCovariance consistency methods for fault-tolerant distributed data fusionrdquo Information Fusion vol 4 no3 pp 201ndash215 2003

[44] S Bashi V P Jilkov X R Li and H Chen ldquoDistributed imple-mentations of particle filtersrdquo in Proceedings of the 6th Interna-tional Conference of Information Fusion pp 1164ndash1171 2003

[45] M Coates ldquoDistributed particle filters for sensor networksrdquo inProceedings of the 3rd International symposium on InformationProcessing in Sensor Networks (ACM rsquo04) pp 99ndash107 New YorkNY USA 2004

[46] D Gu ldquoDistributed particle filter for target trackingrdquo in Pro-ceedings of the IEEE International Conference on Robotics andAutomation (ICRA rsquo07) pp 3856ndash3861 April 2007

[47] Y Bar-Shalom ldquoUpdate with out-of-sequencemeasurements intracking exact solutionrdquo IEEE Transactions on Aerospace andElectronic Systems vol 38 no 3 pp 769ndash778 2002

[48] M Orton and A Marrs ldquoA Bayesian approach to multi-targettracking and data fusionwithOut-of-SequenceMeasurementsrdquoIEE Colloquium no 174 pp 151ndash155 2001

[49] M L Hernandez A D Marrs S Maskell and M R OrtonldquoTracking and fusion for wireless sensor networksrdquo in Proceed-ings of the 5th International Conference on Information Fusion2002

[50] P C Mahalanobis ldquoOn the generalized distance in statisticsrdquoProceedings National Institute of ScienceIndia vol 2 no 1 pp49ndash55 1936

[51] S J Julier J K Uhlmann and D Nicholson ldquoA methodfor dealing with assignment ambiguityrdquo in Proceedings of theAmerican Control Conference (AAC rsquo04) vol 5 pp 4102ndash4107July 2004

[52] H Pan Z-P Liang T J Anastasio and T S Huang ldquoHybridNN-Bayesian architecture for information fusionrdquo in Proceed-ings of the International Conference on Image Processing (ICIPrsquo98) pp 368ndash371 October 1998

[53] C Coue T Fraichard P Bessiere and E Mazer ldquoMulti-sensordata fusion using Bayesian programming an automotive appli-cationrdquo in Proceedings of the IEEERSJ International Conferenceon Intelligent Robots and Systems pp 141ndash146 October 2002

[54] D L Hall and J Llinas Handbook of Multisensor Data FusionCRC Press Boca Raton Fla USA 2001

[55] P Dempster ldquoA Generalization of Bayesian Inferencerdquo Journalof the Royal Statistical Society B vol 30 no 2 pp 205ndash247 1968

[56] A ShaferMathematical Theory of Evidence Princeton Univer-sity Press Princeton NJ USA 1976

[57] G M Provan ldquoThe validity of Dempster-Shafer belief func-tionsrdquo International Journal of Approximate Reasoning vol 6no 3 pp 389ndash399 1992

[58] D M Buede ldquoShafer-Dempster and Bayesian reasoning aresponse to lsquoShafer-Dempster reasoning with applications tomultisensor target identification systemsrsquordquo IEEE Transactions onSystems Man and Cybernetics vol 18 no 6 pp 1009ndash1011 1988

[59] Y Cheng and R L Kashyap ldquoComparisonol Bayesian andDempsterrsquos rules in evidence combinationrdquo in Maximum-Entropy and BayesianMethods in Science and Engineering 1988

[60] B R Cobb and P P Shenoy ldquoA comparison of Bayesian andbelief function reasoningrdquo Information Systems Frontiers vol 5no 4 pp 345ndash358 2003

[61] H Wu M Siegel R Stiefelhagen and J Yang ldquoSensor fusionusing Dempster-Shafer theoryrdquo in Proceedings of the 19thIEEE Instrumentation and Measurement Technology Conference(TMTC rsquo02) pp 7ndash11 May 2002

The Scientific World Journal 19

[62] H Wu M Siegel and S Ablay ldquoSensor fusion using dempster-shafer theory II staticweighting andKalmanfilter-like dynamicweightingrdquo in Proceedings of the 20th IEEE Information andMeasurement Technology Conference (TMTC rsquo03) pp 907ndash912May 2003

[63] E Bosse P Valin A-C Boury-Brisset and D Grenier ldquoEx-ploitation of a priori knowledge for information fusionrdquo Infor-mation Fusion vol 7 no 2 pp 161ndash175 2006

[64] MMorbee L Tessens H Aghajan andW Philips ldquoDempster-Shafer based multi-view occupancy mapsrdquo Electronics Lettersvol 46 no 5 pp 341ndash343 2010

[65] C S Peirce Abduction and Induction Philosophical Writings ofPeirce vol 156 Dover New York NY USA 1955

[66] A M Abdelbar E A M Andrews and D C Wunsch IIldquoAbductive reasoning with recurrent neural networksrdquo NeuralNetworks vol 16 no 5-6 pp 665ndash673 2003

[67] J R Aguero and A Vargas ldquoInference of operative configu-ration of distribution networks using fuzzy logic techniquesPart II extended real-time modelrdquo IEEE Transactions on PowerSystems vol 20 no 3 pp 1562ndash1569 2005

[68] A W M Smeulders M Worring S Santini A Gupta and RJain ldquoContent-based image retrieval at the end of the earlyyearsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 22 no 12 pp 1349ndash1380 2000

[69] D S Friedlander and S Phoha ldquoSemantic information fusionfor coordinated signal processing in mobile sensor networksrdquoInternational Journal of High Performance Computing Applica-tions vol 16 no 3 pp 235ndash241 2002

[70] S Friedlander ldquoSemantic information extractionrdquo in Dis-tributed Sensor Networks 2005

[71] KWhitehouse J Liu and F Zhao ldquoSemantic Streams a frame-work for composable inference over sensor datardquo in Proceedingsof the 3rd European Workshop on Wireless Sensor NetworksLecture Notes in Computer Science Springer February 2006

[72] J Cox ldquoA review of statistical data association techniques formotion correspondencerdquo International Journal of ComputerVision vol 10 no 1 pp 53ndash66 1993

[73] C J Veenman M J T Reinders and E Backer ldquoResolvingmotion correspondence for densely moving pointsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol23 no 1 pp 54ndash72 2001

[74] F Castanedo M A Patricio J Garcıa and J M MolinaldquoBottom-uptop-down coordination in a multiagent visualsensor networkrdquo in Proceedings of the IEEE Conference onAdvanced Video and Signal Based Surveillance (AVSS rsquo07) pp93ndash98 September 2007

[75] F Castanedo J Garcıa M A Patricio and J M MolinaldquoAnalysis of distributed fusion alternatives in coordinated visionagentsrdquo in Proceedings of the 11th International Conference onInformation Fusion (FUSION rsquo08) July 2008

[76] F Castanedo J Garcıa M A Patricio and J M Molina ldquoDatafusion to improve trajectory tracking in a cooperative surveil-lance multi-agent architecturerdquo Information Fusion vol 11 no3 pp 243ndash255 2010

[77] M A Davenport C Hegde M F Duarte and R G BaraniukldquoJoint manifolds for data fusionrdquo IEEE Transactions on ImageProcessing vol 19 no 10 pp 2580ndash2594 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 17: Review Article A Review of Data Fusion Techniquesdownloads.hindawi.com/journals/tswj/2013/704504.pdf · Classification of Data Fusion Techniques Data fusion is a ... to extract features

The Scientific World Journal 17

independently and the measurements are normally dis-tributed around the predicted position typically does nothold Second because the statistical techniques model allof the events as probabilities they typically have severalparameters and a priori probabilities for false measurementsand detection errors that are often difficult to obtain (atleast in an optimal sense) For example in the case of theMHT algorithm specific parameters must be established thatare nontrivial to determine and are very sensitive [73] Incontrast statisticalmethods that optimize over several framesare computationally intensive and their complexity typicallygrows exponentially with the number of targets For examplein the case of particle filters tracking several targets can beaccomplished jointly as a group or individually If severaltargets are tracked jointly the necessary number of particlesgrows exponentially Therefore in practice it is better toperform tracking on them individually with the assumptionthat targets do not interact between the particles

In contrast to centralized systems the distributed datafusion methods introduce some challenges in the data fusionprocess such as (i) spatial and temporal alignments of theinformation (ii) out-of-sequence measurements and (iii)data correlation reported by Castanedo et al [74 75] Theinherent redundancy of the distributed systems could beexploited with distributed reasoning techniques and cooper-ative algorithms to improve the individual node estimationsreported by Castanedo et al [76] In addition to the previousstudies a new trend based on the geometric notion of a low-dimensional manifold is gaining attention in the data fusioncommunity An example is the work of Davenport et al [77]which proposes a simple model that captures the correlationbetween the sensor observations by matching the parametervalues for the different obtained manifolds

Acknowledgments

The author would like to thank Jesus Garcıa Miguel APatricio and James Llinas for their interesting and relateddiscussions on several topics that were presented in thispaper

References

[1] JDL Data Fusion Lexicon Technical Panel For C3 FE WhiteSan Diego Calif USA Code 420 1991

[2] D L Hall and J Llinas ldquoAn introduction to multisensor datafusionrdquo Proceedings of the IEEE vol 85 no 1 pp 6ndash23 1997

[3] H F Durrant-Whyte ldquoSensor models and multisensor integra-tionrdquo International Journal of Robotics Research vol 7 no 6 pp97ndash113 1988

[4] B V Dasarathy ldquoSensor fusion potential exploitation-inno-vative architectures and illustrative applicationsrdquo Proceedings ofthe IEEE vol 85 no 1 pp 24ndash38 1997

[5] R C Luo C-C Yih and K L Su ldquoMultisensor fusion andintegration approaches applications and future research direc-tionsrdquo IEEE Sensors Journal vol 2 no 2 pp 107ndash119 2002

[6] J Llinas C Bowman G Rogova A Steinberg E Waltz andF White ldquoRevisiting the JDL data fusion model IIrdquo TechnicalReport DTIC Document 2004

[7] E P Blasch and S Plano ldquoJDL level 5 fusionmodel ldquouser refine-mentrdquo issues and applications in group trackingrdquo in Proceedingsof the Signal Processing Sensor Fusion and Target RecognitionXI pp 270ndash279 April 2002

[8] H F Durrant-Whyte and M Stevens ldquoData fusion in decen-tralized sensing networksrdquo in Proceedings of the 4th Interna-tional Conference on Information Fusion pp 302ndash307MontrealCanada 2001

[9] J Manyika and H Durrant-Whyte Data Fusion and SensorManagement A Decentralized Information-Theoretic ApproachPrentice Hall Upper Saddle River NJ USA 1995

[10] S S Blackman ldquoAssociation and fusion ofmultiple sensor datardquoinMultitarget-Multisensor Tracking Advanced Applications pp187ndash217 Artech House 1990

[11] S Lloyd ldquoLeast squares quantization in pcmrdquo IEEETransactionson Information Theory vol 28 no 2 pp 129ndash137 1982

[12] M Shindler A Wong and A Meyerson ldquoFast and accurate120581-means for large datasetsrdquo in Proceedings of the 25th AnnualConference on Neural Information Processing Systems (NIPS rsquo11)pp 2375ndash2383 December 2011

[13] Y Bar-Shalom and E Tse ldquoTracking in a cluttered environmentwith probabilistic data associationrdquo Automatica vol 11 no 5pp 451ndash460 1975

[14] T E Fortmann Y Bar-Shalom and M Scheffe ldquoMulti-targettracking using joint probabilistic data associationrdquo in Pro-ceedings of the 19th IEEE Conference on Decision and Controlincluding the Symposium on Adaptive Processes vol 19 pp 807ndash812 December 1980

[15] D B Reid ldquoAn algorithm for tracking multiple targetsrdquo IEEETransactions on Automatic Control vol 24 no 6 pp 843ndash8541979

[16] C L Morefield ldquoApplication of 0-1 integer programming tomultitarget tracking problemsrdquo IEEETransactions onAutomaticControl vol 22 no 3 pp 302ndash312 1977

[17] R L Streit and T E Luginbuhl ldquoMaximum likelihood methodfor probabilistic multihypothesis trackingrdquo in Proceedings of theSignal and Data Processing of Small Targets vol 2235 of Pro-ceedings of SPIE p 394 1994

[18] I J Cox and S L Hingorani ldquoEfficient implementation of Reidrsquosmultiple hypothesis tracking algorithm and its evaluation forthe purpose of visual trackingrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 18 no 2 pp 138ndash1501996

[19] K G Murty ldquoAn algorithm for ranking all the assignments inorder of increasing costrdquo Operations Research vol 16 no 3 pp682ndash687 1968

[20] M E Liggins II C-Y Chong I Kadar et al ldquoDistributed fusionarchitectures and algorithms for target trackingrdquo Proceedings ofthe IEEE vol 85 no 1 pp 95ndash106 1997

[21] S Coraluppi C Carthel M Luettgen and S Lynch ldquoAll-source track and identity fusionrdquo in Proceedings of the NationalSymposium on Sensor and Data Fusion 2000

[22] P Storms and F Spieksma ldquoAn lp-based algorithm for the dataassociation problem in multitarget trackingrdquo in Proceedings ofthe 3rd IEEE International Conference on Information Fusionvol 1 2000

[23] S-W Joo and R Chellappa ldquoA multiple-hypothesis approachfor multiobject visual trackingrdquo IEEE Transactions on ImageProcessing vol 16 no 11 pp 2849ndash2854 2007

[24] S Coraluppi andC Carthel ldquoAggregate surveillance a cardinal-ity tracking approachrdquo in Proceedings of the 14th InternationalConference on Information Fusion (FUSION rsquo11) July 2011

18 The Scientific World Journal

[25] K C Chang C Y Chong and Y Bar-Shalom ldquoJoint proba-bilistic data association in distributed sensor networksrdquo IEEETransactions on Automatic Control vol 31 no 10 pp 889ndash8971986

[26] Y Chong S Mori and K C Chang ldquoInformation lusion indistributed sensor networksrdquo in Proceedings of the 4th AmericanControl Conference Boston Mass USA June 1985

[27] Y Chong S Mori and K C Chang ldquoDistributed multitar-get multisensor trackingrdquo in Multitarget-Multisensor TrackingAdvanced Applications vol 1 pp 247ndash295 1990

[28] J Pearl Probabilistic Reasoning in Intelligent Systems Networksof Plausible Inference Morgan Kaufmann San Mateo CalifUSA 1988

[29] Koller and N Friedman Probabilistic Graphical Models Princi-ples and Techniques MIT press 2009

[30] L Chen M Cetin and A S Willsky ldquoDistributed data associ-ation for multi-target tracking in sensor networksrdquo in Proceed-ings of the 7th International Conference on Information Fusion(FUSION rsquo05) pp 9ndash16 July 2005

[31] L Chen M J Wainwright M Cetin and A S Willsky ldquoDataassociation based on optimization in graphical models withapplication to sensor networksrdquo Mathematical and ComputerModelling vol 43 no 9-10 pp 1114ndash1113 2006

[32] Y Weiss and W T Freeman ldquoOn the optimality of solutionsof the max-product belief-propagation algorithm in arbitrarygraphsrdquo IEEE Transactions on InformationTheory vol 47 no 2pp 736ndash744 2001

[33] C Brown H Durrant-Whyte J Leonard B Rao and B SteerldquoDistributed data fusion using Kalman filtering a roboticsapplicationrdquo in Data Fusion in Robotics and Machine Intelli-gence M A Abidi and R C Gonzalez Eds pp 267ndash309 1992

[34] R E Kalman ldquoA new approach to linear filtering and predictionproblemsrdquo Journal of Basic Engineering vol 82 no 1 pp 35ndash451960

[35] R C Luo and M G Kay ldquoData fusion and sensor integrationstate-of-the-art 1990srdquo in Data Fusion in Robotics and MachineIntelligence pp 7ndash135 1992

[36] Welch and G Bishop An Introduction to the Kalman FilterACM SIC-CRAPH 2001 Course Notes 2001

[37] S J Julier and J K Uhlmann ldquoA new extension of the Kalmanfilter to nonlinear systemsrdquo in Proceedings of the InternationalSymposium on AerospaceDefense Sensing Simulation and Con-trols vol 3 1997

[38] A Wan and R Van Der Merwe ldquoThe unscented kalman filterfor nonlinear estimationrdquo in Proceedings of the Adaptive Systemsfor Signal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 153ndash158 2000

[39] D Crisan and A Doucet ldquoA survey of convergence results onparticle filtering methods for practitionersrdquo IEEE Transactionson Signal Processing vol 50 no 3 pp 736ndash746 2002

[40] J Martinez-del Rincon C Orrite-Urunuela and J E Herrero-Jaraba ldquoAn efficient particle filter for color-based tracking incomplex scenesrdquo in Proceedings of the IEEE Conference onAdvancedVideo and Signal Based Surveillance pp 176ndash181 2007

[41] S Ganeriwal R Kumar and M B Srivastava ldquoTiming-syncprotocol for sensor networksrdquo in Proceedings of the 1st Inter-national Conference on Embedded Networked Sensor Systems(SenSys rsquo03) pp 138ndash149 November 2003

[42] M Manzo T Roosta and S Sastry ldquoTime synchronization innetworksrdquo in Proceedings of the 3rd ACMWorkshop on Securityof Ad Hoc and Sensor Networks (SASN rsquo05) pp 107ndash116November 2005

[43] J K Uhlmann ldquoCovariance consistency methods for fault-tolerant distributed data fusionrdquo Information Fusion vol 4 no3 pp 201ndash215 2003

[44] S Bashi V P Jilkov X R Li and H Chen ldquoDistributed imple-mentations of particle filtersrdquo in Proceedings of the 6th Interna-tional Conference of Information Fusion pp 1164ndash1171 2003

[45] M Coates ldquoDistributed particle filters for sensor networksrdquo inProceedings of the 3rd International symposium on InformationProcessing in Sensor Networks (ACM rsquo04) pp 99ndash107 New YorkNY USA 2004

[46] D Gu ldquoDistributed particle filter for target trackingrdquo in Pro-ceedings of the IEEE International Conference on Robotics andAutomation (ICRA rsquo07) pp 3856ndash3861 April 2007

[47] Y Bar-Shalom ldquoUpdate with out-of-sequencemeasurements intracking exact solutionrdquo IEEE Transactions on Aerospace andElectronic Systems vol 38 no 3 pp 769ndash778 2002

[48] M Orton and A Marrs ldquoA Bayesian approach to multi-targettracking and data fusionwithOut-of-SequenceMeasurementsrdquoIEE Colloquium no 174 pp 151ndash155 2001

[49] M L Hernandez A D Marrs S Maskell and M R OrtonldquoTracking and fusion for wireless sensor networksrdquo in Proceed-ings of the 5th International Conference on Information Fusion2002

[50] P C Mahalanobis ldquoOn the generalized distance in statisticsrdquoProceedings National Institute of ScienceIndia vol 2 no 1 pp49ndash55 1936

[51] S J Julier J K Uhlmann and D Nicholson ldquoA methodfor dealing with assignment ambiguityrdquo in Proceedings of theAmerican Control Conference (AAC rsquo04) vol 5 pp 4102ndash4107July 2004

[52] H Pan Z-P Liang T J Anastasio and T S Huang ldquoHybridNN-Bayesian architecture for information fusionrdquo in Proceed-ings of the International Conference on Image Processing (ICIPrsquo98) pp 368ndash371 October 1998

[53] C Coue T Fraichard P Bessiere and E Mazer ldquoMulti-sensordata fusion using Bayesian programming an automotive appli-cationrdquo in Proceedings of the IEEERSJ International Conferenceon Intelligent Robots and Systems pp 141ndash146 October 2002

[54] D L Hall and J Llinas Handbook of Multisensor Data FusionCRC Press Boca Raton Fla USA 2001

[55] P Dempster ldquoA Generalization of Bayesian Inferencerdquo Journalof the Royal Statistical Society B vol 30 no 2 pp 205ndash247 1968

[56] A ShaferMathematical Theory of Evidence Princeton Univer-sity Press Princeton NJ USA 1976

[57] G M Provan ldquoThe validity of Dempster-Shafer belief func-tionsrdquo International Journal of Approximate Reasoning vol 6no 3 pp 389ndash399 1992

[58] D M Buede ldquoShafer-Dempster and Bayesian reasoning aresponse to lsquoShafer-Dempster reasoning with applications tomultisensor target identification systemsrsquordquo IEEE Transactions onSystems Man and Cybernetics vol 18 no 6 pp 1009ndash1011 1988

[59] Y Cheng and R L Kashyap ldquoComparisonol Bayesian andDempsterrsquos rules in evidence combinationrdquo in Maximum-Entropy and BayesianMethods in Science and Engineering 1988

[60] B R Cobb and P P Shenoy ldquoA comparison of Bayesian andbelief function reasoningrdquo Information Systems Frontiers vol 5no 4 pp 345ndash358 2003

[61] H Wu M Siegel R Stiefelhagen and J Yang ldquoSensor fusionusing Dempster-Shafer theoryrdquo in Proceedings of the 19thIEEE Instrumentation and Measurement Technology Conference(TMTC rsquo02) pp 7ndash11 May 2002

The Scientific World Journal 19

[62] H Wu M Siegel and S Ablay ldquoSensor fusion using dempster-shafer theory II staticweighting andKalmanfilter-like dynamicweightingrdquo in Proceedings of the 20th IEEE Information andMeasurement Technology Conference (TMTC rsquo03) pp 907ndash912May 2003

[63] E Bosse P Valin A-C Boury-Brisset and D Grenier ldquoEx-ploitation of a priori knowledge for information fusionrdquo Infor-mation Fusion vol 7 no 2 pp 161ndash175 2006

[64] MMorbee L Tessens H Aghajan andW Philips ldquoDempster-Shafer based multi-view occupancy mapsrdquo Electronics Lettersvol 46 no 5 pp 341ndash343 2010

[65] C S Peirce Abduction and Induction Philosophical Writings ofPeirce vol 156 Dover New York NY USA 1955

[66] A M Abdelbar E A M Andrews and D C Wunsch IIldquoAbductive reasoning with recurrent neural networksrdquo NeuralNetworks vol 16 no 5-6 pp 665ndash673 2003

[67] J R Aguero and A Vargas ldquoInference of operative configu-ration of distribution networks using fuzzy logic techniquesPart II extended real-time modelrdquo IEEE Transactions on PowerSystems vol 20 no 3 pp 1562ndash1569 2005

[68] A W M Smeulders M Worring S Santini A Gupta and RJain ldquoContent-based image retrieval at the end of the earlyyearsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 22 no 12 pp 1349ndash1380 2000

[69] D S Friedlander and S Phoha ldquoSemantic information fusionfor coordinated signal processing in mobile sensor networksrdquoInternational Journal of High Performance Computing Applica-tions vol 16 no 3 pp 235ndash241 2002

[70] S Friedlander ldquoSemantic information extractionrdquo in Dis-tributed Sensor Networks 2005

[71] KWhitehouse J Liu and F Zhao ldquoSemantic Streams a frame-work for composable inference over sensor datardquo in Proceedingsof the 3rd European Workshop on Wireless Sensor NetworksLecture Notes in Computer Science Springer February 2006

[72] J Cox ldquoA review of statistical data association techniques formotion correspondencerdquo International Journal of ComputerVision vol 10 no 1 pp 53ndash66 1993

[73] C J Veenman M J T Reinders and E Backer ldquoResolvingmotion correspondence for densely moving pointsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol23 no 1 pp 54ndash72 2001

[74] F Castanedo M A Patricio J Garcıa and J M MolinaldquoBottom-uptop-down coordination in a multiagent visualsensor networkrdquo in Proceedings of the IEEE Conference onAdvanced Video and Signal Based Surveillance (AVSS rsquo07) pp93ndash98 September 2007

[75] F Castanedo J Garcıa M A Patricio and J M MolinaldquoAnalysis of distributed fusion alternatives in coordinated visionagentsrdquo in Proceedings of the 11th International Conference onInformation Fusion (FUSION rsquo08) July 2008

[76] F Castanedo J Garcıa M A Patricio and J M Molina ldquoDatafusion to improve trajectory tracking in a cooperative surveil-lance multi-agent architecturerdquo Information Fusion vol 11 no3 pp 243ndash255 2010

[77] M A Davenport C Hegde M F Duarte and R G BaraniukldquoJoint manifolds for data fusionrdquo IEEE Transactions on ImageProcessing vol 19 no 10 pp 2580ndash2594 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 18: Review Article A Review of Data Fusion Techniquesdownloads.hindawi.com/journals/tswj/2013/704504.pdf · Classification of Data Fusion Techniques Data fusion is a ... to extract features

18 The Scientific World Journal

[25] K C Chang C Y Chong and Y Bar-Shalom ldquoJoint proba-bilistic data association in distributed sensor networksrdquo IEEETransactions on Automatic Control vol 31 no 10 pp 889ndash8971986

[26] Y Chong S Mori and K C Chang ldquoInformation lusion indistributed sensor networksrdquo in Proceedings of the 4th AmericanControl Conference Boston Mass USA June 1985

[27] Y Chong S Mori and K C Chang ldquoDistributed multitar-get multisensor trackingrdquo in Multitarget-Multisensor TrackingAdvanced Applications vol 1 pp 247ndash295 1990

[28] J Pearl Probabilistic Reasoning in Intelligent Systems Networksof Plausible Inference Morgan Kaufmann San Mateo CalifUSA 1988

[29] Koller and N Friedman Probabilistic Graphical Models Princi-ples and Techniques MIT press 2009

[30] L Chen M Cetin and A S Willsky ldquoDistributed data associ-ation for multi-target tracking in sensor networksrdquo in Proceed-ings of the 7th International Conference on Information Fusion(FUSION rsquo05) pp 9ndash16 July 2005

[31] L Chen M J Wainwright M Cetin and A S Willsky ldquoDataassociation based on optimization in graphical models withapplication to sensor networksrdquo Mathematical and ComputerModelling vol 43 no 9-10 pp 1114ndash1113 2006

[32] Y Weiss and W T Freeman ldquoOn the optimality of solutionsof the max-product belief-propagation algorithm in arbitrarygraphsrdquo IEEE Transactions on InformationTheory vol 47 no 2pp 736ndash744 2001

[33] C Brown H Durrant-Whyte J Leonard B Rao and B SteerldquoDistributed data fusion using Kalman filtering a roboticsapplicationrdquo in Data Fusion in Robotics and Machine Intelli-gence M A Abidi and R C Gonzalez Eds pp 267ndash309 1992

[34] R E Kalman ldquoA new approach to linear filtering and predictionproblemsrdquo Journal of Basic Engineering vol 82 no 1 pp 35ndash451960

[35] R C Luo and M G Kay ldquoData fusion and sensor integrationstate-of-the-art 1990srdquo in Data Fusion in Robotics and MachineIntelligence pp 7ndash135 1992

[36] Welch and G Bishop An Introduction to the Kalman FilterACM SIC-CRAPH 2001 Course Notes 2001

[37] S J Julier and J K Uhlmann ldquoA new extension of the Kalmanfilter to nonlinear systemsrdquo in Proceedings of the InternationalSymposium on AerospaceDefense Sensing Simulation and Con-trols vol 3 1997

[38] A Wan and R Van Der Merwe ldquoThe unscented kalman filterfor nonlinear estimationrdquo in Proceedings of the Adaptive Systemsfor Signal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 153ndash158 2000

[39] D Crisan and A Doucet ldquoA survey of convergence results onparticle filtering methods for practitionersrdquo IEEE Transactionson Signal Processing vol 50 no 3 pp 736ndash746 2002

[40] J Martinez-del Rincon C Orrite-Urunuela and J E Herrero-Jaraba ldquoAn efficient particle filter for color-based tracking incomplex scenesrdquo in Proceedings of the IEEE Conference onAdvancedVideo and Signal Based Surveillance pp 176ndash181 2007

[41] S Ganeriwal R Kumar and M B Srivastava ldquoTiming-syncprotocol for sensor networksrdquo in Proceedings of the 1st Inter-national Conference on Embedded Networked Sensor Systems(SenSys rsquo03) pp 138ndash149 November 2003

[42] M Manzo T Roosta and S Sastry ldquoTime synchronization innetworksrdquo in Proceedings of the 3rd ACMWorkshop on Securityof Ad Hoc and Sensor Networks (SASN rsquo05) pp 107ndash116November 2005

[43] J K Uhlmann ldquoCovariance consistency methods for fault-tolerant distributed data fusionrdquo Information Fusion vol 4 no3 pp 201ndash215 2003

[44] S Bashi V P Jilkov X R Li and H Chen ldquoDistributed imple-mentations of particle filtersrdquo in Proceedings of the 6th Interna-tional Conference of Information Fusion pp 1164ndash1171 2003

[45] M Coates ldquoDistributed particle filters for sensor networksrdquo inProceedings of the 3rd International symposium on InformationProcessing in Sensor Networks (ACM rsquo04) pp 99ndash107 New YorkNY USA 2004

[46] D Gu ldquoDistributed particle filter for target trackingrdquo in Pro-ceedings of the IEEE International Conference on Robotics andAutomation (ICRA rsquo07) pp 3856ndash3861 April 2007

[47] Y Bar-Shalom ldquoUpdate with out-of-sequencemeasurements intracking exact solutionrdquo IEEE Transactions on Aerospace andElectronic Systems vol 38 no 3 pp 769ndash778 2002

[48] M Orton and A Marrs ldquoA Bayesian approach to multi-targettracking and data fusionwithOut-of-SequenceMeasurementsrdquoIEE Colloquium no 174 pp 151ndash155 2001

[49] M L Hernandez A D Marrs S Maskell and M R OrtonldquoTracking and fusion for wireless sensor networksrdquo in Proceed-ings of the 5th International Conference on Information Fusion2002

[50] P C Mahalanobis ldquoOn the generalized distance in statisticsrdquoProceedings National Institute of ScienceIndia vol 2 no 1 pp49ndash55 1936

[51] S J Julier J K Uhlmann and D Nicholson ldquoA methodfor dealing with assignment ambiguityrdquo in Proceedings of theAmerican Control Conference (AAC rsquo04) vol 5 pp 4102ndash4107July 2004

[52] H Pan Z-P Liang T J Anastasio and T S Huang ldquoHybridNN-Bayesian architecture for information fusionrdquo in Proceed-ings of the International Conference on Image Processing (ICIPrsquo98) pp 368ndash371 October 1998

[53] C Coue T Fraichard P Bessiere and E Mazer ldquoMulti-sensordata fusion using Bayesian programming an automotive appli-cationrdquo in Proceedings of the IEEERSJ International Conferenceon Intelligent Robots and Systems pp 141ndash146 October 2002

[54] D L Hall and J Llinas Handbook of Multisensor Data FusionCRC Press Boca Raton Fla USA 2001

[55] P Dempster ldquoA Generalization of Bayesian Inferencerdquo Journalof the Royal Statistical Society B vol 30 no 2 pp 205ndash247 1968

[56] A ShaferMathematical Theory of Evidence Princeton Univer-sity Press Princeton NJ USA 1976

[57] G M Provan ldquoThe validity of Dempster-Shafer belief func-tionsrdquo International Journal of Approximate Reasoning vol 6no 3 pp 389ndash399 1992

[58] D M Buede ldquoShafer-Dempster and Bayesian reasoning aresponse to lsquoShafer-Dempster reasoning with applications tomultisensor target identification systemsrsquordquo IEEE Transactions onSystems Man and Cybernetics vol 18 no 6 pp 1009ndash1011 1988

[59] Y Cheng and R L Kashyap ldquoComparisonol Bayesian andDempsterrsquos rules in evidence combinationrdquo in Maximum-Entropy and BayesianMethods in Science and Engineering 1988

[60] B R Cobb and P P Shenoy ldquoA comparison of Bayesian andbelief function reasoningrdquo Information Systems Frontiers vol 5no 4 pp 345ndash358 2003

[61] H Wu M Siegel R Stiefelhagen and J Yang ldquoSensor fusionusing Dempster-Shafer theoryrdquo in Proceedings of the 19thIEEE Instrumentation and Measurement Technology Conference(TMTC rsquo02) pp 7ndash11 May 2002

The Scientific World Journal 19

[62] H Wu M Siegel and S Ablay ldquoSensor fusion using dempster-shafer theory II staticweighting andKalmanfilter-like dynamicweightingrdquo in Proceedings of the 20th IEEE Information andMeasurement Technology Conference (TMTC rsquo03) pp 907ndash912May 2003

[63] E Bosse P Valin A-C Boury-Brisset and D Grenier ldquoEx-ploitation of a priori knowledge for information fusionrdquo Infor-mation Fusion vol 7 no 2 pp 161ndash175 2006

[64] MMorbee L Tessens H Aghajan andW Philips ldquoDempster-Shafer based multi-view occupancy mapsrdquo Electronics Lettersvol 46 no 5 pp 341ndash343 2010

[65] C S Peirce Abduction and Induction Philosophical Writings ofPeirce vol 156 Dover New York NY USA 1955

[66] A M Abdelbar E A M Andrews and D C Wunsch IIldquoAbductive reasoning with recurrent neural networksrdquo NeuralNetworks vol 16 no 5-6 pp 665ndash673 2003

[67] J R Aguero and A Vargas ldquoInference of operative configu-ration of distribution networks using fuzzy logic techniquesPart II extended real-time modelrdquo IEEE Transactions on PowerSystems vol 20 no 3 pp 1562ndash1569 2005

[68] A W M Smeulders M Worring S Santini A Gupta and RJain ldquoContent-based image retrieval at the end of the earlyyearsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 22 no 12 pp 1349ndash1380 2000

[69] D S Friedlander and S Phoha ldquoSemantic information fusionfor coordinated signal processing in mobile sensor networksrdquoInternational Journal of High Performance Computing Applica-tions vol 16 no 3 pp 235ndash241 2002

[70] S Friedlander ldquoSemantic information extractionrdquo in Dis-tributed Sensor Networks 2005

[71] KWhitehouse J Liu and F Zhao ldquoSemantic Streams a frame-work for composable inference over sensor datardquo in Proceedingsof the 3rd European Workshop on Wireless Sensor NetworksLecture Notes in Computer Science Springer February 2006

[72] J Cox ldquoA review of statistical data association techniques formotion correspondencerdquo International Journal of ComputerVision vol 10 no 1 pp 53ndash66 1993

[73] C J Veenman M J T Reinders and E Backer ldquoResolvingmotion correspondence for densely moving pointsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol23 no 1 pp 54ndash72 2001

[74] F Castanedo M A Patricio J Garcıa and J M MolinaldquoBottom-uptop-down coordination in a multiagent visualsensor networkrdquo in Proceedings of the IEEE Conference onAdvanced Video and Signal Based Surveillance (AVSS rsquo07) pp93ndash98 September 2007

[75] F Castanedo J Garcıa M A Patricio and J M MolinaldquoAnalysis of distributed fusion alternatives in coordinated visionagentsrdquo in Proceedings of the 11th International Conference onInformation Fusion (FUSION rsquo08) July 2008

[76] F Castanedo J Garcıa M A Patricio and J M Molina ldquoDatafusion to improve trajectory tracking in a cooperative surveil-lance multi-agent architecturerdquo Information Fusion vol 11 no3 pp 243ndash255 2010

[77] M A Davenport C Hegde M F Duarte and R G BaraniukldquoJoint manifolds for data fusionrdquo IEEE Transactions on ImageProcessing vol 19 no 10 pp 2580ndash2594 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 19: Review Article A Review of Data Fusion Techniquesdownloads.hindawi.com/journals/tswj/2013/704504.pdf · Classification of Data Fusion Techniques Data fusion is a ... to extract features

The Scientific World Journal 19

[62] H Wu M Siegel and S Ablay ldquoSensor fusion using dempster-shafer theory II staticweighting andKalmanfilter-like dynamicweightingrdquo in Proceedings of the 20th IEEE Information andMeasurement Technology Conference (TMTC rsquo03) pp 907ndash912May 2003

[63] E Bosse P Valin A-C Boury-Brisset and D Grenier ldquoEx-ploitation of a priori knowledge for information fusionrdquo Infor-mation Fusion vol 7 no 2 pp 161ndash175 2006

[64] MMorbee L Tessens H Aghajan andW Philips ldquoDempster-Shafer based multi-view occupancy mapsrdquo Electronics Lettersvol 46 no 5 pp 341ndash343 2010

[65] C S Peirce Abduction and Induction Philosophical Writings ofPeirce vol 156 Dover New York NY USA 1955

[66] A M Abdelbar E A M Andrews and D C Wunsch IIldquoAbductive reasoning with recurrent neural networksrdquo NeuralNetworks vol 16 no 5-6 pp 665ndash673 2003

[67] J R Aguero and A Vargas ldquoInference of operative configu-ration of distribution networks using fuzzy logic techniquesPart II extended real-time modelrdquo IEEE Transactions on PowerSystems vol 20 no 3 pp 1562ndash1569 2005

[68] A W M Smeulders M Worring S Santini A Gupta and RJain ldquoContent-based image retrieval at the end of the earlyyearsrdquo IEEE Transactions on Pattern Analysis and MachineIntelligence vol 22 no 12 pp 1349ndash1380 2000

[69] D S Friedlander and S Phoha ldquoSemantic information fusionfor coordinated signal processing in mobile sensor networksrdquoInternational Journal of High Performance Computing Applica-tions vol 16 no 3 pp 235ndash241 2002

[70] S Friedlander ldquoSemantic information extractionrdquo in Dis-tributed Sensor Networks 2005

[71] KWhitehouse J Liu and F Zhao ldquoSemantic Streams a frame-work for composable inference over sensor datardquo in Proceedingsof the 3rd European Workshop on Wireless Sensor NetworksLecture Notes in Computer Science Springer February 2006

[72] J Cox ldquoA review of statistical data association techniques formotion correspondencerdquo International Journal of ComputerVision vol 10 no 1 pp 53ndash66 1993

[73] C J Veenman M J T Reinders and E Backer ldquoResolvingmotion correspondence for densely moving pointsrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol23 no 1 pp 54ndash72 2001

[74] F Castanedo M A Patricio J Garcıa and J M MolinaldquoBottom-uptop-down coordination in a multiagent visualsensor networkrdquo in Proceedings of the IEEE Conference onAdvanced Video and Signal Based Surveillance (AVSS rsquo07) pp93ndash98 September 2007

[75] F Castanedo J Garcıa M A Patricio and J M MolinaldquoAnalysis of distributed fusion alternatives in coordinated visionagentsrdquo in Proceedings of the 11th International Conference onInformation Fusion (FUSION rsquo08) July 2008

[76] F Castanedo J Garcıa M A Patricio and J M Molina ldquoDatafusion to improve trajectory tracking in a cooperative surveil-lance multi-agent architecturerdquo Information Fusion vol 11 no3 pp 243ndash255 2010

[77] M A Davenport C Hegde M F Duarte and R G BaraniukldquoJoint manifolds for data fusionrdquo IEEE Transactions on ImageProcessing vol 19 no 10 pp 2580ndash2594 2010

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 20: Review Article A Review of Data Fusion Techniquesdownloads.hindawi.com/journals/tswj/2013/704504.pdf · Classification of Data Fusion Techniques Data fusion is a ... to extract features

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014


Recommended