+ All Categories
Home > Documents > Target Detection and Recognition Improvements by Use of Spatiotemporal Fusion

Target Detection and Recognition Improvements by Use of Spatiotemporal Fusion

Date post: 30-Sep-2016
Category:
Upload: teresa
View: 216 times
Download: 3 times
Share this document with a friend
13
Target detection and recognition improvements by use of spatiotemporal fusion Hai-Wen Chen, Surachai Sutha, and Teresa Olson We developed spatiotemporal fusion techniques for improving target detection and automatic target recognition. We also investigated real IR infrared sensor clutter noise. The sensor noise was collected by an IR 256 256 sensor looking at various scenes trees, grass, roads, buildings, etc.. More than 95% of the sensor pixels showed near-stationary sensor clutter noise that was uncorrelated between pixels as well as across time frames. However, in a few pixels covering the grass near the road the sensor noise showed nonstationary properties with increasing or decreasing mean across time frames. The natural noise extracted from the IR sensor, as well as the computer-generated noise with Gaussian and Rayleigh distributions, was used to test and compare different spatiotemporal fusion strategies. Finally, we proposed two advanced detection schemes: the double-thresholding the reverse- thresholding techniques. These techniques may be applied to complicated clutter situations e.g., very- high clutter or nonstationary clutter situations where the traditional constant-false-alarm-ratio technique may fail. © 2004 Optical Society of America OCIS codes: 110.2970, 110.4280, 100.5010, 100.2000, 100.2960. 1. Introduction In this paper our previous results 1,2 for spatiotempo- ral fusion for target classification are developed fur- ther for target detection. In our previous results for target classification, fusion was conducted in the like- lihood function reading domain. In general, the likelihood functions or probability density functions are obtained from training data based on single- sensor and single-frame measurements. Therefore, if we conduct fusion with the likelihood readings of the features extracted from measurements of a single sensor and a frame, we need to store only one set of likelihood functions from the sensor and the frame, regardless of the number of sensors and frames we use for fusion. On the other hand, we show in this paper that the detection process uses thresholding techniques instead of likelihood functions; thus we can directly fuse the extracted feature values from different sensors and time frames in the feature do- main for target detection. As discussed in our previous papers, 1,2 so-called spatial fusion is defined as the fusion between differ- ent sensors, whereas temporal fusion is defined as the temporal integration across different time frames within a single sensor. Different spatial fusion and temporal integration fusion strategies have been developed and compared, including predetection in- tegration such as additive, multiplicative, maximum MAX, and minimum MIN fusions, as well as tra- ditional postdetection integration the persistency test. Predetection integration is conducted by fus- ing the feature values from different time frames before the thresholding process i.e., the detection process, whereas postdetection integration is con- ducted after the thresholding process. Although our techniques are aimed mainly toward improving target detection, they can be used for other applications that also use thresholding techniques. Automatic target recognition ATR research has re- cently received considerable research attention. One popular ATR approach uses matched filtering and correlation techniques in which the postcorrela- tion features e.g., the peak-to-sidelobe ratio are sub- jected to threshold-screening to pick and classify the recognized targets. 3,4 Therefore both the predetec- tion and postdetection temporal integration methods can be used to improve target recognition when mul- tiple temporal frames are involved. In our second study, 2 the temporal correlation and nonstationary properties of sensor noise were inves- tigated by use of sequences of imagery collected by an The authors are with Lockheed Martin, Missiles and Fire Control–Orlando, 5600 Sand Lake Road, MP-916, Orlando, Florida 32819. H.-W. Chen’s e-mail address is [email protected]. Received 1 May 2003; revised manuscript received 24 July 2003; accepted 25 August 2003. 0003-693504020403-13$15.000 © 2004 Optical Society of America 10 January 2004 Vol. 43, No. 2 APPLIED OPTICS 403
Transcript
Page 1: Target Detection and Recognition Improvements by Use of Spatiotemporal Fusion

Tb

H

1

Irttllasitslruptcdm

C3

a

arget detection and recognition improvementsy use of spatiotemporal fusion

ai-Wen Chen, Surachai Sutha, and Teresa Olson

We developed spatiotemporal fusion techniques for improving target detection and automatic targetrecognition. We also investigated real IR �infrared� sensor clutter noise. The sensor noise was collectedby an IR �256 � 256� sensor looking at various scenes �trees, grass, roads, buildings, etc.�. More than95% of the sensor pixels showed near-stationary sensor clutter noise that was uncorrelated betweenpixels as well as across time frames. However, in a few pixels �covering the grass near the road� thesensor noise showed nonstationary properties �with increasing or decreasing mean across time frames�.The natural noise extracted from the IR sensor, as well as the computer-generated noise with Gaussianand Rayleigh distributions, was used to test and compare different spatiotemporal fusion strategies.Finally, we proposed two advanced detection schemes: the double-thresholding the reverse-thresholding techniques. These techniques may be applied to complicated clutter situations �e.g., very-high clutter or nonstationary clutter situations� where the traditional constant-false-alarm-ratiotechnique may fail. © 2004 Optical Society of America

OCIS codes: 110.2970, 110.4280, 100.5010, 100.2000, 100.2960.

setwtdt�dtibpd

iaAcOatjrtct

nt

. Introduction

n this paper our previous results1,2 for spatiotempo-al fusion for target classification are developed fur-her for target detection. In our previous results forarget classification, fusion was conducted in the like-ihood function reading domain. In general, theikelihood functions �or probability density functions�re obtained from training data based on single-ensor and single-frame measurements. Therefore,f we conduct fusion with the likelihood readings ofhe features extracted from measurements of a singleensor and a frame, we need to store only one set ofikelihood functions from the sensor and the frame,egardless of the number of sensors and frames wese for fusion. On the other hand, we show in thisaper that the detection process uses thresholdingechniques instead of likelihood functions; thus wean directly fuse the extracted feature values fromifferent sensors and time frames in the feature do-ain for target detection.As discussed in our previous papers,1,2 so-called

The authors are with Lockheed Martin, Missiles and Fireontrol–Orlando, 5600 Sand Lake Road, MP-916, Orlando, Florida2819. H.-W. Chen’s e-mail address is [email protected] 1 May 2003; revised manuscript received 24 July 2003;

ccepted 25 August 2003.0003-6935�04�020403-13$15.00�0© 2004 Optical Society of America

patial fusion is defined as the fusion between differ-nt sensors, whereas temporal fusion is defined ashe temporal integration across different time framesithin a single sensor. Different spatial fusion and

emporal integration �fusion� strategies have beeneveloped and compared, including predetection in-egration �such as additive, multiplicative, maximumMAX�, and minimum �MIN� fusions�, as well as tra-itional postdetection integration �the persistencyest�. Predetection integration is conducted by fus-ng the feature values from different time framesefore the thresholding process �i.e., the detectionrocess�, whereas postdetection integration is con-ucted after the thresholding process.Although our techniques are aimed mainly toward

mproving target detection, they can be used for otherpplications that also use thresholding techniques.utomatic target recognition �ATR� research has re-

ently received considerable research attention.ne popular ATR approach uses matched filteringnd correlation techniques in which the postcorrela-ion features �e.g., the peak-to-sidelobe ratio� are sub-ected to threshold-screening to pick and classify theecognized targets.3,4 Therefore both the predetec-ion and postdetection temporal integration methodsan be used to improve target recognition when mul-iple temporal frames are involved.

In our second study,2 the temporal correlation andonstationary properties of sensor noise were inves-igated by use of sequences of imagery collected by an

10 January 2004 � Vol. 43, No. 2 � APPLIED OPTICS 403

Page 2: Target Detection and Recognition Improvements by Use of Spatiotemporal Fusion

I�ncldupitcrtticv

2

Fmdf�sthit

fstgfitbsft

mftnGlpmwrotottst

miv

�ttiitc�

offpdpt

dpsi

pvait

itIatcrobtr

3

It

4

R �256 � 256� sensor looking at different scenestrees, grass, roads, buildings, etc.�. The naturaloise extracted from the IR sensor, as well as theomputer-generated noise with Gaussian and Ray-eigh distributions, was used to test and compareifferent temporal integration strategies. The sim-lation results show that both the predetection andostdetection temporal integrations can considerablymprove target detection by integrating only �3–5ime frames �tested by real sensor noise and byomputer-generated noise�. Moreover, the detectionesults can be further improved by combining bothhe predetection and postdetection temporal integra-ions. Finally, we propose two advanced threshold-ng techniques that may outperform the traditionalonstant-false-alarm-ratio �CFAR� technique in se-ere and complicated clutter situations.

. Spatiotemporal Fusion

or a physical sensor the sensing errors are causedainly by the measurement noise, which is generally

escribed as a random variable �RV�. For example,or an IR �infrared� sensor, the measurement noisetemporal noise� may originate from a number ofources including the scene background, atmosphereransmission, path radiance, optics, filters, sensorousing and shield, detector dark current, pixel phas-

ng, quantization, and amplifier and readout elec-ronics.

For target detection at the feature level, differenteatures are extracted from the original physical mea-urements. In the IR sensor detection of a resolvedarget occupying multiple pixels or of an unsolved tar-et occupying only a single pixel a spatial matchedltering process is generally conducted before the de-ection �thresholding� process. The filter can be a So-el edge extractor, a difference-of-Gaussian filter, apecific tuned basis function, or an optical point spreadunction. The output of the filter is used as the ex-racted feature values for detection.

The extracted features affected by the measure-ent noise are also RVs. The probability density

unction �pdf � of a feature RV may or may not havehe same distribution as the original measurementoise. For example, if the measurement noise has aaussian distribution and the extracted feature is a

inear transform �e.g., the mean or average of multi-le data points is a linear feature� of the physicaleasurement, then the distribution of the feature RVill still be Gaussian. On the other hand, if the

elationship between the extracted feature and theriginal measurement is nonlinear, the feature dis-ribution will generally be different from the originalne. Consider a radar sensor with a Gaussian dis-ributed measurement noise. If we use the ampli-ude of the radar return real and the imaginaryignals as the extracted feature, the distribution ofhe feature RV will be Rayleigh.

To increase the probability of detection �Pd� weust reduce the influence of the feature RVs. The

nfluence of RVs can be decreased by reducing theariances ��2� of the RVs, by increasing the distance

04 APPLIED OPTICS � Vol. 43, No. 2 � 10 January 2004

d� between the means of the two feature RVs relatedo the target and the clutter, or by a combination ofhe two. A reduction in the feature variances or anncrease in the feature distances or both results in anncrease in the signal-to-clutter-noise ratio, leadingo a better performance in the receiver operatingharacteristics �ROC; i.e., a higher Pd for a specificsame� probability of false alarms �Pfa��.

There are two approaches for reducing the variancef RVs: �1� a temporal integration between timerames that averages the RVs in a window of timerames �predetection integration� and �2� a binomialersistency test with a window of time frames �post-etection integration�. In 1938 Wold5 proposed androved a theorem that provides some insight into howemporal integration can be useful.

Wold’s Fundamental Theorem: any stationaryiscrete-time stochastic process x�n� may be ex-ressed in the form x�n� � u�n� s�n�, where u�n� and�n� are uncorrelated processes, u�n� is a RV, and s�n�s a deterministic process.

Therefore, if u�n� is less temporally correlated, tem-oral integration will be more useful for reducing theariance of u�n�. In this case temporal integrationcross multiple time frames �temporal fusion� canmprove detection and classification results. The in-egrated spatiotemporal fusion is shown in Fig. 1.

Besides the temporally uncorrelated noise that ismportant for effective temporal integration �fusion�,here is another condition that needs to be addressed.n many realistic situations the target may be moving,nd the sensor platform may also be moving, relativeo the background clutters. Therefore another criti-al condition for effective temporal fusion is the accu-ate tracking and association of targets and clutterbjects �i.e., detected objects� at different time framesy use of navigation inertial trackers or image-basedrackers �or both� or by any effective image �object�egistration, association, and correlation techniques.

. Different Fusion Strategies

n this section we present four fusion �RV combina-ion� strategies: �1� additive, �2� multiplicative, �3�

Fig. 1. Spatiotemporal fusion.

Page 3: Target Detection and Recognition Improvements by Use of Spatiotemporal Fusion

Maas

A

Tf

wastttad

ct

pcp

f�

EtRe

B

Tt

�tc

vpHtp

I�

CF

Ip

Towtttb

D

Tfdt�

w

cm

Ff

IN, and �4� MAX. Please refer to Refs. 1 and 2 formore detailed description of additive fusion and its

dvantages when adaptively weighting different sen-ors.

. Additive Fusion

he additive fusion rule for two sensors �or 2 timerames� is

p�t� � p�t1� � p�t2�, p�c� � p�c1� � p�c2�, (1)

here p�t� is the fused target feature values; p�t1�nd p�t2� are the target feature values at sensor 1 andensor 2 �or time frames 1 and 2�, respectively; p�c� ishe fused clutter feature values; p�c1� and p�c2� arehe clutter feature values at sensor 1 and sensor 2 �orime frames 1 and 2�, respectively. In a frame therere generally many more clutter feature values atifferent pixel locations.The additive fusion can be easily extended to in-

lude more than two sensors �spatial fusion� or morehan 2 time frames �temporal integration�:

p�t� � p�t1� � p�t2� � · · · � p�tn�,

p�c� � p�c1� � p�c2� � · · · � p�cn�. (2)

For two independent RVs X and Y, the combineddf of the summation of these two RVs �Z � X Y� isalculated as the convolution of the two individualdf ’s:6,7

fZ� z� � 0

fX� x� fY� z � x�dx. (3)

In our additive fusion case �with two sensors or 2rames� p�t� � z, p�t1� � x, p�t2� � y �or p�c� � z, p�c1�

x, and p�c2� � y�. From Eq. �3� we have

fp�t��p�t�� � 0

fp�t1�� p�t1�� fp�t2�� p�t� � p�t1��dp�t1�,

(4)

fp�c��p�c�� � 0

fp�c1�� p�c1�� fp�c2�� p�c� � p�c1��dp�c1�.

(5)

quations �4� and �5� can be used to predict the de-ection performance of additive fusion because theOC curves that result from additive fusion can bestimated from the combined pdf ’s in Eqs. �4� and �5�.

. Multiplicative Fusion

he multiplicative fusion rule of two sensors �or 2ime frames� is

p�t� � p�t1� � p�t2�, p�c� � p�c1� � p�c2�. (6)

For two independent RVs X and Y, as shown in Eq.7�, the combined pdf of the multiplication of thesewo RVs �Z � X � Y� is calculated as the nonlinearonvolution �with divisions of a RV� of the two indi-

idual pdfs. In Davenport’s book7 this equation ap-ears twice �once as a homework problem�.owever, he does not present the proof for this equa-

ion. Nevertheless, we have used different ways torove that this equation is correct.

fZ� z� � 0

� 1�x�

fX� x� fY�zx�dx. (7)

n our two-sensor multiplication fusion case, from Eq.7�, we have

fp�t�� p�t�� � 0

� 1� p�t1��

fp�t1�� p�t1�� fp�t2�� p�t�p�t1��dp�t1�,

(8)

fp�c�� p�c�� � 0

� 1� p�c1��

fp�c1�� p�c1�� fp�c2�

� � p�c�

p�c1��dp�c1�. (9)

. Relationship between Additive and Multiplicationusions

f we take the logarithm on both sides of the multi-lication fusion Equation �6�, we have

ln� p�t�� � ln� p�t1�� � ln� p�t2��,

ln� p�c�� � ln� p�c1�� � ln� p�c2��. (10)

he multiplication term becomes two additive termsf logarithm functions in each of the equations. Ife have two RVs with log-normal pdf ’s, the equa-

ions above indicate that the multiplicative fusion ofwo RVs with log-normal distributions is equivalento the additive fusion of two RVs with normal distri-utions.

. Minimum and Maximum Fusions

he conjunction “AND” and disjunction “OR” are tworequently used combination rules in fuzzy logic. Aserived in Ref. 6 for two independent RVs X and Y,he combined pdf of the conjunction of these two RVsZ � min�X, Y�� is given as

fZ� z� � fX� z��1 � FY� z�� � fY� z��1 � FX� z��, (11)

here F�z� is the cumulative distribution function.Similarly, for two independent RVs X and Y, the

ombined pdf of the disjunction of these two RVs �Z �ax�X, Y�� is given as

fZ� z� � fX� z� FY� z� � fY� z� FX� z�. (12)

or our two-object problem, the MIN �conjunction�usion is

p�t� � min� p�t1�, p�t2��,

p�c� � min� p�c1�, p�c2��. (13)

10 January 2004 � Vol. 43, No. 2 � APPLIED OPTICS 405

Page 4: Target Detection and Recognition Improvements by Use of Spatiotemporal Fusion

T

E

TthdmeIsaftcfiitfowoe

ttiepao

acrtdoaoptf

wmatE�fria

t7T5ssit

FA�p 6 ou

4

he MAX �disjunction� fusion is

p�t� � max� p�t1�, p�t2��,

p�c� � max� p�c1�, p�c2��. (14)

. Postdetection Integration �Persistency Test�

he terms predetection and postdetection integra-ions were originally used in radar sensor detection;8owever, they can be equally applied for IR sensoretection. For both detection methods a temporaloving integration window �typically containing sev-

ral frames N, e.g., N � 5 or N � 7� is first selected.n the predetection method, one of the different fu-ion strategies discussed in the previous section, ispplied for the frames within the window size. Theused feature values are then used for detection �withhresholding�. In the postdetection method �alsoalled the persistency test� detection �thresholding� isrst performed on each image frame within the mov-

ng window �with N frames�. Then k �k � N� detec-ions are evaluated out of the N frames that occurredor a detected object. For example, for a criteria of 5ut of 7, if an object was detected in 5 or more framesithin a moving window with 7 frames, the detected

bject is considered a target; otherwise, it is consid-red noise or clutter detection.Figure 2�a� shows the pdf ’s for the noise and a

arget in a single frame with a SD �standard devia-ion� of 5. Figure 2�b� shows the pdf ’s after averag-ng 25 frames �the predetection integration, which isquivalent to the additive fusion�. The SD of thedf ’s in Fig. 2�b� is reduced by a factor of 5. Theccumulated probability curves �the error functions�f the pdf ’s for the target and the noise in Figs. 2�a�

ig. 2. ROC performances of predetection and postdetection integrccumulated probability curves for a single frame �solid curves� a

top�, predetection averaging 7 frames; dashed curve �bottom�,ostdetection 5 out of 7; dotted curve �lower middle�, postdetection

06 APPLIED OPTICS � Vol. 43, No. 2 � 10 January 2004

nd 2�b� are plotted in Fig. 2�c�, where the solidurves denote the single frame and the dashed curvesepresent the average of 25 frames. For the prede-ection integration, the ROC curves are obtained byirectly plotting the accumulated probability curvesf the target and the noise shown in Fig. 2�c� as the ynd x axes, respectively, in Fig. 2�d�. For a k-out-f-N postdetection integration, the accumulatedrobability curves must be transferred to postdetec-ion accumulated probability curves by use of theollowing binomial equation:

P�k:N� � �j�k

N �Nk �pj�1 � p�N j, (15)

here p is a specific probability value from the accu-ulated probability curves in Fig. 2�c�. Therefore,

ll the values of a curve in Fig. 2�c� can be transferredo a new curve by use of Eq. �15�. It can be seen thatq. �15� contains all the probabilities of k out of N,

k 1� out of N, . . . until N out of N. A ROC curveor the postdetection integration is obtained by di-ectly plotting the transferred accumulated probabil-ty curves of the target and the noise as the y and xxes, respectively, in Fig. 2�d�.Several ROC curves are plotted in Fig. 2�d�. The

op and bottom �solid and dashed� curves are a-frame average and a 3-frame average, respectively.he middle two �dash-dotted and dotted� curves are-out-of-7 and 6-out-of-7 persistency test results, re-pectively. It can be seen from Fig. 2�d� that for aame-frame window �e.g., 7-frame�, the predetectionntegration performs a little better than the postde-ection integration.

. The pdf ’s for �a� a single frame and �b� a 25-frame average. �c�25-frame average �dashed curves�. �d� ROC curves: solid curvetection averaging 3 frames; dash-dotted curve �upper middle�,t of 7.

ationnd aprede

Page 5: Target Detection and Recognition Improvements by Use of Spatiotemporal Fusion

4

Assdt�fdstcbecmstto

fedwCfciaba

btgff

aFdsFTntwcffp

AF

FShtfpM

satT

Fg

Fl

. Simulations of Target Detection

s shown in Fig. 3, we simulated both IR and RFensors for target detection improvements by use ofpatiotemporal fusion. Spatial fusion was con-ucted between the IR and the rf frames �predetec-ion integration only�, whereas temporal fusionintegration� was conducted across several timerames for each sensor �both predetection and post-etection integrations�. Two target situations wereimulated: �1� one target in the scene and �2� twoargets in the scene. In general, the single-targetase has fewer adjustable parameters and thus woulde easier to use to compare performances from differ-nt fusion strategies than would the multiple-targetase. However, the multiple-target case occurs inany realistic situations. A two-target case is

hown in Fig. 3. In this simulation we used staticargets and clutter and presumed perfect objectracking or registration across multiple time framesr both.Fifty random data samples �related to 50 time

rames� were generated as performance data sets forach object �target or clutter noise� to evaluate theetection performance. Detection was conductedith the traditional CFAR strategy. For a specificFAR threshold, each detected target at one of the 50

rames counts on 2% of the Pd for the single-targetase and 1% for the two-target case. The noise in IRs simulated as a normal distribution with a SD of 20,nd the noise in rf is simulated as a Rayleigh distri-ution with a SD of 10. Figure 4 shows the pdf ’s oftarget and clutter noise, both with normal distri-

ig. 3. Target detection in a two-target situation: squares, tar-et 1; circles, target 2; triangles, clutter noise.

Fig. 5. Baseline �single-frame� detection R

utions. In the single-target case the separation ofhe means between the target and the clutter-noiseroup is set as S � mt mc � 19 for IR and S � 10or rf. In the two-target case, S1 � 19 and S1 � 25or IR; S1 � 10 and S1 � 17 for rf.

The detection ROC performance curves withoutny temporal integration �single frame� are shown inig. 5 as a baseline performance to compare withifferent temporal fusion strategies. Figure 5�a�hows the baseline result from an IR sensor, whereasig. 5�b� shows the same result from a rf sensor.he y axis is the Pd, and the x axis is the false-alarmumber per frame. The curve with circles denoteshe result from the single-target case, and the curveith squares denotes the result from the two-target

ase. It is apparent that for a false-alarm rate of 2alse alarms per frame the Pd is approximately 75%or IR and 87% for rf and that the single-target caseerforms a little better than the two-target case.

. Additive Spatial Fusion versus Additive Temporalusion

or the four different fusion strategies discussed inection 3, our simulation results for target detectionave shown that the multiplication fusion performshe same as the additive fusion and that the MINusion performs better than the MAX fusion. In thisaper, we show the results from the additive andIN fusion strategies.The detection ROC performance curves for the

ingle-target IR-sensor case are shown in Fig. 6�a�,nd the detection ROC performance curves for thewo-target IR-sensor case are shown in Fig. 6�b�.he curve with the circles shows the baseline perfor-

ig. 4. Gaussian pdf ’s of target versus clutter noise. Arrowedine represents the single threshold.

erformance for �a� IR and �b� RF sensors.

OC p

10 January 2004 � Vol. 43, No. 2 � APPLIED OPTICS 407

Page 6: Target Detection and Recognition Improvements by Use of Spatiotemporal Fusion

mstsbr6dummfor3sgtscp

BF

TsswtttFRpt

C

TSsTt5p

D

FwtgFbn�t

cfi1pttct

5

Arpcli

Ftwo-target �d� cases.

4

ance �single frame�. The curve with the triangleshows the result of spatial additive fusion betweenhe IR and the rf sensors, whereas the curve with thequares shows the result of additive temporal fusiony integrating a time-frame window of 3. Similaresults for the rf sensor are shown in Figs. 6�c� and�d�. It is found that the spatial fusion improvesetection and performs better than a single sensorsed alone. The IR �the worse sensor� improvedore than the rf �the better sensor� did. Further-ore, the temporal fusion, which uses three time

rames, outperforms the spatial fusion which usesnly two sensors. In general, if the noise is uncor-elated across frames, a temporal fusion with N � 2,, . . . frames would perform similar to a spatial fu-ion with N sensors. In Section 5 we discuss inreater detail the noise correlation properties be-ween frames. The results of additive temporal fu-ion using five time frames are shown in Fig. 7. Itan be seen that target detection can be further im-roved by increasing the time window of integration.

. Additive Temporal Fusion versus Minimum Temporalusion

he results comparing additive fusion with MIN fu-ion for an integration window of five frames arehown in Fig. 8. Both additive and MIN fusionsith multiple frames improve target detection. For

he IR sensor �with normal noise distribution� addi-ive fusion always outperforms MIN fusion in bothhe single-target and two-target cases, as shown inigs. 8�a� and 8�b�, however, for the rf sensor �withayleigh noise distribution� MIN and additive fusionerform equally well in both the single- and two-arget cases, as shown in Figs. 8�c� and 8�d�.

08 APPLIED OPTICS � Vol. 43, No. 2 � 10 January 2004

. Postdetection Integration �Persistency Test�

he persistency test has been discussed and shown inubsection 3.E and in Fig. 2. Persistency test re-ults for both IR and rf sensors are shown in Fig. 9.he three curves in each figure represent the persis-ency test for K-out-of-N frames �K � 2, 3, 4 and N ��. The three curves in Fig. 9 show detection im-rovements similar to those in Fig. 2�d�.

. Additive Fusion versus Persistency Test

igure 10 shows the results of additive fusion �curveith the squares� and the persistency test �curve with

he triangles� for both the IR and rf sensors by inte-rating a time-frame window of 5. It is found fromig. 10 that by integrating only five frames, use ofoth additive fusion and the persistency test can sig-ificantly improve target detection from the baseline

single frame�, with additive fusion performing a lit-le better than the persistency test.

Additive fusion and the persistency test can also beomplementary to each other and can be combined tourther improve target detection. Results with anntegration window of five frames are shown in Fig.1. The curves with the triangles show the ROCerformance of the persistency test, the curves withhe squares show the ROC performance of the addi-ive fusion, and the curves with the circles show theombined ROC performance of additive fusion andhe persistency test.

. Temporal Correlation Properties of Real IR Noise

s discussed in Section 4, the performance of tempo-al integration depends on the temporal correlationroperties of the sensor noise. Better performancean be achieved if the noise across the time frames isess correlated. In the simulation results presentedn the previous section, we used computer-generated

ig. 6. Additive spatiotemporal fusion �window � 3 frames� for IR one-target �a� and two-target �b� cases and RF one-target �c� and

Page 7: Target Detection and Recognition Improvements by Use of Spatiotemporal Fusion

rfowaSwc

�p

tfgetucFatr

Ft

F�

andom noise that is generally uncorrelated betweenrames. What is the temporal correlation propertyf the real sensor noise? To answer this question,e extracted and studied multiple frame noise fromn InSb IR focal-plane array with 256 � 256 pixels.everal imagery sequences �with 50 time frames�ere collected with this IR sensor looking at different

lutter scenes �trees, grass, roads, buildings, etc.�.Studies of the natural IR noise have revealed that

1� the sensor noise in most ��95%� of the sensorixels are nearly stationary and are uncorrelated be-

ig. 7. Additive spatiotemporal fusion �window � 5 frames� forwo-target �d� cases.

ig. 8. Additive and MIN fusions �window � 5 frames� for IR one-d� cases.

ween pixels and are almost uncorrelated across timerames; and �2� the noise in a few pixels �e.g., therass along the road� has shown nonstationary prop-rties �with increasing or decreasing mean acrossime�. Figure 12�b� shows a typical stationary andncorrelated noise sequence �50 frames� from a spe-ific pixel. Its autocorrelation function is shown inig. 12�a�. Figure 12�d� shows a typical nonstation-ry noise sequence with a decreasing mean acrossime. Its autocorrelation function with high tempo-al correlation is shown in Fig. 12�c�. Figure 12�e�

ne-target �a� and two-target �b� cases and RF one-target �c� and

t �a� and two-target �b� cases and RF one-target �c� and two-target

IR o

targe

10 January 2004 � Vol. 43, No. 2 � APPLIED OPTICS 409

Page 8: Target Detection and Recognition Improvements by Use of Spatiotemporal Fusion

sdpdFoha

svT

bjncttpccrtn

F and t

Ft

4

hows the autocorrelation function of a Gaussian ran-om noise sequence �50 frames� generated by a com-uter �this noise was also used in the simulationiscussed in the previous section�. It can be seen inigs. 12�a� and 12�e� that the natural noise from mostf the IR pixels and the computer-generated noiseave similar autocorrelation functions and that bothre highly uncorrelated across time frames.From the natural IR noise we notice that the non-

tationary noise at a specific pixel always shows highalues off the center peak in the correlation function.o understand whether these high values are caused

ig. 9. Persistency test �window � 5 frames� for IR one-target �a�

ig. 10. Additive fusion and persistency test �window � 5 frames�wo-target �d� cases.

10 APPLIED OPTICS � Vol. 43, No. 2 � 10 January 2004

y the nonstationary properties alone or arise in con-unction with temporal correlation, we detrended theonstationary noise sequences and removed the in-reasing or decreasing means. We then found thathe detrended noise �a stationary process� becomesemporally uncorrelated at low values off the centereak in the correlation function. This finding indi-ates that the noise at pixels with high off-centerorrelation values is nonstationary but not tempo-ally correlated. One such example of the noise de-rend is shown in Fig. 13. Figure 13�a� shows aonstationary noise sequence with an increasing

wo-target �b� cases and RF one-target �c� and two-target �d� cases.

R one-target �a� and two-target �b� cases and RF one-target �c� and

for I
Page 9: Target Detection and Recognition Improvements by Use of Spatiotemporal Fusion

m1dsrct

taftia

Fo

Fn

ean whose autocorrelation function is shown in Fig.3�b�. Figure 13�c� shows the same noise after theetrend process, and its autocorrelation function ishown in Fig. 13�d�. It is apparent that the autocor-elation function in Fig. 13�d� has much lower off-enter-peak values than that in Fig. 13�b�. That is,he detrended noise is temporally uncorrelated.

ig. 11. Combination of additive fusion and persistency test �winne-target �a� and two-target �d� cases.

ig. 12. Autocorrelations of real and computer-generated noise ofoise sequence in �d�, and �e� of a computer-generated stationary

We have applied IR real noise to test our differentemporal fusion strategies in predetection integrationnd in postdetection temporal integration. The per-ormances with the stationary real IR noise are similaro those with the computer-generated noise, as shownn the previous section. Figure 14�b� shows a station-ry target-noise sequence �50 frames, solid curve� and

� 5 frames� for IR one-target �a� and two-target �b� cases and RF

he stationary IR noise sequence in �b�, �c� of the nonstationary IRsequence. GWN denotes Gaussian white noise.

dow

�a� tnoise

10 January 2004 � Vol. 43, No. 2 � APPLIED OPTICS 411

Page 10: Target Detection and Recognition Improvements by Use of Spatiotemporal Fusion

aTpba2tasctf

twdc

ssTFmFu

F143rd row of the 2D image. �c� Same noise sequence after detrend with autocorrelation function �d�.

Ft

4

stationary clutter-noise sequence �dashed curve�.hese noise sequences were extracted from differentixels of the IR sensor. The separation of the meansetween the target and the clutter-noise groups is sets S � mt mc � 2.3 for the single-target case; S1 �.3, S2 � 2.3 for the two-target case. The target de-ection ROC performances for the single-target casere shown in Fig. 14�a�. The curve with the circleshows the baseline �single-frame� performance. Theurve with the triangles shows the performance withhe persistency test with an integration window of 3rames �2 out of 3�, the curve with the squares shows

12 APPLIED OPTICS � Vol. 43, No. 2 � 10 January 2004

he performance of additive fusion with an integrationindow of 3 frames. It is apparent that both the pre-etection and postdetection integration methods canonsiderably improve target-detection performance.

Figure 14�d� shows a nonstationary target-noiseequence �solid curve� with a decreasing mean and atationary clutter-noise sequence �dashed curve�.he target detection ROC performances are shown inig. 14�c�. It can be seen that the detection perfor-ances are much worse than the results shown inig. 14�a�. We discuss the nonstationary noise sit-ation more fully in Section 6.

ig. 13. �a� Nonstationary noise sequence before detrend with autocorrelation function �b� im�143,:� denotes the 1D slice cut from the

ig. 14. Target detection with real IR sensor noise: �a� single-target detection performance; �b� stationary target noise sequence; �c�wo-target detection performance; �d� nonstationary target noise sequence.

Page 11: Target Detection and Recognition Improvements by Use of Spatiotemporal Fusion

ttrRgstftt

6

AC

IdtsaiefiNdff

evaitlvoaa

B

IsvdFatrc�l�ofuftTw

tttawestvc

t

FIt

The results of combining predetection and postde-ection integration with real IR noise for single- andwo-target cases are shown in Figs. 15�a� and 15�b�,espectively. The curves with triangles show theOC performance of the persistency test with an inte-ration window of 3 frames, the curves with squareshow the ROC performance of the additive fusion, andhe curves with circles show the combined ROC per-ormance of the additive fusion and the persistencyest. It is found that use of this combination can fur-her improve target detection performance.

. Discussion and Some Further Thoughts

. Temporal Fusion and IR Sensor Nonuniformityorrection

n the traditional nonuniformity correction �NUC�esign, frame subtraction is generally used to sub-ract out the fixed pattern noise. However, directubtraction of two adjacent frames doubles the vari-nce of the temporal noise. To avoid a large increasen temporal noise, the NUC design is applied an it-ration feedback loop, and only a small fraction of thexed pattern noise is subtracted out at each iteration.evertheless, if we apply temporal integration to theetection system after the NUC process, we can af-ord the direct subtraction between two nearbyrames and can reduce the noise even further. For

xample, the sum of n original frames results in aariance of n � v, where v is the single-frame vari-nce. On the other hand, because all the variancesn the middle frames are cancelled out and only thewo variances in the first and the last frames areeftover, the sum of n subtracted frames results in aariance of 2 � v. Therefore, for an average of nriginal frames, the resulting variance is v�n. Whenveraging n subtracted frames, the resulting vari-nce is 2v�n2; that is, 2v�n2 � v�n when n � 2.

. Double-Thresholding Detection Scheme

f the feature values of all the different clutters in acene are larger �or smaller� than the target featurealue, as indicated in Fig. 4, the traditional CFARetection scheme still works. For the example inig. 4, the CFAR scheme always treats an object withfeature value below the threshold as a clutter and

hat above the threshold as a target. However, ineality, the clutter situations may be very compli-ated. As shown in Fig. 16, some clutter groupse.g., some trees or roads� may have feature valuesower than the target, whereas other clutter groupse.g., some decoy-like objects or countermeasurementbjects� may look more like the target and thus haveeature values higher than the target. In these sit-ations the traditional CFAR scheme will partiallyail because it uses only a single-thresholding schemehat can threshold out only one of the clutter groups.his increases the likelihood that the other groupsill be incorrectly detected as targets.To address the situation in which some clutter fea-

ure values are larger and some smaller than thearget feature value, we propose a double-hresholding scheme with one upper-bound thresholdnd one lower-bound threshold. When combinedith temporal integration, this technique will consid-rably improve target detection. For example, ashown in Fig. 16, suppose the two clutter groups andhe target have Gaussian distributions with the sameariance. The separation of the target from the twolutter groups is 2� �i.e., a SD of 2�:

mt � mc1 � mc2 � mt � 2�.

If we set the double thresholds to 1� below abovehe target mean m , as indicated by the two arrowed

ig. 15. Combination of predetection and postdetection with realR sensor noise �integration window � 3 frames� for �a� single-arget and �b� two-target cases.

t

Fig. 16. Gaussian pdf ’s of multiple clutter types versus the target.

10 January 2004 � Vol. 43, No. 2 � APPLIED OPTICS 413

Page 12: Target Detection and Recognition Improvements by Use of Spatiotemporal Fusion

ltthTtcctatuadcfT9rsfm

C

Astaigtmaodsfflttchtm1

eSclrBtcccfwmwo

onuon

7

Sttmdtfi

serntlstd�gtmttcd

sntmvftbFsfbf

sitbnti

tttpCs

4

ines in Fig. 16, then the detection criteria is set sohat only one object with a feature value larger thanhe lower-bound threshold and smaller than theigher-bound threshold is assigned as a detection.his is a 2� probability. The Pd for a Gaussian dis-

ribution is �68%, and the Pfa caused by the twolutter groups is �32% �� 16% 16%�. This per-entage represents the baseline performance for theraditional single-frame detection. However, if wepply the temporal integration of 9 frames �assumehat the noise across the time frames is temporallyncorrelated� with the additive fusion �equivalent toveraging 9 frames; see Refs. 1 and 2 for a detailedescription�, then the standard deviations for thelutter groups and the target will be reduced by aactor of 3. This then becomes a 6� probability.he Pd is correspondingly increased to more than9%, and the Pfa caused by the two clutter groups iseduced to less than 2%. In order to appropriatelyelect the two thresholds with this technique, we pre-er to have prior knowledge of the target mean, which

ay be available from good training data.

. Reverse-Thresholding Detection Scheme

nother situation in which the traditional CFARcheme will fail is when nonstationary targets or clut-er groups exist. This is shown in Fig. 14�d�, where

nonstationary target with a decreasing mean ex-sts. At an earlier time moment, the target mean isreater than the clutter mean, whereas at a laterime moment the target mean is less than clutterean. For a traditional CFAR single-thresholding

pproach, we set a single threshold and assign anybject with a feature value above this threshold as aetected target. Note that for the traditional CFARcheme, the threshold itself is changing �floating�rom frame to frame in order to maintain a constantalse-alarm rate. This approach also works at ear-ier time moments when the target mean is largerhan the clutter mean. However, it fails when thearget mean falls below the clutter mean. In thisase the false detection of the clutter group as a targetas a higher probability than the correct detection ofhe real target. That is why the detection perfor-ances in Fig. 14�c� are much worse that those in Fig.

4�a�.Similarly, a nonstationary clutter situation can be

asily understood from the results shown in Fig. 16.uppose that at an earlier moment the nonstationarylutter with an increasing mean was at the clutter 1ocation. At a later time moment it moved to theight-hand side of the target at the clutter 2 location.ased on these observations, we propose a reverse-

hresholding scheme to deal with the nonstationaryase. As shown in Fig. 16, when the nonstationarylutter mean is less than the target mean, we set theriteria for detection assignment as when the object’seature value is above the threshold. However,hen the clutter mean is greater than the targetean, we set the detection-assignment criteria ashen the object’s feature value is below the thresh-

ld. Real-time measurements of the changing mean

14 APPLIED OPTICS � Vol. 43, No. 2 � 10 January 2004

f a nonstationary process are necessary for this tech-ique. These measurements may be obtained byse of a temporal moving window or with the Wienerr Kalman filtering techniques. Further research iseeded in this area.

. Summary

ensor and data fusion techniques have proved effec-ive ways to improve target detection and recogni-ion.9,10 Current research in this field concentratesainly in the direction of spatial fusion �fusion from

ifferent sensors�. In this paper we have shown thatemporal fusion �i.e., fusion across multiple timerames within a specific sensor� can also considerablymprove target detection and recognition.

A critical parameter for temporal fusion is the fu-ion window size of multiple time frames. In gen-ral the larger the window size the better the fusedesults that are achieved. However, under someonstationary situations or in the presence of largeracking errors �or both�, a large window will causearge uncorrelated errors. In this paper we havehown that both the predetection and postdetectionemporal integrations considerably improve targetetection by integrating only �3–5 time framestested by real sensor noise as well as computer-enerated noise�. These newly developed predetec-ion temporal integration techniques �additive,ultiplicative, or MIN fusion� perform a little better

han the traditional postdetection temporal integra-ion technique �persistency test�. Detection resultsan be further improved by combining both the pre-etection and postdetection temporal integrations.As discussed in Section 1, although most examples

hown in this paper are for target detection, the tech-iques developed in this study can also be used forarget recognition �such as the ATR approach withatched filtering and correlation techniques�, pro-

ided multiple time frames are available. Note thatusion is conducted in the feature domain by fusingracked object features across different time frames,ut it is not conducted in the original image domain.or example, if the extracted feature is the peak-to-idelobe ratio of ATR correlation, the ATR with fusedeatures across multiple time frames will performetter than the ATR with a feature from only a singlerame.

Clutter noise from an IR sensor looking at realcenes �trees, grass, roads, buildings, etc.� was stud-ed. The sensor clutter noise in more than 95% ofhe sensor pixels is near stationary and uncorrelatedetween pixels as well as across time frames. Theoise in a few pixels �those looking at the grass nearhe road edge� shows nonstationary properties �withncreasing or decreasing mean across time frames�.

Based on observations over the real IR sensor clut-er noise, we proposed two advanced thresholdingechniques: double thresholding and reversehresholding. They may perform well in some com-licated clutter situations in which the traditionalFAR single-thresholding technique may fail. Sub-ection 6.B presents a simple example of the double-

Page 13: Target Detection and Recognition Improvements by Use of Spatiotemporal Fusion

tsdttatmbbobmm

ttcttkttcsge

bs

R

1

hresholding technique in a complicated clutterituation with a mix of two clutter types. Theouble-thresholding technique, in combination withemporal fusion of multiple time frames, can improvehe Pd from 68% to 99%. One difficulty in the actualpplication of the double-thresholding technique ishat we must have some prior knowledge of the targetean and distribution to set the upper- and lower-

ound thresholds. In general this information cane obtained from reliable training data. The resultsf this simple example should be used with cautionecause, in reality, the clutter types may numberore than 2 and the noise across the time framesay not be totally temporally uncorrelated.The training data reveal that, if we encounter clut-

er groups with a pdf that is broader than that for thearget, we must then investigate whether the broadlutter pdf is caused by nonstationary noise with aime-variant mean or by a mix of different clutterypes with different stationary means. Once this isnown, we can accordingly select different detectionechniques, such as the newly proposed double-hresholding or reverse-thresholding schemes dis-ussed in Subsections 6.B and 6.C. Finally wehould point out that it is critical to further investi-ate and understand the nonstationary noise prop-rty under different weather conditions and different

ackground scenes and textures, such as grass, trees,and, and water surfaces.

eferences1. H.-W. Chen and T. Olson, “Integrated spatiotemporal multiple

sensor fusion system design,” in Sensor Fusion: Architec-tures, Algorithms, and Applications VI, B. V. Dasarathy, ed.,Proc. SPIE 4731, 204–215 �2002�.

2. H.-W. Chen and T. Olson, “Adaptive spatiotemporal multiplesensor fusion,” Opt. Eng. 42, 1481–1495 �2003�.

3. A. Mahalanobis, B. V. K. Vijaya Kumar, S. R. F. Sims, and J.Epperson, “Unconstrained correlation filters,” Appl. Opt. 33,3751–3759 �1994�.

4. A. Mahalanobis, B. V. K. Vijaya Kumar, and S. R. F. Sims,“Distance-classifier correlation filters for multiclass target rec-ognition,” Appl. Opt. 35, 3127–3133 �1996�.

5. S. Haykin, Adaptive Filter Theory �Prentice-Hall, EnglewoodCliffs, N.J., 1986�.

6. A. Papoulis, Probability, Random Variables, and StochasticProcesses, 3rd ed. �McGraw-Hill, New York, 1991�.

7. W. B. Davenport, Probability and Random Processes �McGraw-Hill, New York, 1970�.

8. M. I. Skolnik, Radar Handbook �McGraw-Hill, New York,1970�.

9. E. Waltz and J. Llinas, Multisensor Data Fusion �ArtechHouse, Norwood, Mass., 1990�.

0. L. A. Klein, Sensor and Data Fusion Concepts and Applica-tions, 2nd ed. �SPIE Press, Bellingham, Wash., 1999�.

10 January 2004 � Vol. 43, No. 2 � APPLIED OPTICS 415


Recommended