+ All Categories
Home > Documents > Research Article FPGA-Based Real-Time Moving Target Detection...

Research Article FPGA-Based Real-Time Moving Target Detection...

Date post: 18-Apr-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
17
Research Article FPGA-Based Real-Time Moving Target Detection System for Unmanned Aerial Vehicle Application Jia Wei Tang, Nasir Shaikh-Husin, Usman Ullah Sheikh, and M. N. Marsono Faculty of Electrical Engineering, Universiti Teknologi Malaysia (UTM), 81310 Skudai, Johor Bahru, Malaysia Correspondence should be addressed to Nasir Shaikh-Husin; [email protected] Received 15 November 2015; Revised 5 February 2016; Accepted 10 March 2016 Academic Editor: Jo˜ ao Cardoso Copyright © 2016 Jia Wei Tang et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Moving target detection is the most common task for Unmanned Aerial Vehicle (UAV) to find and track object of interest from a bird’s eye view in mobile aerial surveillance for civilian applications such as search and rescue operation. e complex detection algorithm can be implemented in a real-time embedded system using Field Programmable Gate Array (FPGA). is paper presents the development of real-time moving target detection System-on-Chip (SoC) using FPGA for deployment on a UAV. e detection algorithm utilizes area-based image registration technique which includes motion estimation and object segmentation processes. e moving target detection system has been prototyped on a low-cost Terasic DE2-115 board mounted with TRDB-D5M camera. e system consists of Nios II processor and stream-oriented dedicated hardware accelerators running at 100MHz clock rate, achieving 30-frame per second processing speed for 640 × 480 pixels’ resolution greyscale videos. 1. Introduction Unmanned Aerial Vehicle (UAV) plays an important role in mobile aerial monitoring operations and has been widely applied in diverse applications such as aerial surveillance, border patrol, resource exploration, and combat and military applications. Due to its mobility, UAV has also been deployed for search and rescue operation [1] by acquiring high- resolution images in disaster area. Apart from that, several researches [2, 3] have also been done on traffic monitoring using UAV. As most monitoring systems require detection and tracking object of interest, moving target detection is a typical process in UAV monitoring system [4]. Moving target detection is the process of locating moving objects (foreground) residing in the static scene (back- ground) from a series of visual images captured from a cam- era. As displacement of object in subsequent video frames defines its movement, at least two successive video frames are required for processing. An object is defined as a moving target if it is located in two different positions corresponding to the background from two selected frames taken at different time intervals. us, a background model is required to repre- sent the static scene from incoming video frames prior to the segmentation of moving object. Background model can be categorized based on the type of camera movement [5], including stationary camera, pan- tilt-zoom camera, free camera motion with planar scene, and free camera motion with complex scene geometry. Detection and segmentation of moving objects in stationary background (static camera) can be performed easily using background subtraction technique [6–11], while image regis- tration technique is required in moving background (moving camera) involving ego-motion (camera motion) estimation and compensation to align the backgrounds of selected video frames prior to object segmentation. e scene in aerial imagery in UAV video is assumed to be planar [12]. e ego- motion estimation for planar scene can be estimated using homography transformation such as affine model. Hence, moving object can be detected by registering the video frame to the estimated model and employing the background subtraction with this registered model. is approach does not consider the scene with significant depth variations as it causes incorrect registrations due to parallax. Due to the complexity of computer vision algorithm, moving target detection in aerial imagery is a time consuming process. It is also not practical to rely on a ground processing station via radio link as video quality will greatly depend on the wireless communication speed and stability. In addition, Hindawi Publishing Corporation International Journal of Reconfigurable Computing Volume 2016, Article ID 8457908, 16 pages http://dx.doi.org/10.1155/2016/8457908
Transcript

Research ArticleFPGA-Based Real-Time Moving Target Detection System forUnmanned Aerial Vehicle Application

Jia Wei Tang Nasir Shaikh-Husin Usman Ullah Sheikh and M N Marsono

Faculty of Electrical Engineering Universiti Teknologi Malaysia (UTM) 81310 Skudai Johor Bahru Malaysia

Correspondence should be addressed to Nasir Shaikh-Husin nasirshfkeutmmy

Received 15 November 2015 Revised 5 February 2016 Accepted 10 March 2016

Academic Editor Joao Cardoso

Copyright copy 2016 Jia Wei Tang et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Moving target detection is the most common task for Unmanned Aerial Vehicle (UAV) to find and track object of interest from abirdrsquos eye view in mobile aerial surveillance for civilian applications such as search and rescue operation The complex detectionalgorithm can be implemented in a real-time embedded system using Field Programmable Gate Array (FPGA)This paper presentsthe development of real-timemoving target detection System-on-Chip (SoC) using FPGA for deployment on a UAVThe detectionalgorithm utilizes area-based image registration technique which includes motion estimation and object segmentation processesThe moving target detection system has been prototyped on a low-cost Terasic DE2-115 board mounted with TRDB-D5M cameraThe system consists of Nios II processor and stream-oriented dedicated hardware accelerators running at 100MHz clock rateachieving 30-frame per second processing speed for 640 times 480 pixelsrsquo resolution greyscale videos

1 Introduction

Unmanned Aerial Vehicle (UAV) plays an important role inmobile aerial monitoring operations and has been widelyapplied in diverse applications such as aerial surveillanceborder patrol resource exploration and combat and militaryapplications Due to its mobility UAV has also been deployedfor search and rescue operation [1] by acquiring high-resolution images in disaster area Apart from that severalresearches [2 3] have also been done on traffic monitoringusing UAV As most monitoring systems require detectionand tracking object of interest moving target detection is atypical process in UAV monitoring system [4]Moving target detection is the process of locating moving

objects (foreground) residing in the static scene (back-ground) from a series of visual images captured from a cam-era As displacement of object in subsequent video framesdefines its movement at least two successive video framesare required for processing An object is defined as a movingtarget if it is located in two different positions correspondingto the background from two selected frames taken at differenttime intervalsThus a backgroundmodel is required to repre-sent the static scene from incoming video frames prior to thesegmentation of moving object

Background model can be categorized based on the typeof camera movement [5] including stationary camera pan-tilt-zoom camera free camera motion with planar sceneand free camera motion with complex scene geometryDetection and segmentation of moving objects in stationarybackground (static camera) can be performed easily usingbackground subtraction technique [6ndash11] while image regis-tration technique is required inmoving background (movingcamera) involving ego-motion (camera motion) estimationand compensation to align the backgrounds of selected videoframes prior to object segmentation The scene in aerialimagery in UAV video is assumed to be planar [12] The ego-motion estimation for planar scene can be estimated usinghomography transformation such as affine model Hencemoving object can be detected by registering the videoframe to the estimatedmodel and employing the backgroundsubtraction with this registered model This approach doesnot consider the scene with significant depth variations as itcauses incorrect registrations due to parallaxDue to the complexity of computer vision algorithm

moving target detection in aerial imagery is a time consumingprocess It is also not practical to rely on a ground processingstation via radio link as video quality will greatly depend onthe wireless communication speed and stability In addition

Hindawi Publishing CorporationInternational Journal of Reconfigurable ComputingVolume 2016 Article ID 8457908 16 pageshttpdxdoiorg10115520168457908

2 International Journal of Reconfigurable Computing

full autonomous UAV is desirable as it can operate and reacttowards detected target with minimal human intervention[13] Thus an autonomous UAV demands a system with highmobility and high computing capability to perform detectionon the platform itself The use of Field Programmable GateArray (FPGA) will satisfy the low power consumption highcomputing power and small circuitry requirements of a UAVsystem FPGA-based system is a good solution in real-timecomputer vision problem for mobile platform [14] and canbe reconfigured to handle different tasks according to desiredapplicationsThis paper presents a FPGA implementation of real-

time moving target detection system for UAV applicationsThe detection algorithm utilizes image registration techniquewhich first estimates the ego-motion from two subsequentframes using block matching (area-based matching) andRandom Sample Consensus (RANSAC) algorithm Aftercompensating the ego-motion frame differencing medianfiltering and morphological process are utilized to segmentthe moving object The contributions of this paper are asfollows

(i) Development of real-time moving target detectionin a System-on-Chip (SoC) attaining 30 frames persecond (fps) processing rate for 640 times 480 pixelsrsquovideo

(ii) Prototyping of the proposed system in a low-costFPGA board (Terasic DE2-115) mounted with a 5megapixelsrsquo camera (TRDB-D5M) occupying only13 of total combinational function and 13 of totalmemory bits

(iii) Partitioning and pipeline scheduling of the detectionalgorithm in a hardwaresoftware (HWSW) code-sign for maximum processing throughput

(iv) Stream-oriented hardware accelerators includingblock matching and object segmentation modulewhich are able to operate in one cycle per pixel

(v) Analyzing detection performance with different den-sity of area-based ego-motion estimation and framedifferencing threshold

The rest of the paper is as follows Section 2 discusses theliteratures in moving target detection Section 3 discusses themoving target detection algorithm while Section 4 describesthe SoC development and the specialized hardware architec-ture of moving target detection Section 5 presents the detec-tion result from the complete prototype Section 6 concludesthis paper

2 Related Work

Moving target detection targeting for aerial videos or UAVapplications has been widely researched in the past fewdecades A framework consisting of ego-motion compensa-tion motion detection and object tracking was developed in[15] The authors used combination of feature and gradient-based techniques to compensate ego-motion while utilizingaccumulative frame differencing and background subtraction

Table 1 Comparison between related works on FPGA-based objectdetection system for different applications with proposed system

Relatedwork

Cameraplatform Detection technique and application

[6ndash11] Staticcamera

(i) Background subtraction(ii) Using GMM ViBE and so forth

[23] Movingrobot

(i) Detecting moving object using(ii) Optical flow and frame differencing

[24] UAV (i) Detecting and tracking object feature

[13] UAV (i) Car detection(ii) Based on shape size and colour

[25] UAV(i) Detecting moving object(ii) Using regional phase correlation(iii) Does not prototype the complete system

[26] UAV (i) Real-time ego-motion estimation

Proposed UAV(i) Moving target detection(ii) Using area-based image registration(iii) Prototyping the complete system

to detect moving vehicle in aerial videosThe research in [16]presented two different approaches to detect and track mov-ing vehicle and person using a Hierarchy of Gradient (HoG)based classifier The work in [17] has proposed a moving tar-get detection method that performs motion compensationmotion detection and tracking in parallel by including datacapture and collaboration control modules Multiple targetdetection algorithm was proposed in [18] catering for largenumber of moving targets in wide area surveillance appli-cation Moving target detection and tracking for differentaltitude were presented and demonstrated on UAV-capturedvideos in [19] Feature-based image registration techniquewas proposed in [20] to detect moving object in UAV videoThe authors utilized corner points in subsequent video framesas features to perform ego-motion estimation and compen-sation In [21] a multimodel estimation for aerial video wasproposed to detect moving objects in complex backgroundthat is able to remove buildings trees and other false alarmsin detection As these literature works focused on improvingthe detection algorithm for different cases and did notconsider autonomousUAVdeployment they developed theirsystem in a common desktop computer [17 19ndash21] or GraphicProcessing Unit (GPU) accelerated [22] environmentIn the context of FPGA-based object detection system

most works in the literature were targeted for static camera[6ndash11] as illustrated in Table 1 They utilized background sub-traction techniques such as GaussianMixtureModel (GMM)and ViBE (Visual Background Extractor) to perform fore-ground object segmentation in static background video Thework in [23] has proposed FPGA-basedmoving object detec-tion for a walking robotThey implemented ego-motion esti-mation using optical flow technique and framedifferencing inhardwaresoftware codesign systemThere are also several literatures proposing FPGA-based

detection for UAV applications The research in [24] hasproposed a hardwaresoftware codesign using FPGA for fea-ture detection and tracking in UAV applicationsThe authors

International Journal of Reconfigurable Computing 3

implemented Harris feature detector in dedicated hardwareto extract object features from aerial video while trackingof object based on the features is executed in softwareImplementation of real-time object detection for UAV isdescribed in [13] to detect cars based on their shape size andcolour However both works in [13 24] performed detectionand tracking based on object features and did not focus onmoving targets A suitable moving target detection algorithmfor FPGA targeting sense and avoid system in UAV has beenproposed in [25] by using regional phase correlation tech-nique but the authors did not prototype the complete systemin FPGA device In addition research in [26] also presentedthe hardware design and architecture of real-time ego-motionestimation for UAV video Hence there are limited numbersof works in the literature focusing on the development ofa complete prototype to perform real-time moving targetdetection for UAV applications using FPGA

3 Moving Target Detection Algorithm

As UAV is a moving platform the proposed moving targetdetection algorithm employs image registration technique tocompensate the ego-motion prior to object segmentationImage registration algorithms can be classified into feature-based and area-based (intensity-based) methods [27 28]In feature-based method detected features such as cor-

ners [29 30] or SURF [31] from two subsequent framesare cross-correlated to find the motion of each feature fromone frame to another Feature-based image registration isreported to have faster computation in software implemen-tation as it uses only a small number of points for featurematching regardless of the number of pixels The number ofdetected features is unpredictable as it depends on the cap-tured scene of the frames thus having unpredictable amountof computation and memory resource making it difficultto be implemented in highly parallel hardware Number offeatures can be reduced to a predictable constant with anadditional step of selecting strongest features based on theirscore (ie feature strength) by sorting or priority queuing[24] However it presents some limitations as only pixels ofthe highly textured areas would be selected while neglectingthe homogeneous area [32] Moreover feature-based methodrequires irregular access of memory which is not suitable forstreaming hardwareOn the contrary area-based technique construct point-

to-point correspondence between frames by finding themostsimilar texture of a block (area) from one frame to another Itis suitable for parallelism and stream processing as it offersseveral benefits for hardware implementation

(i) It has highly parallel operations that make it suitablefor parallel processing in hardware implementation

(ii) It allows simple control-flow and does not requireirregular accessing of image pixels

(iii) It has predictablememory requirementwith fixed sizeof computation data

The overall flow of the proposed algorithm is illustratedin Figure 1 It consists of two main processes which are

Motion estimation

Object segmentation

Previous frame Current frame

Detected moving object region

Median filtering

Frame differencing

Affine transformation

RANSAC

Block matching

Affine parameters

Morphological process

Figure 1 Overall algorithm of moving target detection using imageregistration technique

motion estimation and object segmentation Area-basedimage registration is utilized in this work The inputs to thesystem are two consecutive greyscale video frames which arethe current and the previous frames First block matchingis performed on these two frames to produce point-to-pointmotion between frames As aerial imagery in UAV videois assumed to have free camera motion with planar scene[5] affine model is employed to estimate the ego-motionRANSAC is then used to remove insignificant motion (out-liers) among all points resulting in the ego-motion in termsof affine transformation matrixAfter the previous frame is aligned with current frame

using parameters in the affine transformation matrix framedifferencing can be performed with pixel-by-pixel subtrac-tion on both aligned frames followed by thresholding toproduce a binary image Median filtering and morphologicalprocesses are done on the binary image to remove noisesresulting in only the detected moving targetThe proposed algorithm is intended for SoC implementa-

tion consisting of aNios II embedded software processor run-ning at 100MHz However most processes running on NiosII are slow and insufficient to achieve real-time capability Inorder to realize a real-time moving target detection systemall processes in this work are implemented in fully dedicated

4 International Journal of Reconfigurable Computing

hardware accelerators except RANSAC which is partiallyaccelerated in hardware

31 Block Matching Block matching involves two stepsextraction and matching where two consecutive framesare required Extraction process will store several blocksor patches of image pixels from one frame as templatewhile matching process will find their most similar blocksin the second frame By considering the center points ofblocks as reference this algorithm will yield numerous pairsof corresponding points which indicate the point-to-pointmotion (movement of the pixels) between two consecutiveframesThe paired points from these two frames will be usedin RANSAC to estimate the ego-motionBlock extraction is the process of storing numerous

blocks of 9 times 9 pixels from a predefined location from a videoframeThese blocks will be used as templates in the matchingprocess The positions of the template blocks are distributedevenly over the imageThere is nomathematical computationin the extraction process as it involves only direct copying ofimage patches from video stream into temporary memoryMatching process plays the role of finding the most sim-

ilar blocks from current frame for every extracted templateblock from the previous frameThis is done by correlating thetemplate blocks with next frame to find their correspondingposition based on similarity measure Due to simplicityof hardware implementation Sum of Absolute Difference(SAD) is chosen as the matching criterion for the correlationprocess SAD will generate a similarity error rating of pixel-to-pixel correlation between each template block (from pre-vious frame) and matching block (from current frame) SADwill yield zero result if both blocks are pixel-by-pixel identicalBlockmatching is computation intensive as each template

block has to search for its most similar pair by performingSAD with each block within its search region Several searchtechniques had been proposed in the literatures to reducethe computation by minimizing the search region suchas Three-Step Search Technique [33 34] Four-Step SearchTechnique [35] and Diamond Search [36] However most ofthese techniques are targeted for general purpose processorwhich reads image in irregular way and are not suitable forstreaming hardware architecture This work uses traditionalfull search technique [37] as it is efficient to be performedin stream-oriented hardware due to its regular accessing ofimageThe number of required matching computations is pro-

portional to the number of blocks (density) and their corre-sponding search areas Higher density of blockmatching pro-vides more points for ego-motion estimation to reduce imageregistration error but with higher hardware cost require-ment (number of hardware computation units) To reducehardware cost this work employs only a low density block(area-based) matching and does not estimate frame-to-framemotion of every pixelTo further optimize hardware resources in stream-

oriented architecture best-fit and nonoverlapping searchareas are utilized to ensure only one SAD computation isperformed for each incoming pixel For a number of rowblocks 119898 and a number of column blocks 119899 search areas

are evenly distributed for each block with 119904119898times 119904119899pixels

formulated in

119904119898= lfloor

119882

119898

rfloor

119904119899= lfloor

119867

119899

rfloor

(1)

where 119882 and 119867 represent image width and image heightrespectivelyThe template block positions (blue) and their correspond-

ing search areas (green) are illustrated in Figure 2 In eachclock cycle only one template block is matched with oneblock from its corresponding search area As each templateblock will only search in its dedicated search area withoutintruding other regions the whole block matching processshares only one SAD computation unit for processing thewhole image allowing119898 and 119899 to be context-switched in run-timeThe proposed approach is able to perform different

densities of area-based registration using the same hardwarecost However higher density reduces the search areas of eachblock thus limiting the flow displacement (travel distance ofeach point) The displacement limitations in horizontal 119889

119898

and vertical 119889119899are given as 119889

119898= plusmn1198822119898 and 119889

119898= plusmn1198672119899

respectively As the position and movement of UAV (heightvelocity etc) as well as frame rate of captured aerial videoaffect the point-to-point displacement between two succes-sive frames the proposed technique will produce wrongimage registration result if the point-to-point displacementbetween frames exceeds 119889

119898in horizontal orand 119889

119899in

vertical

32 RANSAC After the block matching stage a set of pointpairs (point-to-point motion) from two successive frames areidentified Based on these point pairs ego-motion estimationcan be performed As outliers (inconsistent motions) usuallyappear in these point pairs RANSAC algorithm is appliedto remove outliers from the data RANSAC is an iterativealgorithm to find the affine model that best describes thetransformation of the two subsequent frames Unlike the con-ventional RANSAC [38] this work uses an upper bound timeto terminate RANSAC computation (similar to [39]) regard-less of the number of iterations due to the real-time constraintas illustrated in Algorithm 1At each iteration RANSAC algorithm chooses three

distinct point pairs randomly as samples Hypothesis modelof affine transformation is then generated from the selectedsamples based on

[[[

[

1199091015840

11199091015840

21199091015840

3

1199101015840

11199101015840

21199101015840

3

1 1 1

]]]

]

=[[

[

ℎ0ℎ1ℎ2

ℎ3ℎ4ℎ5

0 0 1

]]

]

[[

[

119909111990921199093

119910111991021199103

1 1 1

]]

]

(2)

where ℎ119894denote the parameters of the affine model to be

estimated 119909119894and 119910

119894are the coordinates of chosen sample

points and 1199091015840119894and 1199101015840

119894represent their corresponding point

pairs

International Journal of Reconfigurable Computing 5

(a) 6 times 4 blocks (b) 8 times 6 blocks

Figure 2 Positions of template blocks (blue) and search areas (green) in video frame for different densities (119898 times 119899) of block matching withsame hardware cost

while time taken lt upper bound time do(1) Randomly select 3 distinct point pairs as samples(2) Generate hypothesis model (affine parameters) basedon the chosen samples(3) Apply 119879

119889119889test on the hypothesis model

(4) Calculate the fitness score of the model(5) Update and store best scored parameters

end while

Algorithm 1 RANSAC algorithm

119879119889119889test proposed in [40] is applied in the algorithm to

speed up RANSAC computation by skipping the followingsteps (step (4) and (5)) if the hypothesis model is far fromthe truth Fitness of the hypothesis is then evaluated andscored by fitting its parameters to all point pairs The besthypothesis model is constantly updated in each iteration andemerges as the final result when the RANSAC is terminatedupon reaching an upper bound time As RANSAC has theleast computation among overall moving target detectionalgorithms it is implemented as software program with onlythe fitness scoring step (step (4)) being hardware acceleratedFitness scoring is the calculation of the fitness for a hypothesismodel towards all input data (point pairs from block match-ing) as described in Algorithm 2Each data is considered as an inlier if its fitting error is

smaller than a predefined distance threshold thdist or viceversa Inlier fitness score is its fitting error while outlier scoreis fixed to thdist as a constant penalty The total fitness scoreis calculated by accumulating all individual scores for eachdata where a perfect fit will have zero fitness score As fitnessscoring is an iterative process for all data the number ofcomputations increases with size of data As RANSAC is astochastic algorithm it may not produce the best-fit affinemodel when given limited iteration

33 Object Segmentation After estimating ego-motion thecamera movement between two successive frames is tobe compensated prior to object foreground detection The

fitness score = 0for all data

119894do

asub119909 = abs(1199092119894minus (1199091119894sdot 1198670+ 1199101119894sdot 1198671+ 1198672))

asub119910 = abs(1199102119894minus (1199091119894sdot 1198673+ 1199101119894sdot 1198674+ 1198675))

score = min((asub1199092 + asub1199102) th2dist)fitnessscore = fitnessscore + score

end forWhereEach data

119894contains a point pair (119909

1119894 1199092119894 1199101119894 and 119910

2119894)

1198670 1198671 1198672 1198673 1198674 1198675are affine parameters of hypothesis

modelth2dist is the predefined distance threshold

Algorithm 2 Fitness scoring in RANSAC algorithm

previous frame is transformed and mosaic with currentframe using the estimated affine parameters from RANSACalgorithm Reverse mapping technique is applied by calcu-lating the corresponding location in the source image basedon the destination pixel location The equation of affinetransformation is shown in

[[[

[

1199091015840

119894

1199101015840

119894

1

]]]

]

=[[

[

ℎ0ℎ1ℎ2

ℎ3ℎ4ℎ5

0 0 1

]]

]

[[

[

119909119894

119910119894

1

]]

]

(3)

where 119909119894and 119910

119894are the pixel coordinates of destination

image 1199091015840119894and 1199101015840

119894denote the corresponding pixel coordinates

in source image and ℎ119894are best-fit affine parameters from

RANSACAs the transformation may produce fractional result

nearest neighbour interpolation is utilized due to its efficiencyin hardware design The ego-motion compensation is per-formed pixel-by-pixel in raster scan generating a stream ofthe transformed previous frame to the next processFrame differencing is executed on the current frame and

the transformed (ego-motion compensated) previous frameby pixel-to-pixel absolute subtraction of both frames Thepixels in the resultant image are threshold with constant

6 International Journal of Reconfigurable Computing

Table 2 Pipeline scheduling for processing subsequent frames

Processes Processing frame at frame period 119905119894

1199050

1199051

1199052

1199053

1199054

sdot sdot sdot 119905119894

Motion estimation(i) Block matching 119865

01198651larr 1198650

1198652larr 1198651

1198653larr 1198652

1198654larr 1198653

sdot sdot sdot 119865119894larr 119865119894minus1

(ii) RANSAC mdash mdash 1198651larr 1198650

1198652larr 1198651

1198653larr 1198652

sdot sdot sdot 119865119894minus1larr 119865119894minus2

Object segmentation(i) Affine transformation

mdash mdash mdash 1198651larr 1198650

1198652larr 1198651

sdot sdot sdot 119865119894minus2larr 119865119894minus3

(ii) Frame differencing(iii) Median filtering(iv) Morphological119865119894larr 119865119895is detection of moving object from 119895th frame to 119894th frame

value of thfd to produce binary image Lower value of thfdmay induce more false alarm in detection while higher valuecauses the miss detection Both subtraction and thresholdingprocesses can be done as soon as two pixels for the samecoordinate from these frames are obtained to yield one binarypixel for the next process Lastly 7 times 7 binary median filterand dilation processes are performed on the binary imageto remove noise and improve the detected region of movingtarget

34 Pipeline Scheduling In order to establish a real-timemoving target detection system for streaming video properpipeline scheduling is utilized to fully maximize the overallsystem throughputThe algorithm is split into several subpro-cesses with each hardware accelerator working on differentframes independently transferring the intermediate resultfrom one process to another until the end of the detectioncycle Hence the system will always produce output everytime after a fixed latency The overall process is divided intofour stages of pipeline as shown in Table 2Due to data dependencies of the streaming algorithm all

processesmust be done sequentially to produce one detectionresult Block matching requires two successive video framesfor computation The first frame is streamed in for blockextraction process and stored into frame buffer Blockmatch-ing is performed after the next frame is obtained with theextracted block of previous frame RANSAC can only beginits computation after block matching has finished processingon the entire frame Lastly two original frames (119865

119894minus2and

119865119894minus3) are read from frame buffer for object segmentation to

produce the final result Object segmentation computationcan be performed in stream without further frame bufferingThe overall pipeline processing of the streaming system hasfour framesrsquo latency Hence at least four frames (119865

119894minus3to 119865119894)

must be stored in frame buffer at all time for a completemoving target detection process

4 Proposed Moving Target Detection SoC

Themoving target detection SoC is developed andprototypedin Terasic DE2-115 board with Altera Cyclone IV FPGAdevice The system consists of hardwaresoftware codesignof the algorithm of where the hardware computation is

executed in dedicated accelerator coded in Verilog HardwareDescription Language (HDL) while software program isperformed using a soft-core Nios II processor with SDRAMas software memoryThe system architecture of the proposedmoving target detection SoC is illustrated in Figure 3Camera interface handles the image acquisition tasks to

provide the raw image for processing while VGA interfacemanages video displaying task Apart from being a softwarememory part of SDRAM is also reserved as video displaybuffer Thus Direct Memory Access (DMA) technique isapplied to read and write the displaying frame in SDRAM toensure the high throughput image transferAs multiple frames are required at the same time to

detect moving target frame buffer is required to temporarilystore the frames for processing Hence SRAM is utilizedas frame buffer due to its low latency access Since mostcomputations are performed in the dedicated hardware NiosII handles only RANSAC process (except fitness scoring stepas described in Section 32) and auxiliary firmware controlsUSB controller is included in the SoC to enable data transferwith USB mass storage device for verification and debuggingpurposes In addition embedded operating system (Nios2-linux) is booted in the system to provide file system anddrivers supportThe real-time video is streamed directly into the mov-

ing target detector for processing Both Nios II and hard-ware accelerator modules compute the result as a hard-waresoftware codesign system and transfer the output frameto SDRAM via DMA VGA interface constantly reads anddisplays the output frame in SDRAM All operations are ableto be performed in real-time attaining a 30 fps moving targetdetection system

41Moving Target DetectionHardware Accelerator Thehard-ware architecture of the moving target detector is shown inFigure 4 It is composed of motion estimation core objectsegmentation core frame grabber and other interfaces Theoverall moving target detection is performed according to thefollowing sequences

(1) Frame grabber receives the input video stream andstores four most recent frames (119865

119894minus3to 119865119894) into frame

buffer through its interface At the same time frame

International Journal of Reconfigurable Computing 7

Moving targetdetector

NIOS II CPU

SDRAMcontroller

Camerainterface

FPGA

Syste

m b

us

Embedded Linux

(i) Firmware control

(ii) Software processing

SDRAM

VGAinterface

VGAdisplay

SRAM

USBcontroller

USB mass storage device

Camera

Software processor Hardware accelerator Controllers and interfaces

Memory components IO components

Figure 3 System architecture of moving target detection

Framebuffer

interface

Motionestimation

core

Framegrabber

Objectsegmentation

core

Video streaminput

Businterface(slave)

Software processor

Frame buffer(SRAM)

Businterface(master)

Videoresult

DMA to SDRAM

Affineparameters

RANSACcomputation

Fiminus2

Fi

and Fiminus3

Figure 4 Hardware architecture of moving target detector

8 International Journal of Reconfigurable Computing

grabber also provides the current frame (119865119894) tomotion

estimation core(2) Motion estimation core performs blockmatching andRANSAC computation Since RANSAC is computedin both hardware and software software processor isconstantly accessing this core via system bus interfaceto calculate the affine parameters

(3) After RANSAC the affine parameters are transferredfrom software to object segmentation core Two pre-vious frames (119865

119894minus2and 119865

119894minus3) are read from the frame

buffer by object segmentation core for processing(4) Several processes involving affine transformationframe differencing median filter and dilation arethen performed on both frames resulting in thedetected moving target

(5) Lastly the bus interface (master) provides DMAaccess for object segmentation core to transfer theend result into SDRAMfor displaying and verificationpurposes

As the frame buffer (SRAM) is a single port 16-bitmemory frame grabber concatenates two neighbouring 8-bitgreyscale pixels to store in one memory location Since framegrabber and object segmentation core share the frame bufferto write and read frames respectively frame buffer interfaceprovides priority arbitration and gives frame grabber thehighest priority granting everywrite request However framebuffer may be busy for a couple of clock cycles due to readoperation of SRAM by other modules a small FIFO withdepth of 4 is utilized in frame grabber to temporarily bufferthe incoming image pixels

42 Motion Estimation Hardware Accelerator Motion esti-mation core consists of block matching and RANSAC hard-ware accelerators Since RANSAC requires the entire dataof point pairs provided by block matching to begin itscomputation additional buffers are needed to temporarilystore the corresponding point pairs for every two subsequentframes The hardware architecture for motion estimationprocess is shown in Figure 5To enable high throughput data (point pairs) sharing

for both block matching and RANSAC double bufferingtechnique is applied by using two buffers (Buffer 1 and Buffer2) as data storage For any instance one buffer is writtenby block matching while the other is used for computationby RANSAC Buffer controller swaps the roles of these twobuffers for each incoming new frame therefore ensuringboth processes to be pipelined by reading and writing oneach buffer subsequently Buffer swapping is initiated at eachcompletion of block matching modules while RANSAC isperformed during the time gap between each swap and isterminated before the next swap

421 Block Matching Hardware Accelerator Figure 7 showsthe architecture of the proposed block matching hardwareaccelerator performing template blocks extraction fromone frame and matching of these template blocks in theircorresponding search areas from next frame The overall

Buffer controllerBuffer 1

Block matchingVideo stream

input

RANSAC acceleratorTo softwareprocessor

Buffer 2

Figure 5 Hardware architecture of motion estimation core

process can be completed in stream to yield the point-to-point motion (point pairs) of two subsequent frames withoutbuffering an entire frameAs 9 times 9 block size is utilized in block matching a 9-

tap line buffer is designed in such a way that 9 times 9 pixels ofmoving window can be obtained in every clock cycleThese 9times 9 pixels are shared for both block extraction and matchingprocesses and are read one by one in pipeline from the linebuffer at each valid cycle resulting in a total of 81 cycles toobtain a complete windowThe block extractor keeps track of the coordinate of

current pixel in video stream as a reference for extractionprocess Template blocks from incoming frames are extractedand stored temporarily into block memory As each block isextracted line-by-line in raster scan blockmemory is dividedinto nine-rowmemories as illustrated in Figure 6(a)with eachof which being used to store one pixel row in template blocksWhen video stream reaches the block position each pixelrow is loaded into each rowmemory from the correspondingtap of the line buffer Block coordinates are also stored in aseparate FIFO to keep track of its positionSince only one SAD processor is used for matching 119898 times119899 blocks as mentioned in Section 31 the template blockhas to be swapped according to the corresponding searcharea during raster scan Hence row memory is constructedwith two FIFOs upper and lower FIFO as illustrated inFigure 6(b) to enable block swapping during matchingprocess Template blocks are stored into upper FIFO duringextraction process During matching process each line ofraster scan enters eight different search areas to match eightdifferent template blocks respectively Hence one row oftemplate blocks is cached in lower FIFO and is repeatedlyused until the end of their search areas (reaching next row ofsearch areas) Upon reaching each new row of search areastemplate blocks in lower FIFO are replaced with new row oftemplate blocks from upper FIFO At the last line of rasterscan the lower FIFO is flushed to prevent overflow

International Journal of Reconfigurable Computing 9

CV ControlVector(CV)

Tap 0 Tap 1

Row 1pixels

Template blockscoordinate

Template blockscoordinate

Template blocks pixels

Row 2pixels

Tap 8

Row 8pixels

CV

Rowmemory

Rowmemory

Coordinate memory (FIFO)

Row memory

middot middot middot

middot middot middot

middot middot middot

(a) Block memory consisting of nine-row memories

wr_upper

rd_upper

wr_lower

rd_lower

sel1

sel2

CV_in CV_out

Tap

Row pixels

UpperFIFO

LowerFIFO

1 0

0

Controlregisters

1 0

(b) Row memory contains an upper FIFO and lower FIFO

Figure 6 Block memory architecture for storing template blocks

Blockextractor

Bestscore

tracker

Line buffer

Block memory

SAD processor

Video stream input

Control vector

Matching score

Point pairs

Template blocks Blocks coordinate

9 times 9 window pixels

Figure 7 Stream-oriented hardware architecture of blockmatching

In order to efficiently extract and match all blocksdifferent Control Vector (CV) as illustrated in Table 3 is sentto perform different reading and writing operations in blockmemory based on the current position in raster scan Bothreads andwrites are independent of each other and are able tobe executed at the same time Pixels are processed one by onein 81 cycles to complete a window Both writing and reading

Table 3 Control Vector (CV) for different read andwrite operationsof block memory

Position of raster scan WriteupperReadupper

Writelower

Readlower sel1 sel2

Entering templateblock position 1 x x x x x

Entering first searcharea row x 1 1 0 1 1

Entering next searcharea row x 1 1 1 1 1

Reentering samesearch area row x 0 1 1 0 0

Leaving last searcharea row x 0 0 1 0 0

processes require 9 cycles for each row memory passing CVfrom the first row memory to the next row memory untilthe end to complete a 81-pixel write or read operation of atemplate blockSAD processor performs the correlation of the template

blocks from previous frame with all possible blocks fromcurrent frame according to the search area Extracted blockpixels are read from block memory while window pixels insearch areas are provided from the taps of the line bufferThetotal number of required PEs is the total number of pixelsin a window The process is pipelined such that each pixelis computed in each PE as soon as it is obtained from theline buffer Matching score of each window can be obtainedin every cycle after a fixed latencyLastly the best score tracker constantly stores and updates

the best matching score for each template block within itscorresponding search area The matching score is compared

10 International Journal of Reconfigurable Computing

x2 H2y1 H1

H0y2 H5 x1x1 H3

y1 H4

minus times times minus

minus minus

times times

+

+

+

abs abs

sqr sqr

min

acc

Pipelineregister

Pipelineregister

Pipelineregister

Fitness score

th2dist

Figure 8 Hardware datapath of fitness scoring in RANSAC accel-erator

among the same search area and the coordinates of the best-scored blocks are preserved At the end of each search areathe coordinates of the best pairs (template blocks and theirbest-scored blocks) are sent to RANSAC module for nextprocessing Hence the proposed block matching hardware isable to produce point-to-point motion (point pairs) of everytwo successive frames in streaming video at line rate

422 RANSAC Hardware Accelerator RANSAC hardwaredesign in [39] is utilized in this work which acceleratesonly fitness scoring step As described in Algorithm 2fitness scoring is an iterative process which performs similarcomputation to all data samples based on hypothesis modelHence this data intensive process is executed in pipelineddatapath as illustrated in Figure 8 A control unit is utilizedto read input data provided by block matching from bufferand stream these inputs to the datapath unit at every clockcycleThe datapath unit utilizes three stages of pipeline with

the aim of isolating multiplication processes thus allowingfaster clock rate The first stage pipeline registers are locatedright after the first multiplication while the other two stagesof pipeline registers enclose the squaring processes Theindividual score is accumulated in the last stage producingtotal final fitness score The accumulator is reset on each newset of hypothesis Thus the total number of cycles required

Table 4 Fixed point precision of fitness scoring inputs

Parameter Number of bits Number rangeInteger Fraction

1199091 1199101 1199092 1199102

11 0 [minus1024 1024)1198670119867111986731198674

4 12 [minus8 8)11986721198675

11 5 [minus1024 1024)

Detected movingtarget

Affineparameters

fromsoftware

AffinePE

Framereader

Address

Framedifferencing PE

Binary image stream

MedianPE

Line buffer

DilationPE

Line buffer

from framebuffer

Fiminus2 and Fiminus3

Fiminus2F998400iminus3

Figure 9 Hardware architecture for object segmentation

for fitness score computation is the number of overall dataplus the four-cycle latencyAlthough fitness scoring could require floating point

computations the datapath unit uses suitable fixed pointprecision for each stage SinceNios II is a 32-bit processor theaffineparameters in hypothesismodel (119867

0to1198676) are properly

scaled to different precision of 16-bit fixed points as describedin Table 4 so that two affine parameters can be assigned in asingle 32-bit write instruction As this system is targeted for640 times 480 pixelsrsquo video all input coordinates (119909

1 1199101 1199092 and

1199102) are scaled to 11 bits

43 Object Segmentation Hardware Architecture As objectsegmentation can be performed in one raster scan a stream-oriented architecture is proposed as illustrated in Figure 9 Allsubprocesses are executed in pipeline on the streaming videowithout additional frame buffering Object segmentationprocess is initiated by software processor after providing theaffine parameters from RANSAC to affine PE Two frames(119865119894minus2and 119865

119894minus3as described in Table 2) from frame buffer

(SRAM) are required to segment the moving targetBased on the affine parameters from RANSAC affine PE

uses reverse mapping technique to find each pixel location inprevious frame (119865

119894minus3) using (3) and generates their addresses

in frame buffer (SRAM) Frame readers fetch the previ-ous frame (119865

119894minus3) pixel-by-pixel according to the generated

addresses from frame buffer thus constructing a stream oftransformed frame which is denoted as 1198651015840

119894minus3

International Journal of Reconfigurable Computing 11

Leftmostcolumn pixels

Rightmostcolumn pixels

Median output stream

Addertree

Addertree

24

pixelsTo line buffer

Binary image stream

7 times 7 window

minus

+

gtgt

Figure 10 Hardware architecture of median PE

By synchronizing the streams of both frames framedifferencing can be executed in pipeline as soon as one pixelfrom each frame is obtained Hence one pixel in currentframe (119865

119894minus2) and one pixel in transformed frame (1198651015840

119894minus3) are

fetched alternatively from their corresponding memory loca-tions by frame reader constructing two synchronized streamsof 119865119894minus2and 1198651015840

119894minus3frames Frame differencing PE performs

pixel-to-pixel absolute subtraction and thresholding on thestreams The frame differencing PE is able to compute inone cycle per pixel A configurable threshold value thfd isused after the subtraction yielding a stream of binary imagewithout buffering the whole frameAfter frame differencing the binary image is streamed

into 7 times 7 median filtering Seven lines of the image arebuffered in the line buffer providing 7 times 7 pixels window forthe median PE to perform the median computation Mediancomputation can be performed in one clock cycle for eachprocessing window due to short propagation delay as onlybinary pixels are involved Figure 10 shows the hardware logicdesign of median PEMedian filtering can be computed by counting the num-

ber of asserted (binary 1) pixels in the window If morethan half the pixels in the window (24 pixels) are assertedthe resultant pixel is ldquo1rdquo or ldquo0rdquo otherwise Since processingwindow will move only one pixel to the right for each com-putation during raster scan current pixel count is computedby adding the previous pixel count and rightmost columnpixels in the current window while subtracting the leftmost

column pixels in the previous window Final binary outputpixel is produced by thresholding the current pixel count with24 (half of window size)As dilation is also a 7times 7window-based processing it uses

similar line buffering technique asmedian filtering Howeveronly simple logical OR operation is performed on all windowpixels Due to its simplicity dilation PE can also be computedin one clock cycle resulting in the stream of binary imagewith detected region of moving targets

5 Experimental Results

51 Verification of Proposed SoC Theproposedmoving targetdetection SoC is verified in offline detection mode using thedatabase in [41] Test videos are 640 times 480 pixels in sizeand are greyscaled prior to the verification process The testvideos are transferred to the system for computation via aUSB mass storage device After performing the detection inSoC the image results are displayed on VGA and also storedon USB drive for verification Figure 11 shows the movingtarget detection result from the proposed SoC using differentsample videos The detected regions (red) are overlaid on theinput frame In most cases the proposed SoC is able to detectthe moving target in consecutive framesHowever there are several limitations in this work Block

matching may not give a goodmotion estimation result if theextracted blocks do not have texture (the pixels intensity aresimilar) Moreover the detected region of moving target mayappear in cavity or multiple split of smaller regions as onlysimple frame differencing is applied in the proposed systemAdditional postprocessing to produce better detected blob bymerging split regions is out of the scope in this workAs the stochastic RANSAC algorithm is terminated after

a constant time step for each frame image registration errormay occur which produces incorrect ego-motion estimationThis could be mitigated by accelerating RANSAC algorithmto ensure more iterations using dedicated hardware or highperformance general purpose processor

52 Performance Evaluation of Detection Algorithm Theper-formance evaluation of the implemented detection algorithmuses the Mathematical Performance Metric in [42] thatinvolves several parameters as follows

(i) True positive TP the detected moving object(ii) False positive FP detected regions that do not corre-spond to any moving object

(iii) False negative FN the nondetected moving object(iv) Detection rate DR the ratio of TP with the combina-

tion of TP and FN as formulated in

DR = TPTP + FN

(4)

(v) False alarm rate FAR the ratio between FP in allpositive detection as defined in

FAR = FPTP + FP

(5)

12 International Journal of Reconfigurable Computing

(a) Frame 255 (b) Frame 275 (c) Frame 295 (d) Frame 315

(e) Frame 1000 (f) Frame 1020 (g) Frame 1040 (h) Frame 1060

(i) Frame 600 (j) Frame 620 (k) Frame 640 (l) Frame 660

Figure 11 Detected regions from the proposed moving target detection SoC on different sample videos in [41] Video numbers (a)ndash(d)V3V100003 004 video numbers (e)ndash(h) V3V100004 003 and video numbers (i)ndash(l) V4V100007 017

FP

FNTP

Figure 12 Evaluation of performancemetrics TP FP and FN basedon ground truth boxes (blue) and the detected region (red)

To obtain the performance metrics ground truth regionsare manually labelled in several frames of test videos Abounding box is drawn across each moving object to indicatethe ground truth region of every frame as depicted in Fig-ure 12 A simple postprocessing is performed on the detectedregion by filtering out the detected region smaller than 15pixelsrsquo width or 15 pixelsrsquo height prior to the evaluationA detected moving object (TP) has detected regions in itsbounded ground truth area while a nondetected movingobject (FN) has no detected region overlapping with itsground truth area Detected region that does not overlappwith any ground truth region is considered as false positive(FP)The detection performance is evaluated on different

parameters configuration The DR and FAR for 1000 testframes using different number of blocks (density in ego-motion estimation) 119898 times 119899 in area-based registration and

frame differencing threshold thfd are depicted in Table 5 andFigure 13The experiment results show that DR is almost similar

for different density of ego-motion estimation but decreaseswith thfd Although higher density in the proposed work haslower displacement limitation 119889

119898and 119889

119899as discussed in

Section 31 most of the point-to-point displacements do notexceed the limitation due to slowUAVmovement in themostframes of the test dataset On the contrary higher value of thfdmay filter out the moving object if the differences in intensityof the object pixels and background pixels are almost similarFAR decreases with density in ego-motion estimation

due to the higher quality in image registration process butincreases if most frames exceed the displacement limitation119889119898and 119889

119899 However false registration due to displacement

limitation results in a huge blob of foreground but does notgreatly increase FAR Although higher values of thfd decreasethe false detection rate they also produce smaller foregroundarea for all detected moving objects as pixels almost similarintensity with background will be thresholded

53 Speed Comparison with Full Software ImplementationThe computation speed of the proposed moving target detec-tion SoC is compared with software computation in differentplatforms including modern CPU (Intel Core i5) in desktopcomputer and embedded processor (ARM) Table 6 illustratesthe comparison of computation frame rate and hardware

International Journal of Reconfigurable Computing 13

1 2 3 4 5 6 7 8 9 100944

0946

0948

095

0952

0954

0956

0958

096

0962

0964

Det

ectio

n ra

teD

R

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(a) DR

1 2 3 4 5 6 7 8 9 100

01

02

03

04

05

06

07

False

alar

m ra

te F

AR

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(b) FAR

Figure 13 DR and FAR for different density in ego-motion estimation119898 times 119899 and frame differencing threshold thfd

Table 5 Performance evaluation in terms of DR and FAR for 1000frames using different density in ego-motion estimation119898times 119899 andframe differencing threshold thfd

119898 times 119899 thfd DR FAR12 15 0958 064312 20 0954 033112 25 0949 019424 15 0957 056824 20 0950 032424 25 0945 010135 15 0958 054835 20 0952 021535 25 0947 009048 15 0959 053948 20 0952 025348 25 0944 007970 15 0958 050970 20 0951 018870 25 0946 007588 15 0960 048988 20 0951 021988 25 0947 0074108 15 0958 0483108 20 0952 0168108 25 0946 0058140 15 0958 0499140 20 0951 0187140 25 0946 0059165 15 0958 0474165 20 0953 0214165 25 0947 0068192 15 0959 0478192 20 0952 0169192 25 0946 0092

Table 6 Computation speed comparison of the proposed sys-tem with different software implementation using area-based andfeature-based registrations

Platform Frequency Registrationtechnique

Framerate

Hardwarespeed-up

ProposedSoC 100MHz Area-based 30 1

Intel Corei5-4210U 170GHz Area-based 426 704

Feature-based 1311 229

ARM1176JZF 700MHz Area-based 020 150Feature-based 056 5357

speed-up between the proposed system and other softwareimplementations using test videos in [41]As feature-based image registration has faster computa-

tion in software implementation comparing to area-basedregistration speed performance of feature-based method isalso included for comparison In feature-based implementa-tion features are first detected in each frame The detectedfeatures from current frame are cross-correlatedwith featureswith previous framewhile RANSAC algorithm is used to esti-mate the ego-motion between frames After compensatingthe ego-motion segmentation ofmoving object uses the sameprocesses with the proposed system To further optimizethe software implementation in terms of speed performancea fast feature detection algorithm [30] is utilized As thenumber of features will affect the computation time in featurematching step only 100 strongest features in each frame areselected for processingHowever the performance evaluationdoes not consider multithreaded software execution

14 International Journal of Reconfigurable Computing

Table 7 Resources usage of the proposed moving target detectionSoC

Logic units Utilization ()Total combinational function 15161 13Total registers 10803 9Total memory bits 521054 13Embedded multiplier 27 5FPGA device Altera Cyclone IV

Based on experimental result the speed performanceof the proposed moving target detection SoC surpassesoptimized software computation by 229 times and 5357times compared with implementations in modern CPUand embedded CPU respectively The software computation(RANSAC) in HWSW codesign of the proposed system cre-ates speed bottleneck thus limiting the maximum through-put to 30 fps The processing frame rate of the proposed sys-tem can be further improved by using fully dedicated hard-ware

54 Resource Utilization The overall hardware resourcesutilization of the complete system is illustrated in Table 7This prototype of real-time moving object detection systemutilizes only less than 20 percent of total resources in AlteraCyclone IV FPGA device As the proposed system uses off-chip memory components for frame buffering FPGA on-chip memory is utilized only for line buffering in streamingprocess (eg block matching and median filtering) and stor-ing intermediate results (eg point pairs after block match-ing) Thus the low resource usage of the proposed systemprovides abundant hardware space for other processes such astarget tracking or classification to be developed in future

6 Conclusions

Moving target detection is a crucial step in most computervision problem especially for UAV applications On-chipdetection without the need of real-time video transmission toground will provide immense benefit to diverse applicationssuch as military surveillance and resource exploration Inorder to perform this complex embedded video processingon-chip FPGA-based system is desirable due to the potentialparallelism of the algorithmThis paper proposed a moving target detection system

using FPGA to enable autonomous UAVwhich is able to per-form the computer vision algorithm on the flying platformThe proposed system is prototyped using Altera CycloneIV FPGA device on Terasic DE2-115 development boardmounted with a TRDB-D5M camera This system is devel-oped as a HWSW codesign using dedicated hardware withNios II software processor (booted with embedded Linux)running at 100MHz clock rate As stream-oriented hardwarewith pipeline processing is utilized the proposed systemachieves real-time capability with 30 frames per secondprocessing speed on 640times 480 live video Experimental resultshows that the proposed SoC performs 229 times and 5357times faster than optimized software computation onmodern

desktop computer (Intel Core i5) and embedded processor(ARM) In addition the proposed moving target detectionuses only less than 20 percent of total resources in theFPGA device allowing other hardware accelerators to beimplemented in future

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors would like to express their gratitude to UniversitiTeknologi Malaysia (UTM) and the Ministry of ScienceTechnology and Innovation (MOSTI) Malaysia for support-ing this research work under research Grants 01-01-06-SF1197and 01-01-06-SF1229

References

[1] A Ahmed M Nagai C Tianen and R Shibasaki ldquoUav basedmonitoring systemandobject detection technique developmentfor a disaster areardquo International Archives of PhotogrammetryRemote Sensing and Spatial Information Sciences vol 37 pp373ndash377 2008

[2] B Coifman M McCord R Mishalani M Iswalt and Y JildquoRoadway trafficmonitoring from an unmanned aerial vehiclerdquoIEE Proceedings-Intelligent Transport Systems vol 153 no 1 pp11ndash20 2006

[3] K Kanistras G Martins M J Rutherford and K P ValavanisldquoSurvey of unmanned aerial vehicles (uavs) for traffic monitor-ingrdquo inHandbook of Unmanned Aerial Vehicles pp 2643ndash2666Springer 2015

[4] K Nordberg P Doherty G Farneback et al ldquoVision for a UAVhelicopterrdquo in Proceedings of the International Conference onIntelligent Robots and Systems (IROS rsquo02) Workshop on AerialRobotics pp 29ndash34 Lausanne Switzerland October 2002

[5] D Zamalieva and A Yilmaz ldquoBackground subtraction for themoving camera a geometric approachrdquo Computer Vision andImage Understanding vol 127 pp 73ndash85 2014

[6] M Genovese and E Napoli ldquoASIC and FPGA implementationof the Gaussianmixturemodel algorithm for real-time segmen-tation of high definition videordquo IEEETransactions onVery LargeScale Integration (VLSI) Systems vol 22 no 3 pp 537ndash547 2014

[7] F Kristensen H Hedberg H Jiang P Nilsson and V OwallldquoAn embedded real-time surveillance system implementationand evaluationrdquo Journal of Signal Processing Systems vol 52 no1 pp 75ndash94 2008

[8] H Jiang H Ardo and V Owall ldquoA hardware architecturefor real-time video segmentation utilizing memory reductiontechniquesrdquo IEEETransactions onCircuits and Systems for VideoTechnology vol 19 no 2 pp 226ndash236 2009

[9] M Genovese and E Napoli ldquoFPGA-based architecture for realtime segmentation and denoising of HD videordquo Journal of Real-Time Image Processing vol 8 no 4 pp 389ndash401 2013

[10] A Lopez-Bravo J Diaz-Carmona A Ramirez-Agundis APadilla-Medina and J Prado-Olivarez ldquoFPGA-based videosystem for real time moving object detectionrdquo in Proceedings

International Journal of Reconfigurable Computing 15

of the 23rd International Conference on Electronics Communi-cations and Computing (CONIELECOMP rsquo13) pp 92ndash97 IEEECholula Mexico March 2013

[11] T Kryjak M Komorkiewicz and M Gorgon ldquoReal-time mov-ing object detection for video surveillance system in FPGArdquo inProceedings of the Conference on Design and Architectures forSignal and Image Processing (DASIP rsquo11) pp 1ndash8 IEEE TampereFinland November 2011

[12] A Mittal and D Huttenlocher ldquoScene modeling for wide areasurveillance and image synthesisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition vol 2pp 160ndash167 IEEE June 2000

[13] A Price J Pyke D Ashiri and T Cornall ldquoReal time objectdetection for an unmanned aerial vehicle using an FPGAbased vision systemrdquo in Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA rsquo06) pp 2854ndash2859 IEEE Orlando Fla USA May 2006

[14] G J Garcıa C A Jara J Pomares A Alabdo L M Poggi andF Torres ldquoA survey on FPGA-based sensor systems towardsintelligent and reconfigurable low-power sensors for computervision control and signal processingrdquo Sensors vol 14 no 4 pp6247ndash6278 2014

[15] S Ali and M Shah ldquoCocoa tracking in aerial imageryrdquo inAirborne Intelligence Surveillance Reconnaissance (ISR) Systemsand Applications III vol 6209 of Proceedings of SPIE OrlandoFla USA April 2006

[16] J Xiao C Yang F Han and H Cheng ldquoVehicle and persontracking in aerial videosrdquo in Multimodal Technologies for Per-ception of Humans pp 203ndash214 Springer 2008

[17] W Yu X Yu P Zhang and J Zhou ldquoA new framework of mov-ing target detection and tracking for uav video applicationrdquo inProceedings of the International Archives of the PhotogrammetryRemote Sensing and Spatial Information Science vol 37 BeijingChina 2008

[18] V Reilly H Idrees and M Shah ldquoDetection and tracking oflarge number of targets in wide area surveillancerdquo in ComputerVisionmdashECCV 2010 11th European Conference on ComputerVision Heraklion Crete Greece September 5ndash11 2010 Proceed-ings Part III vol 6313 of Lecture Notes in Computer Science pp186ndash199 Springer Berlin Germany 2010

[19] J Wang Y Zhang J Lu and W Xu ldquoA framework for movingtarget detection recognition and tracking in UAV videosrdquoin Affective Computing and Intelligent Interaction vol 137 ofAdvances in Intelligent and Soft Computing pp 69ndash76 SpringerBerlin Germany 2012

[20] S A Cheraghi andUU Sheikh ldquoMoving object detection usingimage registration for a moving camera platformrdquo in Proceed-ings of the IEEE International Conference on Control SystemComputing and Engineering (ICCSCE rsquo12) pp 355ndash359 IEEEPenang Malaysia November 2012

[21] Y Zhang X Tong T Yang and W Ma ldquoMulti-model estima-tion based moving object detection for aerial videordquo Sensorsvol 15 no 4 pp 8214ndash8231 2015

[22] Q Yu and G Medioni ldquoA GPU-based implementation ofmotion detection from a moving platformrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern RecognitionWorkshops (CVPR rsquo08) pp 1ndash6 AnchorageAlaska USA June 2008

[23] A Laika J Paul C Claus W Stechele A E S Auf and EMaehle ldquoFPGA-based real-time moving object detection for

walking robotsrdquo in Proceedings of the 8th IEEE InternationalWorkshop on Safety Security and Rescue Robotics (SSRR rsquo10) pp1ndash8 IEEE Bremen Germany July 2010

[24] B Tippetts S Fowers K Lillywhite D-J Lee and J ArchibaldldquoFPGA implementation of a feature detection and trackingalgorithm for real-time applicationsrdquo in Advances in VisualComputing pp 682ndash691 Springer 2007

[25] K May and N Krouglicof ldquoMoving target detection for senseand avoid using regional phase correlationrdquo in Proceedings ofthe IEEE International Conference on Robotics and Automation(ICRA rsquo13) pp 4767ndash4772 IEEE Karlsruhe Germany May2013

[26] M E Angelopoulou and C-S Bouganis ldquoVision-based ego-motion estimation on FPGA for unmanned aerial vehiclenavigationrdquo IEEE Transactions on Circuits and Systems for VideoTechnology vol 24 no 6 pp 1070ndash1083 2014

[27] I M El-Emary andMM A El-Kareem ldquoOn the application ofgenetic algorithms in finger prints registrationrdquoWorld AppliedSciences Journal vol 5 no 3 pp 276ndash281 2008

[28] A A Goshtasby 2-D and 3-D Image Registration for MedicalRemote Sensing and Industrial Applications JohnWiley amp SonsNew York NY USA 2005

[29] C Harris and M Stephens ldquoA combined corner and edgedetectorrdquo in Proceedings of the 4th Alvey Vision Conference vol15 pp 147ndash151 1988

[30] E Rosten andTDrummond ldquoMachine learning for high-speedcorner detectionrdquo in Computer VisionmdashECCV 2006 pp 430ndash443 Springer 2006

[31] H Bay T Tuytelaars and L Van Gool ldquoSURF speededup robust featuresrdquo in Computer VisionmdashECCV 2006 ALeonardis H Bischof and A Pinz Eds vol 3951 of LectureNotes in Computer Science pp 404ndash417 Springer 2006

[32] G R Rodrıguez-Canosa SThomas J del Cerro A Barrientosand B MacDonald ldquoA real-time method to detect and trackmoving objects (DATMO) from unmanned aerial vehicles(UAVs) using a single camerardquo Remote Sensing vol 4 no 4 pp1090ndash1111 2012

[33] B Liu and A Zaccarin ldquoNew fast algorithms for the estimationof block motion vectorsrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 3 no 2 pp 148ndash157 1993

[34] R Li B Zeng andM L Liou ldquoNew three-step search algorithmfor blockmotion estimationrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 4 no 4 pp 438ndash442 1994

[35] L-M Po andW-C Ma ldquoA novel four-step search algorithm forfast blockmotion estimationrdquo IEEETransactions onCircuits andSystems for Video Technology vol 6 no 3 pp 313ndash317 1996

[36] S Zhu and K-K Ma ldquoA new diamond search algorithm forfast block-matching motion estimationrdquo IEEE Transactions onImage Processing vol 9 no 2 pp 287ndash290 2000

[37] L De Vos andM Stegherr ldquoParameterizable VLSI architecturesfor the full-search block-matching algorithmrdquo IEEE Transac-tions on Circuits and Systems vol 36 no 10 pp 1309ndash1316 1989

[38] M A Fischler and R C Bolles ldquoRandom sample consensus aparadigm for model fitting with applications to image analysisand automated cartographyrdquo Communications of the ACM vol24 no 6 pp 381ndash395 1981

[39] J W Tang N Shaikh-Husin and U U Sheikh ldquoFPGA imple-mentation of RANSAC algorithm for real-time image geometry

16 International Journal of Reconfigurable Computing

estimationrdquo in Proceedings of the 11th IEEE Student Conferenceon Research andDevelopment (SCOReD rsquo13) pp 290ndash294 IEEEPutrajaya Malaysia December 2013

[40] O Chum and J Matas ldquoRandomized ransac with T119889119889testrdquo in

Proceedings of the British Machine Vision Conference vol 2 pp448ndash457 September 2002

[41] DARPA SDMS PublicWeb Site 2003 httpswwwsdmsafrlafmil

[42] A F M S Saif A S Prabuwono and Z R MahayuddinldquoMotion analysis for moving object detection from UAV aerialimages a reviewrdquo in Proceedings of the International Conferenceon Informatics Electronics and Vision (ICIEV rsquo14) pp 1ndash6 IEEEDhaka Bangladesh May 2014

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

2 International Journal of Reconfigurable Computing

full autonomous UAV is desirable as it can operate and reacttowards detected target with minimal human intervention[13] Thus an autonomous UAV demands a system with highmobility and high computing capability to perform detectionon the platform itself The use of Field Programmable GateArray (FPGA) will satisfy the low power consumption highcomputing power and small circuitry requirements of a UAVsystem FPGA-based system is a good solution in real-timecomputer vision problem for mobile platform [14] and canbe reconfigured to handle different tasks according to desiredapplicationsThis paper presents a FPGA implementation of real-

time moving target detection system for UAV applicationsThe detection algorithm utilizes image registration techniquewhich first estimates the ego-motion from two subsequentframes using block matching (area-based matching) andRandom Sample Consensus (RANSAC) algorithm Aftercompensating the ego-motion frame differencing medianfiltering and morphological process are utilized to segmentthe moving object The contributions of this paper are asfollows

(i) Development of real-time moving target detectionin a System-on-Chip (SoC) attaining 30 frames persecond (fps) processing rate for 640 times 480 pixelsrsquovideo

(ii) Prototyping of the proposed system in a low-costFPGA board (Terasic DE2-115) mounted with a 5megapixelsrsquo camera (TRDB-D5M) occupying only13 of total combinational function and 13 of totalmemory bits

(iii) Partitioning and pipeline scheduling of the detectionalgorithm in a hardwaresoftware (HWSW) code-sign for maximum processing throughput

(iv) Stream-oriented hardware accelerators includingblock matching and object segmentation modulewhich are able to operate in one cycle per pixel

(v) Analyzing detection performance with different den-sity of area-based ego-motion estimation and framedifferencing threshold

The rest of the paper is as follows Section 2 discusses theliteratures in moving target detection Section 3 discusses themoving target detection algorithm while Section 4 describesthe SoC development and the specialized hardware architec-ture of moving target detection Section 5 presents the detec-tion result from the complete prototype Section 6 concludesthis paper

2 Related Work

Moving target detection targeting for aerial videos or UAVapplications has been widely researched in the past fewdecades A framework consisting of ego-motion compensa-tion motion detection and object tracking was developed in[15] The authors used combination of feature and gradient-based techniques to compensate ego-motion while utilizingaccumulative frame differencing and background subtraction

Table 1 Comparison between related works on FPGA-based objectdetection system for different applications with proposed system

Relatedwork

Cameraplatform Detection technique and application

[6ndash11] Staticcamera

(i) Background subtraction(ii) Using GMM ViBE and so forth

[23] Movingrobot

(i) Detecting moving object using(ii) Optical flow and frame differencing

[24] UAV (i) Detecting and tracking object feature

[13] UAV (i) Car detection(ii) Based on shape size and colour

[25] UAV(i) Detecting moving object(ii) Using regional phase correlation(iii) Does not prototype the complete system

[26] UAV (i) Real-time ego-motion estimation

Proposed UAV(i) Moving target detection(ii) Using area-based image registration(iii) Prototyping the complete system

to detect moving vehicle in aerial videosThe research in [16]presented two different approaches to detect and track mov-ing vehicle and person using a Hierarchy of Gradient (HoG)based classifier The work in [17] has proposed a moving tar-get detection method that performs motion compensationmotion detection and tracking in parallel by including datacapture and collaboration control modules Multiple targetdetection algorithm was proposed in [18] catering for largenumber of moving targets in wide area surveillance appli-cation Moving target detection and tracking for differentaltitude were presented and demonstrated on UAV-capturedvideos in [19] Feature-based image registration techniquewas proposed in [20] to detect moving object in UAV videoThe authors utilized corner points in subsequent video framesas features to perform ego-motion estimation and compen-sation In [21] a multimodel estimation for aerial video wasproposed to detect moving objects in complex backgroundthat is able to remove buildings trees and other false alarmsin detection As these literature works focused on improvingthe detection algorithm for different cases and did notconsider autonomousUAVdeployment they developed theirsystem in a common desktop computer [17 19ndash21] or GraphicProcessing Unit (GPU) accelerated [22] environmentIn the context of FPGA-based object detection system

most works in the literature were targeted for static camera[6ndash11] as illustrated in Table 1 They utilized background sub-traction techniques such as GaussianMixtureModel (GMM)and ViBE (Visual Background Extractor) to perform fore-ground object segmentation in static background video Thework in [23] has proposed FPGA-basedmoving object detec-tion for a walking robotThey implemented ego-motion esti-mation using optical flow technique and framedifferencing inhardwaresoftware codesign systemThere are also several literatures proposing FPGA-based

detection for UAV applications The research in [24] hasproposed a hardwaresoftware codesign using FPGA for fea-ture detection and tracking in UAV applicationsThe authors

International Journal of Reconfigurable Computing 3

implemented Harris feature detector in dedicated hardwareto extract object features from aerial video while trackingof object based on the features is executed in softwareImplementation of real-time object detection for UAV isdescribed in [13] to detect cars based on their shape size andcolour However both works in [13 24] performed detectionand tracking based on object features and did not focus onmoving targets A suitable moving target detection algorithmfor FPGA targeting sense and avoid system in UAV has beenproposed in [25] by using regional phase correlation tech-nique but the authors did not prototype the complete systemin FPGA device In addition research in [26] also presentedthe hardware design and architecture of real-time ego-motionestimation for UAV video Hence there are limited numbersof works in the literature focusing on the development ofa complete prototype to perform real-time moving targetdetection for UAV applications using FPGA

3 Moving Target Detection Algorithm

As UAV is a moving platform the proposed moving targetdetection algorithm employs image registration technique tocompensate the ego-motion prior to object segmentationImage registration algorithms can be classified into feature-based and area-based (intensity-based) methods [27 28]In feature-based method detected features such as cor-

ners [29 30] or SURF [31] from two subsequent framesare cross-correlated to find the motion of each feature fromone frame to another Feature-based image registration isreported to have faster computation in software implemen-tation as it uses only a small number of points for featurematching regardless of the number of pixels The number ofdetected features is unpredictable as it depends on the cap-tured scene of the frames thus having unpredictable amountof computation and memory resource making it difficultto be implemented in highly parallel hardware Number offeatures can be reduced to a predictable constant with anadditional step of selecting strongest features based on theirscore (ie feature strength) by sorting or priority queuing[24] However it presents some limitations as only pixels ofthe highly textured areas would be selected while neglectingthe homogeneous area [32] Moreover feature-based methodrequires irregular access of memory which is not suitable forstreaming hardwareOn the contrary area-based technique construct point-

to-point correspondence between frames by finding themostsimilar texture of a block (area) from one frame to another Itis suitable for parallelism and stream processing as it offersseveral benefits for hardware implementation

(i) It has highly parallel operations that make it suitablefor parallel processing in hardware implementation

(ii) It allows simple control-flow and does not requireirregular accessing of image pixels

(iii) It has predictablememory requirementwith fixed sizeof computation data

The overall flow of the proposed algorithm is illustratedin Figure 1 It consists of two main processes which are

Motion estimation

Object segmentation

Previous frame Current frame

Detected moving object region

Median filtering

Frame differencing

Affine transformation

RANSAC

Block matching

Affine parameters

Morphological process

Figure 1 Overall algorithm of moving target detection using imageregistration technique

motion estimation and object segmentation Area-basedimage registration is utilized in this work The inputs to thesystem are two consecutive greyscale video frames which arethe current and the previous frames First block matchingis performed on these two frames to produce point-to-pointmotion between frames As aerial imagery in UAV videois assumed to have free camera motion with planar scene[5] affine model is employed to estimate the ego-motionRANSAC is then used to remove insignificant motion (out-liers) among all points resulting in the ego-motion in termsof affine transformation matrixAfter the previous frame is aligned with current frame

using parameters in the affine transformation matrix framedifferencing can be performed with pixel-by-pixel subtrac-tion on both aligned frames followed by thresholding toproduce a binary image Median filtering and morphologicalprocesses are done on the binary image to remove noisesresulting in only the detected moving targetThe proposed algorithm is intended for SoC implementa-

tion consisting of aNios II embedded software processor run-ning at 100MHz However most processes running on NiosII are slow and insufficient to achieve real-time capability Inorder to realize a real-time moving target detection systemall processes in this work are implemented in fully dedicated

4 International Journal of Reconfigurable Computing

hardware accelerators except RANSAC which is partiallyaccelerated in hardware

31 Block Matching Block matching involves two stepsextraction and matching where two consecutive framesare required Extraction process will store several blocksor patches of image pixels from one frame as templatewhile matching process will find their most similar blocksin the second frame By considering the center points ofblocks as reference this algorithm will yield numerous pairsof corresponding points which indicate the point-to-pointmotion (movement of the pixels) between two consecutiveframesThe paired points from these two frames will be usedin RANSAC to estimate the ego-motionBlock extraction is the process of storing numerous

blocks of 9 times 9 pixels from a predefined location from a videoframeThese blocks will be used as templates in the matchingprocess The positions of the template blocks are distributedevenly over the imageThere is nomathematical computationin the extraction process as it involves only direct copying ofimage patches from video stream into temporary memoryMatching process plays the role of finding the most sim-

ilar blocks from current frame for every extracted templateblock from the previous frameThis is done by correlating thetemplate blocks with next frame to find their correspondingposition based on similarity measure Due to simplicityof hardware implementation Sum of Absolute Difference(SAD) is chosen as the matching criterion for the correlationprocess SAD will generate a similarity error rating of pixel-to-pixel correlation between each template block (from pre-vious frame) and matching block (from current frame) SADwill yield zero result if both blocks are pixel-by-pixel identicalBlockmatching is computation intensive as each template

block has to search for its most similar pair by performingSAD with each block within its search region Several searchtechniques had been proposed in the literatures to reducethe computation by minimizing the search region suchas Three-Step Search Technique [33 34] Four-Step SearchTechnique [35] and Diamond Search [36] However most ofthese techniques are targeted for general purpose processorwhich reads image in irregular way and are not suitable forstreaming hardware architecture This work uses traditionalfull search technique [37] as it is efficient to be performedin stream-oriented hardware due to its regular accessing ofimageThe number of required matching computations is pro-

portional to the number of blocks (density) and their corre-sponding search areas Higher density of blockmatching pro-vides more points for ego-motion estimation to reduce imageregistration error but with higher hardware cost require-ment (number of hardware computation units) To reducehardware cost this work employs only a low density block(area-based) matching and does not estimate frame-to-framemotion of every pixelTo further optimize hardware resources in stream-

oriented architecture best-fit and nonoverlapping searchareas are utilized to ensure only one SAD computation isperformed for each incoming pixel For a number of rowblocks 119898 and a number of column blocks 119899 search areas

are evenly distributed for each block with 119904119898times 119904119899pixels

formulated in

119904119898= lfloor

119882

119898

rfloor

119904119899= lfloor

119867

119899

rfloor

(1)

where 119882 and 119867 represent image width and image heightrespectivelyThe template block positions (blue) and their correspond-

ing search areas (green) are illustrated in Figure 2 In eachclock cycle only one template block is matched with oneblock from its corresponding search area As each templateblock will only search in its dedicated search area withoutintruding other regions the whole block matching processshares only one SAD computation unit for processing thewhole image allowing119898 and 119899 to be context-switched in run-timeThe proposed approach is able to perform different

densities of area-based registration using the same hardwarecost However higher density reduces the search areas of eachblock thus limiting the flow displacement (travel distance ofeach point) The displacement limitations in horizontal 119889

119898

and vertical 119889119899are given as 119889

119898= plusmn1198822119898 and 119889

119898= plusmn1198672119899

respectively As the position and movement of UAV (heightvelocity etc) as well as frame rate of captured aerial videoaffect the point-to-point displacement between two succes-sive frames the proposed technique will produce wrongimage registration result if the point-to-point displacementbetween frames exceeds 119889

119898in horizontal orand 119889

119899in

vertical

32 RANSAC After the block matching stage a set of pointpairs (point-to-point motion) from two successive frames areidentified Based on these point pairs ego-motion estimationcan be performed As outliers (inconsistent motions) usuallyappear in these point pairs RANSAC algorithm is appliedto remove outliers from the data RANSAC is an iterativealgorithm to find the affine model that best describes thetransformation of the two subsequent frames Unlike the con-ventional RANSAC [38] this work uses an upper bound timeto terminate RANSAC computation (similar to [39]) regard-less of the number of iterations due to the real-time constraintas illustrated in Algorithm 1At each iteration RANSAC algorithm chooses three

distinct point pairs randomly as samples Hypothesis modelof affine transformation is then generated from the selectedsamples based on

[[[

[

1199091015840

11199091015840

21199091015840

3

1199101015840

11199101015840

21199101015840

3

1 1 1

]]]

]

=[[

[

ℎ0ℎ1ℎ2

ℎ3ℎ4ℎ5

0 0 1

]]

]

[[

[

119909111990921199093

119910111991021199103

1 1 1

]]

]

(2)

where ℎ119894denote the parameters of the affine model to be

estimated 119909119894and 119910

119894are the coordinates of chosen sample

points and 1199091015840119894and 1199101015840

119894represent their corresponding point

pairs

International Journal of Reconfigurable Computing 5

(a) 6 times 4 blocks (b) 8 times 6 blocks

Figure 2 Positions of template blocks (blue) and search areas (green) in video frame for different densities (119898 times 119899) of block matching withsame hardware cost

while time taken lt upper bound time do(1) Randomly select 3 distinct point pairs as samples(2) Generate hypothesis model (affine parameters) basedon the chosen samples(3) Apply 119879

119889119889test on the hypothesis model

(4) Calculate the fitness score of the model(5) Update and store best scored parameters

end while

Algorithm 1 RANSAC algorithm

119879119889119889test proposed in [40] is applied in the algorithm to

speed up RANSAC computation by skipping the followingsteps (step (4) and (5)) if the hypothesis model is far fromthe truth Fitness of the hypothesis is then evaluated andscored by fitting its parameters to all point pairs The besthypothesis model is constantly updated in each iteration andemerges as the final result when the RANSAC is terminatedupon reaching an upper bound time As RANSAC has theleast computation among overall moving target detectionalgorithms it is implemented as software program with onlythe fitness scoring step (step (4)) being hardware acceleratedFitness scoring is the calculation of the fitness for a hypothesismodel towards all input data (point pairs from block match-ing) as described in Algorithm 2Each data is considered as an inlier if its fitting error is

smaller than a predefined distance threshold thdist or viceversa Inlier fitness score is its fitting error while outlier scoreis fixed to thdist as a constant penalty The total fitness scoreis calculated by accumulating all individual scores for eachdata where a perfect fit will have zero fitness score As fitnessscoring is an iterative process for all data the number ofcomputations increases with size of data As RANSAC is astochastic algorithm it may not produce the best-fit affinemodel when given limited iteration

33 Object Segmentation After estimating ego-motion thecamera movement between two successive frames is tobe compensated prior to object foreground detection The

fitness score = 0for all data

119894do

asub119909 = abs(1199092119894minus (1199091119894sdot 1198670+ 1199101119894sdot 1198671+ 1198672))

asub119910 = abs(1199102119894minus (1199091119894sdot 1198673+ 1199101119894sdot 1198674+ 1198675))

score = min((asub1199092 + asub1199102) th2dist)fitnessscore = fitnessscore + score

end forWhereEach data

119894contains a point pair (119909

1119894 1199092119894 1199101119894 and 119910

2119894)

1198670 1198671 1198672 1198673 1198674 1198675are affine parameters of hypothesis

modelth2dist is the predefined distance threshold

Algorithm 2 Fitness scoring in RANSAC algorithm

previous frame is transformed and mosaic with currentframe using the estimated affine parameters from RANSACalgorithm Reverse mapping technique is applied by calcu-lating the corresponding location in the source image basedon the destination pixel location The equation of affinetransformation is shown in

[[[

[

1199091015840

119894

1199101015840

119894

1

]]]

]

=[[

[

ℎ0ℎ1ℎ2

ℎ3ℎ4ℎ5

0 0 1

]]

]

[[

[

119909119894

119910119894

1

]]

]

(3)

where 119909119894and 119910

119894are the pixel coordinates of destination

image 1199091015840119894and 1199101015840

119894denote the corresponding pixel coordinates

in source image and ℎ119894are best-fit affine parameters from

RANSACAs the transformation may produce fractional result

nearest neighbour interpolation is utilized due to its efficiencyin hardware design The ego-motion compensation is per-formed pixel-by-pixel in raster scan generating a stream ofthe transformed previous frame to the next processFrame differencing is executed on the current frame and

the transformed (ego-motion compensated) previous frameby pixel-to-pixel absolute subtraction of both frames Thepixels in the resultant image are threshold with constant

6 International Journal of Reconfigurable Computing

Table 2 Pipeline scheduling for processing subsequent frames

Processes Processing frame at frame period 119905119894

1199050

1199051

1199052

1199053

1199054

sdot sdot sdot 119905119894

Motion estimation(i) Block matching 119865

01198651larr 1198650

1198652larr 1198651

1198653larr 1198652

1198654larr 1198653

sdot sdot sdot 119865119894larr 119865119894minus1

(ii) RANSAC mdash mdash 1198651larr 1198650

1198652larr 1198651

1198653larr 1198652

sdot sdot sdot 119865119894minus1larr 119865119894minus2

Object segmentation(i) Affine transformation

mdash mdash mdash 1198651larr 1198650

1198652larr 1198651

sdot sdot sdot 119865119894minus2larr 119865119894minus3

(ii) Frame differencing(iii) Median filtering(iv) Morphological119865119894larr 119865119895is detection of moving object from 119895th frame to 119894th frame

value of thfd to produce binary image Lower value of thfdmay induce more false alarm in detection while higher valuecauses the miss detection Both subtraction and thresholdingprocesses can be done as soon as two pixels for the samecoordinate from these frames are obtained to yield one binarypixel for the next process Lastly 7 times 7 binary median filterand dilation processes are performed on the binary imageto remove noise and improve the detected region of movingtarget

34 Pipeline Scheduling In order to establish a real-timemoving target detection system for streaming video properpipeline scheduling is utilized to fully maximize the overallsystem throughputThe algorithm is split into several subpro-cesses with each hardware accelerator working on differentframes independently transferring the intermediate resultfrom one process to another until the end of the detectioncycle Hence the system will always produce output everytime after a fixed latency The overall process is divided intofour stages of pipeline as shown in Table 2Due to data dependencies of the streaming algorithm all

processesmust be done sequentially to produce one detectionresult Block matching requires two successive video framesfor computation The first frame is streamed in for blockextraction process and stored into frame buffer Blockmatch-ing is performed after the next frame is obtained with theextracted block of previous frame RANSAC can only beginits computation after block matching has finished processingon the entire frame Lastly two original frames (119865

119894minus2and

119865119894minus3) are read from frame buffer for object segmentation to

produce the final result Object segmentation computationcan be performed in stream without further frame bufferingThe overall pipeline processing of the streaming system hasfour framesrsquo latency Hence at least four frames (119865

119894minus3to 119865119894)

must be stored in frame buffer at all time for a completemoving target detection process

4 Proposed Moving Target Detection SoC

Themoving target detection SoC is developed andprototypedin Terasic DE2-115 board with Altera Cyclone IV FPGAdevice The system consists of hardwaresoftware codesignof the algorithm of where the hardware computation is

executed in dedicated accelerator coded in Verilog HardwareDescription Language (HDL) while software program isperformed using a soft-core Nios II processor with SDRAMas software memoryThe system architecture of the proposedmoving target detection SoC is illustrated in Figure 3Camera interface handles the image acquisition tasks to

provide the raw image for processing while VGA interfacemanages video displaying task Apart from being a softwarememory part of SDRAM is also reserved as video displaybuffer Thus Direct Memory Access (DMA) technique isapplied to read and write the displaying frame in SDRAM toensure the high throughput image transferAs multiple frames are required at the same time to

detect moving target frame buffer is required to temporarilystore the frames for processing Hence SRAM is utilizedas frame buffer due to its low latency access Since mostcomputations are performed in the dedicated hardware NiosII handles only RANSAC process (except fitness scoring stepas described in Section 32) and auxiliary firmware controlsUSB controller is included in the SoC to enable data transferwith USB mass storage device for verification and debuggingpurposes In addition embedded operating system (Nios2-linux) is booted in the system to provide file system anddrivers supportThe real-time video is streamed directly into the mov-

ing target detector for processing Both Nios II and hard-ware accelerator modules compute the result as a hard-waresoftware codesign system and transfer the output frameto SDRAM via DMA VGA interface constantly reads anddisplays the output frame in SDRAM All operations are ableto be performed in real-time attaining a 30 fps moving targetdetection system

41Moving Target DetectionHardware Accelerator Thehard-ware architecture of the moving target detector is shown inFigure 4 It is composed of motion estimation core objectsegmentation core frame grabber and other interfaces Theoverall moving target detection is performed according to thefollowing sequences

(1) Frame grabber receives the input video stream andstores four most recent frames (119865

119894minus3to 119865119894) into frame

buffer through its interface At the same time frame

International Journal of Reconfigurable Computing 7

Moving targetdetector

NIOS II CPU

SDRAMcontroller

Camerainterface

FPGA

Syste

m b

us

Embedded Linux

(i) Firmware control

(ii) Software processing

SDRAM

VGAinterface

VGAdisplay

SRAM

USBcontroller

USB mass storage device

Camera

Software processor Hardware accelerator Controllers and interfaces

Memory components IO components

Figure 3 System architecture of moving target detection

Framebuffer

interface

Motionestimation

core

Framegrabber

Objectsegmentation

core

Video streaminput

Businterface(slave)

Software processor

Frame buffer(SRAM)

Businterface(master)

Videoresult

DMA to SDRAM

Affineparameters

RANSACcomputation

Fiminus2

Fi

and Fiminus3

Figure 4 Hardware architecture of moving target detector

8 International Journal of Reconfigurable Computing

grabber also provides the current frame (119865119894) tomotion

estimation core(2) Motion estimation core performs blockmatching andRANSAC computation Since RANSAC is computedin both hardware and software software processor isconstantly accessing this core via system bus interfaceto calculate the affine parameters

(3) After RANSAC the affine parameters are transferredfrom software to object segmentation core Two pre-vious frames (119865

119894minus2and 119865

119894minus3) are read from the frame

buffer by object segmentation core for processing(4) Several processes involving affine transformationframe differencing median filter and dilation arethen performed on both frames resulting in thedetected moving target

(5) Lastly the bus interface (master) provides DMAaccess for object segmentation core to transfer theend result into SDRAMfor displaying and verificationpurposes

As the frame buffer (SRAM) is a single port 16-bitmemory frame grabber concatenates two neighbouring 8-bitgreyscale pixels to store in one memory location Since framegrabber and object segmentation core share the frame bufferto write and read frames respectively frame buffer interfaceprovides priority arbitration and gives frame grabber thehighest priority granting everywrite request However framebuffer may be busy for a couple of clock cycles due to readoperation of SRAM by other modules a small FIFO withdepth of 4 is utilized in frame grabber to temporarily bufferthe incoming image pixels

42 Motion Estimation Hardware Accelerator Motion esti-mation core consists of block matching and RANSAC hard-ware accelerators Since RANSAC requires the entire dataof point pairs provided by block matching to begin itscomputation additional buffers are needed to temporarilystore the corresponding point pairs for every two subsequentframes The hardware architecture for motion estimationprocess is shown in Figure 5To enable high throughput data (point pairs) sharing

for both block matching and RANSAC double bufferingtechnique is applied by using two buffers (Buffer 1 and Buffer2) as data storage For any instance one buffer is writtenby block matching while the other is used for computationby RANSAC Buffer controller swaps the roles of these twobuffers for each incoming new frame therefore ensuringboth processes to be pipelined by reading and writing oneach buffer subsequently Buffer swapping is initiated at eachcompletion of block matching modules while RANSAC isperformed during the time gap between each swap and isterminated before the next swap

421 Block Matching Hardware Accelerator Figure 7 showsthe architecture of the proposed block matching hardwareaccelerator performing template blocks extraction fromone frame and matching of these template blocks in theircorresponding search areas from next frame The overall

Buffer controllerBuffer 1

Block matchingVideo stream

input

RANSAC acceleratorTo softwareprocessor

Buffer 2

Figure 5 Hardware architecture of motion estimation core

process can be completed in stream to yield the point-to-point motion (point pairs) of two subsequent frames withoutbuffering an entire frameAs 9 times 9 block size is utilized in block matching a 9-

tap line buffer is designed in such a way that 9 times 9 pixels ofmoving window can be obtained in every clock cycleThese 9times 9 pixels are shared for both block extraction and matchingprocesses and are read one by one in pipeline from the linebuffer at each valid cycle resulting in a total of 81 cycles toobtain a complete windowThe block extractor keeps track of the coordinate of

current pixel in video stream as a reference for extractionprocess Template blocks from incoming frames are extractedand stored temporarily into block memory As each block isextracted line-by-line in raster scan blockmemory is dividedinto nine-rowmemories as illustrated in Figure 6(a)with eachof which being used to store one pixel row in template blocksWhen video stream reaches the block position each pixelrow is loaded into each rowmemory from the correspondingtap of the line buffer Block coordinates are also stored in aseparate FIFO to keep track of its positionSince only one SAD processor is used for matching 119898 times119899 blocks as mentioned in Section 31 the template blockhas to be swapped according to the corresponding searcharea during raster scan Hence row memory is constructedwith two FIFOs upper and lower FIFO as illustrated inFigure 6(b) to enable block swapping during matchingprocess Template blocks are stored into upper FIFO duringextraction process During matching process each line ofraster scan enters eight different search areas to match eightdifferent template blocks respectively Hence one row oftemplate blocks is cached in lower FIFO and is repeatedlyused until the end of their search areas (reaching next row ofsearch areas) Upon reaching each new row of search areastemplate blocks in lower FIFO are replaced with new row oftemplate blocks from upper FIFO At the last line of rasterscan the lower FIFO is flushed to prevent overflow

International Journal of Reconfigurable Computing 9

CV ControlVector(CV)

Tap 0 Tap 1

Row 1pixels

Template blockscoordinate

Template blockscoordinate

Template blocks pixels

Row 2pixels

Tap 8

Row 8pixels

CV

Rowmemory

Rowmemory

Coordinate memory (FIFO)

Row memory

middot middot middot

middot middot middot

middot middot middot

(a) Block memory consisting of nine-row memories

wr_upper

rd_upper

wr_lower

rd_lower

sel1

sel2

CV_in CV_out

Tap

Row pixels

UpperFIFO

LowerFIFO

1 0

0

Controlregisters

1 0

(b) Row memory contains an upper FIFO and lower FIFO

Figure 6 Block memory architecture for storing template blocks

Blockextractor

Bestscore

tracker

Line buffer

Block memory

SAD processor

Video stream input

Control vector

Matching score

Point pairs

Template blocks Blocks coordinate

9 times 9 window pixels

Figure 7 Stream-oriented hardware architecture of blockmatching

In order to efficiently extract and match all blocksdifferent Control Vector (CV) as illustrated in Table 3 is sentto perform different reading and writing operations in blockmemory based on the current position in raster scan Bothreads andwrites are independent of each other and are able tobe executed at the same time Pixels are processed one by onein 81 cycles to complete a window Both writing and reading

Table 3 Control Vector (CV) for different read andwrite operationsof block memory

Position of raster scan WriteupperReadupper

Writelower

Readlower sel1 sel2

Entering templateblock position 1 x x x x x

Entering first searcharea row x 1 1 0 1 1

Entering next searcharea row x 1 1 1 1 1

Reentering samesearch area row x 0 1 1 0 0

Leaving last searcharea row x 0 0 1 0 0

processes require 9 cycles for each row memory passing CVfrom the first row memory to the next row memory untilthe end to complete a 81-pixel write or read operation of atemplate blockSAD processor performs the correlation of the template

blocks from previous frame with all possible blocks fromcurrent frame according to the search area Extracted blockpixels are read from block memory while window pixels insearch areas are provided from the taps of the line bufferThetotal number of required PEs is the total number of pixelsin a window The process is pipelined such that each pixelis computed in each PE as soon as it is obtained from theline buffer Matching score of each window can be obtainedin every cycle after a fixed latencyLastly the best score tracker constantly stores and updates

the best matching score for each template block within itscorresponding search area The matching score is compared

10 International Journal of Reconfigurable Computing

x2 H2y1 H1

H0y2 H5 x1x1 H3

y1 H4

minus times times minus

minus minus

times times

+

+

+

abs abs

sqr sqr

min

acc

Pipelineregister

Pipelineregister

Pipelineregister

Fitness score

th2dist

Figure 8 Hardware datapath of fitness scoring in RANSAC accel-erator

among the same search area and the coordinates of the best-scored blocks are preserved At the end of each search areathe coordinates of the best pairs (template blocks and theirbest-scored blocks) are sent to RANSAC module for nextprocessing Hence the proposed block matching hardware isable to produce point-to-point motion (point pairs) of everytwo successive frames in streaming video at line rate

422 RANSAC Hardware Accelerator RANSAC hardwaredesign in [39] is utilized in this work which acceleratesonly fitness scoring step As described in Algorithm 2fitness scoring is an iterative process which performs similarcomputation to all data samples based on hypothesis modelHence this data intensive process is executed in pipelineddatapath as illustrated in Figure 8 A control unit is utilizedto read input data provided by block matching from bufferand stream these inputs to the datapath unit at every clockcycleThe datapath unit utilizes three stages of pipeline with

the aim of isolating multiplication processes thus allowingfaster clock rate The first stage pipeline registers are locatedright after the first multiplication while the other two stagesof pipeline registers enclose the squaring processes Theindividual score is accumulated in the last stage producingtotal final fitness score The accumulator is reset on each newset of hypothesis Thus the total number of cycles required

Table 4 Fixed point precision of fitness scoring inputs

Parameter Number of bits Number rangeInteger Fraction

1199091 1199101 1199092 1199102

11 0 [minus1024 1024)1198670119867111986731198674

4 12 [minus8 8)11986721198675

11 5 [minus1024 1024)

Detected movingtarget

Affineparameters

fromsoftware

AffinePE

Framereader

Address

Framedifferencing PE

Binary image stream

MedianPE

Line buffer

DilationPE

Line buffer

from framebuffer

Fiminus2 and Fiminus3

Fiminus2F998400iminus3

Figure 9 Hardware architecture for object segmentation

for fitness score computation is the number of overall dataplus the four-cycle latencyAlthough fitness scoring could require floating point

computations the datapath unit uses suitable fixed pointprecision for each stage SinceNios II is a 32-bit processor theaffineparameters in hypothesismodel (119867

0to1198676) are properly

scaled to different precision of 16-bit fixed points as describedin Table 4 so that two affine parameters can be assigned in asingle 32-bit write instruction As this system is targeted for640 times 480 pixelsrsquo video all input coordinates (119909

1 1199101 1199092 and

1199102) are scaled to 11 bits

43 Object Segmentation Hardware Architecture As objectsegmentation can be performed in one raster scan a stream-oriented architecture is proposed as illustrated in Figure 9 Allsubprocesses are executed in pipeline on the streaming videowithout additional frame buffering Object segmentationprocess is initiated by software processor after providing theaffine parameters from RANSAC to affine PE Two frames(119865119894minus2and 119865

119894minus3as described in Table 2) from frame buffer

(SRAM) are required to segment the moving targetBased on the affine parameters from RANSAC affine PE

uses reverse mapping technique to find each pixel location inprevious frame (119865

119894minus3) using (3) and generates their addresses

in frame buffer (SRAM) Frame readers fetch the previ-ous frame (119865

119894minus3) pixel-by-pixel according to the generated

addresses from frame buffer thus constructing a stream oftransformed frame which is denoted as 1198651015840

119894minus3

International Journal of Reconfigurable Computing 11

Leftmostcolumn pixels

Rightmostcolumn pixels

Median output stream

Addertree

Addertree

24

pixelsTo line buffer

Binary image stream

7 times 7 window

minus

+

gtgt

Figure 10 Hardware architecture of median PE

By synchronizing the streams of both frames framedifferencing can be executed in pipeline as soon as one pixelfrom each frame is obtained Hence one pixel in currentframe (119865

119894minus2) and one pixel in transformed frame (1198651015840

119894minus3) are

fetched alternatively from their corresponding memory loca-tions by frame reader constructing two synchronized streamsof 119865119894minus2and 1198651015840

119894minus3frames Frame differencing PE performs

pixel-to-pixel absolute subtraction and thresholding on thestreams The frame differencing PE is able to compute inone cycle per pixel A configurable threshold value thfd isused after the subtraction yielding a stream of binary imagewithout buffering the whole frameAfter frame differencing the binary image is streamed

into 7 times 7 median filtering Seven lines of the image arebuffered in the line buffer providing 7 times 7 pixels window forthe median PE to perform the median computation Mediancomputation can be performed in one clock cycle for eachprocessing window due to short propagation delay as onlybinary pixels are involved Figure 10 shows the hardware logicdesign of median PEMedian filtering can be computed by counting the num-

ber of asserted (binary 1) pixels in the window If morethan half the pixels in the window (24 pixels) are assertedthe resultant pixel is ldquo1rdquo or ldquo0rdquo otherwise Since processingwindow will move only one pixel to the right for each com-putation during raster scan current pixel count is computedby adding the previous pixel count and rightmost columnpixels in the current window while subtracting the leftmost

column pixels in the previous window Final binary outputpixel is produced by thresholding the current pixel count with24 (half of window size)As dilation is also a 7times 7window-based processing it uses

similar line buffering technique asmedian filtering Howeveronly simple logical OR operation is performed on all windowpixels Due to its simplicity dilation PE can also be computedin one clock cycle resulting in the stream of binary imagewith detected region of moving targets

5 Experimental Results

51 Verification of Proposed SoC Theproposedmoving targetdetection SoC is verified in offline detection mode using thedatabase in [41] Test videos are 640 times 480 pixels in sizeand are greyscaled prior to the verification process The testvideos are transferred to the system for computation via aUSB mass storage device After performing the detection inSoC the image results are displayed on VGA and also storedon USB drive for verification Figure 11 shows the movingtarget detection result from the proposed SoC using differentsample videos The detected regions (red) are overlaid on theinput frame In most cases the proposed SoC is able to detectthe moving target in consecutive framesHowever there are several limitations in this work Block

matching may not give a goodmotion estimation result if theextracted blocks do not have texture (the pixels intensity aresimilar) Moreover the detected region of moving target mayappear in cavity or multiple split of smaller regions as onlysimple frame differencing is applied in the proposed systemAdditional postprocessing to produce better detected blob bymerging split regions is out of the scope in this workAs the stochastic RANSAC algorithm is terminated after

a constant time step for each frame image registration errormay occur which produces incorrect ego-motion estimationThis could be mitigated by accelerating RANSAC algorithmto ensure more iterations using dedicated hardware or highperformance general purpose processor

52 Performance Evaluation of Detection Algorithm Theper-formance evaluation of the implemented detection algorithmuses the Mathematical Performance Metric in [42] thatinvolves several parameters as follows

(i) True positive TP the detected moving object(ii) False positive FP detected regions that do not corre-spond to any moving object

(iii) False negative FN the nondetected moving object(iv) Detection rate DR the ratio of TP with the combina-

tion of TP and FN as formulated in

DR = TPTP + FN

(4)

(v) False alarm rate FAR the ratio between FP in allpositive detection as defined in

FAR = FPTP + FP

(5)

12 International Journal of Reconfigurable Computing

(a) Frame 255 (b) Frame 275 (c) Frame 295 (d) Frame 315

(e) Frame 1000 (f) Frame 1020 (g) Frame 1040 (h) Frame 1060

(i) Frame 600 (j) Frame 620 (k) Frame 640 (l) Frame 660

Figure 11 Detected regions from the proposed moving target detection SoC on different sample videos in [41] Video numbers (a)ndash(d)V3V100003 004 video numbers (e)ndash(h) V3V100004 003 and video numbers (i)ndash(l) V4V100007 017

FP

FNTP

Figure 12 Evaluation of performancemetrics TP FP and FN basedon ground truth boxes (blue) and the detected region (red)

To obtain the performance metrics ground truth regionsare manually labelled in several frames of test videos Abounding box is drawn across each moving object to indicatethe ground truth region of every frame as depicted in Fig-ure 12 A simple postprocessing is performed on the detectedregion by filtering out the detected region smaller than 15pixelsrsquo width or 15 pixelsrsquo height prior to the evaluationA detected moving object (TP) has detected regions in itsbounded ground truth area while a nondetected movingobject (FN) has no detected region overlapping with itsground truth area Detected region that does not overlappwith any ground truth region is considered as false positive(FP)The detection performance is evaluated on different

parameters configuration The DR and FAR for 1000 testframes using different number of blocks (density in ego-motion estimation) 119898 times 119899 in area-based registration and

frame differencing threshold thfd are depicted in Table 5 andFigure 13The experiment results show that DR is almost similar

for different density of ego-motion estimation but decreaseswith thfd Although higher density in the proposed work haslower displacement limitation 119889

119898and 119889

119899as discussed in

Section 31 most of the point-to-point displacements do notexceed the limitation due to slowUAVmovement in themostframes of the test dataset On the contrary higher value of thfdmay filter out the moving object if the differences in intensityof the object pixels and background pixels are almost similarFAR decreases with density in ego-motion estimation

due to the higher quality in image registration process butincreases if most frames exceed the displacement limitation119889119898and 119889

119899 However false registration due to displacement

limitation results in a huge blob of foreground but does notgreatly increase FAR Although higher values of thfd decreasethe false detection rate they also produce smaller foregroundarea for all detected moving objects as pixels almost similarintensity with background will be thresholded

53 Speed Comparison with Full Software ImplementationThe computation speed of the proposed moving target detec-tion SoC is compared with software computation in differentplatforms including modern CPU (Intel Core i5) in desktopcomputer and embedded processor (ARM) Table 6 illustratesthe comparison of computation frame rate and hardware

International Journal of Reconfigurable Computing 13

1 2 3 4 5 6 7 8 9 100944

0946

0948

095

0952

0954

0956

0958

096

0962

0964

Det

ectio

n ra

teD

R

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(a) DR

1 2 3 4 5 6 7 8 9 100

01

02

03

04

05

06

07

False

alar

m ra

te F

AR

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(b) FAR

Figure 13 DR and FAR for different density in ego-motion estimation119898 times 119899 and frame differencing threshold thfd

Table 5 Performance evaluation in terms of DR and FAR for 1000frames using different density in ego-motion estimation119898times 119899 andframe differencing threshold thfd

119898 times 119899 thfd DR FAR12 15 0958 064312 20 0954 033112 25 0949 019424 15 0957 056824 20 0950 032424 25 0945 010135 15 0958 054835 20 0952 021535 25 0947 009048 15 0959 053948 20 0952 025348 25 0944 007970 15 0958 050970 20 0951 018870 25 0946 007588 15 0960 048988 20 0951 021988 25 0947 0074108 15 0958 0483108 20 0952 0168108 25 0946 0058140 15 0958 0499140 20 0951 0187140 25 0946 0059165 15 0958 0474165 20 0953 0214165 25 0947 0068192 15 0959 0478192 20 0952 0169192 25 0946 0092

Table 6 Computation speed comparison of the proposed sys-tem with different software implementation using area-based andfeature-based registrations

Platform Frequency Registrationtechnique

Framerate

Hardwarespeed-up

ProposedSoC 100MHz Area-based 30 1

Intel Corei5-4210U 170GHz Area-based 426 704

Feature-based 1311 229

ARM1176JZF 700MHz Area-based 020 150Feature-based 056 5357

speed-up between the proposed system and other softwareimplementations using test videos in [41]As feature-based image registration has faster computa-

tion in software implementation comparing to area-basedregistration speed performance of feature-based method isalso included for comparison In feature-based implementa-tion features are first detected in each frame The detectedfeatures from current frame are cross-correlatedwith featureswith previous framewhile RANSAC algorithm is used to esti-mate the ego-motion between frames After compensatingthe ego-motion segmentation ofmoving object uses the sameprocesses with the proposed system To further optimizethe software implementation in terms of speed performancea fast feature detection algorithm [30] is utilized As thenumber of features will affect the computation time in featurematching step only 100 strongest features in each frame areselected for processingHowever the performance evaluationdoes not consider multithreaded software execution

14 International Journal of Reconfigurable Computing

Table 7 Resources usage of the proposed moving target detectionSoC

Logic units Utilization ()Total combinational function 15161 13Total registers 10803 9Total memory bits 521054 13Embedded multiplier 27 5FPGA device Altera Cyclone IV

Based on experimental result the speed performanceof the proposed moving target detection SoC surpassesoptimized software computation by 229 times and 5357times compared with implementations in modern CPUand embedded CPU respectively The software computation(RANSAC) in HWSW codesign of the proposed system cre-ates speed bottleneck thus limiting the maximum through-put to 30 fps The processing frame rate of the proposed sys-tem can be further improved by using fully dedicated hard-ware

54 Resource Utilization The overall hardware resourcesutilization of the complete system is illustrated in Table 7This prototype of real-time moving object detection systemutilizes only less than 20 percent of total resources in AlteraCyclone IV FPGA device As the proposed system uses off-chip memory components for frame buffering FPGA on-chip memory is utilized only for line buffering in streamingprocess (eg block matching and median filtering) and stor-ing intermediate results (eg point pairs after block match-ing) Thus the low resource usage of the proposed systemprovides abundant hardware space for other processes such astarget tracking or classification to be developed in future

6 Conclusions

Moving target detection is a crucial step in most computervision problem especially for UAV applications On-chipdetection without the need of real-time video transmission toground will provide immense benefit to diverse applicationssuch as military surveillance and resource exploration Inorder to perform this complex embedded video processingon-chip FPGA-based system is desirable due to the potentialparallelism of the algorithmThis paper proposed a moving target detection system

using FPGA to enable autonomous UAVwhich is able to per-form the computer vision algorithm on the flying platformThe proposed system is prototyped using Altera CycloneIV FPGA device on Terasic DE2-115 development boardmounted with a TRDB-D5M camera This system is devel-oped as a HWSW codesign using dedicated hardware withNios II software processor (booted with embedded Linux)running at 100MHz clock rate As stream-oriented hardwarewith pipeline processing is utilized the proposed systemachieves real-time capability with 30 frames per secondprocessing speed on 640times 480 live video Experimental resultshows that the proposed SoC performs 229 times and 5357times faster than optimized software computation onmodern

desktop computer (Intel Core i5) and embedded processor(ARM) In addition the proposed moving target detectionuses only less than 20 percent of total resources in theFPGA device allowing other hardware accelerators to beimplemented in future

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors would like to express their gratitude to UniversitiTeknologi Malaysia (UTM) and the Ministry of ScienceTechnology and Innovation (MOSTI) Malaysia for support-ing this research work under research Grants 01-01-06-SF1197and 01-01-06-SF1229

References

[1] A Ahmed M Nagai C Tianen and R Shibasaki ldquoUav basedmonitoring systemandobject detection technique developmentfor a disaster areardquo International Archives of PhotogrammetryRemote Sensing and Spatial Information Sciences vol 37 pp373ndash377 2008

[2] B Coifman M McCord R Mishalani M Iswalt and Y JildquoRoadway trafficmonitoring from an unmanned aerial vehiclerdquoIEE Proceedings-Intelligent Transport Systems vol 153 no 1 pp11ndash20 2006

[3] K Kanistras G Martins M J Rutherford and K P ValavanisldquoSurvey of unmanned aerial vehicles (uavs) for traffic monitor-ingrdquo inHandbook of Unmanned Aerial Vehicles pp 2643ndash2666Springer 2015

[4] K Nordberg P Doherty G Farneback et al ldquoVision for a UAVhelicopterrdquo in Proceedings of the International Conference onIntelligent Robots and Systems (IROS rsquo02) Workshop on AerialRobotics pp 29ndash34 Lausanne Switzerland October 2002

[5] D Zamalieva and A Yilmaz ldquoBackground subtraction for themoving camera a geometric approachrdquo Computer Vision andImage Understanding vol 127 pp 73ndash85 2014

[6] M Genovese and E Napoli ldquoASIC and FPGA implementationof the Gaussianmixturemodel algorithm for real-time segmen-tation of high definition videordquo IEEETransactions onVery LargeScale Integration (VLSI) Systems vol 22 no 3 pp 537ndash547 2014

[7] F Kristensen H Hedberg H Jiang P Nilsson and V OwallldquoAn embedded real-time surveillance system implementationand evaluationrdquo Journal of Signal Processing Systems vol 52 no1 pp 75ndash94 2008

[8] H Jiang H Ardo and V Owall ldquoA hardware architecturefor real-time video segmentation utilizing memory reductiontechniquesrdquo IEEETransactions onCircuits and Systems for VideoTechnology vol 19 no 2 pp 226ndash236 2009

[9] M Genovese and E Napoli ldquoFPGA-based architecture for realtime segmentation and denoising of HD videordquo Journal of Real-Time Image Processing vol 8 no 4 pp 389ndash401 2013

[10] A Lopez-Bravo J Diaz-Carmona A Ramirez-Agundis APadilla-Medina and J Prado-Olivarez ldquoFPGA-based videosystem for real time moving object detectionrdquo in Proceedings

International Journal of Reconfigurable Computing 15

of the 23rd International Conference on Electronics Communi-cations and Computing (CONIELECOMP rsquo13) pp 92ndash97 IEEECholula Mexico March 2013

[11] T Kryjak M Komorkiewicz and M Gorgon ldquoReal-time mov-ing object detection for video surveillance system in FPGArdquo inProceedings of the Conference on Design and Architectures forSignal and Image Processing (DASIP rsquo11) pp 1ndash8 IEEE TampereFinland November 2011

[12] A Mittal and D Huttenlocher ldquoScene modeling for wide areasurveillance and image synthesisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition vol 2pp 160ndash167 IEEE June 2000

[13] A Price J Pyke D Ashiri and T Cornall ldquoReal time objectdetection for an unmanned aerial vehicle using an FPGAbased vision systemrdquo in Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA rsquo06) pp 2854ndash2859 IEEE Orlando Fla USA May 2006

[14] G J Garcıa C A Jara J Pomares A Alabdo L M Poggi andF Torres ldquoA survey on FPGA-based sensor systems towardsintelligent and reconfigurable low-power sensors for computervision control and signal processingrdquo Sensors vol 14 no 4 pp6247ndash6278 2014

[15] S Ali and M Shah ldquoCocoa tracking in aerial imageryrdquo inAirborne Intelligence Surveillance Reconnaissance (ISR) Systemsand Applications III vol 6209 of Proceedings of SPIE OrlandoFla USA April 2006

[16] J Xiao C Yang F Han and H Cheng ldquoVehicle and persontracking in aerial videosrdquo in Multimodal Technologies for Per-ception of Humans pp 203ndash214 Springer 2008

[17] W Yu X Yu P Zhang and J Zhou ldquoA new framework of mov-ing target detection and tracking for uav video applicationrdquo inProceedings of the International Archives of the PhotogrammetryRemote Sensing and Spatial Information Science vol 37 BeijingChina 2008

[18] V Reilly H Idrees and M Shah ldquoDetection and tracking oflarge number of targets in wide area surveillancerdquo in ComputerVisionmdashECCV 2010 11th European Conference on ComputerVision Heraklion Crete Greece September 5ndash11 2010 Proceed-ings Part III vol 6313 of Lecture Notes in Computer Science pp186ndash199 Springer Berlin Germany 2010

[19] J Wang Y Zhang J Lu and W Xu ldquoA framework for movingtarget detection recognition and tracking in UAV videosrdquoin Affective Computing and Intelligent Interaction vol 137 ofAdvances in Intelligent and Soft Computing pp 69ndash76 SpringerBerlin Germany 2012

[20] S A Cheraghi andUU Sheikh ldquoMoving object detection usingimage registration for a moving camera platformrdquo in Proceed-ings of the IEEE International Conference on Control SystemComputing and Engineering (ICCSCE rsquo12) pp 355ndash359 IEEEPenang Malaysia November 2012

[21] Y Zhang X Tong T Yang and W Ma ldquoMulti-model estima-tion based moving object detection for aerial videordquo Sensorsvol 15 no 4 pp 8214ndash8231 2015

[22] Q Yu and G Medioni ldquoA GPU-based implementation ofmotion detection from a moving platformrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern RecognitionWorkshops (CVPR rsquo08) pp 1ndash6 AnchorageAlaska USA June 2008

[23] A Laika J Paul C Claus W Stechele A E S Auf and EMaehle ldquoFPGA-based real-time moving object detection for

walking robotsrdquo in Proceedings of the 8th IEEE InternationalWorkshop on Safety Security and Rescue Robotics (SSRR rsquo10) pp1ndash8 IEEE Bremen Germany July 2010

[24] B Tippetts S Fowers K Lillywhite D-J Lee and J ArchibaldldquoFPGA implementation of a feature detection and trackingalgorithm for real-time applicationsrdquo in Advances in VisualComputing pp 682ndash691 Springer 2007

[25] K May and N Krouglicof ldquoMoving target detection for senseand avoid using regional phase correlationrdquo in Proceedings ofthe IEEE International Conference on Robotics and Automation(ICRA rsquo13) pp 4767ndash4772 IEEE Karlsruhe Germany May2013

[26] M E Angelopoulou and C-S Bouganis ldquoVision-based ego-motion estimation on FPGA for unmanned aerial vehiclenavigationrdquo IEEE Transactions on Circuits and Systems for VideoTechnology vol 24 no 6 pp 1070ndash1083 2014

[27] I M El-Emary andMM A El-Kareem ldquoOn the application ofgenetic algorithms in finger prints registrationrdquoWorld AppliedSciences Journal vol 5 no 3 pp 276ndash281 2008

[28] A A Goshtasby 2-D and 3-D Image Registration for MedicalRemote Sensing and Industrial Applications JohnWiley amp SonsNew York NY USA 2005

[29] C Harris and M Stephens ldquoA combined corner and edgedetectorrdquo in Proceedings of the 4th Alvey Vision Conference vol15 pp 147ndash151 1988

[30] E Rosten andTDrummond ldquoMachine learning for high-speedcorner detectionrdquo in Computer VisionmdashECCV 2006 pp 430ndash443 Springer 2006

[31] H Bay T Tuytelaars and L Van Gool ldquoSURF speededup robust featuresrdquo in Computer VisionmdashECCV 2006 ALeonardis H Bischof and A Pinz Eds vol 3951 of LectureNotes in Computer Science pp 404ndash417 Springer 2006

[32] G R Rodrıguez-Canosa SThomas J del Cerro A Barrientosand B MacDonald ldquoA real-time method to detect and trackmoving objects (DATMO) from unmanned aerial vehicles(UAVs) using a single camerardquo Remote Sensing vol 4 no 4 pp1090ndash1111 2012

[33] B Liu and A Zaccarin ldquoNew fast algorithms for the estimationof block motion vectorsrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 3 no 2 pp 148ndash157 1993

[34] R Li B Zeng andM L Liou ldquoNew three-step search algorithmfor blockmotion estimationrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 4 no 4 pp 438ndash442 1994

[35] L-M Po andW-C Ma ldquoA novel four-step search algorithm forfast blockmotion estimationrdquo IEEETransactions onCircuits andSystems for Video Technology vol 6 no 3 pp 313ndash317 1996

[36] S Zhu and K-K Ma ldquoA new diamond search algorithm forfast block-matching motion estimationrdquo IEEE Transactions onImage Processing vol 9 no 2 pp 287ndash290 2000

[37] L De Vos andM Stegherr ldquoParameterizable VLSI architecturesfor the full-search block-matching algorithmrdquo IEEE Transac-tions on Circuits and Systems vol 36 no 10 pp 1309ndash1316 1989

[38] M A Fischler and R C Bolles ldquoRandom sample consensus aparadigm for model fitting with applications to image analysisand automated cartographyrdquo Communications of the ACM vol24 no 6 pp 381ndash395 1981

[39] J W Tang N Shaikh-Husin and U U Sheikh ldquoFPGA imple-mentation of RANSAC algorithm for real-time image geometry

16 International Journal of Reconfigurable Computing

estimationrdquo in Proceedings of the 11th IEEE Student Conferenceon Research andDevelopment (SCOReD rsquo13) pp 290ndash294 IEEEPutrajaya Malaysia December 2013

[40] O Chum and J Matas ldquoRandomized ransac with T119889119889testrdquo in

Proceedings of the British Machine Vision Conference vol 2 pp448ndash457 September 2002

[41] DARPA SDMS PublicWeb Site 2003 httpswwwsdmsafrlafmil

[42] A F M S Saif A S Prabuwono and Z R MahayuddinldquoMotion analysis for moving object detection from UAV aerialimages a reviewrdquo in Proceedings of the International Conferenceon Informatics Electronics and Vision (ICIEV rsquo14) pp 1ndash6 IEEEDhaka Bangladesh May 2014

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

International Journal of Reconfigurable Computing 3

implemented Harris feature detector in dedicated hardwareto extract object features from aerial video while trackingof object based on the features is executed in softwareImplementation of real-time object detection for UAV isdescribed in [13] to detect cars based on their shape size andcolour However both works in [13 24] performed detectionand tracking based on object features and did not focus onmoving targets A suitable moving target detection algorithmfor FPGA targeting sense and avoid system in UAV has beenproposed in [25] by using regional phase correlation tech-nique but the authors did not prototype the complete systemin FPGA device In addition research in [26] also presentedthe hardware design and architecture of real-time ego-motionestimation for UAV video Hence there are limited numbersof works in the literature focusing on the development ofa complete prototype to perform real-time moving targetdetection for UAV applications using FPGA

3 Moving Target Detection Algorithm

As UAV is a moving platform the proposed moving targetdetection algorithm employs image registration technique tocompensate the ego-motion prior to object segmentationImage registration algorithms can be classified into feature-based and area-based (intensity-based) methods [27 28]In feature-based method detected features such as cor-

ners [29 30] or SURF [31] from two subsequent framesare cross-correlated to find the motion of each feature fromone frame to another Feature-based image registration isreported to have faster computation in software implemen-tation as it uses only a small number of points for featurematching regardless of the number of pixels The number ofdetected features is unpredictable as it depends on the cap-tured scene of the frames thus having unpredictable amountof computation and memory resource making it difficultto be implemented in highly parallel hardware Number offeatures can be reduced to a predictable constant with anadditional step of selecting strongest features based on theirscore (ie feature strength) by sorting or priority queuing[24] However it presents some limitations as only pixels ofthe highly textured areas would be selected while neglectingthe homogeneous area [32] Moreover feature-based methodrequires irregular access of memory which is not suitable forstreaming hardwareOn the contrary area-based technique construct point-

to-point correspondence between frames by finding themostsimilar texture of a block (area) from one frame to another Itis suitable for parallelism and stream processing as it offersseveral benefits for hardware implementation

(i) It has highly parallel operations that make it suitablefor parallel processing in hardware implementation

(ii) It allows simple control-flow and does not requireirregular accessing of image pixels

(iii) It has predictablememory requirementwith fixed sizeof computation data

The overall flow of the proposed algorithm is illustratedin Figure 1 It consists of two main processes which are

Motion estimation

Object segmentation

Previous frame Current frame

Detected moving object region

Median filtering

Frame differencing

Affine transformation

RANSAC

Block matching

Affine parameters

Morphological process

Figure 1 Overall algorithm of moving target detection using imageregistration technique

motion estimation and object segmentation Area-basedimage registration is utilized in this work The inputs to thesystem are two consecutive greyscale video frames which arethe current and the previous frames First block matchingis performed on these two frames to produce point-to-pointmotion between frames As aerial imagery in UAV videois assumed to have free camera motion with planar scene[5] affine model is employed to estimate the ego-motionRANSAC is then used to remove insignificant motion (out-liers) among all points resulting in the ego-motion in termsof affine transformation matrixAfter the previous frame is aligned with current frame

using parameters in the affine transformation matrix framedifferencing can be performed with pixel-by-pixel subtrac-tion on both aligned frames followed by thresholding toproduce a binary image Median filtering and morphologicalprocesses are done on the binary image to remove noisesresulting in only the detected moving targetThe proposed algorithm is intended for SoC implementa-

tion consisting of aNios II embedded software processor run-ning at 100MHz However most processes running on NiosII are slow and insufficient to achieve real-time capability Inorder to realize a real-time moving target detection systemall processes in this work are implemented in fully dedicated

4 International Journal of Reconfigurable Computing

hardware accelerators except RANSAC which is partiallyaccelerated in hardware

31 Block Matching Block matching involves two stepsextraction and matching where two consecutive framesare required Extraction process will store several blocksor patches of image pixels from one frame as templatewhile matching process will find their most similar blocksin the second frame By considering the center points ofblocks as reference this algorithm will yield numerous pairsof corresponding points which indicate the point-to-pointmotion (movement of the pixels) between two consecutiveframesThe paired points from these two frames will be usedin RANSAC to estimate the ego-motionBlock extraction is the process of storing numerous

blocks of 9 times 9 pixels from a predefined location from a videoframeThese blocks will be used as templates in the matchingprocess The positions of the template blocks are distributedevenly over the imageThere is nomathematical computationin the extraction process as it involves only direct copying ofimage patches from video stream into temporary memoryMatching process plays the role of finding the most sim-

ilar blocks from current frame for every extracted templateblock from the previous frameThis is done by correlating thetemplate blocks with next frame to find their correspondingposition based on similarity measure Due to simplicityof hardware implementation Sum of Absolute Difference(SAD) is chosen as the matching criterion for the correlationprocess SAD will generate a similarity error rating of pixel-to-pixel correlation between each template block (from pre-vious frame) and matching block (from current frame) SADwill yield zero result if both blocks are pixel-by-pixel identicalBlockmatching is computation intensive as each template

block has to search for its most similar pair by performingSAD with each block within its search region Several searchtechniques had been proposed in the literatures to reducethe computation by minimizing the search region suchas Three-Step Search Technique [33 34] Four-Step SearchTechnique [35] and Diamond Search [36] However most ofthese techniques are targeted for general purpose processorwhich reads image in irregular way and are not suitable forstreaming hardware architecture This work uses traditionalfull search technique [37] as it is efficient to be performedin stream-oriented hardware due to its regular accessing ofimageThe number of required matching computations is pro-

portional to the number of blocks (density) and their corre-sponding search areas Higher density of blockmatching pro-vides more points for ego-motion estimation to reduce imageregistration error but with higher hardware cost require-ment (number of hardware computation units) To reducehardware cost this work employs only a low density block(area-based) matching and does not estimate frame-to-framemotion of every pixelTo further optimize hardware resources in stream-

oriented architecture best-fit and nonoverlapping searchareas are utilized to ensure only one SAD computation isperformed for each incoming pixel For a number of rowblocks 119898 and a number of column blocks 119899 search areas

are evenly distributed for each block with 119904119898times 119904119899pixels

formulated in

119904119898= lfloor

119882

119898

rfloor

119904119899= lfloor

119867

119899

rfloor

(1)

where 119882 and 119867 represent image width and image heightrespectivelyThe template block positions (blue) and their correspond-

ing search areas (green) are illustrated in Figure 2 In eachclock cycle only one template block is matched with oneblock from its corresponding search area As each templateblock will only search in its dedicated search area withoutintruding other regions the whole block matching processshares only one SAD computation unit for processing thewhole image allowing119898 and 119899 to be context-switched in run-timeThe proposed approach is able to perform different

densities of area-based registration using the same hardwarecost However higher density reduces the search areas of eachblock thus limiting the flow displacement (travel distance ofeach point) The displacement limitations in horizontal 119889

119898

and vertical 119889119899are given as 119889

119898= plusmn1198822119898 and 119889

119898= plusmn1198672119899

respectively As the position and movement of UAV (heightvelocity etc) as well as frame rate of captured aerial videoaffect the point-to-point displacement between two succes-sive frames the proposed technique will produce wrongimage registration result if the point-to-point displacementbetween frames exceeds 119889

119898in horizontal orand 119889

119899in

vertical

32 RANSAC After the block matching stage a set of pointpairs (point-to-point motion) from two successive frames areidentified Based on these point pairs ego-motion estimationcan be performed As outliers (inconsistent motions) usuallyappear in these point pairs RANSAC algorithm is appliedto remove outliers from the data RANSAC is an iterativealgorithm to find the affine model that best describes thetransformation of the two subsequent frames Unlike the con-ventional RANSAC [38] this work uses an upper bound timeto terminate RANSAC computation (similar to [39]) regard-less of the number of iterations due to the real-time constraintas illustrated in Algorithm 1At each iteration RANSAC algorithm chooses three

distinct point pairs randomly as samples Hypothesis modelof affine transformation is then generated from the selectedsamples based on

[[[

[

1199091015840

11199091015840

21199091015840

3

1199101015840

11199101015840

21199101015840

3

1 1 1

]]]

]

=[[

[

ℎ0ℎ1ℎ2

ℎ3ℎ4ℎ5

0 0 1

]]

]

[[

[

119909111990921199093

119910111991021199103

1 1 1

]]

]

(2)

where ℎ119894denote the parameters of the affine model to be

estimated 119909119894and 119910

119894are the coordinates of chosen sample

points and 1199091015840119894and 1199101015840

119894represent their corresponding point

pairs

International Journal of Reconfigurable Computing 5

(a) 6 times 4 blocks (b) 8 times 6 blocks

Figure 2 Positions of template blocks (blue) and search areas (green) in video frame for different densities (119898 times 119899) of block matching withsame hardware cost

while time taken lt upper bound time do(1) Randomly select 3 distinct point pairs as samples(2) Generate hypothesis model (affine parameters) basedon the chosen samples(3) Apply 119879

119889119889test on the hypothesis model

(4) Calculate the fitness score of the model(5) Update and store best scored parameters

end while

Algorithm 1 RANSAC algorithm

119879119889119889test proposed in [40] is applied in the algorithm to

speed up RANSAC computation by skipping the followingsteps (step (4) and (5)) if the hypothesis model is far fromthe truth Fitness of the hypothesis is then evaluated andscored by fitting its parameters to all point pairs The besthypothesis model is constantly updated in each iteration andemerges as the final result when the RANSAC is terminatedupon reaching an upper bound time As RANSAC has theleast computation among overall moving target detectionalgorithms it is implemented as software program with onlythe fitness scoring step (step (4)) being hardware acceleratedFitness scoring is the calculation of the fitness for a hypothesismodel towards all input data (point pairs from block match-ing) as described in Algorithm 2Each data is considered as an inlier if its fitting error is

smaller than a predefined distance threshold thdist or viceversa Inlier fitness score is its fitting error while outlier scoreis fixed to thdist as a constant penalty The total fitness scoreis calculated by accumulating all individual scores for eachdata where a perfect fit will have zero fitness score As fitnessscoring is an iterative process for all data the number ofcomputations increases with size of data As RANSAC is astochastic algorithm it may not produce the best-fit affinemodel when given limited iteration

33 Object Segmentation After estimating ego-motion thecamera movement between two successive frames is tobe compensated prior to object foreground detection The

fitness score = 0for all data

119894do

asub119909 = abs(1199092119894minus (1199091119894sdot 1198670+ 1199101119894sdot 1198671+ 1198672))

asub119910 = abs(1199102119894minus (1199091119894sdot 1198673+ 1199101119894sdot 1198674+ 1198675))

score = min((asub1199092 + asub1199102) th2dist)fitnessscore = fitnessscore + score

end forWhereEach data

119894contains a point pair (119909

1119894 1199092119894 1199101119894 and 119910

2119894)

1198670 1198671 1198672 1198673 1198674 1198675are affine parameters of hypothesis

modelth2dist is the predefined distance threshold

Algorithm 2 Fitness scoring in RANSAC algorithm

previous frame is transformed and mosaic with currentframe using the estimated affine parameters from RANSACalgorithm Reverse mapping technique is applied by calcu-lating the corresponding location in the source image basedon the destination pixel location The equation of affinetransformation is shown in

[[[

[

1199091015840

119894

1199101015840

119894

1

]]]

]

=[[

[

ℎ0ℎ1ℎ2

ℎ3ℎ4ℎ5

0 0 1

]]

]

[[

[

119909119894

119910119894

1

]]

]

(3)

where 119909119894and 119910

119894are the pixel coordinates of destination

image 1199091015840119894and 1199101015840

119894denote the corresponding pixel coordinates

in source image and ℎ119894are best-fit affine parameters from

RANSACAs the transformation may produce fractional result

nearest neighbour interpolation is utilized due to its efficiencyin hardware design The ego-motion compensation is per-formed pixel-by-pixel in raster scan generating a stream ofthe transformed previous frame to the next processFrame differencing is executed on the current frame and

the transformed (ego-motion compensated) previous frameby pixel-to-pixel absolute subtraction of both frames Thepixels in the resultant image are threshold with constant

6 International Journal of Reconfigurable Computing

Table 2 Pipeline scheduling for processing subsequent frames

Processes Processing frame at frame period 119905119894

1199050

1199051

1199052

1199053

1199054

sdot sdot sdot 119905119894

Motion estimation(i) Block matching 119865

01198651larr 1198650

1198652larr 1198651

1198653larr 1198652

1198654larr 1198653

sdot sdot sdot 119865119894larr 119865119894minus1

(ii) RANSAC mdash mdash 1198651larr 1198650

1198652larr 1198651

1198653larr 1198652

sdot sdot sdot 119865119894minus1larr 119865119894minus2

Object segmentation(i) Affine transformation

mdash mdash mdash 1198651larr 1198650

1198652larr 1198651

sdot sdot sdot 119865119894minus2larr 119865119894minus3

(ii) Frame differencing(iii) Median filtering(iv) Morphological119865119894larr 119865119895is detection of moving object from 119895th frame to 119894th frame

value of thfd to produce binary image Lower value of thfdmay induce more false alarm in detection while higher valuecauses the miss detection Both subtraction and thresholdingprocesses can be done as soon as two pixels for the samecoordinate from these frames are obtained to yield one binarypixel for the next process Lastly 7 times 7 binary median filterand dilation processes are performed on the binary imageto remove noise and improve the detected region of movingtarget

34 Pipeline Scheduling In order to establish a real-timemoving target detection system for streaming video properpipeline scheduling is utilized to fully maximize the overallsystem throughputThe algorithm is split into several subpro-cesses with each hardware accelerator working on differentframes independently transferring the intermediate resultfrom one process to another until the end of the detectioncycle Hence the system will always produce output everytime after a fixed latency The overall process is divided intofour stages of pipeline as shown in Table 2Due to data dependencies of the streaming algorithm all

processesmust be done sequentially to produce one detectionresult Block matching requires two successive video framesfor computation The first frame is streamed in for blockextraction process and stored into frame buffer Blockmatch-ing is performed after the next frame is obtained with theextracted block of previous frame RANSAC can only beginits computation after block matching has finished processingon the entire frame Lastly two original frames (119865

119894minus2and

119865119894minus3) are read from frame buffer for object segmentation to

produce the final result Object segmentation computationcan be performed in stream without further frame bufferingThe overall pipeline processing of the streaming system hasfour framesrsquo latency Hence at least four frames (119865

119894minus3to 119865119894)

must be stored in frame buffer at all time for a completemoving target detection process

4 Proposed Moving Target Detection SoC

Themoving target detection SoC is developed andprototypedin Terasic DE2-115 board with Altera Cyclone IV FPGAdevice The system consists of hardwaresoftware codesignof the algorithm of where the hardware computation is

executed in dedicated accelerator coded in Verilog HardwareDescription Language (HDL) while software program isperformed using a soft-core Nios II processor with SDRAMas software memoryThe system architecture of the proposedmoving target detection SoC is illustrated in Figure 3Camera interface handles the image acquisition tasks to

provide the raw image for processing while VGA interfacemanages video displaying task Apart from being a softwarememory part of SDRAM is also reserved as video displaybuffer Thus Direct Memory Access (DMA) technique isapplied to read and write the displaying frame in SDRAM toensure the high throughput image transferAs multiple frames are required at the same time to

detect moving target frame buffer is required to temporarilystore the frames for processing Hence SRAM is utilizedas frame buffer due to its low latency access Since mostcomputations are performed in the dedicated hardware NiosII handles only RANSAC process (except fitness scoring stepas described in Section 32) and auxiliary firmware controlsUSB controller is included in the SoC to enable data transferwith USB mass storage device for verification and debuggingpurposes In addition embedded operating system (Nios2-linux) is booted in the system to provide file system anddrivers supportThe real-time video is streamed directly into the mov-

ing target detector for processing Both Nios II and hard-ware accelerator modules compute the result as a hard-waresoftware codesign system and transfer the output frameto SDRAM via DMA VGA interface constantly reads anddisplays the output frame in SDRAM All operations are ableto be performed in real-time attaining a 30 fps moving targetdetection system

41Moving Target DetectionHardware Accelerator Thehard-ware architecture of the moving target detector is shown inFigure 4 It is composed of motion estimation core objectsegmentation core frame grabber and other interfaces Theoverall moving target detection is performed according to thefollowing sequences

(1) Frame grabber receives the input video stream andstores four most recent frames (119865

119894minus3to 119865119894) into frame

buffer through its interface At the same time frame

International Journal of Reconfigurable Computing 7

Moving targetdetector

NIOS II CPU

SDRAMcontroller

Camerainterface

FPGA

Syste

m b

us

Embedded Linux

(i) Firmware control

(ii) Software processing

SDRAM

VGAinterface

VGAdisplay

SRAM

USBcontroller

USB mass storage device

Camera

Software processor Hardware accelerator Controllers and interfaces

Memory components IO components

Figure 3 System architecture of moving target detection

Framebuffer

interface

Motionestimation

core

Framegrabber

Objectsegmentation

core

Video streaminput

Businterface(slave)

Software processor

Frame buffer(SRAM)

Businterface(master)

Videoresult

DMA to SDRAM

Affineparameters

RANSACcomputation

Fiminus2

Fi

and Fiminus3

Figure 4 Hardware architecture of moving target detector

8 International Journal of Reconfigurable Computing

grabber also provides the current frame (119865119894) tomotion

estimation core(2) Motion estimation core performs blockmatching andRANSAC computation Since RANSAC is computedin both hardware and software software processor isconstantly accessing this core via system bus interfaceto calculate the affine parameters

(3) After RANSAC the affine parameters are transferredfrom software to object segmentation core Two pre-vious frames (119865

119894minus2and 119865

119894minus3) are read from the frame

buffer by object segmentation core for processing(4) Several processes involving affine transformationframe differencing median filter and dilation arethen performed on both frames resulting in thedetected moving target

(5) Lastly the bus interface (master) provides DMAaccess for object segmentation core to transfer theend result into SDRAMfor displaying and verificationpurposes

As the frame buffer (SRAM) is a single port 16-bitmemory frame grabber concatenates two neighbouring 8-bitgreyscale pixels to store in one memory location Since framegrabber and object segmentation core share the frame bufferto write and read frames respectively frame buffer interfaceprovides priority arbitration and gives frame grabber thehighest priority granting everywrite request However framebuffer may be busy for a couple of clock cycles due to readoperation of SRAM by other modules a small FIFO withdepth of 4 is utilized in frame grabber to temporarily bufferthe incoming image pixels

42 Motion Estimation Hardware Accelerator Motion esti-mation core consists of block matching and RANSAC hard-ware accelerators Since RANSAC requires the entire dataof point pairs provided by block matching to begin itscomputation additional buffers are needed to temporarilystore the corresponding point pairs for every two subsequentframes The hardware architecture for motion estimationprocess is shown in Figure 5To enable high throughput data (point pairs) sharing

for both block matching and RANSAC double bufferingtechnique is applied by using two buffers (Buffer 1 and Buffer2) as data storage For any instance one buffer is writtenby block matching while the other is used for computationby RANSAC Buffer controller swaps the roles of these twobuffers for each incoming new frame therefore ensuringboth processes to be pipelined by reading and writing oneach buffer subsequently Buffer swapping is initiated at eachcompletion of block matching modules while RANSAC isperformed during the time gap between each swap and isterminated before the next swap

421 Block Matching Hardware Accelerator Figure 7 showsthe architecture of the proposed block matching hardwareaccelerator performing template blocks extraction fromone frame and matching of these template blocks in theircorresponding search areas from next frame The overall

Buffer controllerBuffer 1

Block matchingVideo stream

input

RANSAC acceleratorTo softwareprocessor

Buffer 2

Figure 5 Hardware architecture of motion estimation core

process can be completed in stream to yield the point-to-point motion (point pairs) of two subsequent frames withoutbuffering an entire frameAs 9 times 9 block size is utilized in block matching a 9-

tap line buffer is designed in such a way that 9 times 9 pixels ofmoving window can be obtained in every clock cycleThese 9times 9 pixels are shared for both block extraction and matchingprocesses and are read one by one in pipeline from the linebuffer at each valid cycle resulting in a total of 81 cycles toobtain a complete windowThe block extractor keeps track of the coordinate of

current pixel in video stream as a reference for extractionprocess Template blocks from incoming frames are extractedand stored temporarily into block memory As each block isextracted line-by-line in raster scan blockmemory is dividedinto nine-rowmemories as illustrated in Figure 6(a)with eachof which being used to store one pixel row in template blocksWhen video stream reaches the block position each pixelrow is loaded into each rowmemory from the correspondingtap of the line buffer Block coordinates are also stored in aseparate FIFO to keep track of its positionSince only one SAD processor is used for matching 119898 times119899 blocks as mentioned in Section 31 the template blockhas to be swapped according to the corresponding searcharea during raster scan Hence row memory is constructedwith two FIFOs upper and lower FIFO as illustrated inFigure 6(b) to enable block swapping during matchingprocess Template blocks are stored into upper FIFO duringextraction process During matching process each line ofraster scan enters eight different search areas to match eightdifferent template blocks respectively Hence one row oftemplate blocks is cached in lower FIFO and is repeatedlyused until the end of their search areas (reaching next row ofsearch areas) Upon reaching each new row of search areastemplate blocks in lower FIFO are replaced with new row oftemplate blocks from upper FIFO At the last line of rasterscan the lower FIFO is flushed to prevent overflow

International Journal of Reconfigurable Computing 9

CV ControlVector(CV)

Tap 0 Tap 1

Row 1pixels

Template blockscoordinate

Template blockscoordinate

Template blocks pixels

Row 2pixels

Tap 8

Row 8pixels

CV

Rowmemory

Rowmemory

Coordinate memory (FIFO)

Row memory

middot middot middot

middot middot middot

middot middot middot

(a) Block memory consisting of nine-row memories

wr_upper

rd_upper

wr_lower

rd_lower

sel1

sel2

CV_in CV_out

Tap

Row pixels

UpperFIFO

LowerFIFO

1 0

0

Controlregisters

1 0

(b) Row memory contains an upper FIFO and lower FIFO

Figure 6 Block memory architecture for storing template blocks

Blockextractor

Bestscore

tracker

Line buffer

Block memory

SAD processor

Video stream input

Control vector

Matching score

Point pairs

Template blocks Blocks coordinate

9 times 9 window pixels

Figure 7 Stream-oriented hardware architecture of blockmatching

In order to efficiently extract and match all blocksdifferent Control Vector (CV) as illustrated in Table 3 is sentto perform different reading and writing operations in blockmemory based on the current position in raster scan Bothreads andwrites are independent of each other and are able tobe executed at the same time Pixels are processed one by onein 81 cycles to complete a window Both writing and reading

Table 3 Control Vector (CV) for different read andwrite operationsof block memory

Position of raster scan WriteupperReadupper

Writelower

Readlower sel1 sel2

Entering templateblock position 1 x x x x x

Entering first searcharea row x 1 1 0 1 1

Entering next searcharea row x 1 1 1 1 1

Reentering samesearch area row x 0 1 1 0 0

Leaving last searcharea row x 0 0 1 0 0

processes require 9 cycles for each row memory passing CVfrom the first row memory to the next row memory untilthe end to complete a 81-pixel write or read operation of atemplate blockSAD processor performs the correlation of the template

blocks from previous frame with all possible blocks fromcurrent frame according to the search area Extracted blockpixels are read from block memory while window pixels insearch areas are provided from the taps of the line bufferThetotal number of required PEs is the total number of pixelsin a window The process is pipelined such that each pixelis computed in each PE as soon as it is obtained from theline buffer Matching score of each window can be obtainedin every cycle after a fixed latencyLastly the best score tracker constantly stores and updates

the best matching score for each template block within itscorresponding search area The matching score is compared

10 International Journal of Reconfigurable Computing

x2 H2y1 H1

H0y2 H5 x1x1 H3

y1 H4

minus times times minus

minus minus

times times

+

+

+

abs abs

sqr sqr

min

acc

Pipelineregister

Pipelineregister

Pipelineregister

Fitness score

th2dist

Figure 8 Hardware datapath of fitness scoring in RANSAC accel-erator

among the same search area and the coordinates of the best-scored blocks are preserved At the end of each search areathe coordinates of the best pairs (template blocks and theirbest-scored blocks) are sent to RANSAC module for nextprocessing Hence the proposed block matching hardware isable to produce point-to-point motion (point pairs) of everytwo successive frames in streaming video at line rate

422 RANSAC Hardware Accelerator RANSAC hardwaredesign in [39] is utilized in this work which acceleratesonly fitness scoring step As described in Algorithm 2fitness scoring is an iterative process which performs similarcomputation to all data samples based on hypothesis modelHence this data intensive process is executed in pipelineddatapath as illustrated in Figure 8 A control unit is utilizedto read input data provided by block matching from bufferand stream these inputs to the datapath unit at every clockcycleThe datapath unit utilizes three stages of pipeline with

the aim of isolating multiplication processes thus allowingfaster clock rate The first stage pipeline registers are locatedright after the first multiplication while the other two stagesof pipeline registers enclose the squaring processes Theindividual score is accumulated in the last stage producingtotal final fitness score The accumulator is reset on each newset of hypothesis Thus the total number of cycles required

Table 4 Fixed point precision of fitness scoring inputs

Parameter Number of bits Number rangeInteger Fraction

1199091 1199101 1199092 1199102

11 0 [minus1024 1024)1198670119867111986731198674

4 12 [minus8 8)11986721198675

11 5 [minus1024 1024)

Detected movingtarget

Affineparameters

fromsoftware

AffinePE

Framereader

Address

Framedifferencing PE

Binary image stream

MedianPE

Line buffer

DilationPE

Line buffer

from framebuffer

Fiminus2 and Fiminus3

Fiminus2F998400iminus3

Figure 9 Hardware architecture for object segmentation

for fitness score computation is the number of overall dataplus the four-cycle latencyAlthough fitness scoring could require floating point

computations the datapath unit uses suitable fixed pointprecision for each stage SinceNios II is a 32-bit processor theaffineparameters in hypothesismodel (119867

0to1198676) are properly

scaled to different precision of 16-bit fixed points as describedin Table 4 so that two affine parameters can be assigned in asingle 32-bit write instruction As this system is targeted for640 times 480 pixelsrsquo video all input coordinates (119909

1 1199101 1199092 and

1199102) are scaled to 11 bits

43 Object Segmentation Hardware Architecture As objectsegmentation can be performed in one raster scan a stream-oriented architecture is proposed as illustrated in Figure 9 Allsubprocesses are executed in pipeline on the streaming videowithout additional frame buffering Object segmentationprocess is initiated by software processor after providing theaffine parameters from RANSAC to affine PE Two frames(119865119894minus2and 119865

119894minus3as described in Table 2) from frame buffer

(SRAM) are required to segment the moving targetBased on the affine parameters from RANSAC affine PE

uses reverse mapping technique to find each pixel location inprevious frame (119865

119894minus3) using (3) and generates their addresses

in frame buffer (SRAM) Frame readers fetch the previ-ous frame (119865

119894minus3) pixel-by-pixel according to the generated

addresses from frame buffer thus constructing a stream oftransformed frame which is denoted as 1198651015840

119894minus3

International Journal of Reconfigurable Computing 11

Leftmostcolumn pixels

Rightmostcolumn pixels

Median output stream

Addertree

Addertree

24

pixelsTo line buffer

Binary image stream

7 times 7 window

minus

+

gtgt

Figure 10 Hardware architecture of median PE

By synchronizing the streams of both frames framedifferencing can be executed in pipeline as soon as one pixelfrom each frame is obtained Hence one pixel in currentframe (119865

119894minus2) and one pixel in transformed frame (1198651015840

119894minus3) are

fetched alternatively from their corresponding memory loca-tions by frame reader constructing two synchronized streamsof 119865119894minus2and 1198651015840

119894minus3frames Frame differencing PE performs

pixel-to-pixel absolute subtraction and thresholding on thestreams The frame differencing PE is able to compute inone cycle per pixel A configurable threshold value thfd isused after the subtraction yielding a stream of binary imagewithout buffering the whole frameAfter frame differencing the binary image is streamed

into 7 times 7 median filtering Seven lines of the image arebuffered in the line buffer providing 7 times 7 pixels window forthe median PE to perform the median computation Mediancomputation can be performed in one clock cycle for eachprocessing window due to short propagation delay as onlybinary pixels are involved Figure 10 shows the hardware logicdesign of median PEMedian filtering can be computed by counting the num-

ber of asserted (binary 1) pixels in the window If morethan half the pixels in the window (24 pixels) are assertedthe resultant pixel is ldquo1rdquo or ldquo0rdquo otherwise Since processingwindow will move only one pixel to the right for each com-putation during raster scan current pixel count is computedby adding the previous pixel count and rightmost columnpixels in the current window while subtracting the leftmost

column pixels in the previous window Final binary outputpixel is produced by thresholding the current pixel count with24 (half of window size)As dilation is also a 7times 7window-based processing it uses

similar line buffering technique asmedian filtering Howeveronly simple logical OR operation is performed on all windowpixels Due to its simplicity dilation PE can also be computedin one clock cycle resulting in the stream of binary imagewith detected region of moving targets

5 Experimental Results

51 Verification of Proposed SoC Theproposedmoving targetdetection SoC is verified in offline detection mode using thedatabase in [41] Test videos are 640 times 480 pixels in sizeand are greyscaled prior to the verification process The testvideos are transferred to the system for computation via aUSB mass storage device After performing the detection inSoC the image results are displayed on VGA and also storedon USB drive for verification Figure 11 shows the movingtarget detection result from the proposed SoC using differentsample videos The detected regions (red) are overlaid on theinput frame In most cases the proposed SoC is able to detectthe moving target in consecutive framesHowever there are several limitations in this work Block

matching may not give a goodmotion estimation result if theextracted blocks do not have texture (the pixels intensity aresimilar) Moreover the detected region of moving target mayappear in cavity or multiple split of smaller regions as onlysimple frame differencing is applied in the proposed systemAdditional postprocessing to produce better detected blob bymerging split regions is out of the scope in this workAs the stochastic RANSAC algorithm is terminated after

a constant time step for each frame image registration errormay occur which produces incorrect ego-motion estimationThis could be mitigated by accelerating RANSAC algorithmto ensure more iterations using dedicated hardware or highperformance general purpose processor

52 Performance Evaluation of Detection Algorithm Theper-formance evaluation of the implemented detection algorithmuses the Mathematical Performance Metric in [42] thatinvolves several parameters as follows

(i) True positive TP the detected moving object(ii) False positive FP detected regions that do not corre-spond to any moving object

(iii) False negative FN the nondetected moving object(iv) Detection rate DR the ratio of TP with the combina-

tion of TP and FN as formulated in

DR = TPTP + FN

(4)

(v) False alarm rate FAR the ratio between FP in allpositive detection as defined in

FAR = FPTP + FP

(5)

12 International Journal of Reconfigurable Computing

(a) Frame 255 (b) Frame 275 (c) Frame 295 (d) Frame 315

(e) Frame 1000 (f) Frame 1020 (g) Frame 1040 (h) Frame 1060

(i) Frame 600 (j) Frame 620 (k) Frame 640 (l) Frame 660

Figure 11 Detected regions from the proposed moving target detection SoC on different sample videos in [41] Video numbers (a)ndash(d)V3V100003 004 video numbers (e)ndash(h) V3V100004 003 and video numbers (i)ndash(l) V4V100007 017

FP

FNTP

Figure 12 Evaluation of performancemetrics TP FP and FN basedon ground truth boxes (blue) and the detected region (red)

To obtain the performance metrics ground truth regionsare manually labelled in several frames of test videos Abounding box is drawn across each moving object to indicatethe ground truth region of every frame as depicted in Fig-ure 12 A simple postprocessing is performed on the detectedregion by filtering out the detected region smaller than 15pixelsrsquo width or 15 pixelsrsquo height prior to the evaluationA detected moving object (TP) has detected regions in itsbounded ground truth area while a nondetected movingobject (FN) has no detected region overlapping with itsground truth area Detected region that does not overlappwith any ground truth region is considered as false positive(FP)The detection performance is evaluated on different

parameters configuration The DR and FAR for 1000 testframes using different number of blocks (density in ego-motion estimation) 119898 times 119899 in area-based registration and

frame differencing threshold thfd are depicted in Table 5 andFigure 13The experiment results show that DR is almost similar

for different density of ego-motion estimation but decreaseswith thfd Although higher density in the proposed work haslower displacement limitation 119889

119898and 119889

119899as discussed in

Section 31 most of the point-to-point displacements do notexceed the limitation due to slowUAVmovement in themostframes of the test dataset On the contrary higher value of thfdmay filter out the moving object if the differences in intensityof the object pixels and background pixels are almost similarFAR decreases with density in ego-motion estimation

due to the higher quality in image registration process butincreases if most frames exceed the displacement limitation119889119898and 119889

119899 However false registration due to displacement

limitation results in a huge blob of foreground but does notgreatly increase FAR Although higher values of thfd decreasethe false detection rate they also produce smaller foregroundarea for all detected moving objects as pixels almost similarintensity with background will be thresholded

53 Speed Comparison with Full Software ImplementationThe computation speed of the proposed moving target detec-tion SoC is compared with software computation in differentplatforms including modern CPU (Intel Core i5) in desktopcomputer and embedded processor (ARM) Table 6 illustratesthe comparison of computation frame rate and hardware

International Journal of Reconfigurable Computing 13

1 2 3 4 5 6 7 8 9 100944

0946

0948

095

0952

0954

0956

0958

096

0962

0964

Det

ectio

n ra

teD

R

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(a) DR

1 2 3 4 5 6 7 8 9 100

01

02

03

04

05

06

07

False

alar

m ra

te F

AR

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(b) FAR

Figure 13 DR and FAR for different density in ego-motion estimation119898 times 119899 and frame differencing threshold thfd

Table 5 Performance evaluation in terms of DR and FAR for 1000frames using different density in ego-motion estimation119898times 119899 andframe differencing threshold thfd

119898 times 119899 thfd DR FAR12 15 0958 064312 20 0954 033112 25 0949 019424 15 0957 056824 20 0950 032424 25 0945 010135 15 0958 054835 20 0952 021535 25 0947 009048 15 0959 053948 20 0952 025348 25 0944 007970 15 0958 050970 20 0951 018870 25 0946 007588 15 0960 048988 20 0951 021988 25 0947 0074108 15 0958 0483108 20 0952 0168108 25 0946 0058140 15 0958 0499140 20 0951 0187140 25 0946 0059165 15 0958 0474165 20 0953 0214165 25 0947 0068192 15 0959 0478192 20 0952 0169192 25 0946 0092

Table 6 Computation speed comparison of the proposed sys-tem with different software implementation using area-based andfeature-based registrations

Platform Frequency Registrationtechnique

Framerate

Hardwarespeed-up

ProposedSoC 100MHz Area-based 30 1

Intel Corei5-4210U 170GHz Area-based 426 704

Feature-based 1311 229

ARM1176JZF 700MHz Area-based 020 150Feature-based 056 5357

speed-up between the proposed system and other softwareimplementations using test videos in [41]As feature-based image registration has faster computa-

tion in software implementation comparing to area-basedregistration speed performance of feature-based method isalso included for comparison In feature-based implementa-tion features are first detected in each frame The detectedfeatures from current frame are cross-correlatedwith featureswith previous framewhile RANSAC algorithm is used to esti-mate the ego-motion between frames After compensatingthe ego-motion segmentation ofmoving object uses the sameprocesses with the proposed system To further optimizethe software implementation in terms of speed performancea fast feature detection algorithm [30] is utilized As thenumber of features will affect the computation time in featurematching step only 100 strongest features in each frame areselected for processingHowever the performance evaluationdoes not consider multithreaded software execution

14 International Journal of Reconfigurable Computing

Table 7 Resources usage of the proposed moving target detectionSoC

Logic units Utilization ()Total combinational function 15161 13Total registers 10803 9Total memory bits 521054 13Embedded multiplier 27 5FPGA device Altera Cyclone IV

Based on experimental result the speed performanceof the proposed moving target detection SoC surpassesoptimized software computation by 229 times and 5357times compared with implementations in modern CPUand embedded CPU respectively The software computation(RANSAC) in HWSW codesign of the proposed system cre-ates speed bottleneck thus limiting the maximum through-put to 30 fps The processing frame rate of the proposed sys-tem can be further improved by using fully dedicated hard-ware

54 Resource Utilization The overall hardware resourcesutilization of the complete system is illustrated in Table 7This prototype of real-time moving object detection systemutilizes only less than 20 percent of total resources in AlteraCyclone IV FPGA device As the proposed system uses off-chip memory components for frame buffering FPGA on-chip memory is utilized only for line buffering in streamingprocess (eg block matching and median filtering) and stor-ing intermediate results (eg point pairs after block match-ing) Thus the low resource usage of the proposed systemprovides abundant hardware space for other processes such astarget tracking or classification to be developed in future

6 Conclusions

Moving target detection is a crucial step in most computervision problem especially for UAV applications On-chipdetection without the need of real-time video transmission toground will provide immense benefit to diverse applicationssuch as military surveillance and resource exploration Inorder to perform this complex embedded video processingon-chip FPGA-based system is desirable due to the potentialparallelism of the algorithmThis paper proposed a moving target detection system

using FPGA to enable autonomous UAVwhich is able to per-form the computer vision algorithm on the flying platformThe proposed system is prototyped using Altera CycloneIV FPGA device on Terasic DE2-115 development boardmounted with a TRDB-D5M camera This system is devel-oped as a HWSW codesign using dedicated hardware withNios II software processor (booted with embedded Linux)running at 100MHz clock rate As stream-oriented hardwarewith pipeline processing is utilized the proposed systemachieves real-time capability with 30 frames per secondprocessing speed on 640times 480 live video Experimental resultshows that the proposed SoC performs 229 times and 5357times faster than optimized software computation onmodern

desktop computer (Intel Core i5) and embedded processor(ARM) In addition the proposed moving target detectionuses only less than 20 percent of total resources in theFPGA device allowing other hardware accelerators to beimplemented in future

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors would like to express their gratitude to UniversitiTeknologi Malaysia (UTM) and the Ministry of ScienceTechnology and Innovation (MOSTI) Malaysia for support-ing this research work under research Grants 01-01-06-SF1197and 01-01-06-SF1229

References

[1] A Ahmed M Nagai C Tianen and R Shibasaki ldquoUav basedmonitoring systemandobject detection technique developmentfor a disaster areardquo International Archives of PhotogrammetryRemote Sensing and Spatial Information Sciences vol 37 pp373ndash377 2008

[2] B Coifman M McCord R Mishalani M Iswalt and Y JildquoRoadway trafficmonitoring from an unmanned aerial vehiclerdquoIEE Proceedings-Intelligent Transport Systems vol 153 no 1 pp11ndash20 2006

[3] K Kanistras G Martins M J Rutherford and K P ValavanisldquoSurvey of unmanned aerial vehicles (uavs) for traffic monitor-ingrdquo inHandbook of Unmanned Aerial Vehicles pp 2643ndash2666Springer 2015

[4] K Nordberg P Doherty G Farneback et al ldquoVision for a UAVhelicopterrdquo in Proceedings of the International Conference onIntelligent Robots and Systems (IROS rsquo02) Workshop on AerialRobotics pp 29ndash34 Lausanne Switzerland October 2002

[5] D Zamalieva and A Yilmaz ldquoBackground subtraction for themoving camera a geometric approachrdquo Computer Vision andImage Understanding vol 127 pp 73ndash85 2014

[6] M Genovese and E Napoli ldquoASIC and FPGA implementationof the Gaussianmixturemodel algorithm for real-time segmen-tation of high definition videordquo IEEETransactions onVery LargeScale Integration (VLSI) Systems vol 22 no 3 pp 537ndash547 2014

[7] F Kristensen H Hedberg H Jiang P Nilsson and V OwallldquoAn embedded real-time surveillance system implementationand evaluationrdquo Journal of Signal Processing Systems vol 52 no1 pp 75ndash94 2008

[8] H Jiang H Ardo and V Owall ldquoA hardware architecturefor real-time video segmentation utilizing memory reductiontechniquesrdquo IEEETransactions onCircuits and Systems for VideoTechnology vol 19 no 2 pp 226ndash236 2009

[9] M Genovese and E Napoli ldquoFPGA-based architecture for realtime segmentation and denoising of HD videordquo Journal of Real-Time Image Processing vol 8 no 4 pp 389ndash401 2013

[10] A Lopez-Bravo J Diaz-Carmona A Ramirez-Agundis APadilla-Medina and J Prado-Olivarez ldquoFPGA-based videosystem for real time moving object detectionrdquo in Proceedings

International Journal of Reconfigurable Computing 15

of the 23rd International Conference on Electronics Communi-cations and Computing (CONIELECOMP rsquo13) pp 92ndash97 IEEECholula Mexico March 2013

[11] T Kryjak M Komorkiewicz and M Gorgon ldquoReal-time mov-ing object detection for video surveillance system in FPGArdquo inProceedings of the Conference on Design and Architectures forSignal and Image Processing (DASIP rsquo11) pp 1ndash8 IEEE TampereFinland November 2011

[12] A Mittal and D Huttenlocher ldquoScene modeling for wide areasurveillance and image synthesisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition vol 2pp 160ndash167 IEEE June 2000

[13] A Price J Pyke D Ashiri and T Cornall ldquoReal time objectdetection for an unmanned aerial vehicle using an FPGAbased vision systemrdquo in Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA rsquo06) pp 2854ndash2859 IEEE Orlando Fla USA May 2006

[14] G J Garcıa C A Jara J Pomares A Alabdo L M Poggi andF Torres ldquoA survey on FPGA-based sensor systems towardsintelligent and reconfigurable low-power sensors for computervision control and signal processingrdquo Sensors vol 14 no 4 pp6247ndash6278 2014

[15] S Ali and M Shah ldquoCocoa tracking in aerial imageryrdquo inAirborne Intelligence Surveillance Reconnaissance (ISR) Systemsand Applications III vol 6209 of Proceedings of SPIE OrlandoFla USA April 2006

[16] J Xiao C Yang F Han and H Cheng ldquoVehicle and persontracking in aerial videosrdquo in Multimodal Technologies for Per-ception of Humans pp 203ndash214 Springer 2008

[17] W Yu X Yu P Zhang and J Zhou ldquoA new framework of mov-ing target detection and tracking for uav video applicationrdquo inProceedings of the International Archives of the PhotogrammetryRemote Sensing and Spatial Information Science vol 37 BeijingChina 2008

[18] V Reilly H Idrees and M Shah ldquoDetection and tracking oflarge number of targets in wide area surveillancerdquo in ComputerVisionmdashECCV 2010 11th European Conference on ComputerVision Heraklion Crete Greece September 5ndash11 2010 Proceed-ings Part III vol 6313 of Lecture Notes in Computer Science pp186ndash199 Springer Berlin Germany 2010

[19] J Wang Y Zhang J Lu and W Xu ldquoA framework for movingtarget detection recognition and tracking in UAV videosrdquoin Affective Computing and Intelligent Interaction vol 137 ofAdvances in Intelligent and Soft Computing pp 69ndash76 SpringerBerlin Germany 2012

[20] S A Cheraghi andUU Sheikh ldquoMoving object detection usingimage registration for a moving camera platformrdquo in Proceed-ings of the IEEE International Conference on Control SystemComputing and Engineering (ICCSCE rsquo12) pp 355ndash359 IEEEPenang Malaysia November 2012

[21] Y Zhang X Tong T Yang and W Ma ldquoMulti-model estima-tion based moving object detection for aerial videordquo Sensorsvol 15 no 4 pp 8214ndash8231 2015

[22] Q Yu and G Medioni ldquoA GPU-based implementation ofmotion detection from a moving platformrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern RecognitionWorkshops (CVPR rsquo08) pp 1ndash6 AnchorageAlaska USA June 2008

[23] A Laika J Paul C Claus W Stechele A E S Auf and EMaehle ldquoFPGA-based real-time moving object detection for

walking robotsrdquo in Proceedings of the 8th IEEE InternationalWorkshop on Safety Security and Rescue Robotics (SSRR rsquo10) pp1ndash8 IEEE Bremen Germany July 2010

[24] B Tippetts S Fowers K Lillywhite D-J Lee and J ArchibaldldquoFPGA implementation of a feature detection and trackingalgorithm for real-time applicationsrdquo in Advances in VisualComputing pp 682ndash691 Springer 2007

[25] K May and N Krouglicof ldquoMoving target detection for senseand avoid using regional phase correlationrdquo in Proceedings ofthe IEEE International Conference on Robotics and Automation(ICRA rsquo13) pp 4767ndash4772 IEEE Karlsruhe Germany May2013

[26] M E Angelopoulou and C-S Bouganis ldquoVision-based ego-motion estimation on FPGA for unmanned aerial vehiclenavigationrdquo IEEE Transactions on Circuits and Systems for VideoTechnology vol 24 no 6 pp 1070ndash1083 2014

[27] I M El-Emary andMM A El-Kareem ldquoOn the application ofgenetic algorithms in finger prints registrationrdquoWorld AppliedSciences Journal vol 5 no 3 pp 276ndash281 2008

[28] A A Goshtasby 2-D and 3-D Image Registration for MedicalRemote Sensing and Industrial Applications JohnWiley amp SonsNew York NY USA 2005

[29] C Harris and M Stephens ldquoA combined corner and edgedetectorrdquo in Proceedings of the 4th Alvey Vision Conference vol15 pp 147ndash151 1988

[30] E Rosten andTDrummond ldquoMachine learning for high-speedcorner detectionrdquo in Computer VisionmdashECCV 2006 pp 430ndash443 Springer 2006

[31] H Bay T Tuytelaars and L Van Gool ldquoSURF speededup robust featuresrdquo in Computer VisionmdashECCV 2006 ALeonardis H Bischof and A Pinz Eds vol 3951 of LectureNotes in Computer Science pp 404ndash417 Springer 2006

[32] G R Rodrıguez-Canosa SThomas J del Cerro A Barrientosand B MacDonald ldquoA real-time method to detect and trackmoving objects (DATMO) from unmanned aerial vehicles(UAVs) using a single camerardquo Remote Sensing vol 4 no 4 pp1090ndash1111 2012

[33] B Liu and A Zaccarin ldquoNew fast algorithms for the estimationof block motion vectorsrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 3 no 2 pp 148ndash157 1993

[34] R Li B Zeng andM L Liou ldquoNew three-step search algorithmfor blockmotion estimationrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 4 no 4 pp 438ndash442 1994

[35] L-M Po andW-C Ma ldquoA novel four-step search algorithm forfast blockmotion estimationrdquo IEEETransactions onCircuits andSystems for Video Technology vol 6 no 3 pp 313ndash317 1996

[36] S Zhu and K-K Ma ldquoA new diamond search algorithm forfast block-matching motion estimationrdquo IEEE Transactions onImage Processing vol 9 no 2 pp 287ndash290 2000

[37] L De Vos andM Stegherr ldquoParameterizable VLSI architecturesfor the full-search block-matching algorithmrdquo IEEE Transac-tions on Circuits and Systems vol 36 no 10 pp 1309ndash1316 1989

[38] M A Fischler and R C Bolles ldquoRandom sample consensus aparadigm for model fitting with applications to image analysisand automated cartographyrdquo Communications of the ACM vol24 no 6 pp 381ndash395 1981

[39] J W Tang N Shaikh-Husin and U U Sheikh ldquoFPGA imple-mentation of RANSAC algorithm for real-time image geometry

16 International Journal of Reconfigurable Computing

estimationrdquo in Proceedings of the 11th IEEE Student Conferenceon Research andDevelopment (SCOReD rsquo13) pp 290ndash294 IEEEPutrajaya Malaysia December 2013

[40] O Chum and J Matas ldquoRandomized ransac with T119889119889testrdquo in

Proceedings of the British Machine Vision Conference vol 2 pp448ndash457 September 2002

[41] DARPA SDMS PublicWeb Site 2003 httpswwwsdmsafrlafmil

[42] A F M S Saif A S Prabuwono and Z R MahayuddinldquoMotion analysis for moving object detection from UAV aerialimages a reviewrdquo in Proceedings of the International Conferenceon Informatics Electronics and Vision (ICIEV rsquo14) pp 1ndash6 IEEEDhaka Bangladesh May 2014

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

4 International Journal of Reconfigurable Computing

hardware accelerators except RANSAC which is partiallyaccelerated in hardware

31 Block Matching Block matching involves two stepsextraction and matching where two consecutive framesare required Extraction process will store several blocksor patches of image pixels from one frame as templatewhile matching process will find their most similar blocksin the second frame By considering the center points ofblocks as reference this algorithm will yield numerous pairsof corresponding points which indicate the point-to-pointmotion (movement of the pixels) between two consecutiveframesThe paired points from these two frames will be usedin RANSAC to estimate the ego-motionBlock extraction is the process of storing numerous

blocks of 9 times 9 pixels from a predefined location from a videoframeThese blocks will be used as templates in the matchingprocess The positions of the template blocks are distributedevenly over the imageThere is nomathematical computationin the extraction process as it involves only direct copying ofimage patches from video stream into temporary memoryMatching process plays the role of finding the most sim-

ilar blocks from current frame for every extracted templateblock from the previous frameThis is done by correlating thetemplate blocks with next frame to find their correspondingposition based on similarity measure Due to simplicityof hardware implementation Sum of Absolute Difference(SAD) is chosen as the matching criterion for the correlationprocess SAD will generate a similarity error rating of pixel-to-pixel correlation between each template block (from pre-vious frame) and matching block (from current frame) SADwill yield zero result if both blocks are pixel-by-pixel identicalBlockmatching is computation intensive as each template

block has to search for its most similar pair by performingSAD with each block within its search region Several searchtechniques had been proposed in the literatures to reducethe computation by minimizing the search region suchas Three-Step Search Technique [33 34] Four-Step SearchTechnique [35] and Diamond Search [36] However most ofthese techniques are targeted for general purpose processorwhich reads image in irregular way and are not suitable forstreaming hardware architecture This work uses traditionalfull search technique [37] as it is efficient to be performedin stream-oriented hardware due to its regular accessing ofimageThe number of required matching computations is pro-

portional to the number of blocks (density) and their corre-sponding search areas Higher density of blockmatching pro-vides more points for ego-motion estimation to reduce imageregistration error but with higher hardware cost require-ment (number of hardware computation units) To reducehardware cost this work employs only a low density block(area-based) matching and does not estimate frame-to-framemotion of every pixelTo further optimize hardware resources in stream-

oriented architecture best-fit and nonoverlapping searchareas are utilized to ensure only one SAD computation isperformed for each incoming pixel For a number of rowblocks 119898 and a number of column blocks 119899 search areas

are evenly distributed for each block with 119904119898times 119904119899pixels

formulated in

119904119898= lfloor

119882

119898

rfloor

119904119899= lfloor

119867

119899

rfloor

(1)

where 119882 and 119867 represent image width and image heightrespectivelyThe template block positions (blue) and their correspond-

ing search areas (green) are illustrated in Figure 2 In eachclock cycle only one template block is matched with oneblock from its corresponding search area As each templateblock will only search in its dedicated search area withoutintruding other regions the whole block matching processshares only one SAD computation unit for processing thewhole image allowing119898 and 119899 to be context-switched in run-timeThe proposed approach is able to perform different

densities of area-based registration using the same hardwarecost However higher density reduces the search areas of eachblock thus limiting the flow displacement (travel distance ofeach point) The displacement limitations in horizontal 119889

119898

and vertical 119889119899are given as 119889

119898= plusmn1198822119898 and 119889

119898= plusmn1198672119899

respectively As the position and movement of UAV (heightvelocity etc) as well as frame rate of captured aerial videoaffect the point-to-point displacement between two succes-sive frames the proposed technique will produce wrongimage registration result if the point-to-point displacementbetween frames exceeds 119889

119898in horizontal orand 119889

119899in

vertical

32 RANSAC After the block matching stage a set of pointpairs (point-to-point motion) from two successive frames areidentified Based on these point pairs ego-motion estimationcan be performed As outliers (inconsistent motions) usuallyappear in these point pairs RANSAC algorithm is appliedto remove outliers from the data RANSAC is an iterativealgorithm to find the affine model that best describes thetransformation of the two subsequent frames Unlike the con-ventional RANSAC [38] this work uses an upper bound timeto terminate RANSAC computation (similar to [39]) regard-less of the number of iterations due to the real-time constraintas illustrated in Algorithm 1At each iteration RANSAC algorithm chooses three

distinct point pairs randomly as samples Hypothesis modelof affine transformation is then generated from the selectedsamples based on

[[[

[

1199091015840

11199091015840

21199091015840

3

1199101015840

11199101015840

21199101015840

3

1 1 1

]]]

]

=[[

[

ℎ0ℎ1ℎ2

ℎ3ℎ4ℎ5

0 0 1

]]

]

[[

[

119909111990921199093

119910111991021199103

1 1 1

]]

]

(2)

where ℎ119894denote the parameters of the affine model to be

estimated 119909119894and 119910

119894are the coordinates of chosen sample

points and 1199091015840119894and 1199101015840

119894represent their corresponding point

pairs

International Journal of Reconfigurable Computing 5

(a) 6 times 4 blocks (b) 8 times 6 blocks

Figure 2 Positions of template blocks (blue) and search areas (green) in video frame for different densities (119898 times 119899) of block matching withsame hardware cost

while time taken lt upper bound time do(1) Randomly select 3 distinct point pairs as samples(2) Generate hypothesis model (affine parameters) basedon the chosen samples(3) Apply 119879

119889119889test on the hypothesis model

(4) Calculate the fitness score of the model(5) Update and store best scored parameters

end while

Algorithm 1 RANSAC algorithm

119879119889119889test proposed in [40] is applied in the algorithm to

speed up RANSAC computation by skipping the followingsteps (step (4) and (5)) if the hypothesis model is far fromthe truth Fitness of the hypothesis is then evaluated andscored by fitting its parameters to all point pairs The besthypothesis model is constantly updated in each iteration andemerges as the final result when the RANSAC is terminatedupon reaching an upper bound time As RANSAC has theleast computation among overall moving target detectionalgorithms it is implemented as software program with onlythe fitness scoring step (step (4)) being hardware acceleratedFitness scoring is the calculation of the fitness for a hypothesismodel towards all input data (point pairs from block match-ing) as described in Algorithm 2Each data is considered as an inlier if its fitting error is

smaller than a predefined distance threshold thdist or viceversa Inlier fitness score is its fitting error while outlier scoreis fixed to thdist as a constant penalty The total fitness scoreis calculated by accumulating all individual scores for eachdata where a perfect fit will have zero fitness score As fitnessscoring is an iterative process for all data the number ofcomputations increases with size of data As RANSAC is astochastic algorithm it may not produce the best-fit affinemodel when given limited iteration

33 Object Segmentation After estimating ego-motion thecamera movement between two successive frames is tobe compensated prior to object foreground detection The

fitness score = 0for all data

119894do

asub119909 = abs(1199092119894minus (1199091119894sdot 1198670+ 1199101119894sdot 1198671+ 1198672))

asub119910 = abs(1199102119894minus (1199091119894sdot 1198673+ 1199101119894sdot 1198674+ 1198675))

score = min((asub1199092 + asub1199102) th2dist)fitnessscore = fitnessscore + score

end forWhereEach data

119894contains a point pair (119909

1119894 1199092119894 1199101119894 and 119910

2119894)

1198670 1198671 1198672 1198673 1198674 1198675are affine parameters of hypothesis

modelth2dist is the predefined distance threshold

Algorithm 2 Fitness scoring in RANSAC algorithm

previous frame is transformed and mosaic with currentframe using the estimated affine parameters from RANSACalgorithm Reverse mapping technique is applied by calcu-lating the corresponding location in the source image basedon the destination pixel location The equation of affinetransformation is shown in

[[[

[

1199091015840

119894

1199101015840

119894

1

]]]

]

=[[

[

ℎ0ℎ1ℎ2

ℎ3ℎ4ℎ5

0 0 1

]]

]

[[

[

119909119894

119910119894

1

]]

]

(3)

where 119909119894and 119910

119894are the pixel coordinates of destination

image 1199091015840119894and 1199101015840

119894denote the corresponding pixel coordinates

in source image and ℎ119894are best-fit affine parameters from

RANSACAs the transformation may produce fractional result

nearest neighbour interpolation is utilized due to its efficiencyin hardware design The ego-motion compensation is per-formed pixel-by-pixel in raster scan generating a stream ofthe transformed previous frame to the next processFrame differencing is executed on the current frame and

the transformed (ego-motion compensated) previous frameby pixel-to-pixel absolute subtraction of both frames Thepixels in the resultant image are threshold with constant

6 International Journal of Reconfigurable Computing

Table 2 Pipeline scheduling for processing subsequent frames

Processes Processing frame at frame period 119905119894

1199050

1199051

1199052

1199053

1199054

sdot sdot sdot 119905119894

Motion estimation(i) Block matching 119865

01198651larr 1198650

1198652larr 1198651

1198653larr 1198652

1198654larr 1198653

sdot sdot sdot 119865119894larr 119865119894minus1

(ii) RANSAC mdash mdash 1198651larr 1198650

1198652larr 1198651

1198653larr 1198652

sdot sdot sdot 119865119894minus1larr 119865119894minus2

Object segmentation(i) Affine transformation

mdash mdash mdash 1198651larr 1198650

1198652larr 1198651

sdot sdot sdot 119865119894minus2larr 119865119894minus3

(ii) Frame differencing(iii) Median filtering(iv) Morphological119865119894larr 119865119895is detection of moving object from 119895th frame to 119894th frame

value of thfd to produce binary image Lower value of thfdmay induce more false alarm in detection while higher valuecauses the miss detection Both subtraction and thresholdingprocesses can be done as soon as two pixels for the samecoordinate from these frames are obtained to yield one binarypixel for the next process Lastly 7 times 7 binary median filterand dilation processes are performed on the binary imageto remove noise and improve the detected region of movingtarget

34 Pipeline Scheduling In order to establish a real-timemoving target detection system for streaming video properpipeline scheduling is utilized to fully maximize the overallsystem throughputThe algorithm is split into several subpro-cesses with each hardware accelerator working on differentframes independently transferring the intermediate resultfrom one process to another until the end of the detectioncycle Hence the system will always produce output everytime after a fixed latency The overall process is divided intofour stages of pipeline as shown in Table 2Due to data dependencies of the streaming algorithm all

processesmust be done sequentially to produce one detectionresult Block matching requires two successive video framesfor computation The first frame is streamed in for blockextraction process and stored into frame buffer Blockmatch-ing is performed after the next frame is obtained with theextracted block of previous frame RANSAC can only beginits computation after block matching has finished processingon the entire frame Lastly two original frames (119865

119894minus2and

119865119894minus3) are read from frame buffer for object segmentation to

produce the final result Object segmentation computationcan be performed in stream without further frame bufferingThe overall pipeline processing of the streaming system hasfour framesrsquo latency Hence at least four frames (119865

119894minus3to 119865119894)

must be stored in frame buffer at all time for a completemoving target detection process

4 Proposed Moving Target Detection SoC

Themoving target detection SoC is developed andprototypedin Terasic DE2-115 board with Altera Cyclone IV FPGAdevice The system consists of hardwaresoftware codesignof the algorithm of where the hardware computation is

executed in dedicated accelerator coded in Verilog HardwareDescription Language (HDL) while software program isperformed using a soft-core Nios II processor with SDRAMas software memoryThe system architecture of the proposedmoving target detection SoC is illustrated in Figure 3Camera interface handles the image acquisition tasks to

provide the raw image for processing while VGA interfacemanages video displaying task Apart from being a softwarememory part of SDRAM is also reserved as video displaybuffer Thus Direct Memory Access (DMA) technique isapplied to read and write the displaying frame in SDRAM toensure the high throughput image transferAs multiple frames are required at the same time to

detect moving target frame buffer is required to temporarilystore the frames for processing Hence SRAM is utilizedas frame buffer due to its low latency access Since mostcomputations are performed in the dedicated hardware NiosII handles only RANSAC process (except fitness scoring stepas described in Section 32) and auxiliary firmware controlsUSB controller is included in the SoC to enable data transferwith USB mass storage device for verification and debuggingpurposes In addition embedded operating system (Nios2-linux) is booted in the system to provide file system anddrivers supportThe real-time video is streamed directly into the mov-

ing target detector for processing Both Nios II and hard-ware accelerator modules compute the result as a hard-waresoftware codesign system and transfer the output frameto SDRAM via DMA VGA interface constantly reads anddisplays the output frame in SDRAM All operations are ableto be performed in real-time attaining a 30 fps moving targetdetection system

41Moving Target DetectionHardware Accelerator Thehard-ware architecture of the moving target detector is shown inFigure 4 It is composed of motion estimation core objectsegmentation core frame grabber and other interfaces Theoverall moving target detection is performed according to thefollowing sequences

(1) Frame grabber receives the input video stream andstores four most recent frames (119865

119894minus3to 119865119894) into frame

buffer through its interface At the same time frame

International Journal of Reconfigurable Computing 7

Moving targetdetector

NIOS II CPU

SDRAMcontroller

Camerainterface

FPGA

Syste

m b

us

Embedded Linux

(i) Firmware control

(ii) Software processing

SDRAM

VGAinterface

VGAdisplay

SRAM

USBcontroller

USB mass storage device

Camera

Software processor Hardware accelerator Controllers and interfaces

Memory components IO components

Figure 3 System architecture of moving target detection

Framebuffer

interface

Motionestimation

core

Framegrabber

Objectsegmentation

core

Video streaminput

Businterface(slave)

Software processor

Frame buffer(SRAM)

Businterface(master)

Videoresult

DMA to SDRAM

Affineparameters

RANSACcomputation

Fiminus2

Fi

and Fiminus3

Figure 4 Hardware architecture of moving target detector

8 International Journal of Reconfigurable Computing

grabber also provides the current frame (119865119894) tomotion

estimation core(2) Motion estimation core performs blockmatching andRANSAC computation Since RANSAC is computedin both hardware and software software processor isconstantly accessing this core via system bus interfaceto calculate the affine parameters

(3) After RANSAC the affine parameters are transferredfrom software to object segmentation core Two pre-vious frames (119865

119894minus2and 119865

119894minus3) are read from the frame

buffer by object segmentation core for processing(4) Several processes involving affine transformationframe differencing median filter and dilation arethen performed on both frames resulting in thedetected moving target

(5) Lastly the bus interface (master) provides DMAaccess for object segmentation core to transfer theend result into SDRAMfor displaying and verificationpurposes

As the frame buffer (SRAM) is a single port 16-bitmemory frame grabber concatenates two neighbouring 8-bitgreyscale pixels to store in one memory location Since framegrabber and object segmentation core share the frame bufferto write and read frames respectively frame buffer interfaceprovides priority arbitration and gives frame grabber thehighest priority granting everywrite request However framebuffer may be busy for a couple of clock cycles due to readoperation of SRAM by other modules a small FIFO withdepth of 4 is utilized in frame grabber to temporarily bufferthe incoming image pixels

42 Motion Estimation Hardware Accelerator Motion esti-mation core consists of block matching and RANSAC hard-ware accelerators Since RANSAC requires the entire dataof point pairs provided by block matching to begin itscomputation additional buffers are needed to temporarilystore the corresponding point pairs for every two subsequentframes The hardware architecture for motion estimationprocess is shown in Figure 5To enable high throughput data (point pairs) sharing

for both block matching and RANSAC double bufferingtechnique is applied by using two buffers (Buffer 1 and Buffer2) as data storage For any instance one buffer is writtenby block matching while the other is used for computationby RANSAC Buffer controller swaps the roles of these twobuffers for each incoming new frame therefore ensuringboth processes to be pipelined by reading and writing oneach buffer subsequently Buffer swapping is initiated at eachcompletion of block matching modules while RANSAC isperformed during the time gap between each swap and isterminated before the next swap

421 Block Matching Hardware Accelerator Figure 7 showsthe architecture of the proposed block matching hardwareaccelerator performing template blocks extraction fromone frame and matching of these template blocks in theircorresponding search areas from next frame The overall

Buffer controllerBuffer 1

Block matchingVideo stream

input

RANSAC acceleratorTo softwareprocessor

Buffer 2

Figure 5 Hardware architecture of motion estimation core

process can be completed in stream to yield the point-to-point motion (point pairs) of two subsequent frames withoutbuffering an entire frameAs 9 times 9 block size is utilized in block matching a 9-

tap line buffer is designed in such a way that 9 times 9 pixels ofmoving window can be obtained in every clock cycleThese 9times 9 pixels are shared for both block extraction and matchingprocesses and are read one by one in pipeline from the linebuffer at each valid cycle resulting in a total of 81 cycles toobtain a complete windowThe block extractor keeps track of the coordinate of

current pixel in video stream as a reference for extractionprocess Template blocks from incoming frames are extractedand stored temporarily into block memory As each block isextracted line-by-line in raster scan blockmemory is dividedinto nine-rowmemories as illustrated in Figure 6(a)with eachof which being used to store one pixel row in template blocksWhen video stream reaches the block position each pixelrow is loaded into each rowmemory from the correspondingtap of the line buffer Block coordinates are also stored in aseparate FIFO to keep track of its positionSince only one SAD processor is used for matching 119898 times119899 blocks as mentioned in Section 31 the template blockhas to be swapped according to the corresponding searcharea during raster scan Hence row memory is constructedwith two FIFOs upper and lower FIFO as illustrated inFigure 6(b) to enable block swapping during matchingprocess Template blocks are stored into upper FIFO duringextraction process During matching process each line ofraster scan enters eight different search areas to match eightdifferent template blocks respectively Hence one row oftemplate blocks is cached in lower FIFO and is repeatedlyused until the end of their search areas (reaching next row ofsearch areas) Upon reaching each new row of search areastemplate blocks in lower FIFO are replaced with new row oftemplate blocks from upper FIFO At the last line of rasterscan the lower FIFO is flushed to prevent overflow

International Journal of Reconfigurable Computing 9

CV ControlVector(CV)

Tap 0 Tap 1

Row 1pixels

Template blockscoordinate

Template blockscoordinate

Template blocks pixels

Row 2pixels

Tap 8

Row 8pixels

CV

Rowmemory

Rowmemory

Coordinate memory (FIFO)

Row memory

middot middot middot

middot middot middot

middot middot middot

(a) Block memory consisting of nine-row memories

wr_upper

rd_upper

wr_lower

rd_lower

sel1

sel2

CV_in CV_out

Tap

Row pixels

UpperFIFO

LowerFIFO

1 0

0

Controlregisters

1 0

(b) Row memory contains an upper FIFO and lower FIFO

Figure 6 Block memory architecture for storing template blocks

Blockextractor

Bestscore

tracker

Line buffer

Block memory

SAD processor

Video stream input

Control vector

Matching score

Point pairs

Template blocks Blocks coordinate

9 times 9 window pixels

Figure 7 Stream-oriented hardware architecture of blockmatching

In order to efficiently extract and match all blocksdifferent Control Vector (CV) as illustrated in Table 3 is sentto perform different reading and writing operations in blockmemory based on the current position in raster scan Bothreads andwrites are independent of each other and are able tobe executed at the same time Pixels are processed one by onein 81 cycles to complete a window Both writing and reading

Table 3 Control Vector (CV) for different read andwrite operationsof block memory

Position of raster scan WriteupperReadupper

Writelower

Readlower sel1 sel2

Entering templateblock position 1 x x x x x

Entering first searcharea row x 1 1 0 1 1

Entering next searcharea row x 1 1 1 1 1

Reentering samesearch area row x 0 1 1 0 0

Leaving last searcharea row x 0 0 1 0 0

processes require 9 cycles for each row memory passing CVfrom the first row memory to the next row memory untilthe end to complete a 81-pixel write or read operation of atemplate blockSAD processor performs the correlation of the template

blocks from previous frame with all possible blocks fromcurrent frame according to the search area Extracted blockpixels are read from block memory while window pixels insearch areas are provided from the taps of the line bufferThetotal number of required PEs is the total number of pixelsin a window The process is pipelined such that each pixelis computed in each PE as soon as it is obtained from theline buffer Matching score of each window can be obtainedin every cycle after a fixed latencyLastly the best score tracker constantly stores and updates

the best matching score for each template block within itscorresponding search area The matching score is compared

10 International Journal of Reconfigurable Computing

x2 H2y1 H1

H0y2 H5 x1x1 H3

y1 H4

minus times times minus

minus minus

times times

+

+

+

abs abs

sqr sqr

min

acc

Pipelineregister

Pipelineregister

Pipelineregister

Fitness score

th2dist

Figure 8 Hardware datapath of fitness scoring in RANSAC accel-erator

among the same search area and the coordinates of the best-scored blocks are preserved At the end of each search areathe coordinates of the best pairs (template blocks and theirbest-scored blocks) are sent to RANSAC module for nextprocessing Hence the proposed block matching hardware isable to produce point-to-point motion (point pairs) of everytwo successive frames in streaming video at line rate

422 RANSAC Hardware Accelerator RANSAC hardwaredesign in [39] is utilized in this work which acceleratesonly fitness scoring step As described in Algorithm 2fitness scoring is an iterative process which performs similarcomputation to all data samples based on hypothesis modelHence this data intensive process is executed in pipelineddatapath as illustrated in Figure 8 A control unit is utilizedto read input data provided by block matching from bufferand stream these inputs to the datapath unit at every clockcycleThe datapath unit utilizes three stages of pipeline with

the aim of isolating multiplication processes thus allowingfaster clock rate The first stage pipeline registers are locatedright after the first multiplication while the other two stagesof pipeline registers enclose the squaring processes Theindividual score is accumulated in the last stage producingtotal final fitness score The accumulator is reset on each newset of hypothesis Thus the total number of cycles required

Table 4 Fixed point precision of fitness scoring inputs

Parameter Number of bits Number rangeInteger Fraction

1199091 1199101 1199092 1199102

11 0 [minus1024 1024)1198670119867111986731198674

4 12 [minus8 8)11986721198675

11 5 [minus1024 1024)

Detected movingtarget

Affineparameters

fromsoftware

AffinePE

Framereader

Address

Framedifferencing PE

Binary image stream

MedianPE

Line buffer

DilationPE

Line buffer

from framebuffer

Fiminus2 and Fiminus3

Fiminus2F998400iminus3

Figure 9 Hardware architecture for object segmentation

for fitness score computation is the number of overall dataplus the four-cycle latencyAlthough fitness scoring could require floating point

computations the datapath unit uses suitable fixed pointprecision for each stage SinceNios II is a 32-bit processor theaffineparameters in hypothesismodel (119867

0to1198676) are properly

scaled to different precision of 16-bit fixed points as describedin Table 4 so that two affine parameters can be assigned in asingle 32-bit write instruction As this system is targeted for640 times 480 pixelsrsquo video all input coordinates (119909

1 1199101 1199092 and

1199102) are scaled to 11 bits

43 Object Segmentation Hardware Architecture As objectsegmentation can be performed in one raster scan a stream-oriented architecture is proposed as illustrated in Figure 9 Allsubprocesses are executed in pipeline on the streaming videowithout additional frame buffering Object segmentationprocess is initiated by software processor after providing theaffine parameters from RANSAC to affine PE Two frames(119865119894minus2and 119865

119894minus3as described in Table 2) from frame buffer

(SRAM) are required to segment the moving targetBased on the affine parameters from RANSAC affine PE

uses reverse mapping technique to find each pixel location inprevious frame (119865

119894minus3) using (3) and generates their addresses

in frame buffer (SRAM) Frame readers fetch the previ-ous frame (119865

119894minus3) pixel-by-pixel according to the generated

addresses from frame buffer thus constructing a stream oftransformed frame which is denoted as 1198651015840

119894minus3

International Journal of Reconfigurable Computing 11

Leftmostcolumn pixels

Rightmostcolumn pixels

Median output stream

Addertree

Addertree

24

pixelsTo line buffer

Binary image stream

7 times 7 window

minus

+

gtgt

Figure 10 Hardware architecture of median PE

By synchronizing the streams of both frames framedifferencing can be executed in pipeline as soon as one pixelfrom each frame is obtained Hence one pixel in currentframe (119865

119894minus2) and one pixel in transformed frame (1198651015840

119894minus3) are

fetched alternatively from their corresponding memory loca-tions by frame reader constructing two synchronized streamsof 119865119894minus2and 1198651015840

119894minus3frames Frame differencing PE performs

pixel-to-pixel absolute subtraction and thresholding on thestreams The frame differencing PE is able to compute inone cycle per pixel A configurable threshold value thfd isused after the subtraction yielding a stream of binary imagewithout buffering the whole frameAfter frame differencing the binary image is streamed

into 7 times 7 median filtering Seven lines of the image arebuffered in the line buffer providing 7 times 7 pixels window forthe median PE to perform the median computation Mediancomputation can be performed in one clock cycle for eachprocessing window due to short propagation delay as onlybinary pixels are involved Figure 10 shows the hardware logicdesign of median PEMedian filtering can be computed by counting the num-

ber of asserted (binary 1) pixels in the window If morethan half the pixels in the window (24 pixels) are assertedthe resultant pixel is ldquo1rdquo or ldquo0rdquo otherwise Since processingwindow will move only one pixel to the right for each com-putation during raster scan current pixel count is computedby adding the previous pixel count and rightmost columnpixels in the current window while subtracting the leftmost

column pixels in the previous window Final binary outputpixel is produced by thresholding the current pixel count with24 (half of window size)As dilation is also a 7times 7window-based processing it uses

similar line buffering technique asmedian filtering Howeveronly simple logical OR operation is performed on all windowpixels Due to its simplicity dilation PE can also be computedin one clock cycle resulting in the stream of binary imagewith detected region of moving targets

5 Experimental Results

51 Verification of Proposed SoC Theproposedmoving targetdetection SoC is verified in offline detection mode using thedatabase in [41] Test videos are 640 times 480 pixels in sizeand are greyscaled prior to the verification process The testvideos are transferred to the system for computation via aUSB mass storage device After performing the detection inSoC the image results are displayed on VGA and also storedon USB drive for verification Figure 11 shows the movingtarget detection result from the proposed SoC using differentsample videos The detected regions (red) are overlaid on theinput frame In most cases the proposed SoC is able to detectthe moving target in consecutive framesHowever there are several limitations in this work Block

matching may not give a goodmotion estimation result if theextracted blocks do not have texture (the pixels intensity aresimilar) Moreover the detected region of moving target mayappear in cavity or multiple split of smaller regions as onlysimple frame differencing is applied in the proposed systemAdditional postprocessing to produce better detected blob bymerging split regions is out of the scope in this workAs the stochastic RANSAC algorithm is terminated after

a constant time step for each frame image registration errormay occur which produces incorrect ego-motion estimationThis could be mitigated by accelerating RANSAC algorithmto ensure more iterations using dedicated hardware or highperformance general purpose processor

52 Performance Evaluation of Detection Algorithm Theper-formance evaluation of the implemented detection algorithmuses the Mathematical Performance Metric in [42] thatinvolves several parameters as follows

(i) True positive TP the detected moving object(ii) False positive FP detected regions that do not corre-spond to any moving object

(iii) False negative FN the nondetected moving object(iv) Detection rate DR the ratio of TP with the combina-

tion of TP and FN as formulated in

DR = TPTP + FN

(4)

(v) False alarm rate FAR the ratio between FP in allpositive detection as defined in

FAR = FPTP + FP

(5)

12 International Journal of Reconfigurable Computing

(a) Frame 255 (b) Frame 275 (c) Frame 295 (d) Frame 315

(e) Frame 1000 (f) Frame 1020 (g) Frame 1040 (h) Frame 1060

(i) Frame 600 (j) Frame 620 (k) Frame 640 (l) Frame 660

Figure 11 Detected regions from the proposed moving target detection SoC on different sample videos in [41] Video numbers (a)ndash(d)V3V100003 004 video numbers (e)ndash(h) V3V100004 003 and video numbers (i)ndash(l) V4V100007 017

FP

FNTP

Figure 12 Evaluation of performancemetrics TP FP and FN basedon ground truth boxes (blue) and the detected region (red)

To obtain the performance metrics ground truth regionsare manually labelled in several frames of test videos Abounding box is drawn across each moving object to indicatethe ground truth region of every frame as depicted in Fig-ure 12 A simple postprocessing is performed on the detectedregion by filtering out the detected region smaller than 15pixelsrsquo width or 15 pixelsrsquo height prior to the evaluationA detected moving object (TP) has detected regions in itsbounded ground truth area while a nondetected movingobject (FN) has no detected region overlapping with itsground truth area Detected region that does not overlappwith any ground truth region is considered as false positive(FP)The detection performance is evaluated on different

parameters configuration The DR and FAR for 1000 testframes using different number of blocks (density in ego-motion estimation) 119898 times 119899 in area-based registration and

frame differencing threshold thfd are depicted in Table 5 andFigure 13The experiment results show that DR is almost similar

for different density of ego-motion estimation but decreaseswith thfd Although higher density in the proposed work haslower displacement limitation 119889

119898and 119889

119899as discussed in

Section 31 most of the point-to-point displacements do notexceed the limitation due to slowUAVmovement in themostframes of the test dataset On the contrary higher value of thfdmay filter out the moving object if the differences in intensityof the object pixels and background pixels are almost similarFAR decreases with density in ego-motion estimation

due to the higher quality in image registration process butincreases if most frames exceed the displacement limitation119889119898and 119889

119899 However false registration due to displacement

limitation results in a huge blob of foreground but does notgreatly increase FAR Although higher values of thfd decreasethe false detection rate they also produce smaller foregroundarea for all detected moving objects as pixels almost similarintensity with background will be thresholded

53 Speed Comparison with Full Software ImplementationThe computation speed of the proposed moving target detec-tion SoC is compared with software computation in differentplatforms including modern CPU (Intel Core i5) in desktopcomputer and embedded processor (ARM) Table 6 illustratesthe comparison of computation frame rate and hardware

International Journal of Reconfigurable Computing 13

1 2 3 4 5 6 7 8 9 100944

0946

0948

095

0952

0954

0956

0958

096

0962

0964

Det

ectio

n ra

teD

R

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(a) DR

1 2 3 4 5 6 7 8 9 100

01

02

03

04

05

06

07

False

alar

m ra

te F

AR

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(b) FAR

Figure 13 DR and FAR for different density in ego-motion estimation119898 times 119899 and frame differencing threshold thfd

Table 5 Performance evaluation in terms of DR and FAR for 1000frames using different density in ego-motion estimation119898times 119899 andframe differencing threshold thfd

119898 times 119899 thfd DR FAR12 15 0958 064312 20 0954 033112 25 0949 019424 15 0957 056824 20 0950 032424 25 0945 010135 15 0958 054835 20 0952 021535 25 0947 009048 15 0959 053948 20 0952 025348 25 0944 007970 15 0958 050970 20 0951 018870 25 0946 007588 15 0960 048988 20 0951 021988 25 0947 0074108 15 0958 0483108 20 0952 0168108 25 0946 0058140 15 0958 0499140 20 0951 0187140 25 0946 0059165 15 0958 0474165 20 0953 0214165 25 0947 0068192 15 0959 0478192 20 0952 0169192 25 0946 0092

Table 6 Computation speed comparison of the proposed sys-tem with different software implementation using area-based andfeature-based registrations

Platform Frequency Registrationtechnique

Framerate

Hardwarespeed-up

ProposedSoC 100MHz Area-based 30 1

Intel Corei5-4210U 170GHz Area-based 426 704

Feature-based 1311 229

ARM1176JZF 700MHz Area-based 020 150Feature-based 056 5357

speed-up between the proposed system and other softwareimplementations using test videos in [41]As feature-based image registration has faster computa-

tion in software implementation comparing to area-basedregistration speed performance of feature-based method isalso included for comparison In feature-based implementa-tion features are first detected in each frame The detectedfeatures from current frame are cross-correlatedwith featureswith previous framewhile RANSAC algorithm is used to esti-mate the ego-motion between frames After compensatingthe ego-motion segmentation ofmoving object uses the sameprocesses with the proposed system To further optimizethe software implementation in terms of speed performancea fast feature detection algorithm [30] is utilized As thenumber of features will affect the computation time in featurematching step only 100 strongest features in each frame areselected for processingHowever the performance evaluationdoes not consider multithreaded software execution

14 International Journal of Reconfigurable Computing

Table 7 Resources usage of the proposed moving target detectionSoC

Logic units Utilization ()Total combinational function 15161 13Total registers 10803 9Total memory bits 521054 13Embedded multiplier 27 5FPGA device Altera Cyclone IV

Based on experimental result the speed performanceof the proposed moving target detection SoC surpassesoptimized software computation by 229 times and 5357times compared with implementations in modern CPUand embedded CPU respectively The software computation(RANSAC) in HWSW codesign of the proposed system cre-ates speed bottleneck thus limiting the maximum through-put to 30 fps The processing frame rate of the proposed sys-tem can be further improved by using fully dedicated hard-ware

54 Resource Utilization The overall hardware resourcesutilization of the complete system is illustrated in Table 7This prototype of real-time moving object detection systemutilizes only less than 20 percent of total resources in AlteraCyclone IV FPGA device As the proposed system uses off-chip memory components for frame buffering FPGA on-chip memory is utilized only for line buffering in streamingprocess (eg block matching and median filtering) and stor-ing intermediate results (eg point pairs after block match-ing) Thus the low resource usage of the proposed systemprovides abundant hardware space for other processes such astarget tracking or classification to be developed in future

6 Conclusions

Moving target detection is a crucial step in most computervision problem especially for UAV applications On-chipdetection without the need of real-time video transmission toground will provide immense benefit to diverse applicationssuch as military surveillance and resource exploration Inorder to perform this complex embedded video processingon-chip FPGA-based system is desirable due to the potentialparallelism of the algorithmThis paper proposed a moving target detection system

using FPGA to enable autonomous UAVwhich is able to per-form the computer vision algorithm on the flying platformThe proposed system is prototyped using Altera CycloneIV FPGA device on Terasic DE2-115 development boardmounted with a TRDB-D5M camera This system is devel-oped as a HWSW codesign using dedicated hardware withNios II software processor (booted with embedded Linux)running at 100MHz clock rate As stream-oriented hardwarewith pipeline processing is utilized the proposed systemachieves real-time capability with 30 frames per secondprocessing speed on 640times 480 live video Experimental resultshows that the proposed SoC performs 229 times and 5357times faster than optimized software computation onmodern

desktop computer (Intel Core i5) and embedded processor(ARM) In addition the proposed moving target detectionuses only less than 20 percent of total resources in theFPGA device allowing other hardware accelerators to beimplemented in future

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors would like to express their gratitude to UniversitiTeknologi Malaysia (UTM) and the Ministry of ScienceTechnology and Innovation (MOSTI) Malaysia for support-ing this research work under research Grants 01-01-06-SF1197and 01-01-06-SF1229

References

[1] A Ahmed M Nagai C Tianen and R Shibasaki ldquoUav basedmonitoring systemandobject detection technique developmentfor a disaster areardquo International Archives of PhotogrammetryRemote Sensing and Spatial Information Sciences vol 37 pp373ndash377 2008

[2] B Coifman M McCord R Mishalani M Iswalt and Y JildquoRoadway trafficmonitoring from an unmanned aerial vehiclerdquoIEE Proceedings-Intelligent Transport Systems vol 153 no 1 pp11ndash20 2006

[3] K Kanistras G Martins M J Rutherford and K P ValavanisldquoSurvey of unmanned aerial vehicles (uavs) for traffic monitor-ingrdquo inHandbook of Unmanned Aerial Vehicles pp 2643ndash2666Springer 2015

[4] K Nordberg P Doherty G Farneback et al ldquoVision for a UAVhelicopterrdquo in Proceedings of the International Conference onIntelligent Robots and Systems (IROS rsquo02) Workshop on AerialRobotics pp 29ndash34 Lausanne Switzerland October 2002

[5] D Zamalieva and A Yilmaz ldquoBackground subtraction for themoving camera a geometric approachrdquo Computer Vision andImage Understanding vol 127 pp 73ndash85 2014

[6] M Genovese and E Napoli ldquoASIC and FPGA implementationof the Gaussianmixturemodel algorithm for real-time segmen-tation of high definition videordquo IEEETransactions onVery LargeScale Integration (VLSI) Systems vol 22 no 3 pp 537ndash547 2014

[7] F Kristensen H Hedberg H Jiang P Nilsson and V OwallldquoAn embedded real-time surveillance system implementationand evaluationrdquo Journal of Signal Processing Systems vol 52 no1 pp 75ndash94 2008

[8] H Jiang H Ardo and V Owall ldquoA hardware architecturefor real-time video segmentation utilizing memory reductiontechniquesrdquo IEEETransactions onCircuits and Systems for VideoTechnology vol 19 no 2 pp 226ndash236 2009

[9] M Genovese and E Napoli ldquoFPGA-based architecture for realtime segmentation and denoising of HD videordquo Journal of Real-Time Image Processing vol 8 no 4 pp 389ndash401 2013

[10] A Lopez-Bravo J Diaz-Carmona A Ramirez-Agundis APadilla-Medina and J Prado-Olivarez ldquoFPGA-based videosystem for real time moving object detectionrdquo in Proceedings

International Journal of Reconfigurable Computing 15

of the 23rd International Conference on Electronics Communi-cations and Computing (CONIELECOMP rsquo13) pp 92ndash97 IEEECholula Mexico March 2013

[11] T Kryjak M Komorkiewicz and M Gorgon ldquoReal-time mov-ing object detection for video surveillance system in FPGArdquo inProceedings of the Conference on Design and Architectures forSignal and Image Processing (DASIP rsquo11) pp 1ndash8 IEEE TampereFinland November 2011

[12] A Mittal and D Huttenlocher ldquoScene modeling for wide areasurveillance and image synthesisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition vol 2pp 160ndash167 IEEE June 2000

[13] A Price J Pyke D Ashiri and T Cornall ldquoReal time objectdetection for an unmanned aerial vehicle using an FPGAbased vision systemrdquo in Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA rsquo06) pp 2854ndash2859 IEEE Orlando Fla USA May 2006

[14] G J Garcıa C A Jara J Pomares A Alabdo L M Poggi andF Torres ldquoA survey on FPGA-based sensor systems towardsintelligent and reconfigurable low-power sensors for computervision control and signal processingrdquo Sensors vol 14 no 4 pp6247ndash6278 2014

[15] S Ali and M Shah ldquoCocoa tracking in aerial imageryrdquo inAirborne Intelligence Surveillance Reconnaissance (ISR) Systemsand Applications III vol 6209 of Proceedings of SPIE OrlandoFla USA April 2006

[16] J Xiao C Yang F Han and H Cheng ldquoVehicle and persontracking in aerial videosrdquo in Multimodal Technologies for Per-ception of Humans pp 203ndash214 Springer 2008

[17] W Yu X Yu P Zhang and J Zhou ldquoA new framework of mov-ing target detection and tracking for uav video applicationrdquo inProceedings of the International Archives of the PhotogrammetryRemote Sensing and Spatial Information Science vol 37 BeijingChina 2008

[18] V Reilly H Idrees and M Shah ldquoDetection and tracking oflarge number of targets in wide area surveillancerdquo in ComputerVisionmdashECCV 2010 11th European Conference on ComputerVision Heraklion Crete Greece September 5ndash11 2010 Proceed-ings Part III vol 6313 of Lecture Notes in Computer Science pp186ndash199 Springer Berlin Germany 2010

[19] J Wang Y Zhang J Lu and W Xu ldquoA framework for movingtarget detection recognition and tracking in UAV videosrdquoin Affective Computing and Intelligent Interaction vol 137 ofAdvances in Intelligent and Soft Computing pp 69ndash76 SpringerBerlin Germany 2012

[20] S A Cheraghi andUU Sheikh ldquoMoving object detection usingimage registration for a moving camera platformrdquo in Proceed-ings of the IEEE International Conference on Control SystemComputing and Engineering (ICCSCE rsquo12) pp 355ndash359 IEEEPenang Malaysia November 2012

[21] Y Zhang X Tong T Yang and W Ma ldquoMulti-model estima-tion based moving object detection for aerial videordquo Sensorsvol 15 no 4 pp 8214ndash8231 2015

[22] Q Yu and G Medioni ldquoA GPU-based implementation ofmotion detection from a moving platformrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern RecognitionWorkshops (CVPR rsquo08) pp 1ndash6 AnchorageAlaska USA June 2008

[23] A Laika J Paul C Claus W Stechele A E S Auf and EMaehle ldquoFPGA-based real-time moving object detection for

walking robotsrdquo in Proceedings of the 8th IEEE InternationalWorkshop on Safety Security and Rescue Robotics (SSRR rsquo10) pp1ndash8 IEEE Bremen Germany July 2010

[24] B Tippetts S Fowers K Lillywhite D-J Lee and J ArchibaldldquoFPGA implementation of a feature detection and trackingalgorithm for real-time applicationsrdquo in Advances in VisualComputing pp 682ndash691 Springer 2007

[25] K May and N Krouglicof ldquoMoving target detection for senseand avoid using regional phase correlationrdquo in Proceedings ofthe IEEE International Conference on Robotics and Automation(ICRA rsquo13) pp 4767ndash4772 IEEE Karlsruhe Germany May2013

[26] M E Angelopoulou and C-S Bouganis ldquoVision-based ego-motion estimation on FPGA for unmanned aerial vehiclenavigationrdquo IEEE Transactions on Circuits and Systems for VideoTechnology vol 24 no 6 pp 1070ndash1083 2014

[27] I M El-Emary andMM A El-Kareem ldquoOn the application ofgenetic algorithms in finger prints registrationrdquoWorld AppliedSciences Journal vol 5 no 3 pp 276ndash281 2008

[28] A A Goshtasby 2-D and 3-D Image Registration for MedicalRemote Sensing and Industrial Applications JohnWiley amp SonsNew York NY USA 2005

[29] C Harris and M Stephens ldquoA combined corner and edgedetectorrdquo in Proceedings of the 4th Alvey Vision Conference vol15 pp 147ndash151 1988

[30] E Rosten andTDrummond ldquoMachine learning for high-speedcorner detectionrdquo in Computer VisionmdashECCV 2006 pp 430ndash443 Springer 2006

[31] H Bay T Tuytelaars and L Van Gool ldquoSURF speededup robust featuresrdquo in Computer VisionmdashECCV 2006 ALeonardis H Bischof and A Pinz Eds vol 3951 of LectureNotes in Computer Science pp 404ndash417 Springer 2006

[32] G R Rodrıguez-Canosa SThomas J del Cerro A Barrientosand B MacDonald ldquoA real-time method to detect and trackmoving objects (DATMO) from unmanned aerial vehicles(UAVs) using a single camerardquo Remote Sensing vol 4 no 4 pp1090ndash1111 2012

[33] B Liu and A Zaccarin ldquoNew fast algorithms for the estimationof block motion vectorsrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 3 no 2 pp 148ndash157 1993

[34] R Li B Zeng andM L Liou ldquoNew three-step search algorithmfor blockmotion estimationrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 4 no 4 pp 438ndash442 1994

[35] L-M Po andW-C Ma ldquoA novel four-step search algorithm forfast blockmotion estimationrdquo IEEETransactions onCircuits andSystems for Video Technology vol 6 no 3 pp 313ndash317 1996

[36] S Zhu and K-K Ma ldquoA new diamond search algorithm forfast block-matching motion estimationrdquo IEEE Transactions onImage Processing vol 9 no 2 pp 287ndash290 2000

[37] L De Vos andM Stegherr ldquoParameterizable VLSI architecturesfor the full-search block-matching algorithmrdquo IEEE Transac-tions on Circuits and Systems vol 36 no 10 pp 1309ndash1316 1989

[38] M A Fischler and R C Bolles ldquoRandom sample consensus aparadigm for model fitting with applications to image analysisand automated cartographyrdquo Communications of the ACM vol24 no 6 pp 381ndash395 1981

[39] J W Tang N Shaikh-Husin and U U Sheikh ldquoFPGA imple-mentation of RANSAC algorithm for real-time image geometry

16 International Journal of Reconfigurable Computing

estimationrdquo in Proceedings of the 11th IEEE Student Conferenceon Research andDevelopment (SCOReD rsquo13) pp 290ndash294 IEEEPutrajaya Malaysia December 2013

[40] O Chum and J Matas ldquoRandomized ransac with T119889119889testrdquo in

Proceedings of the British Machine Vision Conference vol 2 pp448ndash457 September 2002

[41] DARPA SDMS PublicWeb Site 2003 httpswwwsdmsafrlafmil

[42] A F M S Saif A S Prabuwono and Z R MahayuddinldquoMotion analysis for moving object detection from UAV aerialimages a reviewrdquo in Proceedings of the International Conferenceon Informatics Electronics and Vision (ICIEV rsquo14) pp 1ndash6 IEEEDhaka Bangladesh May 2014

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

International Journal of Reconfigurable Computing 5

(a) 6 times 4 blocks (b) 8 times 6 blocks

Figure 2 Positions of template blocks (blue) and search areas (green) in video frame for different densities (119898 times 119899) of block matching withsame hardware cost

while time taken lt upper bound time do(1) Randomly select 3 distinct point pairs as samples(2) Generate hypothesis model (affine parameters) basedon the chosen samples(3) Apply 119879

119889119889test on the hypothesis model

(4) Calculate the fitness score of the model(5) Update and store best scored parameters

end while

Algorithm 1 RANSAC algorithm

119879119889119889test proposed in [40] is applied in the algorithm to

speed up RANSAC computation by skipping the followingsteps (step (4) and (5)) if the hypothesis model is far fromthe truth Fitness of the hypothesis is then evaluated andscored by fitting its parameters to all point pairs The besthypothesis model is constantly updated in each iteration andemerges as the final result when the RANSAC is terminatedupon reaching an upper bound time As RANSAC has theleast computation among overall moving target detectionalgorithms it is implemented as software program with onlythe fitness scoring step (step (4)) being hardware acceleratedFitness scoring is the calculation of the fitness for a hypothesismodel towards all input data (point pairs from block match-ing) as described in Algorithm 2Each data is considered as an inlier if its fitting error is

smaller than a predefined distance threshold thdist or viceversa Inlier fitness score is its fitting error while outlier scoreis fixed to thdist as a constant penalty The total fitness scoreis calculated by accumulating all individual scores for eachdata where a perfect fit will have zero fitness score As fitnessscoring is an iterative process for all data the number ofcomputations increases with size of data As RANSAC is astochastic algorithm it may not produce the best-fit affinemodel when given limited iteration

33 Object Segmentation After estimating ego-motion thecamera movement between two successive frames is tobe compensated prior to object foreground detection The

fitness score = 0for all data

119894do

asub119909 = abs(1199092119894minus (1199091119894sdot 1198670+ 1199101119894sdot 1198671+ 1198672))

asub119910 = abs(1199102119894minus (1199091119894sdot 1198673+ 1199101119894sdot 1198674+ 1198675))

score = min((asub1199092 + asub1199102) th2dist)fitnessscore = fitnessscore + score

end forWhereEach data

119894contains a point pair (119909

1119894 1199092119894 1199101119894 and 119910

2119894)

1198670 1198671 1198672 1198673 1198674 1198675are affine parameters of hypothesis

modelth2dist is the predefined distance threshold

Algorithm 2 Fitness scoring in RANSAC algorithm

previous frame is transformed and mosaic with currentframe using the estimated affine parameters from RANSACalgorithm Reverse mapping technique is applied by calcu-lating the corresponding location in the source image basedon the destination pixel location The equation of affinetransformation is shown in

[[[

[

1199091015840

119894

1199101015840

119894

1

]]]

]

=[[

[

ℎ0ℎ1ℎ2

ℎ3ℎ4ℎ5

0 0 1

]]

]

[[

[

119909119894

119910119894

1

]]

]

(3)

where 119909119894and 119910

119894are the pixel coordinates of destination

image 1199091015840119894and 1199101015840

119894denote the corresponding pixel coordinates

in source image and ℎ119894are best-fit affine parameters from

RANSACAs the transformation may produce fractional result

nearest neighbour interpolation is utilized due to its efficiencyin hardware design The ego-motion compensation is per-formed pixel-by-pixel in raster scan generating a stream ofthe transformed previous frame to the next processFrame differencing is executed on the current frame and

the transformed (ego-motion compensated) previous frameby pixel-to-pixel absolute subtraction of both frames Thepixels in the resultant image are threshold with constant

6 International Journal of Reconfigurable Computing

Table 2 Pipeline scheduling for processing subsequent frames

Processes Processing frame at frame period 119905119894

1199050

1199051

1199052

1199053

1199054

sdot sdot sdot 119905119894

Motion estimation(i) Block matching 119865

01198651larr 1198650

1198652larr 1198651

1198653larr 1198652

1198654larr 1198653

sdot sdot sdot 119865119894larr 119865119894minus1

(ii) RANSAC mdash mdash 1198651larr 1198650

1198652larr 1198651

1198653larr 1198652

sdot sdot sdot 119865119894minus1larr 119865119894minus2

Object segmentation(i) Affine transformation

mdash mdash mdash 1198651larr 1198650

1198652larr 1198651

sdot sdot sdot 119865119894minus2larr 119865119894minus3

(ii) Frame differencing(iii) Median filtering(iv) Morphological119865119894larr 119865119895is detection of moving object from 119895th frame to 119894th frame

value of thfd to produce binary image Lower value of thfdmay induce more false alarm in detection while higher valuecauses the miss detection Both subtraction and thresholdingprocesses can be done as soon as two pixels for the samecoordinate from these frames are obtained to yield one binarypixel for the next process Lastly 7 times 7 binary median filterand dilation processes are performed on the binary imageto remove noise and improve the detected region of movingtarget

34 Pipeline Scheduling In order to establish a real-timemoving target detection system for streaming video properpipeline scheduling is utilized to fully maximize the overallsystem throughputThe algorithm is split into several subpro-cesses with each hardware accelerator working on differentframes independently transferring the intermediate resultfrom one process to another until the end of the detectioncycle Hence the system will always produce output everytime after a fixed latency The overall process is divided intofour stages of pipeline as shown in Table 2Due to data dependencies of the streaming algorithm all

processesmust be done sequentially to produce one detectionresult Block matching requires two successive video framesfor computation The first frame is streamed in for blockextraction process and stored into frame buffer Blockmatch-ing is performed after the next frame is obtained with theextracted block of previous frame RANSAC can only beginits computation after block matching has finished processingon the entire frame Lastly two original frames (119865

119894minus2and

119865119894minus3) are read from frame buffer for object segmentation to

produce the final result Object segmentation computationcan be performed in stream without further frame bufferingThe overall pipeline processing of the streaming system hasfour framesrsquo latency Hence at least four frames (119865

119894minus3to 119865119894)

must be stored in frame buffer at all time for a completemoving target detection process

4 Proposed Moving Target Detection SoC

Themoving target detection SoC is developed andprototypedin Terasic DE2-115 board with Altera Cyclone IV FPGAdevice The system consists of hardwaresoftware codesignof the algorithm of where the hardware computation is

executed in dedicated accelerator coded in Verilog HardwareDescription Language (HDL) while software program isperformed using a soft-core Nios II processor with SDRAMas software memoryThe system architecture of the proposedmoving target detection SoC is illustrated in Figure 3Camera interface handles the image acquisition tasks to

provide the raw image for processing while VGA interfacemanages video displaying task Apart from being a softwarememory part of SDRAM is also reserved as video displaybuffer Thus Direct Memory Access (DMA) technique isapplied to read and write the displaying frame in SDRAM toensure the high throughput image transferAs multiple frames are required at the same time to

detect moving target frame buffer is required to temporarilystore the frames for processing Hence SRAM is utilizedas frame buffer due to its low latency access Since mostcomputations are performed in the dedicated hardware NiosII handles only RANSAC process (except fitness scoring stepas described in Section 32) and auxiliary firmware controlsUSB controller is included in the SoC to enable data transferwith USB mass storage device for verification and debuggingpurposes In addition embedded operating system (Nios2-linux) is booted in the system to provide file system anddrivers supportThe real-time video is streamed directly into the mov-

ing target detector for processing Both Nios II and hard-ware accelerator modules compute the result as a hard-waresoftware codesign system and transfer the output frameto SDRAM via DMA VGA interface constantly reads anddisplays the output frame in SDRAM All operations are ableto be performed in real-time attaining a 30 fps moving targetdetection system

41Moving Target DetectionHardware Accelerator Thehard-ware architecture of the moving target detector is shown inFigure 4 It is composed of motion estimation core objectsegmentation core frame grabber and other interfaces Theoverall moving target detection is performed according to thefollowing sequences

(1) Frame grabber receives the input video stream andstores four most recent frames (119865

119894minus3to 119865119894) into frame

buffer through its interface At the same time frame

International Journal of Reconfigurable Computing 7

Moving targetdetector

NIOS II CPU

SDRAMcontroller

Camerainterface

FPGA

Syste

m b

us

Embedded Linux

(i) Firmware control

(ii) Software processing

SDRAM

VGAinterface

VGAdisplay

SRAM

USBcontroller

USB mass storage device

Camera

Software processor Hardware accelerator Controllers and interfaces

Memory components IO components

Figure 3 System architecture of moving target detection

Framebuffer

interface

Motionestimation

core

Framegrabber

Objectsegmentation

core

Video streaminput

Businterface(slave)

Software processor

Frame buffer(SRAM)

Businterface(master)

Videoresult

DMA to SDRAM

Affineparameters

RANSACcomputation

Fiminus2

Fi

and Fiminus3

Figure 4 Hardware architecture of moving target detector

8 International Journal of Reconfigurable Computing

grabber also provides the current frame (119865119894) tomotion

estimation core(2) Motion estimation core performs blockmatching andRANSAC computation Since RANSAC is computedin both hardware and software software processor isconstantly accessing this core via system bus interfaceto calculate the affine parameters

(3) After RANSAC the affine parameters are transferredfrom software to object segmentation core Two pre-vious frames (119865

119894minus2and 119865

119894minus3) are read from the frame

buffer by object segmentation core for processing(4) Several processes involving affine transformationframe differencing median filter and dilation arethen performed on both frames resulting in thedetected moving target

(5) Lastly the bus interface (master) provides DMAaccess for object segmentation core to transfer theend result into SDRAMfor displaying and verificationpurposes

As the frame buffer (SRAM) is a single port 16-bitmemory frame grabber concatenates two neighbouring 8-bitgreyscale pixels to store in one memory location Since framegrabber and object segmentation core share the frame bufferto write and read frames respectively frame buffer interfaceprovides priority arbitration and gives frame grabber thehighest priority granting everywrite request However framebuffer may be busy for a couple of clock cycles due to readoperation of SRAM by other modules a small FIFO withdepth of 4 is utilized in frame grabber to temporarily bufferthe incoming image pixels

42 Motion Estimation Hardware Accelerator Motion esti-mation core consists of block matching and RANSAC hard-ware accelerators Since RANSAC requires the entire dataof point pairs provided by block matching to begin itscomputation additional buffers are needed to temporarilystore the corresponding point pairs for every two subsequentframes The hardware architecture for motion estimationprocess is shown in Figure 5To enable high throughput data (point pairs) sharing

for both block matching and RANSAC double bufferingtechnique is applied by using two buffers (Buffer 1 and Buffer2) as data storage For any instance one buffer is writtenby block matching while the other is used for computationby RANSAC Buffer controller swaps the roles of these twobuffers for each incoming new frame therefore ensuringboth processes to be pipelined by reading and writing oneach buffer subsequently Buffer swapping is initiated at eachcompletion of block matching modules while RANSAC isperformed during the time gap between each swap and isterminated before the next swap

421 Block Matching Hardware Accelerator Figure 7 showsthe architecture of the proposed block matching hardwareaccelerator performing template blocks extraction fromone frame and matching of these template blocks in theircorresponding search areas from next frame The overall

Buffer controllerBuffer 1

Block matchingVideo stream

input

RANSAC acceleratorTo softwareprocessor

Buffer 2

Figure 5 Hardware architecture of motion estimation core

process can be completed in stream to yield the point-to-point motion (point pairs) of two subsequent frames withoutbuffering an entire frameAs 9 times 9 block size is utilized in block matching a 9-

tap line buffer is designed in such a way that 9 times 9 pixels ofmoving window can be obtained in every clock cycleThese 9times 9 pixels are shared for both block extraction and matchingprocesses and are read one by one in pipeline from the linebuffer at each valid cycle resulting in a total of 81 cycles toobtain a complete windowThe block extractor keeps track of the coordinate of

current pixel in video stream as a reference for extractionprocess Template blocks from incoming frames are extractedand stored temporarily into block memory As each block isextracted line-by-line in raster scan blockmemory is dividedinto nine-rowmemories as illustrated in Figure 6(a)with eachof which being used to store one pixel row in template blocksWhen video stream reaches the block position each pixelrow is loaded into each rowmemory from the correspondingtap of the line buffer Block coordinates are also stored in aseparate FIFO to keep track of its positionSince only one SAD processor is used for matching 119898 times119899 blocks as mentioned in Section 31 the template blockhas to be swapped according to the corresponding searcharea during raster scan Hence row memory is constructedwith two FIFOs upper and lower FIFO as illustrated inFigure 6(b) to enable block swapping during matchingprocess Template blocks are stored into upper FIFO duringextraction process During matching process each line ofraster scan enters eight different search areas to match eightdifferent template blocks respectively Hence one row oftemplate blocks is cached in lower FIFO and is repeatedlyused until the end of their search areas (reaching next row ofsearch areas) Upon reaching each new row of search areastemplate blocks in lower FIFO are replaced with new row oftemplate blocks from upper FIFO At the last line of rasterscan the lower FIFO is flushed to prevent overflow

International Journal of Reconfigurable Computing 9

CV ControlVector(CV)

Tap 0 Tap 1

Row 1pixels

Template blockscoordinate

Template blockscoordinate

Template blocks pixels

Row 2pixels

Tap 8

Row 8pixels

CV

Rowmemory

Rowmemory

Coordinate memory (FIFO)

Row memory

middot middot middot

middot middot middot

middot middot middot

(a) Block memory consisting of nine-row memories

wr_upper

rd_upper

wr_lower

rd_lower

sel1

sel2

CV_in CV_out

Tap

Row pixels

UpperFIFO

LowerFIFO

1 0

0

Controlregisters

1 0

(b) Row memory contains an upper FIFO and lower FIFO

Figure 6 Block memory architecture for storing template blocks

Blockextractor

Bestscore

tracker

Line buffer

Block memory

SAD processor

Video stream input

Control vector

Matching score

Point pairs

Template blocks Blocks coordinate

9 times 9 window pixels

Figure 7 Stream-oriented hardware architecture of blockmatching

In order to efficiently extract and match all blocksdifferent Control Vector (CV) as illustrated in Table 3 is sentto perform different reading and writing operations in blockmemory based on the current position in raster scan Bothreads andwrites are independent of each other and are able tobe executed at the same time Pixels are processed one by onein 81 cycles to complete a window Both writing and reading

Table 3 Control Vector (CV) for different read andwrite operationsof block memory

Position of raster scan WriteupperReadupper

Writelower

Readlower sel1 sel2

Entering templateblock position 1 x x x x x

Entering first searcharea row x 1 1 0 1 1

Entering next searcharea row x 1 1 1 1 1

Reentering samesearch area row x 0 1 1 0 0

Leaving last searcharea row x 0 0 1 0 0

processes require 9 cycles for each row memory passing CVfrom the first row memory to the next row memory untilthe end to complete a 81-pixel write or read operation of atemplate blockSAD processor performs the correlation of the template

blocks from previous frame with all possible blocks fromcurrent frame according to the search area Extracted blockpixels are read from block memory while window pixels insearch areas are provided from the taps of the line bufferThetotal number of required PEs is the total number of pixelsin a window The process is pipelined such that each pixelis computed in each PE as soon as it is obtained from theline buffer Matching score of each window can be obtainedin every cycle after a fixed latencyLastly the best score tracker constantly stores and updates

the best matching score for each template block within itscorresponding search area The matching score is compared

10 International Journal of Reconfigurable Computing

x2 H2y1 H1

H0y2 H5 x1x1 H3

y1 H4

minus times times minus

minus minus

times times

+

+

+

abs abs

sqr sqr

min

acc

Pipelineregister

Pipelineregister

Pipelineregister

Fitness score

th2dist

Figure 8 Hardware datapath of fitness scoring in RANSAC accel-erator

among the same search area and the coordinates of the best-scored blocks are preserved At the end of each search areathe coordinates of the best pairs (template blocks and theirbest-scored blocks) are sent to RANSAC module for nextprocessing Hence the proposed block matching hardware isable to produce point-to-point motion (point pairs) of everytwo successive frames in streaming video at line rate

422 RANSAC Hardware Accelerator RANSAC hardwaredesign in [39] is utilized in this work which acceleratesonly fitness scoring step As described in Algorithm 2fitness scoring is an iterative process which performs similarcomputation to all data samples based on hypothesis modelHence this data intensive process is executed in pipelineddatapath as illustrated in Figure 8 A control unit is utilizedto read input data provided by block matching from bufferand stream these inputs to the datapath unit at every clockcycleThe datapath unit utilizes three stages of pipeline with

the aim of isolating multiplication processes thus allowingfaster clock rate The first stage pipeline registers are locatedright after the first multiplication while the other two stagesof pipeline registers enclose the squaring processes Theindividual score is accumulated in the last stage producingtotal final fitness score The accumulator is reset on each newset of hypothesis Thus the total number of cycles required

Table 4 Fixed point precision of fitness scoring inputs

Parameter Number of bits Number rangeInteger Fraction

1199091 1199101 1199092 1199102

11 0 [minus1024 1024)1198670119867111986731198674

4 12 [minus8 8)11986721198675

11 5 [minus1024 1024)

Detected movingtarget

Affineparameters

fromsoftware

AffinePE

Framereader

Address

Framedifferencing PE

Binary image stream

MedianPE

Line buffer

DilationPE

Line buffer

from framebuffer

Fiminus2 and Fiminus3

Fiminus2F998400iminus3

Figure 9 Hardware architecture for object segmentation

for fitness score computation is the number of overall dataplus the four-cycle latencyAlthough fitness scoring could require floating point

computations the datapath unit uses suitable fixed pointprecision for each stage SinceNios II is a 32-bit processor theaffineparameters in hypothesismodel (119867

0to1198676) are properly

scaled to different precision of 16-bit fixed points as describedin Table 4 so that two affine parameters can be assigned in asingle 32-bit write instruction As this system is targeted for640 times 480 pixelsrsquo video all input coordinates (119909

1 1199101 1199092 and

1199102) are scaled to 11 bits

43 Object Segmentation Hardware Architecture As objectsegmentation can be performed in one raster scan a stream-oriented architecture is proposed as illustrated in Figure 9 Allsubprocesses are executed in pipeline on the streaming videowithout additional frame buffering Object segmentationprocess is initiated by software processor after providing theaffine parameters from RANSAC to affine PE Two frames(119865119894minus2and 119865

119894minus3as described in Table 2) from frame buffer

(SRAM) are required to segment the moving targetBased on the affine parameters from RANSAC affine PE

uses reverse mapping technique to find each pixel location inprevious frame (119865

119894minus3) using (3) and generates their addresses

in frame buffer (SRAM) Frame readers fetch the previ-ous frame (119865

119894minus3) pixel-by-pixel according to the generated

addresses from frame buffer thus constructing a stream oftransformed frame which is denoted as 1198651015840

119894minus3

International Journal of Reconfigurable Computing 11

Leftmostcolumn pixels

Rightmostcolumn pixels

Median output stream

Addertree

Addertree

24

pixelsTo line buffer

Binary image stream

7 times 7 window

minus

+

gtgt

Figure 10 Hardware architecture of median PE

By synchronizing the streams of both frames framedifferencing can be executed in pipeline as soon as one pixelfrom each frame is obtained Hence one pixel in currentframe (119865

119894minus2) and one pixel in transformed frame (1198651015840

119894minus3) are

fetched alternatively from their corresponding memory loca-tions by frame reader constructing two synchronized streamsof 119865119894minus2and 1198651015840

119894minus3frames Frame differencing PE performs

pixel-to-pixel absolute subtraction and thresholding on thestreams The frame differencing PE is able to compute inone cycle per pixel A configurable threshold value thfd isused after the subtraction yielding a stream of binary imagewithout buffering the whole frameAfter frame differencing the binary image is streamed

into 7 times 7 median filtering Seven lines of the image arebuffered in the line buffer providing 7 times 7 pixels window forthe median PE to perform the median computation Mediancomputation can be performed in one clock cycle for eachprocessing window due to short propagation delay as onlybinary pixels are involved Figure 10 shows the hardware logicdesign of median PEMedian filtering can be computed by counting the num-

ber of asserted (binary 1) pixels in the window If morethan half the pixels in the window (24 pixels) are assertedthe resultant pixel is ldquo1rdquo or ldquo0rdquo otherwise Since processingwindow will move only one pixel to the right for each com-putation during raster scan current pixel count is computedby adding the previous pixel count and rightmost columnpixels in the current window while subtracting the leftmost

column pixels in the previous window Final binary outputpixel is produced by thresholding the current pixel count with24 (half of window size)As dilation is also a 7times 7window-based processing it uses

similar line buffering technique asmedian filtering Howeveronly simple logical OR operation is performed on all windowpixels Due to its simplicity dilation PE can also be computedin one clock cycle resulting in the stream of binary imagewith detected region of moving targets

5 Experimental Results

51 Verification of Proposed SoC Theproposedmoving targetdetection SoC is verified in offline detection mode using thedatabase in [41] Test videos are 640 times 480 pixels in sizeand are greyscaled prior to the verification process The testvideos are transferred to the system for computation via aUSB mass storage device After performing the detection inSoC the image results are displayed on VGA and also storedon USB drive for verification Figure 11 shows the movingtarget detection result from the proposed SoC using differentsample videos The detected regions (red) are overlaid on theinput frame In most cases the proposed SoC is able to detectthe moving target in consecutive framesHowever there are several limitations in this work Block

matching may not give a goodmotion estimation result if theextracted blocks do not have texture (the pixels intensity aresimilar) Moreover the detected region of moving target mayappear in cavity or multiple split of smaller regions as onlysimple frame differencing is applied in the proposed systemAdditional postprocessing to produce better detected blob bymerging split regions is out of the scope in this workAs the stochastic RANSAC algorithm is terminated after

a constant time step for each frame image registration errormay occur which produces incorrect ego-motion estimationThis could be mitigated by accelerating RANSAC algorithmto ensure more iterations using dedicated hardware or highperformance general purpose processor

52 Performance Evaluation of Detection Algorithm Theper-formance evaluation of the implemented detection algorithmuses the Mathematical Performance Metric in [42] thatinvolves several parameters as follows

(i) True positive TP the detected moving object(ii) False positive FP detected regions that do not corre-spond to any moving object

(iii) False negative FN the nondetected moving object(iv) Detection rate DR the ratio of TP with the combina-

tion of TP and FN as formulated in

DR = TPTP + FN

(4)

(v) False alarm rate FAR the ratio between FP in allpositive detection as defined in

FAR = FPTP + FP

(5)

12 International Journal of Reconfigurable Computing

(a) Frame 255 (b) Frame 275 (c) Frame 295 (d) Frame 315

(e) Frame 1000 (f) Frame 1020 (g) Frame 1040 (h) Frame 1060

(i) Frame 600 (j) Frame 620 (k) Frame 640 (l) Frame 660

Figure 11 Detected regions from the proposed moving target detection SoC on different sample videos in [41] Video numbers (a)ndash(d)V3V100003 004 video numbers (e)ndash(h) V3V100004 003 and video numbers (i)ndash(l) V4V100007 017

FP

FNTP

Figure 12 Evaluation of performancemetrics TP FP and FN basedon ground truth boxes (blue) and the detected region (red)

To obtain the performance metrics ground truth regionsare manually labelled in several frames of test videos Abounding box is drawn across each moving object to indicatethe ground truth region of every frame as depicted in Fig-ure 12 A simple postprocessing is performed on the detectedregion by filtering out the detected region smaller than 15pixelsrsquo width or 15 pixelsrsquo height prior to the evaluationA detected moving object (TP) has detected regions in itsbounded ground truth area while a nondetected movingobject (FN) has no detected region overlapping with itsground truth area Detected region that does not overlappwith any ground truth region is considered as false positive(FP)The detection performance is evaluated on different

parameters configuration The DR and FAR for 1000 testframes using different number of blocks (density in ego-motion estimation) 119898 times 119899 in area-based registration and

frame differencing threshold thfd are depicted in Table 5 andFigure 13The experiment results show that DR is almost similar

for different density of ego-motion estimation but decreaseswith thfd Although higher density in the proposed work haslower displacement limitation 119889

119898and 119889

119899as discussed in

Section 31 most of the point-to-point displacements do notexceed the limitation due to slowUAVmovement in themostframes of the test dataset On the contrary higher value of thfdmay filter out the moving object if the differences in intensityof the object pixels and background pixels are almost similarFAR decreases with density in ego-motion estimation

due to the higher quality in image registration process butincreases if most frames exceed the displacement limitation119889119898and 119889

119899 However false registration due to displacement

limitation results in a huge blob of foreground but does notgreatly increase FAR Although higher values of thfd decreasethe false detection rate they also produce smaller foregroundarea for all detected moving objects as pixels almost similarintensity with background will be thresholded

53 Speed Comparison with Full Software ImplementationThe computation speed of the proposed moving target detec-tion SoC is compared with software computation in differentplatforms including modern CPU (Intel Core i5) in desktopcomputer and embedded processor (ARM) Table 6 illustratesthe comparison of computation frame rate and hardware

International Journal of Reconfigurable Computing 13

1 2 3 4 5 6 7 8 9 100944

0946

0948

095

0952

0954

0956

0958

096

0962

0964

Det

ectio

n ra

teD

R

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(a) DR

1 2 3 4 5 6 7 8 9 100

01

02

03

04

05

06

07

False

alar

m ra

te F

AR

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(b) FAR

Figure 13 DR and FAR for different density in ego-motion estimation119898 times 119899 and frame differencing threshold thfd

Table 5 Performance evaluation in terms of DR and FAR for 1000frames using different density in ego-motion estimation119898times 119899 andframe differencing threshold thfd

119898 times 119899 thfd DR FAR12 15 0958 064312 20 0954 033112 25 0949 019424 15 0957 056824 20 0950 032424 25 0945 010135 15 0958 054835 20 0952 021535 25 0947 009048 15 0959 053948 20 0952 025348 25 0944 007970 15 0958 050970 20 0951 018870 25 0946 007588 15 0960 048988 20 0951 021988 25 0947 0074108 15 0958 0483108 20 0952 0168108 25 0946 0058140 15 0958 0499140 20 0951 0187140 25 0946 0059165 15 0958 0474165 20 0953 0214165 25 0947 0068192 15 0959 0478192 20 0952 0169192 25 0946 0092

Table 6 Computation speed comparison of the proposed sys-tem with different software implementation using area-based andfeature-based registrations

Platform Frequency Registrationtechnique

Framerate

Hardwarespeed-up

ProposedSoC 100MHz Area-based 30 1

Intel Corei5-4210U 170GHz Area-based 426 704

Feature-based 1311 229

ARM1176JZF 700MHz Area-based 020 150Feature-based 056 5357

speed-up between the proposed system and other softwareimplementations using test videos in [41]As feature-based image registration has faster computa-

tion in software implementation comparing to area-basedregistration speed performance of feature-based method isalso included for comparison In feature-based implementa-tion features are first detected in each frame The detectedfeatures from current frame are cross-correlatedwith featureswith previous framewhile RANSAC algorithm is used to esti-mate the ego-motion between frames After compensatingthe ego-motion segmentation ofmoving object uses the sameprocesses with the proposed system To further optimizethe software implementation in terms of speed performancea fast feature detection algorithm [30] is utilized As thenumber of features will affect the computation time in featurematching step only 100 strongest features in each frame areselected for processingHowever the performance evaluationdoes not consider multithreaded software execution

14 International Journal of Reconfigurable Computing

Table 7 Resources usage of the proposed moving target detectionSoC

Logic units Utilization ()Total combinational function 15161 13Total registers 10803 9Total memory bits 521054 13Embedded multiplier 27 5FPGA device Altera Cyclone IV

Based on experimental result the speed performanceof the proposed moving target detection SoC surpassesoptimized software computation by 229 times and 5357times compared with implementations in modern CPUand embedded CPU respectively The software computation(RANSAC) in HWSW codesign of the proposed system cre-ates speed bottleneck thus limiting the maximum through-put to 30 fps The processing frame rate of the proposed sys-tem can be further improved by using fully dedicated hard-ware

54 Resource Utilization The overall hardware resourcesutilization of the complete system is illustrated in Table 7This prototype of real-time moving object detection systemutilizes only less than 20 percent of total resources in AlteraCyclone IV FPGA device As the proposed system uses off-chip memory components for frame buffering FPGA on-chip memory is utilized only for line buffering in streamingprocess (eg block matching and median filtering) and stor-ing intermediate results (eg point pairs after block match-ing) Thus the low resource usage of the proposed systemprovides abundant hardware space for other processes such astarget tracking or classification to be developed in future

6 Conclusions

Moving target detection is a crucial step in most computervision problem especially for UAV applications On-chipdetection without the need of real-time video transmission toground will provide immense benefit to diverse applicationssuch as military surveillance and resource exploration Inorder to perform this complex embedded video processingon-chip FPGA-based system is desirable due to the potentialparallelism of the algorithmThis paper proposed a moving target detection system

using FPGA to enable autonomous UAVwhich is able to per-form the computer vision algorithm on the flying platformThe proposed system is prototyped using Altera CycloneIV FPGA device on Terasic DE2-115 development boardmounted with a TRDB-D5M camera This system is devel-oped as a HWSW codesign using dedicated hardware withNios II software processor (booted with embedded Linux)running at 100MHz clock rate As stream-oriented hardwarewith pipeline processing is utilized the proposed systemachieves real-time capability with 30 frames per secondprocessing speed on 640times 480 live video Experimental resultshows that the proposed SoC performs 229 times and 5357times faster than optimized software computation onmodern

desktop computer (Intel Core i5) and embedded processor(ARM) In addition the proposed moving target detectionuses only less than 20 percent of total resources in theFPGA device allowing other hardware accelerators to beimplemented in future

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors would like to express their gratitude to UniversitiTeknologi Malaysia (UTM) and the Ministry of ScienceTechnology and Innovation (MOSTI) Malaysia for support-ing this research work under research Grants 01-01-06-SF1197and 01-01-06-SF1229

References

[1] A Ahmed M Nagai C Tianen and R Shibasaki ldquoUav basedmonitoring systemandobject detection technique developmentfor a disaster areardquo International Archives of PhotogrammetryRemote Sensing and Spatial Information Sciences vol 37 pp373ndash377 2008

[2] B Coifman M McCord R Mishalani M Iswalt and Y JildquoRoadway trafficmonitoring from an unmanned aerial vehiclerdquoIEE Proceedings-Intelligent Transport Systems vol 153 no 1 pp11ndash20 2006

[3] K Kanistras G Martins M J Rutherford and K P ValavanisldquoSurvey of unmanned aerial vehicles (uavs) for traffic monitor-ingrdquo inHandbook of Unmanned Aerial Vehicles pp 2643ndash2666Springer 2015

[4] K Nordberg P Doherty G Farneback et al ldquoVision for a UAVhelicopterrdquo in Proceedings of the International Conference onIntelligent Robots and Systems (IROS rsquo02) Workshop on AerialRobotics pp 29ndash34 Lausanne Switzerland October 2002

[5] D Zamalieva and A Yilmaz ldquoBackground subtraction for themoving camera a geometric approachrdquo Computer Vision andImage Understanding vol 127 pp 73ndash85 2014

[6] M Genovese and E Napoli ldquoASIC and FPGA implementationof the Gaussianmixturemodel algorithm for real-time segmen-tation of high definition videordquo IEEETransactions onVery LargeScale Integration (VLSI) Systems vol 22 no 3 pp 537ndash547 2014

[7] F Kristensen H Hedberg H Jiang P Nilsson and V OwallldquoAn embedded real-time surveillance system implementationand evaluationrdquo Journal of Signal Processing Systems vol 52 no1 pp 75ndash94 2008

[8] H Jiang H Ardo and V Owall ldquoA hardware architecturefor real-time video segmentation utilizing memory reductiontechniquesrdquo IEEETransactions onCircuits and Systems for VideoTechnology vol 19 no 2 pp 226ndash236 2009

[9] M Genovese and E Napoli ldquoFPGA-based architecture for realtime segmentation and denoising of HD videordquo Journal of Real-Time Image Processing vol 8 no 4 pp 389ndash401 2013

[10] A Lopez-Bravo J Diaz-Carmona A Ramirez-Agundis APadilla-Medina and J Prado-Olivarez ldquoFPGA-based videosystem for real time moving object detectionrdquo in Proceedings

International Journal of Reconfigurable Computing 15

of the 23rd International Conference on Electronics Communi-cations and Computing (CONIELECOMP rsquo13) pp 92ndash97 IEEECholula Mexico March 2013

[11] T Kryjak M Komorkiewicz and M Gorgon ldquoReal-time mov-ing object detection for video surveillance system in FPGArdquo inProceedings of the Conference on Design and Architectures forSignal and Image Processing (DASIP rsquo11) pp 1ndash8 IEEE TampereFinland November 2011

[12] A Mittal and D Huttenlocher ldquoScene modeling for wide areasurveillance and image synthesisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition vol 2pp 160ndash167 IEEE June 2000

[13] A Price J Pyke D Ashiri and T Cornall ldquoReal time objectdetection for an unmanned aerial vehicle using an FPGAbased vision systemrdquo in Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA rsquo06) pp 2854ndash2859 IEEE Orlando Fla USA May 2006

[14] G J Garcıa C A Jara J Pomares A Alabdo L M Poggi andF Torres ldquoA survey on FPGA-based sensor systems towardsintelligent and reconfigurable low-power sensors for computervision control and signal processingrdquo Sensors vol 14 no 4 pp6247ndash6278 2014

[15] S Ali and M Shah ldquoCocoa tracking in aerial imageryrdquo inAirborne Intelligence Surveillance Reconnaissance (ISR) Systemsand Applications III vol 6209 of Proceedings of SPIE OrlandoFla USA April 2006

[16] J Xiao C Yang F Han and H Cheng ldquoVehicle and persontracking in aerial videosrdquo in Multimodal Technologies for Per-ception of Humans pp 203ndash214 Springer 2008

[17] W Yu X Yu P Zhang and J Zhou ldquoA new framework of mov-ing target detection and tracking for uav video applicationrdquo inProceedings of the International Archives of the PhotogrammetryRemote Sensing and Spatial Information Science vol 37 BeijingChina 2008

[18] V Reilly H Idrees and M Shah ldquoDetection and tracking oflarge number of targets in wide area surveillancerdquo in ComputerVisionmdashECCV 2010 11th European Conference on ComputerVision Heraklion Crete Greece September 5ndash11 2010 Proceed-ings Part III vol 6313 of Lecture Notes in Computer Science pp186ndash199 Springer Berlin Germany 2010

[19] J Wang Y Zhang J Lu and W Xu ldquoA framework for movingtarget detection recognition and tracking in UAV videosrdquoin Affective Computing and Intelligent Interaction vol 137 ofAdvances in Intelligent and Soft Computing pp 69ndash76 SpringerBerlin Germany 2012

[20] S A Cheraghi andUU Sheikh ldquoMoving object detection usingimage registration for a moving camera platformrdquo in Proceed-ings of the IEEE International Conference on Control SystemComputing and Engineering (ICCSCE rsquo12) pp 355ndash359 IEEEPenang Malaysia November 2012

[21] Y Zhang X Tong T Yang and W Ma ldquoMulti-model estima-tion based moving object detection for aerial videordquo Sensorsvol 15 no 4 pp 8214ndash8231 2015

[22] Q Yu and G Medioni ldquoA GPU-based implementation ofmotion detection from a moving platformrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern RecognitionWorkshops (CVPR rsquo08) pp 1ndash6 AnchorageAlaska USA June 2008

[23] A Laika J Paul C Claus W Stechele A E S Auf and EMaehle ldquoFPGA-based real-time moving object detection for

walking robotsrdquo in Proceedings of the 8th IEEE InternationalWorkshop on Safety Security and Rescue Robotics (SSRR rsquo10) pp1ndash8 IEEE Bremen Germany July 2010

[24] B Tippetts S Fowers K Lillywhite D-J Lee and J ArchibaldldquoFPGA implementation of a feature detection and trackingalgorithm for real-time applicationsrdquo in Advances in VisualComputing pp 682ndash691 Springer 2007

[25] K May and N Krouglicof ldquoMoving target detection for senseand avoid using regional phase correlationrdquo in Proceedings ofthe IEEE International Conference on Robotics and Automation(ICRA rsquo13) pp 4767ndash4772 IEEE Karlsruhe Germany May2013

[26] M E Angelopoulou and C-S Bouganis ldquoVision-based ego-motion estimation on FPGA for unmanned aerial vehiclenavigationrdquo IEEE Transactions on Circuits and Systems for VideoTechnology vol 24 no 6 pp 1070ndash1083 2014

[27] I M El-Emary andMM A El-Kareem ldquoOn the application ofgenetic algorithms in finger prints registrationrdquoWorld AppliedSciences Journal vol 5 no 3 pp 276ndash281 2008

[28] A A Goshtasby 2-D and 3-D Image Registration for MedicalRemote Sensing and Industrial Applications JohnWiley amp SonsNew York NY USA 2005

[29] C Harris and M Stephens ldquoA combined corner and edgedetectorrdquo in Proceedings of the 4th Alvey Vision Conference vol15 pp 147ndash151 1988

[30] E Rosten andTDrummond ldquoMachine learning for high-speedcorner detectionrdquo in Computer VisionmdashECCV 2006 pp 430ndash443 Springer 2006

[31] H Bay T Tuytelaars and L Van Gool ldquoSURF speededup robust featuresrdquo in Computer VisionmdashECCV 2006 ALeonardis H Bischof and A Pinz Eds vol 3951 of LectureNotes in Computer Science pp 404ndash417 Springer 2006

[32] G R Rodrıguez-Canosa SThomas J del Cerro A Barrientosand B MacDonald ldquoA real-time method to detect and trackmoving objects (DATMO) from unmanned aerial vehicles(UAVs) using a single camerardquo Remote Sensing vol 4 no 4 pp1090ndash1111 2012

[33] B Liu and A Zaccarin ldquoNew fast algorithms for the estimationof block motion vectorsrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 3 no 2 pp 148ndash157 1993

[34] R Li B Zeng andM L Liou ldquoNew three-step search algorithmfor blockmotion estimationrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 4 no 4 pp 438ndash442 1994

[35] L-M Po andW-C Ma ldquoA novel four-step search algorithm forfast blockmotion estimationrdquo IEEETransactions onCircuits andSystems for Video Technology vol 6 no 3 pp 313ndash317 1996

[36] S Zhu and K-K Ma ldquoA new diamond search algorithm forfast block-matching motion estimationrdquo IEEE Transactions onImage Processing vol 9 no 2 pp 287ndash290 2000

[37] L De Vos andM Stegherr ldquoParameterizable VLSI architecturesfor the full-search block-matching algorithmrdquo IEEE Transac-tions on Circuits and Systems vol 36 no 10 pp 1309ndash1316 1989

[38] M A Fischler and R C Bolles ldquoRandom sample consensus aparadigm for model fitting with applications to image analysisand automated cartographyrdquo Communications of the ACM vol24 no 6 pp 381ndash395 1981

[39] J W Tang N Shaikh-Husin and U U Sheikh ldquoFPGA imple-mentation of RANSAC algorithm for real-time image geometry

16 International Journal of Reconfigurable Computing

estimationrdquo in Proceedings of the 11th IEEE Student Conferenceon Research andDevelopment (SCOReD rsquo13) pp 290ndash294 IEEEPutrajaya Malaysia December 2013

[40] O Chum and J Matas ldquoRandomized ransac with T119889119889testrdquo in

Proceedings of the British Machine Vision Conference vol 2 pp448ndash457 September 2002

[41] DARPA SDMS PublicWeb Site 2003 httpswwwsdmsafrlafmil

[42] A F M S Saif A S Prabuwono and Z R MahayuddinldquoMotion analysis for moving object detection from UAV aerialimages a reviewrdquo in Proceedings of the International Conferenceon Informatics Electronics and Vision (ICIEV rsquo14) pp 1ndash6 IEEEDhaka Bangladesh May 2014

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

6 International Journal of Reconfigurable Computing

Table 2 Pipeline scheduling for processing subsequent frames

Processes Processing frame at frame period 119905119894

1199050

1199051

1199052

1199053

1199054

sdot sdot sdot 119905119894

Motion estimation(i) Block matching 119865

01198651larr 1198650

1198652larr 1198651

1198653larr 1198652

1198654larr 1198653

sdot sdot sdot 119865119894larr 119865119894minus1

(ii) RANSAC mdash mdash 1198651larr 1198650

1198652larr 1198651

1198653larr 1198652

sdot sdot sdot 119865119894minus1larr 119865119894minus2

Object segmentation(i) Affine transformation

mdash mdash mdash 1198651larr 1198650

1198652larr 1198651

sdot sdot sdot 119865119894minus2larr 119865119894minus3

(ii) Frame differencing(iii) Median filtering(iv) Morphological119865119894larr 119865119895is detection of moving object from 119895th frame to 119894th frame

value of thfd to produce binary image Lower value of thfdmay induce more false alarm in detection while higher valuecauses the miss detection Both subtraction and thresholdingprocesses can be done as soon as two pixels for the samecoordinate from these frames are obtained to yield one binarypixel for the next process Lastly 7 times 7 binary median filterand dilation processes are performed on the binary imageto remove noise and improve the detected region of movingtarget

34 Pipeline Scheduling In order to establish a real-timemoving target detection system for streaming video properpipeline scheduling is utilized to fully maximize the overallsystem throughputThe algorithm is split into several subpro-cesses with each hardware accelerator working on differentframes independently transferring the intermediate resultfrom one process to another until the end of the detectioncycle Hence the system will always produce output everytime after a fixed latency The overall process is divided intofour stages of pipeline as shown in Table 2Due to data dependencies of the streaming algorithm all

processesmust be done sequentially to produce one detectionresult Block matching requires two successive video framesfor computation The first frame is streamed in for blockextraction process and stored into frame buffer Blockmatch-ing is performed after the next frame is obtained with theextracted block of previous frame RANSAC can only beginits computation after block matching has finished processingon the entire frame Lastly two original frames (119865

119894minus2and

119865119894minus3) are read from frame buffer for object segmentation to

produce the final result Object segmentation computationcan be performed in stream without further frame bufferingThe overall pipeline processing of the streaming system hasfour framesrsquo latency Hence at least four frames (119865

119894minus3to 119865119894)

must be stored in frame buffer at all time for a completemoving target detection process

4 Proposed Moving Target Detection SoC

Themoving target detection SoC is developed andprototypedin Terasic DE2-115 board with Altera Cyclone IV FPGAdevice The system consists of hardwaresoftware codesignof the algorithm of where the hardware computation is

executed in dedicated accelerator coded in Verilog HardwareDescription Language (HDL) while software program isperformed using a soft-core Nios II processor with SDRAMas software memoryThe system architecture of the proposedmoving target detection SoC is illustrated in Figure 3Camera interface handles the image acquisition tasks to

provide the raw image for processing while VGA interfacemanages video displaying task Apart from being a softwarememory part of SDRAM is also reserved as video displaybuffer Thus Direct Memory Access (DMA) technique isapplied to read and write the displaying frame in SDRAM toensure the high throughput image transferAs multiple frames are required at the same time to

detect moving target frame buffer is required to temporarilystore the frames for processing Hence SRAM is utilizedas frame buffer due to its low latency access Since mostcomputations are performed in the dedicated hardware NiosII handles only RANSAC process (except fitness scoring stepas described in Section 32) and auxiliary firmware controlsUSB controller is included in the SoC to enable data transferwith USB mass storage device for verification and debuggingpurposes In addition embedded operating system (Nios2-linux) is booted in the system to provide file system anddrivers supportThe real-time video is streamed directly into the mov-

ing target detector for processing Both Nios II and hard-ware accelerator modules compute the result as a hard-waresoftware codesign system and transfer the output frameto SDRAM via DMA VGA interface constantly reads anddisplays the output frame in SDRAM All operations are ableto be performed in real-time attaining a 30 fps moving targetdetection system

41Moving Target DetectionHardware Accelerator Thehard-ware architecture of the moving target detector is shown inFigure 4 It is composed of motion estimation core objectsegmentation core frame grabber and other interfaces Theoverall moving target detection is performed according to thefollowing sequences

(1) Frame grabber receives the input video stream andstores four most recent frames (119865

119894minus3to 119865119894) into frame

buffer through its interface At the same time frame

International Journal of Reconfigurable Computing 7

Moving targetdetector

NIOS II CPU

SDRAMcontroller

Camerainterface

FPGA

Syste

m b

us

Embedded Linux

(i) Firmware control

(ii) Software processing

SDRAM

VGAinterface

VGAdisplay

SRAM

USBcontroller

USB mass storage device

Camera

Software processor Hardware accelerator Controllers and interfaces

Memory components IO components

Figure 3 System architecture of moving target detection

Framebuffer

interface

Motionestimation

core

Framegrabber

Objectsegmentation

core

Video streaminput

Businterface(slave)

Software processor

Frame buffer(SRAM)

Businterface(master)

Videoresult

DMA to SDRAM

Affineparameters

RANSACcomputation

Fiminus2

Fi

and Fiminus3

Figure 4 Hardware architecture of moving target detector

8 International Journal of Reconfigurable Computing

grabber also provides the current frame (119865119894) tomotion

estimation core(2) Motion estimation core performs blockmatching andRANSAC computation Since RANSAC is computedin both hardware and software software processor isconstantly accessing this core via system bus interfaceto calculate the affine parameters

(3) After RANSAC the affine parameters are transferredfrom software to object segmentation core Two pre-vious frames (119865

119894minus2and 119865

119894minus3) are read from the frame

buffer by object segmentation core for processing(4) Several processes involving affine transformationframe differencing median filter and dilation arethen performed on both frames resulting in thedetected moving target

(5) Lastly the bus interface (master) provides DMAaccess for object segmentation core to transfer theend result into SDRAMfor displaying and verificationpurposes

As the frame buffer (SRAM) is a single port 16-bitmemory frame grabber concatenates two neighbouring 8-bitgreyscale pixels to store in one memory location Since framegrabber and object segmentation core share the frame bufferto write and read frames respectively frame buffer interfaceprovides priority arbitration and gives frame grabber thehighest priority granting everywrite request However framebuffer may be busy for a couple of clock cycles due to readoperation of SRAM by other modules a small FIFO withdepth of 4 is utilized in frame grabber to temporarily bufferthe incoming image pixels

42 Motion Estimation Hardware Accelerator Motion esti-mation core consists of block matching and RANSAC hard-ware accelerators Since RANSAC requires the entire dataof point pairs provided by block matching to begin itscomputation additional buffers are needed to temporarilystore the corresponding point pairs for every two subsequentframes The hardware architecture for motion estimationprocess is shown in Figure 5To enable high throughput data (point pairs) sharing

for both block matching and RANSAC double bufferingtechnique is applied by using two buffers (Buffer 1 and Buffer2) as data storage For any instance one buffer is writtenby block matching while the other is used for computationby RANSAC Buffer controller swaps the roles of these twobuffers for each incoming new frame therefore ensuringboth processes to be pipelined by reading and writing oneach buffer subsequently Buffer swapping is initiated at eachcompletion of block matching modules while RANSAC isperformed during the time gap between each swap and isterminated before the next swap

421 Block Matching Hardware Accelerator Figure 7 showsthe architecture of the proposed block matching hardwareaccelerator performing template blocks extraction fromone frame and matching of these template blocks in theircorresponding search areas from next frame The overall

Buffer controllerBuffer 1

Block matchingVideo stream

input

RANSAC acceleratorTo softwareprocessor

Buffer 2

Figure 5 Hardware architecture of motion estimation core

process can be completed in stream to yield the point-to-point motion (point pairs) of two subsequent frames withoutbuffering an entire frameAs 9 times 9 block size is utilized in block matching a 9-

tap line buffer is designed in such a way that 9 times 9 pixels ofmoving window can be obtained in every clock cycleThese 9times 9 pixels are shared for both block extraction and matchingprocesses and are read one by one in pipeline from the linebuffer at each valid cycle resulting in a total of 81 cycles toobtain a complete windowThe block extractor keeps track of the coordinate of

current pixel in video stream as a reference for extractionprocess Template blocks from incoming frames are extractedand stored temporarily into block memory As each block isextracted line-by-line in raster scan blockmemory is dividedinto nine-rowmemories as illustrated in Figure 6(a)with eachof which being used to store one pixel row in template blocksWhen video stream reaches the block position each pixelrow is loaded into each rowmemory from the correspondingtap of the line buffer Block coordinates are also stored in aseparate FIFO to keep track of its positionSince only one SAD processor is used for matching 119898 times119899 blocks as mentioned in Section 31 the template blockhas to be swapped according to the corresponding searcharea during raster scan Hence row memory is constructedwith two FIFOs upper and lower FIFO as illustrated inFigure 6(b) to enable block swapping during matchingprocess Template blocks are stored into upper FIFO duringextraction process During matching process each line ofraster scan enters eight different search areas to match eightdifferent template blocks respectively Hence one row oftemplate blocks is cached in lower FIFO and is repeatedlyused until the end of their search areas (reaching next row ofsearch areas) Upon reaching each new row of search areastemplate blocks in lower FIFO are replaced with new row oftemplate blocks from upper FIFO At the last line of rasterscan the lower FIFO is flushed to prevent overflow

International Journal of Reconfigurable Computing 9

CV ControlVector(CV)

Tap 0 Tap 1

Row 1pixels

Template blockscoordinate

Template blockscoordinate

Template blocks pixels

Row 2pixels

Tap 8

Row 8pixels

CV

Rowmemory

Rowmemory

Coordinate memory (FIFO)

Row memory

middot middot middot

middot middot middot

middot middot middot

(a) Block memory consisting of nine-row memories

wr_upper

rd_upper

wr_lower

rd_lower

sel1

sel2

CV_in CV_out

Tap

Row pixels

UpperFIFO

LowerFIFO

1 0

0

Controlregisters

1 0

(b) Row memory contains an upper FIFO and lower FIFO

Figure 6 Block memory architecture for storing template blocks

Blockextractor

Bestscore

tracker

Line buffer

Block memory

SAD processor

Video stream input

Control vector

Matching score

Point pairs

Template blocks Blocks coordinate

9 times 9 window pixels

Figure 7 Stream-oriented hardware architecture of blockmatching

In order to efficiently extract and match all blocksdifferent Control Vector (CV) as illustrated in Table 3 is sentto perform different reading and writing operations in blockmemory based on the current position in raster scan Bothreads andwrites are independent of each other and are able tobe executed at the same time Pixels are processed one by onein 81 cycles to complete a window Both writing and reading

Table 3 Control Vector (CV) for different read andwrite operationsof block memory

Position of raster scan WriteupperReadupper

Writelower

Readlower sel1 sel2

Entering templateblock position 1 x x x x x

Entering first searcharea row x 1 1 0 1 1

Entering next searcharea row x 1 1 1 1 1

Reentering samesearch area row x 0 1 1 0 0

Leaving last searcharea row x 0 0 1 0 0

processes require 9 cycles for each row memory passing CVfrom the first row memory to the next row memory untilthe end to complete a 81-pixel write or read operation of atemplate blockSAD processor performs the correlation of the template

blocks from previous frame with all possible blocks fromcurrent frame according to the search area Extracted blockpixels are read from block memory while window pixels insearch areas are provided from the taps of the line bufferThetotal number of required PEs is the total number of pixelsin a window The process is pipelined such that each pixelis computed in each PE as soon as it is obtained from theline buffer Matching score of each window can be obtainedin every cycle after a fixed latencyLastly the best score tracker constantly stores and updates

the best matching score for each template block within itscorresponding search area The matching score is compared

10 International Journal of Reconfigurable Computing

x2 H2y1 H1

H0y2 H5 x1x1 H3

y1 H4

minus times times minus

minus minus

times times

+

+

+

abs abs

sqr sqr

min

acc

Pipelineregister

Pipelineregister

Pipelineregister

Fitness score

th2dist

Figure 8 Hardware datapath of fitness scoring in RANSAC accel-erator

among the same search area and the coordinates of the best-scored blocks are preserved At the end of each search areathe coordinates of the best pairs (template blocks and theirbest-scored blocks) are sent to RANSAC module for nextprocessing Hence the proposed block matching hardware isable to produce point-to-point motion (point pairs) of everytwo successive frames in streaming video at line rate

422 RANSAC Hardware Accelerator RANSAC hardwaredesign in [39] is utilized in this work which acceleratesonly fitness scoring step As described in Algorithm 2fitness scoring is an iterative process which performs similarcomputation to all data samples based on hypothesis modelHence this data intensive process is executed in pipelineddatapath as illustrated in Figure 8 A control unit is utilizedto read input data provided by block matching from bufferand stream these inputs to the datapath unit at every clockcycleThe datapath unit utilizes three stages of pipeline with

the aim of isolating multiplication processes thus allowingfaster clock rate The first stage pipeline registers are locatedright after the first multiplication while the other two stagesof pipeline registers enclose the squaring processes Theindividual score is accumulated in the last stage producingtotal final fitness score The accumulator is reset on each newset of hypothesis Thus the total number of cycles required

Table 4 Fixed point precision of fitness scoring inputs

Parameter Number of bits Number rangeInteger Fraction

1199091 1199101 1199092 1199102

11 0 [minus1024 1024)1198670119867111986731198674

4 12 [minus8 8)11986721198675

11 5 [minus1024 1024)

Detected movingtarget

Affineparameters

fromsoftware

AffinePE

Framereader

Address

Framedifferencing PE

Binary image stream

MedianPE

Line buffer

DilationPE

Line buffer

from framebuffer

Fiminus2 and Fiminus3

Fiminus2F998400iminus3

Figure 9 Hardware architecture for object segmentation

for fitness score computation is the number of overall dataplus the four-cycle latencyAlthough fitness scoring could require floating point

computations the datapath unit uses suitable fixed pointprecision for each stage SinceNios II is a 32-bit processor theaffineparameters in hypothesismodel (119867

0to1198676) are properly

scaled to different precision of 16-bit fixed points as describedin Table 4 so that two affine parameters can be assigned in asingle 32-bit write instruction As this system is targeted for640 times 480 pixelsrsquo video all input coordinates (119909

1 1199101 1199092 and

1199102) are scaled to 11 bits

43 Object Segmentation Hardware Architecture As objectsegmentation can be performed in one raster scan a stream-oriented architecture is proposed as illustrated in Figure 9 Allsubprocesses are executed in pipeline on the streaming videowithout additional frame buffering Object segmentationprocess is initiated by software processor after providing theaffine parameters from RANSAC to affine PE Two frames(119865119894minus2and 119865

119894minus3as described in Table 2) from frame buffer

(SRAM) are required to segment the moving targetBased on the affine parameters from RANSAC affine PE

uses reverse mapping technique to find each pixel location inprevious frame (119865

119894minus3) using (3) and generates their addresses

in frame buffer (SRAM) Frame readers fetch the previ-ous frame (119865

119894minus3) pixel-by-pixel according to the generated

addresses from frame buffer thus constructing a stream oftransformed frame which is denoted as 1198651015840

119894minus3

International Journal of Reconfigurable Computing 11

Leftmostcolumn pixels

Rightmostcolumn pixels

Median output stream

Addertree

Addertree

24

pixelsTo line buffer

Binary image stream

7 times 7 window

minus

+

gtgt

Figure 10 Hardware architecture of median PE

By synchronizing the streams of both frames framedifferencing can be executed in pipeline as soon as one pixelfrom each frame is obtained Hence one pixel in currentframe (119865

119894minus2) and one pixel in transformed frame (1198651015840

119894minus3) are

fetched alternatively from their corresponding memory loca-tions by frame reader constructing two synchronized streamsof 119865119894minus2and 1198651015840

119894minus3frames Frame differencing PE performs

pixel-to-pixel absolute subtraction and thresholding on thestreams The frame differencing PE is able to compute inone cycle per pixel A configurable threshold value thfd isused after the subtraction yielding a stream of binary imagewithout buffering the whole frameAfter frame differencing the binary image is streamed

into 7 times 7 median filtering Seven lines of the image arebuffered in the line buffer providing 7 times 7 pixels window forthe median PE to perform the median computation Mediancomputation can be performed in one clock cycle for eachprocessing window due to short propagation delay as onlybinary pixels are involved Figure 10 shows the hardware logicdesign of median PEMedian filtering can be computed by counting the num-

ber of asserted (binary 1) pixels in the window If morethan half the pixels in the window (24 pixels) are assertedthe resultant pixel is ldquo1rdquo or ldquo0rdquo otherwise Since processingwindow will move only one pixel to the right for each com-putation during raster scan current pixel count is computedby adding the previous pixel count and rightmost columnpixels in the current window while subtracting the leftmost

column pixels in the previous window Final binary outputpixel is produced by thresholding the current pixel count with24 (half of window size)As dilation is also a 7times 7window-based processing it uses

similar line buffering technique asmedian filtering Howeveronly simple logical OR operation is performed on all windowpixels Due to its simplicity dilation PE can also be computedin one clock cycle resulting in the stream of binary imagewith detected region of moving targets

5 Experimental Results

51 Verification of Proposed SoC Theproposedmoving targetdetection SoC is verified in offline detection mode using thedatabase in [41] Test videos are 640 times 480 pixels in sizeand are greyscaled prior to the verification process The testvideos are transferred to the system for computation via aUSB mass storage device After performing the detection inSoC the image results are displayed on VGA and also storedon USB drive for verification Figure 11 shows the movingtarget detection result from the proposed SoC using differentsample videos The detected regions (red) are overlaid on theinput frame In most cases the proposed SoC is able to detectthe moving target in consecutive framesHowever there are several limitations in this work Block

matching may not give a goodmotion estimation result if theextracted blocks do not have texture (the pixels intensity aresimilar) Moreover the detected region of moving target mayappear in cavity or multiple split of smaller regions as onlysimple frame differencing is applied in the proposed systemAdditional postprocessing to produce better detected blob bymerging split regions is out of the scope in this workAs the stochastic RANSAC algorithm is terminated after

a constant time step for each frame image registration errormay occur which produces incorrect ego-motion estimationThis could be mitigated by accelerating RANSAC algorithmto ensure more iterations using dedicated hardware or highperformance general purpose processor

52 Performance Evaluation of Detection Algorithm Theper-formance evaluation of the implemented detection algorithmuses the Mathematical Performance Metric in [42] thatinvolves several parameters as follows

(i) True positive TP the detected moving object(ii) False positive FP detected regions that do not corre-spond to any moving object

(iii) False negative FN the nondetected moving object(iv) Detection rate DR the ratio of TP with the combina-

tion of TP and FN as formulated in

DR = TPTP + FN

(4)

(v) False alarm rate FAR the ratio between FP in allpositive detection as defined in

FAR = FPTP + FP

(5)

12 International Journal of Reconfigurable Computing

(a) Frame 255 (b) Frame 275 (c) Frame 295 (d) Frame 315

(e) Frame 1000 (f) Frame 1020 (g) Frame 1040 (h) Frame 1060

(i) Frame 600 (j) Frame 620 (k) Frame 640 (l) Frame 660

Figure 11 Detected regions from the proposed moving target detection SoC on different sample videos in [41] Video numbers (a)ndash(d)V3V100003 004 video numbers (e)ndash(h) V3V100004 003 and video numbers (i)ndash(l) V4V100007 017

FP

FNTP

Figure 12 Evaluation of performancemetrics TP FP and FN basedon ground truth boxes (blue) and the detected region (red)

To obtain the performance metrics ground truth regionsare manually labelled in several frames of test videos Abounding box is drawn across each moving object to indicatethe ground truth region of every frame as depicted in Fig-ure 12 A simple postprocessing is performed on the detectedregion by filtering out the detected region smaller than 15pixelsrsquo width or 15 pixelsrsquo height prior to the evaluationA detected moving object (TP) has detected regions in itsbounded ground truth area while a nondetected movingobject (FN) has no detected region overlapping with itsground truth area Detected region that does not overlappwith any ground truth region is considered as false positive(FP)The detection performance is evaluated on different

parameters configuration The DR and FAR for 1000 testframes using different number of blocks (density in ego-motion estimation) 119898 times 119899 in area-based registration and

frame differencing threshold thfd are depicted in Table 5 andFigure 13The experiment results show that DR is almost similar

for different density of ego-motion estimation but decreaseswith thfd Although higher density in the proposed work haslower displacement limitation 119889

119898and 119889

119899as discussed in

Section 31 most of the point-to-point displacements do notexceed the limitation due to slowUAVmovement in themostframes of the test dataset On the contrary higher value of thfdmay filter out the moving object if the differences in intensityof the object pixels and background pixels are almost similarFAR decreases with density in ego-motion estimation

due to the higher quality in image registration process butincreases if most frames exceed the displacement limitation119889119898and 119889

119899 However false registration due to displacement

limitation results in a huge blob of foreground but does notgreatly increase FAR Although higher values of thfd decreasethe false detection rate they also produce smaller foregroundarea for all detected moving objects as pixels almost similarintensity with background will be thresholded

53 Speed Comparison with Full Software ImplementationThe computation speed of the proposed moving target detec-tion SoC is compared with software computation in differentplatforms including modern CPU (Intel Core i5) in desktopcomputer and embedded processor (ARM) Table 6 illustratesthe comparison of computation frame rate and hardware

International Journal of Reconfigurable Computing 13

1 2 3 4 5 6 7 8 9 100944

0946

0948

095

0952

0954

0956

0958

096

0962

0964

Det

ectio

n ra

teD

R

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(a) DR

1 2 3 4 5 6 7 8 9 100

01

02

03

04

05

06

07

False

alar

m ra

te F

AR

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(b) FAR

Figure 13 DR and FAR for different density in ego-motion estimation119898 times 119899 and frame differencing threshold thfd

Table 5 Performance evaluation in terms of DR and FAR for 1000frames using different density in ego-motion estimation119898times 119899 andframe differencing threshold thfd

119898 times 119899 thfd DR FAR12 15 0958 064312 20 0954 033112 25 0949 019424 15 0957 056824 20 0950 032424 25 0945 010135 15 0958 054835 20 0952 021535 25 0947 009048 15 0959 053948 20 0952 025348 25 0944 007970 15 0958 050970 20 0951 018870 25 0946 007588 15 0960 048988 20 0951 021988 25 0947 0074108 15 0958 0483108 20 0952 0168108 25 0946 0058140 15 0958 0499140 20 0951 0187140 25 0946 0059165 15 0958 0474165 20 0953 0214165 25 0947 0068192 15 0959 0478192 20 0952 0169192 25 0946 0092

Table 6 Computation speed comparison of the proposed sys-tem with different software implementation using area-based andfeature-based registrations

Platform Frequency Registrationtechnique

Framerate

Hardwarespeed-up

ProposedSoC 100MHz Area-based 30 1

Intel Corei5-4210U 170GHz Area-based 426 704

Feature-based 1311 229

ARM1176JZF 700MHz Area-based 020 150Feature-based 056 5357

speed-up between the proposed system and other softwareimplementations using test videos in [41]As feature-based image registration has faster computa-

tion in software implementation comparing to area-basedregistration speed performance of feature-based method isalso included for comparison In feature-based implementa-tion features are first detected in each frame The detectedfeatures from current frame are cross-correlatedwith featureswith previous framewhile RANSAC algorithm is used to esti-mate the ego-motion between frames After compensatingthe ego-motion segmentation ofmoving object uses the sameprocesses with the proposed system To further optimizethe software implementation in terms of speed performancea fast feature detection algorithm [30] is utilized As thenumber of features will affect the computation time in featurematching step only 100 strongest features in each frame areselected for processingHowever the performance evaluationdoes not consider multithreaded software execution

14 International Journal of Reconfigurable Computing

Table 7 Resources usage of the proposed moving target detectionSoC

Logic units Utilization ()Total combinational function 15161 13Total registers 10803 9Total memory bits 521054 13Embedded multiplier 27 5FPGA device Altera Cyclone IV

Based on experimental result the speed performanceof the proposed moving target detection SoC surpassesoptimized software computation by 229 times and 5357times compared with implementations in modern CPUand embedded CPU respectively The software computation(RANSAC) in HWSW codesign of the proposed system cre-ates speed bottleneck thus limiting the maximum through-put to 30 fps The processing frame rate of the proposed sys-tem can be further improved by using fully dedicated hard-ware

54 Resource Utilization The overall hardware resourcesutilization of the complete system is illustrated in Table 7This prototype of real-time moving object detection systemutilizes only less than 20 percent of total resources in AlteraCyclone IV FPGA device As the proposed system uses off-chip memory components for frame buffering FPGA on-chip memory is utilized only for line buffering in streamingprocess (eg block matching and median filtering) and stor-ing intermediate results (eg point pairs after block match-ing) Thus the low resource usage of the proposed systemprovides abundant hardware space for other processes such astarget tracking or classification to be developed in future

6 Conclusions

Moving target detection is a crucial step in most computervision problem especially for UAV applications On-chipdetection without the need of real-time video transmission toground will provide immense benefit to diverse applicationssuch as military surveillance and resource exploration Inorder to perform this complex embedded video processingon-chip FPGA-based system is desirable due to the potentialparallelism of the algorithmThis paper proposed a moving target detection system

using FPGA to enable autonomous UAVwhich is able to per-form the computer vision algorithm on the flying platformThe proposed system is prototyped using Altera CycloneIV FPGA device on Terasic DE2-115 development boardmounted with a TRDB-D5M camera This system is devel-oped as a HWSW codesign using dedicated hardware withNios II software processor (booted with embedded Linux)running at 100MHz clock rate As stream-oriented hardwarewith pipeline processing is utilized the proposed systemachieves real-time capability with 30 frames per secondprocessing speed on 640times 480 live video Experimental resultshows that the proposed SoC performs 229 times and 5357times faster than optimized software computation onmodern

desktop computer (Intel Core i5) and embedded processor(ARM) In addition the proposed moving target detectionuses only less than 20 percent of total resources in theFPGA device allowing other hardware accelerators to beimplemented in future

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors would like to express their gratitude to UniversitiTeknologi Malaysia (UTM) and the Ministry of ScienceTechnology and Innovation (MOSTI) Malaysia for support-ing this research work under research Grants 01-01-06-SF1197and 01-01-06-SF1229

References

[1] A Ahmed M Nagai C Tianen and R Shibasaki ldquoUav basedmonitoring systemandobject detection technique developmentfor a disaster areardquo International Archives of PhotogrammetryRemote Sensing and Spatial Information Sciences vol 37 pp373ndash377 2008

[2] B Coifman M McCord R Mishalani M Iswalt and Y JildquoRoadway trafficmonitoring from an unmanned aerial vehiclerdquoIEE Proceedings-Intelligent Transport Systems vol 153 no 1 pp11ndash20 2006

[3] K Kanistras G Martins M J Rutherford and K P ValavanisldquoSurvey of unmanned aerial vehicles (uavs) for traffic monitor-ingrdquo inHandbook of Unmanned Aerial Vehicles pp 2643ndash2666Springer 2015

[4] K Nordberg P Doherty G Farneback et al ldquoVision for a UAVhelicopterrdquo in Proceedings of the International Conference onIntelligent Robots and Systems (IROS rsquo02) Workshop on AerialRobotics pp 29ndash34 Lausanne Switzerland October 2002

[5] D Zamalieva and A Yilmaz ldquoBackground subtraction for themoving camera a geometric approachrdquo Computer Vision andImage Understanding vol 127 pp 73ndash85 2014

[6] M Genovese and E Napoli ldquoASIC and FPGA implementationof the Gaussianmixturemodel algorithm for real-time segmen-tation of high definition videordquo IEEETransactions onVery LargeScale Integration (VLSI) Systems vol 22 no 3 pp 537ndash547 2014

[7] F Kristensen H Hedberg H Jiang P Nilsson and V OwallldquoAn embedded real-time surveillance system implementationand evaluationrdquo Journal of Signal Processing Systems vol 52 no1 pp 75ndash94 2008

[8] H Jiang H Ardo and V Owall ldquoA hardware architecturefor real-time video segmentation utilizing memory reductiontechniquesrdquo IEEETransactions onCircuits and Systems for VideoTechnology vol 19 no 2 pp 226ndash236 2009

[9] M Genovese and E Napoli ldquoFPGA-based architecture for realtime segmentation and denoising of HD videordquo Journal of Real-Time Image Processing vol 8 no 4 pp 389ndash401 2013

[10] A Lopez-Bravo J Diaz-Carmona A Ramirez-Agundis APadilla-Medina and J Prado-Olivarez ldquoFPGA-based videosystem for real time moving object detectionrdquo in Proceedings

International Journal of Reconfigurable Computing 15

of the 23rd International Conference on Electronics Communi-cations and Computing (CONIELECOMP rsquo13) pp 92ndash97 IEEECholula Mexico March 2013

[11] T Kryjak M Komorkiewicz and M Gorgon ldquoReal-time mov-ing object detection for video surveillance system in FPGArdquo inProceedings of the Conference on Design and Architectures forSignal and Image Processing (DASIP rsquo11) pp 1ndash8 IEEE TampereFinland November 2011

[12] A Mittal and D Huttenlocher ldquoScene modeling for wide areasurveillance and image synthesisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition vol 2pp 160ndash167 IEEE June 2000

[13] A Price J Pyke D Ashiri and T Cornall ldquoReal time objectdetection for an unmanned aerial vehicle using an FPGAbased vision systemrdquo in Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA rsquo06) pp 2854ndash2859 IEEE Orlando Fla USA May 2006

[14] G J Garcıa C A Jara J Pomares A Alabdo L M Poggi andF Torres ldquoA survey on FPGA-based sensor systems towardsintelligent and reconfigurable low-power sensors for computervision control and signal processingrdquo Sensors vol 14 no 4 pp6247ndash6278 2014

[15] S Ali and M Shah ldquoCocoa tracking in aerial imageryrdquo inAirborne Intelligence Surveillance Reconnaissance (ISR) Systemsand Applications III vol 6209 of Proceedings of SPIE OrlandoFla USA April 2006

[16] J Xiao C Yang F Han and H Cheng ldquoVehicle and persontracking in aerial videosrdquo in Multimodal Technologies for Per-ception of Humans pp 203ndash214 Springer 2008

[17] W Yu X Yu P Zhang and J Zhou ldquoA new framework of mov-ing target detection and tracking for uav video applicationrdquo inProceedings of the International Archives of the PhotogrammetryRemote Sensing and Spatial Information Science vol 37 BeijingChina 2008

[18] V Reilly H Idrees and M Shah ldquoDetection and tracking oflarge number of targets in wide area surveillancerdquo in ComputerVisionmdashECCV 2010 11th European Conference on ComputerVision Heraklion Crete Greece September 5ndash11 2010 Proceed-ings Part III vol 6313 of Lecture Notes in Computer Science pp186ndash199 Springer Berlin Germany 2010

[19] J Wang Y Zhang J Lu and W Xu ldquoA framework for movingtarget detection recognition and tracking in UAV videosrdquoin Affective Computing and Intelligent Interaction vol 137 ofAdvances in Intelligent and Soft Computing pp 69ndash76 SpringerBerlin Germany 2012

[20] S A Cheraghi andUU Sheikh ldquoMoving object detection usingimage registration for a moving camera platformrdquo in Proceed-ings of the IEEE International Conference on Control SystemComputing and Engineering (ICCSCE rsquo12) pp 355ndash359 IEEEPenang Malaysia November 2012

[21] Y Zhang X Tong T Yang and W Ma ldquoMulti-model estima-tion based moving object detection for aerial videordquo Sensorsvol 15 no 4 pp 8214ndash8231 2015

[22] Q Yu and G Medioni ldquoA GPU-based implementation ofmotion detection from a moving platformrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern RecognitionWorkshops (CVPR rsquo08) pp 1ndash6 AnchorageAlaska USA June 2008

[23] A Laika J Paul C Claus W Stechele A E S Auf and EMaehle ldquoFPGA-based real-time moving object detection for

walking robotsrdquo in Proceedings of the 8th IEEE InternationalWorkshop on Safety Security and Rescue Robotics (SSRR rsquo10) pp1ndash8 IEEE Bremen Germany July 2010

[24] B Tippetts S Fowers K Lillywhite D-J Lee and J ArchibaldldquoFPGA implementation of a feature detection and trackingalgorithm for real-time applicationsrdquo in Advances in VisualComputing pp 682ndash691 Springer 2007

[25] K May and N Krouglicof ldquoMoving target detection for senseand avoid using regional phase correlationrdquo in Proceedings ofthe IEEE International Conference on Robotics and Automation(ICRA rsquo13) pp 4767ndash4772 IEEE Karlsruhe Germany May2013

[26] M E Angelopoulou and C-S Bouganis ldquoVision-based ego-motion estimation on FPGA for unmanned aerial vehiclenavigationrdquo IEEE Transactions on Circuits and Systems for VideoTechnology vol 24 no 6 pp 1070ndash1083 2014

[27] I M El-Emary andMM A El-Kareem ldquoOn the application ofgenetic algorithms in finger prints registrationrdquoWorld AppliedSciences Journal vol 5 no 3 pp 276ndash281 2008

[28] A A Goshtasby 2-D and 3-D Image Registration for MedicalRemote Sensing and Industrial Applications JohnWiley amp SonsNew York NY USA 2005

[29] C Harris and M Stephens ldquoA combined corner and edgedetectorrdquo in Proceedings of the 4th Alvey Vision Conference vol15 pp 147ndash151 1988

[30] E Rosten andTDrummond ldquoMachine learning for high-speedcorner detectionrdquo in Computer VisionmdashECCV 2006 pp 430ndash443 Springer 2006

[31] H Bay T Tuytelaars and L Van Gool ldquoSURF speededup robust featuresrdquo in Computer VisionmdashECCV 2006 ALeonardis H Bischof and A Pinz Eds vol 3951 of LectureNotes in Computer Science pp 404ndash417 Springer 2006

[32] G R Rodrıguez-Canosa SThomas J del Cerro A Barrientosand B MacDonald ldquoA real-time method to detect and trackmoving objects (DATMO) from unmanned aerial vehicles(UAVs) using a single camerardquo Remote Sensing vol 4 no 4 pp1090ndash1111 2012

[33] B Liu and A Zaccarin ldquoNew fast algorithms for the estimationof block motion vectorsrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 3 no 2 pp 148ndash157 1993

[34] R Li B Zeng andM L Liou ldquoNew three-step search algorithmfor blockmotion estimationrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 4 no 4 pp 438ndash442 1994

[35] L-M Po andW-C Ma ldquoA novel four-step search algorithm forfast blockmotion estimationrdquo IEEETransactions onCircuits andSystems for Video Technology vol 6 no 3 pp 313ndash317 1996

[36] S Zhu and K-K Ma ldquoA new diamond search algorithm forfast block-matching motion estimationrdquo IEEE Transactions onImage Processing vol 9 no 2 pp 287ndash290 2000

[37] L De Vos andM Stegherr ldquoParameterizable VLSI architecturesfor the full-search block-matching algorithmrdquo IEEE Transac-tions on Circuits and Systems vol 36 no 10 pp 1309ndash1316 1989

[38] M A Fischler and R C Bolles ldquoRandom sample consensus aparadigm for model fitting with applications to image analysisand automated cartographyrdquo Communications of the ACM vol24 no 6 pp 381ndash395 1981

[39] J W Tang N Shaikh-Husin and U U Sheikh ldquoFPGA imple-mentation of RANSAC algorithm for real-time image geometry

16 International Journal of Reconfigurable Computing

estimationrdquo in Proceedings of the 11th IEEE Student Conferenceon Research andDevelopment (SCOReD rsquo13) pp 290ndash294 IEEEPutrajaya Malaysia December 2013

[40] O Chum and J Matas ldquoRandomized ransac with T119889119889testrdquo in

Proceedings of the British Machine Vision Conference vol 2 pp448ndash457 September 2002

[41] DARPA SDMS PublicWeb Site 2003 httpswwwsdmsafrlafmil

[42] A F M S Saif A S Prabuwono and Z R MahayuddinldquoMotion analysis for moving object detection from UAV aerialimages a reviewrdquo in Proceedings of the International Conferenceon Informatics Electronics and Vision (ICIEV rsquo14) pp 1ndash6 IEEEDhaka Bangladesh May 2014

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

International Journal of Reconfigurable Computing 7

Moving targetdetector

NIOS II CPU

SDRAMcontroller

Camerainterface

FPGA

Syste

m b

us

Embedded Linux

(i) Firmware control

(ii) Software processing

SDRAM

VGAinterface

VGAdisplay

SRAM

USBcontroller

USB mass storage device

Camera

Software processor Hardware accelerator Controllers and interfaces

Memory components IO components

Figure 3 System architecture of moving target detection

Framebuffer

interface

Motionestimation

core

Framegrabber

Objectsegmentation

core

Video streaminput

Businterface(slave)

Software processor

Frame buffer(SRAM)

Businterface(master)

Videoresult

DMA to SDRAM

Affineparameters

RANSACcomputation

Fiminus2

Fi

and Fiminus3

Figure 4 Hardware architecture of moving target detector

8 International Journal of Reconfigurable Computing

grabber also provides the current frame (119865119894) tomotion

estimation core(2) Motion estimation core performs blockmatching andRANSAC computation Since RANSAC is computedin both hardware and software software processor isconstantly accessing this core via system bus interfaceto calculate the affine parameters

(3) After RANSAC the affine parameters are transferredfrom software to object segmentation core Two pre-vious frames (119865

119894minus2and 119865

119894minus3) are read from the frame

buffer by object segmentation core for processing(4) Several processes involving affine transformationframe differencing median filter and dilation arethen performed on both frames resulting in thedetected moving target

(5) Lastly the bus interface (master) provides DMAaccess for object segmentation core to transfer theend result into SDRAMfor displaying and verificationpurposes

As the frame buffer (SRAM) is a single port 16-bitmemory frame grabber concatenates two neighbouring 8-bitgreyscale pixels to store in one memory location Since framegrabber and object segmentation core share the frame bufferto write and read frames respectively frame buffer interfaceprovides priority arbitration and gives frame grabber thehighest priority granting everywrite request However framebuffer may be busy for a couple of clock cycles due to readoperation of SRAM by other modules a small FIFO withdepth of 4 is utilized in frame grabber to temporarily bufferthe incoming image pixels

42 Motion Estimation Hardware Accelerator Motion esti-mation core consists of block matching and RANSAC hard-ware accelerators Since RANSAC requires the entire dataof point pairs provided by block matching to begin itscomputation additional buffers are needed to temporarilystore the corresponding point pairs for every two subsequentframes The hardware architecture for motion estimationprocess is shown in Figure 5To enable high throughput data (point pairs) sharing

for both block matching and RANSAC double bufferingtechnique is applied by using two buffers (Buffer 1 and Buffer2) as data storage For any instance one buffer is writtenby block matching while the other is used for computationby RANSAC Buffer controller swaps the roles of these twobuffers for each incoming new frame therefore ensuringboth processes to be pipelined by reading and writing oneach buffer subsequently Buffer swapping is initiated at eachcompletion of block matching modules while RANSAC isperformed during the time gap between each swap and isterminated before the next swap

421 Block Matching Hardware Accelerator Figure 7 showsthe architecture of the proposed block matching hardwareaccelerator performing template blocks extraction fromone frame and matching of these template blocks in theircorresponding search areas from next frame The overall

Buffer controllerBuffer 1

Block matchingVideo stream

input

RANSAC acceleratorTo softwareprocessor

Buffer 2

Figure 5 Hardware architecture of motion estimation core

process can be completed in stream to yield the point-to-point motion (point pairs) of two subsequent frames withoutbuffering an entire frameAs 9 times 9 block size is utilized in block matching a 9-

tap line buffer is designed in such a way that 9 times 9 pixels ofmoving window can be obtained in every clock cycleThese 9times 9 pixels are shared for both block extraction and matchingprocesses and are read one by one in pipeline from the linebuffer at each valid cycle resulting in a total of 81 cycles toobtain a complete windowThe block extractor keeps track of the coordinate of

current pixel in video stream as a reference for extractionprocess Template blocks from incoming frames are extractedand stored temporarily into block memory As each block isextracted line-by-line in raster scan blockmemory is dividedinto nine-rowmemories as illustrated in Figure 6(a)with eachof which being used to store one pixel row in template blocksWhen video stream reaches the block position each pixelrow is loaded into each rowmemory from the correspondingtap of the line buffer Block coordinates are also stored in aseparate FIFO to keep track of its positionSince only one SAD processor is used for matching 119898 times119899 blocks as mentioned in Section 31 the template blockhas to be swapped according to the corresponding searcharea during raster scan Hence row memory is constructedwith two FIFOs upper and lower FIFO as illustrated inFigure 6(b) to enable block swapping during matchingprocess Template blocks are stored into upper FIFO duringextraction process During matching process each line ofraster scan enters eight different search areas to match eightdifferent template blocks respectively Hence one row oftemplate blocks is cached in lower FIFO and is repeatedlyused until the end of their search areas (reaching next row ofsearch areas) Upon reaching each new row of search areastemplate blocks in lower FIFO are replaced with new row oftemplate blocks from upper FIFO At the last line of rasterscan the lower FIFO is flushed to prevent overflow

International Journal of Reconfigurable Computing 9

CV ControlVector(CV)

Tap 0 Tap 1

Row 1pixels

Template blockscoordinate

Template blockscoordinate

Template blocks pixels

Row 2pixels

Tap 8

Row 8pixels

CV

Rowmemory

Rowmemory

Coordinate memory (FIFO)

Row memory

middot middot middot

middot middot middot

middot middot middot

(a) Block memory consisting of nine-row memories

wr_upper

rd_upper

wr_lower

rd_lower

sel1

sel2

CV_in CV_out

Tap

Row pixels

UpperFIFO

LowerFIFO

1 0

0

Controlregisters

1 0

(b) Row memory contains an upper FIFO and lower FIFO

Figure 6 Block memory architecture for storing template blocks

Blockextractor

Bestscore

tracker

Line buffer

Block memory

SAD processor

Video stream input

Control vector

Matching score

Point pairs

Template blocks Blocks coordinate

9 times 9 window pixels

Figure 7 Stream-oriented hardware architecture of blockmatching

In order to efficiently extract and match all blocksdifferent Control Vector (CV) as illustrated in Table 3 is sentto perform different reading and writing operations in blockmemory based on the current position in raster scan Bothreads andwrites are independent of each other and are able tobe executed at the same time Pixels are processed one by onein 81 cycles to complete a window Both writing and reading

Table 3 Control Vector (CV) for different read andwrite operationsof block memory

Position of raster scan WriteupperReadupper

Writelower

Readlower sel1 sel2

Entering templateblock position 1 x x x x x

Entering first searcharea row x 1 1 0 1 1

Entering next searcharea row x 1 1 1 1 1

Reentering samesearch area row x 0 1 1 0 0

Leaving last searcharea row x 0 0 1 0 0

processes require 9 cycles for each row memory passing CVfrom the first row memory to the next row memory untilthe end to complete a 81-pixel write or read operation of atemplate blockSAD processor performs the correlation of the template

blocks from previous frame with all possible blocks fromcurrent frame according to the search area Extracted blockpixels are read from block memory while window pixels insearch areas are provided from the taps of the line bufferThetotal number of required PEs is the total number of pixelsin a window The process is pipelined such that each pixelis computed in each PE as soon as it is obtained from theline buffer Matching score of each window can be obtainedin every cycle after a fixed latencyLastly the best score tracker constantly stores and updates

the best matching score for each template block within itscorresponding search area The matching score is compared

10 International Journal of Reconfigurable Computing

x2 H2y1 H1

H0y2 H5 x1x1 H3

y1 H4

minus times times minus

minus minus

times times

+

+

+

abs abs

sqr sqr

min

acc

Pipelineregister

Pipelineregister

Pipelineregister

Fitness score

th2dist

Figure 8 Hardware datapath of fitness scoring in RANSAC accel-erator

among the same search area and the coordinates of the best-scored blocks are preserved At the end of each search areathe coordinates of the best pairs (template blocks and theirbest-scored blocks) are sent to RANSAC module for nextprocessing Hence the proposed block matching hardware isable to produce point-to-point motion (point pairs) of everytwo successive frames in streaming video at line rate

422 RANSAC Hardware Accelerator RANSAC hardwaredesign in [39] is utilized in this work which acceleratesonly fitness scoring step As described in Algorithm 2fitness scoring is an iterative process which performs similarcomputation to all data samples based on hypothesis modelHence this data intensive process is executed in pipelineddatapath as illustrated in Figure 8 A control unit is utilizedto read input data provided by block matching from bufferand stream these inputs to the datapath unit at every clockcycleThe datapath unit utilizes three stages of pipeline with

the aim of isolating multiplication processes thus allowingfaster clock rate The first stage pipeline registers are locatedright after the first multiplication while the other two stagesof pipeline registers enclose the squaring processes Theindividual score is accumulated in the last stage producingtotal final fitness score The accumulator is reset on each newset of hypothesis Thus the total number of cycles required

Table 4 Fixed point precision of fitness scoring inputs

Parameter Number of bits Number rangeInteger Fraction

1199091 1199101 1199092 1199102

11 0 [minus1024 1024)1198670119867111986731198674

4 12 [minus8 8)11986721198675

11 5 [minus1024 1024)

Detected movingtarget

Affineparameters

fromsoftware

AffinePE

Framereader

Address

Framedifferencing PE

Binary image stream

MedianPE

Line buffer

DilationPE

Line buffer

from framebuffer

Fiminus2 and Fiminus3

Fiminus2F998400iminus3

Figure 9 Hardware architecture for object segmentation

for fitness score computation is the number of overall dataplus the four-cycle latencyAlthough fitness scoring could require floating point

computations the datapath unit uses suitable fixed pointprecision for each stage SinceNios II is a 32-bit processor theaffineparameters in hypothesismodel (119867

0to1198676) are properly

scaled to different precision of 16-bit fixed points as describedin Table 4 so that two affine parameters can be assigned in asingle 32-bit write instruction As this system is targeted for640 times 480 pixelsrsquo video all input coordinates (119909

1 1199101 1199092 and

1199102) are scaled to 11 bits

43 Object Segmentation Hardware Architecture As objectsegmentation can be performed in one raster scan a stream-oriented architecture is proposed as illustrated in Figure 9 Allsubprocesses are executed in pipeline on the streaming videowithout additional frame buffering Object segmentationprocess is initiated by software processor after providing theaffine parameters from RANSAC to affine PE Two frames(119865119894minus2and 119865

119894minus3as described in Table 2) from frame buffer

(SRAM) are required to segment the moving targetBased on the affine parameters from RANSAC affine PE

uses reverse mapping technique to find each pixel location inprevious frame (119865

119894minus3) using (3) and generates their addresses

in frame buffer (SRAM) Frame readers fetch the previ-ous frame (119865

119894minus3) pixel-by-pixel according to the generated

addresses from frame buffer thus constructing a stream oftransformed frame which is denoted as 1198651015840

119894minus3

International Journal of Reconfigurable Computing 11

Leftmostcolumn pixels

Rightmostcolumn pixels

Median output stream

Addertree

Addertree

24

pixelsTo line buffer

Binary image stream

7 times 7 window

minus

+

gtgt

Figure 10 Hardware architecture of median PE

By synchronizing the streams of both frames framedifferencing can be executed in pipeline as soon as one pixelfrom each frame is obtained Hence one pixel in currentframe (119865

119894minus2) and one pixel in transformed frame (1198651015840

119894minus3) are

fetched alternatively from their corresponding memory loca-tions by frame reader constructing two synchronized streamsof 119865119894minus2and 1198651015840

119894minus3frames Frame differencing PE performs

pixel-to-pixel absolute subtraction and thresholding on thestreams The frame differencing PE is able to compute inone cycle per pixel A configurable threshold value thfd isused after the subtraction yielding a stream of binary imagewithout buffering the whole frameAfter frame differencing the binary image is streamed

into 7 times 7 median filtering Seven lines of the image arebuffered in the line buffer providing 7 times 7 pixels window forthe median PE to perform the median computation Mediancomputation can be performed in one clock cycle for eachprocessing window due to short propagation delay as onlybinary pixels are involved Figure 10 shows the hardware logicdesign of median PEMedian filtering can be computed by counting the num-

ber of asserted (binary 1) pixels in the window If morethan half the pixels in the window (24 pixels) are assertedthe resultant pixel is ldquo1rdquo or ldquo0rdquo otherwise Since processingwindow will move only one pixel to the right for each com-putation during raster scan current pixel count is computedby adding the previous pixel count and rightmost columnpixels in the current window while subtracting the leftmost

column pixels in the previous window Final binary outputpixel is produced by thresholding the current pixel count with24 (half of window size)As dilation is also a 7times 7window-based processing it uses

similar line buffering technique asmedian filtering Howeveronly simple logical OR operation is performed on all windowpixels Due to its simplicity dilation PE can also be computedin one clock cycle resulting in the stream of binary imagewith detected region of moving targets

5 Experimental Results

51 Verification of Proposed SoC Theproposedmoving targetdetection SoC is verified in offline detection mode using thedatabase in [41] Test videos are 640 times 480 pixels in sizeand are greyscaled prior to the verification process The testvideos are transferred to the system for computation via aUSB mass storage device After performing the detection inSoC the image results are displayed on VGA and also storedon USB drive for verification Figure 11 shows the movingtarget detection result from the proposed SoC using differentsample videos The detected regions (red) are overlaid on theinput frame In most cases the proposed SoC is able to detectthe moving target in consecutive framesHowever there are several limitations in this work Block

matching may not give a goodmotion estimation result if theextracted blocks do not have texture (the pixels intensity aresimilar) Moreover the detected region of moving target mayappear in cavity or multiple split of smaller regions as onlysimple frame differencing is applied in the proposed systemAdditional postprocessing to produce better detected blob bymerging split regions is out of the scope in this workAs the stochastic RANSAC algorithm is terminated after

a constant time step for each frame image registration errormay occur which produces incorrect ego-motion estimationThis could be mitigated by accelerating RANSAC algorithmto ensure more iterations using dedicated hardware or highperformance general purpose processor

52 Performance Evaluation of Detection Algorithm Theper-formance evaluation of the implemented detection algorithmuses the Mathematical Performance Metric in [42] thatinvolves several parameters as follows

(i) True positive TP the detected moving object(ii) False positive FP detected regions that do not corre-spond to any moving object

(iii) False negative FN the nondetected moving object(iv) Detection rate DR the ratio of TP with the combina-

tion of TP and FN as formulated in

DR = TPTP + FN

(4)

(v) False alarm rate FAR the ratio between FP in allpositive detection as defined in

FAR = FPTP + FP

(5)

12 International Journal of Reconfigurable Computing

(a) Frame 255 (b) Frame 275 (c) Frame 295 (d) Frame 315

(e) Frame 1000 (f) Frame 1020 (g) Frame 1040 (h) Frame 1060

(i) Frame 600 (j) Frame 620 (k) Frame 640 (l) Frame 660

Figure 11 Detected regions from the proposed moving target detection SoC on different sample videos in [41] Video numbers (a)ndash(d)V3V100003 004 video numbers (e)ndash(h) V3V100004 003 and video numbers (i)ndash(l) V4V100007 017

FP

FNTP

Figure 12 Evaluation of performancemetrics TP FP and FN basedon ground truth boxes (blue) and the detected region (red)

To obtain the performance metrics ground truth regionsare manually labelled in several frames of test videos Abounding box is drawn across each moving object to indicatethe ground truth region of every frame as depicted in Fig-ure 12 A simple postprocessing is performed on the detectedregion by filtering out the detected region smaller than 15pixelsrsquo width or 15 pixelsrsquo height prior to the evaluationA detected moving object (TP) has detected regions in itsbounded ground truth area while a nondetected movingobject (FN) has no detected region overlapping with itsground truth area Detected region that does not overlappwith any ground truth region is considered as false positive(FP)The detection performance is evaluated on different

parameters configuration The DR and FAR for 1000 testframes using different number of blocks (density in ego-motion estimation) 119898 times 119899 in area-based registration and

frame differencing threshold thfd are depicted in Table 5 andFigure 13The experiment results show that DR is almost similar

for different density of ego-motion estimation but decreaseswith thfd Although higher density in the proposed work haslower displacement limitation 119889

119898and 119889

119899as discussed in

Section 31 most of the point-to-point displacements do notexceed the limitation due to slowUAVmovement in themostframes of the test dataset On the contrary higher value of thfdmay filter out the moving object if the differences in intensityof the object pixels and background pixels are almost similarFAR decreases with density in ego-motion estimation

due to the higher quality in image registration process butincreases if most frames exceed the displacement limitation119889119898and 119889

119899 However false registration due to displacement

limitation results in a huge blob of foreground but does notgreatly increase FAR Although higher values of thfd decreasethe false detection rate they also produce smaller foregroundarea for all detected moving objects as pixels almost similarintensity with background will be thresholded

53 Speed Comparison with Full Software ImplementationThe computation speed of the proposed moving target detec-tion SoC is compared with software computation in differentplatforms including modern CPU (Intel Core i5) in desktopcomputer and embedded processor (ARM) Table 6 illustratesthe comparison of computation frame rate and hardware

International Journal of Reconfigurable Computing 13

1 2 3 4 5 6 7 8 9 100944

0946

0948

095

0952

0954

0956

0958

096

0962

0964

Det

ectio

n ra

teD

R

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(a) DR

1 2 3 4 5 6 7 8 9 100

01

02

03

04

05

06

07

False

alar

m ra

te F

AR

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(b) FAR

Figure 13 DR and FAR for different density in ego-motion estimation119898 times 119899 and frame differencing threshold thfd

Table 5 Performance evaluation in terms of DR and FAR for 1000frames using different density in ego-motion estimation119898times 119899 andframe differencing threshold thfd

119898 times 119899 thfd DR FAR12 15 0958 064312 20 0954 033112 25 0949 019424 15 0957 056824 20 0950 032424 25 0945 010135 15 0958 054835 20 0952 021535 25 0947 009048 15 0959 053948 20 0952 025348 25 0944 007970 15 0958 050970 20 0951 018870 25 0946 007588 15 0960 048988 20 0951 021988 25 0947 0074108 15 0958 0483108 20 0952 0168108 25 0946 0058140 15 0958 0499140 20 0951 0187140 25 0946 0059165 15 0958 0474165 20 0953 0214165 25 0947 0068192 15 0959 0478192 20 0952 0169192 25 0946 0092

Table 6 Computation speed comparison of the proposed sys-tem with different software implementation using area-based andfeature-based registrations

Platform Frequency Registrationtechnique

Framerate

Hardwarespeed-up

ProposedSoC 100MHz Area-based 30 1

Intel Corei5-4210U 170GHz Area-based 426 704

Feature-based 1311 229

ARM1176JZF 700MHz Area-based 020 150Feature-based 056 5357

speed-up between the proposed system and other softwareimplementations using test videos in [41]As feature-based image registration has faster computa-

tion in software implementation comparing to area-basedregistration speed performance of feature-based method isalso included for comparison In feature-based implementa-tion features are first detected in each frame The detectedfeatures from current frame are cross-correlatedwith featureswith previous framewhile RANSAC algorithm is used to esti-mate the ego-motion between frames After compensatingthe ego-motion segmentation ofmoving object uses the sameprocesses with the proposed system To further optimizethe software implementation in terms of speed performancea fast feature detection algorithm [30] is utilized As thenumber of features will affect the computation time in featurematching step only 100 strongest features in each frame areselected for processingHowever the performance evaluationdoes not consider multithreaded software execution

14 International Journal of Reconfigurable Computing

Table 7 Resources usage of the proposed moving target detectionSoC

Logic units Utilization ()Total combinational function 15161 13Total registers 10803 9Total memory bits 521054 13Embedded multiplier 27 5FPGA device Altera Cyclone IV

Based on experimental result the speed performanceof the proposed moving target detection SoC surpassesoptimized software computation by 229 times and 5357times compared with implementations in modern CPUand embedded CPU respectively The software computation(RANSAC) in HWSW codesign of the proposed system cre-ates speed bottleneck thus limiting the maximum through-put to 30 fps The processing frame rate of the proposed sys-tem can be further improved by using fully dedicated hard-ware

54 Resource Utilization The overall hardware resourcesutilization of the complete system is illustrated in Table 7This prototype of real-time moving object detection systemutilizes only less than 20 percent of total resources in AlteraCyclone IV FPGA device As the proposed system uses off-chip memory components for frame buffering FPGA on-chip memory is utilized only for line buffering in streamingprocess (eg block matching and median filtering) and stor-ing intermediate results (eg point pairs after block match-ing) Thus the low resource usage of the proposed systemprovides abundant hardware space for other processes such astarget tracking or classification to be developed in future

6 Conclusions

Moving target detection is a crucial step in most computervision problem especially for UAV applications On-chipdetection without the need of real-time video transmission toground will provide immense benefit to diverse applicationssuch as military surveillance and resource exploration Inorder to perform this complex embedded video processingon-chip FPGA-based system is desirable due to the potentialparallelism of the algorithmThis paper proposed a moving target detection system

using FPGA to enable autonomous UAVwhich is able to per-form the computer vision algorithm on the flying platformThe proposed system is prototyped using Altera CycloneIV FPGA device on Terasic DE2-115 development boardmounted with a TRDB-D5M camera This system is devel-oped as a HWSW codesign using dedicated hardware withNios II software processor (booted with embedded Linux)running at 100MHz clock rate As stream-oriented hardwarewith pipeline processing is utilized the proposed systemachieves real-time capability with 30 frames per secondprocessing speed on 640times 480 live video Experimental resultshows that the proposed SoC performs 229 times and 5357times faster than optimized software computation onmodern

desktop computer (Intel Core i5) and embedded processor(ARM) In addition the proposed moving target detectionuses only less than 20 percent of total resources in theFPGA device allowing other hardware accelerators to beimplemented in future

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors would like to express their gratitude to UniversitiTeknologi Malaysia (UTM) and the Ministry of ScienceTechnology and Innovation (MOSTI) Malaysia for support-ing this research work under research Grants 01-01-06-SF1197and 01-01-06-SF1229

References

[1] A Ahmed M Nagai C Tianen and R Shibasaki ldquoUav basedmonitoring systemandobject detection technique developmentfor a disaster areardquo International Archives of PhotogrammetryRemote Sensing and Spatial Information Sciences vol 37 pp373ndash377 2008

[2] B Coifman M McCord R Mishalani M Iswalt and Y JildquoRoadway trafficmonitoring from an unmanned aerial vehiclerdquoIEE Proceedings-Intelligent Transport Systems vol 153 no 1 pp11ndash20 2006

[3] K Kanistras G Martins M J Rutherford and K P ValavanisldquoSurvey of unmanned aerial vehicles (uavs) for traffic monitor-ingrdquo inHandbook of Unmanned Aerial Vehicles pp 2643ndash2666Springer 2015

[4] K Nordberg P Doherty G Farneback et al ldquoVision for a UAVhelicopterrdquo in Proceedings of the International Conference onIntelligent Robots and Systems (IROS rsquo02) Workshop on AerialRobotics pp 29ndash34 Lausanne Switzerland October 2002

[5] D Zamalieva and A Yilmaz ldquoBackground subtraction for themoving camera a geometric approachrdquo Computer Vision andImage Understanding vol 127 pp 73ndash85 2014

[6] M Genovese and E Napoli ldquoASIC and FPGA implementationof the Gaussianmixturemodel algorithm for real-time segmen-tation of high definition videordquo IEEETransactions onVery LargeScale Integration (VLSI) Systems vol 22 no 3 pp 537ndash547 2014

[7] F Kristensen H Hedberg H Jiang P Nilsson and V OwallldquoAn embedded real-time surveillance system implementationand evaluationrdquo Journal of Signal Processing Systems vol 52 no1 pp 75ndash94 2008

[8] H Jiang H Ardo and V Owall ldquoA hardware architecturefor real-time video segmentation utilizing memory reductiontechniquesrdquo IEEETransactions onCircuits and Systems for VideoTechnology vol 19 no 2 pp 226ndash236 2009

[9] M Genovese and E Napoli ldquoFPGA-based architecture for realtime segmentation and denoising of HD videordquo Journal of Real-Time Image Processing vol 8 no 4 pp 389ndash401 2013

[10] A Lopez-Bravo J Diaz-Carmona A Ramirez-Agundis APadilla-Medina and J Prado-Olivarez ldquoFPGA-based videosystem for real time moving object detectionrdquo in Proceedings

International Journal of Reconfigurable Computing 15

of the 23rd International Conference on Electronics Communi-cations and Computing (CONIELECOMP rsquo13) pp 92ndash97 IEEECholula Mexico March 2013

[11] T Kryjak M Komorkiewicz and M Gorgon ldquoReal-time mov-ing object detection for video surveillance system in FPGArdquo inProceedings of the Conference on Design and Architectures forSignal and Image Processing (DASIP rsquo11) pp 1ndash8 IEEE TampereFinland November 2011

[12] A Mittal and D Huttenlocher ldquoScene modeling for wide areasurveillance and image synthesisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition vol 2pp 160ndash167 IEEE June 2000

[13] A Price J Pyke D Ashiri and T Cornall ldquoReal time objectdetection for an unmanned aerial vehicle using an FPGAbased vision systemrdquo in Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA rsquo06) pp 2854ndash2859 IEEE Orlando Fla USA May 2006

[14] G J Garcıa C A Jara J Pomares A Alabdo L M Poggi andF Torres ldquoA survey on FPGA-based sensor systems towardsintelligent and reconfigurable low-power sensors for computervision control and signal processingrdquo Sensors vol 14 no 4 pp6247ndash6278 2014

[15] S Ali and M Shah ldquoCocoa tracking in aerial imageryrdquo inAirborne Intelligence Surveillance Reconnaissance (ISR) Systemsand Applications III vol 6209 of Proceedings of SPIE OrlandoFla USA April 2006

[16] J Xiao C Yang F Han and H Cheng ldquoVehicle and persontracking in aerial videosrdquo in Multimodal Technologies for Per-ception of Humans pp 203ndash214 Springer 2008

[17] W Yu X Yu P Zhang and J Zhou ldquoA new framework of mov-ing target detection and tracking for uav video applicationrdquo inProceedings of the International Archives of the PhotogrammetryRemote Sensing and Spatial Information Science vol 37 BeijingChina 2008

[18] V Reilly H Idrees and M Shah ldquoDetection and tracking oflarge number of targets in wide area surveillancerdquo in ComputerVisionmdashECCV 2010 11th European Conference on ComputerVision Heraklion Crete Greece September 5ndash11 2010 Proceed-ings Part III vol 6313 of Lecture Notes in Computer Science pp186ndash199 Springer Berlin Germany 2010

[19] J Wang Y Zhang J Lu and W Xu ldquoA framework for movingtarget detection recognition and tracking in UAV videosrdquoin Affective Computing and Intelligent Interaction vol 137 ofAdvances in Intelligent and Soft Computing pp 69ndash76 SpringerBerlin Germany 2012

[20] S A Cheraghi andUU Sheikh ldquoMoving object detection usingimage registration for a moving camera platformrdquo in Proceed-ings of the IEEE International Conference on Control SystemComputing and Engineering (ICCSCE rsquo12) pp 355ndash359 IEEEPenang Malaysia November 2012

[21] Y Zhang X Tong T Yang and W Ma ldquoMulti-model estima-tion based moving object detection for aerial videordquo Sensorsvol 15 no 4 pp 8214ndash8231 2015

[22] Q Yu and G Medioni ldquoA GPU-based implementation ofmotion detection from a moving platformrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern RecognitionWorkshops (CVPR rsquo08) pp 1ndash6 AnchorageAlaska USA June 2008

[23] A Laika J Paul C Claus W Stechele A E S Auf and EMaehle ldquoFPGA-based real-time moving object detection for

walking robotsrdquo in Proceedings of the 8th IEEE InternationalWorkshop on Safety Security and Rescue Robotics (SSRR rsquo10) pp1ndash8 IEEE Bremen Germany July 2010

[24] B Tippetts S Fowers K Lillywhite D-J Lee and J ArchibaldldquoFPGA implementation of a feature detection and trackingalgorithm for real-time applicationsrdquo in Advances in VisualComputing pp 682ndash691 Springer 2007

[25] K May and N Krouglicof ldquoMoving target detection for senseand avoid using regional phase correlationrdquo in Proceedings ofthe IEEE International Conference on Robotics and Automation(ICRA rsquo13) pp 4767ndash4772 IEEE Karlsruhe Germany May2013

[26] M E Angelopoulou and C-S Bouganis ldquoVision-based ego-motion estimation on FPGA for unmanned aerial vehiclenavigationrdquo IEEE Transactions on Circuits and Systems for VideoTechnology vol 24 no 6 pp 1070ndash1083 2014

[27] I M El-Emary andMM A El-Kareem ldquoOn the application ofgenetic algorithms in finger prints registrationrdquoWorld AppliedSciences Journal vol 5 no 3 pp 276ndash281 2008

[28] A A Goshtasby 2-D and 3-D Image Registration for MedicalRemote Sensing and Industrial Applications JohnWiley amp SonsNew York NY USA 2005

[29] C Harris and M Stephens ldquoA combined corner and edgedetectorrdquo in Proceedings of the 4th Alvey Vision Conference vol15 pp 147ndash151 1988

[30] E Rosten andTDrummond ldquoMachine learning for high-speedcorner detectionrdquo in Computer VisionmdashECCV 2006 pp 430ndash443 Springer 2006

[31] H Bay T Tuytelaars and L Van Gool ldquoSURF speededup robust featuresrdquo in Computer VisionmdashECCV 2006 ALeonardis H Bischof and A Pinz Eds vol 3951 of LectureNotes in Computer Science pp 404ndash417 Springer 2006

[32] G R Rodrıguez-Canosa SThomas J del Cerro A Barrientosand B MacDonald ldquoA real-time method to detect and trackmoving objects (DATMO) from unmanned aerial vehicles(UAVs) using a single camerardquo Remote Sensing vol 4 no 4 pp1090ndash1111 2012

[33] B Liu and A Zaccarin ldquoNew fast algorithms for the estimationof block motion vectorsrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 3 no 2 pp 148ndash157 1993

[34] R Li B Zeng andM L Liou ldquoNew three-step search algorithmfor blockmotion estimationrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 4 no 4 pp 438ndash442 1994

[35] L-M Po andW-C Ma ldquoA novel four-step search algorithm forfast blockmotion estimationrdquo IEEETransactions onCircuits andSystems for Video Technology vol 6 no 3 pp 313ndash317 1996

[36] S Zhu and K-K Ma ldquoA new diamond search algorithm forfast block-matching motion estimationrdquo IEEE Transactions onImage Processing vol 9 no 2 pp 287ndash290 2000

[37] L De Vos andM Stegherr ldquoParameterizable VLSI architecturesfor the full-search block-matching algorithmrdquo IEEE Transac-tions on Circuits and Systems vol 36 no 10 pp 1309ndash1316 1989

[38] M A Fischler and R C Bolles ldquoRandom sample consensus aparadigm for model fitting with applications to image analysisand automated cartographyrdquo Communications of the ACM vol24 no 6 pp 381ndash395 1981

[39] J W Tang N Shaikh-Husin and U U Sheikh ldquoFPGA imple-mentation of RANSAC algorithm for real-time image geometry

16 International Journal of Reconfigurable Computing

estimationrdquo in Proceedings of the 11th IEEE Student Conferenceon Research andDevelopment (SCOReD rsquo13) pp 290ndash294 IEEEPutrajaya Malaysia December 2013

[40] O Chum and J Matas ldquoRandomized ransac with T119889119889testrdquo in

Proceedings of the British Machine Vision Conference vol 2 pp448ndash457 September 2002

[41] DARPA SDMS PublicWeb Site 2003 httpswwwsdmsafrlafmil

[42] A F M S Saif A S Prabuwono and Z R MahayuddinldquoMotion analysis for moving object detection from UAV aerialimages a reviewrdquo in Proceedings of the International Conferenceon Informatics Electronics and Vision (ICIEV rsquo14) pp 1ndash6 IEEEDhaka Bangladesh May 2014

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

8 International Journal of Reconfigurable Computing

grabber also provides the current frame (119865119894) tomotion

estimation core(2) Motion estimation core performs blockmatching andRANSAC computation Since RANSAC is computedin both hardware and software software processor isconstantly accessing this core via system bus interfaceto calculate the affine parameters

(3) After RANSAC the affine parameters are transferredfrom software to object segmentation core Two pre-vious frames (119865

119894minus2and 119865

119894minus3) are read from the frame

buffer by object segmentation core for processing(4) Several processes involving affine transformationframe differencing median filter and dilation arethen performed on both frames resulting in thedetected moving target

(5) Lastly the bus interface (master) provides DMAaccess for object segmentation core to transfer theend result into SDRAMfor displaying and verificationpurposes

As the frame buffer (SRAM) is a single port 16-bitmemory frame grabber concatenates two neighbouring 8-bitgreyscale pixels to store in one memory location Since framegrabber and object segmentation core share the frame bufferto write and read frames respectively frame buffer interfaceprovides priority arbitration and gives frame grabber thehighest priority granting everywrite request However framebuffer may be busy for a couple of clock cycles due to readoperation of SRAM by other modules a small FIFO withdepth of 4 is utilized in frame grabber to temporarily bufferthe incoming image pixels

42 Motion Estimation Hardware Accelerator Motion esti-mation core consists of block matching and RANSAC hard-ware accelerators Since RANSAC requires the entire dataof point pairs provided by block matching to begin itscomputation additional buffers are needed to temporarilystore the corresponding point pairs for every two subsequentframes The hardware architecture for motion estimationprocess is shown in Figure 5To enable high throughput data (point pairs) sharing

for both block matching and RANSAC double bufferingtechnique is applied by using two buffers (Buffer 1 and Buffer2) as data storage For any instance one buffer is writtenby block matching while the other is used for computationby RANSAC Buffer controller swaps the roles of these twobuffers for each incoming new frame therefore ensuringboth processes to be pipelined by reading and writing oneach buffer subsequently Buffer swapping is initiated at eachcompletion of block matching modules while RANSAC isperformed during the time gap between each swap and isterminated before the next swap

421 Block Matching Hardware Accelerator Figure 7 showsthe architecture of the proposed block matching hardwareaccelerator performing template blocks extraction fromone frame and matching of these template blocks in theircorresponding search areas from next frame The overall

Buffer controllerBuffer 1

Block matchingVideo stream

input

RANSAC acceleratorTo softwareprocessor

Buffer 2

Figure 5 Hardware architecture of motion estimation core

process can be completed in stream to yield the point-to-point motion (point pairs) of two subsequent frames withoutbuffering an entire frameAs 9 times 9 block size is utilized in block matching a 9-

tap line buffer is designed in such a way that 9 times 9 pixels ofmoving window can be obtained in every clock cycleThese 9times 9 pixels are shared for both block extraction and matchingprocesses and are read one by one in pipeline from the linebuffer at each valid cycle resulting in a total of 81 cycles toobtain a complete windowThe block extractor keeps track of the coordinate of

current pixel in video stream as a reference for extractionprocess Template blocks from incoming frames are extractedand stored temporarily into block memory As each block isextracted line-by-line in raster scan blockmemory is dividedinto nine-rowmemories as illustrated in Figure 6(a)with eachof which being used to store one pixel row in template blocksWhen video stream reaches the block position each pixelrow is loaded into each rowmemory from the correspondingtap of the line buffer Block coordinates are also stored in aseparate FIFO to keep track of its positionSince only one SAD processor is used for matching 119898 times119899 blocks as mentioned in Section 31 the template blockhas to be swapped according to the corresponding searcharea during raster scan Hence row memory is constructedwith two FIFOs upper and lower FIFO as illustrated inFigure 6(b) to enable block swapping during matchingprocess Template blocks are stored into upper FIFO duringextraction process During matching process each line ofraster scan enters eight different search areas to match eightdifferent template blocks respectively Hence one row oftemplate blocks is cached in lower FIFO and is repeatedlyused until the end of their search areas (reaching next row ofsearch areas) Upon reaching each new row of search areastemplate blocks in lower FIFO are replaced with new row oftemplate blocks from upper FIFO At the last line of rasterscan the lower FIFO is flushed to prevent overflow

International Journal of Reconfigurable Computing 9

CV ControlVector(CV)

Tap 0 Tap 1

Row 1pixels

Template blockscoordinate

Template blockscoordinate

Template blocks pixels

Row 2pixels

Tap 8

Row 8pixels

CV

Rowmemory

Rowmemory

Coordinate memory (FIFO)

Row memory

middot middot middot

middot middot middot

middot middot middot

(a) Block memory consisting of nine-row memories

wr_upper

rd_upper

wr_lower

rd_lower

sel1

sel2

CV_in CV_out

Tap

Row pixels

UpperFIFO

LowerFIFO

1 0

0

Controlregisters

1 0

(b) Row memory contains an upper FIFO and lower FIFO

Figure 6 Block memory architecture for storing template blocks

Blockextractor

Bestscore

tracker

Line buffer

Block memory

SAD processor

Video stream input

Control vector

Matching score

Point pairs

Template blocks Blocks coordinate

9 times 9 window pixels

Figure 7 Stream-oriented hardware architecture of blockmatching

In order to efficiently extract and match all blocksdifferent Control Vector (CV) as illustrated in Table 3 is sentto perform different reading and writing operations in blockmemory based on the current position in raster scan Bothreads andwrites are independent of each other and are able tobe executed at the same time Pixels are processed one by onein 81 cycles to complete a window Both writing and reading

Table 3 Control Vector (CV) for different read andwrite operationsof block memory

Position of raster scan WriteupperReadupper

Writelower

Readlower sel1 sel2

Entering templateblock position 1 x x x x x

Entering first searcharea row x 1 1 0 1 1

Entering next searcharea row x 1 1 1 1 1

Reentering samesearch area row x 0 1 1 0 0

Leaving last searcharea row x 0 0 1 0 0

processes require 9 cycles for each row memory passing CVfrom the first row memory to the next row memory untilthe end to complete a 81-pixel write or read operation of atemplate blockSAD processor performs the correlation of the template

blocks from previous frame with all possible blocks fromcurrent frame according to the search area Extracted blockpixels are read from block memory while window pixels insearch areas are provided from the taps of the line bufferThetotal number of required PEs is the total number of pixelsin a window The process is pipelined such that each pixelis computed in each PE as soon as it is obtained from theline buffer Matching score of each window can be obtainedin every cycle after a fixed latencyLastly the best score tracker constantly stores and updates

the best matching score for each template block within itscorresponding search area The matching score is compared

10 International Journal of Reconfigurable Computing

x2 H2y1 H1

H0y2 H5 x1x1 H3

y1 H4

minus times times minus

minus minus

times times

+

+

+

abs abs

sqr sqr

min

acc

Pipelineregister

Pipelineregister

Pipelineregister

Fitness score

th2dist

Figure 8 Hardware datapath of fitness scoring in RANSAC accel-erator

among the same search area and the coordinates of the best-scored blocks are preserved At the end of each search areathe coordinates of the best pairs (template blocks and theirbest-scored blocks) are sent to RANSAC module for nextprocessing Hence the proposed block matching hardware isable to produce point-to-point motion (point pairs) of everytwo successive frames in streaming video at line rate

422 RANSAC Hardware Accelerator RANSAC hardwaredesign in [39] is utilized in this work which acceleratesonly fitness scoring step As described in Algorithm 2fitness scoring is an iterative process which performs similarcomputation to all data samples based on hypothesis modelHence this data intensive process is executed in pipelineddatapath as illustrated in Figure 8 A control unit is utilizedto read input data provided by block matching from bufferand stream these inputs to the datapath unit at every clockcycleThe datapath unit utilizes three stages of pipeline with

the aim of isolating multiplication processes thus allowingfaster clock rate The first stage pipeline registers are locatedright after the first multiplication while the other two stagesof pipeline registers enclose the squaring processes Theindividual score is accumulated in the last stage producingtotal final fitness score The accumulator is reset on each newset of hypothesis Thus the total number of cycles required

Table 4 Fixed point precision of fitness scoring inputs

Parameter Number of bits Number rangeInteger Fraction

1199091 1199101 1199092 1199102

11 0 [minus1024 1024)1198670119867111986731198674

4 12 [minus8 8)11986721198675

11 5 [minus1024 1024)

Detected movingtarget

Affineparameters

fromsoftware

AffinePE

Framereader

Address

Framedifferencing PE

Binary image stream

MedianPE

Line buffer

DilationPE

Line buffer

from framebuffer

Fiminus2 and Fiminus3

Fiminus2F998400iminus3

Figure 9 Hardware architecture for object segmentation

for fitness score computation is the number of overall dataplus the four-cycle latencyAlthough fitness scoring could require floating point

computations the datapath unit uses suitable fixed pointprecision for each stage SinceNios II is a 32-bit processor theaffineparameters in hypothesismodel (119867

0to1198676) are properly

scaled to different precision of 16-bit fixed points as describedin Table 4 so that two affine parameters can be assigned in asingle 32-bit write instruction As this system is targeted for640 times 480 pixelsrsquo video all input coordinates (119909

1 1199101 1199092 and

1199102) are scaled to 11 bits

43 Object Segmentation Hardware Architecture As objectsegmentation can be performed in one raster scan a stream-oriented architecture is proposed as illustrated in Figure 9 Allsubprocesses are executed in pipeline on the streaming videowithout additional frame buffering Object segmentationprocess is initiated by software processor after providing theaffine parameters from RANSAC to affine PE Two frames(119865119894minus2and 119865

119894minus3as described in Table 2) from frame buffer

(SRAM) are required to segment the moving targetBased on the affine parameters from RANSAC affine PE

uses reverse mapping technique to find each pixel location inprevious frame (119865

119894minus3) using (3) and generates their addresses

in frame buffer (SRAM) Frame readers fetch the previ-ous frame (119865

119894minus3) pixel-by-pixel according to the generated

addresses from frame buffer thus constructing a stream oftransformed frame which is denoted as 1198651015840

119894minus3

International Journal of Reconfigurable Computing 11

Leftmostcolumn pixels

Rightmostcolumn pixels

Median output stream

Addertree

Addertree

24

pixelsTo line buffer

Binary image stream

7 times 7 window

minus

+

gtgt

Figure 10 Hardware architecture of median PE

By synchronizing the streams of both frames framedifferencing can be executed in pipeline as soon as one pixelfrom each frame is obtained Hence one pixel in currentframe (119865

119894minus2) and one pixel in transformed frame (1198651015840

119894minus3) are

fetched alternatively from their corresponding memory loca-tions by frame reader constructing two synchronized streamsof 119865119894minus2and 1198651015840

119894minus3frames Frame differencing PE performs

pixel-to-pixel absolute subtraction and thresholding on thestreams The frame differencing PE is able to compute inone cycle per pixel A configurable threshold value thfd isused after the subtraction yielding a stream of binary imagewithout buffering the whole frameAfter frame differencing the binary image is streamed

into 7 times 7 median filtering Seven lines of the image arebuffered in the line buffer providing 7 times 7 pixels window forthe median PE to perform the median computation Mediancomputation can be performed in one clock cycle for eachprocessing window due to short propagation delay as onlybinary pixels are involved Figure 10 shows the hardware logicdesign of median PEMedian filtering can be computed by counting the num-

ber of asserted (binary 1) pixels in the window If morethan half the pixels in the window (24 pixels) are assertedthe resultant pixel is ldquo1rdquo or ldquo0rdquo otherwise Since processingwindow will move only one pixel to the right for each com-putation during raster scan current pixel count is computedby adding the previous pixel count and rightmost columnpixels in the current window while subtracting the leftmost

column pixels in the previous window Final binary outputpixel is produced by thresholding the current pixel count with24 (half of window size)As dilation is also a 7times 7window-based processing it uses

similar line buffering technique asmedian filtering Howeveronly simple logical OR operation is performed on all windowpixels Due to its simplicity dilation PE can also be computedin one clock cycle resulting in the stream of binary imagewith detected region of moving targets

5 Experimental Results

51 Verification of Proposed SoC Theproposedmoving targetdetection SoC is verified in offline detection mode using thedatabase in [41] Test videos are 640 times 480 pixels in sizeand are greyscaled prior to the verification process The testvideos are transferred to the system for computation via aUSB mass storage device After performing the detection inSoC the image results are displayed on VGA and also storedon USB drive for verification Figure 11 shows the movingtarget detection result from the proposed SoC using differentsample videos The detected regions (red) are overlaid on theinput frame In most cases the proposed SoC is able to detectthe moving target in consecutive framesHowever there are several limitations in this work Block

matching may not give a goodmotion estimation result if theextracted blocks do not have texture (the pixels intensity aresimilar) Moreover the detected region of moving target mayappear in cavity or multiple split of smaller regions as onlysimple frame differencing is applied in the proposed systemAdditional postprocessing to produce better detected blob bymerging split regions is out of the scope in this workAs the stochastic RANSAC algorithm is terminated after

a constant time step for each frame image registration errormay occur which produces incorrect ego-motion estimationThis could be mitigated by accelerating RANSAC algorithmto ensure more iterations using dedicated hardware or highperformance general purpose processor

52 Performance Evaluation of Detection Algorithm Theper-formance evaluation of the implemented detection algorithmuses the Mathematical Performance Metric in [42] thatinvolves several parameters as follows

(i) True positive TP the detected moving object(ii) False positive FP detected regions that do not corre-spond to any moving object

(iii) False negative FN the nondetected moving object(iv) Detection rate DR the ratio of TP with the combina-

tion of TP and FN as formulated in

DR = TPTP + FN

(4)

(v) False alarm rate FAR the ratio between FP in allpositive detection as defined in

FAR = FPTP + FP

(5)

12 International Journal of Reconfigurable Computing

(a) Frame 255 (b) Frame 275 (c) Frame 295 (d) Frame 315

(e) Frame 1000 (f) Frame 1020 (g) Frame 1040 (h) Frame 1060

(i) Frame 600 (j) Frame 620 (k) Frame 640 (l) Frame 660

Figure 11 Detected regions from the proposed moving target detection SoC on different sample videos in [41] Video numbers (a)ndash(d)V3V100003 004 video numbers (e)ndash(h) V3V100004 003 and video numbers (i)ndash(l) V4V100007 017

FP

FNTP

Figure 12 Evaluation of performancemetrics TP FP and FN basedon ground truth boxes (blue) and the detected region (red)

To obtain the performance metrics ground truth regionsare manually labelled in several frames of test videos Abounding box is drawn across each moving object to indicatethe ground truth region of every frame as depicted in Fig-ure 12 A simple postprocessing is performed on the detectedregion by filtering out the detected region smaller than 15pixelsrsquo width or 15 pixelsrsquo height prior to the evaluationA detected moving object (TP) has detected regions in itsbounded ground truth area while a nondetected movingobject (FN) has no detected region overlapping with itsground truth area Detected region that does not overlappwith any ground truth region is considered as false positive(FP)The detection performance is evaluated on different

parameters configuration The DR and FAR for 1000 testframes using different number of blocks (density in ego-motion estimation) 119898 times 119899 in area-based registration and

frame differencing threshold thfd are depicted in Table 5 andFigure 13The experiment results show that DR is almost similar

for different density of ego-motion estimation but decreaseswith thfd Although higher density in the proposed work haslower displacement limitation 119889

119898and 119889

119899as discussed in

Section 31 most of the point-to-point displacements do notexceed the limitation due to slowUAVmovement in themostframes of the test dataset On the contrary higher value of thfdmay filter out the moving object if the differences in intensityof the object pixels and background pixels are almost similarFAR decreases with density in ego-motion estimation

due to the higher quality in image registration process butincreases if most frames exceed the displacement limitation119889119898and 119889

119899 However false registration due to displacement

limitation results in a huge blob of foreground but does notgreatly increase FAR Although higher values of thfd decreasethe false detection rate they also produce smaller foregroundarea for all detected moving objects as pixels almost similarintensity with background will be thresholded

53 Speed Comparison with Full Software ImplementationThe computation speed of the proposed moving target detec-tion SoC is compared with software computation in differentplatforms including modern CPU (Intel Core i5) in desktopcomputer and embedded processor (ARM) Table 6 illustratesthe comparison of computation frame rate and hardware

International Journal of Reconfigurable Computing 13

1 2 3 4 5 6 7 8 9 100944

0946

0948

095

0952

0954

0956

0958

096

0962

0964

Det

ectio

n ra

teD

R

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(a) DR

1 2 3 4 5 6 7 8 9 100

01

02

03

04

05

06

07

False

alar

m ra

te F

AR

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(b) FAR

Figure 13 DR and FAR for different density in ego-motion estimation119898 times 119899 and frame differencing threshold thfd

Table 5 Performance evaluation in terms of DR and FAR for 1000frames using different density in ego-motion estimation119898times 119899 andframe differencing threshold thfd

119898 times 119899 thfd DR FAR12 15 0958 064312 20 0954 033112 25 0949 019424 15 0957 056824 20 0950 032424 25 0945 010135 15 0958 054835 20 0952 021535 25 0947 009048 15 0959 053948 20 0952 025348 25 0944 007970 15 0958 050970 20 0951 018870 25 0946 007588 15 0960 048988 20 0951 021988 25 0947 0074108 15 0958 0483108 20 0952 0168108 25 0946 0058140 15 0958 0499140 20 0951 0187140 25 0946 0059165 15 0958 0474165 20 0953 0214165 25 0947 0068192 15 0959 0478192 20 0952 0169192 25 0946 0092

Table 6 Computation speed comparison of the proposed sys-tem with different software implementation using area-based andfeature-based registrations

Platform Frequency Registrationtechnique

Framerate

Hardwarespeed-up

ProposedSoC 100MHz Area-based 30 1

Intel Corei5-4210U 170GHz Area-based 426 704

Feature-based 1311 229

ARM1176JZF 700MHz Area-based 020 150Feature-based 056 5357

speed-up between the proposed system and other softwareimplementations using test videos in [41]As feature-based image registration has faster computa-

tion in software implementation comparing to area-basedregistration speed performance of feature-based method isalso included for comparison In feature-based implementa-tion features are first detected in each frame The detectedfeatures from current frame are cross-correlatedwith featureswith previous framewhile RANSAC algorithm is used to esti-mate the ego-motion between frames After compensatingthe ego-motion segmentation ofmoving object uses the sameprocesses with the proposed system To further optimizethe software implementation in terms of speed performancea fast feature detection algorithm [30] is utilized As thenumber of features will affect the computation time in featurematching step only 100 strongest features in each frame areselected for processingHowever the performance evaluationdoes not consider multithreaded software execution

14 International Journal of Reconfigurable Computing

Table 7 Resources usage of the proposed moving target detectionSoC

Logic units Utilization ()Total combinational function 15161 13Total registers 10803 9Total memory bits 521054 13Embedded multiplier 27 5FPGA device Altera Cyclone IV

Based on experimental result the speed performanceof the proposed moving target detection SoC surpassesoptimized software computation by 229 times and 5357times compared with implementations in modern CPUand embedded CPU respectively The software computation(RANSAC) in HWSW codesign of the proposed system cre-ates speed bottleneck thus limiting the maximum through-put to 30 fps The processing frame rate of the proposed sys-tem can be further improved by using fully dedicated hard-ware

54 Resource Utilization The overall hardware resourcesutilization of the complete system is illustrated in Table 7This prototype of real-time moving object detection systemutilizes only less than 20 percent of total resources in AlteraCyclone IV FPGA device As the proposed system uses off-chip memory components for frame buffering FPGA on-chip memory is utilized only for line buffering in streamingprocess (eg block matching and median filtering) and stor-ing intermediate results (eg point pairs after block match-ing) Thus the low resource usage of the proposed systemprovides abundant hardware space for other processes such astarget tracking or classification to be developed in future

6 Conclusions

Moving target detection is a crucial step in most computervision problem especially for UAV applications On-chipdetection without the need of real-time video transmission toground will provide immense benefit to diverse applicationssuch as military surveillance and resource exploration Inorder to perform this complex embedded video processingon-chip FPGA-based system is desirable due to the potentialparallelism of the algorithmThis paper proposed a moving target detection system

using FPGA to enable autonomous UAVwhich is able to per-form the computer vision algorithm on the flying platformThe proposed system is prototyped using Altera CycloneIV FPGA device on Terasic DE2-115 development boardmounted with a TRDB-D5M camera This system is devel-oped as a HWSW codesign using dedicated hardware withNios II software processor (booted with embedded Linux)running at 100MHz clock rate As stream-oriented hardwarewith pipeline processing is utilized the proposed systemachieves real-time capability with 30 frames per secondprocessing speed on 640times 480 live video Experimental resultshows that the proposed SoC performs 229 times and 5357times faster than optimized software computation onmodern

desktop computer (Intel Core i5) and embedded processor(ARM) In addition the proposed moving target detectionuses only less than 20 percent of total resources in theFPGA device allowing other hardware accelerators to beimplemented in future

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors would like to express their gratitude to UniversitiTeknologi Malaysia (UTM) and the Ministry of ScienceTechnology and Innovation (MOSTI) Malaysia for support-ing this research work under research Grants 01-01-06-SF1197and 01-01-06-SF1229

References

[1] A Ahmed M Nagai C Tianen and R Shibasaki ldquoUav basedmonitoring systemandobject detection technique developmentfor a disaster areardquo International Archives of PhotogrammetryRemote Sensing and Spatial Information Sciences vol 37 pp373ndash377 2008

[2] B Coifman M McCord R Mishalani M Iswalt and Y JildquoRoadway trafficmonitoring from an unmanned aerial vehiclerdquoIEE Proceedings-Intelligent Transport Systems vol 153 no 1 pp11ndash20 2006

[3] K Kanistras G Martins M J Rutherford and K P ValavanisldquoSurvey of unmanned aerial vehicles (uavs) for traffic monitor-ingrdquo inHandbook of Unmanned Aerial Vehicles pp 2643ndash2666Springer 2015

[4] K Nordberg P Doherty G Farneback et al ldquoVision for a UAVhelicopterrdquo in Proceedings of the International Conference onIntelligent Robots and Systems (IROS rsquo02) Workshop on AerialRobotics pp 29ndash34 Lausanne Switzerland October 2002

[5] D Zamalieva and A Yilmaz ldquoBackground subtraction for themoving camera a geometric approachrdquo Computer Vision andImage Understanding vol 127 pp 73ndash85 2014

[6] M Genovese and E Napoli ldquoASIC and FPGA implementationof the Gaussianmixturemodel algorithm for real-time segmen-tation of high definition videordquo IEEETransactions onVery LargeScale Integration (VLSI) Systems vol 22 no 3 pp 537ndash547 2014

[7] F Kristensen H Hedberg H Jiang P Nilsson and V OwallldquoAn embedded real-time surveillance system implementationand evaluationrdquo Journal of Signal Processing Systems vol 52 no1 pp 75ndash94 2008

[8] H Jiang H Ardo and V Owall ldquoA hardware architecturefor real-time video segmentation utilizing memory reductiontechniquesrdquo IEEETransactions onCircuits and Systems for VideoTechnology vol 19 no 2 pp 226ndash236 2009

[9] M Genovese and E Napoli ldquoFPGA-based architecture for realtime segmentation and denoising of HD videordquo Journal of Real-Time Image Processing vol 8 no 4 pp 389ndash401 2013

[10] A Lopez-Bravo J Diaz-Carmona A Ramirez-Agundis APadilla-Medina and J Prado-Olivarez ldquoFPGA-based videosystem for real time moving object detectionrdquo in Proceedings

International Journal of Reconfigurable Computing 15

of the 23rd International Conference on Electronics Communi-cations and Computing (CONIELECOMP rsquo13) pp 92ndash97 IEEECholula Mexico March 2013

[11] T Kryjak M Komorkiewicz and M Gorgon ldquoReal-time mov-ing object detection for video surveillance system in FPGArdquo inProceedings of the Conference on Design and Architectures forSignal and Image Processing (DASIP rsquo11) pp 1ndash8 IEEE TampereFinland November 2011

[12] A Mittal and D Huttenlocher ldquoScene modeling for wide areasurveillance and image synthesisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition vol 2pp 160ndash167 IEEE June 2000

[13] A Price J Pyke D Ashiri and T Cornall ldquoReal time objectdetection for an unmanned aerial vehicle using an FPGAbased vision systemrdquo in Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA rsquo06) pp 2854ndash2859 IEEE Orlando Fla USA May 2006

[14] G J Garcıa C A Jara J Pomares A Alabdo L M Poggi andF Torres ldquoA survey on FPGA-based sensor systems towardsintelligent and reconfigurable low-power sensors for computervision control and signal processingrdquo Sensors vol 14 no 4 pp6247ndash6278 2014

[15] S Ali and M Shah ldquoCocoa tracking in aerial imageryrdquo inAirborne Intelligence Surveillance Reconnaissance (ISR) Systemsand Applications III vol 6209 of Proceedings of SPIE OrlandoFla USA April 2006

[16] J Xiao C Yang F Han and H Cheng ldquoVehicle and persontracking in aerial videosrdquo in Multimodal Technologies for Per-ception of Humans pp 203ndash214 Springer 2008

[17] W Yu X Yu P Zhang and J Zhou ldquoA new framework of mov-ing target detection and tracking for uav video applicationrdquo inProceedings of the International Archives of the PhotogrammetryRemote Sensing and Spatial Information Science vol 37 BeijingChina 2008

[18] V Reilly H Idrees and M Shah ldquoDetection and tracking oflarge number of targets in wide area surveillancerdquo in ComputerVisionmdashECCV 2010 11th European Conference on ComputerVision Heraklion Crete Greece September 5ndash11 2010 Proceed-ings Part III vol 6313 of Lecture Notes in Computer Science pp186ndash199 Springer Berlin Germany 2010

[19] J Wang Y Zhang J Lu and W Xu ldquoA framework for movingtarget detection recognition and tracking in UAV videosrdquoin Affective Computing and Intelligent Interaction vol 137 ofAdvances in Intelligent and Soft Computing pp 69ndash76 SpringerBerlin Germany 2012

[20] S A Cheraghi andUU Sheikh ldquoMoving object detection usingimage registration for a moving camera platformrdquo in Proceed-ings of the IEEE International Conference on Control SystemComputing and Engineering (ICCSCE rsquo12) pp 355ndash359 IEEEPenang Malaysia November 2012

[21] Y Zhang X Tong T Yang and W Ma ldquoMulti-model estima-tion based moving object detection for aerial videordquo Sensorsvol 15 no 4 pp 8214ndash8231 2015

[22] Q Yu and G Medioni ldquoA GPU-based implementation ofmotion detection from a moving platformrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern RecognitionWorkshops (CVPR rsquo08) pp 1ndash6 AnchorageAlaska USA June 2008

[23] A Laika J Paul C Claus W Stechele A E S Auf and EMaehle ldquoFPGA-based real-time moving object detection for

walking robotsrdquo in Proceedings of the 8th IEEE InternationalWorkshop on Safety Security and Rescue Robotics (SSRR rsquo10) pp1ndash8 IEEE Bremen Germany July 2010

[24] B Tippetts S Fowers K Lillywhite D-J Lee and J ArchibaldldquoFPGA implementation of a feature detection and trackingalgorithm for real-time applicationsrdquo in Advances in VisualComputing pp 682ndash691 Springer 2007

[25] K May and N Krouglicof ldquoMoving target detection for senseand avoid using regional phase correlationrdquo in Proceedings ofthe IEEE International Conference on Robotics and Automation(ICRA rsquo13) pp 4767ndash4772 IEEE Karlsruhe Germany May2013

[26] M E Angelopoulou and C-S Bouganis ldquoVision-based ego-motion estimation on FPGA for unmanned aerial vehiclenavigationrdquo IEEE Transactions on Circuits and Systems for VideoTechnology vol 24 no 6 pp 1070ndash1083 2014

[27] I M El-Emary andMM A El-Kareem ldquoOn the application ofgenetic algorithms in finger prints registrationrdquoWorld AppliedSciences Journal vol 5 no 3 pp 276ndash281 2008

[28] A A Goshtasby 2-D and 3-D Image Registration for MedicalRemote Sensing and Industrial Applications JohnWiley amp SonsNew York NY USA 2005

[29] C Harris and M Stephens ldquoA combined corner and edgedetectorrdquo in Proceedings of the 4th Alvey Vision Conference vol15 pp 147ndash151 1988

[30] E Rosten andTDrummond ldquoMachine learning for high-speedcorner detectionrdquo in Computer VisionmdashECCV 2006 pp 430ndash443 Springer 2006

[31] H Bay T Tuytelaars and L Van Gool ldquoSURF speededup robust featuresrdquo in Computer VisionmdashECCV 2006 ALeonardis H Bischof and A Pinz Eds vol 3951 of LectureNotes in Computer Science pp 404ndash417 Springer 2006

[32] G R Rodrıguez-Canosa SThomas J del Cerro A Barrientosand B MacDonald ldquoA real-time method to detect and trackmoving objects (DATMO) from unmanned aerial vehicles(UAVs) using a single camerardquo Remote Sensing vol 4 no 4 pp1090ndash1111 2012

[33] B Liu and A Zaccarin ldquoNew fast algorithms for the estimationof block motion vectorsrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 3 no 2 pp 148ndash157 1993

[34] R Li B Zeng andM L Liou ldquoNew three-step search algorithmfor blockmotion estimationrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 4 no 4 pp 438ndash442 1994

[35] L-M Po andW-C Ma ldquoA novel four-step search algorithm forfast blockmotion estimationrdquo IEEETransactions onCircuits andSystems for Video Technology vol 6 no 3 pp 313ndash317 1996

[36] S Zhu and K-K Ma ldquoA new diamond search algorithm forfast block-matching motion estimationrdquo IEEE Transactions onImage Processing vol 9 no 2 pp 287ndash290 2000

[37] L De Vos andM Stegherr ldquoParameterizable VLSI architecturesfor the full-search block-matching algorithmrdquo IEEE Transac-tions on Circuits and Systems vol 36 no 10 pp 1309ndash1316 1989

[38] M A Fischler and R C Bolles ldquoRandom sample consensus aparadigm for model fitting with applications to image analysisand automated cartographyrdquo Communications of the ACM vol24 no 6 pp 381ndash395 1981

[39] J W Tang N Shaikh-Husin and U U Sheikh ldquoFPGA imple-mentation of RANSAC algorithm for real-time image geometry

16 International Journal of Reconfigurable Computing

estimationrdquo in Proceedings of the 11th IEEE Student Conferenceon Research andDevelopment (SCOReD rsquo13) pp 290ndash294 IEEEPutrajaya Malaysia December 2013

[40] O Chum and J Matas ldquoRandomized ransac with T119889119889testrdquo in

Proceedings of the British Machine Vision Conference vol 2 pp448ndash457 September 2002

[41] DARPA SDMS PublicWeb Site 2003 httpswwwsdmsafrlafmil

[42] A F M S Saif A S Prabuwono and Z R MahayuddinldquoMotion analysis for moving object detection from UAV aerialimages a reviewrdquo in Proceedings of the International Conferenceon Informatics Electronics and Vision (ICIEV rsquo14) pp 1ndash6 IEEEDhaka Bangladesh May 2014

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

International Journal of Reconfigurable Computing 9

CV ControlVector(CV)

Tap 0 Tap 1

Row 1pixels

Template blockscoordinate

Template blockscoordinate

Template blocks pixels

Row 2pixels

Tap 8

Row 8pixels

CV

Rowmemory

Rowmemory

Coordinate memory (FIFO)

Row memory

middot middot middot

middot middot middot

middot middot middot

(a) Block memory consisting of nine-row memories

wr_upper

rd_upper

wr_lower

rd_lower

sel1

sel2

CV_in CV_out

Tap

Row pixels

UpperFIFO

LowerFIFO

1 0

0

Controlregisters

1 0

(b) Row memory contains an upper FIFO and lower FIFO

Figure 6 Block memory architecture for storing template blocks

Blockextractor

Bestscore

tracker

Line buffer

Block memory

SAD processor

Video stream input

Control vector

Matching score

Point pairs

Template blocks Blocks coordinate

9 times 9 window pixels

Figure 7 Stream-oriented hardware architecture of blockmatching

In order to efficiently extract and match all blocksdifferent Control Vector (CV) as illustrated in Table 3 is sentto perform different reading and writing operations in blockmemory based on the current position in raster scan Bothreads andwrites are independent of each other and are able tobe executed at the same time Pixels are processed one by onein 81 cycles to complete a window Both writing and reading

Table 3 Control Vector (CV) for different read andwrite operationsof block memory

Position of raster scan WriteupperReadupper

Writelower

Readlower sel1 sel2

Entering templateblock position 1 x x x x x

Entering first searcharea row x 1 1 0 1 1

Entering next searcharea row x 1 1 1 1 1

Reentering samesearch area row x 0 1 1 0 0

Leaving last searcharea row x 0 0 1 0 0

processes require 9 cycles for each row memory passing CVfrom the first row memory to the next row memory untilthe end to complete a 81-pixel write or read operation of atemplate blockSAD processor performs the correlation of the template

blocks from previous frame with all possible blocks fromcurrent frame according to the search area Extracted blockpixels are read from block memory while window pixels insearch areas are provided from the taps of the line bufferThetotal number of required PEs is the total number of pixelsin a window The process is pipelined such that each pixelis computed in each PE as soon as it is obtained from theline buffer Matching score of each window can be obtainedin every cycle after a fixed latencyLastly the best score tracker constantly stores and updates

the best matching score for each template block within itscorresponding search area The matching score is compared

10 International Journal of Reconfigurable Computing

x2 H2y1 H1

H0y2 H5 x1x1 H3

y1 H4

minus times times minus

minus minus

times times

+

+

+

abs abs

sqr sqr

min

acc

Pipelineregister

Pipelineregister

Pipelineregister

Fitness score

th2dist

Figure 8 Hardware datapath of fitness scoring in RANSAC accel-erator

among the same search area and the coordinates of the best-scored blocks are preserved At the end of each search areathe coordinates of the best pairs (template blocks and theirbest-scored blocks) are sent to RANSAC module for nextprocessing Hence the proposed block matching hardware isable to produce point-to-point motion (point pairs) of everytwo successive frames in streaming video at line rate

422 RANSAC Hardware Accelerator RANSAC hardwaredesign in [39] is utilized in this work which acceleratesonly fitness scoring step As described in Algorithm 2fitness scoring is an iterative process which performs similarcomputation to all data samples based on hypothesis modelHence this data intensive process is executed in pipelineddatapath as illustrated in Figure 8 A control unit is utilizedto read input data provided by block matching from bufferand stream these inputs to the datapath unit at every clockcycleThe datapath unit utilizes three stages of pipeline with

the aim of isolating multiplication processes thus allowingfaster clock rate The first stage pipeline registers are locatedright after the first multiplication while the other two stagesof pipeline registers enclose the squaring processes Theindividual score is accumulated in the last stage producingtotal final fitness score The accumulator is reset on each newset of hypothesis Thus the total number of cycles required

Table 4 Fixed point precision of fitness scoring inputs

Parameter Number of bits Number rangeInteger Fraction

1199091 1199101 1199092 1199102

11 0 [minus1024 1024)1198670119867111986731198674

4 12 [minus8 8)11986721198675

11 5 [minus1024 1024)

Detected movingtarget

Affineparameters

fromsoftware

AffinePE

Framereader

Address

Framedifferencing PE

Binary image stream

MedianPE

Line buffer

DilationPE

Line buffer

from framebuffer

Fiminus2 and Fiminus3

Fiminus2F998400iminus3

Figure 9 Hardware architecture for object segmentation

for fitness score computation is the number of overall dataplus the four-cycle latencyAlthough fitness scoring could require floating point

computations the datapath unit uses suitable fixed pointprecision for each stage SinceNios II is a 32-bit processor theaffineparameters in hypothesismodel (119867

0to1198676) are properly

scaled to different precision of 16-bit fixed points as describedin Table 4 so that two affine parameters can be assigned in asingle 32-bit write instruction As this system is targeted for640 times 480 pixelsrsquo video all input coordinates (119909

1 1199101 1199092 and

1199102) are scaled to 11 bits

43 Object Segmentation Hardware Architecture As objectsegmentation can be performed in one raster scan a stream-oriented architecture is proposed as illustrated in Figure 9 Allsubprocesses are executed in pipeline on the streaming videowithout additional frame buffering Object segmentationprocess is initiated by software processor after providing theaffine parameters from RANSAC to affine PE Two frames(119865119894minus2and 119865

119894minus3as described in Table 2) from frame buffer

(SRAM) are required to segment the moving targetBased on the affine parameters from RANSAC affine PE

uses reverse mapping technique to find each pixel location inprevious frame (119865

119894minus3) using (3) and generates their addresses

in frame buffer (SRAM) Frame readers fetch the previ-ous frame (119865

119894minus3) pixel-by-pixel according to the generated

addresses from frame buffer thus constructing a stream oftransformed frame which is denoted as 1198651015840

119894minus3

International Journal of Reconfigurable Computing 11

Leftmostcolumn pixels

Rightmostcolumn pixels

Median output stream

Addertree

Addertree

24

pixelsTo line buffer

Binary image stream

7 times 7 window

minus

+

gtgt

Figure 10 Hardware architecture of median PE

By synchronizing the streams of both frames framedifferencing can be executed in pipeline as soon as one pixelfrom each frame is obtained Hence one pixel in currentframe (119865

119894minus2) and one pixel in transformed frame (1198651015840

119894minus3) are

fetched alternatively from their corresponding memory loca-tions by frame reader constructing two synchronized streamsof 119865119894minus2and 1198651015840

119894minus3frames Frame differencing PE performs

pixel-to-pixel absolute subtraction and thresholding on thestreams The frame differencing PE is able to compute inone cycle per pixel A configurable threshold value thfd isused after the subtraction yielding a stream of binary imagewithout buffering the whole frameAfter frame differencing the binary image is streamed

into 7 times 7 median filtering Seven lines of the image arebuffered in the line buffer providing 7 times 7 pixels window forthe median PE to perform the median computation Mediancomputation can be performed in one clock cycle for eachprocessing window due to short propagation delay as onlybinary pixels are involved Figure 10 shows the hardware logicdesign of median PEMedian filtering can be computed by counting the num-

ber of asserted (binary 1) pixels in the window If morethan half the pixels in the window (24 pixels) are assertedthe resultant pixel is ldquo1rdquo or ldquo0rdquo otherwise Since processingwindow will move only one pixel to the right for each com-putation during raster scan current pixel count is computedby adding the previous pixel count and rightmost columnpixels in the current window while subtracting the leftmost

column pixels in the previous window Final binary outputpixel is produced by thresholding the current pixel count with24 (half of window size)As dilation is also a 7times 7window-based processing it uses

similar line buffering technique asmedian filtering Howeveronly simple logical OR operation is performed on all windowpixels Due to its simplicity dilation PE can also be computedin one clock cycle resulting in the stream of binary imagewith detected region of moving targets

5 Experimental Results

51 Verification of Proposed SoC Theproposedmoving targetdetection SoC is verified in offline detection mode using thedatabase in [41] Test videos are 640 times 480 pixels in sizeand are greyscaled prior to the verification process The testvideos are transferred to the system for computation via aUSB mass storage device After performing the detection inSoC the image results are displayed on VGA and also storedon USB drive for verification Figure 11 shows the movingtarget detection result from the proposed SoC using differentsample videos The detected regions (red) are overlaid on theinput frame In most cases the proposed SoC is able to detectthe moving target in consecutive framesHowever there are several limitations in this work Block

matching may not give a goodmotion estimation result if theextracted blocks do not have texture (the pixels intensity aresimilar) Moreover the detected region of moving target mayappear in cavity or multiple split of smaller regions as onlysimple frame differencing is applied in the proposed systemAdditional postprocessing to produce better detected blob bymerging split regions is out of the scope in this workAs the stochastic RANSAC algorithm is terminated after

a constant time step for each frame image registration errormay occur which produces incorrect ego-motion estimationThis could be mitigated by accelerating RANSAC algorithmto ensure more iterations using dedicated hardware or highperformance general purpose processor

52 Performance Evaluation of Detection Algorithm Theper-formance evaluation of the implemented detection algorithmuses the Mathematical Performance Metric in [42] thatinvolves several parameters as follows

(i) True positive TP the detected moving object(ii) False positive FP detected regions that do not corre-spond to any moving object

(iii) False negative FN the nondetected moving object(iv) Detection rate DR the ratio of TP with the combina-

tion of TP and FN as formulated in

DR = TPTP + FN

(4)

(v) False alarm rate FAR the ratio between FP in allpositive detection as defined in

FAR = FPTP + FP

(5)

12 International Journal of Reconfigurable Computing

(a) Frame 255 (b) Frame 275 (c) Frame 295 (d) Frame 315

(e) Frame 1000 (f) Frame 1020 (g) Frame 1040 (h) Frame 1060

(i) Frame 600 (j) Frame 620 (k) Frame 640 (l) Frame 660

Figure 11 Detected regions from the proposed moving target detection SoC on different sample videos in [41] Video numbers (a)ndash(d)V3V100003 004 video numbers (e)ndash(h) V3V100004 003 and video numbers (i)ndash(l) V4V100007 017

FP

FNTP

Figure 12 Evaluation of performancemetrics TP FP and FN basedon ground truth boxes (blue) and the detected region (red)

To obtain the performance metrics ground truth regionsare manually labelled in several frames of test videos Abounding box is drawn across each moving object to indicatethe ground truth region of every frame as depicted in Fig-ure 12 A simple postprocessing is performed on the detectedregion by filtering out the detected region smaller than 15pixelsrsquo width or 15 pixelsrsquo height prior to the evaluationA detected moving object (TP) has detected regions in itsbounded ground truth area while a nondetected movingobject (FN) has no detected region overlapping with itsground truth area Detected region that does not overlappwith any ground truth region is considered as false positive(FP)The detection performance is evaluated on different

parameters configuration The DR and FAR for 1000 testframes using different number of blocks (density in ego-motion estimation) 119898 times 119899 in area-based registration and

frame differencing threshold thfd are depicted in Table 5 andFigure 13The experiment results show that DR is almost similar

for different density of ego-motion estimation but decreaseswith thfd Although higher density in the proposed work haslower displacement limitation 119889

119898and 119889

119899as discussed in

Section 31 most of the point-to-point displacements do notexceed the limitation due to slowUAVmovement in themostframes of the test dataset On the contrary higher value of thfdmay filter out the moving object if the differences in intensityof the object pixels and background pixels are almost similarFAR decreases with density in ego-motion estimation

due to the higher quality in image registration process butincreases if most frames exceed the displacement limitation119889119898and 119889

119899 However false registration due to displacement

limitation results in a huge blob of foreground but does notgreatly increase FAR Although higher values of thfd decreasethe false detection rate they also produce smaller foregroundarea for all detected moving objects as pixels almost similarintensity with background will be thresholded

53 Speed Comparison with Full Software ImplementationThe computation speed of the proposed moving target detec-tion SoC is compared with software computation in differentplatforms including modern CPU (Intel Core i5) in desktopcomputer and embedded processor (ARM) Table 6 illustratesthe comparison of computation frame rate and hardware

International Journal of Reconfigurable Computing 13

1 2 3 4 5 6 7 8 9 100944

0946

0948

095

0952

0954

0956

0958

096

0962

0964

Det

ectio

n ra

teD

R

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(a) DR

1 2 3 4 5 6 7 8 9 100

01

02

03

04

05

06

07

False

alar

m ra

te F

AR

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(b) FAR

Figure 13 DR and FAR for different density in ego-motion estimation119898 times 119899 and frame differencing threshold thfd

Table 5 Performance evaluation in terms of DR and FAR for 1000frames using different density in ego-motion estimation119898times 119899 andframe differencing threshold thfd

119898 times 119899 thfd DR FAR12 15 0958 064312 20 0954 033112 25 0949 019424 15 0957 056824 20 0950 032424 25 0945 010135 15 0958 054835 20 0952 021535 25 0947 009048 15 0959 053948 20 0952 025348 25 0944 007970 15 0958 050970 20 0951 018870 25 0946 007588 15 0960 048988 20 0951 021988 25 0947 0074108 15 0958 0483108 20 0952 0168108 25 0946 0058140 15 0958 0499140 20 0951 0187140 25 0946 0059165 15 0958 0474165 20 0953 0214165 25 0947 0068192 15 0959 0478192 20 0952 0169192 25 0946 0092

Table 6 Computation speed comparison of the proposed sys-tem with different software implementation using area-based andfeature-based registrations

Platform Frequency Registrationtechnique

Framerate

Hardwarespeed-up

ProposedSoC 100MHz Area-based 30 1

Intel Corei5-4210U 170GHz Area-based 426 704

Feature-based 1311 229

ARM1176JZF 700MHz Area-based 020 150Feature-based 056 5357

speed-up between the proposed system and other softwareimplementations using test videos in [41]As feature-based image registration has faster computa-

tion in software implementation comparing to area-basedregistration speed performance of feature-based method isalso included for comparison In feature-based implementa-tion features are first detected in each frame The detectedfeatures from current frame are cross-correlatedwith featureswith previous framewhile RANSAC algorithm is used to esti-mate the ego-motion between frames After compensatingthe ego-motion segmentation ofmoving object uses the sameprocesses with the proposed system To further optimizethe software implementation in terms of speed performancea fast feature detection algorithm [30] is utilized As thenumber of features will affect the computation time in featurematching step only 100 strongest features in each frame areselected for processingHowever the performance evaluationdoes not consider multithreaded software execution

14 International Journal of Reconfigurable Computing

Table 7 Resources usage of the proposed moving target detectionSoC

Logic units Utilization ()Total combinational function 15161 13Total registers 10803 9Total memory bits 521054 13Embedded multiplier 27 5FPGA device Altera Cyclone IV

Based on experimental result the speed performanceof the proposed moving target detection SoC surpassesoptimized software computation by 229 times and 5357times compared with implementations in modern CPUand embedded CPU respectively The software computation(RANSAC) in HWSW codesign of the proposed system cre-ates speed bottleneck thus limiting the maximum through-put to 30 fps The processing frame rate of the proposed sys-tem can be further improved by using fully dedicated hard-ware

54 Resource Utilization The overall hardware resourcesutilization of the complete system is illustrated in Table 7This prototype of real-time moving object detection systemutilizes only less than 20 percent of total resources in AlteraCyclone IV FPGA device As the proposed system uses off-chip memory components for frame buffering FPGA on-chip memory is utilized only for line buffering in streamingprocess (eg block matching and median filtering) and stor-ing intermediate results (eg point pairs after block match-ing) Thus the low resource usage of the proposed systemprovides abundant hardware space for other processes such astarget tracking or classification to be developed in future

6 Conclusions

Moving target detection is a crucial step in most computervision problem especially for UAV applications On-chipdetection without the need of real-time video transmission toground will provide immense benefit to diverse applicationssuch as military surveillance and resource exploration Inorder to perform this complex embedded video processingon-chip FPGA-based system is desirable due to the potentialparallelism of the algorithmThis paper proposed a moving target detection system

using FPGA to enable autonomous UAVwhich is able to per-form the computer vision algorithm on the flying platformThe proposed system is prototyped using Altera CycloneIV FPGA device on Terasic DE2-115 development boardmounted with a TRDB-D5M camera This system is devel-oped as a HWSW codesign using dedicated hardware withNios II software processor (booted with embedded Linux)running at 100MHz clock rate As stream-oriented hardwarewith pipeline processing is utilized the proposed systemachieves real-time capability with 30 frames per secondprocessing speed on 640times 480 live video Experimental resultshows that the proposed SoC performs 229 times and 5357times faster than optimized software computation onmodern

desktop computer (Intel Core i5) and embedded processor(ARM) In addition the proposed moving target detectionuses only less than 20 percent of total resources in theFPGA device allowing other hardware accelerators to beimplemented in future

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors would like to express their gratitude to UniversitiTeknologi Malaysia (UTM) and the Ministry of ScienceTechnology and Innovation (MOSTI) Malaysia for support-ing this research work under research Grants 01-01-06-SF1197and 01-01-06-SF1229

References

[1] A Ahmed M Nagai C Tianen and R Shibasaki ldquoUav basedmonitoring systemandobject detection technique developmentfor a disaster areardquo International Archives of PhotogrammetryRemote Sensing and Spatial Information Sciences vol 37 pp373ndash377 2008

[2] B Coifman M McCord R Mishalani M Iswalt and Y JildquoRoadway trafficmonitoring from an unmanned aerial vehiclerdquoIEE Proceedings-Intelligent Transport Systems vol 153 no 1 pp11ndash20 2006

[3] K Kanistras G Martins M J Rutherford and K P ValavanisldquoSurvey of unmanned aerial vehicles (uavs) for traffic monitor-ingrdquo inHandbook of Unmanned Aerial Vehicles pp 2643ndash2666Springer 2015

[4] K Nordberg P Doherty G Farneback et al ldquoVision for a UAVhelicopterrdquo in Proceedings of the International Conference onIntelligent Robots and Systems (IROS rsquo02) Workshop on AerialRobotics pp 29ndash34 Lausanne Switzerland October 2002

[5] D Zamalieva and A Yilmaz ldquoBackground subtraction for themoving camera a geometric approachrdquo Computer Vision andImage Understanding vol 127 pp 73ndash85 2014

[6] M Genovese and E Napoli ldquoASIC and FPGA implementationof the Gaussianmixturemodel algorithm for real-time segmen-tation of high definition videordquo IEEETransactions onVery LargeScale Integration (VLSI) Systems vol 22 no 3 pp 537ndash547 2014

[7] F Kristensen H Hedberg H Jiang P Nilsson and V OwallldquoAn embedded real-time surveillance system implementationand evaluationrdquo Journal of Signal Processing Systems vol 52 no1 pp 75ndash94 2008

[8] H Jiang H Ardo and V Owall ldquoA hardware architecturefor real-time video segmentation utilizing memory reductiontechniquesrdquo IEEETransactions onCircuits and Systems for VideoTechnology vol 19 no 2 pp 226ndash236 2009

[9] M Genovese and E Napoli ldquoFPGA-based architecture for realtime segmentation and denoising of HD videordquo Journal of Real-Time Image Processing vol 8 no 4 pp 389ndash401 2013

[10] A Lopez-Bravo J Diaz-Carmona A Ramirez-Agundis APadilla-Medina and J Prado-Olivarez ldquoFPGA-based videosystem for real time moving object detectionrdquo in Proceedings

International Journal of Reconfigurable Computing 15

of the 23rd International Conference on Electronics Communi-cations and Computing (CONIELECOMP rsquo13) pp 92ndash97 IEEECholula Mexico March 2013

[11] T Kryjak M Komorkiewicz and M Gorgon ldquoReal-time mov-ing object detection for video surveillance system in FPGArdquo inProceedings of the Conference on Design and Architectures forSignal and Image Processing (DASIP rsquo11) pp 1ndash8 IEEE TampereFinland November 2011

[12] A Mittal and D Huttenlocher ldquoScene modeling for wide areasurveillance and image synthesisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition vol 2pp 160ndash167 IEEE June 2000

[13] A Price J Pyke D Ashiri and T Cornall ldquoReal time objectdetection for an unmanned aerial vehicle using an FPGAbased vision systemrdquo in Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA rsquo06) pp 2854ndash2859 IEEE Orlando Fla USA May 2006

[14] G J Garcıa C A Jara J Pomares A Alabdo L M Poggi andF Torres ldquoA survey on FPGA-based sensor systems towardsintelligent and reconfigurable low-power sensors for computervision control and signal processingrdquo Sensors vol 14 no 4 pp6247ndash6278 2014

[15] S Ali and M Shah ldquoCocoa tracking in aerial imageryrdquo inAirborne Intelligence Surveillance Reconnaissance (ISR) Systemsand Applications III vol 6209 of Proceedings of SPIE OrlandoFla USA April 2006

[16] J Xiao C Yang F Han and H Cheng ldquoVehicle and persontracking in aerial videosrdquo in Multimodal Technologies for Per-ception of Humans pp 203ndash214 Springer 2008

[17] W Yu X Yu P Zhang and J Zhou ldquoA new framework of mov-ing target detection and tracking for uav video applicationrdquo inProceedings of the International Archives of the PhotogrammetryRemote Sensing and Spatial Information Science vol 37 BeijingChina 2008

[18] V Reilly H Idrees and M Shah ldquoDetection and tracking oflarge number of targets in wide area surveillancerdquo in ComputerVisionmdashECCV 2010 11th European Conference on ComputerVision Heraklion Crete Greece September 5ndash11 2010 Proceed-ings Part III vol 6313 of Lecture Notes in Computer Science pp186ndash199 Springer Berlin Germany 2010

[19] J Wang Y Zhang J Lu and W Xu ldquoA framework for movingtarget detection recognition and tracking in UAV videosrdquoin Affective Computing and Intelligent Interaction vol 137 ofAdvances in Intelligent and Soft Computing pp 69ndash76 SpringerBerlin Germany 2012

[20] S A Cheraghi andUU Sheikh ldquoMoving object detection usingimage registration for a moving camera platformrdquo in Proceed-ings of the IEEE International Conference on Control SystemComputing and Engineering (ICCSCE rsquo12) pp 355ndash359 IEEEPenang Malaysia November 2012

[21] Y Zhang X Tong T Yang and W Ma ldquoMulti-model estima-tion based moving object detection for aerial videordquo Sensorsvol 15 no 4 pp 8214ndash8231 2015

[22] Q Yu and G Medioni ldquoA GPU-based implementation ofmotion detection from a moving platformrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern RecognitionWorkshops (CVPR rsquo08) pp 1ndash6 AnchorageAlaska USA June 2008

[23] A Laika J Paul C Claus W Stechele A E S Auf and EMaehle ldquoFPGA-based real-time moving object detection for

walking robotsrdquo in Proceedings of the 8th IEEE InternationalWorkshop on Safety Security and Rescue Robotics (SSRR rsquo10) pp1ndash8 IEEE Bremen Germany July 2010

[24] B Tippetts S Fowers K Lillywhite D-J Lee and J ArchibaldldquoFPGA implementation of a feature detection and trackingalgorithm for real-time applicationsrdquo in Advances in VisualComputing pp 682ndash691 Springer 2007

[25] K May and N Krouglicof ldquoMoving target detection for senseand avoid using regional phase correlationrdquo in Proceedings ofthe IEEE International Conference on Robotics and Automation(ICRA rsquo13) pp 4767ndash4772 IEEE Karlsruhe Germany May2013

[26] M E Angelopoulou and C-S Bouganis ldquoVision-based ego-motion estimation on FPGA for unmanned aerial vehiclenavigationrdquo IEEE Transactions on Circuits and Systems for VideoTechnology vol 24 no 6 pp 1070ndash1083 2014

[27] I M El-Emary andMM A El-Kareem ldquoOn the application ofgenetic algorithms in finger prints registrationrdquoWorld AppliedSciences Journal vol 5 no 3 pp 276ndash281 2008

[28] A A Goshtasby 2-D and 3-D Image Registration for MedicalRemote Sensing and Industrial Applications JohnWiley amp SonsNew York NY USA 2005

[29] C Harris and M Stephens ldquoA combined corner and edgedetectorrdquo in Proceedings of the 4th Alvey Vision Conference vol15 pp 147ndash151 1988

[30] E Rosten andTDrummond ldquoMachine learning for high-speedcorner detectionrdquo in Computer VisionmdashECCV 2006 pp 430ndash443 Springer 2006

[31] H Bay T Tuytelaars and L Van Gool ldquoSURF speededup robust featuresrdquo in Computer VisionmdashECCV 2006 ALeonardis H Bischof and A Pinz Eds vol 3951 of LectureNotes in Computer Science pp 404ndash417 Springer 2006

[32] G R Rodrıguez-Canosa SThomas J del Cerro A Barrientosand B MacDonald ldquoA real-time method to detect and trackmoving objects (DATMO) from unmanned aerial vehicles(UAVs) using a single camerardquo Remote Sensing vol 4 no 4 pp1090ndash1111 2012

[33] B Liu and A Zaccarin ldquoNew fast algorithms for the estimationof block motion vectorsrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 3 no 2 pp 148ndash157 1993

[34] R Li B Zeng andM L Liou ldquoNew three-step search algorithmfor blockmotion estimationrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 4 no 4 pp 438ndash442 1994

[35] L-M Po andW-C Ma ldquoA novel four-step search algorithm forfast blockmotion estimationrdquo IEEETransactions onCircuits andSystems for Video Technology vol 6 no 3 pp 313ndash317 1996

[36] S Zhu and K-K Ma ldquoA new diamond search algorithm forfast block-matching motion estimationrdquo IEEE Transactions onImage Processing vol 9 no 2 pp 287ndash290 2000

[37] L De Vos andM Stegherr ldquoParameterizable VLSI architecturesfor the full-search block-matching algorithmrdquo IEEE Transac-tions on Circuits and Systems vol 36 no 10 pp 1309ndash1316 1989

[38] M A Fischler and R C Bolles ldquoRandom sample consensus aparadigm for model fitting with applications to image analysisand automated cartographyrdquo Communications of the ACM vol24 no 6 pp 381ndash395 1981

[39] J W Tang N Shaikh-Husin and U U Sheikh ldquoFPGA imple-mentation of RANSAC algorithm for real-time image geometry

16 International Journal of Reconfigurable Computing

estimationrdquo in Proceedings of the 11th IEEE Student Conferenceon Research andDevelopment (SCOReD rsquo13) pp 290ndash294 IEEEPutrajaya Malaysia December 2013

[40] O Chum and J Matas ldquoRandomized ransac with T119889119889testrdquo in

Proceedings of the British Machine Vision Conference vol 2 pp448ndash457 September 2002

[41] DARPA SDMS PublicWeb Site 2003 httpswwwsdmsafrlafmil

[42] A F M S Saif A S Prabuwono and Z R MahayuddinldquoMotion analysis for moving object detection from UAV aerialimages a reviewrdquo in Proceedings of the International Conferenceon Informatics Electronics and Vision (ICIEV rsquo14) pp 1ndash6 IEEEDhaka Bangladesh May 2014

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

10 International Journal of Reconfigurable Computing

x2 H2y1 H1

H0y2 H5 x1x1 H3

y1 H4

minus times times minus

minus minus

times times

+

+

+

abs abs

sqr sqr

min

acc

Pipelineregister

Pipelineregister

Pipelineregister

Fitness score

th2dist

Figure 8 Hardware datapath of fitness scoring in RANSAC accel-erator

among the same search area and the coordinates of the best-scored blocks are preserved At the end of each search areathe coordinates of the best pairs (template blocks and theirbest-scored blocks) are sent to RANSAC module for nextprocessing Hence the proposed block matching hardware isable to produce point-to-point motion (point pairs) of everytwo successive frames in streaming video at line rate

422 RANSAC Hardware Accelerator RANSAC hardwaredesign in [39] is utilized in this work which acceleratesonly fitness scoring step As described in Algorithm 2fitness scoring is an iterative process which performs similarcomputation to all data samples based on hypothesis modelHence this data intensive process is executed in pipelineddatapath as illustrated in Figure 8 A control unit is utilizedto read input data provided by block matching from bufferand stream these inputs to the datapath unit at every clockcycleThe datapath unit utilizes three stages of pipeline with

the aim of isolating multiplication processes thus allowingfaster clock rate The first stage pipeline registers are locatedright after the first multiplication while the other two stagesof pipeline registers enclose the squaring processes Theindividual score is accumulated in the last stage producingtotal final fitness score The accumulator is reset on each newset of hypothesis Thus the total number of cycles required

Table 4 Fixed point precision of fitness scoring inputs

Parameter Number of bits Number rangeInteger Fraction

1199091 1199101 1199092 1199102

11 0 [minus1024 1024)1198670119867111986731198674

4 12 [minus8 8)11986721198675

11 5 [minus1024 1024)

Detected movingtarget

Affineparameters

fromsoftware

AffinePE

Framereader

Address

Framedifferencing PE

Binary image stream

MedianPE

Line buffer

DilationPE

Line buffer

from framebuffer

Fiminus2 and Fiminus3

Fiminus2F998400iminus3

Figure 9 Hardware architecture for object segmentation

for fitness score computation is the number of overall dataplus the four-cycle latencyAlthough fitness scoring could require floating point

computations the datapath unit uses suitable fixed pointprecision for each stage SinceNios II is a 32-bit processor theaffineparameters in hypothesismodel (119867

0to1198676) are properly

scaled to different precision of 16-bit fixed points as describedin Table 4 so that two affine parameters can be assigned in asingle 32-bit write instruction As this system is targeted for640 times 480 pixelsrsquo video all input coordinates (119909

1 1199101 1199092 and

1199102) are scaled to 11 bits

43 Object Segmentation Hardware Architecture As objectsegmentation can be performed in one raster scan a stream-oriented architecture is proposed as illustrated in Figure 9 Allsubprocesses are executed in pipeline on the streaming videowithout additional frame buffering Object segmentationprocess is initiated by software processor after providing theaffine parameters from RANSAC to affine PE Two frames(119865119894minus2and 119865

119894minus3as described in Table 2) from frame buffer

(SRAM) are required to segment the moving targetBased on the affine parameters from RANSAC affine PE

uses reverse mapping technique to find each pixel location inprevious frame (119865

119894minus3) using (3) and generates their addresses

in frame buffer (SRAM) Frame readers fetch the previ-ous frame (119865

119894minus3) pixel-by-pixel according to the generated

addresses from frame buffer thus constructing a stream oftransformed frame which is denoted as 1198651015840

119894minus3

International Journal of Reconfigurable Computing 11

Leftmostcolumn pixels

Rightmostcolumn pixels

Median output stream

Addertree

Addertree

24

pixelsTo line buffer

Binary image stream

7 times 7 window

minus

+

gtgt

Figure 10 Hardware architecture of median PE

By synchronizing the streams of both frames framedifferencing can be executed in pipeline as soon as one pixelfrom each frame is obtained Hence one pixel in currentframe (119865

119894minus2) and one pixel in transformed frame (1198651015840

119894minus3) are

fetched alternatively from their corresponding memory loca-tions by frame reader constructing two synchronized streamsof 119865119894minus2and 1198651015840

119894minus3frames Frame differencing PE performs

pixel-to-pixel absolute subtraction and thresholding on thestreams The frame differencing PE is able to compute inone cycle per pixel A configurable threshold value thfd isused after the subtraction yielding a stream of binary imagewithout buffering the whole frameAfter frame differencing the binary image is streamed

into 7 times 7 median filtering Seven lines of the image arebuffered in the line buffer providing 7 times 7 pixels window forthe median PE to perform the median computation Mediancomputation can be performed in one clock cycle for eachprocessing window due to short propagation delay as onlybinary pixels are involved Figure 10 shows the hardware logicdesign of median PEMedian filtering can be computed by counting the num-

ber of asserted (binary 1) pixels in the window If morethan half the pixels in the window (24 pixels) are assertedthe resultant pixel is ldquo1rdquo or ldquo0rdquo otherwise Since processingwindow will move only one pixel to the right for each com-putation during raster scan current pixel count is computedby adding the previous pixel count and rightmost columnpixels in the current window while subtracting the leftmost

column pixels in the previous window Final binary outputpixel is produced by thresholding the current pixel count with24 (half of window size)As dilation is also a 7times 7window-based processing it uses

similar line buffering technique asmedian filtering Howeveronly simple logical OR operation is performed on all windowpixels Due to its simplicity dilation PE can also be computedin one clock cycle resulting in the stream of binary imagewith detected region of moving targets

5 Experimental Results

51 Verification of Proposed SoC Theproposedmoving targetdetection SoC is verified in offline detection mode using thedatabase in [41] Test videos are 640 times 480 pixels in sizeand are greyscaled prior to the verification process The testvideos are transferred to the system for computation via aUSB mass storage device After performing the detection inSoC the image results are displayed on VGA and also storedon USB drive for verification Figure 11 shows the movingtarget detection result from the proposed SoC using differentsample videos The detected regions (red) are overlaid on theinput frame In most cases the proposed SoC is able to detectthe moving target in consecutive framesHowever there are several limitations in this work Block

matching may not give a goodmotion estimation result if theextracted blocks do not have texture (the pixels intensity aresimilar) Moreover the detected region of moving target mayappear in cavity or multiple split of smaller regions as onlysimple frame differencing is applied in the proposed systemAdditional postprocessing to produce better detected blob bymerging split regions is out of the scope in this workAs the stochastic RANSAC algorithm is terminated after

a constant time step for each frame image registration errormay occur which produces incorrect ego-motion estimationThis could be mitigated by accelerating RANSAC algorithmto ensure more iterations using dedicated hardware or highperformance general purpose processor

52 Performance Evaluation of Detection Algorithm Theper-formance evaluation of the implemented detection algorithmuses the Mathematical Performance Metric in [42] thatinvolves several parameters as follows

(i) True positive TP the detected moving object(ii) False positive FP detected regions that do not corre-spond to any moving object

(iii) False negative FN the nondetected moving object(iv) Detection rate DR the ratio of TP with the combina-

tion of TP and FN as formulated in

DR = TPTP + FN

(4)

(v) False alarm rate FAR the ratio between FP in allpositive detection as defined in

FAR = FPTP + FP

(5)

12 International Journal of Reconfigurable Computing

(a) Frame 255 (b) Frame 275 (c) Frame 295 (d) Frame 315

(e) Frame 1000 (f) Frame 1020 (g) Frame 1040 (h) Frame 1060

(i) Frame 600 (j) Frame 620 (k) Frame 640 (l) Frame 660

Figure 11 Detected regions from the proposed moving target detection SoC on different sample videos in [41] Video numbers (a)ndash(d)V3V100003 004 video numbers (e)ndash(h) V3V100004 003 and video numbers (i)ndash(l) V4V100007 017

FP

FNTP

Figure 12 Evaluation of performancemetrics TP FP and FN basedon ground truth boxes (blue) and the detected region (red)

To obtain the performance metrics ground truth regionsare manually labelled in several frames of test videos Abounding box is drawn across each moving object to indicatethe ground truth region of every frame as depicted in Fig-ure 12 A simple postprocessing is performed on the detectedregion by filtering out the detected region smaller than 15pixelsrsquo width or 15 pixelsrsquo height prior to the evaluationA detected moving object (TP) has detected regions in itsbounded ground truth area while a nondetected movingobject (FN) has no detected region overlapping with itsground truth area Detected region that does not overlappwith any ground truth region is considered as false positive(FP)The detection performance is evaluated on different

parameters configuration The DR and FAR for 1000 testframes using different number of blocks (density in ego-motion estimation) 119898 times 119899 in area-based registration and

frame differencing threshold thfd are depicted in Table 5 andFigure 13The experiment results show that DR is almost similar

for different density of ego-motion estimation but decreaseswith thfd Although higher density in the proposed work haslower displacement limitation 119889

119898and 119889

119899as discussed in

Section 31 most of the point-to-point displacements do notexceed the limitation due to slowUAVmovement in themostframes of the test dataset On the contrary higher value of thfdmay filter out the moving object if the differences in intensityof the object pixels and background pixels are almost similarFAR decreases with density in ego-motion estimation

due to the higher quality in image registration process butincreases if most frames exceed the displacement limitation119889119898and 119889

119899 However false registration due to displacement

limitation results in a huge blob of foreground but does notgreatly increase FAR Although higher values of thfd decreasethe false detection rate they also produce smaller foregroundarea for all detected moving objects as pixels almost similarintensity with background will be thresholded

53 Speed Comparison with Full Software ImplementationThe computation speed of the proposed moving target detec-tion SoC is compared with software computation in differentplatforms including modern CPU (Intel Core i5) in desktopcomputer and embedded processor (ARM) Table 6 illustratesthe comparison of computation frame rate and hardware

International Journal of Reconfigurable Computing 13

1 2 3 4 5 6 7 8 9 100944

0946

0948

095

0952

0954

0956

0958

096

0962

0964

Det

ectio

n ra

teD

R

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(a) DR

1 2 3 4 5 6 7 8 9 100

01

02

03

04

05

06

07

False

alar

m ra

te F

AR

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(b) FAR

Figure 13 DR and FAR for different density in ego-motion estimation119898 times 119899 and frame differencing threshold thfd

Table 5 Performance evaluation in terms of DR and FAR for 1000frames using different density in ego-motion estimation119898times 119899 andframe differencing threshold thfd

119898 times 119899 thfd DR FAR12 15 0958 064312 20 0954 033112 25 0949 019424 15 0957 056824 20 0950 032424 25 0945 010135 15 0958 054835 20 0952 021535 25 0947 009048 15 0959 053948 20 0952 025348 25 0944 007970 15 0958 050970 20 0951 018870 25 0946 007588 15 0960 048988 20 0951 021988 25 0947 0074108 15 0958 0483108 20 0952 0168108 25 0946 0058140 15 0958 0499140 20 0951 0187140 25 0946 0059165 15 0958 0474165 20 0953 0214165 25 0947 0068192 15 0959 0478192 20 0952 0169192 25 0946 0092

Table 6 Computation speed comparison of the proposed sys-tem with different software implementation using area-based andfeature-based registrations

Platform Frequency Registrationtechnique

Framerate

Hardwarespeed-up

ProposedSoC 100MHz Area-based 30 1

Intel Corei5-4210U 170GHz Area-based 426 704

Feature-based 1311 229

ARM1176JZF 700MHz Area-based 020 150Feature-based 056 5357

speed-up between the proposed system and other softwareimplementations using test videos in [41]As feature-based image registration has faster computa-

tion in software implementation comparing to area-basedregistration speed performance of feature-based method isalso included for comparison In feature-based implementa-tion features are first detected in each frame The detectedfeatures from current frame are cross-correlatedwith featureswith previous framewhile RANSAC algorithm is used to esti-mate the ego-motion between frames After compensatingthe ego-motion segmentation ofmoving object uses the sameprocesses with the proposed system To further optimizethe software implementation in terms of speed performancea fast feature detection algorithm [30] is utilized As thenumber of features will affect the computation time in featurematching step only 100 strongest features in each frame areselected for processingHowever the performance evaluationdoes not consider multithreaded software execution

14 International Journal of Reconfigurable Computing

Table 7 Resources usage of the proposed moving target detectionSoC

Logic units Utilization ()Total combinational function 15161 13Total registers 10803 9Total memory bits 521054 13Embedded multiplier 27 5FPGA device Altera Cyclone IV

Based on experimental result the speed performanceof the proposed moving target detection SoC surpassesoptimized software computation by 229 times and 5357times compared with implementations in modern CPUand embedded CPU respectively The software computation(RANSAC) in HWSW codesign of the proposed system cre-ates speed bottleneck thus limiting the maximum through-put to 30 fps The processing frame rate of the proposed sys-tem can be further improved by using fully dedicated hard-ware

54 Resource Utilization The overall hardware resourcesutilization of the complete system is illustrated in Table 7This prototype of real-time moving object detection systemutilizes only less than 20 percent of total resources in AlteraCyclone IV FPGA device As the proposed system uses off-chip memory components for frame buffering FPGA on-chip memory is utilized only for line buffering in streamingprocess (eg block matching and median filtering) and stor-ing intermediate results (eg point pairs after block match-ing) Thus the low resource usage of the proposed systemprovides abundant hardware space for other processes such astarget tracking or classification to be developed in future

6 Conclusions

Moving target detection is a crucial step in most computervision problem especially for UAV applications On-chipdetection without the need of real-time video transmission toground will provide immense benefit to diverse applicationssuch as military surveillance and resource exploration Inorder to perform this complex embedded video processingon-chip FPGA-based system is desirable due to the potentialparallelism of the algorithmThis paper proposed a moving target detection system

using FPGA to enable autonomous UAVwhich is able to per-form the computer vision algorithm on the flying platformThe proposed system is prototyped using Altera CycloneIV FPGA device on Terasic DE2-115 development boardmounted with a TRDB-D5M camera This system is devel-oped as a HWSW codesign using dedicated hardware withNios II software processor (booted with embedded Linux)running at 100MHz clock rate As stream-oriented hardwarewith pipeline processing is utilized the proposed systemachieves real-time capability with 30 frames per secondprocessing speed on 640times 480 live video Experimental resultshows that the proposed SoC performs 229 times and 5357times faster than optimized software computation onmodern

desktop computer (Intel Core i5) and embedded processor(ARM) In addition the proposed moving target detectionuses only less than 20 percent of total resources in theFPGA device allowing other hardware accelerators to beimplemented in future

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors would like to express their gratitude to UniversitiTeknologi Malaysia (UTM) and the Ministry of ScienceTechnology and Innovation (MOSTI) Malaysia for support-ing this research work under research Grants 01-01-06-SF1197and 01-01-06-SF1229

References

[1] A Ahmed M Nagai C Tianen and R Shibasaki ldquoUav basedmonitoring systemandobject detection technique developmentfor a disaster areardquo International Archives of PhotogrammetryRemote Sensing and Spatial Information Sciences vol 37 pp373ndash377 2008

[2] B Coifman M McCord R Mishalani M Iswalt and Y JildquoRoadway trafficmonitoring from an unmanned aerial vehiclerdquoIEE Proceedings-Intelligent Transport Systems vol 153 no 1 pp11ndash20 2006

[3] K Kanistras G Martins M J Rutherford and K P ValavanisldquoSurvey of unmanned aerial vehicles (uavs) for traffic monitor-ingrdquo inHandbook of Unmanned Aerial Vehicles pp 2643ndash2666Springer 2015

[4] K Nordberg P Doherty G Farneback et al ldquoVision for a UAVhelicopterrdquo in Proceedings of the International Conference onIntelligent Robots and Systems (IROS rsquo02) Workshop on AerialRobotics pp 29ndash34 Lausanne Switzerland October 2002

[5] D Zamalieva and A Yilmaz ldquoBackground subtraction for themoving camera a geometric approachrdquo Computer Vision andImage Understanding vol 127 pp 73ndash85 2014

[6] M Genovese and E Napoli ldquoASIC and FPGA implementationof the Gaussianmixturemodel algorithm for real-time segmen-tation of high definition videordquo IEEETransactions onVery LargeScale Integration (VLSI) Systems vol 22 no 3 pp 537ndash547 2014

[7] F Kristensen H Hedberg H Jiang P Nilsson and V OwallldquoAn embedded real-time surveillance system implementationand evaluationrdquo Journal of Signal Processing Systems vol 52 no1 pp 75ndash94 2008

[8] H Jiang H Ardo and V Owall ldquoA hardware architecturefor real-time video segmentation utilizing memory reductiontechniquesrdquo IEEETransactions onCircuits and Systems for VideoTechnology vol 19 no 2 pp 226ndash236 2009

[9] M Genovese and E Napoli ldquoFPGA-based architecture for realtime segmentation and denoising of HD videordquo Journal of Real-Time Image Processing vol 8 no 4 pp 389ndash401 2013

[10] A Lopez-Bravo J Diaz-Carmona A Ramirez-Agundis APadilla-Medina and J Prado-Olivarez ldquoFPGA-based videosystem for real time moving object detectionrdquo in Proceedings

International Journal of Reconfigurable Computing 15

of the 23rd International Conference on Electronics Communi-cations and Computing (CONIELECOMP rsquo13) pp 92ndash97 IEEECholula Mexico March 2013

[11] T Kryjak M Komorkiewicz and M Gorgon ldquoReal-time mov-ing object detection for video surveillance system in FPGArdquo inProceedings of the Conference on Design and Architectures forSignal and Image Processing (DASIP rsquo11) pp 1ndash8 IEEE TampereFinland November 2011

[12] A Mittal and D Huttenlocher ldquoScene modeling for wide areasurveillance and image synthesisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition vol 2pp 160ndash167 IEEE June 2000

[13] A Price J Pyke D Ashiri and T Cornall ldquoReal time objectdetection for an unmanned aerial vehicle using an FPGAbased vision systemrdquo in Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA rsquo06) pp 2854ndash2859 IEEE Orlando Fla USA May 2006

[14] G J Garcıa C A Jara J Pomares A Alabdo L M Poggi andF Torres ldquoA survey on FPGA-based sensor systems towardsintelligent and reconfigurable low-power sensors for computervision control and signal processingrdquo Sensors vol 14 no 4 pp6247ndash6278 2014

[15] S Ali and M Shah ldquoCocoa tracking in aerial imageryrdquo inAirborne Intelligence Surveillance Reconnaissance (ISR) Systemsand Applications III vol 6209 of Proceedings of SPIE OrlandoFla USA April 2006

[16] J Xiao C Yang F Han and H Cheng ldquoVehicle and persontracking in aerial videosrdquo in Multimodal Technologies for Per-ception of Humans pp 203ndash214 Springer 2008

[17] W Yu X Yu P Zhang and J Zhou ldquoA new framework of mov-ing target detection and tracking for uav video applicationrdquo inProceedings of the International Archives of the PhotogrammetryRemote Sensing and Spatial Information Science vol 37 BeijingChina 2008

[18] V Reilly H Idrees and M Shah ldquoDetection and tracking oflarge number of targets in wide area surveillancerdquo in ComputerVisionmdashECCV 2010 11th European Conference on ComputerVision Heraklion Crete Greece September 5ndash11 2010 Proceed-ings Part III vol 6313 of Lecture Notes in Computer Science pp186ndash199 Springer Berlin Germany 2010

[19] J Wang Y Zhang J Lu and W Xu ldquoA framework for movingtarget detection recognition and tracking in UAV videosrdquoin Affective Computing and Intelligent Interaction vol 137 ofAdvances in Intelligent and Soft Computing pp 69ndash76 SpringerBerlin Germany 2012

[20] S A Cheraghi andUU Sheikh ldquoMoving object detection usingimage registration for a moving camera platformrdquo in Proceed-ings of the IEEE International Conference on Control SystemComputing and Engineering (ICCSCE rsquo12) pp 355ndash359 IEEEPenang Malaysia November 2012

[21] Y Zhang X Tong T Yang and W Ma ldquoMulti-model estima-tion based moving object detection for aerial videordquo Sensorsvol 15 no 4 pp 8214ndash8231 2015

[22] Q Yu and G Medioni ldquoA GPU-based implementation ofmotion detection from a moving platformrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern RecognitionWorkshops (CVPR rsquo08) pp 1ndash6 AnchorageAlaska USA June 2008

[23] A Laika J Paul C Claus W Stechele A E S Auf and EMaehle ldquoFPGA-based real-time moving object detection for

walking robotsrdquo in Proceedings of the 8th IEEE InternationalWorkshop on Safety Security and Rescue Robotics (SSRR rsquo10) pp1ndash8 IEEE Bremen Germany July 2010

[24] B Tippetts S Fowers K Lillywhite D-J Lee and J ArchibaldldquoFPGA implementation of a feature detection and trackingalgorithm for real-time applicationsrdquo in Advances in VisualComputing pp 682ndash691 Springer 2007

[25] K May and N Krouglicof ldquoMoving target detection for senseand avoid using regional phase correlationrdquo in Proceedings ofthe IEEE International Conference on Robotics and Automation(ICRA rsquo13) pp 4767ndash4772 IEEE Karlsruhe Germany May2013

[26] M E Angelopoulou and C-S Bouganis ldquoVision-based ego-motion estimation on FPGA for unmanned aerial vehiclenavigationrdquo IEEE Transactions on Circuits and Systems for VideoTechnology vol 24 no 6 pp 1070ndash1083 2014

[27] I M El-Emary andMM A El-Kareem ldquoOn the application ofgenetic algorithms in finger prints registrationrdquoWorld AppliedSciences Journal vol 5 no 3 pp 276ndash281 2008

[28] A A Goshtasby 2-D and 3-D Image Registration for MedicalRemote Sensing and Industrial Applications JohnWiley amp SonsNew York NY USA 2005

[29] C Harris and M Stephens ldquoA combined corner and edgedetectorrdquo in Proceedings of the 4th Alvey Vision Conference vol15 pp 147ndash151 1988

[30] E Rosten andTDrummond ldquoMachine learning for high-speedcorner detectionrdquo in Computer VisionmdashECCV 2006 pp 430ndash443 Springer 2006

[31] H Bay T Tuytelaars and L Van Gool ldquoSURF speededup robust featuresrdquo in Computer VisionmdashECCV 2006 ALeonardis H Bischof and A Pinz Eds vol 3951 of LectureNotes in Computer Science pp 404ndash417 Springer 2006

[32] G R Rodrıguez-Canosa SThomas J del Cerro A Barrientosand B MacDonald ldquoA real-time method to detect and trackmoving objects (DATMO) from unmanned aerial vehicles(UAVs) using a single camerardquo Remote Sensing vol 4 no 4 pp1090ndash1111 2012

[33] B Liu and A Zaccarin ldquoNew fast algorithms for the estimationof block motion vectorsrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 3 no 2 pp 148ndash157 1993

[34] R Li B Zeng andM L Liou ldquoNew three-step search algorithmfor blockmotion estimationrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 4 no 4 pp 438ndash442 1994

[35] L-M Po andW-C Ma ldquoA novel four-step search algorithm forfast blockmotion estimationrdquo IEEETransactions onCircuits andSystems for Video Technology vol 6 no 3 pp 313ndash317 1996

[36] S Zhu and K-K Ma ldquoA new diamond search algorithm forfast block-matching motion estimationrdquo IEEE Transactions onImage Processing vol 9 no 2 pp 287ndash290 2000

[37] L De Vos andM Stegherr ldquoParameterizable VLSI architecturesfor the full-search block-matching algorithmrdquo IEEE Transac-tions on Circuits and Systems vol 36 no 10 pp 1309ndash1316 1989

[38] M A Fischler and R C Bolles ldquoRandom sample consensus aparadigm for model fitting with applications to image analysisand automated cartographyrdquo Communications of the ACM vol24 no 6 pp 381ndash395 1981

[39] J W Tang N Shaikh-Husin and U U Sheikh ldquoFPGA imple-mentation of RANSAC algorithm for real-time image geometry

16 International Journal of Reconfigurable Computing

estimationrdquo in Proceedings of the 11th IEEE Student Conferenceon Research andDevelopment (SCOReD rsquo13) pp 290ndash294 IEEEPutrajaya Malaysia December 2013

[40] O Chum and J Matas ldquoRandomized ransac with T119889119889testrdquo in

Proceedings of the British Machine Vision Conference vol 2 pp448ndash457 September 2002

[41] DARPA SDMS PublicWeb Site 2003 httpswwwsdmsafrlafmil

[42] A F M S Saif A S Prabuwono and Z R MahayuddinldquoMotion analysis for moving object detection from UAV aerialimages a reviewrdquo in Proceedings of the International Conferenceon Informatics Electronics and Vision (ICIEV rsquo14) pp 1ndash6 IEEEDhaka Bangladesh May 2014

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

International Journal of Reconfigurable Computing 11

Leftmostcolumn pixels

Rightmostcolumn pixels

Median output stream

Addertree

Addertree

24

pixelsTo line buffer

Binary image stream

7 times 7 window

minus

+

gtgt

Figure 10 Hardware architecture of median PE

By synchronizing the streams of both frames framedifferencing can be executed in pipeline as soon as one pixelfrom each frame is obtained Hence one pixel in currentframe (119865

119894minus2) and one pixel in transformed frame (1198651015840

119894minus3) are

fetched alternatively from their corresponding memory loca-tions by frame reader constructing two synchronized streamsof 119865119894minus2and 1198651015840

119894minus3frames Frame differencing PE performs

pixel-to-pixel absolute subtraction and thresholding on thestreams The frame differencing PE is able to compute inone cycle per pixel A configurable threshold value thfd isused after the subtraction yielding a stream of binary imagewithout buffering the whole frameAfter frame differencing the binary image is streamed

into 7 times 7 median filtering Seven lines of the image arebuffered in the line buffer providing 7 times 7 pixels window forthe median PE to perform the median computation Mediancomputation can be performed in one clock cycle for eachprocessing window due to short propagation delay as onlybinary pixels are involved Figure 10 shows the hardware logicdesign of median PEMedian filtering can be computed by counting the num-

ber of asserted (binary 1) pixels in the window If morethan half the pixels in the window (24 pixels) are assertedthe resultant pixel is ldquo1rdquo or ldquo0rdquo otherwise Since processingwindow will move only one pixel to the right for each com-putation during raster scan current pixel count is computedby adding the previous pixel count and rightmost columnpixels in the current window while subtracting the leftmost

column pixels in the previous window Final binary outputpixel is produced by thresholding the current pixel count with24 (half of window size)As dilation is also a 7times 7window-based processing it uses

similar line buffering technique asmedian filtering Howeveronly simple logical OR operation is performed on all windowpixels Due to its simplicity dilation PE can also be computedin one clock cycle resulting in the stream of binary imagewith detected region of moving targets

5 Experimental Results

51 Verification of Proposed SoC Theproposedmoving targetdetection SoC is verified in offline detection mode using thedatabase in [41] Test videos are 640 times 480 pixels in sizeand are greyscaled prior to the verification process The testvideos are transferred to the system for computation via aUSB mass storage device After performing the detection inSoC the image results are displayed on VGA and also storedon USB drive for verification Figure 11 shows the movingtarget detection result from the proposed SoC using differentsample videos The detected regions (red) are overlaid on theinput frame In most cases the proposed SoC is able to detectthe moving target in consecutive framesHowever there are several limitations in this work Block

matching may not give a goodmotion estimation result if theextracted blocks do not have texture (the pixels intensity aresimilar) Moreover the detected region of moving target mayappear in cavity or multiple split of smaller regions as onlysimple frame differencing is applied in the proposed systemAdditional postprocessing to produce better detected blob bymerging split regions is out of the scope in this workAs the stochastic RANSAC algorithm is terminated after

a constant time step for each frame image registration errormay occur which produces incorrect ego-motion estimationThis could be mitigated by accelerating RANSAC algorithmto ensure more iterations using dedicated hardware or highperformance general purpose processor

52 Performance Evaluation of Detection Algorithm Theper-formance evaluation of the implemented detection algorithmuses the Mathematical Performance Metric in [42] thatinvolves several parameters as follows

(i) True positive TP the detected moving object(ii) False positive FP detected regions that do not corre-spond to any moving object

(iii) False negative FN the nondetected moving object(iv) Detection rate DR the ratio of TP with the combina-

tion of TP and FN as formulated in

DR = TPTP + FN

(4)

(v) False alarm rate FAR the ratio between FP in allpositive detection as defined in

FAR = FPTP + FP

(5)

12 International Journal of Reconfigurable Computing

(a) Frame 255 (b) Frame 275 (c) Frame 295 (d) Frame 315

(e) Frame 1000 (f) Frame 1020 (g) Frame 1040 (h) Frame 1060

(i) Frame 600 (j) Frame 620 (k) Frame 640 (l) Frame 660

Figure 11 Detected regions from the proposed moving target detection SoC on different sample videos in [41] Video numbers (a)ndash(d)V3V100003 004 video numbers (e)ndash(h) V3V100004 003 and video numbers (i)ndash(l) V4V100007 017

FP

FNTP

Figure 12 Evaluation of performancemetrics TP FP and FN basedon ground truth boxes (blue) and the detected region (red)

To obtain the performance metrics ground truth regionsare manually labelled in several frames of test videos Abounding box is drawn across each moving object to indicatethe ground truth region of every frame as depicted in Fig-ure 12 A simple postprocessing is performed on the detectedregion by filtering out the detected region smaller than 15pixelsrsquo width or 15 pixelsrsquo height prior to the evaluationA detected moving object (TP) has detected regions in itsbounded ground truth area while a nondetected movingobject (FN) has no detected region overlapping with itsground truth area Detected region that does not overlappwith any ground truth region is considered as false positive(FP)The detection performance is evaluated on different

parameters configuration The DR and FAR for 1000 testframes using different number of blocks (density in ego-motion estimation) 119898 times 119899 in area-based registration and

frame differencing threshold thfd are depicted in Table 5 andFigure 13The experiment results show that DR is almost similar

for different density of ego-motion estimation but decreaseswith thfd Although higher density in the proposed work haslower displacement limitation 119889

119898and 119889

119899as discussed in

Section 31 most of the point-to-point displacements do notexceed the limitation due to slowUAVmovement in themostframes of the test dataset On the contrary higher value of thfdmay filter out the moving object if the differences in intensityof the object pixels and background pixels are almost similarFAR decreases with density in ego-motion estimation

due to the higher quality in image registration process butincreases if most frames exceed the displacement limitation119889119898and 119889

119899 However false registration due to displacement

limitation results in a huge blob of foreground but does notgreatly increase FAR Although higher values of thfd decreasethe false detection rate they also produce smaller foregroundarea for all detected moving objects as pixels almost similarintensity with background will be thresholded

53 Speed Comparison with Full Software ImplementationThe computation speed of the proposed moving target detec-tion SoC is compared with software computation in differentplatforms including modern CPU (Intel Core i5) in desktopcomputer and embedded processor (ARM) Table 6 illustratesthe comparison of computation frame rate and hardware

International Journal of Reconfigurable Computing 13

1 2 3 4 5 6 7 8 9 100944

0946

0948

095

0952

0954

0956

0958

096

0962

0964

Det

ectio

n ra

teD

R

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(a) DR

1 2 3 4 5 6 7 8 9 100

01

02

03

04

05

06

07

False

alar

m ra

te F

AR

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(b) FAR

Figure 13 DR and FAR for different density in ego-motion estimation119898 times 119899 and frame differencing threshold thfd

Table 5 Performance evaluation in terms of DR and FAR for 1000frames using different density in ego-motion estimation119898times 119899 andframe differencing threshold thfd

119898 times 119899 thfd DR FAR12 15 0958 064312 20 0954 033112 25 0949 019424 15 0957 056824 20 0950 032424 25 0945 010135 15 0958 054835 20 0952 021535 25 0947 009048 15 0959 053948 20 0952 025348 25 0944 007970 15 0958 050970 20 0951 018870 25 0946 007588 15 0960 048988 20 0951 021988 25 0947 0074108 15 0958 0483108 20 0952 0168108 25 0946 0058140 15 0958 0499140 20 0951 0187140 25 0946 0059165 15 0958 0474165 20 0953 0214165 25 0947 0068192 15 0959 0478192 20 0952 0169192 25 0946 0092

Table 6 Computation speed comparison of the proposed sys-tem with different software implementation using area-based andfeature-based registrations

Platform Frequency Registrationtechnique

Framerate

Hardwarespeed-up

ProposedSoC 100MHz Area-based 30 1

Intel Corei5-4210U 170GHz Area-based 426 704

Feature-based 1311 229

ARM1176JZF 700MHz Area-based 020 150Feature-based 056 5357

speed-up between the proposed system and other softwareimplementations using test videos in [41]As feature-based image registration has faster computa-

tion in software implementation comparing to area-basedregistration speed performance of feature-based method isalso included for comparison In feature-based implementa-tion features are first detected in each frame The detectedfeatures from current frame are cross-correlatedwith featureswith previous framewhile RANSAC algorithm is used to esti-mate the ego-motion between frames After compensatingthe ego-motion segmentation ofmoving object uses the sameprocesses with the proposed system To further optimizethe software implementation in terms of speed performancea fast feature detection algorithm [30] is utilized As thenumber of features will affect the computation time in featurematching step only 100 strongest features in each frame areselected for processingHowever the performance evaluationdoes not consider multithreaded software execution

14 International Journal of Reconfigurable Computing

Table 7 Resources usage of the proposed moving target detectionSoC

Logic units Utilization ()Total combinational function 15161 13Total registers 10803 9Total memory bits 521054 13Embedded multiplier 27 5FPGA device Altera Cyclone IV

Based on experimental result the speed performanceof the proposed moving target detection SoC surpassesoptimized software computation by 229 times and 5357times compared with implementations in modern CPUand embedded CPU respectively The software computation(RANSAC) in HWSW codesign of the proposed system cre-ates speed bottleneck thus limiting the maximum through-put to 30 fps The processing frame rate of the proposed sys-tem can be further improved by using fully dedicated hard-ware

54 Resource Utilization The overall hardware resourcesutilization of the complete system is illustrated in Table 7This prototype of real-time moving object detection systemutilizes only less than 20 percent of total resources in AlteraCyclone IV FPGA device As the proposed system uses off-chip memory components for frame buffering FPGA on-chip memory is utilized only for line buffering in streamingprocess (eg block matching and median filtering) and stor-ing intermediate results (eg point pairs after block match-ing) Thus the low resource usage of the proposed systemprovides abundant hardware space for other processes such astarget tracking or classification to be developed in future

6 Conclusions

Moving target detection is a crucial step in most computervision problem especially for UAV applications On-chipdetection without the need of real-time video transmission toground will provide immense benefit to diverse applicationssuch as military surveillance and resource exploration Inorder to perform this complex embedded video processingon-chip FPGA-based system is desirable due to the potentialparallelism of the algorithmThis paper proposed a moving target detection system

using FPGA to enable autonomous UAVwhich is able to per-form the computer vision algorithm on the flying platformThe proposed system is prototyped using Altera CycloneIV FPGA device on Terasic DE2-115 development boardmounted with a TRDB-D5M camera This system is devel-oped as a HWSW codesign using dedicated hardware withNios II software processor (booted with embedded Linux)running at 100MHz clock rate As stream-oriented hardwarewith pipeline processing is utilized the proposed systemachieves real-time capability with 30 frames per secondprocessing speed on 640times 480 live video Experimental resultshows that the proposed SoC performs 229 times and 5357times faster than optimized software computation onmodern

desktop computer (Intel Core i5) and embedded processor(ARM) In addition the proposed moving target detectionuses only less than 20 percent of total resources in theFPGA device allowing other hardware accelerators to beimplemented in future

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors would like to express their gratitude to UniversitiTeknologi Malaysia (UTM) and the Ministry of ScienceTechnology and Innovation (MOSTI) Malaysia for support-ing this research work under research Grants 01-01-06-SF1197and 01-01-06-SF1229

References

[1] A Ahmed M Nagai C Tianen and R Shibasaki ldquoUav basedmonitoring systemandobject detection technique developmentfor a disaster areardquo International Archives of PhotogrammetryRemote Sensing and Spatial Information Sciences vol 37 pp373ndash377 2008

[2] B Coifman M McCord R Mishalani M Iswalt and Y JildquoRoadway trafficmonitoring from an unmanned aerial vehiclerdquoIEE Proceedings-Intelligent Transport Systems vol 153 no 1 pp11ndash20 2006

[3] K Kanistras G Martins M J Rutherford and K P ValavanisldquoSurvey of unmanned aerial vehicles (uavs) for traffic monitor-ingrdquo inHandbook of Unmanned Aerial Vehicles pp 2643ndash2666Springer 2015

[4] K Nordberg P Doherty G Farneback et al ldquoVision for a UAVhelicopterrdquo in Proceedings of the International Conference onIntelligent Robots and Systems (IROS rsquo02) Workshop on AerialRobotics pp 29ndash34 Lausanne Switzerland October 2002

[5] D Zamalieva and A Yilmaz ldquoBackground subtraction for themoving camera a geometric approachrdquo Computer Vision andImage Understanding vol 127 pp 73ndash85 2014

[6] M Genovese and E Napoli ldquoASIC and FPGA implementationof the Gaussianmixturemodel algorithm for real-time segmen-tation of high definition videordquo IEEETransactions onVery LargeScale Integration (VLSI) Systems vol 22 no 3 pp 537ndash547 2014

[7] F Kristensen H Hedberg H Jiang P Nilsson and V OwallldquoAn embedded real-time surveillance system implementationand evaluationrdquo Journal of Signal Processing Systems vol 52 no1 pp 75ndash94 2008

[8] H Jiang H Ardo and V Owall ldquoA hardware architecturefor real-time video segmentation utilizing memory reductiontechniquesrdquo IEEETransactions onCircuits and Systems for VideoTechnology vol 19 no 2 pp 226ndash236 2009

[9] M Genovese and E Napoli ldquoFPGA-based architecture for realtime segmentation and denoising of HD videordquo Journal of Real-Time Image Processing vol 8 no 4 pp 389ndash401 2013

[10] A Lopez-Bravo J Diaz-Carmona A Ramirez-Agundis APadilla-Medina and J Prado-Olivarez ldquoFPGA-based videosystem for real time moving object detectionrdquo in Proceedings

International Journal of Reconfigurable Computing 15

of the 23rd International Conference on Electronics Communi-cations and Computing (CONIELECOMP rsquo13) pp 92ndash97 IEEECholula Mexico March 2013

[11] T Kryjak M Komorkiewicz and M Gorgon ldquoReal-time mov-ing object detection for video surveillance system in FPGArdquo inProceedings of the Conference on Design and Architectures forSignal and Image Processing (DASIP rsquo11) pp 1ndash8 IEEE TampereFinland November 2011

[12] A Mittal and D Huttenlocher ldquoScene modeling for wide areasurveillance and image synthesisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition vol 2pp 160ndash167 IEEE June 2000

[13] A Price J Pyke D Ashiri and T Cornall ldquoReal time objectdetection for an unmanned aerial vehicle using an FPGAbased vision systemrdquo in Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA rsquo06) pp 2854ndash2859 IEEE Orlando Fla USA May 2006

[14] G J Garcıa C A Jara J Pomares A Alabdo L M Poggi andF Torres ldquoA survey on FPGA-based sensor systems towardsintelligent and reconfigurable low-power sensors for computervision control and signal processingrdquo Sensors vol 14 no 4 pp6247ndash6278 2014

[15] S Ali and M Shah ldquoCocoa tracking in aerial imageryrdquo inAirborne Intelligence Surveillance Reconnaissance (ISR) Systemsand Applications III vol 6209 of Proceedings of SPIE OrlandoFla USA April 2006

[16] J Xiao C Yang F Han and H Cheng ldquoVehicle and persontracking in aerial videosrdquo in Multimodal Technologies for Per-ception of Humans pp 203ndash214 Springer 2008

[17] W Yu X Yu P Zhang and J Zhou ldquoA new framework of mov-ing target detection and tracking for uav video applicationrdquo inProceedings of the International Archives of the PhotogrammetryRemote Sensing and Spatial Information Science vol 37 BeijingChina 2008

[18] V Reilly H Idrees and M Shah ldquoDetection and tracking oflarge number of targets in wide area surveillancerdquo in ComputerVisionmdashECCV 2010 11th European Conference on ComputerVision Heraklion Crete Greece September 5ndash11 2010 Proceed-ings Part III vol 6313 of Lecture Notes in Computer Science pp186ndash199 Springer Berlin Germany 2010

[19] J Wang Y Zhang J Lu and W Xu ldquoA framework for movingtarget detection recognition and tracking in UAV videosrdquoin Affective Computing and Intelligent Interaction vol 137 ofAdvances in Intelligent and Soft Computing pp 69ndash76 SpringerBerlin Germany 2012

[20] S A Cheraghi andUU Sheikh ldquoMoving object detection usingimage registration for a moving camera platformrdquo in Proceed-ings of the IEEE International Conference on Control SystemComputing and Engineering (ICCSCE rsquo12) pp 355ndash359 IEEEPenang Malaysia November 2012

[21] Y Zhang X Tong T Yang and W Ma ldquoMulti-model estima-tion based moving object detection for aerial videordquo Sensorsvol 15 no 4 pp 8214ndash8231 2015

[22] Q Yu and G Medioni ldquoA GPU-based implementation ofmotion detection from a moving platformrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern RecognitionWorkshops (CVPR rsquo08) pp 1ndash6 AnchorageAlaska USA June 2008

[23] A Laika J Paul C Claus W Stechele A E S Auf and EMaehle ldquoFPGA-based real-time moving object detection for

walking robotsrdquo in Proceedings of the 8th IEEE InternationalWorkshop on Safety Security and Rescue Robotics (SSRR rsquo10) pp1ndash8 IEEE Bremen Germany July 2010

[24] B Tippetts S Fowers K Lillywhite D-J Lee and J ArchibaldldquoFPGA implementation of a feature detection and trackingalgorithm for real-time applicationsrdquo in Advances in VisualComputing pp 682ndash691 Springer 2007

[25] K May and N Krouglicof ldquoMoving target detection for senseand avoid using regional phase correlationrdquo in Proceedings ofthe IEEE International Conference on Robotics and Automation(ICRA rsquo13) pp 4767ndash4772 IEEE Karlsruhe Germany May2013

[26] M E Angelopoulou and C-S Bouganis ldquoVision-based ego-motion estimation on FPGA for unmanned aerial vehiclenavigationrdquo IEEE Transactions on Circuits and Systems for VideoTechnology vol 24 no 6 pp 1070ndash1083 2014

[27] I M El-Emary andMM A El-Kareem ldquoOn the application ofgenetic algorithms in finger prints registrationrdquoWorld AppliedSciences Journal vol 5 no 3 pp 276ndash281 2008

[28] A A Goshtasby 2-D and 3-D Image Registration for MedicalRemote Sensing and Industrial Applications JohnWiley amp SonsNew York NY USA 2005

[29] C Harris and M Stephens ldquoA combined corner and edgedetectorrdquo in Proceedings of the 4th Alvey Vision Conference vol15 pp 147ndash151 1988

[30] E Rosten andTDrummond ldquoMachine learning for high-speedcorner detectionrdquo in Computer VisionmdashECCV 2006 pp 430ndash443 Springer 2006

[31] H Bay T Tuytelaars and L Van Gool ldquoSURF speededup robust featuresrdquo in Computer VisionmdashECCV 2006 ALeonardis H Bischof and A Pinz Eds vol 3951 of LectureNotes in Computer Science pp 404ndash417 Springer 2006

[32] G R Rodrıguez-Canosa SThomas J del Cerro A Barrientosand B MacDonald ldquoA real-time method to detect and trackmoving objects (DATMO) from unmanned aerial vehicles(UAVs) using a single camerardquo Remote Sensing vol 4 no 4 pp1090ndash1111 2012

[33] B Liu and A Zaccarin ldquoNew fast algorithms for the estimationof block motion vectorsrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 3 no 2 pp 148ndash157 1993

[34] R Li B Zeng andM L Liou ldquoNew three-step search algorithmfor blockmotion estimationrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 4 no 4 pp 438ndash442 1994

[35] L-M Po andW-C Ma ldquoA novel four-step search algorithm forfast blockmotion estimationrdquo IEEETransactions onCircuits andSystems for Video Technology vol 6 no 3 pp 313ndash317 1996

[36] S Zhu and K-K Ma ldquoA new diamond search algorithm forfast block-matching motion estimationrdquo IEEE Transactions onImage Processing vol 9 no 2 pp 287ndash290 2000

[37] L De Vos andM Stegherr ldquoParameterizable VLSI architecturesfor the full-search block-matching algorithmrdquo IEEE Transac-tions on Circuits and Systems vol 36 no 10 pp 1309ndash1316 1989

[38] M A Fischler and R C Bolles ldquoRandom sample consensus aparadigm for model fitting with applications to image analysisand automated cartographyrdquo Communications of the ACM vol24 no 6 pp 381ndash395 1981

[39] J W Tang N Shaikh-Husin and U U Sheikh ldquoFPGA imple-mentation of RANSAC algorithm for real-time image geometry

16 International Journal of Reconfigurable Computing

estimationrdquo in Proceedings of the 11th IEEE Student Conferenceon Research andDevelopment (SCOReD rsquo13) pp 290ndash294 IEEEPutrajaya Malaysia December 2013

[40] O Chum and J Matas ldquoRandomized ransac with T119889119889testrdquo in

Proceedings of the British Machine Vision Conference vol 2 pp448ndash457 September 2002

[41] DARPA SDMS PublicWeb Site 2003 httpswwwsdmsafrlafmil

[42] A F M S Saif A S Prabuwono and Z R MahayuddinldquoMotion analysis for moving object detection from UAV aerialimages a reviewrdquo in Proceedings of the International Conferenceon Informatics Electronics and Vision (ICIEV rsquo14) pp 1ndash6 IEEEDhaka Bangladesh May 2014

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

12 International Journal of Reconfigurable Computing

(a) Frame 255 (b) Frame 275 (c) Frame 295 (d) Frame 315

(e) Frame 1000 (f) Frame 1020 (g) Frame 1040 (h) Frame 1060

(i) Frame 600 (j) Frame 620 (k) Frame 640 (l) Frame 660

Figure 11 Detected regions from the proposed moving target detection SoC on different sample videos in [41] Video numbers (a)ndash(d)V3V100003 004 video numbers (e)ndash(h) V3V100004 003 and video numbers (i)ndash(l) V4V100007 017

FP

FNTP

Figure 12 Evaluation of performancemetrics TP FP and FN basedon ground truth boxes (blue) and the detected region (red)

To obtain the performance metrics ground truth regionsare manually labelled in several frames of test videos Abounding box is drawn across each moving object to indicatethe ground truth region of every frame as depicted in Fig-ure 12 A simple postprocessing is performed on the detectedregion by filtering out the detected region smaller than 15pixelsrsquo width or 15 pixelsrsquo height prior to the evaluationA detected moving object (TP) has detected regions in itsbounded ground truth area while a nondetected movingobject (FN) has no detected region overlapping with itsground truth area Detected region that does not overlappwith any ground truth region is considered as false positive(FP)The detection performance is evaluated on different

parameters configuration The DR and FAR for 1000 testframes using different number of blocks (density in ego-motion estimation) 119898 times 119899 in area-based registration and

frame differencing threshold thfd are depicted in Table 5 andFigure 13The experiment results show that DR is almost similar

for different density of ego-motion estimation but decreaseswith thfd Although higher density in the proposed work haslower displacement limitation 119889

119898and 119889

119899as discussed in

Section 31 most of the point-to-point displacements do notexceed the limitation due to slowUAVmovement in themostframes of the test dataset On the contrary higher value of thfdmay filter out the moving object if the differences in intensityof the object pixels and background pixels are almost similarFAR decreases with density in ego-motion estimation

due to the higher quality in image registration process butincreases if most frames exceed the displacement limitation119889119898and 119889

119899 However false registration due to displacement

limitation results in a huge blob of foreground but does notgreatly increase FAR Although higher values of thfd decreasethe false detection rate they also produce smaller foregroundarea for all detected moving objects as pixels almost similarintensity with background will be thresholded

53 Speed Comparison with Full Software ImplementationThe computation speed of the proposed moving target detec-tion SoC is compared with software computation in differentplatforms including modern CPU (Intel Core i5) in desktopcomputer and embedded processor (ARM) Table 6 illustratesthe comparison of computation frame rate and hardware

International Journal of Reconfigurable Computing 13

1 2 3 4 5 6 7 8 9 100944

0946

0948

095

0952

0954

0956

0958

096

0962

0964

Det

ectio

n ra

teD

R

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(a) DR

1 2 3 4 5 6 7 8 9 100

01

02

03

04

05

06

07

False

alar

m ra

te F

AR

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(b) FAR

Figure 13 DR and FAR for different density in ego-motion estimation119898 times 119899 and frame differencing threshold thfd

Table 5 Performance evaluation in terms of DR and FAR for 1000frames using different density in ego-motion estimation119898times 119899 andframe differencing threshold thfd

119898 times 119899 thfd DR FAR12 15 0958 064312 20 0954 033112 25 0949 019424 15 0957 056824 20 0950 032424 25 0945 010135 15 0958 054835 20 0952 021535 25 0947 009048 15 0959 053948 20 0952 025348 25 0944 007970 15 0958 050970 20 0951 018870 25 0946 007588 15 0960 048988 20 0951 021988 25 0947 0074108 15 0958 0483108 20 0952 0168108 25 0946 0058140 15 0958 0499140 20 0951 0187140 25 0946 0059165 15 0958 0474165 20 0953 0214165 25 0947 0068192 15 0959 0478192 20 0952 0169192 25 0946 0092

Table 6 Computation speed comparison of the proposed sys-tem with different software implementation using area-based andfeature-based registrations

Platform Frequency Registrationtechnique

Framerate

Hardwarespeed-up

ProposedSoC 100MHz Area-based 30 1

Intel Corei5-4210U 170GHz Area-based 426 704

Feature-based 1311 229

ARM1176JZF 700MHz Area-based 020 150Feature-based 056 5357

speed-up between the proposed system and other softwareimplementations using test videos in [41]As feature-based image registration has faster computa-

tion in software implementation comparing to area-basedregistration speed performance of feature-based method isalso included for comparison In feature-based implementa-tion features are first detected in each frame The detectedfeatures from current frame are cross-correlatedwith featureswith previous framewhile RANSAC algorithm is used to esti-mate the ego-motion between frames After compensatingthe ego-motion segmentation ofmoving object uses the sameprocesses with the proposed system To further optimizethe software implementation in terms of speed performancea fast feature detection algorithm [30] is utilized As thenumber of features will affect the computation time in featurematching step only 100 strongest features in each frame areselected for processingHowever the performance evaluationdoes not consider multithreaded software execution

14 International Journal of Reconfigurable Computing

Table 7 Resources usage of the proposed moving target detectionSoC

Logic units Utilization ()Total combinational function 15161 13Total registers 10803 9Total memory bits 521054 13Embedded multiplier 27 5FPGA device Altera Cyclone IV

Based on experimental result the speed performanceof the proposed moving target detection SoC surpassesoptimized software computation by 229 times and 5357times compared with implementations in modern CPUand embedded CPU respectively The software computation(RANSAC) in HWSW codesign of the proposed system cre-ates speed bottleneck thus limiting the maximum through-put to 30 fps The processing frame rate of the proposed sys-tem can be further improved by using fully dedicated hard-ware

54 Resource Utilization The overall hardware resourcesutilization of the complete system is illustrated in Table 7This prototype of real-time moving object detection systemutilizes only less than 20 percent of total resources in AlteraCyclone IV FPGA device As the proposed system uses off-chip memory components for frame buffering FPGA on-chip memory is utilized only for line buffering in streamingprocess (eg block matching and median filtering) and stor-ing intermediate results (eg point pairs after block match-ing) Thus the low resource usage of the proposed systemprovides abundant hardware space for other processes such astarget tracking or classification to be developed in future

6 Conclusions

Moving target detection is a crucial step in most computervision problem especially for UAV applications On-chipdetection without the need of real-time video transmission toground will provide immense benefit to diverse applicationssuch as military surveillance and resource exploration Inorder to perform this complex embedded video processingon-chip FPGA-based system is desirable due to the potentialparallelism of the algorithmThis paper proposed a moving target detection system

using FPGA to enable autonomous UAVwhich is able to per-form the computer vision algorithm on the flying platformThe proposed system is prototyped using Altera CycloneIV FPGA device on Terasic DE2-115 development boardmounted with a TRDB-D5M camera This system is devel-oped as a HWSW codesign using dedicated hardware withNios II software processor (booted with embedded Linux)running at 100MHz clock rate As stream-oriented hardwarewith pipeline processing is utilized the proposed systemachieves real-time capability with 30 frames per secondprocessing speed on 640times 480 live video Experimental resultshows that the proposed SoC performs 229 times and 5357times faster than optimized software computation onmodern

desktop computer (Intel Core i5) and embedded processor(ARM) In addition the proposed moving target detectionuses only less than 20 percent of total resources in theFPGA device allowing other hardware accelerators to beimplemented in future

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors would like to express their gratitude to UniversitiTeknologi Malaysia (UTM) and the Ministry of ScienceTechnology and Innovation (MOSTI) Malaysia for support-ing this research work under research Grants 01-01-06-SF1197and 01-01-06-SF1229

References

[1] A Ahmed M Nagai C Tianen and R Shibasaki ldquoUav basedmonitoring systemandobject detection technique developmentfor a disaster areardquo International Archives of PhotogrammetryRemote Sensing and Spatial Information Sciences vol 37 pp373ndash377 2008

[2] B Coifman M McCord R Mishalani M Iswalt and Y JildquoRoadway trafficmonitoring from an unmanned aerial vehiclerdquoIEE Proceedings-Intelligent Transport Systems vol 153 no 1 pp11ndash20 2006

[3] K Kanistras G Martins M J Rutherford and K P ValavanisldquoSurvey of unmanned aerial vehicles (uavs) for traffic monitor-ingrdquo inHandbook of Unmanned Aerial Vehicles pp 2643ndash2666Springer 2015

[4] K Nordberg P Doherty G Farneback et al ldquoVision for a UAVhelicopterrdquo in Proceedings of the International Conference onIntelligent Robots and Systems (IROS rsquo02) Workshop on AerialRobotics pp 29ndash34 Lausanne Switzerland October 2002

[5] D Zamalieva and A Yilmaz ldquoBackground subtraction for themoving camera a geometric approachrdquo Computer Vision andImage Understanding vol 127 pp 73ndash85 2014

[6] M Genovese and E Napoli ldquoASIC and FPGA implementationof the Gaussianmixturemodel algorithm for real-time segmen-tation of high definition videordquo IEEETransactions onVery LargeScale Integration (VLSI) Systems vol 22 no 3 pp 537ndash547 2014

[7] F Kristensen H Hedberg H Jiang P Nilsson and V OwallldquoAn embedded real-time surveillance system implementationand evaluationrdquo Journal of Signal Processing Systems vol 52 no1 pp 75ndash94 2008

[8] H Jiang H Ardo and V Owall ldquoA hardware architecturefor real-time video segmentation utilizing memory reductiontechniquesrdquo IEEETransactions onCircuits and Systems for VideoTechnology vol 19 no 2 pp 226ndash236 2009

[9] M Genovese and E Napoli ldquoFPGA-based architecture for realtime segmentation and denoising of HD videordquo Journal of Real-Time Image Processing vol 8 no 4 pp 389ndash401 2013

[10] A Lopez-Bravo J Diaz-Carmona A Ramirez-Agundis APadilla-Medina and J Prado-Olivarez ldquoFPGA-based videosystem for real time moving object detectionrdquo in Proceedings

International Journal of Reconfigurable Computing 15

of the 23rd International Conference on Electronics Communi-cations and Computing (CONIELECOMP rsquo13) pp 92ndash97 IEEECholula Mexico March 2013

[11] T Kryjak M Komorkiewicz and M Gorgon ldquoReal-time mov-ing object detection for video surveillance system in FPGArdquo inProceedings of the Conference on Design and Architectures forSignal and Image Processing (DASIP rsquo11) pp 1ndash8 IEEE TampereFinland November 2011

[12] A Mittal and D Huttenlocher ldquoScene modeling for wide areasurveillance and image synthesisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition vol 2pp 160ndash167 IEEE June 2000

[13] A Price J Pyke D Ashiri and T Cornall ldquoReal time objectdetection for an unmanned aerial vehicle using an FPGAbased vision systemrdquo in Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA rsquo06) pp 2854ndash2859 IEEE Orlando Fla USA May 2006

[14] G J Garcıa C A Jara J Pomares A Alabdo L M Poggi andF Torres ldquoA survey on FPGA-based sensor systems towardsintelligent and reconfigurable low-power sensors for computervision control and signal processingrdquo Sensors vol 14 no 4 pp6247ndash6278 2014

[15] S Ali and M Shah ldquoCocoa tracking in aerial imageryrdquo inAirborne Intelligence Surveillance Reconnaissance (ISR) Systemsand Applications III vol 6209 of Proceedings of SPIE OrlandoFla USA April 2006

[16] J Xiao C Yang F Han and H Cheng ldquoVehicle and persontracking in aerial videosrdquo in Multimodal Technologies for Per-ception of Humans pp 203ndash214 Springer 2008

[17] W Yu X Yu P Zhang and J Zhou ldquoA new framework of mov-ing target detection and tracking for uav video applicationrdquo inProceedings of the International Archives of the PhotogrammetryRemote Sensing and Spatial Information Science vol 37 BeijingChina 2008

[18] V Reilly H Idrees and M Shah ldquoDetection and tracking oflarge number of targets in wide area surveillancerdquo in ComputerVisionmdashECCV 2010 11th European Conference on ComputerVision Heraklion Crete Greece September 5ndash11 2010 Proceed-ings Part III vol 6313 of Lecture Notes in Computer Science pp186ndash199 Springer Berlin Germany 2010

[19] J Wang Y Zhang J Lu and W Xu ldquoA framework for movingtarget detection recognition and tracking in UAV videosrdquoin Affective Computing and Intelligent Interaction vol 137 ofAdvances in Intelligent and Soft Computing pp 69ndash76 SpringerBerlin Germany 2012

[20] S A Cheraghi andUU Sheikh ldquoMoving object detection usingimage registration for a moving camera platformrdquo in Proceed-ings of the IEEE International Conference on Control SystemComputing and Engineering (ICCSCE rsquo12) pp 355ndash359 IEEEPenang Malaysia November 2012

[21] Y Zhang X Tong T Yang and W Ma ldquoMulti-model estima-tion based moving object detection for aerial videordquo Sensorsvol 15 no 4 pp 8214ndash8231 2015

[22] Q Yu and G Medioni ldquoA GPU-based implementation ofmotion detection from a moving platformrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern RecognitionWorkshops (CVPR rsquo08) pp 1ndash6 AnchorageAlaska USA June 2008

[23] A Laika J Paul C Claus W Stechele A E S Auf and EMaehle ldquoFPGA-based real-time moving object detection for

walking robotsrdquo in Proceedings of the 8th IEEE InternationalWorkshop on Safety Security and Rescue Robotics (SSRR rsquo10) pp1ndash8 IEEE Bremen Germany July 2010

[24] B Tippetts S Fowers K Lillywhite D-J Lee and J ArchibaldldquoFPGA implementation of a feature detection and trackingalgorithm for real-time applicationsrdquo in Advances in VisualComputing pp 682ndash691 Springer 2007

[25] K May and N Krouglicof ldquoMoving target detection for senseand avoid using regional phase correlationrdquo in Proceedings ofthe IEEE International Conference on Robotics and Automation(ICRA rsquo13) pp 4767ndash4772 IEEE Karlsruhe Germany May2013

[26] M E Angelopoulou and C-S Bouganis ldquoVision-based ego-motion estimation on FPGA for unmanned aerial vehiclenavigationrdquo IEEE Transactions on Circuits and Systems for VideoTechnology vol 24 no 6 pp 1070ndash1083 2014

[27] I M El-Emary andMM A El-Kareem ldquoOn the application ofgenetic algorithms in finger prints registrationrdquoWorld AppliedSciences Journal vol 5 no 3 pp 276ndash281 2008

[28] A A Goshtasby 2-D and 3-D Image Registration for MedicalRemote Sensing and Industrial Applications JohnWiley amp SonsNew York NY USA 2005

[29] C Harris and M Stephens ldquoA combined corner and edgedetectorrdquo in Proceedings of the 4th Alvey Vision Conference vol15 pp 147ndash151 1988

[30] E Rosten andTDrummond ldquoMachine learning for high-speedcorner detectionrdquo in Computer VisionmdashECCV 2006 pp 430ndash443 Springer 2006

[31] H Bay T Tuytelaars and L Van Gool ldquoSURF speededup robust featuresrdquo in Computer VisionmdashECCV 2006 ALeonardis H Bischof and A Pinz Eds vol 3951 of LectureNotes in Computer Science pp 404ndash417 Springer 2006

[32] G R Rodrıguez-Canosa SThomas J del Cerro A Barrientosand B MacDonald ldquoA real-time method to detect and trackmoving objects (DATMO) from unmanned aerial vehicles(UAVs) using a single camerardquo Remote Sensing vol 4 no 4 pp1090ndash1111 2012

[33] B Liu and A Zaccarin ldquoNew fast algorithms for the estimationof block motion vectorsrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 3 no 2 pp 148ndash157 1993

[34] R Li B Zeng andM L Liou ldquoNew three-step search algorithmfor blockmotion estimationrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 4 no 4 pp 438ndash442 1994

[35] L-M Po andW-C Ma ldquoA novel four-step search algorithm forfast blockmotion estimationrdquo IEEETransactions onCircuits andSystems for Video Technology vol 6 no 3 pp 313ndash317 1996

[36] S Zhu and K-K Ma ldquoA new diamond search algorithm forfast block-matching motion estimationrdquo IEEE Transactions onImage Processing vol 9 no 2 pp 287ndash290 2000

[37] L De Vos andM Stegherr ldquoParameterizable VLSI architecturesfor the full-search block-matching algorithmrdquo IEEE Transac-tions on Circuits and Systems vol 36 no 10 pp 1309ndash1316 1989

[38] M A Fischler and R C Bolles ldquoRandom sample consensus aparadigm for model fitting with applications to image analysisand automated cartographyrdquo Communications of the ACM vol24 no 6 pp 381ndash395 1981

[39] J W Tang N Shaikh-Husin and U U Sheikh ldquoFPGA imple-mentation of RANSAC algorithm for real-time image geometry

16 International Journal of Reconfigurable Computing

estimationrdquo in Proceedings of the 11th IEEE Student Conferenceon Research andDevelopment (SCOReD rsquo13) pp 290ndash294 IEEEPutrajaya Malaysia December 2013

[40] O Chum and J Matas ldquoRandomized ransac with T119889119889testrdquo in

Proceedings of the British Machine Vision Conference vol 2 pp448ndash457 September 2002

[41] DARPA SDMS PublicWeb Site 2003 httpswwwsdmsafrlafmil

[42] A F M S Saif A S Prabuwono and Z R MahayuddinldquoMotion analysis for moving object detection from UAV aerialimages a reviewrdquo in Proceedings of the International Conferenceon Informatics Electronics and Vision (ICIEV rsquo14) pp 1ndash6 IEEEDhaka Bangladesh May 2014

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

International Journal of Reconfigurable Computing 13

1 2 3 4 5 6 7 8 9 100944

0946

0948

095

0952

0954

0956

0958

096

0962

0964

Det

ectio

n ra

teD

R

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(a) DR

1 2 3 4 5 6 7 8 9 100

01

02

03

04

05

06

07

False

alar

m ra

te F

AR

Density of ego-motion m times n

thfd = 15

thfd = 20

thfd = 25

(b) FAR

Figure 13 DR and FAR for different density in ego-motion estimation119898 times 119899 and frame differencing threshold thfd

Table 5 Performance evaluation in terms of DR and FAR for 1000frames using different density in ego-motion estimation119898times 119899 andframe differencing threshold thfd

119898 times 119899 thfd DR FAR12 15 0958 064312 20 0954 033112 25 0949 019424 15 0957 056824 20 0950 032424 25 0945 010135 15 0958 054835 20 0952 021535 25 0947 009048 15 0959 053948 20 0952 025348 25 0944 007970 15 0958 050970 20 0951 018870 25 0946 007588 15 0960 048988 20 0951 021988 25 0947 0074108 15 0958 0483108 20 0952 0168108 25 0946 0058140 15 0958 0499140 20 0951 0187140 25 0946 0059165 15 0958 0474165 20 0953 0214165 25 0947 0068192 15 0959 0478192 20 0952 0169192 25 0946 0092

Table 6 Computation speed comparison of the proposed sys-tem with different software implementation using area-based andfeature-based registrations

Platform Frequency Registrationtechnique

Framerate

Hardwarespeed-up

ProposedSoC 100MHz Area-based 30 1

Intel Corei5-4210U 170GHz Area-based 426 704

Feature-based 1311 229

ARM1176JZF 700MHz Area-based 020 150Feature-based 056 5357

speed-up between the proposed system and other softwareimplementations using test videos in [41]As feature-based image registration has faster computa-

tion in software implementation comparing to area-basedregistration speed performance of feature-based method isalso included for comparison In feature-based implementa-tion features are first detected in each frame The detectedfeatures from current frame are cross-correlatedwith featureswith previous framewhile RANSAC algorithm is used to esti-mate the ego-motion between frames After compensatingthe ego-motion segmentation ofmoving object uses the sameprocesses with the proposed system To further optimizethe software implementation in terms of speed performancea fast feature detection algorithm [30] is utilized As thenumber of features will affect the computation time in featurematching step only 100 strongest features in each frame areselected for processingHowever the performance evaluationdoes not consider multithreaded software execution

14 International Journal of Reconfigurable Computing

Table 7 Resources usage of the proposed moving target detectionSoC

Logic units Utilization ()Total combinational function 15161 13Total registers 10803 9Total memory bits 521054 13Embedded multiplier 27 5FPGA device Altera Cyclone IV

Based on experimental result the speed performanceof the proposed moving target detection SoC surpassesoptimized software computation by 229 times and 5357times compared with implementations in modern CPUand embedded CPU respectively The software computation(RANSAC) in HWSW codesign of the proposed system cre-ates speed bottleneck thus limiting the maximum through-put to 30 fps The processing frame rate of the proposed sys-tem can be further improved by using fully dedicated hard-ware

54 Resource Utilization The overall hardware resourcesutilization of the complete system is illustrated in Table 7This prototype of real-time moving object detection systemutilizes only less than 20 percent of total resources in AlteraCyclone IV FPGA device As the proposed system uses off-chip memory components for frame buffering FPGA on-chip memory is utilized only for line buffering in streamingprocess (eg block matching and median filtering) and stor-ing intermediate results (eg point pairs after block match-ing) Thus the low resource usage of the proposed systemprovides abundant hardware space for other processes such astarget tracking or classification to be developed in future

6 Conclusions

Moving target detection is a crucial step in most computervision problem especially for UAV applications On-chipdetection without the need of real-time video transmission toground will provide immense benefit to diverse applicationssuch as military surveillance and resource exploration Inorder to perform this complex embedded video processingon-chip FPGA-based system is desirable due to the potentialparallelism of the algorithmThis paper proposed a moving target detection system

using FPGA to enable autonomous UAVwhich is able to per-form the computer vision algorithm on the flying platformThe proposed system is prototyped using Altera CycloneIV FPGA device on Terasic DE2-115 development boardmounted with a TRDB-D5M camera This system is devel-oped as a HWSW codesign using dedicated hardware withNios II software processor (booted with embedded Linux)running at 100MHz clock rate As stream-oriented hardwarewith pipeline processing is utilized the proposed systemachieves real-time capability with 30 frames per secondprocessing speed on 640times 480 live video Experimental resultshows that the proposed SoC performs 229 times and 5357times faster than optimized software computation onmodern

desktop computer (Intel Core i5) and embedded processor(ARM) In addition the proposed moving target detectionuses only less than 20 percent of total resources in theFPGA device allowing other hardware accelerators to beimplemented in future

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors would like to express their gratitude to UniversitiTeknologi Malaysia (UTM) and the Ministry of ScienceTechnology and Innovation (MOSTI) Malaysia for support-ing this research work under research Grants 01-01-06-SF1197and 01-01-06-SF1229

References

[1] A Ahmed M Nagai C Tianen and R Shibasaki ldquoUav basedmonitoring systemandobject detection technique developmentfor a disaster areardquo International Archives of PhotogrammetryRemote Sensing and Spatial Information Sciences vol 37 pp373ndash377 2008

[2] B Coifman M McCord R Mishalani M Iswalt and Y JildquoRoadway trafficmonitoring from an unmanned aerial vehiclerdquoIEE Proceedings-Intelligent Transport Systems vol 153 no 1 pp11ndash20 2006

[3] K Kanistras G Martins M J Rutherford and K P ValavanisldquoSurvey of unmanned aerial vehicles (uavs) for traffic monitor-ingrdquo inHandbook of Unmanned Aerial Vehicles pp 2643ndash2666Springer 2015

[4] K Nordberg P Doherty G Farneback et al ldquoVision for a UAVhelicopterrdquo in Proceedings of the International Conference onIntelligent Robots and Systems (IROS rsquo02) Workshop on AerialRobotics pp 29ndash34 Lausanne Switzerland October 2002

[5] D Zamalieva and A Yilmaz ldquoBackground subtraction for themoving camera a geometric approachrdquo Computer Vision andImage Understanding vol 127 pp 73ndash85 2014

[6] M Genovese and E Napoli ldquoASIC and FPGA implementationof the Gaussianmixturemodel algorithm for real-time segmen-tation of high definition videordquo IEEETransactions onVery LargeScale Integration (VLSI) Systems vol 22 no 3 pp 537ndash547 2014

[7] F Kristensen H Hedberg H Jiang P Nilsson and V OwallldquoAn embedded real-time surveillance system implementationand evaluationrdquo Journal of Signal Processing Systems vol 52 no1 pp 75ndash94 2008

[8] H Jiang H Ardo and V Owall ldquoA hardware architecturefor real-time video segmentation utilizing memory reductiontechniquesrdquo IEEETransactions onCircuits and Systems for VideoTechnology vol 19 no 2 pp 226ndash236 2009

[9] M Genovese and E Napoli ldquoFPGA-based architecture for realtime segmentation and denoising of HD videordquo Journal of Real-Time Image Processing vol 8 no 4 pp 389ndash401 2013

[10] A Lopez-Bravo J Diaz-Carmona A Ramirez-Agundis APadilla-Medina and J Prado-Olivarez ldquoFPGA-based videosystem for real time moving object detectionrdquo in Proceedings

International Journal of Reconfigurable Computing 15

of the 23rd International Conference on Electronics Communi-cations and Computing (CONIELECOMP rsquo13) pp 92ndash97 IEEECholula Mexico March 2013

[11] T Kryjak M Komorkiewicz and M Gorgon ldquoReal-time mov-ing object detection for video surveillance system in FPGArdquo inProceedings of the Conference on Design and Architectures forSignal and Image Processing (DASIP rsquo11) pp 1ndash8 IEEE TampereFinland November 2011

[12] A Mittal and D Huttenlocher ldquoScene modeling for wide areasurveillance and image synthesisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition vol 2pp 160ndash167 IEEE June 2000

[13] A Price J Pyke D Ashiri and T Cornall ldquoReal time objectdetection for an unmanned aerial vehicle using an FPGAbased vision systemrdquo in Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA rsquo06) pp 2854ndash2859 IEEE Orlando Fla USA May 2006

[14] G J Garcıa C A Jara J Pomares A Alabdo L M Poggi andF Torres ldquoA survey on FPGA-based sensor systems towardsintelligent and reconfigurable low-power sensors for computervision control and signal processingrdquo Sensors vol 14 no 4 pp6247ndash6278 2014

[15] S Ali and M Shah ldquoCocoa tracking in aerial imageryrdquo inAirborne Intelligence Surveillance Reconnaissance (ISR) Systemsand Applications III vol 6209 of Proceedings of SPIE OrlandoFla USA April 2006

[16] J Xiao C Yang F Han and H Cheng ldquoVehicle and persontracking in aerial videosrdquo in Multimodal Technologies for Per-ception of Humans pp 203ndash214 Springer 2008

[17] W Yu X Yu P Zhang and J Zhou ldquoA new framework of mov-ing target detection and tracking for uav video applicationrdquo inProceedings of the International Archives of the PhotogrammetryRemote Sensing and Spatial Information Science vol 37 BeijingChina 2008

[18] V Reilly H Idrees and M Shah ldquoDetection and tracking oflarge number of targets in wide area surveillancerdquo in ComputerVisionmdashECCV 2010 11th European Conference on ComputerVision Heraklion Crete Greece September 5ndash11 2010 Proceed-ings Part III vol 6313 of Lecture Notes in Computer Science pp186ndash199 Springer Berlin Germany 2010

[19] J Wang Y Zhang J Lu and W Xu ldquoA framework for movingtarget detection recognition and tracking in UAV videosrdquoin Affective Computing and Intelligent Interaction vol 137 ofAdvances in Intelligent and Soft Computing pp 69ndash76 SpringerBerlin Germany 2012

[20] S A Cheraghi andUU Sheikh ldquoMoving object detection usingimage registration for a moving camera platformrdquo in Proceed-ings of the IEEE International Conference on Control SystemComputing and Engineering (ICCSCE rsquo12) pp 355ndash359 IEEEPenang Malaysia November 2012

[21] Y Zhang X Tong T Yang and W Ma ldquoMulti-model estima-tion based moving object detection for aerial videordquo Sensorsvol 15 no 4 pp 8214ndash8231 2015

[22] Q Yu and G Medioni ldquoA GPU-based implementation ofmotion detection from a moving platformrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern RecognitionWorkshops (CVPR rsquo08) pp 1ndash6 AnchorageAlaska USA June 2008

[23] A Laika J Paul C Claus W Stechele A E S Auf and EMaehle ldquoFPGA-based real-time moving object detection for

walking robotsrdquo in Proceedings of the 8th IEEE InternationalWorkshop on Safety Security and Rescue Robotics (SSRR rsquo10) pp1ndash8 IEEE Bremen Germany July 2010

[24] B Tippetts S Fowers K Lillywhite D-J Lee and J ArchibaldldquoFPGA implementation of a feature detection and trackingalgorithm for real-time applicationsrdquo in Advances in VisualComputing pp 682ndash691 Springer 2007

[25] K May and N Krouglicof ldquoMoving target detection for senseand avoid using regional phase correlationrdquo in Proceedings ofthe IEEE International Conference on Robotics and Automation(ICRA rsquo13) pp 4767ndash4772 IEEE Karlsruhe Germany May2013

[26] M E Angelopoulou and C-S Bouganis ldquoVision-based ego-motion estimation on FPGA for unmanned aerial vehiclenavigationrdquo IEEE Transactions on Circuits and Systems for VideoTechnology vol 24 no 6 pp 1070ndash1083 2014

[27] I M El-Emary andMM A El-Kareem ldquoOn the application ofgenetic algorithms in finger prints registrationrdquoWorld AppliedSciences Journal vol 5 no 3 pp 276ndash281 2008

[28] A A Goshtasby 2-D and 3-D Image Registration for MedicalRemote Sensing and Industrial Applications JohnWiley amp SonsNew York NY USA 2005

[29] C Harris and M Stephens ldquoA combined corner and edgedetectorrdquo in Proceedings of the 4th Alvey Vision Conference vol15 pp 147ndash151 1988

[30] E Rosten andTDrummond ldquoMachine learning for high-speedcorner detectionrdquo in Computer VisionmdashECCV 2006 pp 430ndash443 Springer 2006

[31] H Bay T Tuytelaars and L Van Gool ldquoSURF speededup robust featuresrdquo in Computer VisionmdashECCV 2006 ALeonardis H Bischof and A Pinz Eds vol 3951 of LectureNotes in Computer Science pp 404ndash417 Springer 2006

[32] G R Rodrıguez-Canosa SThomas J del Cerro A Barrientosand B MacDonald ldquoA real-time method to detect and trackmoving objects (DATMO) from unmanned aerial vehicles(UAVs) using a single camerardquo Remote Sensing vol 4 no 4 pp1090ndash1111 2012

[33] B Liu and A Zaccarin ldquoNew fast algorithms for the estimationof block motion vectorsrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 3 no 2 pp 148ndash157 1993

[34] R Li B Zeng andM L Liou ldquoNew three-step search algorithmfor blockmotion estimationrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 4 no 4 pp 438ndash442 1994

[35] L-M Po andW-C Ma ldquoA novel four-step search algorithm forfast blockmotion estimationrdquo IEEETransactions onCircuits andSystems for Video Technology vol 6 no 3 pp 313ndash317 1996

[36] S Zhu and K-K Ma ldquoA new diamond search algorithm forfast block-matching motion estimationrdquo IEEE Transactions onImage Processing vol 9 no 2 pp 287ndash290 2000

[37] L De Vos andM Stegherr ldquoParameterizable VLSI architecturesfor the full-search block-matching algorithmrdquo IEEE Transac-tions on Circuits and Systems vol 36 no 10 pp 1309ndash1316 1989

[38] M A Fischler and R C Bolles ldquoRandom sample consensus aparadigm for model fitting with applications to image analysisand automated cartographyrdquo Communications of the ACM vol24 no 6 pp 381ndash395 1981

[39] J W Tang N Shaikh-Husin and U U Sheikh ldquoFPGA imple-mentation of RANSAC algorithm for real-time image geometry

16 International Journal of Reconfigurable Computing

estimationrdquo in Proceedings of the 11th IEEE Student Conferenceon Research andDevelopment (SCOReD rsquo13) pp 290ndash294 IEEEPutrajaya Malaysia December 2013

[40] O Chum and J Matas ldquoRandomized ransac with T119889119889testrdquo in

Proceedings of the British Machine Vision Conference vol 2 pp448ndash457 September 2002

[41] DARPA SDMS PublicWeb Site 2003 httpswwwsdmsafrlafmil

[42] A F M S Saif A S Prabuwono and Z R MahayuddinldquoMotion analysis for moving object detection from UAV aerialimages a reviewrdquo in Proceedings of the International Conferenceon Informatics Electronics and Vision (ICIEV rsquo14) pp 1ndash6 IEEEDhaka Bangladesh May 2014

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

14 International Journal of Reconfigurable Computing

Table 7 Resources usage of the proposed moving target detectionSoC

Logic units Utilization ()Total combinational function 15161 13Total registers 10803 9Total memory bits 521054 13Embedded multiplier 27 5FPGA device Altera Cyclone IV

Based on experimental result the speed performanceof the proposed moving target detection SoC surpassesoptimized software computation by 229 times and 5357times compared with implementations in modern CPUand embedded CPU respectively The software computation(RANSAC) in HWSW codesign of the proposed system cre-ates speed bottleneck thus limiting the maximum through-put to 30 fps The processing frame rate of the proposed sys-tem can be further improved by using fully dedicated hard-ware

54 Resource Utilization The overall hardware resourcesutilization of the complete system is illustrated in Table 7This prototype of real-time moving object detection systemutilizes only less than 20 percent of total resources in AlteraCyclone IV FPGA device As the proposed system uses off-chip memory components for frame buffering FPGA on-chip memory is utilized only for line buffering in streamingprocess (eg block matching and median filtering) and stor-ing intermediate results (eg point pairs after block match-ing) Thus the low resource usage of the proposed systemprovides abundant hardware space for other processes such astarget tracking or classification to be developed in future

6 Conclusions

Moving target detection is a crucial step in most computervision problem especially for UAV applications On-chipdetection without the need of real-time video transmission toground will provide immense benefit to diverse applicationssuch as military surveillance and resource exploration Inorder to perform this complex embedded video processingon-chip FPGA-based system is desirable due to the potentialparallelism of the algorithmThis paper proposed a moving target detection system

using FPGA to enable autonomous UAVwhich is able to per-form the computer vision algorithm on the flying platformThe proposed system is prototyped using Altera CycloneIV FPGA device on Terasic DE2-115 development boardmounted with a TRDB-D5M camera This system is devel-oped as a HWSW codesign using dedicated hardware withNios II software processor (booted with embedded Linux)running at 100MHz clock rate As stream-oriented hardwarewith pipeline processing is utilized the proposed systemachieves real-time capability with 30 frames per secondprocessing speed on 640times 480 live video Experimental resultshows that the proposed SoC performs 229 times and 5357times faster than optimized software computation onmodern

desktop computer (Intel Core i5) and embedded processor(ARM) In addition the proposed moving target detectionuses only less than 20 percent of total resources in theFPGA device allowing other hardware accelerators to beimplemented in future

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors would like to express their gratitude to UniversitiTeknologi Malaysia (UTM) and the Ministry of ScienceTechnology and Innovation (MOSTI) Malaysia for support-ing this research work under research Grants 01-01-06-SF1197and 01-01-06-SF1229

References

[1] A Ahmed M Nagai C Tianen and R Shibasaki ldquoUav basedmonitoring systemandobject detection technique developmentfor a disaster areardquo International Archives of PhotogrammetryRemote Sensing and Spatial Information Sciences vol 37 pp373ndash377 2008

[2] B Coifman M McCord R Mishalani M Iswalt and Y JildquoRoadway trafficmonitoring from an unmanned aerial vehiclerdquoIEE Proceedings-Intelligent Transport Systems vol 153 no 1 pp11ndash20 2006

[3] K Kanistras G Martins M J Rutherford and K P ValavanisldquoSurvey of unmanned aerial vehicles (uavs) for traffic monitor-ingrdquo inHandbook of Unmanned Aerial Vehicles pp 2643ndash2666Springer 2015

[4] K Nordberg P Doherty G Farneback et al ldquoVision for a UAVhelicopterrdquo in Proceedings of the International Conference onIntelligent Robots and Systems (IROS rsquo02) Workshop on AerialRobotics pp 29ndash34 Lausanne Switzerland October 2002

[5] D Zamalieva and A Yilmaz ldquoBackground subtraction for themoving camera a geometric approachrdquo Computer Vision andImage Understanding vol 127 pp 73ndash85 2014

[6] M Genovese and E Napoli ldquoASIC and FPGA implementationof the Gaussianmixturemodel algorithm for real-time segmen-tation of high definition videordquo IEEETransactions onVery LargeScale Integration (VLSI) Systems vol 22 no 3 pp 537ndash547 2014

[7] F Kristensen H Hedberg H Jiang P Nilsson and V OwallldquoAn embedded real-time surveillance system implementationand evaluationrdquo Journal of Signal Processing Systems vol 52 no1 pp 75ndash94 2008

[8] H Jiang H Ardo and V Owall ldquoA hardware architecturefor real-time video segmentation utilizing memory reductiontechniquesrdquo IEEETransactions onCircuits and Systems for VideoTechnology vol 19 no 2 pp 226ndash236 2009

[9] M Genovese and E Napoli ldquoFPGA-based architecture for realtime segmentation and denoising of HD videordquo Journal of Real-Time Image Processing vol 8 no 4 pp 389ndash401 2013

[10] A Lopez-Bravo J Diaz-Carmona A Ramirez-Agundis APadilla-Medina and J Prado-Olivarez ldquoFPGA-based videosystem for real time moving object detectionrdquo in Proceedings

International Journal of Reconfigurable Computing 15

of the 23rd International Conference on Electronics Communi-cations and Computing (CONIELECOMP rsquo13) pp 92ndash97 IEEECholula Mexico March 2013

[11] T Kryjak M Komorkiewicz and M Gorgon ldquoReal-time mov-ing object detection for video surveillance system in FPGArdquo inProceedings of the Conference on Design and Architectures forSignal and Image Processing (DASIP rsquo11) pp 1ndash8 IEEE TampereFinland November 2011

[12] A Mittal and D Huttenlocher ldquoScene modeling for wide areasurveillance and image synthesisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition vol 2pp 160ndash167 IEEE June 2000

[13] A Price J Pyke D Ashiri and T Cornall ldquoReal time objectdetection for an unmanned aerial vehicle using an FPGAbased vision systemrdquo in Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA rsquo06) pp 2854ndash2859 IEEE Orlando Fla USA May 2006

[14] G J Garcıa C A Jara J Pomares A Alabdo L M Poggi andF Torres ldquoA survey on FPGA-based sensor systems towardsintelligent and reconfigurable low-power sensors for computervision control and signal processingrdquo Sensors vol 14 no 4 pp6247ndash6278 2014

[15] S Ali and M Shah ldquoCocoa tracking in aerial imageryrdquo inAirborne Intelligence Surveillance Reconnaissance (ISR) Systemsand Applications III vol 6209 of Proceedings of SPIE OrlandoFla USA April 2006

[16] J Xiao C Yang F Han and H Cheng ldquoVehicle and persontracking in aerial videosrdquo in Multimodal Technologies for Per-ception of Humans pp 203ndash214 Springer 2008

[17] W Yu X Yu P Zhang and J Zhou ldquoA new framework of mov-ing target detection and tracking for uav video applicationrdquo inProceedings of the International Archives of the PhotogrammetryRemote Sensing and Spatial Information Science vol 37 BeijingChina 2008

[18] V Reilly H Idrees and M Shah ldquoDetection and tracking oflarge number of targets in wide area surveillancerdquo in ComputerVisionmdashECCV 2010 11th European Conference on ComputerVision Heraklion Crete Greece September 5ndash11 2010 Proceed-ings Part III vol 6313 of Lecture Notes in Computer Science pp186ndash199 Springer Berlin Germany 2010

[19] J Wang Y Zhang J Lu and W Xu ldquoA framework for movingtarget detection recognition and tracking in UAV videosrdquoin Affective Computing and Intelligent Interaction vol 137 ofAdvances in Intelligent and Soft Computing pp 69ndash76 SpringerBerlin Germany 2012

[20] S A Cheraghi andUU Sheikh ldquoMoving object detection usingimage registration for a moving camera platformrdquo in Proceed-ings of the IEEE International Conference on Control SystemComputing and Engineering (ICCSCE rsquo12) pp 355ndash359 IEEEPenang Malaysia November 2012

[21] Y Zhang X Tong T Yang and W Ma ldquoMulti-model estima-tion based moving object detection for aerial videordquo Sensorsvol 15 no 4 pp 8214ndash8231 2015

[22] Q Yu and G Medioni ldquoA GPU-based implementation ofmotion detection from a moving platformrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern RecognitionWorkshops (CVPR rsquo08) pp 1ndash6 AnchorageAlaska USA June 2008

[23] A Laika J Paul C Claus W Stechele A E S Auf and EMaehle ldquoFPGA-based real-time moving object detection for

walking robotsrdquo in Proceedings of the 8th IEEE InternationalWorkshop on Safety Security and Rescue Robotics (SSRR rsquo10) pp1ndash8 IEEE Bremen Germany July 2010

[24] B Tippetts S Fowers K Lillywhite D-J Lee and J ArchibaldldquoFPGA implementation of a feature detection and trackingalgorithm for real-time applicationsrdquo in Advances in VisualComputing pp 682ndash691 Springer 2007

[25] K May and N Krouglicof ldquoMoving target detection for senseand avoid using regional phase correlationrdquo in Proceedings ofthe IEEE International Conference on Robotics and Automation(ICRA rsquo13) pp 4767ndash4772 IEEE Karlsruhe Germany May2013

[26] M E Angelopoulou and C-S Bouganis ldquoVision-based ego-motion estimation on FPGA for unmanned aerial vehiclenavigationrdquo IEEE Transactions on Circuits and Systems for VideoTechnology vol 24 no 6 pp 1070ndash1083 2014

[27] I M El-Emary andMM A El-Kareem ldquoOn the application ofgenetic algorithms in finger prints registrationrdquoWorld AppliedSciences Journal vol 5 no 3 pp 276ndash281 2008

[28] A A Goshtasby 2-D and 3-D Image Registration for MedicalRemote Sensing and Industrial Applications JohnWiley amp SonsNew York NY USA 2005

[29] C Harris and M Stephens ldquoA combined corner and edgedetectorrdquo in Proceedings of the 4th Alvey Vision Conference vol15 pp 147ndash151 1988

[30] E Rosten andTDrummond ldquoMachine learning for high-speedcorner detectionrdquo in Computer VisionmdashECCV 2006 pp 430ndash443 Springer 2006

[31] H Bay T Tuytelaars and L Van Gool ldquoSURF speededup robust featuresrdquo in Computer VisionmdashECCV 2006 ALeonardis H Bischof and A Pinz Eds vol 3951 of LectureNotes in Computer Science pp 404ndash417 Springer 2006

[32] G R Rodrıguez-Canosa SThomas J del Cerro A Barrientosand B MacDonald ldquoA real-time method to detect and trackmoving objects (DATMO) from unmanned aerial vehicles(UAVs) using a single camerardquo Remote Sensing vol 4 no 4 pp1090ndash1111 2012

[33] B Liu and A Zaccarin ldquoNew fast algorithms for the estimationof block motion vectorsrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 3 no 2 pp 148ndash157 1993

[34] R Li B Zeng andM L Liou ldquoNew three-step search algorithmfor blockmotion estimationrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 4 no 4 pp 438ndash442 1994

[35] L-M Po andW-C Ma ldquoA novel four-step search algorithm forfast blockmotion estimationrdquo IEEETransactions onCircuits andSystems for Video Technology vol 6 no 3 pp 313ndash317 1996

[36] S Zhu and K-K Ma ldquoA new diamond search algorithm forfast block-matching motion estimationrdquo IEEE Transactions onImage Processing vol 9 no 2 pp 287ndash290 2000

[37] L De Vos andM Stegherr ldquoParameterizable VLSI architecturesfor the full-search block-matching algorithmrdquo IEEE Transac-tions on Circuits and Systems vol 36 no 10 pp 1309ndash1316 1989

[38] M A Fischler and R C Bolles ldquoRandom sample consensus aparadigm for model fitting with applications to image analysisand automated cartographyrdquo Communications of the ACM vol24 no 6 pp 381ndash395 1981

[39] J W Tang N Shaikh-Husin and U U Sheikh ldquoFPGA imple-mentation of RANSAC algorithm for real-time image geometry

16 International Journal of Reconfigurable Computing

estimationrdquo in Proceedings of the 11th IEEE Student Conferenceon Research andDevelopment (SCOReD rsquo13) pp 290ndash294 IEEEPutrajaya Malaysia December 2013

[40] O Chum and J Matas ldquoRandomized ransac with T119889119889testrdquo in

Proceedings of the British Machine Vision Conference vol 2 pp448ndash457 September 2002

[41] DARPA SDMS PublicWeb Site 2003 httpswwwsdmsafrlafmil

[42] A F M S Saif A S Prabuwono and Z R MahayuddinldquoMotion analysis for moving object detection from UAV aerialimages a reviewrdquo in Proceedings of the International Conferenceon Informatics Electronics and Vision (ICIEV rsquo14) pp 1ndash6 IEEEDhaka Bangladesh May 2014

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

International Journal of Reconfigurable Computing 15

of the 23rd International Conference on Electronics Communi-cations and Computing (CONIELECOMP rsquo13) pp 92ndash97 IEEECholula Mexico March 2013

[11] T Kryjak M Komorkiewicz and M Gorgon ldquoReal-time mov-ing object detection for video surveillance system in FPGArdquo inProceedings of the Conference on Design and Architectures forSignal and Image Processing (DASIP rsquo11) pp 1ndash8 IEEE TampereFinland November 2011

[12] A Mittal and D Huttenlocher ldquoScene modeling for wide areasurveillance and image synthesisrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition vol 2pp 160ndash167 IEEE June 2000

[13] A Price J Pyke D Ashiri and T Cornall ldquoReal time objectdetection for an unmanned aerial vehicle using an FPGAbased vision systemrdquo in Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA rsquo06) pp 2854ndash2859 IEEE Orlando Fla USA May 2006

[14] G J Garcıa C A Jara J Pomares A Alabdo L M Poggi andF Torres ldquoA survey on FPGA-based sensor systems towardsintelligent and reconfigurable low-power sensors for computervision control and signal processingrdquo Sensors vol 14 no 4 pp6247ndash6278 2014

[15] S Ali and M Shah ldquoCocoa tracking in aerial imageryrdquo inAirborne Intelligence Surveillance Reconnaissance (ISR) Systemsand Applications III vol 6209 of Proceedings of SPIE OrlandoFla USA April 2006

[16] J Xiao C Yang F Han and H Cheng ldquoVehicle and persontracking in aerial videosrdquo in Multimodal Technologies for Per-ception of Humans pp 203ndash214 Springer 2008

[17] W Yu X Yu P Zhang and J Zhou ldquoA new framework of mov-ing target detection and tracking for uav video applicationrdquo inProceedings of the International Archives of the PhotogrammetryRemote Sensing and Spatial Information Science vol 37 BeijingChina 2008

[18] V Reilly H Idrees and M Shah ldquoDetection and tracking oflarge number of targets in wide area surveillancerdquo in ComputerVisionmdashECCV 2010 11th European Conference on ComputerVision Heraklion Crete Greece September 5ndash11 2010 Proceed-ings Part III vol 6313 of Lecture Notes in Computer Science pp186ndash199 Springer Berlin Germany 2010

[19] J Wang Y Zhang J Lu and W Xu ldquoA framework for movingtarget detection recognition and tracking in UAV videosrdquoin Affective Computing and Intelligent Interaction vol 137 ofAdvances in Intelligent and Soft Computing pp 69ndash76 SpringerBerlin Germany 2012

[20] S A Cheraghi andUU Sheikh ldquoMoving object detection usingimage registration for a moving camera platformrdquo in Proceed-ings of the IEEE International Conference on Control SystemComputing and Engineering (ICCSCE rsquo12) pp 355ndash359 IEEEPenang Malaysia November 2012

[21] Y Zhang X Tong T Yang and W Ma ldquoMulti-model estima-tion based moving object detection for aerial videordquo Sensorsvol 15 no 4 pp 8214ndash8231 2015

[22] Q Yu and G Medioni ldquoA GPU-based implementation ofmotion detection from a moving platformrdquo in Proceedings ofthe IEEE Computer Society Conference on Computer Vision andPattern RecognitionWorkshops (CVPR rsquo08) pp 1ndash6 AnchorageAlaska USA June 2008

[23] A Laika J Paul C Claus W Stechele A E S Auf and EMaehle ldquoFPGA-based real-time moving object detection for

walking robotsrdquo in Proceedings of the 8th IEEE InternationalWorkshop on Safety Security and Rescue Robotics (SSRR rsquo10) pp1ndash8 IEEE Bremen Germany July 2010

[24] B Tippetts S Fowers K Lillywhite D-J Lee and J ArchibaldldquoFPGA implementation of a feature detection and trackingalgorithm for real-time applicationsrdquo in Advances in VisualComputing pp 682ndash691 Springer 2007

[25] K May and N Krouglicof ldquoMoving target detection for senseand avoid using regional phase correlationrdquo in Proceedings ofthe IEEE International Conference on Robotics and Automation(ICRA rsquo13) pp 4767ndash4772 IEEE Karlsruhe Germany May2013

[26] M E Angelopoulou and C-S Bouganis ldquoVision-based ego-motion estimation on FPGA for unmanned aerial vehiclenavigationrdquo IEEE Transactions on Circuits and Systems for VideoTechnology vol 24 no 6 pp 1070ndash1083 2014

[27] I M El-Emary andMM A El-Kareem ldquoOn the application ofgenetic algorithms in finger prints registrationrdquoWorld AppliedSciences Journal vol 5 no 3 pp 276ndash281 2008

[28] A A Goshtasby 2-D and 3-D Image Registration for MedicalRemote Sensing and Industrial Applications JohnWiley amp SonsNew York NY USA 2005

[29] C Harris and M Stephens ldquoA combined corner and edgedetectorrdquo in Proceedings of the 4th Alvey Vision Conference vol15 pp 147ndash151 1988

[30] E Rosten andTDrummond ldquoMachine learning for high-speedcorner detectionrdquo in Computer VisionmdashECCV 2006 pp 430ndash443 Springer 2006

[31] H Bay T Tuytelaars and L Van Gool ldquoSURF speededup robust featuresrdquo in Computer VisionmdashECCV 2006 ALeonardis H Bischof and A Pinz Eds vol 3951 of LectureNotes in Computer Science pp 404ndash417 Springer 2006

[32] G R Rodrıguez-Canosa SThomas J del Cerro A Barrientosand B MacDonald ldquoA real-time method to detect and trackmoving objects (DATMO) from unmanned aerial vehicles(UAVs) using a single camerardquo Remote Sensing vol 4 no 4 pp1090ndash1111 2012

[33] B Liu and A Zaccarin ldquoNew fast algorithms for the estimationof block motion vectorsrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 3 no 2 pp 148ndash157 1993

[34] R Li B Zeng andM L Liou ldquoNew three-step search algorithmfor blockmotion estimationrdquo IEEE Transactions on Circuits andSystems for Video Technology vol 4 no 4 pp 438ndash442 1994

[35] L-M Po andW-C Ma ldquoA novel four-step search algorithm forfast blockmotion estimationrdquo IEEETransactions onCircuits andSystems for Video Technology vol 6 no 3 pp 313ndash317 1996

[36] S Zhu and K-K Ma ldquoA new diamond search algorithm forfast block-matching motion estimationrdquo IEEE Transactions onImage Processing vol 9 no 2 pp 287ndash290 2000

[37] L De Vos andM Stegherr ldquoParameterizable VLSI architecturesfor the full-search block-matching algorithmrdquo IEEE Transac-tions on Circuits and Systems vol 36 no 10 pp 1309ndash1316 1989

[38] M A Fischler and R C Bolles ldquoRandom sample consensus aparadigm for model fitting with applications to image analysisand automated cartographyrdquo Communications of the ACM vol24 no 6 pp 381ndash395 1981

[39] J W Tang N Shaikh-Husin and U U Sheikh ldquoFPGA imple-mentation of RANSAC algorithm for real-time image geometry

16 International Journal of Reconfigurable Computing

estimationrdquo in Proceedings of the 11th IEEE Student Conferenceon Research andDevelopment (SCOReD rsquo13) pp 290ndash294 IEEEPutrajaya Malaysia December 2013

[40] O Chum and J Matas ldquoRandomized ransac with T119889119889testrdquo in

Proceedings of the British Machine Vision Conference vol 2 pp448ndash457 September 2002

[41] DARPA SDMS PublicWeb Site 2003 httpswwwsdmsafrlafmil

[42] A F M S Saif A S Prabuwono and Z R MahayuddinldquoMotion analysis for moving object detection from UAV aerialimages a reviewrdquo in Proceedings of the International Conferenceon Informatics Electronics and Vision (ICIEV rsquo14) pp 1ndash6 IEEEDhaka Bangladesh May 2014

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

16 International Journal of Reconfigurable Computing

estimationrdquo in Proceedings of the 11th IEEE Student Conferenceon Research andDevelopment (SCOReD rsquo13) pp 290ndash294 IEEEPutrajaya Malaysia December 2013

[40] O Chum and J Matas ldquoRandomized ransac with T119889119889testrdquo in

Proceedings of the British Machine Vision Conference vol 2 pp448ndash457 September 2002

[41] DARPA SDMS PublicWeb Site 2003 httpswwwsdmsafrlafmil

[42] A F M S Saif A S Prabuwono and Z R MahayuddinldquoMotion analysis for moving object detection from UAV aerialimages a reviewrdquo in Proceedings of the International Conferenceon Informatics Electronics and Vision (ICIEV rsquo14) pp 1ndash6 IEEEDhaka Bangladesh May 2014

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of


Recommended