Underwater Object Tracking Benchmark and Dataset

Underwater Object Tracking Benchmark andDataset

Landry Kezebou, Student Member, IEEE,Victor Oludare, Student Member, IEEE,

Karen Panetta, Fellow, IEEEDepartment of Electrical and Computer Engineering

Medford, MA, [email protected], [email protected],

[email protected]

Sos S Agaian, Fellow, IEEE,Department of Computer ScienceThe City University of New York,

New York City, NY, [email protected]

Abstract—While there has been tremendous advance-ment in object tracking for open air visual data, muchless work has been done for underwater object tracking.This is due to the low quality of underwater visual data.Underwater visual data suffers distortions in contrastand sharpness, as a result of refraction and absorptionof light, and particles, which all vary dependent onthe depth, color and nature of water. Although therecurrently exists several object tracking algorithms withproven record of high speed, precision and success rate,these algorithms work best for open air tracking, andconsiderably degrade in performance when tracking tar-gets in underwater environments, as it is presented in thispaper. The advancement made in open air tracking hasbeen facilitated by availability of multiple benchmark anddataset. However, no such benchmark and dataset exist forunderwater tracking, and this lack of data has hindereddevelopment of dedicated underwater tracking algorithms.In this paper, we present: a) the first underwater trackingbenchmark dataset consisting of 32 videos, and a totalof 24241 annotated frames, averaging 29.15 seconds and757.53 frames per video, to help improve underwatertracking; and b) a comparative performance analysis ofexisting tracking algorithms in underwater environmentas opposed to open air.

Index Terms—Underwater visual data, object track-ing,benchmark dataset, performance evaluation, imageenhancement

I. INTRODUCTION

Object tracking is one of the most important prob-lems in computer vision, and as such has attractedthe attention of many researchers in recent years.It finds application in domains such as homeland

security, port and marine security, search and res-cue operations, disaster recovery, human-computerinteraction, video communication and compression,augmented reality, traffic control, medical imaging,and video editing [1]. Here, the focus will be onmarine border security applications.There has been several object tracking benchmarksfor both single object and multiple object track-ing. The most popular include, the Multiple ObjectTracking (MOT) [2], the Visual Object Tracking(VOT) [3], Object Tracking Becnhmark (OTB) [4],and Thermal object Tracking (TOT) benchmark[2]. Competitions such as VOT challenge [3], theMOT challenge [5], NUSPRO VOT challenge [6],and the Thermal Infrared Object Tracking chal-lenge [7], have helped push the boundaries of thesebenchmarks with consistent improvement in speed,precision and success rate. Unfortunately, all theexisting tracking benchmarks datasets focus on openair target tracking. There currently exist no suchbenchmark or research competition for underwaterobject tracking. The low quality of underwater vi-sual data, due to distortions such as contrast, absorp-tion, particles, refraction of light, color,and sharp-ness, has made underwater tracking less attractive toresearchers and as such has not received the sameattention as open air tracking counterpart. Whilethere has been tremendous advance and success inopen air tracking, the existing trackers considerablydegrade in performance when tested on underwatervisual data, hence the need to create an underwatertracking benchmark.

978-1-7281-5092-5/19/$31.00 ©2019 IEEE

Figure 1. Example of distorted underwater images. Left to right:diver, sea turtle, fish, largemouth bass.

Our main contributions include : a) create afirst underwater water object tracking dataset withcomplete annotation to facilitate bench marking, b)create a first Underwater Object Tracking (UOT)Benchmark by performing a comparative analysisof existing trackers and documenting current per-formance.

The rest of this paper is structured as follows:section II- background, section III- Dataset , sec-tion IV - Evaluation methodology and Results, andsection V - Conclusion.

II. BACKGROUND

The aforementioned research competitions forsingle and multiple objects tracking, has spurred thedevelopment and improvement of several trackingalgorithms suitable for variety of applications andmost efficient for particular tasks, often tradingoff speed vs precision and success rate. Trackingalgorithms are developed using different methodsincluding correlation filter based tracking [8], targetestimation based tracking [9], spatial and intensityinformation [10], [11].Some of the popular object tracking algorithmsinclude Kernelized correlation Filter (KCF)[8], TLD tracker [12], Boosting [13], CSRT[14], MIL [15], GOTURN [16], MOSSE [17],MEDIANFLOW [18], and LKT tracker [11]; mostof which are all available in OpenCV library. Otherstate of the art trackers include: ECO [19], CCOT[20], STAPLE [21], STRCF [22], BACF [23],DCF [24], and SAMF [25], whose Matlab basedsource code are available on Github. The abovementioned trackers are considered state of the arttracking algorithms based on their performanceon readily available open air tracking data. Werefer the reader to the respective publications andGitHub repositories to learn more about how eachof these trackers operate.

There exist a few underwater tracking algorithmsusing different techniques such as time frequencysignatures [26], weighted template matching [27],Kalman filter [28], color based light attenuation[29]. However, these methods do not generalizewell and also don’t perform comparable to openair state of the art object tracking algorithms. S.Bazeille et al [29] proposed a color based methodfor detecting and tracking underwater objects usinglight attenuation and color scheme. The algorithmtracks targets by simply comparing pixels colorsin each frame to prior known colors of the targetof interest. However this method does not accountfor other types of underwater distortions such asparticles, depth, adsorption and refraction, and willfail if multiple object of the same color such asfishes or turtles, are present in the frame. D. Waltheret al [28] proposed a system for tracking multipleobjects in underwater environment. It uses selectiveattention algorithm to reduce the complexity ofmuti-target tracking. Detection of new objects isdone using saliency-based bottom up attention sys-tem, and kalman filter is used to track the centroid ofdetected objects. However, the speed and accuracystill have a lot of room for improvement.

D. Kim et al [27], proposed using weighted cor-relation coefficient for underwater target tracking.The approach consist of using texture and colorbased features to perform template matching oftarget under various lighting conditions. Objects aredetected using multiple template based selectionand tracking is done using mean shift based objecttracking method. The algorithm also recourse toGaussian smoothing and histogram equalization tocompensate for distortion. The proposed method isrobust to illumination change but does not performwell for other forms of underwater distortion.In this paper, we demonstrate the weaknesses ofexisting state of the art tracking algorithms inunderwater visual data and propose a dataset andbenchmark to encourage development of more ro-bust underwater tracking algorithms.

III. DATASET

As mentioned earlier, there currently exist numer-ous object tracking dataset and Benchmark. How-ever, most of these dataset focus on open air orsurface tracking.

978-1-7281-5092-5/19/$31.00 ©2019 IEEE

Figure 2. Sample tracking videos from our UOT32 dataset

Figure 2 shows sample frames from videos in thedataset with the ground truth bounding box for thetarget in the corresponding frame.

Our dataset consists of 32 videos for a total of 933seconds and 24241 annotated frames. Each videohas on average 29.15 seconds and 757.53 frames.Videos data are sourced from YouTube and consistof a combination of naturally distorted underwatervisual data as well as artificial underwater data fromSubnautica [30]. Figure 3 shows the distribution ofthe total number of frames per video in the dataset.

Figure 3. Distribution of Annotated frames

IV. EVALUATION METHODOLOGY

For adequate and fair evaluation of existingstate of the art object tracking algorithms on ourdataset, we run tracking algorithms of up to 15

popular trackers whose original source code aremade available to the public. Most of the trackersused here for evaluation are listed among the topperforming tracking algorithms from the OTB andVOT object tracking challenge.We adopt two common evaluation metrics toquantitatively analyse the performance of selectedtracking algorithms on our UOT32 dataset. Justlike in the OTB and VOT benchmark, we performthe One Pass Evaluation (OPE) and measure theprecision as well as the success rate of eachtrackers over the entire 32 videos dataset. Precisionis measured as the euclidean distance measuredin pixels, between the center of the ground truthbounding box Cgrd−bbox and the tracker boundingbox Ctkr−bbox. A threshold of up to 50 pixels isused for ranking trackers performance and plottingthe corresponding precision [31].

P =√Cgrd−bbox2 − Ctkr−bbox2

Success rate on the Other hand measures theamount of overlap or Intersection over Union (IoU)of pixels between the ground truth and trackerbounding boxes. The performance on a sequence ismeasured by counting the total number of frames inwhich the bounding boxes overlap are greater thana given threshold value. The tracking algorithmsare ranked using the Area Under Curve (AUC).

S = Rtr⋂

Rgt

Rtr⋃

Rgt

Where Rtr and Rgt represent the tracker and groundtruth bounding box for a given frame respectively.

978-1-7281-5092-5/19/$31.00 ©2019 IEEE

V. TRACKING BENCHMARK

The evaluation is conducted on an AlienwareArea-51M 2080 Laptop with configurations asfollows: 9th Generation Intel® Core™ i9-9900(8-Core, 16MB Cache, up to 5.0GHz w/ TurboBoost); NVIDIA® GeForce RTX™ 2080 11GBGPU; 64GB (2x32GB) RAM, DDR4, 2666MHz.For the One Pass Evaluation, each tracker is testedon all 32 videos and 24241 frames.

Overall PerformanceThe plot in Figure 4 shows the average Frame PerSecond (FPS) of each tracker across all videos in thedataset. To evaluate the average FPS performanceof a particular tracker, we first measure the averageFPS of the tracker on each individual sequence.

The Overall FPS performance of of each trackeris then computed as the mean of average FPS forall sequences. In a similar fashion, we compute theOverall Precision as well as the Overall SuccessRate performance of each tracker on the overalldataset, all in One Pass Evaluation fashion.

Figure 4. Average FPS Stats

As can be seen on the plots in Figure 4, 5 and 6,C-COT tracker has the highest precision and successat 0.472 and 0.397 respectively, but at an extremelyslow overall tracking speed at 0.161 FPS which isway below 1 FPS and not so useful. MEDIAN-FLOW on the flip side has the highest trackingspeed at 434.821 FPS but with a lower precisionand success rate at 0.162 and 0.197 respectively.CSRT,BOOSTING, KCF, MOSSE, all perform at

Figure 5. Average Precision Stats

Figure 6. Average Success Rate Stats

real time speed 64, 43, 296, 200 FPS respectively.Hoever, at reasonably low precision 0.339, 0.345,0.162, 0.156 and sucess rate 0.318, 0.350, 0.204,0.196 respectively. Figure 7 depicts the One PassEvaluation of the Precision and Success Rate plotsof the trackers.

Even with a pixel threshold of up to 50px, theprecision of these state of the art trackers don’tcome close to their performance on open air datasetand benchmarks. Table I shows how some topperforming trackers of the OTB50 and OTB100benchmarks, perform on our Underwater ObjectTracking (UOT) dataset.This table substantiates our initial claim that theperformance of existing state of the art algorithmsconsiderably degrade when used in tracking objectsin underwater environment as a result of the numer-ous aforementioned distortions. The results in TableI reveals that CCOT, ECO and STRCF trackers stillrank as the top performing trackers on our UOTdataset, but with a much degraded performance.

978-1-7281-5092-5/19/$31.00 ©2019 IEEE

Figure 7. OPE Precision and Success Rate

Object Tracking DatasetTracker OTB50

precisionOTB50success

100 preci-sion

OTB100success

UOT32Precision

UOT32Success

ECO [19] 0.874 0.643 0.910 0.691 0.422 0.397CCOT [20] 0.843 0.614 0.898 0.671 0.472 0.397STAPLE[21]

0.681 0.509 0.784 0.581 0.090 0.129

SAMF [25] 0.650 0.469 0.751 0.553 0.057 0.113KCF [8] 0.611 0.403 0.696 0.477 0.162 0.204BACF [23] 0.642 0.085 0.145DCF [24] 0.733 0.598 0.073 0.136STRCF[22]

0.405 0.373

Table IPERFORMANCE COMPARISON ON UNDER WATER DATASET

VI. CONCLUSION

In this paper, we investigated the performanceof current state of the art tracking algorithm inunderwater environment in which visual data isdistorted by refraction, reflection, particles, light,depth and much more; as opposed to open air objecttracking which has been the principal focus of priorobject tracking benchmarks. We introduce a richand diverse underwater dataset with up to 32 videosand a total of 24241 annotated frames, from variousdistorted underwater environment. The results of ouranalysis illustrated in the plots and table in the Re-sult section above clearly shows that top performingobject tracking algorithms evaluated on open airenvironment, considerably degrade in performance

when evaluated in underwater environment due tothe inherent distortions present in underwater visualdata. Hopefully, the UOT32 dataset presented in thispaper will serve as a benchmark for developing newtracking algorithm more suited and more robust forunderwater target tracking purposes.

978-1-7281-5092-5/19/$31.00 ©2019 IEEE

REFERENCES

[1] M. Shah A. Yilmaz, O. Javed, “Object tracking: A survey,”ACM Computing Surveys (CSUR), vol. 38, no. 13, April 2006.

[2] A. Berg, J. Ahlberg, and M. Felsberg, “A thermal object track-ing benchmark,” in 2015 12th IEEE International Conferenceon Advanced Video and Signal Based Surveillance (AVSS), Aug2015, pp. 1–6.

[3] Matej Kristan, Jiri Matas, Ales Leonardis, Michael Felsberg,Luka Cehovin, Gustavo Fernandez, Tomas Vojir, Gustav Hager,Georg Nebehay, and Roman Pflugfelder, “The visual objecttracking vot2016 challenge results,” in Springer (Oct 2016),October 2016.

[4] Yi Wu, Jongwoo Lim, and Ming-Hsuan Yang, “Online objecttracking: A benchmark,” in The IEEE Conference on ComputerVision and Pattern Recognition (CVPR), June 2013.

[5] Anton Milan, Laura Leal-Taixe, Ian D. Reid, Stefan Roth, andKonrad Schindler, “MOT16: A benchmark for multi-objecttracking,” CoRR, vol. abs/1603.00831, 2016.

[6] A. Li, M. Lin, Y. Wu, M. Yang, and S. Yan, “Nus-pro: Anew visual tracking challenge,” IEEE Transactions on PatternAnalysis and Machine Intelligence, vol. 38, no. 2, pp. 335–349,Feb 2016.

[7] Michael Felsberg, Amanda Berg, Gustav Hager, JorgenAhlberg, Matej Kristan, Jiri Matas, Ales Leonardis, Luka Ce-hovin, Gustavo Fernandez, Tomas Vojir, Georg Nebehay, andRoman Pflugfelder, “The thermal infrared visual object trackingvot-tir2015 challenge results,” in The IEEE InternationalConference on Computer Vision (ICCV) Workshops, December2015.

[8] J. F. Henriques, R. Caseiro, P. Martins, and J. Batista, “High-speed tracking with kernelized correlation filters,” IEEE Trans-actions on Pattern Analysis and Machine Intelligence, vol. 37,no. 3, pp. 583–596, March 2015.

[9] Shu-Kang Tu Shiuh-Ku Weng, Chung-Ming Kuo, “Video objecttracking using adaptive kalman filter,” in Journal of VisualCommunication and Image Representation, December 2006,vol. 17, pp. 1190–1208.

[10] Mengdan Zhang, Junliang Xing, Jin Gao, Xinchu Shi, QiangWang, and Weiming Hu, “Joint scale-spatial correlation track-ing with adaptive rotation estimation,” in The IEEE Inter-national Conference on Computer Vision (ICCV) Workshops,December 2015.

[11] V. Buddubariki, S. G. Tulluri, and S. Mukherjee, “Multipleobject tracking by improved klt tracker over surf features,” in2015 Fifth National Conference on Computer Vision, PatternRecognition, Image Processing and Graphics (NCVPRIPG),Dec 2015, pp. 1–4.

[12] Z. Kalal, K. Mikolajczyk, and J. Matas, “Tracking-learning-detection,” IEEE Transactions on Pattern Analysis and MachineIntelligence, vol. 34, no. 7, pp. 1409–1422, July 2012.

[13] H. Grabner, M. Grabner, and H. Bischof, “Real-time trackingvia on-line boosting,” in Proc. BMVC, 2006, pp. 6.1–6.10,doi:10.5244/C.20.6.

[14] Alan Lukezic, Tomas Vojır, Luka Cehovin Zajc, Jirı Matas,and Matej Kristan, “Discriminative correlation filter trackerwith channel and spatial reliability,” International Journal ofComputer Vision, vol. 126, no. 7, pp. 671–688, Jul 2018.

[15] B. Babenko, Ming-Hsuan Yang, and S. Belongie, “VisualTracking with Online Multiple Instance Learning,” in CVPR,2009.

[16] David Held, Sebastian Thrun, and Silvio Savarese, “Learning totrack at 100 fps with deep regression networks,” in Computer

Vision – ECCV 2016, Bastian Leibe, Jiri Matas, Nicu Sebe,and Max Welling, Eds., Cham, 2016, pp. 749–765, SpringerInternational Publishing.

[17] D. S. Bolme, J. R. Beveridge, B. A. Draper, and Y. M. Lui,“Visual object tracking using adaptive correlation filters,” in2010 IEEE Computer Society Conference on Computer Visionand Pattern Recognition, June 2010, pp. 2544–2550.

[18] Z. Kalal, K. Mikolajczyk, and J. Matas, “Forward-backwarderror: Automatic detection of tracking failures,” in 2010 20thInternational Conference on Pattern Recognition, Aug 2010,pp. 2756–2759.

[19] Fahad Shahbaz Khan-Michael Felsberg Martin Danelljan,Goutam Bhat, “Eco: Efficient convolution operators for track-ing,” in CVPR, 2017.

[20] Fahad Khan Michael Felsberg Martin Danelljan, An-dreas Robinson, “Beyond correlation filters: Learning continu-ous convolution operators for visual tracking,” in ECCV, 2016.

[21] Stuart Golodetz Ondrej Miksik Philip H.S. Torr Luca Bertinetto,Jack Valmadre, “Staple: Complementary learners for real-timetracking,” in CVPR, 2016.

[22] Wangmeng Zuo Lei Zhang Ming-Hsuan Yang Feng Li,Cheng Tian, “Learning spatial-temporal regularized correlationfilters for visual tracking,” in CVPR, 2018.

[23] Simon Lucey Hamed Kiani Galoogahi, Ashton Fagg, “Learningbackground-aware correlation filters for visual tracking,” inICCV, 2017.

[24] Fahad Shahbaz Khan Michael Felsberg Susanna Gladh, Mar-tin Danelljan, “Deep motion features for visual tracking,” inICPR, 2016.

[25] Jianke Zhu Yang Li, “A scale adaptive kernel correlation filtertracker with feature integration,” in ECCV workshop, 2014.

[26] D. Angela, C. Ion, I. Cornel, B. Diana, and P. Teodor, “Un-derwater object tracking using time frequency signatures ofacoustic signals,” in OCEANS 2014 - TAIPEI, April 2014, pp.1–5.

[27] D. Kim, D. Lee, H. Myung, and H. Choi, “Object detectionand tracking for autonomous underwater robots using weightedtemplate matching,” in 2012 Oceans - Yeosu, May 2012, pp.1–5.

[28] D. Walther, D. R. Edgington, and C. Koch, “Detection andtracking of objects in underwater video,” in Proceedings of the2004 IEEE Computer Society Conference on Computer Visionand Pattern Recognition, 2004. CVPR 2004., June 2004, vol. 1,pp. I–I.

[29] Stephane Bazeille, Isabelle Quidu, and Luc Jaulin, “Color-basedunderwater object recognition using water light attenuation,”Intelligent Service Robotics, vol. 5, no. 2, pp. 109–118, Apr2012.

[30] Subnautica, “https://unknownworlds.com/subnautica/,” .[31] Yi Wu, Jongwoo Lim, and Ming-Hsuan Yang, “Object tracking

benchmark,” in IEEE Transactions on Pattern Analysis andMachine Intelligence 37(9), 1834–1848), 2015.

978-1-7281-5092-5/19/$31.00 ©2019 IEEE

Date post:	15-Jan-2022
Category:	Documents
Upload:	others
View:	8 times
Download:	0 times

Underwater Object Tracking Benchmark and Dataset

Documents