+ All Categories
Home > Documents > Modifiable drone thermal imaging analysis framework for ...

Modifiable drone thermal imaging analysis framework for ...

Date post: 07-Jan-2022
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
124
Brecht Verhoeve mob detection during open-air events Modifiable drone thermal imaging analysis framework for Academic year 2017-2018 Faculty of Engineering and Architecture Chair: Prof. dr. ir. Bart Dhoedt Department of Information Technology Master of Science in Computer Science Engineering Master's dissertation submitted in order to obtain the academic degree of Counsellors: Pieter-Jan Maenhaut, Jerico Moeyersons Supervisors: Prof. dr. Bruno Volckaert, Prof. dr. ir. Filip De Turck
Transcript

Brecht Verhoeve

mob detection during open-air eventsModifiable drone thermal imaging analysis framework for

Academic year 2017-2018Faculty of Engineering and ArchitectureChair Prof dr ir Bart DhoedtDepartment of Information Technology

Master of Science in Computer Science Engineering Masters dissertation submitted in order to obtain the academic degree of

Counsellors Pieter-Jan Maenhaut Jerico MoeyersonsSupervisors Prof dr Bruno Volckaert Prof dr ir Filip De Turck

Brecht Verhoeve

mob detection during open-air eventsModifiable drone thermal imaging analysis framework for

Academic year 2017-2018Faculty of Engineering and ArchitectureChair Prof dr ir Bart DhoedtDepartment of Information Technology

Master of Science in Computer Science Engineering Masters dissertation submitted in order to obtain the academic degree of

Counsellors Pieter-Jan Maenhaut Jerico MoeyersonsSupervisors Prof dr Bruno Volckaert Prof dr ir Filip De Turck

iv

Permission for usage

rdquoThe author gives permission to make this master dissertation available for consultation and to copy parts of this master dis-

sertation for personal use In the case of any other use the copyright terms have to be respected in particular with regard to

the obligation to state expressly the source when quoting results from this master dissertationrdquo

Brecht Verhoeve

Ghent June 2018

v

Preface

This master dissertation is submitted as a completion of the academic degree as Master of Science in Computer Science Engi-

neering at Ghent University The dissertation investigates the upcoming combination of drones and thermal cameras their use

cases and supporting technologies The dissertation led me through various fields such as software architecture microservices

software containerization GPUs and neural networks I wrote the dissertation focussing on the business and technological

aspects that could lead to increasing industry adoption of these technologies

I would like to thank my supervisors and counsellors for their continuous support this year You were always there for a quick

meeting during which the atmosphere was always positive jokes were always around the corner but with a focus on results

Prof Volckaert for the quickest replies on emails I have yet witnessed to this day and guiding me through the complex journey

of this dissertation Jerico Moeyersons for the office hop-ins and help during that annoying CUDA installation Pieter-Jan

Maenhaut for his questions and reviews during meetings which provided me with new insights and things to write about Nils

Tijtgat for the support in the early days of the thesis Irsquove read your tutorial on YOLO more than I would like to admit And finally

Prof De Turck for the opportunity of working on this topic

I am grateful for the company I had this year when working on the dissertation Ozan Catal Joran Claeys Stefan Wauters Dries

Bosman Pieter De Cleer Igor Lima de Paula Laura Van Messem Lars De Brabandere Stijn Cuyvers Stijn Poelman thank you for

the fun times spontaneous beers and support this year

Special thanks go out to the people of the VTK and FK students associations You provided me with unforgettable experiences

friendships teachings andmemories With a special mention to Steacutephanie Anna and Nick from Career amp Development everyone

from Delta and finally Stijn Adams and Sander De Wilde for their continuous support throughout the years

Finally I want to thank my parents and Marjolein Hondekyn for their advise and massive support Without you I wouldnrsquot have

been able to wrestle myself through the tough periods and finish the dissertation

Brecht Verhoeve

Ghent June 2018

vi

Modifiable drone thermal imaging analysis framework for mob detection during

open-air events

Brecht Verhoeve

Supervisors Prof dr Bruno Volckaert Prof dr ir Filip De Turck

Counsellors Pieter-Jan Maenhaut Jerico Moeyersons

Masterrsquos dissertation submitted in order to obtain the academic degree of

Master of Science in Computer Science Engineering

Department of Information Technology

Chair Prof dr ir Bart Dhoedt

Faculty of Engineering and Architecture

Ghent University

Academic year 2017-2018

Abstract

Drones and thermal cameras are used in combination for many applications such as search and rescue fire fighting etc Due to

vendor specific hardware and software applications are hard to develop and maintain Therefore a modifiable drone thermal

imaging analysis framework is proposed that enables users to more easily develop such image processing applications It

implements a microservice plugin architecture Users can build image processing applications with the framework by building

media streams using plugins that are either thermal cameras or image analysis software modules The framework is evaluated

by building a proof of concept implementation which is tested on the initial requirements It achieves the modifiability and

interoperability requirements at the cost of performance and security The framework is applied for detecting large crowds of

people (mobs) during open-air events A new dataset containing thermal images of such mobs is presented on which a YOLOv3

neural network is trained The trained model is able to detect mobs on new thermal images in real-time achieving frame rates

of 55 frames per second when deployed on a modern GPU

Keywords Drone thermal imaging Video streaming Framework Microservices Object de-

tection Plugin

Modifiable Drone Thermal Imaging AnalysisFramework for Mob Detection during Open-Air

EventsBrecht Verhoeve

Supervisors prof dr Bruno Volckaert prof dr ir Filip De Turck Pieter-Jan Maenhaut Jerico Moeyersons

Abstractmdash Drones and thermal cameras are used in combination formany applications such as search and rescue fire fighting etc Due to ven-dor specific hardware and software applications are hard to develop andmaintain Therefore a modifiable drone thermal imaging analysis frame-work is proposed that enables users to more easily develop such imageprocessing applications It implements a microservice plugin architectureUsers can build image processing applications with the framework by build-ing media streams using plugins that are either thermal cameras or imageanalysis software modules The framework is evaluated by building a proofof concept implementation which is tested on the initial requirements Itachieves the modifiability and interoperability requirements at the cost ofperformance and security The framework is applied for detecting largecrowds of people (mobs) during open-air events A new dataset containingthermal images of such mobs is presented on which a YOLOv3 neural net-work is trained The trained model is able to detect mobs on new thermalimages in real-time achieving frame rates of 55 frames per second whendeployed on a modern GPU

Keywordsmdash Drone thermal imaging Video streaming Framework Mi-croservices Object detection Plugin

I INTRODUCTION

THROUGHOUT history having an overview of the environ-ment from high viewpoints held many benefits The advent

of drones and advanced cameras provides low-cost aerial imag-ing that creates numerous opportunities for new applicationsTraditional visual cameras for the visible light spectrum offerhigh quality images but are limited to daytime or artificiallylighted scenes Thermal cameras measure thermal radiation ofobjects in a scene and thus can operate in utter darkness reveal-ing information not visible to the normal eye [1] The combi-nation of drones and thermal cameras is used in many differentapplications such as geography [23] agriculture [4] search andrescue [5] wildlife monitoring [6] disaster response [7] main-tenance [8] etc

Several vendors offer thermal camera products some specif-ically designed for drone platforms These cameras often usedifferent image formats color schemes and interfaces [19ndash11]This leads to issues if applications want to change the camerathat is used or when the camera is no longer supported by thevendor because different software needs to be built to inter-act with the new camera which often is a non-negligible costThis leads to a problem called vendor lock-in that makes cus-tomers dependent on a certain vendor as they cannot switchproduct without making substantial costs a problem alreadyvery tangible for cloud-based applications [12] Applicationsacross various fields often have slightly different functional andnon-functional requirements For this dissertation several Bel-gian fire fighting departments were asked for requirements for

a thermal drone platform application It quickly became clearthat they had various problems that needed to be solved suchas finding hot explosives measuring temperatures in contain-ers identifying hot entrances detecting invisible methane firesfinding missing persons etc Some use cases need to be eval-uated in real-time (during fires) others need to be extremelyaccurate A hypothetical application should be able to quicklyimplement new detection and analysis features to meet all theserequirements Due to the current solutions not being modifiableenough current applications built with aerial thermal imagingtechnology remain in the niche use case for which they were ini-tially developed [13] Applications could benefit from a back-bone framework to aid in this modifiabilityinteroperability is-sue aiding in developing end-to-end solutions connecting ther-mal cameras to various analysisdetection modules

This dissertation explores the requirements for such a frame-work and its potential software architecture To test the viabil-ity of the architecture a proof of concept prototype is imple-mented and evaluated against the initial requirements To verifyif it aids in developing detection applications the specific usecase for detecting large crowds of people so-called mobs dur-ing open-air events is investigated Monitoring crowds duringopen-air events is important as mobs can create potentially dan-gerous situations through bottlenecks blocking escape routesetc Through monitoring and detecting these mobs these situa-tions can be avoided before they become problematic [14 15]

The remainder of this paper is organized as follows Sec-tion II presents similar projects on the topic of modifiable imag-ing analysis frameworks and thermal object detection SectionIII presents the requirements of the framework and the soft-ware architecture designed from these requirements Section IVpresents the implementation of the framework prototype Themob detection experiment is described in Section V The testsand results to evaluate the framework and the mob detection ex-periment are presented in Section VI Finally Section VII drawsconclusions from this research and indicates where future effortsin this field should go to

II RELATED WORK

The Irish start-up DroneSAR [16] developed a search-and-rescue (SAR) drone platform allowing users to stream live im-ages and video from a drone as it conducts a search for miss-ing persons The platform works with any camera visual andthermal but focuses on drones from vendor DJI DroneSARs in-dustry partner Amazon introduced the Amazon Kinesis Video

Streams platform in January 2018 as a new service for the Ama-zon Web Services (AWS) cloud platform It allows users tostream live video from devices to the AWS cloud and buildapplications for real-time video processing [17] The VIPERproject by EAVISE KU Leuven researched how thermal andvisual video images could be used for real-time detection ofpersons using object detection algorithms such as deep learning[18] The framework presented in this work combines elementsfrom all three of these examples

III REQUIREMENTS AND SOFTWARE ARCHITECTURE

A Functional requirements

Three general actors are identified for the framework an end-user that wants to build a image processing application for a spe-cific use case camera developers integrating new cameras intothe framework and analysisdetection module developers inte-grating new analysisdetection algorithms into the framework sothat end-users can use them to build their applications An end-user should be able to construct a image processing applicationusing the integrated cameras and analysis modules He shouldbe able to adapt this application with the framework for newuse cases Camera and analysis module developers should beable to add and distribute new integrations for new cameras andanalysis modules to the framework This allows the end-users tofocus on the use case not the technical details of the hardwareplatforms or algorithms and have a wider selection of hardwareand algorithms

B Non-functional requirements

Interoperability modifiability and peformance are identifiedas the architecturally significant requirements Interoperabilityspecifies that the framework should be able to interact with var-ious cameras and analysis software modules via interfaces Theamount of systems the framework can successfully interact withadding to the business value of the framework as end-users canuse more devices via the framework to build applications Theframework needs to be extendable with new thermal camerasand analysis modules Applications built with the frameworkshould be modifiable to integrate new hardware and softwareThe available hardware on-site for use cases such as forest firemonitoring is not always powerful enough to support heavy im-age analysis software The framework should be able to deployin a distributed fashion to allow more computationally expen-sive operations to be executed on more powerful remote devicesSome use cases require real-time streaming of video and manip-ulation of these video streams which should be supported forthe framework to be relevant

C Software architecture

An architectural pattern analysis based on the requirementspresented in Section III-B was conducted from which a com-bination of the microservices and microkernel pattern was se-lected as the base pattern for the software architecture The mi-crokernel pattern enables the framework to be extended via aplugin system The microservices framework enables interop-erability through the use of standard interfaces and a communi-cation scheme It also allows for the framework to be deployed

in a distributed fashion [19ndash21] The software architecture isdocumented in static views sequence diagrams and deploymentviews Figure 1 presents an overview of the architecture

Fig 1 Component-connector overview of the framework Theclear components are the core components of the framework thateach user needs to install to use the framework The coloredcomponents are used for the distribution of plugins

End-users interact with the framework via the Client Inter-face a graphical or textual interface Commands are forwardedto the Stream module that manages the layout and state of thecomponents in the media pipeline (stream) that the end-userbuilds for his use case To activate and place the componentsin a certain layout the Stream module connects with the Pro-ducer and Consumer components that control the plugins thatprocess the media Producer Plugins are devices that producemedia such as thermal cameras Consumer Plugins process andconsume media such as analysis software and displays Oncea stream is established the plugins forward media to each otherin the layout specified by the Stream module New support forcameras and analysis modules can be added as plugins to theProducerConsumer Distribution components that distribute thissoftware so that end-users can download and install the pluginsEach module in the architecture is a microservice allowing fora distributed deployment and enabling the modifiability require-ments through interchangeable modules Cameras and analy-sis modules are realized as plugins for the ProducerConsumermodules implemented as a microkernel This allows the frame-work to easily build new streams for different use cases and in-terchange components when needed

C1 Plugin model

Figure 2 depicts the model of a general framework pluginThe plugin defines three interfaces a source media endpoint toreceive media from different sources a listener endpoint to for-ward the processed media to other listening plugins and an APIfor framework control The framework uses the API to changewhich sources and listener a plugin has and its state By linkingplugins together by setting the sources and listeners resourcesthe framework can build a media processing stream ProducerPlugins have no sources since they produce media The statesare used stop and start the media processing of the plugins in

Fig 2 Schematic overview of a plugin

the stream The REST paradigm is selected to build this APIwith state sources and listeners resources that need to be min-imally implemented A plugin has the following states INAC-TIVE PLAY STOP and PAUSE Figure 3 depicts the state tran-sition diagram A plugin implements the visible states STOPPAUSE and PLAY describing if the media process of the pluginis stopped paused or processing respectively The INACTIVEstate is only visible to the framework as it indicates that their isno active process that runs the plugin This is the initial state ofa plugin in the framework When a plugin is added to a streamthe plugin microservice is started transitions to the STOP stateand waits for commands

Fig 3 State transition diagram of a plugin

C2 Network topology and communication protocol

The microservices of the framework and the plugins need acommunication protocol to exchange commands and video Forsending the commands the HTTPTCP protocol is used a syn-chronous protocol that blocks on the response of a request Thisensures that commands are always received and acknowledgedincreasing the reliability of the communication at the cost ofperformance [22] The asynchronous RTPUDP protocol is se-lected to transfer media between the plugins to ensure low la-tency video transfer between plugins to enable real-time videostreams The recommended codec for transmitting video mediais MJPEG which transmits video frames as separately encodedJPEG images [23] Other codecs such as MPEG-4 encode onlycertain frames as keyframes and the other frames as B-framesthat encode differences from the keyframe [24] This impliesthat when receiving images from a stream a keyframe must firstbe received before the video can be decoded Using MJPEG

plugins receiving frames can directly perform analysis on eachkeyframe and do not have to wait for additional frames An ex-ample of this network topology for a stream with 2 Producerplugins and 3 Consumer plugins is depicted in Figure 4

Fig 4 Network topology The full lines represent HTTPTCPcommuncations the dashed line RTPUDP communications

IV PROTOTYPE IMPLEMENTATION

The goal of the prototype implementation is to provide a proofof the concept architecture and meet the requirements presentedin Section III The core framework components are imple-mented the distribution components are left out of scope as theyfocus primarily on the distribution of supported plugins Thecore microservices as well as the plugins are implemented usingthe Docker software containerization framework [25] Contain-ers virtualize on the operating system and allow for portablelightweight software environments for processes with a minorperformance overhead Using this technology the core modulesand plugins can be deployed in a local and distributed fashionand be setup and shutdown easily [26] The microservice con-tainers communicate via the protocols presented in Section III-C2 The REST APIs are built with the Flask framework [27] alightweight Python web development framework ideal for pro-totyping The ProducerConsumer components need access tothe Docker daemon running on the Docker host in order to spinup and shutdown Docker containers containing the plugin pro-cesses This is achieved by mounting the Docker client socket inthe ProducerConsumer containers This gives the container rootaccess to the host a significant security threat [2829] Two sam-ple plugins were implemented Filecam a plugin that producesvideo read in from a file and Display a plugin that forwardsmedia to the display of the local device The plugins transmitmedia using the video streaming framework GStreamer [30]

V MOB DETECTION

A Dataset

Several publicly available datasets for thermal images exists[31ndash34] None of these include large crowds of people so anew dataset called the Last Post dataset was created It consistsof thermal video captured at the Last Post ceremony in Ypres

(a) Thermal view of the square (b) Visual view of the square (c) Thermal view of the bridge (d) Visual view of the bridge

Fig 5 Last Post dataset main scenes

Belgium [35] The videos were captured using the Flir One Prothermal camera for Android [36] using the Iron colorschemeTwo main scenes are present in the dataset depicted in Figure 5Mobs are present in the thermal images not in the visual imagesdue to the images being made on separate days The imagesused for the experiment were manually annotated outliers wereremoved and the dataset was randomly split in a training andvalidation set

B Model

Detecting and classifying objects of interest in images isknown as the object detection problem in machine learning [37]Several object detection algorithms and frameworks have beenimplemented in the past years A distinction is made betweentraditional models [31 38ndash40] deep learning two-stage net-works [41ndash46] and deep learning dense networks [47ndash49] Thetraditional and two-stage methods make predictions relativelyslow (order of seconds on GPU) when compared to the densenetworks (order of milliseconds on GPU) [47] Since the goalis to use the framework in real-time use cases the latter is pre-ferred The YOLOv3 model is selected as it achieves state of theart prediction performances can make real-time predictions andis available via the open source neural network framework dark-net [50 51] The model is pre-trained on the ImageNet dataset[52] The model is trained on a NVIDIA Geforce 980 TX GPUand optimizes the SSE loss using batch gradient descent [50]To select the best weights the average Intersection of Union(IoU) and mean Average Precision (mAP) [53] are calculatedon predictions on the validation set The weights that achievethe highest mAP are selected as the final weights

VI RESULTS

A Framework

To evaluate the framework acceptance tests for the require-ments from Section III were conducted Common frameworkoperations such as manipulating and building a stream have anaverage execution time of 084 seconds with a standard devia-tion of 037 seconds Less common operations such as deacti-vating a plugin starting up the framework and shutting downthe framework have an average execution time of 358 840 and2402 seconds respectively with standard deviations 467 050and 048 respectively Deactivating plugins (STOP to INAC-TIVE transitions) takes a long time as the container running theprocess needs to be removed Real-time streaming could not betested due to the GStreamer framework having no readily avail-

able testing endpoints However when streaming and displayinga video with the framework human users could not differenti-ate between a streaming video and a video played using a nativemedia player making it plausible the framework streams in real-time Great care must be taken when building plugins as theirprocessing speed has a direct impact on the real-time streamingperformance Interoperability is achieved with the REST APIsand plugin model presented in Section III-C The interoperabil-ity is tested by having the framework exchange information witha mock plugin implementing the specified interface and count-ing the number of correct exchanges The average successfulexchange ratio is 99998 The framework can install and de-tect new plugins at runtime achieving runtime modifiability atplugin level Different deployment schemes were not tested forthe prototype

B Mob detection

The weights generated at the 15700th training epoch achievedthe highest mAP value 9052 on the validation set For com-parison performance of other models on benchmark datasetsachieve an average mAP of 748 [54] The reason the model isachieving such high values is because the validation set is tem-porally correlated with the training set as both sets are extractedfrom videos in which frames have a temporal correlation Per-formance when predicting on new datasets will be worse Figure6 depicts some predictions of the model When predicting on avideo the model generated predictions at an average frame rateof 55 frames per second an a GPU

Fig 6 Model predictions on validation set

VII CONCLUSION AND FUTURE WORK

In this dissertation a modifiable drone thermal imaging anal-ysis framework is proposed to allow end-users to build flexiblevideo processing pipelines using different thermal cameras andanalysis modules The framework implements a microservice

container plugin architecture Users can build image process-ing applications with the framework by building media streamsusing plugins that are either thermal cameras or image analy-sis software modules The framework is evaluated by building aproof of concept implementation which is tested on the initial re-quirements The proposed framework achieves the modifiabilityand interoperability requirements at the cost of performance andsecurity The framework is applied for detecting large crowdsof people (mobs) during open-air events A new dataset con-taining thermal images of such mobs is presented on which aYOLOv3 neural network is trained The trained model is ableto detect mobs on new thermal images in real-time achievingframe rates of 55 frames per second when deployed on a modernGPU Some extensions to this research are deploying a detec-tion model using the framework testing the other deploymentconfigurations testing the framework with end-users in prac-tice and building new object detection models specifically forthermal images

REFERENCES

[1] R Gade and T B Moeslund ldquoThermal cameras and applications a sur-veyrdquo Machine Vision and Applications vol 25 pp 245ndash262 2014

[2] M C Harvey J V Rowland and K M Luketina ldquoDrone with thermalinfrared camera provides high resolution georeferenced imagery of theWaikite geothermal area New Zealandrdquo 2016

[3] S Amici M Turci S Giammanco L Spampinato and F Giulietti ldquoUAVThermal Infrared Remote Sensing of an Italian Mud Volcanordquo vol 2pp 358ndash364 2013

[4] J Bendig A Bolten and G Bareth ldquoINTRODUCING A LOW-COSTMINI-UAV FOR THERMAL-AND MULTISPECTRAL-IMAGINGrdquo2012

[5] A J Rivera A D Villalobos J C Monje J A Marinas and C MOppus ldquoPost-disaster rescue facility Human detection and geolocationusing aerial dronesrdquo IEEE Region 10 Annual International ConferenceProceedingsTENCON pp 384ndash386 2017

[6] P Christiansen K A Steen R N Joslashrgensen and H Karstoft ldquoAuto-mated detection and recognition of wildlife using thermal camerasrdquo Sen-sors (Basel Switzerland) vol 14 pp 13778ndash93 jul 2014

[7] S Chowdhury A Emelogu M Marufuzzaman S G Nurre and L BianldquoDrones for disaster response and relief operations A continuous approx-imation modelrdquo 2017

[8] Workswell ldquoPipeline inspection with thermal diagnosticsrdquo 2016[9] DJI ldquoZenmuse H3 - 2Drdquo[10] Workswell ldquoApplications of WIRIS - Thermal vision system for dronesrdquo[11] Therm-App ldquoTherm-App - Android-apps op Google Playrdquo 2018[12] B Satzger W Hummer C Inzinger P Leitner and S Dustdar ldquoWinds of

change From vendor lock-in to the meta cloudrdquo IEEE Internet Comput-ing vol 17 no 1 pp 69ndash73 2013

[13] J Divya ldquoDrone Technology and Usage Current Uses and Future DroneTechnologyrdquo 2017

[14] B Steffen and A Seyfried ldquoMethods for measuring pedestrian densityflow speed and direction with minimal scatterrdquo Physica A Statistical Me-chanics and its Applications vol 389 pp 1902ndash1910 may 2010

[15] M Wirz T Franke D Roggen E Mitleton-Kelly P Lukowicz andG Troster ldquoInferring crowd conditions from pedestriansrsquo location tracesfor real-time crowd monitoring during city-scale mass gatheringsrdquo Pro-ceedings of the Workshop on Enabling Technologies Infrastructure forCollaborative Enterprises WETICE pp 367ndash372 2012

[16] L-L Slattery ldquoDroneSAR wants to turn drones into search-and-rescueheroesrdquo 2017

[17] Amazon Web Services Inc ldquoWhat Is Amazon Kinesis Video Streamsrdquo2018

[18] T Goedeme ldquoProjectresultaten VLAIO TETRA-projectrdquo tech rep KULeuven Louvain 2017

[19] L Bass P Clements and R Kazman Software Architecture in PracticeAddison-Wesley Professional 3rd ed 2012

[20] M Richards Software Architecture Patterns OrsquoReilly Media first edit ed2015

[21] C Richardson ldquoMicroservice Architecture patternrdquo 2017[22] C De La Torre C Maddock J Hampton P Kulikov and M Jones ldquoCom-

munication in a microservice architecturerdquo 2017

[23] On-Net Surveillance Systems Inc ldquoMJPEG vs MPEG4 Understandingthe differences advantages and disadvantages of each compression tech-niquerdquo 2006

[24] D Bull Communicating Pictures A Course in Image and Video CodingElsevier Science 2014

[25] Docker Inc ldquoDocker - Build Ship and Run Any App Anywhererdquo 2018[26] D Merkel ldquoDocker Lightweight Linux Containers for Consistent Devel-

opment and Deploymentrdquo 2014[27] A Ronacher ldquoWelcome to Flask Flask Documentation (012)rdquo 2017[28] Lvh ldquoDonrsquot expose the Docker socket (not even to a container)rdquo 2015[29] R Yasrab ldquoMitigating Docker Security Issuesrdquo tech rep University of

Science and Technology of China Hefei[30] GStreamer ldquoGStreamer open source multimedia frameworkrdquo 2018[31] J W Davis and M A Keck ldquoA Two-Stage Template Approach to Per-

son Detection in Thermal Imageryrdquo Proc Workshop on Applications ofComputer Vision 2005

[32] S Hwang J Park N Kim Y Choi and I S Kweon ldquoMultispectralPedestrian Detection Benchmark Dataset and Baselinerdquo CVPR 2015

[33] Z Wu N Fuller D Theriault and M Betke ldquoA Thermal Infrared VideoBenchmark for Visual Analysisrdquo IEEE Conference on Computer Visionand Pattern Recognition Workshops 2014

[34] S Z Li R Chu S Liao and L Zhang ldquoIllumination Invariant FaceRecognition Using Near-Infrared Imagesrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 29 no 4 pp 627ndash639 2007

[35] Last Post Association ldquoMissionrdquo 2018[36] FLIR ldquoFLIR One Prordquo[37] E Alpaydin Introduction to machine learning MIT Press 3 ed 2014[38] F X F Xu X L X Liu and K Fujimura ldquoPedestrian detection and

tracking with night visionrdquo IEEE Transactions on Intelligent Transporta-tion Systems vol 6 no 1 pp 63ndash71 2005

[39] H Nanda and L Davis ldquoProbabilistic template based pedestrian detectionin infrared videosrdquo IEEE Intelligent Vehicles Symposium Proceedingsvol 1 pp 15ndash20 2003

[40] R Appel S Belongie P Perona and P Doll ldquoFast Feature Pyramids forObject Detectionrdquo Pami vol 36 no 8 pp 1ndash14 2014

[41] J R R Uijlings K E A Van De Sande T Gevers and A W M Smeul-ders ldquoSelective Search for Object Recognitionrdquo tech rep 2012

[42] R Girshick J Donahue T Darrell and J Malik ldquoRegion-Based Convolu-tional Networks for Accurate Object Detection and Segmentationrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol 38 no 1pp 142ndash158 2014

[43] R Girshick ldquoFast R-CNNrdquo in Proceedings of the IEEE InternationalConference on Computer Vision vol 2015 Inter pp 1440ndash1448 2015

[44] S Ren K He R Girshick and J Sun ldquoFaster R-CNN Towards Real-Time Object Detection with Region Proposal Networksrdquo IEEE Trans-actions on Pattern Analysis and Machine Intelligence vol 39 no 6pp 1137ndash1149 2016

[45] K He Gkioxari P Dollar and R Girshick ldquoMask R-CNNrdquo arXiv 2018[46] J Dai Y Li K He and J Sun ldquoR-FCN Object Detection via Region-

based Fully Convolutional Networksrdquo tech rep 2016[47] J Redmon S Divvala R Girshick and A Farhadi ldquoYou Only Look

Once Unified Real-Time Object Detectionrdquo 2015[48] W Liu D Anguelov D Erhan C Szegedy S Reed C-y Fu and A C

Berg ldquoSSD Single Shot MultiBox Detectorrdquo arXiv 2016[49] T-y Lin P Goyal R Girshick K He and P Dollar ldquoFocal Loss for

Dense Object Detectionrdquo arXiv 2018[50] J Redmon and A Farhadi ldquoYOLOv3 An Incremental Improvementrdquo

axXiv 2018[51] J Redmon ldquoDarknet Open source neural networks in crdquo

httppjreddiecomdarknet 2013ndash2016[52] J Deng W Dong R Socher L-J Li K Li and L Fei-Fei ldquoImageNet

A Large-Scale Hierarchical Image Databaserdquo in CVPR09 2009[53] M Everingham S M A Eslami L Van Gool C K I Williams J Winn

and A Zisserman ldquoThe Pascal Visual Object Classes Challenge A Ret-rospectiverdquo International Journal of Computer Vision vol 111 no 1pp 98ndash136 2014

[54] A Ouaknine ldquoReview of Deep Learning Algorithms for Object Detec-tionrdquo 2018

xii

Contents

1 Introduction 1

11 Drones 1

12 Concepts 2

121 Thermal Cameras 2

122 Aerial thermal imaging 2

13 Problem statement 2

131 Industry adoption 2

132 Crowd monitoring 3

133 Goal 4

134 Related work 4

14 Outline 4

2 System Design 5

21 Requirements analysis 5

211 Functional requirements 5

212 Non-functional requirements 6

22 Patterns and tactics 11

221 Layers 12

222 Event-driven architecture 12

223 Microkernel 12

224 Microservices 13

225 Comparison of patterns 13

23 Software architecture 15

231 Static view 15

232 Dynamic views 22

233 Deployment views 23

3 State of the art and technology choice 27

31 Thermal camera options 27

311 Parameters 27

312 Comparative analysis 30

32 Microservices frameworks 31

321 Flask 31

322 Falcon 33

323 Nameko 33

324 Vertx 33

325 Spring Boot 34

33 Deployment framework 34

331 Containers 34

332 LXC 35

333 Docker 35

334 rkt 35

34 Object detection algorithms and frameworks 36

341 Traditional approaches 36

342 Deep learning 37

343 Frameworks 39

35 Technology choice 41

351 Thermal camera 41

352 Microservices framework 41

353 Deployment framework 41

354 Object detection 41

4 Proof of Concept implementation 43

41 Goals and scope of prototype 43

42 Overview of prototype 43

421 General overview 43

422 Client interface 45

423 Stream 46

424 Producer and Consumer 46

425 Implemented plugins 48

43 Limitations and issues 51

431 Single client 51

432 Timeouts 51

433 Exception handling and testing 51

434 Docker security issues 51

435 Docker bridge network 52

436 Single stream 52

437 Number of containers per plugin 52

5 Mob detection experiment 53

51 Last Post thermal dataset 53

511 Last Post ceremony 53

512 Dataset description 54

52 Object detection experiment 56

521 Preprocessing 56

522 Training 56

6 Results and evaluation 58

61 Framework results 58

611 Performance evaluation 58

612 Interoperability evaluation 60

613 Modifiability evaluation 62

62 Mob detection experiment results 62

621 Training results 63

622 Metrics 63

623 Validation results 64

7 Conclusion and future work 67

71 Conclusion 67

72 Future work 69

721 Security 69

722 Implementing a detection plugin 69

723 Different deployment configurations 70

724 Multiple streams with different layouts 70

725 Implementing the plugin distribution service (Remote ProducerConsumer) 70

726 Using high performance microservices backbone frameworks 70

727 New object detection models and datasets specifically for thermal images 70

A Firefighting department email conversations 81

A1 General email sent to Firefighting departments 81

A2 Conversation with Firefighting department of Antwerp Belgium 82

A3 Converstation with Firefighting department of Ostend Belgium 83

A4 Conversation with Firefighting department of Courtrai Belgium 83

A5 Conversation with Firefighting department of Ghent Belgium 83

B Thermal camera specifications 85

C Last Post thermal dataset summary 94

C1 24th of March 2018 94

C2 2nd of April 2018 95

C3 3th of April 2018 96

C4 4th of April 2018 97

C5 5th of April 2018 97

C6 9th of April 2018 98

C7 10th of April 2018 99

C8 11th of April 2018 100

C9 12th of April 2018 101

xvi

List of Figures

21 Use case diagram 7

22 Overview of the framework software architecture 16

23 Framework network topology 17

24 Client Interface detailed view 17

25 Stream detailed view 18

26 Stream model 18

27 Plugin model 19

28 Plugin state transition diagram 20

29 Component-connector diagrams of the Producer and Consumer module 21

210 Producer and Consumer Distribution component-connector diagrams 22

211 Add plugin sequence diagram 23

212 Link plugins sequence diagram 24

213 Deployment diagrams 26

31 Thermal image and MSX image of a dog 28

33 Rethink IT Most used tools and frameworks for microservices results [54] 32

34 Containers compared to virtual machines [66] 36

41 filecam GStreamer pipeline 49

42 local plugin GStreamer pipeline 50

51 Last Post ceremony panorama 54

52 Last Post filming locations 54

53 Main scenes in the Last Post dataset 55

54 Outliers 57

61 Average training loss per epoch 64

62 Validation metrics per epoch 65

63 Predictions of the model on images in the validation set 66

71 GStreamer pipeline for a plugin with a detection model 69

xviii

List of Tables

21 Performance utility tree 8

22 Interoperability utility tree 9

23 Modifiability utility tree 10

24 Usability utility tree 11

25 Security utility tree 11

26 Availability utility tree 12

27 Architecture pattern comparison 14

61 Acceptance tests results summary 59

62 Performance test statistics summary measured in seconds 60

63 Resource usage of the framework in several conditions 61

64 Total size of framework components 61

65 Interoperability tests results (S Source L Listener) 62

B1 Compared cameras their producing companies and their average retail price 86

B2 Physical specifications 87

B3 Image quality IR InfraRed SD Standard FOV Field of View 88

B4 Thermal precision 89

B5 Interfaces 90

B6 Energy consumption 91

B7 Help and support 92

B8 Auxiliary features 93

xix

List of Listings

1 Minimal Flask application 32

2 Vertx example 33

3 Spring Boot example 34

4 docker-composeyml snippet of the prototype 44

5 Mounting the Docker socket on the container 47

6 Starting a plugin container 47

7 Dynamic linking of the decodebin and jpegenc 50

xx

List of Abbreviations

ACF Aggregated Channel Features

AMQP Advanced Message Queuing Protocol

API Application Programming Interface

AS Availability Scenario

ASR Architecturally Significant Requirement

CLI Command Line Interface

CNN Convolutional Neural Networks

CRUD Create Read Update Destroy

DNS Domain Name System

FR Functional Requirement

GPU Graphical Processing Unit

H High

HTTP Hyper Text Transfer Protocol

ICF Integral Channel Features

IoU Intersection of Union

IS Interoperability Scenario

IT Interoperability Tactic

JVM Java Virtual Machine

L Low

xxi

LXC Linux Containers

M Medium

mAP mean Average Precision

Motion-JPEG MJPEG

MS Modifiability Scenario

MSX Multi Spectral Dynamic Imaging

MT Modifiablity Tactic

NFR Non-Functional Requirement

ONNX Open Neural Network Exchange Format

OS Operating System

PS Performance Scenario

PT Performance Tactic

QAR Quality Attribute Requirement

REST Representational State Transfer

RNN Recurrent Neural Network

RPN Region Proposal Network

RTP Real-time Transport Protocol

SS Security Scenario

SSE Sum of Squared Errors

SVM Support Vector Machine

TCP Transmission Control Protocol

UDP User Datagram Protocol

UI User Interface

US Usability Scenario

YOLO You Only Look Once

INTRODUCTION 1

Chapter 1

Introduction

Throughout history having an overview of the environment from high viewpoints held many benefits Early civilizations used

hills to monitor their surroundings population and spot possible attackers The discovery of flight meant that environments

could now be viewed from a birdrsquos-eye view offering even more visibility revealing much more of the world below Recently a

much more smaller type of aircraft was developed the drone Ranging from large plane-like to almost insect-like devices and

having a wide variety of uses drones are quickly taking over the sky Drones would not be as effective without proper cameras

providing a detailed view on the world below With digital videocameras offering superb quality for steadily decreasing costs

almost every scene can be captured in great detail However these cameras are limited to the visible light spectrum which

hinders drones to operate in all circumstances like nightly flights Thermal cameras measure the emitted heat of a scene and

can reveal information not visible to the eye such as hidden persons or animals pipelines malfunctioning equipment etc The

combination of these two technologies certainly holds many exciting opportunities for the future

11 Drones

Drones are flying robots that can fly remotely or autonomously and donrsquot carry a human operator They can carry a variety of

payloads video cameras delivery parcels fluid containers sensors lights but also lethal explosives [1]

Drones are classified in different categories based on varying parameters such as the physical characteristics (diameter weight)

aerial movement techniques application domains etc Based on diameter drones are classified as smart dust (1 mm to 025

cm) pico air vehicles (025 cm - 25 cm) nano air vehicles (25 cm - 15 cm) micro air vehicles (15 cm - 1 m) micro unmanned

aerial vehicles (1 m - 2 m) and unmanned aerial vehicles (2 m and larger) Often depending on their diameter the weight

of these devices ranges from less than a gram up to more than 2000 kg Drones have different flight techniques such as

propulsion engines with wings rotors in various amounts flapping wings and even balloons They are used for all kinds of

purposes ranging from search and rescue missions environmental protection delivery recon etc Hassanalian et al provide

an excellent overview of most types of drones [2]

Due to the increasing interest in commercial drone platforms [3] a variety of payloads were developed specifically tailored for

these aerial robots such as gimbals to mount action video cameras [4] gimbals for delivering packets [5] and thermal imaging

12 Concepts 2

platforms [6]

12 Concepts

121 Thermal Cameras

Thermal cameras are passive sensors that capture the infrared radiation emitted by all objects with a temperature above

absolute zero degrees Kelvin In contrast to visible light cameras thermal cameras do not depend on an external energy

source for visibility and colors of objects or scenes This makes captured images independent of the illumination colors etc

Furthermore images can be captured in the absence of visible light [7] Originally thermal camera technology was developed

for night vision purposes for the military and the devices were very expensive Later the technology was commercialized

and has developed quickly over the last few decades resulting in better and cheaper cameras [7] This led to access for a

broader public and the technology is now introduced to a wide range of different applications such as building inspection gas

detection industrial appliances medicinal science agriculture fire detection surveillance etc [7] Thermal cameras are now

being mounted on drones to give an aerial thermal overview

122 Aerial thermal imaging

Aerial thermal imaging is defined as the creation of thermal images using a flying device This dissertation focuses on the usage

of drones for aerial thermal imaging There are many applications for aerial thermal imaging Some examples are geography

[8 9] agriculture [10 11] search and rescue operations [12] wildlife monitoring [13] forest monitoring [14 15] disaster response

[16] equipment and building maintenance [17ndash20] etc In the past few years several industry players have developed thermal

cameras specifically aimed at these drone applications Examples are FLIR [6] Workswell [21] and TEAX Technology [22]

13 Problem statement

131 Industry adoption

The implementation of thermal cameras on drone platforms faces some issues for wide adoption by the industry Several

vendors offer thermal camera products some specifically designed for drone platforms that often implement different image

formats color schemes and interfaces (eg [23ndash25]) This leads to issues if users want to modify their applications by changing

the camera that is used because the applicationmust implement new software to interact with the camera or when the camera

is no longer supported by the vendor leaving the application with outdated hardware and software This leads to a problem

called vendor lock-in that makes customers dependent on a certain vendor as they cannot switch product without making

substantial costs a problem already very tangible for cloud-based applications today [26]

Applications across various fields often have different functional and non-functional requirements Some applications have hard

real-time deadlines (such as firefighting search and rescue security etc) that must be respected other applications require

13 Problem statement 3

highly accurate predictions (eg person detection agriculture etc) A single application domain can even have many different

use cases

Several firefighting departments in Belgium were contacted to get an overview of their needs for a drone thermal imaging ap-

plication It quickly became clear they had various detection problems such as finding missing persons locating hot explosives

measuring temperatures in silos detecting invisible methane fires etc Equipment also wears down more quickly due to usage

in harsh environments such as fires in close proximity A drone thermal application for them needs to be able to exchange

functionality and hardware easily and have high performance constraints to deliver value for them The email conversations

can be read in Appendix A

Other drone thermal imaging applications are mostly only used in the niche domain for which they were developed because

they arenrsquot designed for flexibility [27] These proprietary applications have some disadvantages the development and support

potentially has a large cost vendor lock-in can occur when products are no longer supported security issues could arise and

customization is difficult [28 29] Applications could benefit from a backbone framework to aid in this modifiabilityinteroper-

ability issue aiding in developing end-to-end solutions connecting thermal cameras to various analysisdetection modules for

various use cases

132 Crowd monitoring

Festivals and other open air events are popular gatherings that attract many people For every event organizer it is important to

ensure safety and avoid incidents Large groups of people so-called mobs can create potentially dangerous situations through

bottlenecks blocking escape routes etc Therefore having the ability to monitor crowds and predict their behavior is very

important to avoid such scenarios Data can be obtained by evaluating video footage from past comparable events or real time

video monitoring of current events [30] By analyzing this footage potentially dangerous situations can be avoided by acting

on the mob formation and safety regulations can be improved to help planning future events Vision-based approaches face

several limitations mounted cameras cannot capture elements outside of their field of view canrsquot see in some conditions (for

example during night time) and it is difficult to infer information from the raw footage [31]

Thermal cameras could help for crowd monitoring because they can operate in any condition Having precise and detailed

object recognition for the images produced by these cameras is crucial to extract information correctly In this context clas-

sifying images is not satisfactory localization of the objects contained within the images is needed This problem is known

as object detection [32] There are several challenges for object detection in thermal images the image quality is very low

when compared to visible light images there is a lack of color and texture information and temperature measures are relative

measures etc This makes extracting discriminative information from these images difficult [33] Most efforts towards object

detection on thermal images has gone towards human detection Most of the proposed algorithms focus on feature extraction

using the Aggregated Channel Features technique and boosting algorithms for learning [33ndash35] Novel approaches make use

of so-called deep learning with neural networks that achieve very good results given enough data [36]

14 Outline 4

133 Goal

The goal of the dissertation is to explore the requirements of the possible backbone framework suggested in Section 131 and its

potential software architecture The architecture is evaluated by building a proof of concept implementation of the framework

and evaluating it against the proposed requirements To verify its use in developing drone thermal imaging applications the

specific mob-detection use case is investigated

134 Related work

The Irish start-up DroneSAR [37] developed a search-and-rescue (SAR) drone platform allowing users to stream live images

and video from a drone as it conducts a search for missing persons The platform works with any camera visual and thermal

but focuses on drones from vendor DJI DroneSARs industry partner Amazon introduced the Amazon Kinesis Video Streams

platform in January 2018 as a new service for the Amazon Web Services (AWS) cloud platform It allows users to stream live

video from devices to the AWS cloud and build applications for real-time video processing [38] The VIPER project by EAVISE

KU Leuven researched how thermal and visual video images could be used for real-time detection of persons using object

detection algorithms such as deep learning [36] The framework presented in this work combines elements from all three of

these examples

14 Outline

The remainder of this dissertation is organized as follows Chapter 2 presents the requirements for the framework and the

software architecture Chapter 3 explores several state of the art technologies that can serve as backbone technologies for

the framework To test the viability of the software architecture a prototype is implemented Chapter 4 presents the different

aspects of this prototype Chapter 5 describes the experiment that is conducted to research the detection of mobs in thermal

images The results of both the framework and the detection experiment are presented and evaluated in Chapter 6 Finally the

conclusion and future research efforts are presented in Chapter 7

SYSTEM DESIGN 5

Chapter 2

System Design

Finding out what users actually expect from a software system and what makes it valuable for them is of key importance for the

success of that system This chapter first explores the functional and non-functional requirements of the hypothetical frame-

work suggested in Chapter 1 to find out what makes building the framework worthwhile Well known architectural patterns

enable certain software requirements very well and can be used for building the software architecture of the framework The

framework software architecture combines some of these patterns and is presented in several documents

21 Requirements analysis

Requirements are the stated life-cycle customer needs and objectives for the system and they relate to how well the system

will work in its intended environment They are those aspects of the framework that will provide value to the users

211 Functional requirements

Functional requirements (FR) describe the necessary task action or activity that must be accomplished by the system often

captured in use cases andor user stories [39 40] Use cases provide a summary of the features described in the user stories

Several external people andor systems defined as actors interact with the framework to achieve a certain goal [40] Three

actors are identified for the framework an end-user that uses the framework in order to build an image processing applica-

tion for a specific use case such as the ones described in Section 122 a camera developer who creates support software for

a specific thermal camera for the framework so that the end-user can buy and use their product and an analysis software

developer that creates analysis software for a specific use case (tracking object detecting objects etc) so that the end-user

can use their software to build his 1 application The camera and analysis software developers are generalized to an actor called

plugin developer who develops plugins to extend the functionality of the framework These plugins are the building blocks

with which the end-user can build image processing applications

The general user scenario for the framework proceeds as follows An end-user wants to build an image processing application

1To avoid unnecessary gender specific pronoun clutter the male pronoun is used by default

21 Requirements analysis 6

eg to detect fires in a landscape using a drone He has a thermal camera for this and has read about hot-spot detection in

video The user looks for a plugin for the framework that can read video from his thermal camera and for a plugin that does

the hot-spot detection If the user finds these plugins he can add them to the framework and use them for the application he

is building He connects both plugins with the framework in a specific order to finish his application For this simple example

the thermal camera plugin is connected to the hot-spot detection plugin so that video coming from the thermal camera is

transmitted to the detection plugin to find the fires in the landscape The plugins in the application and the specific order in

which they are connected is defined as a stream This stream should be easily modifiable if additional or other functionalities

are required Suppose that the thermal camera delivers very high quality video and the detector cannot work with this as it

can only operate on low quality images The end-user searches for a plugin that scales the high quality video down to an

accepted quality for the detector This plugin is placed in between the thermal camera and the detector and the application

can work again By continuously adding plugins to the framework the number of possible applications that can be built with

the framework increase making the framework useable for more aerial thermal imaging use cases

Instead of developing the application from scratch users can use the already implemented plugins to build the applications in

an ad hoc fashion Because of this the development time for such applications can be reduced and users can switch hardware

andor algorithms easily The FRs are summarized in a use case diagram that connects each actor with their respective require-

ments and the relationship among them [40] depicted in Figure 21 Trivial functionalities such as launching and shutting down

the framework are omitted The red use cases represent use cases to extend the functionality of the framework the blue use

cases represent use cases for building streams white use cases modify the media processing of the stream Some use cases

depend on others the blue and white use cases work with plugins of the framework their prerequisite use case is rdquoAdd pluginrdquo

as a plugin must be a part of the framework for a user to use it the rdquo(Un)Link pluginsrdquo rdquoStopPausePlay streamrdquo use cases

depend on rdquoAdd plugins to streamrdquo as a stream must contain plugins before they can be manipulated

212 Non-functional requirements

A non-functional requirement (NFR) specifies how the framework is supposed to be or in what manner it should execute its

functionality [41] These qualifications typically cover business and system quality requirements A distinction is made between

quality attribute requirements (QAR) and constraints QARs are qualifications of the FRs or of the overall product eg how

fast a certain function must be executed or how resilient it must be to erroneous input They are closely related to business

requirements which are specifications that once delivered provide value to the actors [40] The QARs are captured in a utility

tree [40] that has a root node representing the system This root node is elaborated by listing the major QARs that the system

is required to exhibit Each QAR is subdivided into more specific QARs To make the specific requirements unambiguous and

testable a scenario for the system or a specific function is written and they are evaluated against the business value and the

architectural impact [40] The QAR can either have High (H) Medium (M) and Low (L) business value and architectural impact

respectively The business value is defined as the value for the end user if the QAR is enabled High designates a must-have

requirement Medium is for a requirement which is important but would not lead to project failure Low describes a nice to have

QAR but not something that is worth much effort Architectural impact defines how much the architecture must be designed

towards the QAR to enable it High means that meeting this QAR will profoundly affect the architecture Medium means

21 Requirements analysis 7

Figure 21 Use case diagram

that meeting this QAR will somewhat affect the architecture Low means that meeting this QAR will have little effect on the

architecture The following QARs are discussed performance interoperability modifiability usability security and availability

Performance

Performance indicates the frameworks ability to meet timing requirements It characterizes the events that can occur and the

frameworks time-based response to those events Latency is defined as the time between the arrival of the stimulus and the

systemrsquos response to it [40] The system latency is the latency between the initialization of an action sequence and the first

change to the system noticeable by the user Streaming latency is defined as the time between the arrival of a video frame and

the arrival of the next video frame The jitter of the response is the allowable variation in latency Scalability is the number

of users that can use the framework at the same time The utility tree is presented in Table 21 The software industry has

not defined a quantified rsquogood latencyrsquo for end-users but a 4 second latency rule is often used as a rule-of-thumb [42] The

average response time for general framework commands should then be less than 2 seconds with a standard deviation of 1

seconds ensuring most execution times respect the 4 second bound As stated in Chapter 1 some use cases require real-time

video streaming such as fire fighting The notion of low latency real-time video loosely defines that video should be streamed

almost simultaneously if a camera is filming and a human user does not notice a latency between the video of the camera

and the real world the video stream is considered real-time Real-time is thus a human time perception and for visual inputs

this bound is as low as 13 milliseconds Anything above 13 milliseconds becomes noticeable anything above 100 milliseconds

hinders human performance [43 44] However the framework focusses on the use of thermal cameras most of which most

donrsquot produce frames faster than 8 frames per second or 125 milliseconds per frame (see Section 31) More expensive cameras

21 Requirements analysis 8

can shoot at 25 frames per second corresponding to a latency of 40 milliseconds and this bound is selected for the streaming

latency with a standard deviation of 20 milliseconds remaining below the frame rate of less expensive cameras The number

of users that can use the framework at the same time is assumed to be low as current aerial thermal image applications are

currently operated by only one user or a few The assumption is that a maximum of five users can use the framework at the

same time All of these requirements are quantified as relatively rsquogoodrsquo values These bounds should be evaluated for user

satisfaction by having users use a prototype of the framework in practice

Attribute refinement Id Quality attribute scenario

LatencyPS-1 The average execution time of all framework commands does not exceed 2 sec-

onds (H M)

PS-2 A playing stream should have an upper limit of 40ms streaming latency (H H)

JitterPS-3 The average standard deviation of the execution time of all framework com-

mands should not exceed 1 second under normal operation (H M)

PS-4 The average standard deviation in streaming latency should not exceed 20ms

under normal operation (H H)

Scalability PS-5 The system should usable by five users at the same time (M M)

Table 21 Performance utility tree

Interoperability

Interoperability is the degree to which two or more independently developed systems can usefully exchange meaningful infor-

mation via interfaces in a particular context [40] The framework will interoperate with cameras and analysis modules via the

framework plugins Henceforth the camera and analysis plugins will be referred to as a Producer plugin and a Consumer plugin

A Producer plugin is thus a plugin that represents a camera that produces video and a Consumer plugin a plugin that represents

a module that processes or consumes video The framework will thus interact with the Producer and Consumer plugins with

which the framework exchanges requests to link them together control their media process etc The more correct exchanges

there are between the two the better the user can use the plugin for building applications with the framework This QAR is

quantified by the ratio of requests made by the framework that were correctly interpreted by the plugin (successful requests)

and the total number of requests during a runtime of the framework [40] Intuitively one argues that the framework must

achieve perfect interoperability with a perfect exchange success rate of 100 Reality however tends to not agree with perfec-

tion and it can never be excluded that exchanges will always be correct Therefore it is better to aim for a good interoperability

measure and prepare for failed exchanges instead of naively assuming the framework will be perfect An exchange success

rate of 9999 is selected the motivation for this bound is as follows A plugin is assumed to be always correct up to first the

mistake after which the plugin is faulty and the fault needs to be identified and ensured that it wonrsquot occur again An exchange

success rate of 9999 means that if 10000 plugins are installed and used by the framework only one will fail during uptime

For one plugin during framework up time the mean time between failures is then 10000 exchanges It is suspected that this

21 Requirements analysis 9

amount of exchanges are very high for normal framework use Because the possibility of faulty exchanges is acknowledged

the framework will need to implement a fallback mechanism to compensate The utility tree is presented in Table 22

Attribute refinement Id Quality attribute scenario

Syntactic interoperabilityIS-1 The framework exchanges requests with a Producer plugin (known at runtime)

with a success ratio of 9999 Incorrect requests are undone by the framework

and logged (HH)

IS-2 The framework exchanges requests with a Consumer plugin (known at runtime)

with a success ratio of 9999 Incorrect requests are undone by the framework

and logged (HH)

Table 22 Interoperability utility tree

Modifiability

Modifiability is the cost and risk of changing functionality of the system [40] One of themost important values of the framework

is modifiability of the supported thermal cameras and analysis modules The framework needs to be extendable for new

functionalities by enabling developers to add their support software in the form of a plugin End-users should be able to

modify the components that they use for their image processing applications easily and quickly to allow for interchangeable

hardware and software and quickly set up new applications Modifiability is defined in two environments runtime defined as

periods during which the system is up and running and downtime defined as the time periods during which the system is not

active The utility tree is presented in Table 23

To enable users to choose the extensions they need the framework will need a distribution service that contains all plugins

available for the framework from which a user can select and install plugins for their local version of the framework Adding

new plugins to the distribution service should not affect versions of the frameworks installed by the user When a user adds a

plugin from the distribution to his version of the framework the framework should only reload once before making the plugin

useable for user comfort Deployability is defined as the different device configurations that specify how the framework can be

deployed If the framework can be deployed in different fashions this can increase the value for the end-user Suppose a fire

fighting use case in which a forest fire is monitored on site Computationally powerful devices might not be available on site

so moving some plugins processing media to a remote server or cloud could still allow usage of the framework Perhaps the

device processing the media is already remote for example a drone on security patrol in this case access via a remote device

such as a smartphone is desirable This leads to the deployment configurations described in the utility tree

Usability

Usability indicates how easy it is for the user to accomplish a desired task and the kind of user support the system provides

Learnability indicates how easy it is for a user to gain knowledge on how to use the framework Errors are the amount of errors

21 Requirements analysis 10

Attribute refinement Id Quality attribute scenario

Run time modifiability

MS-1 Support for a new Producer plugin should be added to the distribution service

within one day without the framework having to restart (H H)

MS-2 Support for a new Consumer plugin should be added to the distribution service

within one day without the framework having to restart (H H)

MS-3 End-users should be able to extend their framework with new functionalities

by installing new Consumer and Producer Plugins (HH)

MS-4 End-users should be able tomodify the plugins used to build their stream (HH)

Down time modifiabilityMS-5 New Producer plugins can be installed to the local framework at runtime the

framework can only reload once before the plugin is useable (H H)

MS-6 New Producer plugins can be installed to the local framework at runtime the

framework can only reload once before the plugin is useable (H H)

Deployability

MS-7 The system should be deployable on a combination of a smartphone and

cloudremote server environment (H H)

MS-8 The system should be deployable on a personal computer or laptop (H H)

MS-9 The system should be deployable on a smartphone laptop and cloud environ-

ment (H H)

Table 23 Modifiability utility tree

a user can make when trying to execute certain functions [40] The utility tree is presented in Table 24

Security

Security is a measure of the systemrsquos ability to protect data and information from unauthorized access while still providing

access to users and systems that are authorized An action taken against the system to cause it harm is called an attack

Security has three main characteristics Confidentiality is the property that data or services are protected from unauthorized

access Integrity is the property that data or services are protected from unauthorized manipulation Availability is the property

of the systemmaintaining its functionality during an attack Authentication verifies the identities of the parties of an interaction

checks if they are truly who they claim to be and gives or provokes access [40] Security is important for the framework if it is

deployed on multiple devices that use a public network to communicate The utility tree is presented in Table 25

Availability

Availability in a general context (not only security) refers to howavailable the software is to carry out its functionality Downtime

is a measure of the time that the system is unavailable to carry out its functions The utility tree is presented in Table 26

Availability is specified for the part of the framework that distributes the plugins

22 Patterns and tactics 11

Attribute refinement Id Quality attribute scenario

Learnability

US-1 A user should be able to learn how to build an image processing application in

at most one hour (H L)

US-2 An experienced developer should be able to start developing a Consumer plugin

for the system within one day (H L)

US-3 An experienced developer should be able to start developing a Producer plugin

for the system within one day (H L)

Errors US-4 A user should not make more than 3 errors to build an image processing appli-

cation (H L)

Table 24 Usability utility tree

Attribute refinement Id Quality attribute scenario

Confidentiality SS-1 Streams created by a user can only be accessed by that user and not by any

other entity (H L)

Integrity SS-2 Streams canrsquot be manipulated without authorization by the user that made the

streams (H L)

Availability SS-3 During an attack the core functionality is still available to the user (H M)

AuthenticationSS-4 Users should authenticate with the system to perform functions (H L)

SS-5 Developers should authenticate their plugins before adding them to the frame-

work (H L)

Table 25 Security utility tree

Architecturally significant requirements

Architecturally Significant Requirements (ASR) are the requirements that are themost important to realize according to business

value and have the most impact on the architecture From the utility trees and the measures of the quality attribute scenarios

the QARs modifiability interoperability and performance are identified as ASRs

22 Patterns and tactics

An architectural pattern is a package of design decisions that is found repeatedly in practice that has known properties that

permit reuse and describe a class of architectures Architectural tactics are simpler than patterns which typically use just a

single structure or computational mechanism They are meant to address a single architectural force Tactics are the rdquobuilding

blocksrdquo of design and an architectural pattern typically comprises one or more tactics [40] Based on the ASRs several tactics

are listed in Table 27 that are used for the base pattern selection The explored patterns are layers event-driven architecture

22 Patterns and tactics 12

microkernel and microservices

221 Layers

The layered pattern divides the software into units called layers that each perform a specific role within the application Each

layer is allowed to use the layer directly beneath it via its interface Changes in one layer are isolated if the interfaces donrsquot

change enablingMT-1 andMT-2MT-5 [40] While changes can be isolated by the isolated layers they remain difficult due the

monolithic nature of most implementations of this pattern Layers contribute to a performance penalty due to the rdquoarchitecture

sinkhole phenomenonrdquo in which requests are simply propagating through layers for the sake of layers [45]

222 Event-driven architecture

This pattern consists of several event publishers that create events and event subscribers that process these events The pub-

lishers and subscribers are decoupled by using an event channel to which the publishers publish events that the event channel

forwards to the event subscribers The subscribers should have a single purpose and execute asynchronously Since the publish-

ers and subscribers are single-purpose and are completely decoupled from other components via the event channel changes

are isolated to one or some components enabling MT-1 MT-2 MT-4 MT-5 and MT-7 If the event channel adds a discovery

mechanism IT-1 can also be enabled Overall the pattern is relatively easy to deploy due to the decoupled nature of the com-

ponents Performance in general can be very high through the asynchronous nature of the architecture enabling PT-6 and PT-7

If the event channel is tweaked to contain extra functionality PT-1 PT-3 PT-8 PT-9 PT-10 PT-11 can be enabled as well

If the components have a limited event response then PT-2 and PT-5 can also be enabled Development can be somewhat

complicated due to the asynchronous nature of the pattern [40 45]

223 Microkernel

The microkernel pattern allows the addition of application features as plugins to the core application providing extensibility

as well as feature separation and isolation The pattern consists of two components a core system called the kernel and

Attribute refinement Id Quality attribute scenario

DowntimeAS-1 The system should be up 995 per year This means the system has an allowed

scheduled downtime of 43 hours and 30 minutes per year for maintenance (M

L)

AS-2 The maximal duration of the interval during which the system is unavailable is

3 hours (M L)

Network AS-3 If there is no active network connection the local device can be used for opera-

tion of the framework (H H)

Table 26 Availability utility tree

22 Patterns and tactics 13

plugins The business logic is divided between independent plugins and the kernel The kernel contains only the minimal

functionality required to make the system operational The plugins are standalone independent components that contain

specialized processing additional features and custom code This code is meant to enhance or extend the core system to

produce additional business capabilities In many implementations plugins are independently developed third-party modules

Changes can largely be isolated and implemented quickly through loosely coupled plugins AllMTs can be enabled Depending

on how the pattern is implemented the plugins can be dynamically added to the kernel at runtime Via a resource discovery

service in the kernel the ITs can be enabled In general most applications built using the microkernel pattern perform well

because applications can be customized and streamlined to only include the features that are needed [45]

224 Microservices

Microservices is an architectural pattern that structures an application as a collection of loosely coupled services that implement

business capabilities Each component of the pattern is deployed as a separate unit that can be deployed on one device or

multiple devices The components can vary in granularity from a single module to a large portion of the application The

components contain one or more modules that represent either a single-purpose function or an independent portion of a

business application [45 46] Due to the separately deployed units changes are isolated to individual components enabling all

MTs Via service discovery mechanisms the ITs can also be enabled The microservices pattern supports distributed deployment

of the software across multiple devices by design This pattern is not known to produce high-performance applications due to

the distributed nature of the pattern which relies on communication via a network [45 46]

225 Comparison of patterns

Table 27 summarizes the analysis of the patterns A score is given based on howwell the pattern enables the tactic Lowmeans

that the pattern does not naturally enable the tactic Medium indicates the pattern can be implemented with the tactic but

does not include it itself High means the tactic is enabled in the pattern Excellent means that the tactic plays a key role in the

pattern

The microkernel pattern andmicroservices pattern both enable most tactics The microkernel pattern implements extendability

of the framework by design using plugins which is the main idea for the framework and thus is an excellent base pattern

Interoperability and deployability of these plugins can be ensured by the microservices pattern as it designs the microservices

to have well defined interfaces for interoperability and allows for the framework to be deployed in a distributed fashion The

architecture presented below is a combination of both the microkernel pattern and the microservices pattern

22 Patterns and tactics 14

Tactic Layers Event-driven Microkernel Microservices

MT-1 Split module Medium High High Excellent

MT-2 Increase semantic coherence Medium High High Excellent

MT-3 Encapsulate Medium High High Excellent

MT-4 Use an intermediary Medium High High Excellent

MT-5 Restrict dependencies High High High Excellent

MT-6 Anticipate expected changes Low High Excellent Excellent

MT-7 Abstract common services Low High Excellent Excellent

MT-8 Defer binding | Runtime registration Low Low Medium High

IT-1 Discover services Low Low High High

IT-2 Orchestrate interface Low Low High High

IT-3 Tailor interface Low Low High High

P-1 Manage sampling rate Low High High Medium

P-2 Limit event response Low High High Medium

P-3 Prioritize events Low High High Medium

P-4 Reduce overhead Low High High Low

P-5 Bound execution time Low High High Medium

PT-6 Increase resource efficiency Low High High High

PT-7 Introduce concurrency Low High Low High

PT-8 Maintain copies of computation Low High Low High

PT-9 Load balancing Low High Low High

PT-10 Maintain multiple copies of data Low High Low High

PT-11 Bound queue sizes Low High Low Medium

PT-12 Schedule resources Low High Low Medium

Table 27 Comparison of how well the discussed patterns enable the tactics needed for the ASRs

23 Software architecture 15

23 Software architecture

The software architecture is documented in three document categories static views dynamic views and deployment views

The static views comprise the different components of the system and their relationship among each other The dynamic views

describe the runtime behavior of the system Finally the deployment views provide different configurations how the system

can be deployed on different devices [47]

231 Static view

Figure 22 presents an overview of the architecture using a component-connector UML diagram Components are the boxes that

represent different software entities that exist at runtime The components have interfaces through which they interact with

other components These are indicated using the rsquolollipoprsquo notation with the rsquoballrsquo representing the interface that a component

provides and a socket indicating that another component is using this interface The type of data exchanged is noted next to

the interface Multiple boxes indicate that multiple components of the same kind can exist at runtime [48]

The architecture consists of the following core components Client Interface Producer Stream Consumer Producer Distribution

Consumer Distribution Producer Plugin and Consumer Plugin The clear components in Figure 22 form the core framework

which each user needs to install to use the framework The colored components form a distribution service for framework

plugins to extend the functionality they are not installed with the core framework but run as remote instances with which

the user can interact to extend his version of the core framework with new plugins A user can use the framework via the

Client Interface building streams that are maintained in the Stream component The Stream component makes requests to

the Producer and Consumer components to activate and control the selected plugins to build the stream Additional plugins

can be added to the framework and are distributed via the Producer and Consumer Distribution components The architecture

implements a hybrid combination of the microservices and microkernel pattern Each presented component is a microservice

that implements its own interface to interact with other components The Producer and Consumer components act as kernels in

the microkernel pattern while the Producer and Consumer plugins acting as plugins in the microkernel pattern These patterns

enable the tactics needed to meet the requirements presented in Section 21

Communication protocol

To allow the microservices to communicate a communication protocol must be designed Communication protocols can roughly

be classified in two categories synchronous and asynchronous Synchronous protocols block on requests which means that the

client waits for a response of the server and can only continue executing when a response is received This makes a synchronous

protocol inherently more reliable but also slower An example synchronous protocol is the Hyper Text Transfer Protocol (HTTP)

Asynchronous protocols just send messages and do not block on the response This makes the protocol less reliable but also

faster [49]

There are two types of traffic exchanged between microservices First there are the command requests that are exchanged

between microservices to edit resources or change state Second there are the video frames that are exchanged between Pro-

ducer and Consumer Plugins Both types of traffic have different requirements The commands must be communicated reliably

23 Software architecture 16

Figure 22 Overview component-connector diagram of the architecture

and need to executed once and only once The reliability is more important than latency so a synchronous protocol is pre-

ferred Microservices traditionally implement the synchronous HTTP protocol with Representational State Transfer Application

Programming Interfaces (REST API) that specifies the application endpoints as textual resources [45] This common protocol

is used for the exchanged command requests in the framework

The video frames need to be sent with low latency at a high frequency but reliability is less important An asynchronous

protocol is preferred For video streaming the Real-time Transport Protocol (RTP) running on top of the User Datagram Protocol

(UDP) is selected as it enables real-time transfer of data between processes [50] RTP defines a standardized packet format to

transmit video and audio over a network It sequences each packet with a sequence number and a timestamp This allows the

application to detect missing packets and latencies in the network The UDP protocol is a low latency asynchronous transport

protocol as it doesnrsquot guarantee packet delivery

The recommended codec for transmitting video media is Motion-JPEG that encodes video frames as separately encoded JPEG

images This makes analysis and processing in subsequent plugins easier as only the received frame is needed to perform

the analysis or processing Other video compression formats such as MPEG-4 use key-frames and object oriented differential

compression formats If a key-frame is received via the stream the frame can be used as is If a reference frame is received the

receiver needs to wait for the corresponding key-frame to be received to be able to construct the full video frame for analysis

This introduces extra complexity and lower quality detection which is a clear trade-off for the quality and simplicity which

MJPEG offers [51 52]

Applying these protocols to the architecture results in the network topology depicted in Figure 23 The full lines represent

communication via HTTP on top of the Transmission Control Protocol (TCP) The dashed lines represent the RTP protocol on top

of the UDP protocol The boxes represent the different microservice components of the framework

23 Software architecture 17

Figure 23 Framework network topology Each box is a microservice component of the framework The full lines indicate communication over the HTTPTCP

protocol the dashed lines indicate communication over the RTPUDP protocol

Client Interface

The Client Interface is the interface through which end-users can interact with the framework Figure 24 presents the detailed

component-connector diagram The Client Interface consists of a User Interface (UI) component and a API Gateway component

Devices can make requests to the Client Interface via the Client Requests interface provided by the API GateWay The UI provides

the UI Operation interface that is used by end-users to control the framework This can be either a visual or textual interface

The UI actions are translated to client requests that are forwarded to the API Gateway using the Client Requests interface The

API Gateway translates the client requests and forwards them to the other core framework components

Figure 24 Client Interface detailed view

Stream

The Stream component maintains the logical representation of the streams built by the end-user for his image processing

application Figure 25 presents the detailed component-connector diagram

23 Software architecture 18

Figure 25 Stream detailed view

It consists of an API a StreamManager and several StreamModel components The API provides the Stream Commands interface

used by the Client Interface to interact with the framework it translates incoming requests to commands for the Stream

Manager that can then execute these commands Commands include creating a new stream modifying the stream layout

modifying the stream state etc The StreamManager creates and manages multiple streams represented by the Stream Model

So the end-user builds Stream Models to create image processing applications The Stream Model represents the logical model

of these image processing application streams As stated before a stream consists of several plugins processing media placed

in some order that are linked by the framework Figure 26 illustrates this concept

Figure 26 Logical model of a stream The arrows represent the flow of media through the stream

Logically the Stream Model is represented as a tree with multiple roots and multiple leaves The framework build streams by

initializing the needed plugins and connecting them in order In the example StreamModel plugins receivemedia frommultiple

source plugins and forward media to multiple targets The Stream Model has a global state that represents the cumulative

state of all plugins To transition the global state from A to B all plugins need to transition from A to B This is done by first

making the transition on the leaves of the Stream Model after which the transition propagates to the root nodes This ensures

that no media is lost because the first transitioned plugins canrsquot process anything as their is no media put into the tree

23 Software architecture 19

Producer and Consumer plugins

A Plugin represents an independent media processing element either of the Consumer type (such as a thermal camera) or

the Producer type (such as an object detection software module) Plugins are deployed as standalone microservices providing

a REST API interface that the framework uses to control the plugin Figure 27 represents a general plugin model A plugin

receives media from other plugins called the sources processes this media and forwards it to other plugins called the listeners

A Producer plugin only has listeners a Consumer plugin has both sources and listeners Merging the media from multiple

sources and forwarding the processed media to multiple listeners is the responsibility of the plugin

Figure 27 Plugin model

The plugin REST API should at least provide a state resource representing the state of how the plugin is processing media

a sources resource that represent the sources from which the plugin receives media to process and a listeners

resource which represent the listeners to which the plugin transmits the processed media Only Consumers have the both

the sources and listeners resource as Producer Plugins produce their own media source and hence can only have

listeners

To indicate if and how the plugin is actively processing media a finite state machine is implemented The state transition

diagram is presented in Figure 28 A plugin can be in four possible states INACTIVE STOP PLAY and PAUSE When a plugin

is in the INACTIVE state no active microservice is running the plugin This is the initial state for all plugins of the framework

This state is only visible to the framework as in this state the plugin is not instantiated When a plugin is in the STOP state

the framework has instantiated a microservice running the plugin The plugin is listening for commands on its API but is not

processing any media This state is visible to the plugin In the PLAY state a plugin is processing media received from its

source(s) and transmits processed media to its listener(s) and is listening for commands When in the PAUSE state media

processing is paused but media buffers are kept This is to decrease the latency when the plugin transitions back to the PLAY

state since the plugin can continue processing from the point from where it was paused The difference with the STOP state

when transitioning STOP state the plugin clears its media buffers

The plugin starts in the INACTIVE state When a microservice running the plugin is instantiated by the framework the plugin

initializes itself in the STOP state From the STOP state the plugin can transition to the PLAY state to process media This

transition is only successful if sources and listeners are registered with the plugin From the PLAY state a transition to both

23 Software architecture 20

Figure 28 The state transition diagram for a plugin

the STOP state and the PAUSE state can be made which stops the processing of media and respectively drops or keeps the

media buffers The plugin cannot make multiple state transitions per command When a transition is made to INACTIVE the

framework first transitions the plugin to the STOP state after which the INACTIVE state can be reached

A sourcelistener has the following fields hostname the hostname of the microservice running the plugin and port the port

on which the sourcelistener is reachable

On the sources and listeners an HTTP GET and POST method must be provided GET retrieves the sourceslisteners

and their details POST adds a new sourcelistener to the plugin Both resources additionally need to provide an individ-

ual endpoint per sourcelistener on which GET PUT and DELETE must be provided This is for individual manipulation of the

sourcelistener GET retrieves the details PUT updates the fields of a listener and DELETE removes a sourcelistener from the

plugin

Producer and Consumer

The Producer and Consumer components are responsible for interacting and managing the ProducerConsumer plugins used in

the streams Figure 29 presents the component-connector diagram of the Producer and Consumer components Both compo-

nents have a similar architecture but are separate components This is because their plugin models differ and are suspected

to often be deployed on different devices having specific hardware requirements Producers Plugins could be deployed on

the thermal camera itself having a very specific operating system whereas a Consumer plugin might need access to specific

processors to speed up its execution

The Producer and Consumer consist of the following components API Kernel Plugin Model and Plugin Manager The API trans-

lates requests coming from the Stream component to commands for the Kernel The Kernel implements the core functionalities

such as activating (deploying) and deactivating plugins managing their state and manipulating their resources It creates a

Plugin Model for each Plugin that the framework has installed This model represents a plugin logically on framework level

and keeps track of the Plugin resources The Plugin Manager manages the plugins that were added to the framework stored in

the Plugin Directory It manages the plugin installations adding updates or installing additional plugins that can be retrieved

from the Producer and Consumer Distribution components

23 Software architecture 21

(a) Producer component-connector diagram

(b) Consumer component-connector diagram

Figure 29 Component-connector diagrams of the Producer and Consumer module

Producer and Consumer Distribution

The Producer and Consumer Distribution components are responsible for managing and maintaining the plugins for the frame-

work They act as online software repositories from which local versions of the framework can retrieve new plugins The

component-connector diagrams are presented in Figure 210 The Distribution components consists of the following subcom-

ponents API Plugin Manager and Plugin Tester Plugin Developers can make requests to the API that translates these requests

to Create Read Update Destroy (CRUD) commands for the Plugin Manager The Plugin Manager executes these commands

on the Plugins that are kept in the Plugin Repository The quality of the framework depends on the quality of the plugins

that it offers Therefore plugins should be thoroughly tested before being added to the framework to guarantee quality The

Plugin Tester component is responsible for this testing Tests should include testing if the plugin implements the Plugin Model

correctly if the plugin meets the performance requirements etc When a plugin passes these tests it is added to the Plugin

Repository so that end-users can install the plugin and use it for their applications

23 Software architecture 22

(a) Producer Distribution (b) Consumer Distribution

Figure 210 Producer and Consumer Distribution component-connector diagrams

232 Dynamic views

Dynamic views depict the behavior of the system and complement the static views They are documented using sequence

diagrams that show an explicit sequence of messages between architecture elements that describes a use case [40] Two key

use cases are presented here Add a plugin to the stream and linking plugins to build the stream

Add plugin to stream

Figure 211 presents the sequence diagram for adding a Producer plugin to the framework The framework is assumed to be

running the user has created a streamS and the Producer PluginA is correctly installed The end-user executes the command

to add A to stream S on the Client Interface that passes the command to the Stream component The Stream component

requests the creation of a microservice instance of A that is created by the Producer Kernel When the Producer Plugin is

instantiated the Producer Kernel creates a Plugin Model of A and adds it to its references so that the instance of A can be

reached for future commands Afterwards the StreamManager is informed of the success upon which the StreamManager can

addA to the Stream Model ready to be linked with other plugins The user is notified of this success and can continue building

IfA could not be instantiated (due to not being installed not installed correctly etc) A is marked as rsquobrokenrsquo and the user is

notified that the action could not be completed When the plugin is marked as rsquobrokenrsquo it can no longer be used and needs to

be reinstalled The sequence diagram for adding a Consumer Plugin is similar but replaces the Producer components with the

Consumer components

Link plugins

Figure 212 presents the sequence diagram for linking two plugins in a stream In the sequence diagram two Consumer Plugins

A and B are linked this can be extended to a Producer Plugin linking with a Consumer Plugin The framework is assumed

to be running the user has created a stream S the plugins A and B have been instantiated and added to the stream The

end-user executes the command to linkA andB in stream S on the Client Interface that passes the command to the Stream

component that checks if the link is valid for the Stream Model S Linking can only be done if the stream is in the STOP state

and if the plugins are already in the stream If the link is valid the Stream Manager can begin linking the plugins To link

23 Software architecture 23

Figure 211 Add a Producer Plugin to stream

the plugins in the order A-B A is added as a source for B and B is added as a listener for A These subsequences are

found in their corresponding frames in the diagram and are very similar The Stream Manager makes the request to add the

sourcelistener to the Kernel that finds the corresponding plugin and makes the request on the corresponding Plugin Model

If the Plugin succeeded the Plugin Model is updated and the Stream Manager is notified of this success If both plugins have

successfully set the source and listener the Stream Model layout is updated with the link Should the sourcelistener request

fail for one of the plugins the change is rolled back and the end-user is notified

233 Deployment views

The different deployment configurations are illustrated via deployment diagrams using the Deployment UML 25 specification

[48] rsquoHostrsquo specifies the device on which components are deployed The rsquomicroservicersquo indicates the isolated environment

in which components are running These isolated environments on the host are realized as software containers that enable

portability of the components to other deployment configurations This concept is further discussed in Section 33 The Producer

and Consumer Distribution components were left out of the diagrams as they are always distributed on a different host than

the core framework Two deployment configurations are presented the local configuration that deploys the components on

a single device and the distributed configuration that deploys each component on a separate device These configurations are

presented in Figure 213

23 Software architecture 24

Figure 212 Link two plugins in a stream The rsquoformat requestrsquo and rsquotranslate requestrsquo actions in the API components have been omitted to reduce clutter

in the diagram but are executed by the API components

23 Software architecture 25

Local configuration deployment

The local configuration deploys the framework on a single local device The configuration is depicted in Figure 213a Because

the framework is deployed as a whole it can operate offline This configuration is useful for image processing applications that

canrsquot rely on a stable network connection Examples are remote locations or densely built-up areas The components are still

deployed as separate microservices due to the architecture of the framework This has an impact on the performance of the

framework because for every interaction between components either the HTTP message protocol or RTP protocol is used that

introduces extra overhead compared to direct invocation of commands

Distributed configuration deployment

The distributed configuration deploys the framework on multiple devices The components are distributed over these devices

made possible by the microservice isolation and communication protocols This configuration is depicted in Figure 213b Obvi-

ously in this configuration each component of the framework must have a reliable network connection to communicate with

the other framework components This configuration could be used for example for a security application The end-user has

the Stream module running on a master node that controls several cameras The end-user can configure his image process-

ing application through the Client Interface running on his device that communicates with the Stream module running on

the master node The master node can control each camera by communicating with the Producer component If for example

the security application requires autonomous detection of trespassing people a computationally intensive task the Consumer

Plugins could need dedicated hardware to run that is only available on another device The Consumer component can then be

deployed on that dedicated device and the Stream component can again communicate with it over the network This success of

this configuration depends on the availability of the network and the capacity of the network If the network fails commands

and media canrsquot come through and the framework can no longer execute Due to the distributed nature performance will also

be worse when compared to the local configuration because each request between the components travels over a network

that can experience delays

23 Software architecture 26

(a) Local configuration deployment diagram(b) Distributed configuration deployment diagram

]

Figure 213 Deployment diagrams

STATE OF THE ART AND TECHNOLOGY CHOICE 27

Chapter 3

State of the art and technology choice

To build and test a proof of concept implementation of the architecture presented in Chapter 2 several state of the art tech-

nologies can be used as support for the framework These are presented in Sections 31 32 33 and 34 For each category a

choice is made that will serve as the basis for the implementation of the proof of concept discussed in Section 35 Readers

already familiar with the presented technologies can safely skip ahead to Section 35

31 Thermal camera options

This section aims to provide an overview of some currently commercially available thermal cameras The overview is not a

complete overview of all products offered by all vendors This data was gathered in September 2017 so some products can

be discontinued and new products can already be launched Several parameters are collected for each product Section 311

discusses why these parameters are important to assess the quality of a thermal camera Section 312 aims to aggregate these

parameters and presents insights into the data The full list of specifications can be found in Appendix B

311 Parameters

The following parameters were considered for the comparison physical specifications image quality thermal precision inter-

faces energy consumption help and support and auxiliary features

Price

Thermal cameras are relatively expensive when compared to visible light cameras For example a 20 megapixel (MP) visible

light camera can cost as low as 100 euro while thermal cameras having a much lower image resolution can cost as much as

15000 euro Prices for thermal cameras cover a very wide range and budgets are limited in practice

31 Thermal camera options 28

Physical specifications

Two specifications are considered the weight of the camera and the dimensions of the camera Drones have a limited carry

weight due to maximal carrying capacities and a faster draining of battery life when carrying heavier loads Lighter and smaller

cameras are preferred for usage with drones These often offer lower image quality and less features than the heavier cameras

Image quality

Image quality specifies how much information an image can possibly hold It consists of five parameters resolution capture

frequency or frame rate field of view and radiometric information Image resolution is the amount of detail an image holds

Higher resolution cameras can capture more details in a scene resulting in a sharper image that holds more information Due

to more details smaller objects can also be seen allowing scenes to be viewed from larger distances Drones capture images

from relatively large distances so good resolutions are required for the images to be useful Image resolution is measured in

pixel density presented as the product of the amount of pixels in width and height of the image The highest resolution found

for the compared cameras is 640 x 512 pixels Some cameras offer a visual camera next to the thermal camera This allows

an overlay of the visual image and the thermal image so-called Multi Spectral Dynamic Imaging (MSX) This creates artificial

sharper images because edges can be seen more clearly because they are more visible in the visual image Figure 31 depicts a

thermal-only image and a MSX image of a dog It can be seen that the MSX image is sharper MSX is a more low-cost solution

to produce sharper images compared to increasing the thermal resolution as visible light cameras are less expensive [7]

(a) Thermal (b) MSX

Figure 31 Thermal image and MSX image of a dog

The capture frequency or frame rate dictates how many frames the camera can capture per second Higher frequency cameras

are able to track dynamic scenes better The field of view is angle throughwhich the camera is sensitive to thermal radiation and

31 Thermal camera options 29

determines the extent of the world that can be seen by the camera Bigger field of views can capturemore of the environment in

one image Most cameras allow various lenses to be mounted onto the camera which allows for greater flexibility in choosing

the field of view Radiometric image information is thermal information embeddedwith the infrared image that can be analyzed

after recording Radiometric information characterizes the distribution of the thermal radiationrsquos power in space and specifies

the temperature per pixel exactly Regular thermal images use a relative scaling of temperatures that are mapped onto a

colorspace with some color being the hottest color in the image and another color the coldest For example in Figure 31a the

Iron color scheme is used which maps the cold regions of the image on blue color variants and warmer regions on red and

yellow variants Radiometric information can give a very detailed description of the radiation pattern of a scene

Thermal precision

Thermal precision specifies the temperature range the sensitivity and accuracy of the temperature measurements The tem-

perature range indicates the minimum and maximum range a camera can detect A larger temperature range comes with a

trade-off in sensitivity and accuracy Often cameras offer different modi of operation and operate using different intervals

according to the accuracy needed in a scene Sensitivity indicates the ability of the camera to record finer distinctions in tem-

perature Accuracy is the margin of error for temperature readings on the thermal camera An accuracy of 5 degrees Celsius

for small temperature ranges and 20 degrees Celsius for large temperature ranges is commonly found The increase in error

margin is a trade-off for the larger temperature interval Objects have different emit infrared waves in various forms (due

to black-box radiation [7]) To accurately compare the temperatures cameras often implement emissivity corrections that

normalize the measurements

Interfaces

Cameras can communicate with other devices via several interfaces during use Cameras mounted on a drone cannot be ac-

cessed during flight and need these interface to transfer data USB and HDMI are the most commonly found interfaces to

connect the camera with an on-board processing unit gimbal or battery MAVLink [53] is a very lightweight header-only mes-

sage marshalling library for micro air vehicles drones When a camera provides this interface this allows for a very efficient

communication scheme to control the camera remotely Other interfaces include Bluetooth or Wi-fi

Energy consumption

A device mounted on a drone has a limited energy source at its disposal The less energy the camera consumes the longer the

drone can operate This can even lead to lighter batteries used in-flight reducing the carried weight and therefore also the

energy consumption Typically energy consumptions for cameras are much lower than the energy consumption of the drone

itself so this is a minor specification Input voltage and power consumption are specified

31 Thermal camera options 30

Help and support

How the camera is supported by the company has a big impact on the ease of developing applications for the camera platform

User manuals phone or email support and FAQs are very helpful Should the camera be malfunctioning a product warranty is

necessary to recover the broken product

User experience

The user experience is another important factor as there is a difference in the technical specifications and the actual experience

of the user The user experience is measured in a number of good and a number of bad reviews Reviews are scored from zero

to five with zero being a very bad experience and 5 being a very good experience A good review is scored three or more a bad

review less than three stars

Auxiliary features

Some cameras offer even more features than the ones mentioned above These can be a connection with the Global Positioning

System (GPS) to indicate where images were captured a software application to interact with the camera analysis functionality

tracking etc

312 Comparative analysis

It can be seen that FLIR is the market leader on thermal solutions for drones They offer the largest product line and products

from other companies often utilize one of their camera cores Figure 32a plots the retail price compared to the thermal

resolution Cameras with high and low resolutions are found across all price ranges Clearly other features determine the price

of a thermal camera A feature function is defined that maps the features of a thermal camera on an integer The function

increments the integer if

bull The camera has MSX support

bull The camera has a standard data format (not just an analog or digital signal)

bull The camera offers radiometric information

bull The image resolution is larger than 640 x 512 pixels being the highest resolution found for these products

bull The sensitivity is smaller than 100 mK

bull The camera offers emissivity correction

bull The camera offers a USB interface

bull The camera offers a MAVLink interface

32 Microservices frameworks 31

bull The camera offers an HDMI interface

bull The camera offers a Bluetooth connection

bull The camera offers Wi-Fi connection

bull The camera offers GPS tagging

Figure 32b plots these feature points versus the retail price This gives a more log-like relationship The features of a camera

determine the price much more than just the image quality For a price less than 5000 euro thermal cameras are found that

implement most basic features Then the price increases rather fast for less added features These are features like radiometry

that require additional hardware that greatly increase the price of the camera

32 Microservices frameworks

The architecture presented in Section 23 relies heavily on the microservices pattern Therefore this Section aims to present

several microservices frameworks to support this architecture Figure 33 depicts the results of the Rethink IT survey query-

ing the most used frameworks for microservices by developers [54] The most popular frameworks Java EE and Spring Boot

are written in Java The Java EE framework is more of a one-stop-shop framework offering much more functionalities than

just a backbone microservices framework and is therefore not considered Spring Boot is clearly a very popular and mature

framework more streamlined for microservices Vertx is a more upcoming framework renowned for its performance making

it worthwhile to explore Python is an upcoming language for web development and because it is excellent for prototyping

several frameworks for this language are explored as well The frameworks presented here are Vertx version 351 Spring Boot

version 20 Flask version 012 Falcon version 141 and Nameko version 290

321 Flask

Flask is a micro web development framework for Python The term rdquomicrordquo means that Flask aims to keep its core simple but

extensible Flask is an unopinionated framework as it only provides a glue layer to build a REST API around the application

(a) Camera resolution compared to retail price(b) Camera feature points compared to price

32 Microservices frameworks 32

Figure 33 Rethink IT Most used tools and frameworks for microservices results [54]

However it provides a large list of extensions if extra functionality is required [55] Starting a microservice is very simple

as illustrated in Listing 1 Flask uses the concept of Python decorators [56] to bind Python functions to a REST API in Listing

1 for example the function service_status() is linked to the rsquorsquo resource When a user issues an HTTP GET request

on this resource the route() function on the app object is called by Flask Because route() is a decorator for the

service_status() function service_status() is wrapped and passed to the route() function so that when

a user issues an HTTP GET request the service_status() function that was passed gets called This allows for an easy

construction of the REST API just decorate all the functions of the microservice with the correct Flask decorator

from flask import Flask

app = Flask(__name__)

approute()

def service_status()

return service_status

if __name__ == __main__

apprun()

Listing 1 Minimal Flask application

Because Flask is a microframework its memory footprint is small with the binary file only being 535KB large It is in use

by several large companies such as Netflix and Reddit [57] In a production environment the default Flask web server is not

sufficient as it only serves one user at a time However for prototyping it is an excellent framework [55]

32 Microservices frameworks 33

322 Falcon

Falcon is a bare-metal Python web framework that differentiates itself in performance when compared to other frameworks

It targets itself towards microservices due to being even more lightweight and faster when compared to frameworks like Flask

In a benchmark test it achieves 27 times the speed of Flask [58] The framework seems less mature and has not been adopted

by many companies [59] It is not considered for the prototype of the system but could be used in production as it achieves

better performance

323 Nameko

Nameko is a framework specifically built for building microservices in Python Next to offering a REST API it also offers asyn-

chronous events over the Advanced Message Queuing Protocol (AMQP) It is only meant to be used for microservices not for

web applications that serve content It is a relatively young framework and is not backed by any major companies as of yet It

is however backed by the developer of the Flask framework [60]

324 Vertx

Vertx is a toolkit for building reactive applications on the Java Virtual Machine (JVM) This framework follows the reactive

systems principles These principles are used to achieve responsiveness and build systems that respond to requests in a timely

fashion even with failures or under load To build such a system reactive systems embrace a message-driven approach All

the components interact using messages sent and received asynchronously Reactive microservices built with Vertx have the

following characteristics autonomy asynchronous resilience and elasticity Vertx is a toolkit and can be used as any other

library which makes it very flexible It provides a large set of features metrics different programming languages different

protocols templating data access cluster management etc

Vertx embraces the asynchronous development model which can be seen in Listing 2

import iovertxcoreAbstractVerticle

public class Server extends AbstractVerticle

public void start()

vertxcreateHttpServer()requestHandler(req -gt

reqresponse()

putHeader(content-type textplain)

end(Hello from Vertx)

)listen(8080)

Listing 2 Vertx example

33 Deployment framework 34

The event which occurs is the HTTP request On arrival of the event the Handler is called and is executed The Handler is chained

to a listen request and does not block the calling thread The Handler is only notified when an event is ready to be processed

or when the result of an asynchronous operation has been computed [61]

325 Spring Boot

Spring Boot is an opinionated Java framework for building microservices based on the Spring dependency injection framework

It allows developers to create microservices through reduced boilerplate and configuration For simple applications it provides

a similar syntax to Flask in Python and uses decorators for routing An example is given in Listing 3 The framework handles

most of the routing and request handling but restricts the developer in application structure The framework is not lightweight

and performs less well than Vertx [62]

RestController

RequestMapping(api)

public class HelloRestController

RequestMapping(method = RequestMethodGET value=hola

produces = textplain)

public String hello()

return Hello Spring Boot

Listing 3 Spring Boot example

33 Deployment framework

To allow for the modifiability and interoperability requirements discussed in Section 212 and the different deployment config-

urations in Section 233 Linux containers (LXC) are used A container is a lightweight operating system running inside the host

system running instructions native to the core CPU eliminating the need for instruction level emulation that Virtual Machines

use Containers provide an identical isolated runtime environment for processes without the overhead of virtualization This

makes them perfect for highly portable software as only the container needs to be moved and can directly be executed on any

system supporting the containers [63] First the concept of containers is introduced in Section 331 Second several container

frameworks are presented in Sections 332 333 334

331 Containers

Containers sandbox processes from each other and are often described as the lightweight equivalent of virtual machines The

difference between a virtual machine and a container is the level of virtualization Virtual machines virtualize at the hardware

33 Deployment framework 35

level whereas containers do this at the operating system (OS) level The achieved effect is similar but there are significant

differences Containers make available protected portions of the OS and share its resources Two containers running on one OS

have their own OS abstraction layer and donrsquot know they are running on the same host This provides a significant difference in

resource utilization Virtual machines provide access to hardware only so it is necessary to install an OS As a result there are

multiple OSs running which gobble up resources Containers piggyback on the running OS of the host environment They merely

execute in spaces that are isolated form each other and certain parts of the OS This allows for efficient resource utilization and

for cheap creation and destruction of containers Consequently starting and stopping a container is equivalent to starting and

stopping an application [64 65] This comparison is illustrated in Figure 34

Containers offer several advantages over running a process directly on the system Due to the OS virtualization of the con-

tainers software is always deployed on the same operating system defined in the container This allows for a rsquowrite once run

everywherersquo scenario which allows for portability of the system to a range of devices Containers communicate with each other

using protocols such as HTTP This allows for the processes in containers to be written in any programming language using

any external library that is needed For the system this means that if the Producer and Consumer Plugins are packaged as

containers they can effectively be made in any available technology greatly enhancing the extensibility of the system

332 LXC

Linux containers are the basis on top of which other container frameworks are built LXC provides a normal OS environment

similar to a VM The containers in this framework almost behave identically to a VM They can run multiple processes LXC can

be used directly but offer only low level functionalities and can be difficult to set up [67]

333 Docker

Docker started as an open-source project at dotCloud in early 2013 It was an extension of the technology the company had

developed to run its cloud applications on thousands of servers [64] Now Docker is a standalone mature company providing a

software container platform for the deployment of applications [66] Docker provides two main services a simple toolset and

API for managing Linux containers and a cloud platform which provides easy access to recipes for software containers created

by other developers [68] Docker is the container technology with most public traction and is becoming the container standard

at the time of writing due to its functionalities and very responsive community It offers functionality to easily build and run

containers but also manage them in large clusters A design decision that limits Docker is that each container can only run one

process at a time and the Docker client Docker consists of a daemon that manages the containers and the API Engine a REST

client Should this client fail dangling containers can arise [69]

334 rkt

Core OSrsquo rkt is an emerging container technology providing an API engine similar to the Docker API Engine that can run LXC

containers as well as Docker containers rkt focusses on security standardization and is specifically designed to run in cloud

environments Unlike Docker rkt does not use a daemon process with a REST client The command line tool executes all the

34 Object detection algorithms and frameworks 36

(a) Container stack (b) Virtual machine stack

Figure 34 Containers compared to virtual machines [66]

operations which makes the framework more reliable rkt is not as mature as Docker yet It is portable to multiple Linux

environments but is not yet portable to macOS and Windows [70]

34 Object detection algorithms and frameworks

As stated in Section 132 object detection is the computer vision task of detecting which objects are present in an image and

where they are located Several approaches to this problem have been proposed some of which focus on thermal images This

section aims to give a small overview of different existing techniques For the technical details on the algorithms the reader is

referred to the respective articles on the algorithms

341 Traditional approaches

Traditional approaches include hot-spot detection techniques and Adaptive Boosting (AdaBoost) with various feature extraction

techniques such as Aggregated Channel Features (ACF) and Integral Channel Features (ICF) Thesemethods rely on clever feature

engineering solutions that use domain knowledge or statistical insights to transform the raw dataset into a specific set of

features in order to find patterns [32]

Hot-spot detection

Hot-spot techniques work on the assumptions that people have an overall higher body temperature than most of the back-

ground in the thermal image These techniques first select candidate objects these are the hot-spots in the image The hot-spots

define the region on which a classifier is run and are thus the localization step in the object detection problem Afterwards

a classifier is trained on these candidates Xu et al used a Support Vector Machine (SVM) classifier to classify if the hot-spot

34 Object detection algorithms and frameworks 37

represented a pedestrian [71] Nanda et al used a Bayes classifier to classify the hot-spots [72] These methods are generally

not applicable because people often are not the only hot-spots in thermal images

AdaBoost

AdaBoost is a machine learning algorithm that utilizes the output of so-called weak learning algorithms (weak learners) and

combine their outputs into aweighted sum that forms the output of the boosted classifier AdaBoostmodifies theweak learners

in favor of data points misclassified by previous classifiers [73] Viola and Jones et al built a detection algorithm that uses two

consecutive frames of a video sequence and trains the AdaBoost classifier on both motion and appearance information [74]

Davis et al use a two-stage template approach that initially performs a fast screening procedure using a generalized template

using a contour saliency map to locate potential person locations Any window located in the first phase is then forwarded to

the AdaBoost algorithm to validate the presence of the person Dollaacuter et al extracted features using different ICF and ACF [35]

ICF and ACF compute features by calculating several aggregations over the different channels of an image such as gradient

color histogram and colors Goedeme et al expanded these detectors with extra thermal channels to achieve comparable

results as Dollaacuter et al but for thermal images [36]

342 Deep learning

Over the past few decades there has been a shift in proposed solution methods towards deep learning Deep learning for object

detection uses Convolutional Neural Networks (CNN) CNNs are a specialized kind of neural network for processing data that

has a known grid-like topology such as images CNNs generally consist of three steps a convolution step that creates a feature

map of a region of an image a pooling step that summarizes the output of the convolution step and finally a fully-connected

network that learns from the features extracted in the previous steps [75] The key difference is that these algorithms do the

feature extraction in the convolutional layers and do not need feature engineering like the algorithms presented in Section

341 This requires quite a bit of computing power when compared to the traditional methods Since deep learning made the

shift to computing on Graphical Processing Units (GPUs) computations became feasible and these models proved to achieve

very good performance on various machine learning problems Two model types are described two-stage networks (R-CNN

R-FCN) that extract image regions first and make separate predictions on each region and dense networks (YOLO SSD NASNet

RetinaNet) that operate on the image as a whole

Region-based Convolutional Network (R-CNN)

R-CNN uses a selective search method to find objects an alternative to the exhaustive search in an image It initializes small

regions in an image and merges them hierarchically The detected regions are merged according to color spaces and other

similarity metrics [76] R-CNN combines this selective search with a CNN per region to find out the objects in these regions [77]

34 Object detection algorithms and frameworks 38

Fast(er) Region-based Convolutional Network (Fast(er) R-CNN)

Fast R-CNN was developed to reduce the time consumption related to the high number of models necessary to analyze region

proposals from the selective search method in R-CNN Instead of using a CNN for each region a single CNN with multiple

convolutional layers is used [78] Faster RCNN drops the region proposals detected with the selective search method (which

is computationally expensive) and introduced the Region Proposal Network (RPN) to directly generate region proposals This

accelerates training and testing and improves performance [79] Mask R-CNN is an extension of the Faster R-CNN model that

adds a parallel branch to the bounding box detection to predict object masks that is the segmentation of an object by pixel in

the image [80]

Region-based Fully Convolutional Network (R-FCN)

R-FCN tries a more efficient approach to region detection Instead of applying a per-region subnetwork multiple times R-FCN

uses a fully convolutional network with computations shared across the entire image This allows it to be compatible with

multiple backbone networks such as Residual Networks [81]

You Only Look Once (YOLO)

The previously discussed methods need to run the same computations on different parts of an image multiple times before

generating a prediction This makes those methods relatively slow The YOLO model [82] was developed with the requirement

to make predictions as fast as possible trading off accuracy for speed to move towards real-time object detection YOLO directly

predicts bounding boxes and class probabilities with a single CNN in a single evaluation instead of first detecting object regions

and predicting classes afterwards This has some benefits over the other methods YOLO is very fast when compared to other

methods capable of processing images in real-time up to 155 frames per second for some variants It also learns contextual

information because it trains on entire images instead of regions YOLO also generalizes better for other image types All these

benefits come at the cost of accuracy YOLO struggles to precisely localize some objects especially small objects The following

versions of YOLO focus on delivering more accuracy The algorithm is currently in its third version [83]

Single-Shot Detector (SSD)

The SSD [84] is similar to YOLO and predicts all the bounding boxes and the class probabilities in one single evaluation (single

shot) using one CNN The model takes an image as input which passes through multiple convolutional layers When compared

to YOLO SSD achieves higher accuracies by adding convolutional layers and including separate filters for different aspect ratio

detections

Neural Architecture Search Net (NASNet)

NASNet takes a different approach and does not design the network architecture to perform the object detection beforehand

but instead trains a Recurrent Neural Network (RNN) to generate the model descriptions of the CNN to perform the object

34 Object detection algorithms and frameworks 39

detection The RNN is trained using reinforcement learning The NASNets built for object detection perform as good as most

networks but are slower to train [85]

RetinaNet

RetinaNet is the latest state-of-the art object detector It is a simple dense detector similar to YOLO and SSD but matches

the accuracy of the two-stage detectors like the R-CNN variants RetinaNet proposes that the foreground-background class

imbalance encountered when training the dense detectors lead to less accuracy when compared to the two-stage detectors

RetinaNet uses a newmethod called Focal Loss that focuses training on a sparse set of examples to counter this class imbalance

which results in a very good performance and a very fast detection [86]

343 Frameworks

While the previous Sections focused on different algorithms actually implementing these algorithms is not straightforward

Thatrsquos why over the past years several deep learning frameworks have emerged that try to provide easier access to this tech-

nology Some frameworks provide APIs for some of the object detection algorithms presented above This section gives a small

overview of some frameworks Most frameworks differ quite a bit from each other which makes porting a model from one

framework to another rather difficult The Open Neural Network Exchange Format (ONNX) initiative hopes to propose a stan-

dard for interchangeable models which should aid switching among frameworks more easily in the future [87] Note that there

are other frameworks available but those do not yet support object detection functions out of the box

TensorFlow

Perhaps the most well-known framework TensorFlow is an open source machine learning library for neural networks with a

Python interface It was developed by Google for internal use and released for the public in 2015 [88] Recently an Object

Detection API has been built for TensorFlow which implements pre-trained models on benchmark datasets such as SSD Faster

R-CNN R-FCN and Mask R-CNN [89] TensorFlow offers a lot of flexibility in its use and can be used for many machine learning

problems

Darknet

Darknet is an open source neural network framework written in C and CUDA It is maintained by Joseph Redmon the person

behind the YOLO algorithm [90] Darknet does not offer the flexibility that other frameworks offer but is easy to install and

use when compared to others Out of the box Darknet offers an interface for YOLO The open source community offers some

ports of this framework to other popular frameworks such as Tensorflow

34 Object detection algorithms and frameworks 40

CNTK

The Microsoft Cognitive Toolkit (CNTK) is an open source toolkit for distributed deep learning It offers a Python C or C++

interface Itrsquos one of the first framework so support ONNX CNTK offers an API for Fast R-CNN and Faster R-CNN [91]

35 Technology choice 41

35 Technology choice

This Section presents the choices made for each technology described in the previous Sections

351 Thermal camera

The FLIR One Pro and Therm-App were selected as thermal cameras for the proof of concept Both offer relatively high quality

images 160 x 120 pixels and 320 x 240 pixels respectively This is of course relative to their price 469 and 93731 euro respec-

tively These prices are at the low end of the product ranges offered Both cameras are designed to use on a smartphone which

makes them ideal for prototyping since these devices are widely available and setting up the camera via the apps from the

respective companies is easy Both cameras provide MPEG-4h264 encoded video output easily understood by most playback

software Both cameras can be found in the lower left of Figure 32b

For deployment in production-ready applications with drones these cameras are not the best choice They arenrsquot specifically

designed to be used on a drone and donrsquot offer the best image quality possible In those applications platforms like the FLIR Vue

Duo Zenmuse or Workswell Wiris are better candidates due to their superior image quality MAVLink interfaces compatibility

with commercially available gimbals to mount them on drones and other features

352 Microservices framework

Flask is selected as the microservices framework The arguments for Flask are as follows Flask is a mature web framework

with major companies backing it This means the APIs stay consistent and the framework is stable in use When compared to

some other frameworks like Spring Boot Flask is unopionated which allows for maximum flexibility during development Flask

also has a very small memory footprint that makes it easier to deploy on less powerful on-board devices like drones Flask is

also easy to use and quick to set up ideal for developing a proof of concept A final argument is the familiarity of the author

with Flask

353 Deployment framework

Docker is selected as the deployment framework Docker is the most mature and well supported container framework at

the time of writing and likely will be important in the future It offers the most features and is specifically designed for the

microservices pattern [68]

354 Object detection

One of the requirements specified in Section 21 is real-time streaming Real-time object detection is only achieved by a few

models presented in Section 34 Candidates are YOLO SSD and RetinaNet As there is no framework that provides an implemen-

tation of the RetinaNet algorithm out of the box at the time of writing this algorithm is not selected SSD is implemented in

the TensorFlow object detection API However at the time of writing this API has not been found stable when trying out the API

fallbacks to older versions of the software were needed to be able to test the models This was due to the object detection API

35 Technology choice 42

using older versions of the TensorFlow framework Therefore YOLO implemented in the darknet framework is selected Darknet

offers a stable distribution YOLO achieves good results and has proven to be a very fast detector capable for real-time object

detection

PROOF OF CONCEPT IMPLEMENTATION 43

Chapter 4

Proof of Concept implementation

To prove the concept of the architecture discussed in the previous chapters a prototype is implemented First the goals and the

scope of the prototype are presented in Section 41 Next the components of the prototype are presented in Section 42 Finally

the known limitations and issues of the prototype are presented in Section 43

41 Goals and scope of prototype

The goals of the prototype are to prove the QARs defined in Section 21 The prototype focusses on the ASRs performance

interoperability and modifiability The usability security and availability requirements are left out of the scope of the prototype

because they are not an ASR and require significant resources (focus groups longtime deployment etc) to test

The components that are implemented in the prototype are Client Interface Stream Consumer and Producer because they

represent the core functionality of the framework to build image processing application streams using plugins The Producer

and Consumer Distribution components enable third party plugin developers to add their functionality to the framework These

are distribution functionalities which are out of scope of the prototype The prototype will only support one video stream All

functions presented in Figure 21 are implemented with the exception of rsquoInstall pluginrsquo rsquoUninstall pluginrsquo rsquoAdd pluginrsquo rsquoView

pluginrsquo rsquoRemove pluginrsquo and rsquoUpdate pluginrsquo as they are only supported via the Producer and Consumer Distribution components

The prototype is deployed on a local device Distributed deployment configurations require small changes in the implementation

(see Section 43)

42 Overview of prototype

421 General overview

The prototype consists of four main components a cli streamer producer and consumer The cli process is

the Client Interface implemented as a textual Command Line user Interface (CLI) which allows a user to interact with the

prototype through textual commands in a shell This process is deployed on the local machine The streamer producer

42 Overview of prototype 44

and consumer processes are deployed as microservices in their own Docker containers The prototype is initialized through

the cli which spins up the Docker containers of the other processes This is achieved with the tool docker-compose Compose

is a tool for defining and running multi-container Docker applications The compose YAML file defines the configurations for

the microservices Compose uses these configurations to start and stop the application with a single command [92] A snippet

of the compose file for the application is given in Listing 4 Containers are specified as services The example service

configuration given is that of the producer First the name of the container is specified which overwrites the default name

as the container name is used as hostname for the container in Docker [93] The build configuration specifies where the

container build recipe is situated The port mapping allows processes from the localhost to access processes in the container

For the producer service this is only used for debugging The volumes configuration specifies folders from the host to

be mounted to the container This configuration mounts in the source code and resources It also provides access to the Docker

socket to allow interaction with the Docker host (see Section 424)

services

producer

container_name producer

build

context producer

dockerfile Dockerfile

ports

- 808080

volumes

- producerusrproducer

- varrundockersockvarrundockersock

Listing 4 docker-composeyml snippet of the prototype

All containers are connected to a Docker bridge network [93] for communication A bridge network uses a software bridge to

allow connected containers to communicate while providing isolation from containers which are not connected to that bridge

network The bridge network applies to containers running on the same Docker host The network is thus confined to the local

Docker host and is not distributed on different devices The bridge network has some advantages

bull The bridge provides better isolation and interoperability between containers Containers automatically expose all ports

to each other and none to the outside world

bull The bridge provides automatic Domain Name System (DNS) resolution between containers This means that containers

resolve the IP address of each other by container name or alias

bull Containers can be attached to and detached from the networks on the fly

bull Environment variables are shared which can be used to provide equal environment configurations for every container

on the bridge

42 Overview of prototype 45

422 Client interface

The Client Interface is implemented by the cli component The cli is built in Python with the Click package by Armin

Ronacher [94] Click is a CLI creation kit which aims to make the implementation of CLIs easier It resembles the Flask frame-

work as it also leverages Python decorators [56] for most of its functionality The source code of the cli is located in the

mosquitopy file Commands can be executed by calling python mosquitopy or by calling mosquito if the

source code is installed into the Python environment The following commands are implemented

bull mosquito Displays a help page listing command groups

bull mosquito on Starts the application

bull mosquito off Shuts down the application

bull mosquito plugins Groups all commands to manage plugins Plugins can only be listed not installed or unin-

stalled as the Remote Producer and Remote Consumer are not implemented

bull mosquito plugins ls Lists all locally installed plugins

bull mosquito stream Groups all commands to manipulate the current stream

bull mosquito stream add Adds a producer or consumer to the stream

bull mosquito stream delete Deletes a producer or consumer from the stream

bull mosquito stream elements List all producers and consumers that were added to the stream

bull mosquito stream link Links two stream plugins

bull mosquito stream pause Pauses the stream

bull mosquito stream play Plays the stream This means the stream is processing media

bull mosquito stream print Prints the stream layout (which plugins are linked)

bull mosquito stream stop Stop the stream

bull mosquito stream view View the stream on the local device

A typical use of the application would be the following First the application is started using mosquito on Then plugins

are added to the stream using mosquito stream add [ELEMENT_TYPE] [ELEMENT] This will instantiate the

corresponding plugins in the Producer and Consumer component The plugins are linked in order using mosquito stream

link [ELEMENT_1] [ELEMENT_2] The stream is then set to play using mosquito stream play When the

last plugin is linked to the special local plugin the user can view the output from that plugin using mosquito stream

view which opens up a window in which the stream is displayed

42 Overview of prototype 46

As specified in the software architecture (see Section 23) the Client Interface can use the Stream Commands interface of the

Stream component As specified in Section 231 this interface is a REST API so the client can use this interface through the HTTP

protocol This is done with the Python Requests library [95]

423 Stream

The Stream component is responsible for the logical representation of the stream (see Section 231) implemented as the

streamer component The component consists of three objects api that contains the REST API StreamManager and

the Stream object representing the Stream Model in the framework Requests to the other microservices are sent using the

Python Requests library The prototype implementation only supports one stream with a chain-like model This means that

unlike the stream depicted in Figure 26 a plugin canrsquot have multiple sources or multiple listeners The Stream object man-

ages the logical representation of the stream and manipulates the references to the plugins by forwarding commands to the

producer and consumer component respectively It contains two data structures outline which is the logical struc-

ture of the stream and elements that contains all the plugins present in the stream In the prototype the Stream component

provides the following functionalities on its API endpoints

bull plugins GET Fetches all the plugins from the producer and consumer components and returns their in-

formation

bull elements GET POST DELETE Resource to add and delete plugins from the elements bin

bull streamlinks POST Resource to create links for elements

bull streamstate GET PUT Resource to update the state

bull shutdown POST Shut down the framework

Since the streamer component is the only component of the framework that interacts with outside users it has the re-

sponsibility to gracefully shut down the framework This is needed to solve the problem of dangling plugin containers that

run plugins that have not been stopped and removed after closing the application Since only plugins that are contained in a

stream have a running container associated the stream can notify the Producer and Consumer components to stop and remove

those containers

424 Producer and Consumer

The Producer and Consumer component cover similar responsibilities in managing installed plugins They are implemented in

the producer and consumer components Both components consist of the following objects api that contains the REST

API the Kernel that implements the core functionalities the PluginManager which finds plugins installed on the device

and checks if their installation is valid and the Plugin which is the logical representation of a plugin as described in Section

231 Commands to control the plugins are made using the Python Requests library

42 Overview of prototype 47

For the component to be able to start stop and interact with the plugin containers the component needs access to the Docker

host and the Docker client running on that host But because the component is running in its own container it is isolated from

the Docker host and canrsquot interact with the Docker client by default The workaround for this problem is to expose the socket

on which the Docker client is running on the Docker host to the container This is done by mounting the Docker socket of the

host on the Docker socket in the container In Docker compose the mounting is achieved using the following Listing

volumes

- varrundockersockvarrundockersock

Listing 5 Mounting the Docker socket on the container

This has some implications on security (see Section 43) To interact with the now exposed Docker client the component uses

the docker-py library [96] Listing 6 shows how a connection is made to the Docker client and a plugin container is started

The container is started from the plugin image on the network of the framework and is given the plugin name as the container

name Docker thus creates a DNS entry with the plugin name which makes the container addressable on its name Due to this

implementation this limits that there can only be one container of a plugin running at all times in the current implementation

import docker

client = dockerfrom_env()

container = clientcontainersrun(

image=plugin_name

detach=True

name=plugin_name

network=mosquito_default

)

Listing 6 Starting a plugin container

When both components are initialized the Kernel and PluginManager are created The PluginManager searches

for a plugin_directory which contains information on which plugins are installed on the device Each installed plugin

should have a valid image on the device which are contained in the images directory of the Docker daemon If the image

or information file cannot be found on the device the plugin is marked as broken and canrsquot be used by the framework To

describe the API the consumer API is used The producer API is analogous but replaces consumer with producer

and doesnrsquot have the sources endpoints The Producer and Consumer components provide the following functionalities

on the API endpoints

bull consumers GET Retrieves a list of the installed consumers on the device on which the component is running

bull consumerslthostnamegt GET DELETE Retrieves the information of a consumer specified by the host-

name value which is the name of the consumer

42 Overview of prototype 48

bull consumerslthostnamegtstate GET PUT Retrieves or respectively updates the state of a consumer

specified by the hostname value

bull consumerslthostnamegtsources GET POST Retrieves the sources or respectively adds a new source

to the consumer specified by the hostname value

bull consumerslthostnamegtsourcesltsource_hostnamegt

GET PUT DELETE Retrieves updates or removes the source specified by source_hostname of a consumer spec-

ified by hostname respectively

bull consumerslthostnamegtlisteners All listeners resources are analogous to the sources re-

sources

425 Implemented plugins

Three plugins are implemented and tested filecam (called rsquoMycamrsquo in the code) a producer that reads in a video file and

transmits it in MJPEG encoding using the RTP protocol testsrc a producer which generates test video and transmits it

in MJPEG encoding using the RTP protocol and local a consumer which captures incoming RTP MJPEG video frames and

displays them on the local display The filecam and local plugins are discussed since the testsrc is similar to the

filecam

The plugins are implemented in Python use the GStreamer library with the Python bindings [97] for media streaming and the

Flask framework to implement the API These libraries donrsquot have to be used by future plugins which can just implement a REST

API and provide a media stream specified in their descriptions

Filecam plugin

The filecam image is based of the Ubuntu 1710 image It is chosen over lighter Linux distributions because it offers more

functionalities out of the box for prototyping Other dependencies are Python 36 GStreamer 112 GStreamer plugins-base

plugins-good plugins-bad plugins-ugly libav doc and tools and python-gst

The API of the plugin offers the following functionalities

bull state GET PUT Retrieve and respectively update the state of the plugin

bull listeners GET POST Retrieve and respectively add a listener on the plugin

bull listenerslthostnamegt GET PUT DELETE Retrieve update and respectively delete a listener on the

plugin

The implemented GStreamer pipeline is depicted in Figure 41 The pipeline consists of the following GStreamer elements

1 filesrc GStreamer element that reads data from a file in the local file system This file can have any extension

and is not limited to video or audio files [98] The location property is set to the location of the file in the plugin

container

42 Overview of prototype 49

Figure 41 filecam GStreamer pipeline

2 decodebin GStreamer bin that automatically constructs a decoding pipeline using available decoders and demuxers

via auto-plugging [99] Note that for some media containers and codecs the appropriate decoders must be installed

For example to decode the MPEG streams contained in MP4 files a h264 decoder is needed that can be found in the

rsquolibavrsquo GStreamer plugins library

3 jpegenc GStreamer elements that encodes raw video into JPEG images [100] This implements the MPEG video

stream as all video frames are encoded as JPEG images

4 rtpjpegpay GStreamer element that payload encodes JPEG images into RTP packets according to RFC 2435 [101]

5 udpsink GStreamer element that sends UDP packets to the network When combined with an RTP payload plugin

it implements RTP streaming [102] The host and port property are set to the hostname and port property of the

listener of the plugin

This pipeline is implemented using the Python GStreamer bindings The process consists of creating each GStreamer element

adding them to the GStreamer pipeline and linking the elements in order of appearance in the pipeline The decodebin

and jpegenc element canrsquot be linked when created because there is no default sink pad available on the decodebin

Because the decodebin needs to decide on how to decode media it needs the pipeline to be processing media to it If no

media is flowing the decodebin canrsquot know what decoder it needs to decode the media and what media it can offer to the

sink element Therefore the process of dynamic linking is used [103] All elements which can be linked when the pipeline is

not in the PLAYING state are linked A handler is registered on the rsquopad-addedrsquo signal emitted when a new pad is added

on the decodebin indicating that it can forward media downstream When media is flowing through the pipeline the

decodebin creates new pads when it can generate output data and emits the rsquopad-addedrsquo signal A callback is performed

on the handler which links the decodebin with the jpegenc Listing 7 illustrates this concept

callback handler

def on_pad(source pad sink)

get the sink pad from the sink element

sink_pad = sinkget_static_pad(sink)

get the pad type

pad_caps = padget_current_caps()

pad_type = pad_capsget_structure(0)get_name()

Only if the pad is raw video the link is made

if pad_type == videox-raw

42 Overview of prototype 50

Perform the dynamic link

padlink(sink_pad)

Other pad types are ignored

filesrc = GstElementFactorymake(filesrc)

decodebin = GstElementFactorymake(decodebin)

jpegenc = GstElementFactorymake(jpegenc)

(create other elements and add elements to pipeline)

Only filesrc and decodebin can be linked statically

filesrclink(decodebin)

Register on_pad handler on the pad-added signal

handler_id = decodebinconnect(pad-added on_pad jpegenc)

Set pipeline to PLAYING callback will be called to perform the dynamic link

pipelineset_state(GstStatePLAYING)

Listing 7 Dynamic linking of the decodebin and jpegenc

Local plugin

The local plugin captures an incoming media stream and displays it on the local display This plugin is special with respect

to other plugins in that it is not deployed in a Docker container It runs natively via the cli on the host to allow access to

the local display This version is built for macOS High Sierra (version 10134) and uses GStreamer 112 GStreamer plugins-base

plugins-good plugins-bad plugins-ugly libav doc and tools to receive an incoming stream When a plugin links to the local

plugin the Stream component does not instruct the Consumer component to start the plugin but instead links the plugin to the

local host For macOS the address of the host is hostdockerinternal The GStreamer pipeline used by the plugin is depicted in

Figure 42

Figure 42 local plugin GStreamer pipeline

The pipeline consists of the following elements

1 updsrc GStreamer element that reads UDP packets from the network [104] The port property is set to the port to

which the source is transmitting media

2 rtpjpegdepay GStreamer element that retrieves JPEG images from the received RTP packets [105] This element

canrsquot process the media received from the udpsrc directly because it canrsquot know what type of data it will be receiv-

43 Limitations and issues 51

ing Between the pads a rsquocapabilities filterrsquo is placed which informs the elements on the type of data that will be

flowing through In this case the capabilities are applicationx-rtp which tells that there will be rtp pack-

ets coming through encoding-name=JPEG which tells that the payload of the RTP packets are JPEG images and

payload=26 which also tells that the encoding is JPEG according to RFC3551 [50 106]

3 jpegdec GStreamer element that decodes JPEG images [107]

4 autovideosink GStreamer element that automatically detects an appropriate videosink and forwards the video

to it [108]

43 Limitations and issues

The implementation presented is a prototype and slimmed down version of the architecture presented in Section 23 The

following limitations and issues remain

431 Single client

The current implementation deploys the Flask framework (on which each microservice relies) on the built-in Flask web server

(Werkzeug) which is provided for development convenience It is only built for use by a single user and by default can only

handle one request at each given moment which implies that the framework can also only be used by a single user [109]

432 Timeouts

The framework does not perform checks on request timeouts when passing commands to components and plugins This can

be a problem when the framework is deployed on several devices and the request latency is much higher In case of timeouts

the framework will keep waiting for a response which leads to a crash

433 Exception handling and testing

The framework is only tested for the so called rsquohappy pathrsquo the default scenario featuring no exceptional or error conditions

Some alternate paths are handled butmost still need to be tested An example scenario would be if one of the plugin containers

in a stream fails and stops The framework is not able to detect this and will assume that the container is still running

434 Docker security issues

The Docker client is a client that communicates with a daemon process using the socket dockerd This socket is a UNIX

domain socket called varrundockersock The daemon is highly privileged having root access to the host system

Any process that can write to this socket effectively has root access To allow the components of the framework to manipulate

the plugin containers they need access to this socket Therefore the socket ismounted in the containerwhich gives the container

43 Limitations and issues 52

write access to the socket This implies that that container now has root access on the host when writing to this socket Because

the container gets root access to the host an attacker can walk the file tree of the host and extract sensitive information or run

unwanted software This type of attack is known as a rsquoDocker Breakoutrsquo or rsquoContainer Escapersquo attack [110 111]

435 Docker bridge network

The current implementation deploys the framework on a Docker bridge network which can only be used if the framework is

deployed on a single device The current implementation can thus only be deployed on a single device To deploy the framework

on multiple devices the framework must be deployed using a Docker overlay network [112]

436 Single stream

The implementation supports one stream which must be a chain Multiple streams in tree form with merging media from

multiple sources and broadcasting to multiple listeners is not supported

437 Number of containers per plugin

The framework uses the name of the plugin as identifier for the containers The name is also the hostname on which the

container can be reached Therefore there can only be one active container associated with a plugin at runtime

MOB DETECTION EXPERIMENT 53

Chapter 5

Mob detection experiment

To try out an actual drone thermal imaging application the mob detection experiment is carried out The goal of this experi-

ment is to use existing object detection algorithms on a dataset of thermal images to try and detect large crowds of people

hereinafter referred to as a mob

Several public datasets of thermal images exist Most datasets focus on the detection of people in scenes [113ndash117] some on

face recognition [118 119] others on vehicle recognition [120] Most of these datasets are freely available through the OTCBVS

Benchmark Dataset Collection [121] No datasets containing large amounts of people were found so the Last Post thermal

dataset was created for the detection of mobs and other analysis tasks This dataset is presented in Section 51

To detect mobs in the images of the datasets a deep learning approach using neural networks is explored The selection and

training of the model is described in Section 52

51 Last Post thermal dataset

The Last Post dataset consists of videos of the Last Post Ceremony taking place each night at 800 PM (Brussels timezone) under

the Menin Gate in Ypres Belgium Section 511 gives some insight into this unique ceremony The full dataset is described in

Section 512

511 Last Post ceremony

The Last Post ceremony is a nightly ceremony taking place under the Menin Gate in Ypres at 800 PM sharp The ceremony is

held in remembrance of the fallen soldiers during World War I (1914-1918) The Last Post association [122] states its mission as

follows

True to its statutes the Last Post Association wishes to honor and remember the soldiers of the British Empire

who gave their lives during the Great War of 1914-1918 The Last Post ceremony seeks to express day after day

the lasting debt of gratitude which we all owe to the men who fought and fell for the restoration of peace and

the independence of Belgium

51 Last Post thermal dataset 54

Figure 51 gives an impression of the size of the ceremony Because of the sheer number of people that gather under the gate

each day the Last Post is a unique open air event that allowed for repeatable conditions to capture footage therefore the event

was a perfect opportunity to create the dataset

Figure 51 Last Post ceremony panorama

512 Dataset description

Due to legislation in Belgium drones cannot be flown in public areas without a certification and permit by authorities The

creation of real aerial thermal images with a drone was thus not feasible Therefore an elevated position (in order to simulate

aerial images) on the walls next to Menin gate was used to capture the footage of the adjacent square on one side and the

bridge on the other side Figure 52 shows the locations where the video footage was captured

Figure 52 Locations where the video footage was captured The black stars represent the captured scenes the red stars represent the locations from

where the scene was filmed

The data was recorded with the FLIR One Generation 3 Pro camera for Android devices hereafter referred to as rdquoCamerardquo [123]

Since thermal images donrsquot hold color information a color scheme is used to represent the relative differences in temperature

The rsquoIronrsquo color scheme which maps colder sections of a scene on blue colors and warmer sections on red and yellow colors

51 Last Post thermal dataset 55

The videos are encoded using the H264 MPEG-4 codec Decoded the color information is captured in 420 YUV format The

frame rate of the videos varies from 7 Hz to 8 Hz depending on the speed of the objects in the scene There is sound present

in the videos which is encoded with the MPEG AAC codec For a full list of sequences the reader is referred to Appendix C

The two locations that make up the main scenes in the dataset are presented in Figure 53 The thermal images and visual

images of each scene are depicted next to each other The thermal and visual images were not captured at the same time so

the mobs that are present in the thermal images canrsquot be seen in the visual images In both scenes buildings are present that

are quite warm when compared to the surroundings as can be seen in the thermal images In Figure 53a it even becomes

difficult to recognize the mob when they are standing close to the building This is less the case for Figure 53c where due to

the water present in the image the mob has higher contrast due to the larger difference in emitted heat Towards the far right

of the image the mob seemingly disappears into the background The effect of two objects having a similar heat signature and

having no clear transition in thermal images is defined as thermal camouflage a technique that is often used by animals and

military units [124] This effect is even visible when looking at the mobs present in both images because people are standing

so close together it becomes difficult to recognize individual persons in the crowd

(a) Thermal view of the square in location A (b) Visual view of the square in location A

(c) Thermal view of the bridge in location B (d) Visual view of the bridge in location B

Figure 53 Main scenes in the Last Post dataset

52 Object detection experiment 56

52 Object detection experiment

521 Preprocessing

The Last Post dataset was not used entirely for training the model because there were not enough resources to manually

annotate every image Therefore a smaller dataset was used to serve as a baseline model

The following videos were used 2018-04-10 195029mp4 2018-04-10 200122mp4 2018-04-04-

202859mp4 2018-04-10 202558mp4 and 2018-04-04 200052mp4 captured on the fourth and

tenth of April 2018 These videos were used because of their contents They contain images from location A and B respectively

in which the mob behaves more dynamically compared to other videos This was due to a marching band present on the fourth

of April and a marching army unit on the tenth of April See Appendix C for a summary of the contents of these videos From

these videos images were extracted at a capture rate of 1 Hz Each image was manually labelled using the Microsoft Visual

Object Tagging Tool [125] The tool allows to export the training images to various formats such as Pascal VOC for Tensorflow

YOLO and Microsoft CNTK

Within the data several visual outliers are present An outlier is an observation point that is distant from other observations

It is created due to variability in capturing the videos or indicate experimental errors [126] The errors detected here are the

latter form and are depicted in Figure 54 The first type of outliers are system faults in the Camera Due to an error in the

processing of the video the Camera would sometimes not register any input This causes the Camera to produce completely

black images which is depicted in Figure 54a The Camera softwaremaps temperatures onto colors in the image The variations

of the colors are relative to the temperature interval ranging from the minimum and maximum temperature detected by the

Camera If the minimum andor maximum detected temperature change the Camera needs to adapt its color mapping This

causes the Camera to fade to bright colors for a short period of time (1 to 2 seconds) The resulting image is depicted in Figure

54b Because the resulting image is too bright and objects are hard to detect it is considered an outlier Due to instabilities

when capturing the footage sequences with fast motion some images are very blurry This makes it hard even for a person to

decide what is visible in the frame therefore it is considered an outlier This is depicted in Figure 54c Sometimes people would

pass in front of the Camera which resulted in brightly colored areas in the videos that were not part of the scene and therefore

are another type of outliers depicted in Figure 54d Because the presented outliers are experimental errors and do not belong

in the scenes they were removed from the dataset

522 Training

The model that is used for training is YOLOv3 implemented using the darknet neural network framework [83] The model is

trained using convolutional weights that are pre-trained on the ImageNet database [127] The concept of using weights from a

pre-trained model previously trained on large datasets is known as transfer learning It is very important that when choosing

a pre-trained model the problem statement of the pre-trained model is close enough to the current problem statement For

the pre-trained model on ImageNet this was to identify objects in images which lies close to the detection of mobs in thermal

images Because the type of images (thermal versus visual) is fundamentally different the model could suffer in performance

Goedeme et al [36] solved a similar problem with thermal images and achieved good results which gives an indication that

52 Object detection experiment 57

(a) System fault in the Camera no input was detected (b) The Camera updates to new temperature interval

(c) Due to moving the Camera too fast the image becomes too blurry (d) Very warm object due to people passing in front of the Camera

Figure 54 Outliers

detection should be feasible with the pre-trained model Also because the dataset is relatively small training the model from

scratch could actually hurt performance [128] Training was carried out on the NVIDIA Geforce GTX 980 GPU that allows training

to be done much faster To evaluate training progress the Sum of Squared Error (SSE) loss function is calculated defined assumni=1(xij minus xj)

2 where n is the number of samples in a batch used in a single training epoch and j is the dimension (x

or y) as defined in [83] The result of this training is discussed in Chapter 6

RESULTS AND EVALUATION 58

Chapter 6

Results and evaluation

The goal of this Chapter is to present the results of the framework and the detection experiment The results of the framework

tests are presented in Section 61 The results of the object detection experiment are presented in Section 62

61 Framework results

To evaluate the framework acceptance tests are conducted that test if the framework meets the QARs defined in Section 21 As

stated in Section 41 only the ASRs will be tested A summary of which requirements are met by the framework is given in Table

61 Passed means that the framework has met the requirement not passed that the framework hasnrsquot met the requirement

and plausible means that the frameworkmight havemet the requirement but not enough data could be gathered to be certain

611 Performance evaluation

To evaluate performance the acceptance tests for the requirements are conducted the impact of the framework on the pro-

cessing resources are recorded and the total size of the framework is measured

Acceptance tests

To test the performance of the framework the execution times of each command executed with the CLI (see Section 422) are

measured Each command is executed 200 times except for the on off and link commands they are measured manually

10 times Because these commands launched system threads and their finish signal could not be captured they had to be

measured by hand Commands were executed on a 26 GHz Intel Core i5-2540 processor running macOS High Sierra version

10134 The summarized statistics of the tests are given in Table 62

The average execution times for the Play Stop Pause Add Elements Print View and Link commands do not exceed the 2

second bound specified in PS-1 while the average execution times of the Delete On and Off commands do exceed this bound

This performance requirement is not met by the framework The same result is found for PS-2 Especially the Delete and Off

command exceed the requirements by quite a bit The Delete command shuts down a plugin and removes the Docker container

61 Framework results 59

Requirement id Status

PS-1 Not Passed

PS-2 Plausible

PS-3 Not Passed

PS-4 Plausible

PS-5 Not Passed

IS-1 Passed

IS-2 Passed

MS-1 Passed

MS-2 Passed

MS-3 Passed

MS-4 Passed

MS-5 Plausible

MS-6 Passed

MS-7 Plausible

Table 61 Acceptance tests results summary

from the host This action is costly in time The Off command removes all the plugins and all the microservices of the framework

and thus suffers from the same costly action This could be ameliorated by having the framework not removing the containers

but stopping them instead which requires less resources as it only stops the process running in the container but does not

delete the container from the system

PS-2 and PS-4 could not be measured due to the GStreamer pipeline of the prototype not allowing frames to be tracked

However since real-time is a human time perception if a person canrsquot distinguish the streamed videos from videos played with

a native video player real-time streaming is plausible [43 44] The videos were shown side by side to ten users that could not

distinguish between both videos indicating presumable real-time streaming Since the hard requirements cannot bemeasured

the requirements are not met but are plausible Real-time streaming performance also heavily depends on the used plugins

and the hardware on which they are deployed If a plugin canrsquot process its media fast enough due to lack of processing power

or a slow implementation it will slow down the whole stream

The scalability requirement PS-5 could not be met due to the Flask Werkzeug server only being able to process one request at

a time (see Section 43)

Only two performance requirements are met by the prototype However this is mostly due to some actions being very slow

such as shutting down the framework or removing a plugin As these are actions that should occur less frequently when a user

is using the framework these actions are less important for the perceived quality Frequent actions such as adding linking and

changing the state of the stream do perform rather well and contribute more to the perceived quality Overall the performance

of the framework is not stellar but not bad either This can partially be explained due to the choice of supporting frameworks

61 Framework results 60

Statistic Play Stop Pause Add Delete Elements Print View On Off Link

Mean 0690 0804 0634 1363 8402 0562 0564 122 358 24023 0849

Std deviation 0050 0059 0088 1037 4669 0070 00747 0260 0498 0481 0170

Minimum 0629 0708 0549 0516 0505 0517 0517 0757 3015 23707 0637

25 Percentile 0665 0775 0594 1049 1154 0534 0536 0998 3143 23750 0798

Median 0678 0800 0623 111 11132 0550 0552 1214 3500 23886 0853

75 Percentile 0700 0820 0653 1233 11189 0562 0560 1433 3850 24034 0877

Maximum 1016 1279 1631 625 11846 1227 1149 1691 4562 25326 1261

Table 62 Performance test statistics summary measured in seconds

such as Flask that are not built for performance Other more high performance frameworks such as Vertx could ameliorate

performance

Resource usage

The resources used by the modules of the framework are measured using the Docker statistics tool [129] A summary of the

resources used is given in Table 63 When the framework is idle resource usage is negligible When a plugin is active there is

a slight increase in resources This increase in resources depends on the runtime size of the plugin unknown to the framework

The increase peaks when the plugin is processing media CPU usage is 40 on one core which implies that on one CPU core only

two plugins can be active simultaneously before reaching the ceiling of the processing power In a production environment of

the framework plugins need to be tested thoroughly so that these metrics are known beforehand These metrics imply that

the length of streams should be kept short to avoid having many plugins active simultaneously

Size of framework

The total size of all the Docker images of the components of the framework are given in Table 64 Most images are quite large

the framework core components have an average size of 724 MB and the plugins have sizes ranging from 1GB to 3GB This

size can be explained due to the base images and additionally installed software in the images For development flexibility

the base images used are Linux Ubuntu images that are typically larger than other Linux distributions For the plugins the full

GStreamer library with all plugins was installed which is more than 2 GB large The sizes of the components can be reduced in

a production environment by choosing slimmer Linux distributions as base images and only installing the minimally needed

libraries to get a working plugin

612 Interoperability evaluation

The systems with which the framework exchanges data are the plugins These plugins must follow the plugin model presented

in Section 231 implement the presented resources using a REST API the state machine and protocols If these specifications

61 Framework results 61

Condition Container CPU usage [] Memory usage [MiB]

Idle streamer 100 4209

consumer 003 244

producer 001 2414

1 plugin active not processing media streamer 156 4248

consumer 002 2442

producer 002 2423

mycam plugin 075 4597

1 plugin active processing media streamer 156 4251

consumer 002 2442

producer 002 2424

mycam plugin 4003 9924

Table 63 Resource usage of the framework in several conditions

Image Size [MB]

streamer 718

consumer 729

producer 729

testsrc 1250

mycam 3020

Table 64 Total size of framework components

are followed by a plugin the framework should have no issues exchanging information with the plugin To test this a new

mock plugin is implemented For each resource of the plugin the framework is given random mock input data to exchange

with the plugin When the exchange is complete the values in the plugin are requested and compared with the given input If

the input matches the value in the plugin the exchange was successful These tests were executed 50000 times The results

are summarized in Table 65 Play pause and stop are the requests to change the state of the plugin The sourcelistener add

update and delete commands manipulate the sources and listeners of the plugin Overall there were almost no errors made

when exchanging information only when updating a source and deleting a listener there was one incorrect exchange The

ratios achieved are always 100 correct exchanges except for updating a source and deleting a listener which are 99998

IS-1 and IS-2 specify that commands exchanged with the plugins need to be correct 9999 of the uptime so this requirement

is clearly met

Plugins also interact with each other by transmitting media to each other according to the stream layout This interoperability

62 Mob detection experiment results 62

Value Play Pause Stop Add S Update S Delete S Add L Update L Delete L

Correct 50000 50000 50000 50000 50000 49999 50000 50000 49999

Incorrect 0 0 0 0 0 1 0 0 1

Ratio () 100 100 100 100 100 99998 100 100 99998

Table 65 Interoperability tests results (S Source L Listener)

is not directly controlled by the framework as plugins can be developed by third parties To solve this a plugin needs to provide

its specifications to the framework before being integrated as a plugin This allows the framework to decide whether or not two

plugins will be able to interact with each other in a stream For example if plugin A supports MJPEG streams transmitted via

RTPUDP it will be able to interact with plugin B implementing the same protocols If plugin B implements another protocol it

will not be able to interact with plugin A If this is specified the framework can notify a user that two plugins are not compatible

These scenarios should be avoided which is done by specifying standard protocols for plugins

613 Modifiability evaluation

Plugins are installed for the prototype by building and adding their image to the image directory of the Docker host The

framework does not need a restart to install these images Therefore requirements MS-1 and MS-2 are met End-users can

extend their version of the framework with new plugins by installing them by building the respective plugin images meeting

MS-3 Streams can be modified by linking different plugins by design meetingMS-4 The framework can detect newly installed

plugins when starting up if the image is installed to the image directory of the Docker host Therefore requirementsMS-5 and

MS-6 are met The current prototype is only deployable on a local device as discussed in Section 41 meeting requirementMS-7

The other requirements can be met by deploying the framework using the Docker overlay network as discussed in Section 43

without having to implement changes to the code base The requirements MS-8 and MS-9 are not met but are plausible by

using a different Docker deployment

In general the frameworkwas designed to bemodifiable for different video analysis tasks The hybridmicrokernelmicroservices

architecture enables this modifiability The microkernel plugin architecture allows a user to modify a video analysis stream

during framework use The microservices architecture allows for a modifiable deployment configuration

62 Mob detection experiment results

To evaluate the detection experiment the trained model is tested on the validation set that contains random images from the

total annotated dataset presented in Section 512 First the results of the training of the model are presented in Section 621

Second the metrics that were used to evaluate the model are presented in Section 622 Finally the results of the validation

are presented in Section 623

62 Mob detection experiment results 63

621 Training results

To monitor training the average loss per training epoch was measured the resulting training evolutions are depicted in Figure

61 Darknet does not shuffle training data automatically and creates training batches in order of the training data provided

Since YOLO uses gradient descent for optimization this can lead to YOLO getting stuck in local minima of the cost surface [130]

This effect is seen in Figure 61a around epoch 4500 every image in the training set has been loaded at least once at this point

the model was training on images from location B and now images from location A are loaded (see Section 512) This leads to

a peak in average loss as YOLO was optimizing images from location B and probably converging to a local minimum for that

type of images Therefore in a second run data was shuffled allowing the model to get out of local minima easier Figure

61b shows the difference in training loss the curve is much more irregular thanks to the shuffling of the data Once again

the average loss decreases more around epoch 4500 when every image in the training set has been loaded at least once The

average loss stagnates values in the interval [004 007] To avoid overfitting the model on the training data and achieve worse

generalization performance early stopping is applied Early stopping is a generalization technique to stop the training of a

neural network early before the network starts overfitting [131] The stopping criterion used is progress defined as the decrease

of training error in successive training epochs [131] or the slope of the loss curve depicted in Figure 61 This slope approaches

0 from epoch 13000 and onward so this epoch is selected as early stopping point Because the generalization error is not a

smooth curve and consists of many local minima it is a good idea to validate model weights in the neighborhood of the early

stopping point as well as these could potentially yield better performance on the validation set [131]

622 Metrics

Themodel predicts bounding boxes for objects in the images of the validation sets The bounding box provided by the annotated

dataset is defined as the ground truth bounding boxBgt The bounding box provided by the model is defined as the predicted

bounding boxBp To evaluate the performance of themodel and select the best weights several metrics are used The standard

metrics used to evaluate object detection problems are the Intersection over Union (IoU) and themean Average Precision (mAP)

The IoU is a metric used in common object detection challenges such as the Pascal VOC challenge [132] If the functionA(Bx)

gives the area for a bounding boxBx the IoU is defined as

IoU =A(Bp capBgt)

A(Bp cupBgt)(61)

The mAP for set of detections another metric used in the Pascal VOC challenge is defined as the mean over classes of the

interpolated AP for each class A detection is considered a true positive if the IoU for the detection is greater than 05 The

interpolated AP is given by the area under the precision-recall curve for the detections [132ndash134]

Themodel is also tested on several videos not included in the train and validation set to visually evaluate detection andmeasure

the number of frames per second that can be processed by the model

62 Mob detection experiment results 64

(a) Average training loss when data is not shuffled Vertical average loss horizontal time (in training epochs)

(b) Average training loss when data is shuffled Vertical average loss horizontal time (in training epochs)

Figure 61 Average training loss per epoch

623 Validation results

YOLO creates a snapshot from the weights the model is using at a certain epoch every 100 epochs [83] This makes it possible

to validate each set of weights on the validation set and show the evolution of the validation performance Figure 62 shows

these evolutions for the average IoU and mAP metrics The mAP gradually grows from epoch 4500 onwards and stagnates

around epoch 11500 This shows that the model is not learning anymore and is at risk of overfitting The mAP stagnates in the

interval of [88 91] The average IoU shows a similar trend but varies more because predictions on the same images rarely

are exactly the same

The best mAP value is achieved at epoch 15700 being 9052 The weights from this epoch are used for further testing and

validation The mAP for the 05 IoU threshold of YOLOv3 on the COCO benchmark dataset [135] is 748 comparing this to the

achieved mAP for the Last Post dataset the Last Post mAP is very high The reason for this difference is that the validation

62 Mob detection experiment results 65

(a) mAP () per epoch Vertical mAP () horizontal time (in training epochs)

(b) IoU () per epoch Vertical IoU () horizontal time (in training epochs)

Figure 62 Validation metrics per epoch

set has a high correlation with the validation set Due to the training set and validation set being extracted from videos all

images from one video are correlated in time to each other Images from the validation set are thus correlated to images in

the training set and the model is optimized on these types of images explaining the high mAP This indicates that the model is

somewhat overfitting on the training data This was confirmed when testing the model on unseen videos Although the model

could detect a mob most of the time it produced more visual errors Because this data was not annotated no metrics could be

extracted Figure 63 depicts some predictions of the model on images from the validation set The predicted bounding boxes

resemble the ground truth bounding boxes quite accurately visually

To test the speed of the predictions of the model the total time to predict images in the validation set was measured For the

NVIDIA Geforce GTX 980 GPU the average prediction time for one image is 14673 milliseconds with a standard deviation of

0517 milliseconds This indicates that the upper limit of the frame rate when making predictions on a video is approximately

68 frames per second on the GPU For comparison predictions with the model were also made on a CPU a 26 GHz Intel Core

i5-2540 processor with AVX instructions speedup The average prediction time on the CPU is 5849 seconds with a standard

deviation of 0438 seconds resulting in an upper limit for the frame rate on the CPU of 0171 frames per second Clearly real

time object detection with this model is only possible on a GPU When generating predictions on a test video the average frame

rate of the video was 55 frames per second

62 Mob detection experiment results 66

(a) Prediction of a large mob at location B (b) Prediction of the mob at location A

(c) Prediction of a small mob at location B (d) Prediction of the mob at location B

Figure 63 Predictions of the model on images in the validation set

CONCLUSION AND FUTURE WORK 67

Chapter 7

Conclusion and future work

71 Conclusion

Aerial thermal imaging with drones is a promising technology that can deliver many promising applications for various use

cases across many different domains such as agriculture fire fighting search and rescue etc Most applications built with this

technology are built with a specific use case in mind using a thermal camera and analysis software specifically for this use

case and therefore struggle to exchange hardware and algorithms for new use cases Therefore the goal of this dissertation

was to design build and test a possible backbone framework that allows building these applications in a modifiable way The

specific use case of mob detection in thermal images was investigated as a sample use case for the framework

Chapter 2 explored the requirements of such a framework The ASRs to achieve the goal of the framework are performance

interoperability and modifiability Performance is needed because some use cases (like fire fighting) require real-time video

analysis Interoperability enables the framework to interact with different thermal cameras and different processinganalysis

modules Modifiability enables the framework to interchange the thermal cameras and analyzers in its process to build ap-

plications for different use cases A hybrid combination of the microkernel pattern and the microservices pattern is used to

meet these requirements as the microkernel pattern enabled interchanging the cameras and analyzers via a plugin system

and the microservices pattern enabled different deployment configurations for the framework To build and test the frame-

work several technologies were needed backbone technologies for the software architecture a thermal camera and an object

detection algorithm for the mob detection use case

Chapter 3 explored the state of the art of these technologies and presents the selected technologies Thermal cameras come in

all shapes and sizes and have different features according to their retail prize Contrary to intuition the image quality is not the

defining factor of the retail prize but the amount of extra features such as radiometry communication interfaces etc The FLIR

One Pro and ThermApp were selected for this dissertation since they offer good quality images and features for their price and

their use via smartphone platforms that makes these cameras excellent for prototyping Microservices frameworks also know

a lot of variety depending a lot on the use case for the application using the framework Some are aimed at quick prototyping

others focus on performance etc Flask was selected as the microservices framework as it is easy to use and designed for

prototyping with microservices This does come with a performance trade-off To deploy the microservices in a plugin fashion

71 Conclusion 68

the concept of containers is applied Containers virtualize on the OS level allowing the microservices to be moved around on

the host and distributed on different hosts The current field has some frameworks implementing this technology with Docker

being the most well-known and mature framework and it was selected for that reason The field of object detection has a

variety of solutions for the object detection problem having varying accuracies and some can even create predictions in real-

time The YOLOv3 algorithm implemented in the darknet framework was selected as it generalizes well onto other datasets

(such as thermal images) makes relatively accurate predictions and is able to make predictions in real-time when deployed on

a device with GPU processing capabilities

Chapter 4 presents the implemented prototype of the framework using these technologies Two sample plugins were imple-

mented the filecam that serves a video read in from a file and the display plugin that displays this video on the local device

The framework is limited to one video processing stream for one user at a time and is deployed to a local device It also has a

security risk as the framework has to expose the Docker daemon socket to allow the framework to manipulate the containers

running the plugins This gives the containers that run the core framework processes root access to the host system which can

be abused by potential attackers

Themob detection experiment is presented in Chapter 5 A new thermal image dataset called the Last Post datasetwas collected

for this experiment The dataset features videos of the Last Post ceremony filmed over the course of two weeks What makes

this dataset special is that unlike publicly available datasets it delivers footage of the movement of large crowds filmed from

a high vantage point to simulate footage captured from a drone platform This dataset is used to train a pre-trained YOLOv3

model via transfer-learning The dataset is manually labeled and preprocessed by removing the outliers present Training is

done on a NVIDIA GTX 980 GPU and is evaluated using the MSE loss metric

Chapter 6 presented the test conducted on the framework and the detection model and their corresponding results The per-

formance requirements for the frequently used commands are met by the framework Other commands such as removing

plugins starting up and shutting down the framework do not meet the performance requirements since Docker requires sig-

nificant time to start stop and remove containers The real-time streaming requirements could not be proven because the

time between transmitting a frame and receiving a frame could not be measured directly However the processed videos were

shown to human users that could not distinguish between the processed video and the video played back on a local system

which makes it plausible that the framework achieved this requirement Real-time streaming performance heavily depends on

the plugin and the hardware on which it is deployed When plugins in the framework are processingmedia CPU usage increases

significantly even when only one plugin is active This implies that the length of media processing streams should be kept as

short as possible to achieve good performance The framework is relatively big with some plugins even having a size of 2 GB

This is mostly due to the base images and installed libraries of the plugins and core components Due to each components

and plugin having its own container libraries canrsquot be shared so they are redundantly installed leading to large components

sizes This could be alleviated by using slimmer images and only installing minimal libraries needed The interoperability

requirements are all met by the framework This is proven by a test exchanging mock information between the framework

and plugins The modifiability requirements regarding the plugins are met by the framework The modifiability requirements

regarding the deployment schemes are not met by the framework but are can be achieved by deploying the framework using

a Docker overlay network instead of the Docker bridge network To evaluate the trained model the model made predictions

72 Future work 69

on a separate validation set The model achieves an mAP of 9052 which is much higher than what current state of the art

models are achieving on benchmark datasets This shows that the model is capable of learning the thermal features but is also

overfitting on the data due to temporal correlation between training and validation sets The model can predict in real-time

achieving an average frame rate of 55 frames per second when making predictions on a GPU

72 Future work

This dissertation proposed a framework and implements a prototype of it which only implements a part of the total framework

Object detection using deep learning in general and specified on thermal images is still a young field Several extensions to

this research are possible

721 Security

The framework prototype did not implement any security measures Because in distributed configurations communications

rely on an external network these measures should be implemented to reduce the risks of attacks To allow the components

to manipulate Docker containers the Docker host socket was exposed As stated before this is a serious security risk as the

container gets root access to the host Workarounds for this problem could be to implement a Docker in Docker environment

[136] or deploy the containers in a VM

722 Implementing a detection plugin

Due to the scope and time limit of the dissertation a working prototype plugin containing a trained model for detecting objects

in a video stream could not be made A possible GStreamer pipeline for such a plugin is depicted in Figure 71 This plugin is a

Consumer and receives video via the udpsink Frames are decoded and the raw video is presented to the appsink GStreamer

plugin that allows the video to be dumped into an application This is the detection model that can generate predictions on the

frame The predicted frame is then forwarded to an appsrc GStreamer plugin that puts the predicted frame in a new pipeline to

transmit it to further framework plugins It should be tested whether the detection model can run in a Docker container since

it needs GPU support to be able to predict in real-time A solution could be to use nvidia-docker which leverages NVIDIA GPU

support in Docker containers [137]

Figure 71 GStreamer pipeline for a plugin with a detection model

72 Future work 70

723 Different deployment configurations

The prototype of the framework only implemented one of the deployment configurations presented in Section 233 Other

configurations can be explored by changing the Docker bridge network to a Docker overlay network

724 Multiple streams with different layouts

The prototype only implemented one stream with a chain-like layout Future effort could implement support for multiple

streams that run concurrently The layout can be changed by implementing plugin that can forward media to multiple sources

or merge media coming from different sources which is the concept of sensor fusion

725 Implementing the plugin distribution service (Remote ProducerConsumer)

In Chapter 2 presented the Remote Producer and Consumer that distribute the plugins available for the framework This was

deemed out of scope for the prototype but could be implemented in future versions

726 Using high performance microservices backbone frameworks

The current implementation uses the Flask framework excellent for prototyping but not ideal for high performance Other

frameworks such as Vertx focus on high performance through asynchronous messaging that could improve the performance

of the framework

727 New object detection models and datasets specifically for thermal images

Current effort in object detection models goes towards challenges on benchmark datasets of visual images such as ImageNet

and Pascal VOC There are some thermal datasets publicly available for some detection purposes but these are very small

compared to the visual image datasets Future research could create new benchmark datasets similar to the visual image

datasets specifically for thermal images

Currently publicly available pre-trained neural network models are designed for and trained on the visual image datasets

Future research could go towards designing an architecture specifically for thermal images and training amodel on a benchmark

dataset

Thermal images use several colormaps tomap the relative temperatures in a scene on colors presenting warm and cold regions

Well-known examples are the Iron scheme (used in this dissertation) White-hot and Black-hot Some companies implement

threshold colors that highlight very hot spots or very cold spots in an image (for examples see [138 139] etc) Future research

could investigate how models trained on images using different color schemes differ in their predictions and performances

Thermal images could potentially benefit from radiometric information that adds a ton of information by adding a temperature

dimension to each pixel in the image instead of the relative coloring This information could lead to more accurate predictions

BIBLIOGRAPHY 71

Bibliography

[1] S G Gupta M M Ghonge and P Jawandhiya ldquoReview of Unmanned Aircraft Systemrdquo International Journal of Advanced

Research in Computer Engineering amp Technology vol 2 no 4 pp 2278ndash1323 2013 ISSN 2278 ndash 1323

[2] M Hassanalian and A Abdelkefi Classifications applications and design challenges of drones A review 2017 DOI

10 1016 j paerosci 2017 04 003 [Online] Available http ac els - cdn com S0376042116301348 1 - s2 0 -

S0376042116301348-mainpdf7B5C_7Dtid=256c9506-8f3c-11e7-a898-00000aab0f017B5Camp7Dacdnat=

15042875957B5C_7D

[3] M Joel The Booming Business of Drones 2013 [Online] Available httpshbrorg201301the-booming-business-of-

drones (visited on 01302018)

[4] DJI Zenmuse H3 - 2D [Online] Available httpswwwdjicomzenmuse-h3-2d (visited on 01302018)

[5] Gimbal Guard Drop amp Delivery Device for DJI Mavic Pro [Online] Available httpwwwgimbal-guardcom7B5C_

7Dpprd134610820141productdrop-7B5C7D26-delivery-device-for-dji-mavic-pro (visited on 01302018)

[6] FLIR Systems Aerial Thermal Imaging Kits [Online] Available httpwwwflircomsuasaerial-thermal-imaging-kits

(visited on 01302018)

[7] R Gade and T B Moeslund ldquoThermal cameras and applications a surveyrdquo Machine Vision and Applications vol 25

pp 245ndash262 2014 DOI 101007s00138-013-0570-5 [Online] Available httpslinkspringercomcontentpdf10

10077B5C7D2Fs00138-013-0570-5pdf

[8] M C Harvey J V Rowland and K M Luketina ldquoDrone with thermal infrared camera provides high resolution georefer-

enced imagery of theWaikite geothermal area New Zealandrdquo 2016 DOI 101016jjvolgeores201606014 [Online] Avail-

able httpsacels-cdncomS03770273163014211-s20-S0377027316301421-mainpdf7B5C_7Dtid=78077cee-

05f3-11e8-84ec-00000aab0f6c7B5Camp7Dacdnat=15173405687B5C_7D

[9] S Amici M Turci S Giammanco L Spampinato and F Giulietti ldquoUAV Thermal Infrared Remote Sensing of an Italian Mud

Volcanordquo vol 2 pp 358ndash364 2013 DOI 104236ars201324038 [Online] Available httpwwwscirporgjournalars

20httpdxdoiorg104236ars201324038

[10] J Bendig A Bolten and G Bareth ldquoINTRODUCING A LOW-COST MINI-UAV FOR THERMAL-AND MULTISPECTRAL-IMAGINGrdquo

2012 [Online] Available httpswwwint-arch-photogramm-remote-sens-spatial-inf-scinetXXXIX-B13452012

isprsarchives-XXXIX-B1-345-2012pdf

BIBLIOGRAPHY 72

[11] Workswell ldquoUsing the UAV Thermography for Cultivation and Phenotyping of Cerealsrdquo Tech Rep 2016 [Online] Avail-

able httpswwwdrone-thermal-cameracomwp-contentuploadsCultivation-and-Phenotyping-1pdf

[12] A J Rivera A D Villalobos J C Monje J A Marintildeas and C M Oppus ldquoPost-disaster rescue facility Human detection and

geolocation using aerial dronesrdquo IEEE Region 10 Annual International Conference ProceedingsTENCON pp 384ndash386

2017 ISSN 21593450 DOI 101109TENCON20167848026

[13] P Christiansen K A Steen R N Joslashrgensen and H Karstoft ldquoAutomated detection and recognition of wildlife using

thermal camerasrdquo Sensors (Basel Switzerland) vol 14 no 8 pp 13 778ndash93 Jul 2014 ISSN 1424-8220 DOI 103390

s140813778 [Online] Available httpwwwncbinlmnihgovpubmed2519610520httpwwwpubmedcentralnih

govarticlerenderfcgiartid=PMC4179058

[14] J Zhang J Hu J Lian Z Fan X Ouyang and W Ye ldquoSeeing the forest from drones Testing the potential of lightweight

drones as a tool for long-term forest monitoringrdquo Biological Conversation vol 198 pp 60ndash69 2016 [Online] Available

httpacels-cdncomS00063207163011001-s20-S0006320716301100-mainpdf7B5C_7Dtid=7166e916-8f3c-

11e7-9090-00000aacb35e7B5Camp7Dacdnat=15042877237B5C_7D

[15] D Ventura M Bruno G Jona Lasinio A Belluscio and G Ardizzone ldquoA low-cost drone based application for identifying

and mapping of coastal fish nursery groundsrdquo Estuarine Coastal and Shelf Science vol 171 pp 85ndash98 Mar 2016 ISSN

02727714 DOI 101016j ecss 201601 030 [Online] Available http ac els-cdncomS02727714163003001-s20-

S0272771416300300-mainpdf7B5C_7Dtid=7f4cdb08-8f3c-11e7-a03a-00000aab0f6b7B5Camp7Dacdnat=

15042877467B5C_7D20httplinkinghubelseviercomretrievepiiS0272771416300300

[16] S Chowdhury A Emelogu M Marufuzzaman S G Nurre and L Bian ldquoDrones for disaster response and relief operations

A continuous approximation modelrdquo 2017 DOI 101016jijpe201703024 [Online] Available wwwelseviercomlocate

ijpe

[17] Workswell ldquoPipeline inspection with thermal diagnosticsrdquo 2016 [Online] Available https www drone - thermal -

cameracomwp-contentuploadspipelinepdf

[18] Workswell ldquoThermo diagnosis of photovoltaic power plantsrdquo 2016 [Online] Available httpswwwdrone-thermal-

cameracomwp-contentuploadsWorkswell-WIRIS7B5C_7Dphotovoltaicpdf

[19] Workswell ldquoThermodiagnostics of flat roofsrdquo 2016 [Online] Available httpswwwdrone-thermal-cameracomwp-

contentuploadsroofpdf

[20] Workswell ldquoThermodiagnostics in the power engineering sectorrdquo Tech Rep 2016 [Online] Available https www

drone-thermal-cameracomwp-contentuploadshighvoltagepdf

[21] Workswell Workswell WIRIS - Product - Thermal camera for drones 2016 [Online] Available https www drone -

thermal-cameracomwiris (visited on 01302018)

[22] TEAX Technology ThermalCapture - Thermal Imaging Technology | Capture raw radiometric thermal data with drones

[Online] Available httpthermalcapturecom (visited on 01302018)

BIBLIOGRAPHY 73

[23] DJI Zenmuse XT - unlock the possibilities of sight - DJI 2018 [Online] Available https wwwdji comzenmuse-xt

(visited on 01302018)

[24] Workswell SOFTWARE - Workswell WIRIS - Thermal camera for drones 2016 [Online] Available httpswwwdrone-

thermal-cameracomsoftware (visited on 01312018)

[25] Therm-App Therm-Apptrade - Android-apps op Google Play 2018 [Online] Available httpsplaygooglecomstoreapps

detailsid=comthermapp (visited on 01312018)

[26] B Satzger W Hummer C Inzinger P Leitner and S Dustdar ldquoWinds of change From vendor lock-in to the meta cloudrdquo

IEEE Internet Computing vol 17 no 1 pp 69ndash73 2013 ISSN 10897801 DOI 101109MIC201319

[27] J Divya Drone Technology and Usage Current Uses and Future Drone Technology 2017 [Online] Available httpuk

businessinsidercomdrone-technology-uses-2017-7r=US7B5Camp7DIR=T (visited on 01312018)

[28] A Boulanger ldquoOpen-source versus proprietary software Is one more reliable and secure than the otherrdquo IBM Systems

Journal vol 44 no 2 pp 239ndash248 2005 ISSN 0018-8670 DOI 101147sj4420239 [Online] Available httpieeexplore

ieeeorgdocument5386727

[29] M Kazmeyer Disadvantages of Proprietary Software [Online] Available httpsmallbusinesschroncomdisadvantages-

proprietary-software-65430html (visited on 01312018)

[30] B Steffen and A Seyfried ldquoMethods for measuring pedestrian density flow speed and direction with minimal scatterrdquo

Physica A Statistical Mechanics and its Applications vol 389 no 9 pp 1902ndash1910 May 2010 ISSN 0378-4371 DOI 10

1016JPHYSA200912015 [Online] Available httpswwwsciencedirectcomsciencearticlepiiS0378437109010115

via7B5C7D3Dihub

[31] M Wirz T Franke D Roggen E Mitleton-Kelly P Lukowicz and G Troumlster ldquoInferring crowd conditions from pedestriansrsquo

location traces for real-time crowd monitoring during city-scale mass gatheringsrdquo Proceedings of the Workshop on

Enabling Technologies Infrastructure for Collaborative Enterprises WETICE pp 367ndash372 2012 ISSN 15244547 DOI 10

1109WETICE201226

[32] E Alpaydin Introduction to machine learning 3rd ed MIT Press 2014 p 591 ISBN 026201243X [Online] Available

httpsdlacmorgcitationcfmid=1734076

[33] J W Davis and V Sharma ldquoRobust background-subtraction for person detection in Thermal Imageryrdquo IEEE Computer

Society Conference on Computer Vision and Pattern Recognition Workshops vol 2004-Janua no January 2004 ISSN

21607516 DOI 101109CVPR2004431

[34] W Wang J Zhang and C Shen ldquoImproved Human Detection And Classification in Thermal Imagesrdquo pp 2313ndash2316 2010

[35] R Appel S Belongie P Perona and P Doll ldquoFast Feature Pyramids for Object Detectionrdquo Pami vol 36 no 8 pp 1ndash14

2014 ISSN 01628828 DOI 10 1109 TPAMI 2014 2300479 [Online] Available https vision cornell edu se3 wp -

contentuploads201409DollarPAMI14pyramids7B5C_7D0pdf

[36] T Goedeme ldquoProjectresultaten VLAIO TETRA-projectrdquo KU Leuven Louvain Tech Rep 2017

BIBLIOGRAPHY 74

[37] L-L Slattery DroneSAR wants to turn drones into search-and-rescue heroes 2017 [Online] Available https www

siliconrepubliccomstart-upsdronesar-search-and-rescue-drone-software (visited on 05262018)

[38] A W S Inc What Is Amazon Kinesis Video Streams 2018 [Online] Available https docs aws amazon com

kinesisvideostreamslatestdgwhat-is-kinesis-videohtml (visited on 05262018)

[39] U Government ldquoSystems Engineering Fundamentalsrdquo Defence Acquisition University Press no January p 223 2001

ISSN 1872-7565 DOI 101016jcmpb201005002 [Online] Available httpwwwdticmildocscitationsADA387507

[40] L Bass P Clements and R Kazman Software Architecture in Practice 3rd Addison-Wesley Professional 2012 ISBN

0321815734 9780321815736

[41] J Greene and M Stellman Applied Software Project Management 2006 p 324 ISBN 978-0596009489 [Online] Avail-

able httpwwworeillycomcatalogappliedprojectmgmt

[42] S Barber Acceptable application response times vs industry standard 2018 [Online] Available httpssearchsoftwarequality

techtargetcomtipAcceptable-application-response-times-vs-industry-standard (visited on 05282018)

[43] T Burger How Fast Is Realtime Human Perception and Technology | PubNub 2015 [Online] Available httpswww

pubnubcombloghow-fast-is-realtime-human-perception-and-technology (visited on 05282018)

[44] S-t Modeling P Glennie and N Thrift ldquoTime perception modelsrdquo Neuron pp 15 696ndash15 699 1992

[45] M Richards Software Architecture Patterns First edit Heather Scherer Ed OrsquoReilly Media 2015 [Online] Available

httpwwworeillycomprogrammingfreefilessoftware-architecture-patternspdf

[46] C Richardson Microservice Architecture pattern 2017 [Online] Available httpmicroservicesiopatternsmicroservices

html (visited on 12022017)

[47] P Clements F Bachmann L Bass D Garlan J Ivers R Little P Merson R Nord and J Staffor Documenting Software

Architectures Second Boston Pearson Education Inc 2011 ISBN 0-321-55268-7

[48] Object Management Group ldquoUnified Modeling Language v251rdquo no December 2017 [Online] Available http www

omgorgspecUML251

[49] C De La Torre C Maddock J Hampton P Kulikov and M Jones Communication in a microservice architecture 2017

[Online] Available https docs microsoft com en - us dotnet standard microservices - architecture architect -

microservice-container-applicationscommunication-in-microservice-architecture (visited on 04272018)

[50] H Schulzrinne and S Casner ldquoRTP Profile for Audio and Video Conferences with Minimal Controlrdquo 2003 [Online] Avail-

able httpstoolsietforghtmlrfc3551

[51] D Bull Communicating Pictures A Course in Image and Video Coding Elsevier Science 2014 ISBN 9780080993744

[Online] Available httpsbooksgooglebebooksid=PDZOAwAAQBAJ

[52] On-Net Surveillance Systems Inc ldquoMJPEG vs MPEG4 Understanding the differences advantages and disadvantages of

each compression techniquerdquo 2006 [Online] Available wwwonssicom

BIBLIOGRAPHY 75

[53] M M A V Protocol Introduction MAVLink Developer Guide 2013 [Online] Available httpsmavlinkioen (visited on

09142017)

[54] hartmut Schlosser Microservices trends 2017 Strategies tools and frameworks - JAXenter 2017 [Online] Available

httpsjaxentercommicroservices-trends-2017-survey-133265html (visited on 03242018)

[55] A Ronacher Welcome to Flask mdash Flask Documentation (012) 2017 [Online] Available httpflaskpocooorgdocs012

(visited on 03242018)

[56] F Reyes PythonDecorators 2017 [Online] Available https wiki python org moin PythonDecorators (visited on

04272018)

[57] Stackshare Companies that use Flask and Flask Integrations 2018 [Online] Available https stackshare io flask

(visited on 03242018)

[58] Falcon Falcon - Bare-metal web API framework for Python [Online] Available httpsfalconframeworkorg7B5C

7DsectionAbout (visited on 03242018)

[59] Stackshare Companies that use Falcon and Falcon Integrations 2018 [Online] Available httpsstackshareiofalcon

(visited on 03242018)

[60] A Ronacher Nameko for Microservices 2015 [Online] Available httplucumrpocooorg201548microservices-with-

nameko (visited on 03242018)

[61] C Escoffier Building Reactive Microservices in Java 2017 ISBN 9781491986264

[62] C Posta Microservices for Java Developers ISBN 9781491963081

[63] R Dua A R Raja and D Kakadia ldquoVirtualization vs Containerization to support PaaSrdquo in IEEE International Conference

on Cloud Engineering 2014 ISBN 9781479937660 DOI 101109IC2E201441

[64] D Merkel Docker Lightweight Linux Containers for Consistent Development and Deployment 2014 [Online] Available

http delivery acmorg1011452610000260024111600htmlip=1571935 1787B5Camp7Did=26002417B

5Camp7Dacc=ACTIVE20SERVICE7B5Camp7Dkey=D7FC43CABE88BEAA F15FE2ACB4878E3D 4D4702B0C3E38B35

4D4702B0C3E38B357B5Camp7D7B5C_7D7B5C_7Dacm7B5C_7D7B5C_7D=15214915967B5C_

7D (visited on 03192018)

[65] Docker Inc Docker for the Virtualization Admin 2016 p 12

[66] Docker Inc What is a Container 2018 [Online] Available https www docker com what - container (visited on

03242018)

[67] M Helsley LXC Linux container tools 2009 [Online] Available httpswwwibmcomdeveloperworkslinuxlibraryl-

lxc-containers (visited on 05212018)

[68] J Fink Docker a Software as a Service Operating System-Level Virtualization Framework 2014 [Online] Available

http journal code4lib org articles 9669 utm7B5C _7Dsource = feedburner 7B5C amp7Dutm7B5C _

7Dmedium=feed7B5Camp7Dutm7B5C_7Dcampaign=Feed7B5C7D3A+c4lj+ (visited on 03192018)

BIBLIOGRAPHY 76

[69] C Wang What is Docker Linux containers explained 2017 [Online] Available https www infoworld comarticle

3204171linuxwhat-is-docker-linux-containers-explainedhtml (visited on 05212018)

[70] CoreOS Rkt a security-minded standards-based container engine [Online] Available httpscoreoscomrkt (visited

on 03242018)

[71] F X F Xu X L X Liu and K Fujimura ldquoPedestrian detection and tracking with night visionrdquo IEEE Transactions on

Intelligent Transportation Systems vol 6 no 1 pp 63ndash71 2005 ISSN 1524-9050 DOI 101109TITS2004838222

[72] H Nanda and L Davis ldquoProbabilistic template based pedestrian detection in infrared videosrdquo IEEE Intelligent Vehicles

Symposium Proceedings vol 1 pp 15ndash20 2003 DOI 101109IVS20021187921

[73] R E Schapire ldquoExplaining adaboostrdquo Empirical Inference Festschrift in Honor of Vladimir N Vapnik pp 37ndash52 2013

DOI 101007978-3-642-41136-6_5

[74] P Viola O M Way M J Jones and D Snow ldquoDetecting pedestrian using patterns of motion and appearancerdquo Interna-

tional Journal of Computer Vision vol 63 no 2 pp 153ndash161 2005 DOI 101109ICCV20031238422

[75] I Goodfellow Y Bengio and A Courville Deep Learning MIT Press 2016 httpwwwdeeplearningbookorg

[76] J R R Uijlings K E A Van De Sande T Gevers and A W M Smeulders ldquoSelective Search for Object Recognitionrdquo Tech

Rep 2012 DOI 101007s11263-013-0620-5 arXiv 14094842 [Online] Available httpwwwcscornelleducourses

cs76702014spslidesVisionSeminar14pdf

[77] R Girshick J Donahue T Darrell and J Malik ldquoRegion-Based Convolutional Networks for Accurate Object Detection and

Segmentationrdquo IEEE Transactions on Pattern Analysis and Machine Intelligence vol 38 no 1 pp 142ndash158 2014 ISSN

01628828 DOI 101109TPAMI20152437384 arXiv 13112524

[78] R Girshick ldquoFast R-CNNrdquo in Proceedings of the IEEE International Conference on Computer Vision vol 2015 Inter 2015

pp 1440ndash1448 ISBN 9781467383912 DOI 101109ICCV2015169 arXiv 150408083

[79] S Ren K He R Girshick and J Sun ldquoFaster R-CNN Towards Real-Time Object Detection with Region Proposal Networksrdquo

IEEE Transactions on Pattern Analysis and Machine Intelligence vol 39 no 6 pp 1137ndash1149 2016 ISSN 01628828 DOI

101109TPAMI20162577031 arXiv 150601497

[80] K He Gkioxari P Dollaacuter and R Girshick ldquoMask R-CNNrdquo arXiv 2018 arXiv arXiv170306870v3

[81] J Dai Y Li K He and J Sun ldquoR-FCN Object Detection via Region-based Fully Convolutional Networksrdquo Tech Rep 2016

DOI 101109ICASSP20177952132 arXiv 160506409 [Online] Available httparxivorgabs160506409

[82] J Redmon S Divvala R Girshick and A Farhadi ldquoYou Only Look Once Unified Real-Time Object Detectionrdquo 2015 ISSN

01689002 DOI 101109CVPR201691 arXiv 150602640 [Online] Available httparxivorgabs150602640

[83] J Redmon and A Farhadi ldquoYOLOv3 An Incremental Improvementrdquo axXiv 2018 [Online] Available httpspjreddiecom

mediafilespapersYOLOv3pdf

[84] W Liu D Anguelov D Erhan C Szegedy S Reed C-y Fu and A C Berg ldquoSSD Single Shot MultiBox Detectorrdquo arXiv 2016

arXiv arXiv151202325v5

BIBLIOGRAPHY 77

[85] B Zoph and Q V Le ldquoNeural Architecture Search with Reinforcement Learningrdquo in ICLR 2017 pp 1ndash16 arXiv arXiv

161101578v2

[86] T-y Lin P Goyal R Girshick K He and P Dollaacuter ldquoFocal Loss for Dense Object Detectionrdquo arXiv 2018 arXiv arXiv

170802002v2

[87] Facebook Inc ONNX - About 2017 [Online] Available httpsonnxaiabout (visited on 05212018)

[88] TensorFlow TensorFlow 2018 [Online] Available httpswwwtensorfloworg (visited on 05212018)

[89] J Huang V Rathod C Sun M Zhu A Korattikara A Fathi I Fischer Z Wojna Y Song S Guadarrama and K Murphy

ldquoSpeedaccuracy trade-offs for modern convolutional object detectorsrdquo arXiv 2017 arXiv arXiv161110012v3

[90] J Redmon Darknet Open source neural networks in c httppjreddiecomdarknet 2013ndash2016

[91] Microsoft The Microsoft Cognitive Toolkit | Microsoft Docs 2018 [Online] Available https docs microsoft comen-

uscognitive-toolkitindex (visited on 05212018)

[92] Docker Inc Overview of Docker Compose | Docker Documentation 2018 [Online] Available httpsdocsdockercom

composeoverview (visited on 04272018)

[93] Docker Inc Use bridge networks | Docker Documentation 2018 [Online] Available httpsdocsdockercomnetwork

bridge (visited on 04272018)

[94] A Ronacher Click Documentation (50) 2017 [Online] Available httpclickpocooorg5 (visited on 04272018)

[95] A K Reitz Requests HTTP for Humans mdash Requests 2184 documentation 2018 [Online] Available httpdocspython-

requestsorgenmaster (visited on 05092018)

[96] Docker Inc Docker SDK for PythonmdashDocker SDK for Python 20 documentation 2018 [Online] Available httpsdocker-

pyreadthedocsioenstable (visited on 05122018)

[97] GStreamer GStreamer open source multimedia framework 2018 [Online] Available httpsgstreamerfreedesktop

org (visited on 05132018)

[98] E Walthinsen filesrc GStreamer Core Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktop org data doc gstreamer head gstreamer -plugins html gstreamer -plugins - filesrc html (visited on

05132018)

[99] E Hervey decodebin GStreamer Base Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-base-pluginshtmlgst-plugins-base-plugins-decodebinhtml

(visited on 05132018)

[100] W Taymans jpegenc GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-jpegenchtml (visited on

05132018)

BIBLIOGRAPHY 78

[101] A Communications rtpjpegpay GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https

gstreamer freedesktop org data doc gstreamer head gst - plugins - good html gst - plugins - good - plugins -

rtpjpegpayhtml (visited on 05132018)

[102] W Taymans udpsink GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-udpsinkhtml (visited on

05132018)

[103] GStreamer Basic tutorial 3 Dynamic pipelines [Online] Available httpsgstreamerfreedesktoporgdocumentation

tutorialsbasicdynamic-pipelineshtml (visited on 05132018)

[104] W Taymans udpsrc GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-udpsrchtml (visited on

05142018)

[105] W Taymans rtpjpegdepay GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available httpsgstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-rtpjpegdepayhtml

(visited on 05142018)

[106] A Loonstra ldquoVideostreaming with Gstreamerrdquo [Online] Available httpmediatechnologyleideneduimagesuploads

docswt20147B5C_7Dgstreamerpdf

[107] W Taymans jpegdec GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-jpegdechtml (visited on

05142018)

[108] J Schmidt autovideosink GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available httpsgstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-autovideosinkhtml

(visited on 05142018)

[109] A Ronacher Deployment Options mdash Flask 0124 documentation 2018 [Online] Available httpflaskpocooorgdocs

012deploying (visited on 05142018)

[110] R Yasrab ldquoMitigating Docker Security Issuesrdquo University of Science and Technology of China Hefei Tech Rep [Online]

Available httpsarxivorgpdf180405039pdf

[111] Lvh Donrsquot expose the Docker socket (not even to a container) 2015 [Online] Available httpswwwlvhiopostsdont-

expose-the-docker-socket-not-even-to-a-containerhtml (visited on 05152018)

[112] Docker Inc Use overlay networks | Docker Documentation 2018 [Online] Available httpsdocsdockercomnetwork

overlay7B5C7Dcustomize-the-docker7B5C_7Dgwbridge-interface (visited on 05152018)

[113] J W Davis and M A Keck ldquoA Two-Stage Template Approach to Person Detection in Thermal Imageryrdquo Proc Workshop

on Applications of Computer Vision 2005 [Online] Available httpvcipl-okstateorgpbvsbenchpaperswacv05pdf

BIBLIOGRAPHY 79

[114] J W Davis and V Sharma ldquoBackground-subtraction using contour-based fusion of thermal and visible imageryrdquo Com-

puter Vision and Image Understanding vol 106 no No 2-3 pp 162ndash182 2007 DOI 101016jcviu200606010 [Online]

Available httpswebcseohio-stateedu7B~7Ddavis1719Publicationscviu07pdf

[115] S Hwang J Park N Kim Y Choi and I S Kweon ldquoMultispectral Pedestrian Detection Benchmark Dataset and Baselinerdquo

CVPR 2015 [Online] Available httpssitesgooglecomsitepedestrianbenchmark

[116] Z Wu N Fuller D Theriault and M Betke ldquoA Thermal Infrared Video Benchmark for Visual Analysisrdquo IEEE Conference

on Computer Vision and Pattern Recognition Workshops 2014 DOI 101109CVPRW201439 [Online] Available http

citeseerxistpsueduviewdocdownloaddoi=101173522167B5Camp7Drep=rep17B5Camp7Dtype=pdf

[117] R Miezianko Terravic research infrared database

[118] R Miezieanko Terravic research infrared database

[119] S Z Li R Chu S Liao and L Zhang ldquoIllumination Invariant Face Recognition Using Near-Infrared Imagesrdquo IEEE Trans-

actions on Pattern Analysis and Machine Intelligence vol 29 no 4 pp 627ndash639 2007 DOI 101109TPAMI20071014

[Online] Available httpvcipl-okstateorgpbvsbenchpapersNIRpdf

[120] A Akula R Ghosh S Kumar and H K Sardana ldquoMoving target detection in thermal infrared imagery using spatiotem-

poral informationrdquo J Opt Soc Am A vol 30 no 8 pp 1492ndash1501 Aug 2013 DOI 101364JOSAA30001492 [Online]

Available httpjosaaosaorgabstractcfmURI=josaa-30-8-1492

[121] R I Hammoud IEEE OTCBVS WS Series Bench [Online] Available http vcipl - okstate org pbvs bench (visited on

05182018)

[122] Last Post Association Mission 2018 [Online] Available httpwwwlastpostbeenthe-last-postmission (visited on

05182018)

[123] I FLIR Systems FLIR One Pro 2017 [Online] Available httpswwwflircomglobalassetsimported-assetsdocument17-

1746-oem-flir7B5C_7Done7B5C_7Dpro7B5C_7Ddatasheet7B5C_7Dfinal7B5C_7Dv17B5C_

7Dwebpdf

[124] R J Ramana Introduction to Camouflage andDeception Defence Scientific Information ampDocumentation Centre pp 99ndash

164

[125] A Bornstein and I Richter Microsoft visual object tagging tool [Online] Available httpsgithubcomMicrosoftVoTT

(visited on 05202018)

[126] F E Grubbs ldquoProcedures for Detecting Outlying Observations in Samplesrdquo Technometrics vol 11 no 1 pp 1ndash21 Feb 1969

DOI 10108000401706196910490657 [Online] Available httpwwwtandfonlinecomdoiabs10108000401706

196910490657

[127] J Deng W Dong R Socher L-J Li K Li and L Fei-Fei ldquoImageNet A Large-Scale Hierarchical Image Databaserdquo in CVPR09

2009 [Online] Available httpwwwimage-netorgpapersimagenet7B5C_7Dcvpr09pdf

BIBLIOGRAPHY 80

[128] D Gupta Transfer learning amp The art of using Pre-trained Models in Deep Learning 2017 [Online] Available https

wwwanalyticsvidhyacomblog201706transfer-learning-the-art-of-fine-tuning-a-pre-trained-model (visited on

05202018)

[129] Docker Inc docker stats | Docker Documentation 2018 [Online] Available httpsdocsdockercomenginereference

commandlinestats (visited on 05242018)

[130] M Gori and A Tesi ldquoOn the Problem of Local Minima in Recurrent Neural Networksrdquo IEEE Transactions on Pattern

Analysis and Machine Intelligence vol 14 no 1 pp 76ndash86 1992 DOI 10110934107014

[131] L Prechelt ldquoEarly stopping - but whenrdquo in Neural Networks Tricks of the Trade G B Orr and K-R Muumlller Eds Berlin

Heidelberg Springer Berlin Heidelberg 1998 pp 55ndash69 ISBN 978-3-540-49430-0 DOI 1010073-540-49430-8_3

[Online] Available httpsdoiorg1010073-540-49430-8_3

[132] M Everingham L Van Gool C K Williams J Winn and A Zisserman ldquoThe Pascal visual object classes (VOC) challengerdquo

International Journal of Computer Vision vol 88 no 2 pp 303ndash338 2010 ISSN 09205691 DOI 101007s11263-009-

0275-4

[133] M Everingham S M A Eslami L Van Gool C K I Williams J Winn and A Zisserman ldquoThe Pascal Visual Object Classes

Challenge A Retrospectiverdquo International Journal of Computer Vision vol 111 no 1 pp 98ndash136 2014 ISSN 15731405

DOI 101007s11263-014-0733-5

[134] P Henderson and V Ferrari ldquoEnd-to-end training of object class detectors for mean average precisionrdquo Lecture Notes

in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

vol 10115 LNCS pp 198ndash213 2017 ISSN 16113349 DOI 101007978-3-319-54193-8_13 arXiv 160703476

[135] T Y Lin M Maire S Belongie J Hays P Perona D Ramanan P Dollaacuter and C L Zitnick ldquoMicrosoft COCO Common objects

in contextrdquo Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture

Notes in Bioinformatics) vol 8693 LNCS no PART 5 pp 740ndash755 2014 ISSN 16113349 DOI 101007978-3-319-10602-

1_48 arXiv 14050312

[136] Docker Inc Librarydocker 2018 [Online] Available https hub docker com 7B 5C _ 7D docker (visited on

06012018)

[137] Nvidia Nvidia-docker [Online] Available httpsgithubcomNVIDIAnvidia-docker (visited on 05252018)

[138] FLIR ldquoFLIR Onerdquo [Online] Available http www flir comuploadedFiles Store Products FLIR-ONE3rd-GenFLIR-

ONEFLIR-ONE-Gen-3-Datasheetpdf

[139] FLIR ldquoFLIR Bosonrdquo p 2 2016

FIREFIGHTING DEPARTMENT EMAIL CONVERSATIONS 81

Appendix A

Firefighting department email conversations

This appendix contains the email conversations with different firefighting departments in Belgium as part of an exploration of

the functional requirements of an aerial thermal imaging solution Note that all conversations were translated from Dutch to

English

A1 General email sent to Firefighting departments

This email was sent to the departments later mentioned in this appendix The responses in the following sections are responses

to this email

Subject Firefighting department - Thesis thermal drones

Dear Sir Madam

My name is Brecht Verhoeve I am a student Master of Science computer science engineering at Ghent University I am contacting

your department with reference to the research of my masterrsquos dissertation I am currently researching the applications of

thermal cameras in combination with commercial drones They can create an aerial overview of scenes and objects that often

canrsquot be spotted with visual detectors like hidden persons fires or hot explosives The eventual goal is to let a computer indicate

these objects of interest autonomously on the thermal images of the drone These images could aid a firefighter with their

work

For this research I have some questions for you

Functionality

I have enlisted some functionalities which I believe could be interesting for a firefighter

bull Detection of persons in buildings (find potential victims)

bull Detection of hidden fires in buildings (to identify danger zones)

bull Detection of fires on vast terrains (forests industrial terrains)

A2 Conversation with Firefighting department of Antwerp Belgium 82

bull Indication of hot explosives

I have two questions

bull Do you agree that these are the most important functions

bull Are there any other functions that you deem important

Quality of the application Next to the functionality the quality of the application is also important For me the most important

aspects are

bull Accuracy The software must be accurate There is no room for errors when detecting

bull Speed The software must operate quickly An overview must be created quickly to not waste time in case of an emer-

gency

bull Usability The software must be easy to use

Once again I have two questions

bull Do you agree with these qualities

bull Are there any other important qualities that you deem important

I would like to thank you in advance for your time

Best regards

Brecht Verhoeve

A2 Conversation with Firefighting department of Antwerp Belgium

The answers were given inline For clarity these are explicitly given

Subject Re Firefighting department Antwerp - Thesis thermal drones

Answers can be found in your email

Best regards

Functionality Detection of hidden fires in buildings and environments Are there any other functions that you deem important

Capture the evolution of a fire with the thermal camera Visualise incidents during night-time Capture invisible fires such as

hydrogen or methane fires

A3 Converstation with Firefighting department of Ostend Belgium 83

A3 Converstation with Firefighting department of Ostend Belgium

The answers were given inline For clarity these are explicityl given

Subject Re Firefighting department Ostend - Thesis thermal drones

Dear Brecht

You can find the answers after the questions in your email

Best Regards

Functionality Are there any other functions that you deem important These are the most important for us at the moment

Quality of the application Are there any other important qualities that you deem important The application must work au-

tonomously

A4 Conversation with Firefighting department of Courtrai Belgium

Subject Re Firefighting department Courtrai - Thesis thermal drones

Dear Brecht

Beneath you will find our answers (next to the already mentioned items)

Functionality

bull The detection of persons in a landscape For example missing persons after a traffic accident there are searches in the

dark for victims that were catapulted from a vehicle Today this is done via a thermal camera on the ground but with

a drone this could hold potential benefits Another example is searching for missing persons in nature reserves The

police sometimes asks for assitance of firefighters to search the area

Quality of the application

bull The images needs to be processed in realtime not after the drone has landed

The drones must be deployable for multiple purposes

The interpretation of the images in the future can be important for automatic flight control of drones Currently there is a

European project rdquo3D Safeguardrdquo where the KU Leuven is participating They are already quite advanced in interpreting the

images from a drone to spot persons through smoke With this information the drone can be redirected The application can

thus use the interpretations of the images to control the drone in flight

Best regards

A5 Conversation with Firefighting department of Ghent Belgium

Subject Re Firefighting department Ghent - Thesis thermal drones

A5 Conversation with Firefighting department of Ghent Belgium 84

Hi Brecht

I donrsquot know if yoursquove received the previous email but there you received answers on your questions

Best regards

Subject Re Firefighting department Ghent - Thesis thermal drones

With respect to the functionality I would like to add

bull Measuring the temperature of containers silos

I agree with the quality of the application It could be handy to be able to view the application from one or more devices

Everything should have a clear overview If possible information and controls should be available on one screen

I will follow up

Best regards

THERMAL CAMERA SPECIFICATIONS 85

Appendix B

Thermal camera specifications

This appendix gives all the specifications for the compared thermal cameras First the different cameras their producing

companies and average retail prices are listed in Table B1 Second their respective physical specifications are presented in

Table B2 Third the image qualities are presented in Table B3 Fourth the thermal precisions are presented in Table B4 Fifth

the available interfaces to interact with each camera are presented in Table B5 Sixth the energy consumption of each camera

is presented in Table B6 Seventh how support is offered when developing for these platforms is presented in Table B7 Finally

auxiliary features are presented in Table B8

THERMAL CAMERA SPECIFICATIONS 86

Product Company Price (Euro)

Wiris 2nd Gen 640 Workswell 999500

Wiris 2nd Gen 336 Workswell 699500

Duo Pro R 640 FLIR 640900

Duo Pro R 336 FLIR 438484

Duo FLIR 94999

Duo R FLIR 123999

Vue 640 FLIR 268900

Vue 336 FLIR 125993

Vue Pro 640 FLIR 403218

Vue Pro 336 FLIR 230261

Vue Pro R 640 FLIR 518456

Vue Pro R 336 FLIR 345599

Zenmuse XT 640 DJI x FLIR 1181000

Zenmuse XT 336 DJI x FLIR 697000

Zenmuse XT 336 R DJI x FLIR 939000

Zenmuse XT 640 R DJI x FLIR 1423000

One FLIR 23799

One Pro FLIR 46900

Tau 2 640 FLIR 674636

Tau 2 336 FLIR 493389

Tau 2 324 FLIR 2640

Lepton 3 160 x 120 FLIR 25995

Lepton 3 80 x 60 FLIR 14338

Boson 640 FLIR 122209

Boson 320 FLIR 93842

Quark 2 640 FLIR 33165

Quark 2 336 FLIR 33165

DroneThermal v3 Flytron 34115

Compact Seek Thermal 27500

CompactXR Seek Thermal 28646

Compact Pro Seek Thermal 59900

Therm-App Opgal 93731

Therm-App TH Opgal 295000

Therm-App 25 Hz Opgal 199000

Table B1 Compared cameras their producing companies and their average retail price

THERMAL CAMERA SPECIFICATIONS 87

Product Weight (g) Dimensions (mm)

Wiris 2nd Gen 640 390 135 x 77 x 69

Wiris 2nd Gen 336 390 135 x 77 x 69

Duo Pro R 640 325 85 x 813 x 685

Duo Pro R 336 325 85 x 813 x 685

Duo 84 41 x 59 x 30

Duo R 84 41 x 59 x 30

Vue 640 114 574 x 4445 x 4445

Vue 336 114 574 x 4445 x 4445

Vue Pro 640 9214 574 x 4445 x 4445

Vue Pro 336 9214 574 x 4445 x 4445

Vue Pro R 640 9214 574 x 4445 x 4445

Vue Pro R 336 9214 574 x 4445 x 4445

Zenmuse XT 640 270 103 x 74 x 102

Zenmuse XT 336 270 103 x 74 x 102

Zenmuse XT 336 R 270 103 x 74 x 102

Zenmuse XT 640 R 270 103 x 74 x 102

One 345 67 x 34 x 14

One Pro 365 68 x 34 x 14

Tau 2 640 72 444 x 444 x 444

Tau 2 336 72 444 x 444 x 444

Tau 2 324 72 444 x 444 x 444

Lepton 3 160 x 120 09 118 x 127 x 72

Lepton 3 80 x 60 09 118 x 127 x 72

Boson 640 75 21 x 21 x 11

Boson 320 75 21 x 21 x 11

Quark 2 640 8 22 x 22 x 12

Quark 2 336 8 22 x 22 x 12

DroneThermal v3 3 20 x 20 x 15

Compact 1417 254 x 444 x 203

CompactXR 1417 254 x 444 x 254

Compact Pro 1417 254 x 444 x 254

Therm-App 138 55 x 65 x 40

Therm-App TH 123 55 x 65 x 40

Therm-App 25 Hz 138 55 x 65 x 40

Table B2 Physical specifications

THERMAL CAMERA SPECIFICATIONS 88

Product IR Resolution (pixels) SD resolution (megapixels) Frequency (Hz) FOV Radiometry

Wiris 2nd Gen 640 640 x 512 192 not specified Various yes

Wiris 2nd Gen 336 336 x 256 192 not specified Various yes

Duo Pro R 640 640 x 512 12 30 Various lens yes

Duo Pro R 336 336 x 256 12 30 Various lens yes

Duo 160 x 120 2 75 and 83 57deg x 44deg no

Duo R 160 x 120 2 75 57deg x 44deg yes

Vue 640 640 x 512 0 75 Various lens no

Vue 336 336 x 256 0 75 Various lens no

Vue Pro 640 640 x 512 0 75 Various lens no

Vue Pro 336 336 x 256 0 75 Various lens no

Vue Pro R 640 640 x 512 0 75 Various lens yes

Vue Pro R 336 336 x 256 0 75 Various lens yes

Zenmuse XT 640 640 x 512 0 75 Various lens no

Zenmuse XT 336 336 x 256 0 75 Various lens no

Zenmuse XT 336 R 336 x 256 0 75 Various lens yes

Zenmuse XT 640 R 336 x 256 0 75 Various lens yes

One 80 x 60 15 87 50 deg x 38 deg yes

One Pro 160 x 120 15 87 55 deg x 43 deg yes

Tau 2 640 640 x 512 0 75 Various lens yes

Tau 2 336 336 x 256 0 75 Various lens yes

Tau 2 324 324 x 256 0 76 Various lens yes

Lepton 3 160 x 120 160 x 120 0 88 56 deg available

Lepton 3 80 x 60 80 x 60 0 88 56 deg no

Boson 640 640 x 512 0 90 Various lens no

Boson 320 320 x 256 0 90 Various lens no

Quark 2 640 640 x 512 0 9 Various lens no

Quark 2 336 336 x 256 0 9 Various lens no

DroneThermal v3 80 x 60 0 86 25 deg no

Compact 206 x 156 0 9 36 deg no

CompactXR 205 x 156 0 9 20 deg no

Compact Pro 320 x 240 0 15 32 deg no

Therm-App 384 x 288 0 87 Various lens no

Therm-App TH 384 x 288 0 87 Various lens yes

Therm-App 25 Hz 384 x 288 0 25 Various lens no

Table B3 Image quality

IR InfraRed SD Standard FOV Field of View

THERMAL CAMERA SPECIFICATIONS 89

Product Sensitivity mK Temperature range (degrees Celsius) Accuracy (Celsius)

Wiris 2nd Gen 640 50 -25 to +150 -40 to + 550 2

Wiris 2nd Gen 336 50 -25 to +150 -40 to + 550 2

Duo Pro R 640 50 -25 to + 135 -40 to + 550 5 20

Duo Pro R 336 50 -25 to + 135 -40 to + 550 5 20

Duo not specified -40 tot + 550 5

Duo R not specified -40 to + 550 5

Vue 640 not specified -58 to + 113 not specified

Vue 336 not specified -58 to + 113 not specified

Vue Pro 640 not specified -58 to + 113 not specified

Vue Pro 336 not specified -58 to + 113 not specified

Vue Pro R 640 not specified -58 to + 113 not specified

Vue Pro R 336 not specified -58 to + 113 not specified

Zenmuse XT 640 50 -40 to 550 not specified

Zenmuse XT 336 50 -40 to 550 not specified

Zenmuse XT 336 R 50 -40 to 550 not specified

Zenmuse XT 640 R 50 -40 to 550 not specified

One 150 -20 to 120 3

One Pro 150 -20 to 400 3

Tau 2 640 50 -40 to 550 not specified

Tau 2 336 50 -40 to 550 not specified

Tau 2 324 50 -40 to 550 not specified

Lepton 3 160 x 120 50 0 to 450 5

Lepton 3 80 x 60 50 0 to 450 5

Boson 640 40 0 to 500 not specified

Boson 320 40 0 to 500 not specified

Quark 2 640 50 -40 to 160 not specified

Quark 2 336 50 -40 to 160 not specified

DroneThermal v3 50 0 to 120 not specified

Compact not specified -40 to 330 not specified

CompactXR not specified -40 to 330 not specified

Compact Pro 70 -40 to +330 not specified

Therm-App 70 5 to + 90 3

Therm-App TH 70 0 to 200 2

Therm-App 25 Hz 70 5 to + 90 3

Table B4 Thermal precision

THERMAL CAMERA SPECIFICATIONS 90

Product USB MAVLink HDMI

Wiris 2nd Gen 640 Flash disk yes yes

Wiris 2nd Gen 336 Flash disk yes yes

Duo Pro R 640 Mini-USB yes micro-HDMI

Duo Pro R 336 Mini-USB yes micro-HDMI

Duo Mini-USB yes micro-HDMI

Duo R Mini-USB yes micro-HDMI

Vue 640 Mini-USB No No

Vue 336 Mini-USB no no

Vue Pro 640 Mini-USB yes Optional

Vue Pro 336 Mini-USB yes Optional

Vue Pro R 640 Mini-USB yes Optional

Vue Pro R 336 Mini-USB yes Optional

Zenmuse XT 640 Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 336 Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 336 R Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 640 R Via DJI drone Via DJI drone Via DJI drone

Tau 2 640 No no no

Tau 2 336 No no no

Tau 2 324 No no no

Lepton 3 160 x 120 No no no

Lepton 3 80 x 60 No no no

Boson 640 Yes no no

Boson 320 Yes no no

Quark 2 640 no no no

Quark 2 336 no no no

DroneThermal v3 no no no

Compact Smartphone storage no no

CompactXR Smartphone storage no no

Compact Pro Smartphone storage no no

Therm-App Smartphone storage no no

Therm-App TH Smartphone storage no no

Therm-App 25 Hz Smartphone storage no no

Table B5 Interfaces

THERMAL CAMERA SPECIFICATIONS 91

Product Power consumption (Watt) Input Voltage

Wiris 2nd Gen 640 4 6 - 36

Wiris 2nd Gen 336 4 6 - 36

Duo Pro R 640 10 50 - 260

Duo Pro R 336 10 50 - 260

Duo 22 50 - 260

Duo R 22 50 - 260

Vue 640 12 48 - 60

Vue 336 12 48 - 60

Vue Pro 640 21 48 - 60

Vue Pro 336 21 48 - 60

Vue Pro R 640 21 48 - 60

Vue Pro R 336 21 48 - 60

Zenmuse XT 640 Via DJI drone Via DJI drone

Zenmuse XT 336 Via DJI drone Via DJI drone

Zenmuse XT 336 R Via DJI drone Via DJI drone

Zenmuse XT 640 R Via DJI drone Via DJI drone

One approx 1h battery lifetime Battery

One Pro approx 1h battery lifetime Battery

Tau 2 640 13 40 - 60

Tau 2 336 13 40 - 61

Tau 2 324 13 40 - 62

Lepton 3 160 x 120 065 31

Lepton 3 80 x 60 065 31

Boson 640 05 33

Boson 320 05 33

Quark 2 640 12 33

Quark 2 336 12 33

DroneThermal v3 015 33 - 5

Compact Via smartphone Smartphone

CompactXR Via smartphone Smartphone

Compact Pro Via smartphone Smartphone

Therm-App 05 5

Therm-App TH 05 5

Therm-App 25 Hz 05 5

Table B6 Energy consumption

THERMAL CAMERA SPECIFICATIONS 92

Product Warranty (years) User Manual Phone support Email support FAQs

Wiris 2nd Gen 640 Not specified Yes Yes Yes Yes

Wiris 2nd Gen 336 Not specified Yes Yes Yes Yes

Duo Pro R 640 1 Yes Yes Yes Yes

Duo Pro R 336 1 Yes Yes Yes Yes

Duo 1 yes Yes Yes Yes

Duo R 1 yes yes yes yes

Vue 640 1 yes yes yes yes

Vue 336 1 yes yes yes yes

Vue Pro 640 1 yes yes yes yes

Vue Pro 336 1 yes yes yes yes

Vue Pro R 640 1 yes yes yes yes

Vue Pro R 336 1 yes yes yes yes

Zenmuse XT 640 05 yes yes yes yes

Zenmuse XT 336 05 yes yes yes yes

Zenmuse XT 336 R 05 yes yes yes yes

Zenmuse XT 640 R 05 yes yes yes yes

One 1 yes yes yes yes

One Pro 1 yes yes yes yes

Tau 2 640 1 yes yes yes yes

Tau 2 336 1 yes yes yes yes

Tau 2 324 1 yes yes yes yes

Lepton 3 160 x 120 1 yes yes yes yes

Lepton 3 80 x 60 1 yes yes yes yes

Boson 640 1 yes yes yes yes

Boson 320 1 yes yes yes yes

Quark 2 640 1 yes yes yes yes

Quark 2 336 1 yes yes yes yes

DroneThermal v3 not specified no no no no

Compact 1 yes yes yes yes

CompactXR 1 yes yes yes yes

Compact Pro 1 yes yes yes yes

Therm-App 1 yes yes yes yes

Therm-App TH 1 yes yes yes yes

Therm-App 25 Hz 1 yes yes yes yes

Table B7 Help and support

THERMAL CAMERA SPECIFICATIONS 93

Product Bluetooth Wi-Fi GPS Mobile app Storage

Wiris 2nd Gen 640 no on request Yes no yes

Wiris 2nd Gen 336 no on request yes no yes

Duo Pro R 640 yes no yes yes yes

Duo Pro R 336 yes no yes yes yes

Duo no no no no yes

Duo R no no no no yes

Vue 640 No no no no no

Vue 336 no no no no no

Vue Pro 640 yes no no yes yes

Vue Pro 336 yes no no yes yes

Vue Pro R 640 yes no no yes yes

Vue Pro R 336 yes no no yes yes

Zenmuse XT 640 Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 336 Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 336 R Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 640 R Via DJI drone Via DJI drone Via DJI drone yes yes

One no no no yes yes

One Pro no no no yes yes

Tau 2 640 no no no no yes

Tau 2 336 no no no no yes

Tau 2 324 no no no no yes

Lepton 3 160 x 120 no no no no no

Lepton 3 80 x 60 no no no no no

Boson 640 no no no no no

Boson 320 no no no no no

Quark 2 640 no no no no no

Quark 2 336 no no no no no

DroneThermal v3 no no no no no

Compact no no no yes yes

CompactXR no no no yes yes

Compact Pro no no no yes yes

Therm-App no no no yes yes

Therm-App TH no no no yes yes

Therm-App 25 Hz no no no yes yes

Table B8 Auxiliary features

LAST POST THERMAL DATASET SUMMARY 94

Appendix C

Last Post thermal dataset summary

The goal of this appendix is to provide a summary of the layout of the Last Post thermal dataset The data was captured on

the following days 24th of March 2018 second of April 2018 third of April 2018 third of April 2018 fourth of April 2018 fifth of

April 2018 9th of April 2018 10th of April 2018 11th of April 2018 and 12th of April 2018 For each date a small summary of the

contents is made below The small summary consists of a description of the conditions that day a listing of the video files and

their contents

C1 24th of March 2018

Conditions

bull Hours 1940 - 2020

bull Outside temperature range 5 degrees Celsius - 12 degrees Celsius

bull Clear

bull Humidity 76

bull Wind 24 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 14 kilometers

Videos

bull flir_20180324T195255mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd gathers on the right of the video D

C2 2nd of April 2018 95

bull flir_20180324T195836mp4 This video gives an overview of the inside of the Meningate ceremony Many

people can be seen watching the ceremony

bull flir_20180324T200421mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side

bull flir_20180324T201448mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side The crowd is leaving

the area after the ceremony

bull flir_20180324T202328mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side The crowd is leaving

the area after the ceremony

C2 2nd of April 2018

Conditions

bull Hours 1940 - 2020

bull Outside temperature range 9 degrees Celsius - 15 degrees Celsius

bull Light rain

bull Humidity 74

bull Wind 18 kilometers per hour

bull Precipitation 04 centimeters

bull Visibility 81 kilometers

Videos

bull 2018-04-02194733mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people sometimes with umbrellas passing through

bull 2018-04-02194952mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance

C3 3th of April 2018 96

bull 2018-04-02195518mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance

bull 2018-04-02201322mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance Crowds can be seen as well as people holding umbrellas

C3 3th of April 2018

Conditions

bull Hours 2000 - 2030

bull Outside temperature range 8 degrees Celsius - 16 degrees Celsius

bull Heavy rain

bull Humidity 79

bull Wind 25 kilometers per hour

bull Precipitation 05 centimeters

bull Visibility 101 kilometers

Videos

bull 2018-04-03 201227mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are holding

umbrellas due to heavy rain on that day Due to rain conditions and wind it was difficult to steady the camera which

can be seen in the shaky video

bull 2018-04-03 201727mp4 In the beginning of the clip the camera is moving towards the other side of the

Meningate From 0020 and onwards the clip is useful The video gives an overview of the bridge at the east-side of the

Meningate This is were the Frenchlaan goes into the Menenstraat The video shows people leaving from the Meningate

towards the busses at the other side of the bridge Most people are holding umbrellas due to heavy rain that day The

Meningate is in the bottom left of the picture Several buildings can be seen in the distance In the bottom right the

water of the Kasteelgracht can be seen Sometimes in the left of the picture the wall of the Meningate can be seen

bull 2018-04-03 202311mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are holding

umbrellas due to heavy rain on that day Due to rain conditions and wind it was difficult to steady the camera which

can be seen in the shaky video

C4 4th of April 2018 97

C4 4th of April 2018

Content

bull Hours 1945 - 2030

bull Outside temperature range 10 degrees Celsius - 14 degrees Celsius

bull Cloudy

bull Humidity 87

bull Wind 18 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-04 200052mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen In the video a dense

crowd of people slowly starts to form in the bottom right of the picture Most people are coming from the left of the

video to join the crowd

bull 2018-04-04 200728mp4 This video shows the inside of the Meningate and the ceremony of the last post

Some people are up close in front The large crowd can be seen through the hall

bull 2018-04-04 200913mp4 This video shows the inside of the Meningate and the ceremony of the last post

The video switches between MSX mode visual camera and thermal camera to show the differences

bull 2018-04-04 202859mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen At the start of the video

a crowd is seen in the bottom right At the 0100 mark the ceremony has ended and people are exiting the gate and

coming onto the crossing They form two rows to make place for the marching band exiting the Meningate It can be

seen marching through the crowd at the 0250 mark

C5 5th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 11 degrees Celsius - 15 degrees Celsius

C6 9th of April 2018 98

bull Sunny

bull Humidity 77

bull Wind 11 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-05 200217mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen In the video a dense

crowd of people slowly starts to form in the bottom right of the picture Most people are coming from the left of the

video to join the crowd The video shows 15 minutes before the start of the ceremony

bull 2018-04-04 201838mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen The video shows the

first ten minutes after the end of the ceremony The crowd which can be seen on the left leaves towards the square

C6 9th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 9 degrees Celsius - 10 degrees Celsius

bull Light rain

bull Humidity 99

bull Wind 14 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 81 kilometers

Videos

bull 2018-04-09 200007mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are coming from

the left towards the Meningate in the right Not a lot of people are seen due to rain that day

C7 10th of April 2018 99

bull 2018-04-09-202302mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right People are leaving from the right of the Meningate

towards the square

C7 10th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 14 degrees Celsius - 17 degrees Celsius

bull Partly Cloudy

bull Humidity 52

bull Wind 13 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers per hour

Videos

bull 2018-04-10 195029mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way

bull 2018-04-10 195131mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way

bull 2018-04-10 195748mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way Some people are moving around the crowd

bull 2018-04-10 200122mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen There is a big crowd that

can be seen on the right There are some schools there so some people are wearing backpacks It is quite warm and

the cafe on the other side of the street has opened up its terrace

bull 2018-04-10 201427mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

C8 11th of April 2018 100

that they are standing in a very structured way Some people are moving around the crowd The image is not rotated

well a well rotated image is found in 2018-04-10 201427_rotatedmp4

bull 2018-04-10 201515mp4 This video shows the inside of the Meningate and the ceremony A traditional

rsquoHakkarsquo from New-Zealand soldiers can be heard in the video the soldiers are difficult to spot due to thermal blurring

because many people are standing in one place

bull 2018-04-10 202558mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen that

they are standing in a very structured way Some people are moving around the crowd At the 0200 mark the army

unit marches to the end of the bridge Very dense crowds can be seen afterwards At 0825 the army unit marches in a

straight line towards the Meningate

C8 11th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 12 degrees Celsius - 16 degrees Celsius

bull Sunny

bull Humidity 63

bull Wind 14 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-11 200140mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen

bull 2018-04-11 200601mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat A small crowd can be seen on the left of the video

bull 2018-04-11 201554mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People start leaving the

ceremony from the 0120 mark

C9 12th of April 2018 101

C9 12th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 11 degrees Celsius - 14 degrees Celsius

bull Rain

bull Humidity 94

bull Wind 8 kilometers per hour

bull Precipitation 01 centimeters

bull Visibility 32 kilometers

Videos

bull 2018-04-12 195219mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen Sometimes the hedge

of the wall were the film was made is visible due to shaky cam Not many people are seen due to the rain

bull 2018-04-12 201526mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen Sometimes the hedge

of the wall were the film was made is visible due to shaky cam Not many people are seen due to the rain People are

leaving towards the right

  • Introduction
    • Drones
    • Concepts
      • Thermal Cameras
      • Aerial thermal imaging
        • Problem statement
          • Industry adoption
          • Crowd monitoring
          • Goal
          • Related work
            • Outline
              • System Design
                • Requirements analysis
                  • Functional requirements
                  • Non-functional requirements
                    • Patterns and tactics
                      • Layers
                      • Event-driven architecture
                      • Microkernel
                      • Microservices
                      • Comparison of patterns
                        • Software architecture
                          • Static view
                          • Dynamic views
                          • Deployment views
                              • State of the art and technology choice
                                • Thermal camera options
                                  • Parameters
                                  • Comparative analysis
                                    • Microservices frameworks
                                      • Flask
                                      • Falcon
                                      • Nameko
                                      • Vertx
                                      • Spring Boot
                                        • Deployment framework
                                          • Containers
                                          • LXC
                                          • Docker
                                          • rkt
                                            • Object detection algorithms and frameworks
                                              • Traditional approaches
                                              • Deep learning
                                              • Frameworks
                                                • Technology choice
                                                  • Thermal camera
                                                  • Microservices framework
                                                  • Deployment framework
                                                  • Object detection
                                                      • Proof of Concept implementation
                                                        • Goals and scope of prototype
                                                        • Overview of prototype
                                                          • General overview
                                                          • Client interface
                                                          • Stream
                                                          • Producer and Consumer
                                                          • Implemented plugins
                                                            • Limitations and issues
                                                              • Single client
                                                              • Timeouts
                                                              • Exception handling and testing
                                                              • Docker security issues
                                                              • Docker bridge network
                                                              • Single stream
                                                              • Number of containers per plugin
                                                                  • Mob detection experiment
                                                                    • Last Post thermal dataset
                                                                      • Last Post ceremony
                                                                      • Dataset description
                                                                        • Object detection experiment
                                                                          • Preprocessing
                                                                          • Training
                                                                              • Results and evaluation
                                                                                • Framework results
                                                                                  • Performance evaluation
                                                                                  • Interoperability evaluation
                                                                                  • Modifiability evaluation
                                                                                    • Mob detection experiment results
                                                                                      • Training results
                                                                                      • Metrics
                                                                                      • Validation results
                                                                                          • Conclusion and future work
                                                                                            • Conclusion
                                                                                            • Future work
                                                                                              • Security
                                                                                              • Implementing a detection plugin
                                                                                              • Different deployment configurations
                                                                                              • Multiple streams with different layouts
                                                                                              • Implementing the plugin distribution service (Remote ProducerConsumer)
                                                                                              • Using high performance microservices backbone frameworks
                                                                                              • New object detection models and datasets specifically for thermal images
                                                                                                  • Firefighting department email conversations
                                                                                                    • General email sent to Firefighting departments
                                                                                                    • Conversation with Firefighting department of Antwerp Belgium
                                                                                                    • Converstation with Firefighting department of Ostend Belgium
                                                                                                    • Conversation with Firefighting department of Courtrai Belgium
                                                                                                    • Conversation with Firefighting department of Ghent Belgium
                                                                                                      • Thermal camera specifications
                                                                                                      • Last Post thermal dataset summary
                                                                                                        • 24th of March 2018
                                                                                                        • 2nd of April 2018
                                                                                                        • 3th of April 2018
                                                                                                        • 4th of April 2018
                                                                                                        • 5th of April 2018
                                                                                                        • 9th of April 2018
                                                                                                        • 10th of April 2018
                                                                                                        • 11th of April 2018
                                                                                                        • 12th of April 2018

Brecht Verhoeve

mob detection during open-air eventsModifiable drone thermal imaging analysis framework for

Academic year 2017-2018Faculty of Engineering and ArchitectureChair Prof dr ir Bart DhoedtDepartment of Information Technology

Master of Science in Computer Science Engineering Masters dissertation submitted in order to obtain the academic degree of

Counsellors Pieter-Jan Maenhaut Jerico MoeyersonsSupervisors Prof dr Bruno Volckaert Prof dr ir Filip De Turck

iv

Permission for usage

rdquoThe author gives permission to make this master dissertation available for consultation and to copy parts of this master dis-

sertation for personal use In the case of any other use the copyright terms have to be respected in particular with regard to

the obligation to state expressly the source when quoting results from this master dissertationrdquo

Brecht Verhoeve

Ghent June 2018

v

Preface

This master dissertation is submitted as a completion of the academic degree as Master of Science in Computer Science Engi-

neering at Ghent University The dissertation investigates the upcoming combination of drones and thermal cameras their use

cases and supporting technologies The dissertation led me through various fields such as software architecture microservices

software containerization GPUs and neural networks I wrote the dissertation focussing on the business and technological

aspects that could lead to increasing industry adoption of these technologies

I would like to thank my supervisors and counsellors for their continuous support this year You were always there for a quick

meeting during which the atmosphere was always positive jokes were always around the corner but with a focus on results

Prof Volckaert for the quickest replies on emails I have yet witnessed to this day and guiding me through the complex journey

of this dissertation Jerico Moeyersons for the office hop-ins and help during that annoying CUDA installation Pieter-Jan

Maenhaut for his questions and reviews during meetings which provided me with new insights and things to write about Nils

Tijtgat for the support in the early days of the thesis Irsquove read your tutorial on YOLO more than I would like to admit And finally

Prof De Turck for the opportunity of working on this topic

I am grateful for the company I had this year when working on the dissertation Ozan Catal Joran Claeys Stefan Wauters Dries

Bosman Pieter De Cleer Igor Lima de Paula Laura Van Messem Lars De Brabandere Stijn Cuyvers Stijn Poelman thank you for

the fun times spontaneous beers and support this year

Special thanks go out to the people of the VTK and FK students associations You provided me with unforgettable experiences

friendships teachings andmemories With a special mention to Steacutephanie Anna and Nick from Career amp Development everyone

from Delta and finally Stijn Adams and Sander De Wilde for their continuous support throughout the years

Finally I want to thank my parents and Marjolein Hondekyn for their advise and massive support Without you I wouldnrsquot have

been able to wrestle myself through the tough periods and finish the dissertation

Brecht Verhoeve

Ghent June 2018

vi

Modifiable drone thermal imaging analysis framework for mob detection during

open-air events

Brecht Verhoeve

Supervisors Prof dr Bruno Volckaert Prof dr ir Filip De Turck

Counsellors Pieter-Jan Maenhaut Jerico Moeyersons

Masterrsquos dissertation submitted in order to obtain the academic degree of

Master of Science in Computer Science Engineering

Department of Information Technology

Chair Prof dr ir Bart Dhoedt

Faculty of Engineering and Architecture

Ghent University

Academic year 2017-2018

Abstract

Drones and thermal cameras are used in combination for many applications such as search and rescue fire fighting etc Due to

vendor specific hardware and software applications are hard to develop and maintain Therefore a modifiable drone thermal

imaging analysis framework is proposed that enables users to more easily develop such image processing applications It

implements a microservice plugin architecture Users can build image processing applications with the framework by building

media streams using plugins that are either thermal cameras or image analysis software modules The framework is evaluated

by building a proof of concept implementation which is tested on the initial requirements It achieves the modifiability and

interoperability requirements at the cost of performance and security The framework is applied for detecting large crowds of

people (mobs) during open-air events A new dataset containing thermal images of such mobs is presented on which a YOLOv3

neural network is trained The trained model is able to detect mobs on new thermal images in real-time achieving frame rates

of 55 frames per second when deployed on a modern GPU

Keywords Drone thermal imaging Video streaming Framework Microservices Object de-

tection Plugin

Modifiable Drone Thermal Imaging AnalysisFramework for Mob Detection during Open-Air

EventsBrecht Verhoeve

Supervisors prof dr Bruno Volckaert prof dr ir Filip De Turck Pieter-Jan Maenhaut Jerico Moeyersons

Abstractmdash Drones and thermal cameras are used in combination formany applications such as search and rescue fire fighting etc Due to ven-dor specific hardware and software applications are hard to develop andmaintain Therefore a modifiable drone thermal imaging analysis frame-work is proposed that enables users to more easily develop such imageprocessing applications It implements a microservice plugin architectureUsers can build image processing applications with the framework by build-ing media streams using plugins that are either thermal cameras or imageanalysis software modules The framework is evaluated by building a proofof concept implementation which is tested on the initial requirements Itachieves the modifiability and interoperability requirements at the cost ofperformance and security The framework is applied for detecting largecrowds of people (mobs) during open-air events A new dataset containingthermal images of such mobs is presented on which a YOLOv3 neural net-work is trained The trained model is able to detect mobs on new thermalimages in real-time achieving frame rates of 55 frames per second whendeployed on a modern GPU

Keywordsmdash Drone thermal imaging Video streaming Framework Mi-croservices Object detection Plugin

I INTRODUCTION

THROUGHOUT history having an overview of the environ-ment from high viewpoints held many benefits The advent

of drones and advanced cameras provides low-cost aerial imag-ing that creates numerous opportunities for new applicationsTraditional visual cameras for the visible light spectrum offerhigh quality images but are limited to daytime or artificiallylighted scenes Thermal cameras measure thermal radiation ofobjects in a scene and thus can operate in utter darkness reveal-ing information not visible to the normal eye [1] The combi-nation of drones and thermal cameras is used in many differentapplications such as geography [23] agriculture [4] search andrescue [5] wildlife monitoring [6] disaster response [7] main-tenance [8] etc

Several vendors offer thermal camera products some specif-ically designed for drone platforms These cameras often usedifferent image formats color schemes and interfaces [19ndash11]This leads to issues if applications want to change the camerathat is used or when the camera is no longer supported by thevendor because different software needs to be built to inter-act with the new camera which often is a non-negligible costThis leads to a problem called vendor lock-in that makes cus-tomers dependent on a certain vendor as they cannot switchproduct without making substantial costs a problem alreadyvery tangible for cloud-based applications [12] Applicationsacross various fields often have slightly different functional andnon-functional requirements For this dissertation several Bel-gian fire fighting departments were asked for requirements for

a thermal drone platform application It quickly became clearthat they had various problems that needed to be solved suchas finding hot explosives measuring temperatures in contain-ers identifying hot entrances detecting invisible methane firesfinding missing persons etc Some use cases need to be eval-uated in real-time (during fires) others need to be extremelyaccurate A hypothetical application should be able to quicklyimplement new detection and analysis features to meet all theserequirements Due to the current solutions not being modifiableenough current applications built with aerial thermal imagingtechnology remain in the niche use case for which they were ini-tially developed [13] Applications could benefit from a back-bone framework to aid in this modifiabilityinteroperability is-sue aiding in developing end-to-end solutions connecting ther-mal cameras to various analysisdetection modules

This dissertation explores the requirements for such a frame-work and its potential software architecture To test the viabil-ity of the architecture a proof of concept prototype is imple-mented and evaluated against the initial requirements To verifyif it aids in developing detection applications the specific usecase for detecting large crowds of people so-called mobs dur-ing open-air events is investigated Monitoring crowds duringopen-air events is important as mobs can create potentially dan-gerous situations through bottlenecks blocking escape routesetc Through monitoring and detecting these mobs these situa-tions can be avoided before they become problematic [14 15]

The remainder of this paper is organized as follows Sec-tion II presents similar projects on the topic of modifiable imag-ing analysis frameworks and thermal object detection SectionIII presents the requirements of the framework and the soft-ware architecture designed from these requirements Section IVpresents the implementation of the framework prototype Themob detection experiment is described in Section V The testsand results to evaluate the framework and the mob detection ex-periment are presented in Section VI Finally Section VII drawsconclusions from this research and indicates where future effortsin this field should go to

II RELATED WORK

The Irish start-up DroneSAR [16] developed a search-and-rescue (SAR) drone platform allowing users to stream live im-ages and video from a drone as it conducts a search for miss-ing persons The platform works with any camera visual andthermal but focuses on drones from vendor DJI DroneSARs in-dustry partner Amazon introduced the Amazon Kinesis Video

Streams platform in January 2018 as a new service for the Ama-zon Web Services (AWS) cloud platform It allows users tostream live video from devices to the AWS cloud and buildapplications for real-time video processing [17] The VIPERproject by EAVISE KU Leuven researched how thermal andvisual video images could be used for real-time detection ofpersons using object detection algorithms such as deep learning[18] The framework presented in this work combines elementsfrom all three of these examples

III REQUIREMENTS AND SOFTWARE ARCHITECTURE

A Functional requirements

Three general actors are identified for the framework an end-user that wants to build a image processing application for a spe-cific use case camera developers integrating new cameras intothe framework and analysisdetection module developers inte-grating new analysisdetection algorithms into the framework sothat end-users can use them to build their applications An end-user should be able to construct a image processing applicationusing the integrated cameras and analysis modules He shouldbe able to adapt this application with the framework for newuse cases Camera and analysis module developers should beable to add and distribute new integrations for new cameras andanalysis modules to the framework This allows the end-users tofocus on the use case not the technical details of the hardwareplatforms or algorithms and have a wider selection of hardwareand algorithms

B Non-functional requirements

Interoperability modifiability and peformance are identifiedas the architecturally significant requirements Interoperabilityspecifies that the framework should be able to interact with var-ious cameras and analysis software modules via interfaces Theamount of systems the framework can successfully interact withadding to the business value of the framework as end-users canuse more devices via the framework to build applications Theframework needs to be extendable with new thermal camerasand analysis modules Applications built with the frameworkshould be modifiable to integrate new hardware and softwareThe available hardware on-site for use cases such as forest firemonitoring is not always powerful enough to support heavy im-age analysis software The framework should be able to deployin a distributed fashion to allow more computationally expen-sive operations to be executed on more powerful remote devicesSome use cases require real-time streaming of video and manip-ulation of these video streams which should be supported forthe framework to be relevant

C Software architecture

An architectural pattern analysis based on the requirementspresented in Section III-B was conducted from which a com-bination of the microservices and microkernel pattern was se-lected as the base pattern for the software architecture The mi-crokernel pattern enables the framework to be extended via aplugin system The microservices framework enables interop-erability through the use of standard interfaces and a communi-cation scheme It also allows for the framework to be deployed

in a distributed fashion [19ndash21] The software architecture isdocumented in static views sequence diagrams and deploymentviews Figure 1 presents an overview of the architecture

Fig 1 Component-connector overview of the framework Theclear components are the core components of the framework thateach user needs to install to use the framework The coloredcomponents are used for the distribution of plugins

End-users interact with the framework via the Client Inter-face a graphical or textual interface Commands are forwardedto the Stream module that manages the layout and state of thecomponents in the media pipeline (stream) that the end-userbuilds for his use case To activate and place the componentsin a certain layout the Stream module connects with the Pro-ducer and Consumer components that control the plugins thatprocess the media Producer Plugins are devices that producemedia such as thermal cameras Consumer Plugins process andconsume media such as analysis software and displays Oncea stream is established the plugins forward media to each otherin the layout specified by the Stream module New support forcameras and analysis modules can be added as plugins to theProducerConsumer Distribution components that distribute thissoftware so that end-users can download and install the pluginsEach module in the architecture is a microservice allowing fora distributed deployment and enabling the modifiability require-ments through interchangeable modules Cameras and analy-sis modules are realized as plugins for the ProducerConsumermodules implemented as a microkernel This allows the frame-work to easily build new streams for different use cases and in-terchange components when needed

C1 Plugin model

Figure 2 depicts the model of a general framework pluginThe plugin defines three interfaces a source media endpoint toreceive media from different sources a listener endpoint to for-ward the processed media to other listening plugins and an APIfor framework control The framework uses the API to changewhich sources and listener a plugin has and its state By linkingplugins together by setting the sources and listeners resourcesthe framework can build a media processing stream ProducerPlugins have no sources since they produce media The statesare used stop and start the media processing of the plugins in

Fig 2 Schematic overview of a plugin

the stream The REST paradigm is selected to build this APIwith state sources and listeners resources that need to be min-imally implemented A plugin has the following states INAC-TIVE PLAY STOP and PAUSE Figure 3 depicts the state tran-sition diagram A plugin implements the visible states STOPPAUSE and PLAY describing if the media process of the pluginis stopped paused or processing respectively The INACTIVEstate is only visible to the framework as it indicates that their isno active process that runs the plugin This is the initial state ofa plugin in the framework When a plugin is added to a streamthe plugin microservice is started transitions to the STOP stateand waits for commands

Fig 3 State transition diagram of a plugin

C2 Network topology and communication protocol

The microservices of the framework and the plugins need acommunication protocol to exchange commands and video Forsending the commands the HTTPTCP protocol is used a syn-chronous protocol that blocks on the response of a request Thisensures that commands are always received and acknowledgedincreasing the reliability of the communication at the cost ofperformance [22] The asynchronous RTPUDP protocol is se-lected to transfer media between the plugins to ensure low la-tency video transfer between plugins to enable real-time videostreams The recommended codec for transmitting video mediais MJPEG which transmits video frames as separately encodedJPEG images [23] Other codecs such as MPEG-4 encode onlycertain frames as keyframes and the other frames as B-framesthat encode differences from the keyframe [24] This impliesthat when receiving images from a stream a keyframe must firstbe received before the video can be decoded Using MJPEG

plugins receiving frames can directly perform analysis on eachkeyframe and do not have to wait for additional frames An ex-ample of this network topology for a stream with 2 Producerplugins and 3 Consumer plugins is depicted in Figure 4

Fig 4 Network topology The full lines represent HTTPTCPcommuncations the dashed line RTPUDP communications

IV PROTOTYPE IMPLEMENTATION

The goal of the prototype implementation is to provide a proofof the concept architecture and meet the requirements presentedin Section III The core framework components are imple-mented the distribution components are left out of scope as theyfocus primarily on the distribution of supported plugins Thecore microservices as well as the plugins are implemented usingthe Docker software containerization framework [25] Contain-ers virtualize on the operating system and allow for portablelightweight software environments for processes with a minorperformance overhead Using this technology the core modulesand plugins can be deployed in a local and distributed fashionand be setup and shutdown easily [26] The microservice con-tainers communicate via the protocols presented in Section III-C2 The REST APIs are built with the Flask framework [27] alightweight Python web development framework ideal for pro-totyping The ProducerConsumer components need access tothe Docker daemon running on the Docker host in order to spinup and shutdown Docker containers containing the plugin pro-cesses This is achieved by mounting the Docker client socket inthe ProducerConsumer containers This gives the container rootaccess to the host a significant security threat [2829] Two sam-ple plugins were implemented Filecam a plugin that producesvideo read in from a file and Display a plugin that forwardsmedia to the display of the local device The plugins transmitmedia using the video streaming framework GStreamer [30]

V MOB DETECTION

A Dataset

Several publicly available datasets for thermal images exists[31ndash34] None of these include large crowds of people so anew dataset called the Last Post dataset was created It consistsof thermal video captured at the Last Post ceremony in Ypres

(a) Thermal view of the square (b) Visual view of the square (c) Thermal view of the bridge (d) Visual view of the bridge

Fig 5 Last Post dataset main scenes

Belgium [35] The videos were captured using the Flir One Prothermal camera for Android [36] using the Iron colorschemeTwo main scenes are present in the dataset depicted in Figure 5Mobs are present in the thermal images not in the visual imagesdue to the images being made on separate days The imagesused for the experiment were manually annotated outliers wereremoved and the dataset was randomly split in a training andvalidation set

B Model

Detecting and classifying objects of interest in images isknown as the object detection problem in machine learning [37]Several object detection algorithms and frameworks have beenimplemented in the past years A distinction is made betweentraditional models [31 38ndash40] deep learning two-stage net-works [41ndash46] and deep learning dense networks [47ndash49] Thetraditional and two-stage methods make predictions relativelyslow (order of seconds on GPU) when compared to the densenetworks (order of milliseconds on GPU) [47] Since the goalis to use the framework in real-time use cases the latter is pre-ferred The YOLOv3 model is selected as it achieves state of theart prediction performances can make real-time predictions andis available via the open source neural network framework dark-net [50 51] The model is pre-trained on the ImageNet dataset[52] The model is trained on a NVIDIA Geforce 980 TX GPUand optimizes the SSE loss using batch gradient descent [50]To select the best weights the average Intersection of Union(IoU) and mean Average Precision (mAP) [53] are calculatedon predictions on the validation set The weights that achievethe highest mAP are selected as the final weights

VI RESULTS

A Framework

To evaluate the framework acceptance tests for the require-ments from Section III were conducted Common frameworkoperations such as manipulating and building a stream have anaverage execution time of 084 seconds with a standard devia-tion of 037 seconds Less common operations such as deacti-vating a plugin starting up the framework and shutting downthe framework have an average execution time of 358 840 and2402 seconds respectively with standard deviations 467 050and 048 respectively Deactivating plugins (STOP to INAC-TIVE transitions) takes a long time as the container running theprocess needs to be removed Real-time streaming could not betested due to the GStreamer framework having no readily avail-

able testing endpoints However when streaming and displayinga video with the framework human users could not differenti-ate between a streaming video and a video played using a nativemedia player making it plausible the framework streams in real-time Great care must be taken when building plugins as theirprocessing speed has a direct impact on the real-time streamingperformance Interoperability is achieved with the REST APIsand plugin model presented in Section III-C The interoperabil-ity is tested by having the framework exchange information witha mock plugin implementing the specified interface and count-ing the number of correct exchanges The average successfulexchange ratio is 99998 The framework can install and de-tect new plugins at runtime achieving runtime modifiability atplugin level Different deployment schemes were not tested forthe prototype

B Mob detection

The weights generated at the 15700th training epoch achievedthe highest mAP value 9052 on the validation set For com-parison performance of other models on benchmark datasetsachieve an average mAP of 748 [54] The reason the model isachieving such high values is because the validation set is tem-porally correlated with the training set as both sets are extractedfrom videos in which frames have a temporal correlation Per-formance when predicting on new datasets will be worse Figure6 depicts some predictions of the model When predicting on avideo the model generated predictions at an average frame rateof 55 frames per second an a GPU

Fig 6 Model predictions on validation set

VII CONCLUSION AND FUTURE WORK

In this dissertation a modifiable drone thermal imaging anal-ysis framework is proposed to allow end-users to build flexiblevideo processing pipelines using different thermal cameras andanalysis modules The framework implements a microservice

container plugin architecture Users can build image process-ing applications with the framework by building media streamsusing plugins that are either thermal cameras or image analy-sis software modules The framework is evaluated by building aproof of concept implementation which is tested on the initial re-quirements The proposed framework achieves the modifiabilityand interoperability requirements at the cost of performance andsecurity The framework is applied for detecting large crowdsof people (mobs) during open-air events A new dataset con-taining thermal images of such mobs is presented on which aYOLOv3 neural network is trained The trained model is ableto detect mobs on new thermal images in real-time achievingframe rates of 55 frames per second when deployed on a modernGPU Some extensions to this research are deploying a detec-tion model using the framework testing the other deploymentconfigurations testing the framework with end-users in prac-tice and building new object detection models specifically forthermal images

REFERENCES

[1] R Gade and T B Moeslund ldquoThermal cameras and applications a sur-veyrdquo Machine Vision and Applications vol 25 pp 245ndash262 2014

[2] M C Harvey J V Rowland and K M Luketina ldquoDrone with thermalinfrared camera provides high resolution georeferenced imagery of theWaikite geothermal area New Zealandrdquo 2016

[3] S Amici M Turci S Giammanco L Spampinato and F Giulietti ldquoUAVThermal Infrared Remote Sensing of an Italian Mud Volcanordquo vol 2pp 358ndash364 2013

[4] J Bendig A Bolten and G Bareth ldquoINTRODUCING A LOW-COSTMINI-UAV FOR THERMAL-AND MULTISPECTRAL-IMAGINGrdquo2012

[5] A J Rivera A D Villalobos J C Monje J A Marinas and C MOppus ldquoPost-disaster rescue facility Human detection and geolocationusing aerial dronesrdquo IEEE Region 10 Annual International ConferenceProceedingsTENCON pp 384ndash386 2017

[6] P Christiansen K A Steen R N Joslashrgensen and H Karstoft ldquoAuto-mated detection and recognition of wildlife using thermal camerasrdquo Sen-sors (Basel Switzerland) vol 14 pp 13778ndash93 jul 2014

[7] S Chowdhury A Emelogu M Marufuzzaman S G Nurre and L BianldquoDrones for disaster response and relief operations A continuous approx-imation modelrdquo 2017

[8] Workswell ldquoPipeline inspection with thermal diagnosticsrdquo 2016[9] DJI ldquoZenmuse H3 - 2Drdquo[10] Workswell ldquoApplications of WIRIS - Thermal vision system for dronesrdquo[11] Therm-App ldquoTherm-App - Android-apps op Google Playrdquo 2018[12] B Satzger W Hummer C Inzinger P Leitner and S Dustdar ldquoWinds of

change From vendor lock-in to the meta cloudrdquo IEEE Internet Comput-ing vol 17 no 1 pp 69ndash73 2013

[13] J Divya ldquoDrone Technology and Usage Current Uses and Future DroneTechnologyrdquo 2017

[14] B Steffen and A Seyfried ldquoMethods for measuring pedestrian densityflow speed and direction with minimal scatterrdquo Physica A Statistical Me-chanics and its Applications vol 389 pp 1902ndash1910 may 2010

[15] M Wirz T Franke D Roggen E Mitleton-Kelly P Lukowicz andG Troster ldquoInferring crowd conditions from pedestriansrsquo location tracesfor real-time crowd monitoring during city-scale mass gatheringsrdquo Pro-ceedings of the Workshop on Enabling Technologies Infrastructure forCollaborative Enterprises WETICE pp 367ndash372 2012

[16] L-L Slattery ldquoDroneSAR wants to turn drones into search-and-rescueheroesrdquo 2017

[17] Amazon Web Services Inc ldquoWhat Is Amazon Kinesis Video Streamsrdquo2018

[18] T Goedeme ldquoProjectresultaten VLAIO TETRA-projectrdquo tech rep KULeuven Louvain 2017

[19] L Bass P Clements and R Kazman Software Architecture in PracticeAddison-Wesley Professional 3rd ed 2012

[20] M Richards Software Architecture Patterns OrsquoReilly Media first edit ed2015

[21] C Richardson ldquoMicroservice Architecture patternrdquo 2017[22] C De La Torre C Maddock J Hampton P Kulikov and M Jones ldquoCom-

munication in a microservice architecturerdquo 2017

[23] On-Net Surveillance Systems Inc ldquoMJPEG vs MPEG4 Understandingthe differences advantages and disadvantages of each compression tech-niquerdquo 2006

[24] D Bull Communicating Pictures A Course in Image and Video CodingElsevier Science 2014

[25] Docker Inc ldquoDocker - Build Ship and Run Any App Anywhererdquo 2018[26] D Merkel ldquoDocker Lightweight Linux Containers for Consistent Devel-

opment and Deploymentrdquo 2014[27] A Ronacher ldquoWelcome to Flask Flask Documentation (012)rdquo 2017[28] Lvh ldquoDonrsquot expose the Docker socket (not even to a container)rdquo 2015[29] R Yasrab ldquoMitigating Docker Security Issuesrdquo tech rep University of

Science and Technology of China Hefei[30] GStreamer ldquoGStreamer open source multimedia frameworkrdquo 2018[31] J W Davis and M A Keck ldquoA Two-Stage Template Approach to Per-

son Detection in Thermal Imageryrdquo Proc Workshop on Applications ofComputer Vision 2005

[32] S Hwang J Park N Kim Y Choi and I S Kweon ldquoMultispectralPedestrian Detection Benchmark Dataset and Baselinerdquo CVPR 2015

[33] Z Wu N Fuller D Theriault and M Betke ldquoA Thermal Infrared VideoBenchmark for Visual Analysisrdquo IEEE Conference on Computer Visionand Pattern Recognition Workshops 2014

[34] S Z Li R Chu S Liao and L Zhang ldquoIllumination Invariant FaceRecognition Using Near-Infrared Imagesrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 29 no 4 pp 627ndash639 2007

[35] Last Post Association ldquoMissionrdquo 2018[36] FLIR ldquoFLIR One Prordquo[37] E Alpaydin Introduction to machine learning MIT Press 3 ed 2014[38] F X F Xu X L X Liu and K Fujimura ldquoPedestrian detection and

tracking with night visionrdquo IEEE Transactions on Intelligent Transporta-tion Systems vol 6 no 1 pp 63ndash71 2005

[39] H Nanda and L Davis ldquoProbabilistic template based pedestrian detectionin infrared videosrdquo IEEE Intelligent Vehicles Symposium Proceedingsvol 1 pp 15ndash20 2003

[40] R Appel S Belongie P Perona and P Doll ldquoFast Feature Pyramids forObject Detectionrdquo Pami vol 36 no 8 pp 1ndash14 2014

[41] J R R Uijlings K E A Van De Sande T Gevers and A W M Smeul-ders ldquoSelective Search for Object Recognitionrdquo tech rep 2012

[42] R Girshick J Donahue T Darrell and J Malik ldquoRegion-Based Convolu-tional Networks for Accurate Object Detection and Segmentationrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol 38 no 1pp 142ndash158 2014

[43] R Girshick ldquoFast R-CNNrdquo in Proceedings of the IEEE InternationalConference on Computer Vision vol 2015 Inter pp 1440ndash1448 2015

[44] S Ren K He R Girshick and J Sun ldquoFaster R-CNN Towards Real-Time Object Detection with Region Proposal Networksrdquo IEEE Trans-actions on Pattern Analysis and Machine Intelligence vol 39 no 6pp 1137ndash1149 2016

[45] K He Gkioxari P Dollar and R Girshick ldquoMask R-CNNrdquo arXiv 2018[46] J Dai Y Li K He and J Sun ldquoR-FCN Object Detection via Region-

based Fully Convolutional Networksrdquo tech rep 2016[47] J Redmon S Divvala R Girshick and A Farhadi ldquoYou Only Look

Once Unified Real-Time Object Detectionrdquo 2015[48] W Liu D Anguelov D Erhan C Szegedy S Reed C-y Fu and A C

Berg ldquoSSD Single Shot MultiBox Detectorrdquo arXiv 2016[49] T-y Lin P Goyal R Girshick K He and P Dollar ldquoFocal Loss for

Dense Object Detectionrdquo arXiv 2018[50] J Redmon and A Farhadi ldquoYOLOv3 An Incremental Improvementrdquo

axXiv 2018[51] J Redmon ldquoDarknet Open source neural networks in crdquo

httppjreddiecomdarknet 2013ndash2016[52] J Deng W Dong R Socher L-J Li K Li and L Fei-Fei ldquoImageNet

A Large-Scale Hierarchical Image Databaserdquo in CVPR09 2009[53] M Everingham S M A Eslami L Van Gool C K I Williams J Winn

and A Zisserman ldquoThe Pascal Visual Object Classes Challenge A Ret-rospectiverdquo International Journal of Computer Vision vol 111 no 1pp 98ndash136 2014

[54] A Ouaknine ldquoReview of Deep Learning Algorithms for Object Detec-tionrdquo 2018

xii

Contents

1 Introduction 1

11 Drones 1

12 Concepts 2

121 Thermal Cameras 2

122 Aerial thermal imaging 2

13 Problem statement 2

131 Industry adoption 2

132 Crowd monitoring 3

133 Goal 4

134 Related work 4

14 Outline 4

2 System Design 5

21 Requirements analysis 5

211 Functional requirements 5

212 Non-functional requirements 6

22 Patterns and tactics 11

221 Layers 12

222 Event-driven architecture 12

223 Microkernel 12

224 Microservices 13

225 Comparison of patterns 13

23 Software architecture 15

231 Static view 15

232 Dynamic views 22

233 Deployment views 23

3 State of the art and technology choice 27

31 Thermal camera options 27

311 Parameters 27

312 Comparative analysis 30

32 Microservices frameworks 31

321 Flask 31

322 Falcon 33

323 Nameko 33

324 Vertx 33

325 Spring Boot 34

33 Deployment framework 34

331 Containers 34

332 LXC 35

333 Docker 35

334 rkt 35

34 Object detection algorithms and frameworks 36

341 Traditional approaches 36

342 Deep learning 37

343 Frameworks 39

35 Technology choice 41

351 Thermal camera 41

352 Microservices framework 41

353 Deployment framework 41

354 Object detection 41

4 Proof of Concept implementation 43

41 Goals and scope of prototype 43

42 Overview of prototype 43

421 General overview 43

422 Client interface 45

423 Stream 46

424 Producer and Consumer 46

425 Implemented plugins 48

43 Limitations and issues 51

431 Single client 51

432 Timeouts 51

433 Exception handling and testing 51

434 Docker security issues 51

435 Docker bridge network 52

436 Single stream 52

437 Number of containers per plugin 52

5 Mob detection experiment 53

51 Last Post thermal dataset 53

511 Last Post ceremony 53

512 Dataset description 54

52 Object detection experiment 56

521 Preprocessing 56

522 Training 56

6 Results and evaluation 58

61 Framework results 58

611 Performance evaluation 58

612 Interoperability evaluation 60

613 Modifiability evaluation 62

62 Mob detection experiment results 62

621 Training results 63

622 Metrics 63

623 Validation results 64

7 Conclusion and future work 67

71 Conclusion 67

72 Future work 69

721 Security 69

722 Implementing a detection plugin 69

723 Different deployment configurations 70

724 Multiple streams with different layouts 70

725 Implementing the plugin distribution service (Remote ProducerConsumer) 70

726 Using high performance microservices backbone frameworks 70

727 New object detection models and datasets specifically for thermal images 70

A Firefighting department email conversations 81

A1 General email sent to Firefighting departments 81

A2 Conversation with Firefighting department of Antwerp Belgium 82

A3 Converstation with Firefighting department of Ostend Belgium 83

A4 Conversation with Firefighting department of Courtrai Belgium 83

A5 Conversation with Firefighting department of Ghent Belgium 83

B Thermal camera specifications 85

C Last Post thermal dataset summary 94

C1 24th of March 2018 94

C2 2nd of April 2018 95

C3 3th of April 2018 96

C4 4th of April 2018 97

C5 5th of April 2018 97

C6 9th of April 2018 98

C7 10th of April 2018 99

C8 11th of April 2018 100

C9 12th of April 2018 101

xvi

List of Figures

21 Use case diagram 7

22 Overview of the framework software architecture 16

23 Framework network topology 17

24 Client Interface detailed view 17

25 Stream detailed view 18

26 Stream model 18

27 Plugin model 19

28 Plugin state transition diagram 20

29 Component-connector diagrams of the Producer and Consumer module 21

210 Producer and Consumer Distribution component-connector diagrams 22

211 Add plugin sequence diagram 23

212 Link plugins sequence diagram 24

213 Deployment diagrams 26

31 Thermal image and MSX image of a dog 28

33 Rethink IT Most used tools and frameworks for microservices results [54] 32

34 Containers compared to virtual machines [66] 36

41 filecam GStreamer pipeline 49

42 local plugin GStreamer pipeline 50

51 Last Post ceremony panorama 54

52 Last Post filming locations 54

53 Main scenes in the Last Post dataset 55

54 Outliers 57

61 Average training loss per epoch 64

62 Validation metrics per epoch 65

63 Predictions of the model on images in the validation set 66

71 GStreamer pipeline for a plugin with a detection model 69

xviii

List of Tables

21 Performance utility tree 8

22 Interoperability utility tree 9

23 Modifiability utility tree 10

24 Usability utility tree 11

25 Security utility tree 11

26 Availability utility tree 12

27 Architecture pattern comparison 14

61 Acceptance tests results summary 59

62 Performance test statistics summary measured in seconds 60

63 Resource usage of the framework in several conditions 61

64 Total size of framework components 61

65 Interoperability tests results (S Source L Listener) 62

B1 Compared cameras their producing companies and their average retail price 86

B2 Physical specifications 87

B3 Image quality IR InfraRed SD Standard FOV Field of View 88

B4 Thermal precision 89

B5 Interfaces 90

B6 Energy consumption 91

B7 Help and support 92

B8 Auxiliary features 93

xix

List of Listings

1 Minimal Flask application 32

2 Vertx example 33

3 Spring Boot example 34

4 docker-composeyml snippet of the prototype 44

5 Mounting the Docker socket on the container 47

6 Starting a plugin container 47

7 Dynamic linking of the decodebin and jpegenc 50

xx

List of Abbreviations

ACF Aggregated Channel Features

AMQP Advanced Message Queuing Protocol

API Application Programming Interface

AS Availability Scenario

ASR Architecturally Significant Requirement

CLI Command Line Interface

CNN Convolutional Neural Networks

CRUD Create Read Update Destroy

DNS Domain Name System

FR Functional Requirement

GPU Graphical Processing Unit

H High

HTTP Hyper Text Transfer Protocol

ICF Integral Channel Features

IoU Intersection of Union

IS Interoperability Scenario

IT Interoperability Tactic

JVM Java Virtual Machine

L Low

xxi

LXC Linux Containers

M Medium

mAP mean Average Precision

Motion-JPEG MJPEG

MS Modifiability Scenario

MSX Multi Spectral Dynamic Imaging

MT Modifiablity Tactic

NFR Non-Functional Requirement

ONNX Open Neural Network Exchange Format

OS Operating System

PS Performance Scenario

PT Performance Tactic

QAR Quality Attribute Requirement

REST Representational State Transfer

RNN Recurrent Neural Network

RPN Region Proposal Network

RTP Real-time Transport Protocol

SS Security Scenario

SSE Sum of Squared Errors

SVM Support Vector Machine

TCP Transmission Control Protocol

UDP User Datagram Protocol

UI User Interface

US Usability Scenario

YOLO You Only Look Once

INTRODUCTION 1

Chapter 1

Introduction

Throughout history having an overview of the environment from high viewpoints held many benefits Early civilizations used

hills to monitor their surroundings population and spot possible attackers The discovery of flight meant that environments

could now be viewed from a birdrsquos-eye view offering even more visibility revealing much more of the world below Recently a

much more smaller type of aircraft was developed the drone Ranging from large plane-like to almost insect-like devices and

having a wide variety of uses drones are quickly taking over the sky Drones would not be as effective without proper cameras

providing a detailed view on the world below With digital videocameras offering superb quality for steadily decreasing costs

almost every scene can be captured in great detail However these cameras are limited to the visible light spectrum which

hinders drones to operate in all circumstances like nightly flights Thermal cameras measure the emitted heat of a scene and

can reveal information not visible to the eye such as hidden persons or animals pipelines malfunctioning equipment etc The

combination of these two technologies certainly holds many exciting opportunities for the future

11 Drones

Drones are flying robots that can fly remotely or autonomously and donrsquot carry a human operator They can carry a variety of

payloads video cameras delivery parcels fluid containers sensors lights but also lethal explosives [1]

Drones are classified in different categories based on varying parameters such as the physical characteristics (diameter weight)

aerial movement techniques application domains etc Based on diameter drones are classified as smart dust (1 mm to 025

cm) pico air vehicles (025 cm - 25 cm) nano air vehicles (25 cm - 15 cm) micro air vehicles (15 cm - 1 m) micro unmanned

aerial vehicles (1 m - 2 m) and unmanned aerial vehicles (2 m and larger) Often depending on their diameter the weight

of these devices ranges from less than a gram up to more than 2000 kg Drones have different flight techniques such as

propulsion engines with wings rotors in various amounts flapping wings and even balloons They are used for all kinds of

purposes ranging from search and rescue missions environmental protection delivery recon etc Hassanalian et al provide

an excellent overview of most types of drones [2]

Due to the increasing interest in commercial drone platforms [3] a variety of payloads were developed specifically tailored for

these aerial robots such as gimbals to mount action video cameras [4] gimbals for delivering packets [5] and thermal imaging

12 Concepts 2

platforms [6]

12 Concepts

121 Thermal Cameras

Thermal cameras are passive sensors that capture the infrared radiation emitted by all objects with a temperature above

absolute zero degrees Kelvin In contrast to visible light cameras thermal cameras do not depend on an external energy

source for visibility and colors of objects or scenes This makes captured images independent of the illumination colors etc

Furthermore images can be captured in the absence of visible light [7] Originally thermal camera technology was developed

for night vision purposes for the military and the devices were very expensive Later the technology was commercialized

and has developed quickly over the last few decades resulting in better and cheaper cameras [7] This led to access for a

broader public and the technology is now introduced to a wide range of different applications such as building inspection gas

detection industrial appliances medicinal science agriculture fire detection surveillance etc [7] Thermal cameras are now

being mounted on drones to give an aerial thermal overview

122 Aerial thermal imaging

Aerial thermal imaging is defined as the creation of thermal images using a flying device This dissertation focuses on the usage

of drones for aerial thermal imaging There are many applications for aerial thermal imaging Some examples are geography

[8 9] agriculture [10 11] search and rescue operations [12] wildlife monitoring [13] forest monitoring [14 15] disaster response

[16] equipment and building maintenance [17ndash20] etc In the past few years several industry players have developed thermal

cameras specifically aimed at these drone applications Examples are FLIR [6] Workswell [21] and TEAX Technology [22]

13 Problem statement

131 Industry adoption

The implementation of thermal cameras on drone platforms faces some issues for wide adoption by the industry Several

vendors offer thermal camera products some specifically designed for drone platforms that often implement different image

formats color schemes and interfaces (eg [23ndash25]) This leads to issues if users want to modify their applications by changing

the camera that is used because the applicationmust implement new software to interact with the camera or when the camera

is no longer supported by the vendor leaving the application with outdated hardware and software This leads to a problem

called vendor lock-in that makes customers dependent on a certain vendor as they cannot switch product without making

substantial costs a problem already very tangible for cloud-based applications today [26]

Applications across various fields often have different functional and non-functional requirements Some applications have hard

real-time deadlines (such as firefighting search and rescue security etc) that must be respected other applications require

13 Problem statement 3

highly accurate predictions (eg person detection agriculture etc) A single application domain can even have many different

use cases

Several firefighting departments in Belgium were contacted to get an overview of their needs for a drone thermal imaging ap-

plication It quickly became clear they had various detection problems such as finding missing persons locating hot explosives

measuring temperatures in silos detecting invisible methane fires etc Equipment also wears down more quickly due to usage

in harsh environments such as fires in close proximity A drone thermal application for them needs to be able to exchange

functionality and hardware easily and have high performance constraints to deliver value for them The email conversations

can be read in Appendix A

Other drone thermal imaging applications are mostly only used in the niche domain for which they were developed because

they arenrsquot designed for flexibility [27] These proprietary applications have some disadvantages the development and support

potentially has a large cost vendor lock-in can occur when products are no longer supported security issues could arise and

customization is difficult [28 29] Applications could benefit from a backbone framework to aid in this modifiabilityinteroper-

ability issue aiding in developing end-to-end solutions connecting thermal cameras to various analysisdetection modules for

various use cases

132 Crowd monitoring

Festivals and other open air events are popular gatherings that attract many people For every event organizer it is important to

ensure safety and avoid incidents Large groups of people so-called mobs can create potentially dangerous situations through

bottlenecks blocking escape routes etc Therefore having the ability to monitor crowds and predict their behavior is very

important to avoid such scenarios Data can be obtained by evaluating video footage from past comparable events or real time

video monitoring of current events [30] By analyzing this footage potentially dangerous situations can be avoided by acting

on the mob formation and safety regulations can be improved to help planning future events Vision-based approaches face

several limitations mounted cameras cannot capture elements outside of their field of view canrsquot see in some conditions (for

example during night time) and it is difficult to infer information from the raw footage [31]

Thermal cameras could help for crowd monitoring because they can operate in any condition Having precise and detailed

object recognition for the images produced by these cameras is crucial to extract information correctly In this context clas-

sifying images is not satisfactory localization of the objects contained within the images is needed This problem is known

as object detection [32] There are several challenges for object detection in thermal images the image quality is very low

when compared to visible light images there is a lack of color and texture information and temperature measures are relative

measures etc This makes extracting discriminative information from these images difficult [33] Most efforts towards object

detection on thermal images has gone towards human detection Most of the proposed algorithms focus on feature extraction

using the Aggregated Channel Features technique and boosting algorithms for learning [33ndash35] Novel approaches make use

of so-called deep learning with neural networks that achieve very good results given enough data [36]

14 Outline 4

133 Goal

The goal of the dissertation is to explore the requirements of the possible backbone framework suggested in Section 131 and its

potential software architecture The architecture is evaluated by building a proof of concept implementation of the framework

and evaluating it against the proposed requirements To verify its use in developing drone thermal imaging applications the

specific mob-detection use case is investigated

134 Related work

The Irish start-up DroneSAR [37] developed a search-and-rescue (SAR) drone platform allowing users to stream live images

and video from a drone as it conducts a search for missing persons The platform works with any camera visual and thermal

but focuses on drones from vendor DJI DroneSARs industry partner Amazon introduced the Amazon Kinesis Video Streams

platform in January 2018 as a new service for the Amazon Web Services (AWS) cloud platform It allows users to stream live

video from devices to the AWS cloud and build applications for real-time video processing [38] The VIPER project by EAVISE

KU Leuven researched how thermal and visual video images could be used for real-time detection of persons using object

detection algorithms such as deep learning [36] The framework presented in this work combines elements from all three of

these examples

14 Outline

The remainder of this dissertation is organized as follows Chapter 2 presents the requirements for the framework and the

software architecture Chapter 3 explores several state of the art technologies that can serve as backbone technologies for

the framework To test the viability of the software architecture a prototype is implemented Chapter 4 presents the different

aspects of this prototype Chapter 5 describes the experiment that is conducted to research the detection of mobs in thermal

images The results of both the framework and the detection experiment are presented and evaluated in Chapter 6 Finally the

conclusion and future research efforts are presented in Chapter 7

SYSTEM DESIGN 5

Chapter 2

System Design

Finding out what users actually expect from a software system and what makes it valuable for them is of key importance for the

success of that system This chapter first explores the functional and non-functional requirements of the hypothetical frame-

work suggested in Chapter 1 to find out what makes building the framework worthwhile Well known architectural patterns

enable certain software requirements very well and can be used for building the software architecture of the framework The

framework software architecture combines some of these patterns and is presented in several documents

21 Requirements analysis

Requirements are the stated life-cycle customer needs and objectives for the system and they relate to how well the system

will work in its intended environment They are those aspects of the framework that will provide value to the users

211 Functional requirements

Functional requirements (FR) describe the necessary task action or activity that must be accomplished by the system often

captured in use cases andor user stories [39 40] Use cases provide a summary of the features described in the user stories

Several external people andor systems defined as actors interact with the framework to achieve a certain goal [40] Three

actors are identified for the framework an end-user that uses the framework in order to build an image processing applica-

tion for a specific use case such as the ones described in Section 122 a camera developer who creates support software for

a specific thermal camera for the framework so that the end-user can buy and use their product and an analysis software

developer that creates analysis software for a specific use case (tracking object detecting objects etc) so that the end-user

can use their software to build his 1 application The camera and analysis software developers are generalized to an actor called

plugin developer who develops plugins to extend the functionality of the framework These plugins are the building blocks

with which the end-user can build image processing applications

The general user scenario for the framework proceeds as follows An end-user wants to build an image processing application

1To avoid unnecessary gender specific pronoun clutter the male pronoun is used by default

21 Requirements analysis 6

eg to detect fires in a landscape using a drone He has a thermal camera for this and has read about hot-spot detection in

video The user looks for a plugin for the framework that can read video from his thermal camera and for a plugin that does

the hot-spot detection If the user finds these plugins he can add them to the framework and use them for the application he

is building He connects both plugins with the framework in a specific order to finish his application For this simple example

the thermal camera plugin is connected to the hot-spot detection plugin so that video coming from the thermal camera is

transmitted to the detection plugin to find the fires in the landscape The plugins in the application and the specific order in

which they are connected is defined as a stream This stream should be easily modifiable if additional or other functionalities

are required Suppose that the thermal camera delivers very high quality video and the detector cannot work with this as it

can only operate on low quality images The end-user searches for a plugin that scales the high quality video down to an

accepted quality for the detector This plugin is placed in between the thermal camera and the detector and the application

can work again By continuously adding plugins to the framework the number of possible applications that can be built with

the framework increase making the framework useable for more aerial thermal imaging use cases

Instead of developing the application from scratch users can use the already implemented plugins to build the applications in

an ad hoc fashion Because of this the development time for such applications can be reduced and users can switch hardware

andor algorithms easily The FRs are summarized in a use case diagram that connects each actor with their respective require-

ments and the relationship among them [40] depicted in Figure 21 Trivial functionalities such as launching and shutting down

the framework are omitted The red use cases represent use cases to extend the functionality of the framework the blue use

cases represent use cases for building streams white use cases modify the media processing of the stream Some use cases

depend on others the blue and white use cases work with plugins of the framework their prerequisite use case is rdquoAdd pluginrdquo

as a plugin must be a part of the framework for a user to use it the rdquo(Un)Link pluginsrdquo rdquoStopPausePlay streamrdquo use cases

depend on rdquoAdd plugins to streamrdquo as a stream must contain plugins before they can be manipulated

212 Non-functional requirements

A non-functional requirement (NFR) specifies how the framework is supposed to be or in what manner it should execute its

functionality [41] These qualifications typically cover business and system quality requirements A distinction is made between

quality attribute requirements (QAR) and constraints QARs are qualifications of the FRs or of the overall product eg how

fast a certain function must be executed or how resilient it must be to erroneous input They are closely related to business

requirements which are specifications that once delivered provide value to the actors [40] The QARs are captured in a utility

tree [40] that has a root node representing the system This root node is elaborated by listing the major QARs that the system

is required to exhibit Each QAR is subdivided into more specific QARs To make the specific requirements unambiguous and

testable a scenario for the system or a specific function is written and they are evaluated against the business value and the

architectural impact [40] The QAR can either have High (H) Medium (M) and Low (L) business value and architectural impact

respectively The business value is defined as the value for the end user if the QAR is enabled High designates a must-have

requirement Medium is for a requirement which is important but would not lead to project failure Low describes a nice to have

QAR but not something that is worth much effort Architectural impact defines how much the architecture must be designed

towards the QAR to enable it High means that meeting this QAR will profoundly affect the architecture Medium means

21 Requirements analysis 7

Figure 21 Use case diagram

that meeting this QAR will somewhat affect the architecture Low means that meeting this QAR will have little effect on the

architecture The following QARs are discussed performance interoperability modifiability usability security and availability

Performance

Performance indicates the frameworks ability to meet timing requirements It characterizes the events that can occur and the

frameworks time-based response to those events Latency is defined as the time between the arrival of the stimulus and the

systemrsquos response to it [40] The system latency is the latency between the initialization of an action sequence and the first

change to the system noticeable by the user Streaming latency is defined as the time between the arrival of a video frame and

the arrival of the next video frame The jitter of the response is the allowable variation in latency Scalability is the number

of users that can use the framework at the same time The utility tree is presented in Table 21 The software industry has

not defined a quantified rsquogood latencyrsquo for end-users but a 4 second latency rule is often used as a rule-of-thumb [42] The

average response time for general framework commands should then be less than 2 seconds with a standard deviation of 1

seconds ensuring most execution times respect the 4 second bound As stated in Chapter 1 some use cases require real-time

video streaming such as fire fighting The notion of low latency real-time video loosely defines that video should be streamed

almost simultaneously if a camera is filming and a human user does not notice a latency between the video of the camera

and the real world the video stream is considered real-time Real-time is thus a human time perception and for visual inputs

this bound is as low as 13 milliseconds Anything above 13 milliseconds becomes noticeable anything above 100 milliseconds

hinders human performance [43 44] However the framework focusses on the use of thermal cameras most of which most

donrsquot produce frames faster than 8 frames per second or 125 milliseconds per frame (see Section 31) More expensive cameras

21 Requirements analysis 8

can shoot at 25 frames per second corresponding to a latency of 40 milliseconds and this bound is selected for the streaming

latency with a standard deviation of 20 milliseconds remaining below the frame rate of less expensive cameras The number

of users that can use the framework at the same time is assumed to be low as current aerial thermal image applications are

currently operated by only one user or a few The assumption is that a maximum of five users can use the framework at the

same time All of these requirements are quantified as relatively rsquogoodrsquo values These bounds should be evaluated for user

satisfaction by having users use a prototype of the framework in practice

Attribute refinement Id Quality attribute scenario

LatencyPS-1 The average execution time of all framework commands does not exceed 2 sec-

onds (H M)

PS-2 A playing stream should have an upper limit of 40ms streaming latency (H H)

JitterPS-3 The average standard deviation of the execution time of all framework com-

mands should not exceed 1 second under normal operation (H M)

PS-4 The average standard deviation in streaming latency should not exceed 20ms

under normal operation (H H)

Scalability PS-5 The system should usable by five users at the same time (M M)

Table 21 Performance utility tree

Interoperability

Interoperability is the degree to which two or more independently developed systems can usefully exchange meaningful infor-

mation via interfaces in a particular context [40] The framework will interoperate with cameras and analysis modules via the

framework plugins Henceforth the camera and analysis plugins will be referred to as a Producer plugin and a Consumer plugin

A Producer plugin is thus a plugin that represents a camera that produces video and a Consumer plugin a plugin that represents

a module that processes or consumes video The framework will thus interact with the Producer and Consumer plugins with

which the framework exchanges requests to link them together control their media process etc The more correct exchanges

there are between the two the better the user can use the plugin for building applications with the framework This QAR is

quantified by the ratio of requests made by the framework that were correctly interpreted by the plugin (successful requests)

and the total number of requests during a runtime of the framework [40] Intuitively one argues that the framework must

achieve perfect interoperability with a perfect exchange success rate of 100 Reality however tends to not agree with perfec-

tion and it can never be excluded that exchanges will always be correct Therefore it is better to aim for a good interoperability

measure and prepare for failed exchanges instead of naively assuming the framework will be perfect An exchange success

rate of 9999 is selected the motivation for this bound is as follows A plugin is assumed to be always correct up to first the

mistake after which the plugin is faulty and the fault needs to be identified and ensured that it wonrsquot occur again An exchange

success rate of 9999 means that if 10000 plugins are installed and used by the framework only one will fail during uptime

For one plugin during framework up time the mean time between failures is then 10000 exchanges It is suspected that this

21 Requirements analysis 9

amount of exchanges are very high for normal framework use Because the possibility of faulty exchanges is acknowledged

the framework will need to implement a fallback mechanism to compensate The utility tree is presented in Table 22

Attribute refinement Id Quality attribute scenario

Syntactic interoperabilityIS-1 The framework exchanges requests with a Producer plugin (known at runtime)

with a success ratio of 9999 Incorrect requests are undone by the framework

and logged (HH)

IS-2 The framework exchanges requests with a Consumer plugin (known at runtime)

with a success ratio of 9999 Incorrect requests are undone by the framework

and logged (HH)

Table 22 Interoperability utility tree

Modifiability

Modifiability is the cost and risk of changing functionality of the system [40] One of themost important values of the framework

is modifiability of the supported thermal cameras and analysis modules The framework needs to be extendable for new

functionalities by enabling developers to add their support software in the form of a plugin End-users should be able to

modify the components that they use for their image processing applications easily and quickly to allow for interchangeable

hardware and software and quickly set up new applications Modifiability is defined in two environments runtime defined as

periods during which the system is up and running and downtime defined as the time periods during which the system is not

active The utility tree is presented in Table 23

To enable users to choose the extensions they need the framework will need a distribution service that contains all plugins

available for the framework from which a user can select and install plugins for their local version of the framework Adding

new plugins to the distribution service should not affect versions of the frameworks installed by the user When a user adds a

plugin from the distribution to his version of the framework the framework should only reload once before making the plugin

useable for user comfort Deployability is defined as the different device configurations that specify how the framework can be

deployed If the framework can be deployed in different fashions this can increase the value for the end-user Suppose a fire

fighting use case in which a forest fire is monitored on site Computationally powerful devices might not be available on site

so moving some plugins processing media to a remote server or cloud could still allow usage of the framework Perhaps the

device processing the media is already remote for example a drone on security patrol in this case access via a remote device

such as a smartphone is desirable This leads to the deployment configurations described in the utility tree

Usability

Usability indicates how easy it is for the user to accomplish a desired task and the kind of user support the system provides

Learnability indicates how easy it is for a user to gain knowledge on how to use the framework Errors are the amount of errors

21 Requirements analysis 10

Attribute refinement Id Quality attribute scenario

Run time modifiability

MS-1 Support for a new Producer plugin should be added to the distribution service

within one day without the framework having to restart (H H)

MS-2 Support for a new Consumer plugin should be added to the distribution service

within one day without the framework having to restart (H H)

MS-3 End-users should be able to extend their framework with new functionalities

by installing new Consumer and Producer Plugins (HH)

MS-4 End-users should be able tomodify the plugins used to build their stream (HH)

Down time modifiabilityMS-5 New Producer plugins can be installed to the local framework at runtime the

framework can only reload once before the plugin is useable (H H)

MS-6 New Producer plugins can be installed to the local framework at runtime the

framework can only reload once before the plugin is useable (H H)

Deployability

MS-7 The system should be deployable on a combination of a smartphone and

cloudremote server environment (H H)

MS-8 The system should be deployable on a personal computer or laptop (H H)

MS-9 The system should be deployable on a smartphone laptop and cloud environ-

ment (H H)

Table 23 Modifiability utility tree

a user can make when trying to execute certain functions [40] The utility tree is presented in Table 24

Security

Security is a measure of the systemrsquos ability to protect data and information from unauthorized access while still providing

access to users and systems that are authorized An action taken against the system to cause it harm is called an attack

Security has three main characteristics Confidentiality is the property that data or services are protected from unauthorized

access Integrity is the property that data or services are protected from unauthorized manipulation Availability is the property

of the systemmaintaining its functionality during an attack Authentication verifies the identities of the parties of an interaction

checks if they are truly who they claim to be and gives or provokes access [40] Security is important for the framework if it is

deployed on multiple devices that use a public network to communicate The utility tree is presented in Table 25

Availability

Availability in a general context (not only security) refers to howavailable the software is to carry out its functionality Downtime

is a measure of the time that the system is unavailable to carry out its functions The utility tree is presented in Table 26

Availability is specified for the part of the framework that distributes the plugins

22 Patterns and tactics 11

Attribute refinement Id Quality attribute scenario

Learnability

US-1 A user should be able to learn how to build an image processing application in

at most one hour (H L)

US-2 An experienced developer should be able to start developing a Consumer plugin

for the system within one day (H L)

US-3 An experienced developer should be able to start developing a Producer plugin

for the system within one day (H L)

Errors US-4 A user should not make more than 3 errors to build an image processing appli-

cation (H L)

Table 24 Usability utility tree

Attribute refinement Id Quality attribute scenario

Confidentiality SS-1 Streams created by a user can only be accessed by that user and not by any

other entity (H L)

Integrity SS-2 Streams canrsquot be manipulated without authorization by the user that made the

streams (H L)

Availability SS-3 During an attack the core functionality is still available to the user (H M)

AuthenticationSS-4 Users should authenticate with the system to perform functions (H L)

SS-5 Developers should authenticate their plugins before adding them to the frame-

work (H L)

Table 25 Security utility tree

Architecturally significant requirements

Architecturally Significant Requirements (ASR) are the requirements that are themost important to realize according to business

value and have the most impact on the architecture From the utility trees and the measures of the quality attribute scenarios

the QARs modifiability interoperability and performance are identified as ASRs

22 Patterns and tactics

An architectural pattern is a package of design decisions that is found repeatedly in practice that has known properties that

permit reuse and describe a class of architectures Architectural tactics are simpler than patterns which typically use just a

single structure or computational mechanism They are meant to address a single architectural force Tactics are the rdquobuilding

blocksrdquo of design and an architectural pattern typically comprises one or more tactics [40] Based on the ASRs several tactics

are listed in Table 27 that are used for the base pattern selection The explored patterns are layers event-driven architecture

22 Patterns and tactics 12

microkernel and microservices

221 Layers

The layered pattern divides the software into units called layers that each perform a specific role within the application Each

layer is allowed to use the layer directly beneath it via its interface Changes in one layer are isolated if the interfaces donrsquot

change enablingMT-1 andMT-2MT-5 [40] While changes can be isolated by the isolated layers they remain difficult due the

monolithic nature of most implementations of this pattern Layers contribute to a performance penalty due to the rdquoarchitecture

sinkhole phenomenonrdquo in which requests are simply propagating through layers for the sake of layers [45]

222 Event-driven architecture

This pattern consists of several event publishers that create events and event subscribers that process these events The pub-

lishers and subscribers are decoupled by using an event channel to which the publishers publish events that the event channel

forwards to the event subscribers The subscribers should have a single purpose and execute asynchronously Since the publish-

ers and subscribers are single-purpose and are completely decoupled from other components via the event channel changes

are isolated to one or some components enabling MT-1 MT-2 MT-4 MT-5 and MT-7 If the event channel adds a discovery

mechanism IT-1 can also be enabled Overall the pattern is relatively easy to deploy due to the decoupled nature of the com-

ponents Performance in general can be very high through the asynchronous nature of the architecture enabling PT-6 and PT-7

If the event channel is tweaked to contain extra functionality PT-1 PT-3 PT-8 PT-9 PT-10 PT-11 can be enabled as well

If the components have a limited event response then PT-2 and PT-5 can also be enabled Development can be somewhat

complicated due to the asynchronous nature of the pattern [40 45]

223 Microkernel

The microkernel pattern allows the addition of application features as plugins to the core application providing extensibility

as well as feature separation and isolation The pattern consists of two components a core system called the kernel and

Attribute refinement Id Quality attribute scenario

DowntimeAS-1 The system should be up 995 per year This means the system has an allowed

scheduled downtime of 43 hours and 30 minutes per year for maintenance (M

L)

AS-2 The maximal duration of the interval during which the system is unavailable is

3 hours (M L)

Network AS-3 If there is no active network connection the local device can be used for opera-

tion of the framework (H H)

Table 26 Availability utility tree

22 Patterns and tactics 13

plugins The business logic is divided between independent plugins and the kernel The kernel contains only the minimal

functionality required to make the system operational The plugins are standalone independent components that contain

specialized processing additional features and custom code This code is meant to enhance or extend the core system to

produce additional business capabilities In many implementations plugins are independently developed third-party modules

Changes can largely be isolated and implemented quickly through loosely coupled plugins AllMTs can be enabled Depending

on how the pattern is implemented the plugins can be dynamically added to the kernel at runtime Via a resource discovery

service in the kernel the ITs can be enabled In general most applications built using the microkernel pattern perform well

because applications can be customized and streamlined to only include the features that are needed [45]

224 Microservices

Microservices is an architectural pattern that structures an application as a collection of loosely coupled services that implement

business capabilities Each component of the pattern is deployed as a separate unit that can be deployed on one device or

multiple devices The components can vary in granularity from a single module to a large portion of the application The

components contain one or more modules that represent either a single-purpose function or an independent portion of a

business application [45 46] Due to the separately deployed units changes are isolated to individual components enabling all

MTs Via service discovery mechanisms the ITs can also be enabled The microservices pattern supports distributed deployment

of the software across multiple devices by design This pattern is not known to produce high-performance applications due to

the distributed nature of the pattern which relies on communication via a network [45 46]

225 Comparison of patterns

Table 27 summarizes the analysis of the patterns A score is given based on howwell the pattern enables the tactic Lowmeans

that the pattern does not naturally enable the tactic Medium indicates the pattern can be implemented with the tactic but

does not include it itself High means the tactic is enabled in the pattern Excellent means that the tactic plays a key role in the

pattern

The microkernel pattern andmicroservices pattern both enable most tactics The microkernel pattern implements extendability

of the framework by design using plugins which is the main idea for the framework and thus is an excellent base pattern

Interoperability and deployability of these plugins can be ensured by the microservices pattern as it designs the microservices

to have well defined interfaces for interoperability and allows for the framework to be deployed in a distributed fashion The

architecture presented below is a combination of both the microkernel pattern and the microservices pattern

22 Patterns and tactics 14

Tactic Layers Event-driven Microkernel Microservices

MT-1 Split module Medium High High Excellent

MT-2 Increase semantic coherence Medium High High Excellent

MT-3 Encapsulate Medium High High Excellent

MT-4 Use an intermediary Medium High High Excellent

MT-5 Restrict dependencies High High High Excellent

MT-6 Anticipate expected changes Low High Excellent Excellent

MT-7 Abstract common services Low High Excellent Excellent

MT-8 Defer binding | Runtime registration Low Low Medium High

IT-1 Discover services Low Low High High

IT-2 Orchestrate interface Low Low High High

IT-3 Tailor interface Low Low High High

P-1 Manage sampling rate Low High High Medium

P-2 Limit event response Low High High Medium

P-3 Prioritize events Low High High Medium

P-4 Reduce overhead Low High High Low

P-5 Bound execution time Low High High Medium

PT-6 Increase resource efficiency Low High High High

PT-7 Introduce concurrency Low High Low High

PT-8 Maintain copies of computation Low High Low High

PT-9 Load balancing Low High Low High

PT-10 Maintain multiple copies of data Low High Low High

PT-11 Bound queue sizes Low High Low Medium

PT-12 Schedule resources Low High Low Medium

Table 27 Comparison of how well the discussed patterns enable the tactics needed for the ASRs

23 Software architecture 15

23 Software architecture

The software architecture is documented in three document categories static views dynamic views and deployment views

The static views comprise the different components of the system and their relationship among each other The dynamic views

describe the runtime behavior of the system Finally the deployment views provide different configurations how the system

can be deployed on different devices [47]

231 Static view

Figure 22 presents an overview of the architecture using a component-connector UML diagram Components are the boxes that

represent different software entities that exist at runtime The components have interfaces through which they interact with

other components These are indicated using the rsquolollipoprsquo notation with the rsquoballrsquo representing the interface that a component

provides and a socket indicating that another component is using this interface The type of data exchanged is noted next to

the interface Multiple boxes indicate that multiple components of the same kind can exist at runtime [48]

The architecture consists of the following core components Client Interface Producer Stream Consumer Producer Distribution

Consumer Distribution Producer Plugin and Consumer Plugin The clear components in Figure 22 form the core framework

which each user needs to install to use the framework The colored components form a distribution service for framework

plugins to extend the functionality they are not installed with the core framework but run as remote instances with which

the user can interact to extend his version of the core framework with new plugins A user can use the framework via the

Client Interface building streams that are maintained in the Stream component The Stream component makes requests to

the Producer and Consumer components to activate and control the selected plugins to build the stream Additional plugins

can be added to the framework and are distributed via the Producer and Consumer Distribution components The architecture

implements a hybrid combination of the microservices and microkernel pattern Each presented component is a microservice

that implements its own interface to interact with other components The Producer and Consumer components act as kernels in

the microkernel pattern while the Producer and Consumer plugins acting as plugins in the microkernel pattern These patterns

enable the tactics needed to meet the requirements presented in Section 21

Communication protocol

To allow the microservices to communicate a communication protocol must be designed Communication protocols can roughly

be classified in two categories synchronous and asynchronous Synchronous protocols block on requests which means that the

client waits for a response of the server and can only continue executing when a response is received This makes a synchronous

protocol inherently more reliable but also slower An example synchronous protocol is the Hyper Text Transfer Protocol (HTTP)

Asynchronous protocols just send messages and do not block on the response This makes the protocol less reliable but also

faster [49]

There are two types of traffic exchanged between microservices First there are the command requests that are exchanged

between microservices to edit resources or change state Second there are the video frames that are exchanged between Pro-

ducer and Consumer Plugins Both types of traffic have different requirements The commands must be communicated reliably

23 Software architecture 16

Figure 22 Overview component-connector diagram of the architecture

and need to executed once and only once The reliability is more important than latency so a synchronous protocol is pre-

ferred Microservices traditionally implement the synchronous HTTP protocol with Representational State Transfer Application

Programming Interfaces (REST API) that specifies the application endpoints as textual resources [45] This common protocol

is used for the exchanged command requests in the framework

The video frames need to be sent with low latency at a high frequency but reliability is less important An asynchronous

protocol is preferred For video streaming the Real-time Transport Protocol (RTP) running on top of the User Datagram Protocol

(UDP) is selected as it enables real-time transfer of data between processes [50] RTP defines a standardized packet format to

transmit video and audio over a network It sequences each packet with a sequence number and a timestamp This allows the

application to detect missing packets and latencies in the network The UDP protocol is a low latency asynchronous transport

protocol as it doesnrsquot guarantee packet delivery

The recommended codec for transmitting video media is Motion-JPEG that encodes video frames as separately encoded JPEG

images This makes analysis and processing in subsequent plugins easier as only the received frame is needed to perform

the analysis or processing Other video compression formats such as MPEG-4 use key-frames and object oriented differential

compression formats If a key-frame is received via the stream the frame can be used as is If a reference frame is received the

receiver needs to wait for the corresponding key-frame to be received to be able to construct the full video frame for analysis

This introduces extra complexity and lower quality detection which is a clear trade-off for the quality and simplicity which

MJPEG offers [51 52]

Applying these protocols to the architecture results in the network topology depicted in Figure 23 The full lines represent

communication via HTTP on top of the Transmission Control Protocol (TCP) The dashed lines represent the RTP protocol on top

of the UDP protocol The boxes represent the different microservice components of the framework

23 Software architecture 17

Figure 23 Framework network topology Each box is a microservice component of the framework The full lines indicate communication over the HTTPTCP

protocol the dashed lines indicate communication over the RTPUDP protocol

Client Interface

The Client Interface is the interface through which end-users can interact with the framework Figure 24 presents the detailed

component-connector diagram The Client Interface consists of a User Interface (UI) component and a API Gateway component

Devices can make requests to the Client Interface via the Client Requests interface provided by the API GateWay The UI provides

the UI Operation interface that is used by end-users to control the framework This can be either a visual or textual interface

The UI actions are translated to client requests that are forwarded to the API Gateway using the Client Requests interface The

API Gateway translates the client requests and forwards them to the other core framework components

Figure 24 Client Interface detailed view

Stream

The Stream component maintains the logical representation of the streams built by the end-user for his image processing

application Figure 25 presents the detailed component-connector diagram

23 Software architecture 18

Figure 25 Stream detailed view

It consists of an API a StreamManager and several StreamModel components The API provides the Stream Commands interface

used by the Client Interface to interact with the framework it translates incoming requests to commands for the Stream

Manager that can then execute these commands Commands include creating a new stream modifying the stream layout

modifying the stream state etc The StreamManager creates and manages multiple streams represented by the Stream Model

So the end-user builds Stream Models to create image processing applications The Stream Model represents the logical model

of these image processing application streams As stated before a stream consists of several plugins processing media placed

in some order that are linked by the framework Figure 26 illustrates this concept

Figure 26 Logical model of a stream The arrows represent the flow of media through the stream

Logically the Stream Model is represented as a tree with multiple roots and multiple leaves The framework build streams by

initializing the needed plugins and connecting them in order In the example StreamModel plugins receivemedia frommultiple

source plugins and forward media to multiple targets The Stream Model has a global state that represents the cumulative

state of all plugins To transition the global state from A to B all plugins need to transition from A to B This is done by first

making the transition on the leaves of the Stream Model after which the transition propagates to the root nodes This ensures

that no media is lost because the first transitioned plugins canrsquot process anything as their is no media put into the tree

23 Software architecture 19

Producer and Consumer plugins

A Plugin represents an independent media processing element either of the Consumer type (such as a thermal camera) or

the Producer type (such as an object detection software module) Plugins are deployed as standalone microservices providing

a REST API interface that the framework uses to control the plugin Figure 27 represents a general plugin model A plugin

receives media from other plugins called the sources processes this media and forwards it to other plugins called the listeners

A Producer plugin only has listeners a Consumer plugin has both sources and listeners Merging the media from multiple

sources and forwarding the processed media to multiple listeners is the responsibility of the plugin

Figure 27 Plugin model

The plugin REST API should at least provide a state resource representing the state of how the plugin is processing media

a sources resource that represent the sources from which the plugin receives media to process and a listeners

resource which represent the listeners to which the plugin transmits the processed media Only Consumers have the both

the sources and listeners resource as Producer Plugins produce their own media source and hence can only have

listeners

To indicate if and how the plugin is actively processing media a finite state machine is implemented The state transition

diagram is presented in Figure 28 A plugin can be in four possible states INACTIVE STOP PLAY and PAUSE When a plugin

is in the INACTIVE state no active microservice is running the plugin This is the initial state for all plugins of the framework

This state is only visible to the framework as in this state the plugin is not instantiated When a plugin is in the STOP state

the framework has instantiated a microservice running the plugin The plugin is listening for commands on its API but is not

processing any media This state is visible to the plugin In the PLAY state a plugin is processing media received from its

source(s) and transmits processed media to its listener(s) and is listening for commands When in the PAUSE state media

processing is paused but media buffers are kept This is to decrease the latency when the plugin transitions back to the PLAY

state since the plugin can continue processing from the point from where it was paused The difference with the STOP state

when transitioning STOP state the plugin clears its media buffers

The plugin starts in the INACTIVE state When a microservice running the plugin is instantiated by the framework the plugin

initializes itself in the STOP state From the STOP state the plugin can transition to the PLAY state to process media This

transition is only successful if sources and listeners are registered with the plugin From the PLAY state a transition to both

23 Software architecture 20

Figure 28 The state transition diagram for a plugin

the STOP state and the PAUSE state can be made which stops the processing of media and respectively drops or keeps the

media buffers The plugin cannot make multiple state transitions per command When a transition is made to INACTIVE the

framework first transitions the plugin to the STOP state after which the INACTIVE state can be reached

A sourcelistener has the following fields hostname the hostname of the microservice running the plugin and port the port

on which the sourcelistener is reachable

On the sources and listeners an HTTP GET and POST method must be provided GET retrieves the sourceslisteners

and their details POST adds a new sourcelistener to the plugin Both resources additionally need to provide an individ-

ual endpoint per sourcelistener on which GET PUT and DELETE must be provided This is for individual manipulation of the

sourcelistener GET retrieves the details PUT updates the fields of a listener and DELETE removes a sourcelistener from the

plugin

Producer and Consumer

The Producer and Consumer components are responsible for interacting and managing the ProducerConsumer plugins used in

the streams Figure 29 presents the component-connector diagram of the Producer and Consumer components Both compo-

nents have a similar architecture but are separate components This is because their plugin models differ and are suspected

to often be deployed on different devices having specific hardware requirements Producers Plugins could be deployed on

the thermal camera itself having a very specific operating system whereas a Consumer plugin might need access to specific

processors to speed up its execution

The Producer and Consumer consist of the following components API Kernel Plugin Model and Plugin Manager The API trans-

lates requests coming from the Stream component to commands for the Kernel The Kernel implements the core functionalities

such as activating (deploying) and deactivating plugins managing their state and manipulating their resources It creates a

Plugin Model for each Plugin that the framework has installed This model represents a plugin logically on framework level

and keeps track of the Plugin resources The Plugin Manager manages the plugins that were added to the framework stored in

the Plugin Directory It manages the plugin installations adding updates or installing additional plugins that can be retrieved

from the Producer and Consumer Distribution components

23 Software architecture 21

(a) Producer component-connector diagram

(b) Consumer component-connector diagram

Figure 29 Component-connector diagrams of the Producer and Consumer module

Producer and Consumer Distribution

The Producer and Consumer Distribution components are responsible for managing and maintaining the plugins for the frame-

work They act as online software repositories from which local versions of the framework can retrieve new plugins The

component-connector diagrams are presented in Figure 210 The Distribution components consists of the following subcom-

ponents API Plugin Manager and Plugin Tester Plugin Developers can make requests to the API that translates these requests

to Create Read Update Destroy (CRUD) commands for the Plugin Manager The Plugin Manager executes these commands

on the Plugins that are kept in the Plugin Repository The quality of the framework depends on the quality of the plugins

that it offers Therefore plugins should be thoroughly tested before being added to the framework to guarantee quality The

Plugin Tester component is responsible for this testing Tests should include testing if the plugin implements the Plugin Model

correctly if the plugin meets the performance requirements etc When a plugin passes these tests it is added to the Plugin

Repository so that end-users can install the plugin and use it for their applications

23 Software architecture 22

(a) Producer Distribution (b) Consumer Distribution

Figure 210 Producer and Consumer Distribution component-connector diagrams

232 Dynamic views

Dynamic views depict the behavior of the system and complement the static views They are documented using sequence

diagrams that show an explicit sequence of messages between architecture elements that describes a use case [40] Two key

use cases are presented here Add a plugin to the stream and linking plugins to build the stream

Add plugin to stream

Figure 211 presents the sequence diagram for adding a Producer plugin to the framework The framework is assumed to be

running the user has created a streamS and the Producer PluginA is correctly installed The end-user executes the command

to add A to stream S on the Client Interface that passes the command to the Stream component The Stream component

requests the creation of a microservice instance of A that is created by the Producer Kernel When the Producer Plugin is

instantiated the Producer Kernel creates a Plugin Model of A and adds it to its references so that the instance of A can be

reached for future commands Afterwards the StreamManager is informed of the success upon which the StreamManager can

addA to the Stream Model ready to be linked with other plugins The user is notified of this success and can continue building

IfA could not be instantiated (due to not being installed not installed correctly etc) A is marked as rsquobrokenrsquo and the user is

notified that the action could not be completed When the plugin is marked as rsquobrokenrsquo it can no longer be used and needs to

be reinstalled The sequence diagram for adding a Consumer Plugin is similar but replaces the Producer components with the

Consumer components

Link plugins

Figure 212 presents the sequence diagram for linking two plugins in a stream In the sequence diagram two Consumer Plugins

A and B are linked this can be extended to a Producer Plugin linking with a Consumer Plugin The framework is assumed

to be running the user has created a stream S the plugins A and B have been instantiated and added to the stream The

end-user executes the command to linkA andB in stream S on the Client Interface that passes the command to the Stream

component that checks if the link is valid for the Stream Model S Linking can only be done if the stream is in the STOP state

and if the plugins are already in the stream If the link is valid the Stream Manager can begin linking the plugins To link

23 Software architecture 23

Figure 211 Add a Producer Plugin to stream

the plugins in the order A-B A is added as a source for B and B is added as a listener for A These subsequences are

found in their corresponding frames in the diagram and are very similar The Stream Manager makes the request to add the

sourcelistener to the Kernel that finds the corresponding plugin and makes the request on the corresponding Plugin Model

If the Plugin succeeded the Plugin Model is updated and the Stream Manager is notified of this success If both plugins have

successfully set the source and listener the Stream Model layout is updated with the link Should the sourcelistener request

fail for one of the plugins the change is rolled back and the end-user is notified

233 Deployment views

The different deployment configurations are illustrated via deployment diagrams using the Deployment UML 25 specification

[48] rsquoHostrsquo specifies the device on which components are deployed The rsquomicroservicersquo indicates the isolated environment

in which components are running These isolated environments on the host are realized as software containers that enable

portability of the components to other deployment configurations This concept is further discussed in Section 33 The Producer

and Consumer Distribution components were left out of the diagrams as they are always distributed on a different host than

the core framework Two deployment configurations are presented the local configuration that deploys the components on

a single device and the distributed configuration that deploys each component on a separate device These configurations are

presented in Figure 213

23 Software architecture 24

Figure 212 Link two plugins in a stream The rsquoformat requestrsquo and rsquotranslate requestrsquo actions in the API components have been omitted to reduce clutter

in the diagram but are executed by the API components

23 Software architecture 25

Local configuration deployment

The local configuration deploys the framework on a single local device The configuration is depicted in Figure 213a Because

the framework is deployed as a whole it can operate offline This configuration is useful for image processing applications that

canrsquot rely on a stable network connection Examples are remote locations or densely built-up areas The components are still

deployed as separate microservices due to the architecture of the framework This has an impact on the performance of the

framework because for every interaction between components either the HTTP message protocol or RTP protocol is used that

introduces extra overhead compared to direct invocation of commands

Distributed configuration deployment

The distributed configuration deploys the framework on multiple devices The components are distributed over these devices

made possible by the microservice isolation and communication protocols This configuration is depicted in Figure 213b Obvi-

ously in this configuration each component of the framework must have a reliable network connection to communicate with

the other framework components This configuration could be used for example for a security application The end-user has

the Stream module running on a master node that controls several cameras The end-user can configure his image process-

ing application through the Client Interface running on his device that communicates with the Stream module running on

the master node The master node can control each camera by communicating with the Producer component If for example

the security application requires autonomous detection of trespassing people a computationally intensive task the Consumer

Plugins could need dedicated hardware to run that is only available on another device The Consumer component can then be

deployed on that dedicated device and the Stream component can again communicate with it over the network This success of

this configuration depends on the availability of the network and the capacity of the network If the network fails commands

and media canrsquot come through and the framework can no longer execute Due to the distributed nature performance will also

be worse when compared to the local configuration because each request between the components travels over a network

that can experience delays

23 Software architecture 26

(a) Local configuration deployment diagram(b) Distributed configuration deployment diagram

]

Figure 213 Deployment diagrams

STATE OF THE ART AND TECHNOLOGY CHOICE 27

Chapter 3

State of the art and technology choice

To build and test a proof of concept implementation of the architecture presented in Chapter 2 several state of the art tech-

nologies can be used as support for the framework These are presented in Sections 31 32 33 and 34 For each category a

choice is made that will serve as the basis for the implementation of the proof of concept discussed in Section 35 Readers

already familiar with the presented technologies can safely skip ahead to Section 35

31 Thermal camera options

This section aims to provide an overview of some currently commercially available thermal cameras The overview is not a

complete overview of all products offered by all vendors This data was gathered in September 2017 so some products can

be discontinued and new products can already be launched Several parameters are collected for each product Section 311

discusses why these parameters are important to assess the quality of a thermal camera Section 312 aims to aggregate these

parameters and presents insights into the data The full list of specifications can be found in Appendix B

311 Parameters

The following parameters were considered for the comparison physical specifications image quality thermal precision inter-

faces energy consumption help and support and auxiliary features

Price

Thermal cameras are relatively expensive when compared to visible light cameras For example a 20 megapixel (MP) visible

light camera can cost as low as 100 euro while thermal cameras having a much lower image resolution can cost as much as

15000 euro Prices for thermal cameras cover a very wide range and budgets are limited in practice

31 Thermal camera options 28

Physical specifications

Two specifications are considered the weight of the camera and the dimensions of the camera Drones have a limited carry

weight due to maximal carrying capacities and a faster draining of battery life when carrying heavier loads Lighter and smaller

cameras are preferred for usage with drones These often offer lower image quality and less features than the heavier cameras

Image quality

Image quality specifies how much information an image can possibly hold It consists of five parameters resolution capture

frequency or frame rate field of view and radiometric information Image resolution is the amount of detail an image holds

Higher resolution cameras can capture more details in a scene resulting in a sharper image that holds more information Due

to more details smaller objects can also be seen allowing scenes to be viewed from larger distances Drones capture images

from relatively large distances so good resolutions are required for the images to be useful Image resolution is measured in

pixel density presented as the product of the amount of pixels in width and height of the image The highest resolution found

for the compared cameras is 640 x 512 pixels Some cameras offer a visual camera next to the thermal camera This allows

an overlay of the visual image and the thermal image so-called Multi Spectral Dynamic Imaging (MSX) This creates artificial

sharper images because edges can be seen more clearly because they are more visible in the visual image Figure 31 depicts a

thermal-only image and a MSX image of a dog It can be seen that the MSX image is sharper MSX is a more low-cost solution

to produce sharper images compared to increasing the thermal resolution as visible light cameras are less expensive [7]

(a) Thermal (b) MSX

Figure 31 Thermal image and MSX image of a dog

The capture frequency or frame rate dictates how many frames the camera can capture per second Higher frequency cameras

are able to track dynamic scenes better The field of view is angle throughwhich the camera is sensitive to thermal radiation and

31 Thermal camera options 29

determines the extent of the world that can be seen by the camera Bigger field of views can capturemore of the environment in

one image Most cameras allow various lenses to be mounted onto the camera which allows for greater flexibility in choosing

the field of view Radiometric image information is thermal information embeddedwith the infrared image that can be analyzed

after recording Radiometric information characterizes the distribution of the thermal radiationrsquos power in space and specifies

the temperature per pixel exactly Regular thermal images use a relative scaling of temperatures that are mapped onto a

colorspace with some color being the hottest color in the image and another color the coldest For example in Figure 31a the

Iron color scheme is used which maps the cold regions of the image on blue color variants and warmer regions on red and

yellow variants Radiometric information can give a very detailed description of the radiation pattern of a scene

Thermal precision

Thermal precision specifies the temperature range the sensitivity and accuracy of the temperature measurements The tem-

perature range indicates the minimum and maximum range a camera can detect A larger temperature range comes with a

trade-off in sensitivity and accuracy Often cameras offer different modi of operation and operate using different intervals

according to the accuracy needed in a scene Sensitivity indicates the ability of the camera to record finer distinctions in tem-

perature Accuracy is the margin of error for temperature readings on the thermal camera An accuracy of 5 degrees Celsius

for small temperature ranges and 20 degrees Celsius for large temperature ranges is commonly found The increase in error

margin is a trade-off for the larger temperature interval Objects have different emit infrared waves in various forms (due

to black-box radiation [7]) To accurately compare the temperatures cameras often implement emissivity corrections that

normalize the measurements

Interfaces

Cameras can communicate with other devices via several interfaces during use Cameras mounted on a drone cannot be ac-

cessed during flight and need these interface to transfer data USB and HDMI are the most commonly found interfaces to

connect the camera with an on-board processing unit gimbal or battery MAVLink [53] is a very lightweight header-only mes-

sage marshalling library for micro air vehicles drones When a camera provides this interface this allows for a very efficient

communication scheme to control the camera remotely Other interfaces include Bluetooth or Wi-fi

Energy consumption

A device mounted on a drone has a limited energy source at its disposal The less energy the camera consumes the longer the

drone can operate This can even lead to lighter batteries used in-flight reducing the carried weight and therefore also the

energy consumption Typically energy consumptions for cameras are much lower than the energy consumption of the drone

itself so this is a minor specification Input voltage and power consumption are specified

31 Thermal camera options 30

Help and support

How the camera is supported by the company has a big impact on the ease of developing applications for the camera platform

User manuals phone or email support and FAQs are very helpful Should the camera be malfunctioning a product warranty is

necessary to recover the broken product

User experience

The user experience is another important factor as there is a difference in the technical specifications and the actual experience

of the user The user experience is measured in a number of good and a number of bad reviews Reviews are scored from zero

to five with zero being a very bad experience and 5 being a very good experience A good review is scored three or more a bad

review less than three stars

Auxiliary features

Some cameras offer even more features than the ones mentioned above These can be a connection with the Global Positioning

System (GPS) to indicate where images were captured a software application to interact with the camera analysis functionality

tracking etc

312 Comparative analysis

It can be seen that FLIR is the market leader on thermal solutions for drones They offer the largest product line and products

from other companies often utilize one of their camera cores Figure 32a plots the retail price compared to the thermal

resolution Cameras with high and low resolutions are found across all price ranges Clearly other features determine the price

of a thermal camera A feature function is defined that maps the features of a thermal camera on an integer The function

increments the integer if

bull The camera has MSX support

bull The camera has a standard data format (not just an analog or digital signal)

bull The camera offers radiometric information

bull The image resolution is larger than 640 x 512 pixels being the highest resolution found for these products

bull The sensitivity is smaller than 100 mK

bull The camera offers emissivity correction

bull The camera offers a USB interface

bull The camera offers a MAVLink interface

32 Microservices frameworks 31

bull The camera offers an HDMI interface

bull The camera offers a Bluetooth connection

bull The camera offers Wi-Fi connection

bull The camera offers GPS tagging

Figure 32b plots these feature points versus the retail price This gives a more log-like relationship The features of a camera

determine the price much more than just the image quality For a price less than 5000 euro thermal cameras are found that

implement most basic features Then the price increases rather fast for less added features These are features like radiometry

that require additional hardware that greatly increase the price of the camera

32 Microservices frameworks

The architecture presented in Section 23 relies heavily on the microservices pattern Therefore this Section aims to present

several microservices frameworks to support this architecture Figure 33 depicts the results of the Rethink IT survey query-

ing the most used frameworks for microservices by developers [54] The most popular frameworks Java EE and Spring Boot

are written in Java The Java EE framework is more of a one-stop-shop framework offering much more functionalities than

just a backbone microservices framework and is therefore not considered Spring Boot is clearly a very popular and mature

framework more streamlined for microservices Vertx is a more upcoming framework renowned for its performance making

it worthwhile to explore Python is an upcoming language for web development and because it is excellent for prototyping

several frameworks for this language are explored as well The frameworks presented here are Vertx version 351 Spring Boot

version 20 Flask version 012 Falcon version 141 and Nameko version 290

321 Flask

Flask is a micro web development framework for Python The term rdquomicrordquo means that Flask aims to keep its core simple but

extensible Flask is an unopinionated framework as it only provides a glue layer to build a REST API around the application

(a) Camera resolution compared to retail price(b) Camera feature points compared to price

32 Microservices frameworks 32

Figure 33 Rethink IT Most used tools and frameworks for microservices results [54]

However it provides a large list of extensions if extra functionality is required [55] Starting a microservice is very simple

as illustrated in Listing 1 Flask uses the concept of Python decorators [56] to bind Python functions to a REST API in Listing

1 for example the function service_status() is linked to the rsquorsquo resource When a user issues an HTTP GET request

on this resource the route() function on the app object is called by Flask Because route() is a decorator for the

service_status() function service_status() is wrapped and passed to the route() function so that when

a user issues an HTTP GET request the service_status() function that was passed gets called This allows for an easy

construction of the REST API just decorate all the functions of the microservice with the correct Flask decorator

from flask import Flask

app = Flask(__name__)

approute()

def service_status()

return service_status

if __name__ == __main__

apprun()

Listing 1 Minimal Flask application

Because Flask is a microframework its memory footprint is small with the binary file only being 535KB large It is in use

by several large companies such as Netflix and Reddit [57] In a production environment the default Flask web server is not

sufficient as it only serves one user at a time However for prototyping it is an excellent framework [55]

32 Microservices frameworks 33

322 Falcon

Falcon is a bare-metal Python web framework that differentiates itself in performance when compared to other frameworks

It targets itself towards microservices due to being even more lightweight and faster when compared to frameworks like Flask

In a benchmark test it achieves 27 times the speed of Flask [58] The framework seems less mature and has not been adopted

by many companies [59] It is not considered for the prototype of the system but could be used in production as it achieves

better performance

323 Nameko

Nameko is a framework specifically built for building microservices in Python Next to offering a REST API it also offers asyn-

chronous events over the Advanced Message Queuing Protocol (AMQP) It is only meant to be used for microservices not for

web applications that serve content It is a relatively young framework and is not backed by any major companies as of yet It

is however backed by the developer of the Flask framework [60]

324 Vertx

Vertx is a toolkit for building reactive applications on the Java Virtual Machine (JVM) This framework follows the reactive

systems principles These principles are used to achieve responsiveness and build systems that respond to requests in a timely

fashion even with failures or under load To build such a system reactive systems embrace a message-driven approach All

the components interact using messages sent and received asynchronously Reactive microservices built with Vertx have the

following characteristics autonomy asynchronous resilience and elasticity Vertx is a toolkit and can be used as any other

library which makes it very flexible It provides a large set of features metrics different programming languages different

protocols templating data access cluster management etc

Vertx embraces the asynchronous development model which can be seen in Listing 2

import iovertxcoreAbstractVerticle

public class Server extends AbstractVerticle

public void start()

vertxcreateHttpServer()requestHandler(req -gt

reqresponse()

putHeader(content-type textplain)

end(Hello from Vertx)

)listen(8080)

Listing 2 Vertx example

33 Deployment framework 34

The event which occurs is the HTTP request On arrival of the event the Handler is called and is executed The Handler is chained

to a listen request and does not block the calling thread The Handler is only notified when an event is ready to be processed

or when the result of an asynchronous operation has been computed [61]

325 Spring Boot

Spring Boot is an opinionated Java framework for building microservices based on the Spring dependency injection framework

It allows developers to create microservices through reduced boilerplate and configuration For simple applications it provides

a similar syntax to Flask in Python and uses decorators for routing An example is given in Listing 3 The framework handles

most of the routing and request handling but restricts the developer in application structure The framework is not lightweight

and performs less well than Vertx [62]

RestController

RequestMapping(api)

public class HelloRestController

RequestMapping(method = RequestMethodGET value=hola

produces = textplain)

public String hello()

return Hello Spring Boot

Listing 3 Spring Boot example

33 Deployment framework

To allow for the modifiability and interoperability requirements discussed in Section 212 and the different deployment config-

urations in Section 233 Linux containers (LXC) are used A container is a lightweight operating system running inside the host

system running instructions native to the core CPU eliminating the need for instruction level emulation that Virtual Machines

use Containers provide an identical isolated runtime environment for processes without the overhead of virtualization This

makes them perfect for highly portable software as only the container needs to be moved and can directly be executed on any

system supporting the containers [63] First the concept of containers is introduced in Section 331 Second several container

frameworks are presented in Sections 332 333 334

331 Containers

Containers sandbox processes from each other and are often described as the lightweight equivalent of virtual machines The

difference between a virtual machine and a container is the level of virtualization Virtual machines virtualize at the hardware

33 Deployment framework 35

level whereas containers do this at the operating system (OS) level The achieved effect is similar but there are significant

differences Containers make available protected portions of the OS and share its resources Two containers running on one OS

have their own OS abstraction layer and donrsquot know they are running on the same host This provides a significant difference in

resource utilization Virtual machines provide access to hardware only so it is necessary to install an OS As a result there are

multiple OSs running which gobble up resources Containers piggyback on the running OS of the host environment They merely

execute in spaces that are isolated form each other and certain parts of the OS This allows for efficient resource utilization and

for cheap creation and destruction of containers Consequently starting and stopping a container is equivalent to starting and

stopping an application [64 65] This comparison is illustrated in Figure 34

Containers offer several advantages over running a process directly on the system Due to the OS virtualization of the con-

tainers software is always deployed on the same operating system defined in the container This allows for a rsquowrite once run

everywherersquo scenario which allows for portability of the system to a range of devices Containers communicate with each other

using protocols such as HTTP This allows for the processes in containers to be written in any programming language using

any external library that is needed For the system this means that if the Producer and Consumer Plugins are packaged as

containers they can effectively be made in any available technology greatly enhancing the extensibility of the system

332 LXC

Linux containers are the basis on top of which other container frameworks are built LXC provides a normal OS environment

similar to a VM The containers in this framework almost behave identically to a VM They can run multiple processes LXC can

be used directly but offer only low level functionalities and can be difficult to set up [67]

333 Docker

Docker started as an open-source project at dotCloud in early 2013 It was an extension of the technology the company had

developed to run its cloud applications on thousands of servers [64] Now Docker is a standalone mature company providing a

software container platform for the deployment of applications [66] Docker provides two main services a simple toolset and

API for managing Linux containers and a cloud platform which provides easy access to recipes for software containers created

by other developers [68] Docker is the container technology with most public traction and is becoming the container standard

at the time of writing due to its functionalities and very responsive community It offers functionality to easily build and run

containers but also manage them in large clusters A design decision that limits Docker is that each container can only run one

process at a time and the Docker client Docker consists of a daemon that manages the containers and the API Engine a REST

client Should this client fail dangling containers can arise [69]

334 rkt

Core OSrsquo rkt is an emerging container technology providing an API engine similar to the Docker API Engine that can run LXC

containers as well as Docker containers rkt focusses on security standardization and is specifically designed to run in cloud

environments Unlike Docker rkt does not use a daemon process with a REST client The command line tool executes all the

34 Object detection algorithms and frameworks 36

(a) Container stack (b) Virtual machine stack

Figure 34 Containers compared to virtual machines [66]

operations which makes the framework more reliable rkt is not as mature as Docker yet It is portable to multiple Linux

environments but is not yet portable to macOS and Windows [70]

34 Object detection algorithms and frameworks

As stated in Section 132 object detection is the computer vision task of detecting which objects are present in an image and

where they are located Several approaches to this problem have been proposed some of which focus on thermal images This

section aims to give a small overview of different existing techniques For the technical details on the algorithms the reader is

referred to the respective articles on the algorithms

341 Traditional approaches

Traditional approaches include hot-spot detection techniques and Adaptive Boosting (AdaBoost) with various feature extraction

techniques such as Aggregated Channel Features (ACF) and Integral Channel Features (ICF) Thesemethods rely on clever feature

engineering solutions that use domain knowledge or statistical insights to transform the raw dataset into a specific set of

features in order to find patterns [32]

Hot-spot detection

Hot-spot techniques work on the assumptions that people have an overall higher body temperature than most of the back-

ground in the thermal image These techniques first select candidate objects these are the hot-spots in the image The hot-spots

define the region on which a classifier is run and are thus the localization step in the object detection problem Afterwards

a classifier is trained on these candidates Xu et al used a Support Vector Machine (SVM) classifier to classify if the hot-spot

34 Object detection algorithms and frameworks 37

represented a pedestrian [71] Nanda et al used a Bayes classifier to classify the hot-spots [72] These methods are generally

not applicable because people often are not the only hot-spots in thermal images

AdaBoost

AdaBoost is a machine learning algorithm that utilizes the output of so-called weak learning algorithms (weak learners) and

combine their outputs into aweighted sum that forms the output of the boosted classifier AdaBoostmodifies theweak learners

in favor of data points misclassified by previous classifiers [73] Viola and Jones et al built a detection algorithm that uses two

consecutive frames of a video sequence and trains the AdaBoost classifier on both motion and appearance information [74]

Davis et al use a two-stage template approach that initially performs a fast screening procedure using a generalized template

using a contour saliency map to locate potential person locations Any window located in the first phase is then forwarded to

the AdaBoost algorithm to validate the presence of the person Dollaacuter et al extracted features using different ICF and ACF [35]

ICF and ACF compute features by calculating several aggregations over the different channels of an image such as gradient

color histogram and colors Goedeme et al expanded these detectors with extra thermal channels to achieve comparable

results as Dollaacuter et al but for thermal images [36]

342 Deep learning

Over the past few decades there has been a shift in proposed solution methods towards deep learning Deep learning for object

detection uses Convolutional Neural Networks (CNN) CNNs are a specialized kind of neural network for processing data that

has a known grid-like topology such as images CNNs generally consist of three steps a convolution step that creates a feature

map of a region of an image a pooling step that summarizes the output of the convolution step and finally a fully-connected

network that learns from the features extracted in the previous steps [75] The key difference is that these algorithms do the

feature extraction in the convolutional layers and do not need feature engineering like the algorithms presented in Section

341 This requires quite a bit of computing power when compared to the traditional methods Since deep learning made the

shift to computing on Graphical Processing Units (GPUs) computations became feasible and these models proved to achieve

very good performance on various machine learning problems Two model types are described two-stage networks (R-CNN

R-FCN) that extract image regions first and make separate predictions on each region and dense networks (YOLO SSD NASNet

RetinaNet) that operate on the image as a whole

Region-based Convolutional Network (R-CNN)

R-CNN uses a selective search method to find objects an alternative to the exhaustive search in an image It initializes small

regions in an image and merges them hierarchically The detected regions are merged according to color spaces and other

similarity metrics [76] R-CNN combines this selective search with a CNN per region to find out the objects in these regions [77]

34 Object detection algorithms and frameworks 38

Fast(er) Region-based Convolutional Network (Fast(er) R-CNN)

Fast R-CNN was developed to reduce the time consumption related to the high number of models necessary to analyze region

proposals from the selective search method in R-CNN Instead of using a CNN for each region a single CNN with multiple

convolutional layers is used [78] Faster RCNN drops the region proposals detected with the selective search method (which

is computationally expensive) and introduced the Region Proposal Network (RPN) to directly generate region proposals This

accelerates training and testing and improves performance [79] Mask R-CNN is an extension of the Faster R-CNN model that

adds a parallel branch to the bounding box detection to predict object masks that is the segmentation of an object by pixel in

the image [80]

Region-based Fully Convolutional Network (R-FCN)

R-FCN tries a more efficient approach to region detection Instead of applying a per-region subnetwork multiple times R-FCN

uses a fully convolutional network with computations shared across the entire image This allows it to be compatible with

multiple backbone networks such as Residual Networks [81]

You Only Look Once (YOLO)

The previously discussed methods need to run the same computations on different parts of an image multiple times before

generating a prediction This makes those methods relatively slow The YOLO model [82] was developed with the requirement

to make predictions as fast as possible trading off accuracy for speed to move towards real-time object detection YOLO directly

predicts bounding boxes and class probabilities with a single CNN in a single evaluation instead of first detecting object regions

and predicting classes afterwards This has some benefits over the other methods YOLO is very fast when compared to other

methods capable of processing images in real-time up to 155 frames per second for some variants It also learns contextual

information because it trains on entire images instead of regions YOLO also generalizes better for other image types All these

benefits come at the cost of accuracy YOLO struggles to precisely localize some objects especially small objects The following

versions of YOLO focus on delivering more accuracy The algorithm is currently in its third version [83]

Single-Shot Detector (SSD)

The SSD [84] is similar to YOLO and predicts all the bounding boxes and the class probabilities in one single evaluation (single

shot) using one CNN The model takes an image as input which passes through multiple convolutional layers When compared

to YOLO SSD achieves higher accuracies by adding convolutional layers and including separate filters for different aspect ratio

detections

Neural Architecture Search Net (NASNet)

NASNet takes a different approach and does not design the network architecture to perform the object detection beforehand

but instead trains a Recurrent Neural Network (RNN) to generate the model descriptions of the CNN to perform the object

34 Object detection algorithms and frameworks 39

detection The RNN is trained using reinforcement learning The NASNets built for object detection perform as good as most

networks but are slower to train [85]

RetinaNet

RetinaNet is the latest state-of-the art object detector It is a simple dense detector similar to YOLO and SSD but matches

the accuracy of the two-stage detectors like the R-CNN variants RetinaNet proposes that the foreground-background class

imbalance encountered when training the dense detectors lead to less accuracy when compared to the two-stage detectors

RetinaNet uses a newmethod called Focal Loss that focuses training on a sparse set of examples to counter this class imbalance

which results in a very good performance and a very fast detection [86]

343 Frameworks

While the previous Sections focused on different algorithms actually implementing these algorithms is not straightforward

Thatrsquos why over the past years several deep learning frameworks have emerged that try to provide easier access to this tech-

nology Some frameworks provide APIs for some of the object detection algorithms presented above This section gives a small

overview of some frameworks Most frameworks differ quite a bit from each other which makes porting a model from one

framework to another rather difficult The Open Neural Network Exchange Format (ONNX) initiative hopes to propose a stan-

dard for interchangeable models which should aid switching among frameworks more easily in the future [87] Note that there

are other frameworks available but those do not yet support object detection functions out of the box

TensorFlow

Perhaps the most well-known framework TensorFlow is an open source machine learning library for neural networks with a

Python interface It was developed by Google for internal use and released for the public in 2015 [88] Recently an Object

Detection API has been built for TensorFlow which implements pre-trained models on benchmark datasets such as SSD Faster

R-CNN R-FCN and Mask R-CNN [89] TensorFlow offers a lot of flexibility in its use and can be used for many machine learning

problems

Darknet

Darknet is an open source neural network framework written in C and CUDA It is maintained by Joseph Redmon the person

behind the YOLO algorithm [90] Darknet does not offer the flexibility that other frameworks offer but is easy to install and

use when compared to others Out of the box Darknet offers an interface for YOLO The open source community offers some

ports of this framework to other popular frameworks such as Tensorflow

34 Object detection algorithms and frameworks 40

CNTK

The Microsoft Cognitive Toolkit (CNTK) is an open source toolkit for distributed deep learning It offers a Python C or C++

interface Itrsquos one of the first framework so support ONNX CNTK offers an API for Fast R-CNN and Faster R-CNN [91]

35 Technology choice 41

35 Technology choice

This Section presents the choices made for each technology described in the previous Sections

351 Thermal camera

The FLIR One Pro and Therm-App were selected as thermal cameras for the proof of concept Both offer relatively high quality

images 160 x 120 pixels and 320 x 240 pixels respectively This is of course relative to their price 469 and 93731 euro respec-

tively These prices are at the low end of the product ranges offered Both cameras are designed to use on a smartphone which

makes them ideal for prototyping since these devices are widely available and setting up the camera via the apps from the

respective companies is easy Both cameras provide MPEG-4h264 encoded video output easily understood by most playback

software Both cameras can be found in the lower left of Figure 32b

For deployment in production-ready applications with drones these cameras are not the best choice They arenrsquot specifically

designed to be used on a drone and donrsquot offer the best image quality possible In those applications platforms like the FLIR Vue

Duo Zenmuse or Workswell Wiris are better candidates due to their superior image quality MAVLink interfaces compatibility

with commercially available gimbals to mount them on drones and other features

352 Microservices framework

Flask is selected as the microservices framework The arguments for Flask are as follows Flask is a mature web framework

with major companies backing it This means the APIs stay consistent and the framework is stable in use When compared to

some other frameworks like Spring Boot Flask is unopionated which allows for maximum flexibility during development Flask

also has a very small memory footprint that makes it easier to deploy on less powerful on-board devices like drones Flask is

also easy to use and quick to set up ideal for developing a proof of concept A final argument is the familiarity of the author

with Flask

353 Deployment framework

Docker is selected as the deployment framework Docker is the most mature and well supported container framework at

the time of writing and likely will be important in the future It offers the most features and is specifically designed for the

microservices pattern [68]

354 Object detection

One of the requirements specified in Section 21 is real-time streaming Real-time object detection is only achieved by a few

models presented in Section 34 Candidates are YOLO SSD and RetinaNet As there is no framework that provides an implemen-

tation of the RetinaNet algorithm out of the box at the time of writing this algorithm is not selected SSD is implemented in

the TensorFlow object detection API However at the time of writing this API has not been found stable when trying out the API

fallbacks to older versions of the software were needed to be able to test the models This was due to the object detection API

35 Technology choice 42

using older versions of the TensorFlow framework Therefore YOLO implemented in the darknet framework is selected Darknet

offers a stable distribution YOLO achieves good results and has proven to be a very fast detector capable for real-time object

detection

PROOF OF CONCEPT IMPLEMENTATION 43

Chapter 4

Proof of Concept implementation

To prove the concept of the architecture discussed in the previous chapters a prototype is implemented First the goals and the

scope of the prototype are presented in Section 41 Next the components of the prototype are presented in Section 42 Finally

the known limitations and issues of the prototype are presented in Section 43

41 Goals and scope of prototype

The goals of the prototype are to prove the QARs defined in Section 21 The prototype focusses on the ASRs performance

interoperability and modifiability The usability security and availability requirements are left out of the scope of the prototype

because they are not an ASR and require significant resources (focus groups longtime deployment etc) to test

The components that are implemented in the prototype are Client Interface Stream Consumer and Producer because they

represent the core functionality of the framework to build image processing application streams using plugins The Producer

and Consumer Distribution components enable third party plugin developers to add their functionality to the framework These

are distribution functionalities which are out of scope of the prototype The prototype will only support one video stream All

functions presented in Figure 21 are implemented with the exception of rsquoInstall pluginrsquo rsquoUninstall pluginrsquo rsquoAdd pluginrsquo rsquoView

pluginrsquo rsquoRemove pluginrsquo and rsquoUpdate pluginrsquo as they are only supported via the Producer and Consumer Distribution components

The prototype is deployed on a local device Distributed deployment configurations require small changes in the implementation

(see Section 43)

42 Overview of prototype

421 General overview

The prototype consists of four main components a cli streamer producer and consumer The cli process is

the Client Interface implemented as a textual Command Line user Interface (CLI) which allows a user to interact with the

prototype through textual commands in a shell This process is deployed on the local machine The streamer producer

42 Overview of prototype 44

and consumer processes are deployed as microservices in their own Docker containers The prototype is initialized through

the cli which spins up the Docker containers of the other processes This is achieved with the tool docker-compose Compose

is a tool for defining and running multi-container Docker applications The compose YAML file defines the configurations for

the microservices Compose uses these configurations to start and stop the application with a single command [92] A snippet

of the compose file for the application is given in Listing 4 Containers are specified as services The example service

configuration given is that of the producer First the name of the container is specified which overwrites the default name

as the container name is used as hostname for the container in Docker [93] The build configuration specifies where the

container build recipe is situated The port mapping allows processes from the localhost to access processes in the container

For the producer service this is only used for debugging The volumes configuration specifies folders from the host to

be mounted to the container This configuration mounts in the source code and resources It also provides access to the Docker

socket to allow interaction with the Docker host (see Section 424)

services

producer

container_name producer

build

context producer

dockerfile Dockerfile

ports

- 808080

volumes

- producerusrproducer

- varrundockersockvarrundockersock

Listing 4 docker-composeyml snippet of the prototype

All containers are connected to a Docker bridge network [93] for communication A bridge network uses a software bridge to

allow connected containers to communicate while providing isolation from containers which are not connected to that bridge

network The bridge network applies to containers running on the same Docker host The network is thus confined to the local

Docker host and is not distributed on different devices The bridge network has some advantages

bull The bridge provides better isolation and interoperability between containers Containers automatically expose all ports

to each other and none to the outside world

bull The bridge provides automatic Domain Name System (DNS) resolution between containers This means that containers

resolve the IP address of each other by container name or alias

bull Containers can be attached to and detached from the networks on the fly

bull Environment variables are shared which can be used to provide equal environment configurations for every container

on the bridge

42 Overview of prototype 45

422 Client interface

The Client Interface is implemented by the cli component The cli is built in Python with the Click package by Armin

Ronacher [94] Click is a CLI creation kit which aims to make the implementation of CLIs easier It resembles the Flask frame-

work as it also leverages Python decorators [56] for most of its functionality The source code of the cli is located in the

mosquitopy file Commands can be executed by calling python mosquitopy or by calling mosquito if the

source code is installed into the Python environment The following commands are implemented

bull mosquito Displays a help page listing command groups

bull mosquito on Starts the application

bull mosquito off Shuts down the application

bull mosquito plugins Groups all commands to manage plugins Plugins can only be listed not installed or unin-

stalled as the Remote Producer and Remote Consumer are not implemented

bull mosquito plugins ls Lists all locally installed plugins

bull mosquito stream Groups all commands to manipulate the current stream

bull mosquito stream add Adds a producer or consumer to the stream

bull mosquito stream delete Deletes a producer or consumer from the stream

bull mosquito stream elements List all producers and consumers that were added to the stream

bull mosquito stream link Links two stream plugins

bull mosquito stream pause Pauses the stream

bull mosquito stream play Plays the stream This means the stream is processing media

bull mosquito stream print Prints the stream layout (which plugins are linked)

bull mosquito stream stop Stop the stream

bull mosquito stream view View the stream on the local device

A typical use of the application would be the following First the application is started using mosquito on Then plugins

are added to the stream using mosquito stream add [ELEMENT_TYPE] [ELEMENT] This will instantiate the

corresponding plugins in the Producer and Consumer component The plugins are linked in order using mosquito stream

link [ELEMENT_1] [ELEMENT_2] The stream is then set to play using mosquito stream play When the

last plugin is linked to the special local plugin the user can view the output from that plugin using mosquito stream

view which opens up a window in which the stream is displayed

42 Overview of prototype 46

As specified in the software architecture (see Section 23) the Client Interface can use the Stream Commands interface of the

Stream component As specified in Section 231 this interface is a REST API so the client can use this interface through the HTTP

protocol This is done with the Python Requests library [95]

423 Stream

The Stream component is responsible for the logical representation of the stream (see Section 231) implemented as the

streamer component The component consists of three objects api that contains the REST API StreamManager and

the Stream object representing the Stream Model in the framework Requests to the other microservices are sent using the

Python Requests library The prototype implementation only supports one stream with a chain-like model This means that

unlike the stream depicted in Figure 26 a plugin canrsquot have multiple sources or multiple listeners The Stream object man-

ages the logical representation of the stream and manipulates the references to the plugins by forwarding commands to the

producer and consumer component respectively It contains two data structures outline which is the logical struc-

ture of the stream and elements that contains all the plugins present in the stream In the prototype the Stream component

provides the following functionalities on its API endpoints

bull plugins GET Fetches all the plugins from the producer and consumer components and returns their in-

formation

bull elements GET POST DELETE Resource to add and delete plugins from the elements bin

bull streamlinks POST Resource to create links for elements

bull streamstate GET PUT Resource to update the state

bull shutdown POST Shut down the framework

Since the streamer component is the only component of the framework that interacts with outside users it has the re-

sponsibility to gracefully shut down the framework This is needed to solve the problem of dangling plugin containers that

run plugins that have not been stopped and removed after closing the application Since only plugins that are contained in a

stream have a running container associated the stream can notify the Producer and Consumer components to stop and remove

those containers

424 Producer and Consumer

The Producer and Consumer component cover similar responsibilities in managing installed plugins They are implemented in

the producer and consumer components Both components consist of the following objects api that contains the REST

API the Kernel that implements the core functionalities the PluginManager which finds plugins installed on the device

and checks if their installation is valid and the Plugin which is the logical representation of a plugin as described in Section

231 Commands to control the plugins are made using the Python Requests library

42 Overview of prototype 47

For the component to be able to start stop and interact with the plugin containers the component needs access to the Docker

host and the Docker client running on that host But because the component is running in its own container it is isolated from

the Docker host and canrsquot interact with the Docker client by default The workaround for this problem is to expose the socket

on which the Docker client is running on the Docker host to the container This is done by mounting the Docker socket of the

host on the Docker socket in the container In Docker compose the mounting is achieved using the following Listing

volumes

- varrundockersockvarrundockersock

Listing 5 Mounting the Docker socket on the container

This has some implications on security (see Section 43) To interact with the now exposed Docker client the component uses

the docker-py library [96] Listing 6 shows how a connection is made to the Docker client and a plugin container is started

The container is started from the plugin image on the network of the framework and is given the plugin name as the container

name Docker thus creates a DNS entry with the plugin name which makes the container addressable on its name Due to this

implementation this limits that there can only be one container of a plugin running at all times in the current implementation

import docker

client = dockerfrom_env()

container = clientcontainersrun(

image=plugin_name

detach=True

name=plugin_name

network=mosquito_default

)

Listing 6 Starting a plugin container

When both components are initialized the Kernel and PluginManager are created The PluginManager searches

for a plugin_directory which contains information on which plugins are installed on the device Each installed plugin

should have a valid image on the device which are contained in the images directory of the Docker daemon If the image

or information file cannot be found on the device the plugin is marked as broken and canrsquot be used by the framework To

describe the API the consumer API is used The producer API is analogous but replaces consumer with producer

and doesnrsquot have the sources endpoints The Producer and Consumer components provide the following functionalities

on the API endpoints

bull consumers GET Retrieves a list of the installed consumers on the device on which the component is running

bull consumerslthostnamegt GET DELETE Retrieves the information of a consumer specified by the host-

name value which is the name of the consumer

42 Overview of prototype 48

bull consumerslthostnamegtstate GET PUT Retrieves or respectively updates the state of a consumer

specified by the hostname value

bull consumerslthostnamegtsources GET POST Retrieves the sources or respectively adds a new source

to the consumer specified by the hostname value

bull consumerslthostnamegtsourcesltsource_hostnamegt

GET PUT DELETE Retrieves updates or removes the source specified by source_hostname of a consumer spec-

ified by hostname respectively

bull consumerslthostnamegtlisteners All listeners resources are analogous to the sources re-

sources

425 Implemented plugins

Three plugins are implemented and tested filecam (called rsquoMycamrsquo in the code) a producer that reads in a video file and

transmits it in MJPEG encoding using the RTP protocol testsrc a producer which generates test video and transmits it

in MJPEG encoding using the RTP protocol and local a consumer which captures incoming RTP MJPEG video frames and

displays them on the local display The filecam and local plugins are discussed since the testsrc is similar to the

filecam

The plugins are implemented in Python use the GStreamer library with the Python bindings [97] for media streaming and the

Flask framework to implement the API These libraries donrsquot have to be used by future plugins which can just implement a REST

API and provide a media stream specified in their descriptions

Filecam plugin

The filecam image is based of the Ubuntu 1710 image It is chosen over lighter Linux distributions because it offers more

functionalities out of the box for prototyping Other dependencies are Python 36 GStreamer 112 GStreamer plugins-base

plugins-good plugins-bad plugins-ugly libav doc and tools and python-gst

The API of the plugin offers the following functionalities

bull state GET PUT Retrieve and respectively update the state of the plugin

bull listeners GET POST Retrieve and respectively add a listener on the plugin

bull listenerslthostnamegt GET PUT DELETE Retrieve update and respectively delete a listener on the

plugin

The implemented GStreamer pipeline is depicted in Figure 41 The pipeline consists of the following GStreamer elements

1 filesrc GStreamer element that reads data from a file in the local file system This file can have any extension

and is not limited to video or audio files [98] The location property is set to the location of the file in the plugin

container

42 Overview of prototype 49

Figure 41 filecam GStreamer pipeline

2 decodebin GStreamer bin that automatically constructs a decoding pipeline using available decoders and demuxers

via auto-plugging [99] Note that for some media containers and codecs the appropriate decoders must be installed

For example to decode the MPEG streams contained in MP4 files a h264 decoder is needed that can be found in the

rsquolibavrsquo GStreamer plugins library

3 jpegenc GStreamer elements that encodes raw video into JPEG images [100] This implements the MPEG video

stream as all video frames are encoded as JPEG images

4 rtpjpegpay GStreamer element that payload encodes JPEG images into RTP packets according to RFC 2435 [101]

5 udpsink GStreamer element that sends UDP packets to the network When combined with an RTP payload plugin

it implements RTP streaming [102] The host and port property are set to the hostname and port property of the

listener of the plugin

This pipeline is implemented using the Python GStreamer bindings The process consists of creating each GStreamer element

adding them to the GStreamer pipeline and linking the elements in order of appearance in the pipeline The decodebin

and jpegenc element canrsquot be linked when created because there is no default sink pad available on the decodebin

Because the decodebin needs to decide on how to decode media it needs the pipeline to be processing media to it If no

media is flowing the decodebin canrsquot know what decoder it needs to decode the media and what media it can offer to the

sink element Therefore the process of dynamic linking is used [103] All elements which can be linked when the pipeline is

not in the PLAYING state are linked A handler is registered on the rsquopad-addedrsquo signal emitted when a new pad is added

on the decodebin indicating that it can forward media downstream When media is flowing through the pipeline the

decodebin creates new pads when it can generate output data and emits the rsquopad-addedrsquo signal A callback is performed

on the handler which links the decodebin with the jpegenc Listing 7 illustrates this concept

callback handler

def on_pad(source pad sink)

get the sink pad from the sink element

sink_pad = sinkget_static_pad(sink)

get the pad type

pad_caps = padget_current_caps()

pad_type = pad_capsget_structure(0)get_name()

Only if the pad is raw video the link is made

if pad_type == videox-raw

42 Overview of prototype 50

Perform the dynamic link

padlink(sink_pad)

Other pad types are ignored

filesrc = GstElementFactorymake(filesrc)

decodebin = GstElementFactorymake(decodebin)

jpegenc = GstElementFactorymake(jpegenc)

(create other elements and add elements to pipeline)

Only filesrc and decodebin can be linked statically

filesrclink(decodebin)

Register on_pad handler on the pad-added signal

handler_id = decodebinconnect(pad-added on_pad jpegenc)

Set pipeline to PLAYING callback will be called to perform the dynamic link

pipelineset_state(GstStatePLAYING)

Listing 7 Dynamic linking of the decodebin and jpegenc

Local plugin

The local plugin captures an incoming media stream and displays it on the local display This plugin is special with respect

to other plugins in that it is not deployed in a Docker container It runs natively via the cli on the host to allow access to

the local display This version is built for macOS High Sierra (version 10134) and uses GStreamer 112 GStreamer plugins-base

plugins-good plugins-bad plugins-ugly libav doc and tools to receive an incoming stream When a plugin links to the local

plugin the Stream component does not instruct the Consumer component to start the plugin but instead links the plugin to the

local host For macOS the address of the host is hostdockerinternal The GStreamer pipeline used by the plugin is depicted in

Figure 42

Figure 42 local plugin GStreamer pipeline

The pipeline consists of the following elements

1 updsrc GStreamer element that reads UDP packets from the network [104] The port property is set to the port to

which the source is transmitting media

2 rtpjpegdepay GStreamer element that retrieves JPEG images from the received RTP packets [105] This element

canrsquot process the media received from the udpsrc directly because it canrsquot know what type of data it will be receiv-

43 Limitations and issues 51

ing Between the pads a rsquocapabilities filterrsquo is placed which informs the elements on the type of data that will be

flowing through In this case the capabilities are applicationx-rtp which tells that there will be rtp pack-

ets coming through encoding-name=JPEG which tells that the payload of the RTP packets are JPEG images and

payload=26 which also tells that the encoding is JPEG according to RFC3551 [50 106]

3 jpegdec GStreamer element that decodes JPEG images [107]

4 autovideosink GStreamer element that automatically detects an appropriate videosink and forwards the video

to it [108]

43 Limitations and issues

The implementation presented is a prototype and slimmed down version of the architecture presented in Section 23 The

following limitations and issues remain

431 Single client

The current implementation deploys the Flask framework (on which each microservice relies) on the built-in Flask web server

(Werkzeug) which is provided for development convenience It is only built for use by a single user and by default can only

handle one request at each given moment which implies that the framework can also only be used by a single user [109]

432 Timeouts

The framework does not perform checks on request timeouts when passing commands to components and plugins This can

be a problem when the framework is deployed on several devices and the request latency is much higher In case of timeouts

the framework will keep waiting for a response which leads to a crash

433 Exception handling and testing

The framework is only tested for the so called rsquohappy pathrsquo the default scenario featuring no exceptional or error conditions

Some alternate paths are handled butmost still need to be tested An example scenario would be if one of the plugin containers

in a stream fails and stops The framework is not able to detect this and will assume that the container is still running

434 Docker security issues

The Docker client is a client that communicates with a daemon process using the socket dockerd This socket is a UNIX

domain socket called varrundockersock The daemon is highly privileged having root access to the host system

Any process that can write to this socket effectively has root access To allow the components of the framework to manipulate

the plugin containers they need access to this socket Therefore the socket ismounted in the containerwhich gives the container

43 Limitations and issues 52

write access to the socket This implies that that container now has root access on the host when writing to this socket Because

the container gets root access to the host an attacker can walk the file tree of the host and extract sensitive information or run

unwanted software This type of attack is known as a rsquoDocker Breakoutrsquo or rsquoContainer Escapersquo attack [110 111]

435 Docker bridge network

The current implementation deploys the framework on a Docker bridge network which can only be used if the framework is

deployed on a single device The current implementation can thus only be deployed on a single device To deploy the framework

on multiple devices the framework must be deployed using a Docker overlay network [112]

436 Single stream

The implementation supports one stream which must be a chain Multiple streams in tree form with merging media from

multiple sources and broadcasting to multiple listeners is not supported

437 Number of containers per plugin

The framework uses the name of the plugin as identifier for the containers The name is also the hostname on which the

container can be reached Therefore there can only be one active container associated with a plugin at runtime

MOB DETECTION EXPERIMENT 53

Chapter 5

Mob detection experiment

To try out an actual drone thermal imaging application the mob detection experiment is carried out The goal of this experi-

ment is to use existing object detection algorithms on a dataset of thermal images to try and detect large crowds of people

hereinafter referred to as a mob

Several public datasets of thermal images exist Most datasets focus on the detection of people in scenes [113ndash117] some on

face recognition [118 119] others on vehicle recognition [120] Most of these datasets are freely available through the OTCBVS

Benchmark Dataset Collection [121] No datasets containing large amounts of people were found so the Last Post thermal

dataset was created for the detection of mobs and other analysis tasks This dataset is presented in Section 51

To detect mobs in the images of the datasets a deep learning approach using neural networks is explored The selection and

training of the model is described in Section 52

51 Last Post thermal dataset

The Last Post dataset consists of videos of the Last Post Ceremony taking place each night at 800 PM (Brussels timezone) under

the Menin Gate in Ypres Belgium Section 511 gives some insight into this unique ceremony The full dataset is described in

Section 512

511 Last Post ceremony

The Last Post ceremony is a nightly ceremony taking place under the Menin Gate in Ypres at 800 PM sharp The ceremony is

held in remembrance of the fallen soldiers during World War I (1914-1918) The Last Post association [122] states its mission as

follows

True to its statutes the Last Post Association wishes to honor and remember the soldiers of the British Empire

who gave their lives during the Great War of 1914-1918 The Last Post ceremony seeks to express day after day

the lasting debt of gratitude which we all owe to the men who fought and fell for the restoration of peace and

the independence of Belgium

51 Last Post thermal dataset 54

Figure 51 gives an impression of the size of the ceremony Because of the sheer number of people that gather under the gate

each day the Last Post is a unique open air event that allowed for repeatable conditions to capture footage therefore the event

was a perfect opportunity to create the dataset

Figure 51 Last Post ceremony panorama

512 Dataset description

Due to legislation in Belgium drones cannot be flown in public areas without a certification and permit by authorities The

creation of real aerial thermal images with a drone was thus not feasible Therefore an elevated position (in order to simulate

aerial images) on the walls next to Menin gate was used to capture the footage of the adjacent square on one side and the

bridge on the other side Figure 52 shows the locations where the video footage was captured

Figure 52 Locations where the video footage was captured The black stars represent the captured scenes the red stars represent the locations from

where the scene was filmed

The data was recorded with the FLIR One Generation 3 Pro camera for Android devices hereafter referred to as rdquoCamerardquo [123]

Since thermal images donrsquot hold color information a color scheme is used to represent the relative differences in temperature

The rsquoIronrsquo color scheme which maps colder sections of a scene on blue colors and warmer sections on red and yellow colors

51 Last Post thermal dataset 55

The videos are encoded using the H264 MPEG-4 codec Decoded the color information is captured in 420 YUV format The

frame rate of the videos varies from 7 Hz to 8 Hz depending on the speed of the objects in the scene There is sound present

in the videos which is encoded with the MPEG AAC codec For a full list of sequences the reader is referred to Appendix C

The two locations that make up the main scenes in the dataset are presented in Figure 53 The thermal images and visual

images of each scene are depicted next to each other The thermal and visual images were not captured at the same time so

the mobs that are present in the thermal images canrsquot be seen in the visual images In both scenes buildings are present that

are quite warm when compared to the surroundings as can be seen in the thermal images In Figure 53a it even becomes

difficult to recognize the mob when they are standing close to the building This is less the case for Figure 53c where due to

the water present in the image the mob has higher contrast due to the larger difference in emitted heat Towards the far right

of the image the mob seemingly disappears into the background The effect of two objects having a similar heat signature and

having no clear transition in thermal images is defined as thermal camouflage a technique that is often used by animals and

military units [124] This effect is even visible when looking at the mobs present in both images because people are standing

so close together it becomes difficult to recognize individual persons in the crowd

(a) Thermal view of the square in location A (b) Visual view of the square in location A

(c) Thermal view of the bridge in location B (d) Visual view of the bridge in location B

Figure 53 Main scenes in the Last Post dataset

52 Object detection experiment 56

52 Object detection experiment

521 Preprocessing

The Last Post dataset was not used entirely for training the model because there were not enough resources to manually

annotate every image Therefore a smaller dataset was used to serve as a baseline model

The following videos were used 2018-04-10 195029mp4 2018-04-10 200122mp4 2018-04-04-

202859mp4 2018-04-10 202558mp4 and 2018-04-04 200052mp4 captured on the fourth and

tenth of April 2018 These videos were used because of their contents They contain images from location A and B respectively

in which the mob behaves more dynamically compared to other videos This was due to a marching band present on the fourth

of April and a marching army unit on the tenth of April See Appendix C for a summary of the contents of these videos From

these videos images were extracted at a capture rate of 1 Hz Each image was manually labelled using the Microsoft Visual

Object Tagging Tool [125] The tool allows to export the training images to various formats such as Pascal VOC for Tensorflow

YOLO and Microsoft CNTK

Within the data several visual outliers are present An outlier is an observation point that is distant from other observations

It is created due to variability in capturing the videos or indicate experimental errors [126] The errors detected here are the

latter form and are depicted in Figure 54 The first type of outliers are system faults in the Camera Due to an error in the

processing of the video the Camera would sometimes not register any input This causes the Camera to produce completely

black images which is depicted in Figure 54a The Camera softwaremaps temperatures onto colors in the image The variations

of the colors are relative to the temperature interval ranging from the minimum and maximum temperature detected by the

Camera If the minimum andor maximum detected temperature change the Camera needs to adapt its color mapping This

causes the Camera to fade to bright colors for a short period of time (1 to 2 seconds) The resulting image is depicted in Figure

54b Because the resulting image is too bright and objects are hard to detect it is considered an outlier Due to instabilities

when capturing the footage sequences with fast motion some images are very blurry This makes it hard even for a person to

decide what is visible in the frame therefore it is considered an outlier This is depicted in Figure 54c Sometimes people would

pass in front of the Camera which resulted in brightly colored areas in the videos that were not part of the scene and therefore

are another type of outliers depicted in Figure 54d Because the presented outliers are experimental errors and do not belong

in the scenes they were removed from the dataset

522 Training

The model that is used for training is YOLOv3 implemented using the darknet neural network framework [83] The model is

trained using convolutional weights that are pre-trained on the ImageNet database [127] The concept of using weights from a

pre-trained model previously trained on large datasets is known as transfer learning It is very important that when choosing

a pre-trained model the problem statement of the pre-trained model is close enough to the current problem statement For

the pre-trained model on ImageNet this was to identify objects in images which lies close to the detection of mobs in thermal

images Because the type of images (thermal versus visual) is fundamentally different the model could suffer in performance

Goedeme et al [36] solved a similar problem with thermal images and achieved good results which gives an indication that

52 Object detection experiment 57

(a) System fault in the Camera no input was detected (b) The Camera updates to new temperature interval

(c) Due to moving the Camera too fast the image becomes too blurry (d) Very warm object due to people passing in front of the Camera

Figure 54 Outliers

detection should be feasible with the pre-trained model Also because the dataset is relatively small training the model from

scratch could actually hurt performance [128] Training was carried out on the NVIDIA Geforce GTX 980 GPU that allows training

to be done much faster To evaluate training progress the Sum of Squared Error (SSE) loss function is calculated defined assumni=1(xij minus xj)

2 where n is the number of samples in a batch used in a single training epoch and j is the dimension (x

or y) as defined in [83] The result of this training is discussed in Chapter 6

RESULTS AND EVALUATION 58

Chapter 6

Results and evaluation

The goal of this Chapter is to present the results of the framework and the detection experiment The results of the framework

tests are presented in Section 61 The results of the object detection experiment are presented in Section 62

61 Framework results

To evaluate the framework acceptance tests are conducted that test if the framework meets the QARs defined in Section 21 As

stated in Section 41 only the ASRs will be tested A summary of which requirements are met by the framework is given in Table

61 Passed means that the framework has met the requirement not passed that the framework hasnrsquot met the requirement

and plausible means that the frameworkmight havemet the requirement but not enough data could be gathered to be certain

611 Performance evaluation

To evaluate performance the acceptance tests for the requirements are conducted the impact of the framework on the pro-

cessing resources are recorded and the total size of the framework is measured

Acceptance tests

To test the performance of the framework the execution times of each command executed with the CLI (see Section 422) are

measured Each command is executed 200 times except for the on off and link commands they are measured manually

10 times Because these commands launched system threads and their finish signal could not be captured they had to be

measured by hand Commands were executed on a 26 GHz Intel Core i5-2540 processor running macOS High Sierra version

10134 The summarized statistics of the tests are given in Table 62

The average execution times for the Play Stop Pause Add Elements Print View and Link commands do not exceed the 2

second bound specified in PS-1 while the average execution times of the Delete On and Off commands do exceed this bound

This performance requirement is not met by the framework The same result is found for PS-2 Especially the Delete and Off

command exceed the requirements by quite a bit The Delete command shuts down a plugin and removes the Docker container

61 Framework results 59

Requirement id Status

PS-1 Not Passed

PS-2 Plausible

PS-3 Not Passed

PS-4 Plausible

PS-5 Not Passed

IS-1 Passed

IS-2 Passed

MS-1 Passed

MS-2 Passed

MS-3 Passed

MS-4 Passed

MS-5 Plausible

MS-6 Passed

MS-7 Plausible

Table 61 Acceptance tests results summary

from the host This action is costly in time The Off command removes all the plugins and all the microservices of the framework

and thus suffers from the same costly action This could be ameliorated by having the framework not removing the containers

but stopping them instead which requires less resources as it only stops the process running in the container but does not

delete the container from the system

PS-2 and PS-4 could not be measured due to the GStreamer pipeline of the prototype not allowing frames to be tracked

However since real-time is a human time perception if a person canrsquot distinguish the streamed videos from videos played with

a native video player real-time streaming is plausible [43 44] The videos were shown side by side to ten users that could not

distinguish between both videos indicating presumable real-time streaming Since the hard requirements cannot bemeasured

the requirements are not met but are plausible Real-time streaming performance also heavily depends on the used plugins

and the hardware on which they are deployed If a plugin canrsquot process its media fast enough due to lack of processing power

or a slow implementation it will slow down the whole stream

The scalability requirement PS-5 could not be met due to the Flask Werkzeug server only being able to process one request at

a time (see Section 43)

Only two performance requirements are met by the prototype However this is mostly due to some actions being very slow

such as shutting down the framework or removing a plugin As these are actions that should occur less frequently when a user

is using the framework these actions are less important for the perceived quality Frequent actions such as adding linking and

changing the state of the stream do perform rather well and contribute more to the perceived quality Overall the performance

of the framework is not stellar but not bad either This can partially be explained due to the choice of supporting frameworks

61 Framework results 60

Statistic Play Stop Pause Add Delete Elements Print View On Off Link

Mean 0690 0804 0634 1363 8402 0562 0564 122 358 24023 0849

Std deviation 0050 0059 0088 1037 4669 0070 00747 0260 0498 0481 0170

Minimum 0629 0708 0549 0516 0505 0517 0517 0757 3015 23707 0637

25 Percentile 0665 0775 0594 1049 1154 0534 0536 0998 3143 23750 0798

Median 0678 0800 0623 111 11132 0550 0552 1214 3500 23886 0853

75 Percentile 0700 0820 0653 1233 11189 0562 0560 1433 3850 24034 0877

Maximum 1016 1279 1631 625 11846 1227 1149 1691 4562 25326 1261

Table 62 Performance test statistics summary measured in seconds

such as Flask that are not built for performance Other more high performance frameworks such as Vertx could ameliorate

performance

Resource usage

The resources used by the modules of the framework are measured using the Docker statistics tool [129] A summary of the

resources used is given in Table 63 When the framework is idle resource usage is negligible When a plugin is active there is

a slight increase in resources This increase in resources depends on the runtime size of the plugin unknown to the framework

The increase peaks when the plugin is processing media CPU usage is 40 on one core which implies that on one CPU core only

two plugins can be active simultaneously before reaching the ceiling of the processing power In a production environment of

the framework plugins need to be tested thoroughly so that these metrics are known beforehand These metrics imply that

the length of streams should be kept short to avoid having many plugins active simultaneously

Size of framework

The total size of all the Docker images of the components of the framework are given in Table 64 Most images are quite large

the framework core components have an average size of 724 MB and the plugins have sizes ranging from 1GB to 3GB This

size can be explained due to the base images and additionally installed software in the images For development flexibility

the base images used are Linux Ubuntu images that are typically larger than other Linux distributions For the plugins the full

GStreamer library with all plugins was installed which is more than 2 GB large The sizes of the components can be reduced in

a production environment by choosing slimmer Linux distributions as base images and only installing the minimally needed

libraries to get a working plugin

612 Interoperability evaluation

The systems with which the framework exchanges data are the plugins These plugins must follow the plugin model presented

in Section 231 implement the presented resources using a REST API the state machine and protocols If these specifications

61 Framework results 61

Condition Container CPU usage [] Memory usage [MiB]

Idle streamer 100 4209

consumer 003 244

producer 001 2414

1 plugin active not processing media streamer 156 4248

consumer 002 2442

producer 002 2423

mycam plugin 075 4597

1 plugin active processing media streamer 156 4251

consumer 002 2442

producer 002 2424

mycam plugin 4003 9924

Table 63 Resource usage of the framework in several conditions

Image Size [MB]

streamer 718

consumer 729

producer 729

testsrc 1250

mycam 3020

Table 64 Total size of framework components

are followed by a plugin the framework should have no issues exchanging information with the plugin To test this a new

mock plugin is implemented For each resource of the plugin the framework is given random mock input data to exchange

with the plugin When the exchange is complete the values in the plugin are requested and compared with the given input If

the input matches the value in the plugin the exchange was successful These tests were executed 50000 times The results

are summarized in Table 65 Play pause and stop are the requests to change the state of the plugin The sourcelistener add

update and delete commands manipulate the sources and listeners of the plugin Overall there were almost no errors made

when exchanging information only when updating a source and deleting a listener there was one incorrect exchange The

ratios achieved are always 100 correct exchanges except for updating a source and deleting a listener which are 99998

IS-1 and IS-2 specify that commands exchanged with the plugins need to be correct 9999 of the uptime so this requirement

is clearly met

Plugins also interact with each other by transmitting media to each other according to the stream layout This interoperability

62 Mob detection experiment results 62

Value Play Pause Stop Add S Update S Delete S Add L Update L Delete L

Correct 50000 50000 50000 50000 50000 49999 50000 50000 49999

Incorrect 0 0 0 0 0 1 0 0 1

Ratio () 100 100 100 100 100 99998 100 100 99998

Table 65 Interoperability tests results (S Source L Listener)

is not directly controlled by the framework as plugins can be developed by third parties To solve this a plugin needs to provide

its specifications to the framework before being integrated as a plugin This allows the framework to decide whether or not two

plugins will be able to interact with each other in a stream For example if plugin A supports MJPEG streams transmitted via

RTPUDP it will be able to interact with plugin B implementing the same protocols If plugin B implements another protocol it

will not be able to interact with plugin A If this is specified the framework can notify a user that two plugins are not compatible

These scenarios should be avoided which is done by specifying standard protocols for plugins

613 Modifiability evaluation

Plugins are installed for the prototype by building and adding their image to the image directory of the Docker host The

framework does not need a restart to install these images Therefore requirements MS-1 and MS-2 are met End-users can

extend their version of the framework with new plugins by installing them by building the respective plugin images meeting

MS-3 Streams can be modified by linking different plugins by design meetingMS-4 The framework can detect newly installed

plugins when starting up if the image is installed to the image directory of the Docker host Therefore requirementsMS-5 and

MS-6 are met The current prototype is only deployable on a local device as discussed in Section 41 meeting requirementMS-7

The other requirements can be met by deploying the framework using the Docker overlay network as discussed in Section 43

without having to implement changes to the code base The requirements MS-8 and MS-9 are not met but are plausible by

using a different Docker deployment

In general the frameworkwas designed to bemodifiable for different video analysis tasks The hybridmicrokernelmicroservices

architecture enables this modifiability The microkernel plugin architecture allows a user to modify a video analysis stream

during framework use The microservices architecture allows for a modifiable deployment configuration

62 Mob detection experiment results

To evaluate the detection experiment the trained model is tested on the validation set that contains random images from the

total annotated dataset presented in Section 512 First the results of the training of the model are presented in Section 621

Second the metrics that were used to evaluate the model are presented in Section 622 Finally the results of the validation

are presented in Section 623

62 Mob detection experiment results 63

621 Training results

To monitor training the average loss per training epoch was measured the resulting training evolutions are depicted in Figure

61 Darknet does not shuffle training data automatically and creates training batches in order of the training data provided

Since YOLO uses gradient descent for optimization this can lead to YOLO getting stuck in local minima of the cost surface [130]

This effect is seen in Figure 61a around epoch 4500 every image in the training set has been loaded at least once at this point

the model was training on images from location B and now images from location A are loaded (see Section 512) This leads to

a peak in average loss as YOLO was optimizing images from location B and probably converging to a local minimum for that

type of images Therefore in a second run data was shuffled allowing the model to get out of local minima easier Figure

61b shows the difference in training loss the curve is much more irregular thanks to the shuffling of the data Once again

the average loss decreases more around epoch 4500 when every image in the training set has been loaded at least once The

average loss stagnates values in the interval [004 007] To avoid overfitting the model on the training data and achieve worse

generalization performance early stopping is applied Early stopping is a generalization technique to stop the training of a

neural network early before the network starts overfitting [131] The stopping criterion used is progress defined as the decrease

of training error in successive training epochs [131] or the slope of the loss curve depicted in Figure 61 This slope approaches

0 from epoch 13000 and onward so this epoch is selected as early stopping point Because the generalization error is not a

smooth curve and consists of many local minima it is a good idea to validate model weights in the neighborhood of the early

stopping point as well as these could potentially yield better performance on the validation set [131]

622 Metrics

Themodel predicts bounding boxes for objects in the images of the validation sets The bounding box provided by the annotated

dataset is defined as the ground truth bounding boxBgt The bounding box provided by the model is defined as the predicted

bounding boxBp To evaluate the performance of themodel and select the best weights several metrics are used The standard

metrics used to evaluate object detection problems are the Intersection over Union (IoU) and themean Average Precision (mAP)

The IoU is a metric used in common object detection challenges such as the Pascal VOC challenge [132] If the functionA(Bx)

gives the area for a bounding boxBx the IoU is defined as

IoU =A(Bp capBgt)

A(Bp cupBgt)(61)

The mAP for set of detections another metric used in the Pascal VOC challenge is defined as the mean over classes of the

interpolated AP for each class A detection is considered a true positive if the IoU for the detection is greater than 05 The

interpolated AP is given by the area under the precision-recall curve for the detections [132ndash134]

Themodel is also tested on several videos not included in the train and validation set to visually evaluate detection andmeasure

the number of frames per second that can be processed by the model

62 Mob detection experiment results 64

(a) Average training loss when data is not shuffled Vertical average loss horizontal time (in training epochs)

(b) Average training loss when data is shuffled Vertical average loss horizontal time (in training epochs)

Figure 61 Average training loss per epoch

623 Validation results

YOLO creates a snapshot from the weights the model is using at a certain epoch every 100 epochs [83] This makes it possible

to validate each set of weights on the validation set and show the evolution of the validation performance Figure 62 shows

these evolutions for the average IoU and mAP metrics The mAP gradually grows from epoch 4500 onwards and stagnates

around epoch 11500 This shows that the model is not learning anymore and is at risk of overfitting The mAP stagnates in the

interval of [88 91] The average IoU shows a similar trend but varies more because predictions on the same images rarely

are exactly the same

The best mAP value is achieved at epoch 15700 being 9052 The weights from this epoch are used for further testing and

validation The mAP for the 05 IoU threshold of YOLOv3 on the COCO benchmark dataset [135] is 748 comparing this to the

achieved mAP for the Last Post dataset the Last Post mAP is very high The reason for this difference is that the validation

62 Mob detection experiment results 65

(a) mAP () per epoch Vertical mAP () horizontal time (in training epochs)

(b) IoU () per epoch Vertical IoU () horizontal time (in training epochs)

Figure 62 Validation metrics per epoch

set has a high correlation with the validation set Due to the training set and validation set being extracted from videos all

images from one video are correlated in time to each other Images from the validation set are thus correlated to images in

the training set and the model is optimized on these types of images explaining the high mAP This indicates that the model is

somewhat overfitting on the training data This was confirmed when testing the model on unseen videos Although the model

could detect a mob most of the time it produced more visual errors Because this data was not annotated no metrics could be

extracted Figure 63 depicts some predictions of the model on images from the validation set The predicted bounding boxes

resemble the ground truth bounding boxes quite accurately visually

To test the speed of the predictions of the model the total time to predict images in the validation set was measured For the

NVIDIA Geforce GTX 980 GPU the average prediction time for one image is 14673 milliseconds with a standard deviation of

0517 milliseconds This indicates that the upper limit of the frame rate when making predictions on a video is approximately

68 frames per second on the GPU For comparison predictions with the model were also made on a CPU a 26 GHz Intel Core

i5-2540 processor with AVX instructions speedup The average prediction time on the CPU is 5849 seconds with a standard

deviation of 0438 seconds resulting in an upper limit for the frame rate on the CPU of 0171 frames per second Clearly real

time object detection with this model is only possible on a GPU When generating predictions on a test video the average frame

rate of the video was 55 frames per second

62 Mob detection experiment results 66

(a) Prediction of a large mob at location B (b) Prediction of the mob at location A

(c) Prediction of a small mob at location B (d) Prediction of the mob at location B

Figure 63 Predictions of the model on images in the validation set

CONCLUSION AND FUTURE WORK 67

Chapter 7

Conclusion and future work

71 Conclusion

Aerial thermal imaging with drones is a promising technology that can deliver many promising applications for various use

cases across many different domains such as agriculture fire fighting search and rescue etc Most applications built with this

technology are built with a specific use case in mind using a thermal camera and analysis software specifically for this use

case and therefore struggle to exchange hardware and algorithms for new use cases Therefore the goal of this dissertation

was to design build and test a possible backbone framework that allows building these applications in a modifiable way The

specific use case of mob detection in thermal images was investigated as a sample use case for the framework

Chapter 2 explored the requirements of such a framework The ASRs to achieve the goal of the framework are performance

interoperability and modifiability Performance is needed because some use cases (like fire fighting) require real-time video

analysis Interoperability enables the framework to interact with different thermal cameras and different processinganalysis

modules Modifiability enables the framework to interchange the thermal cameras and analyzers in its process to build ap-

plications for different use cases A hybrid combination of the microkernel pattern and the microservices pattern is used to

meet these requirements as the microkernel pattern enabled interchanging the cameras and analyzers via a plugin system

and the microservices pattern enabled different deployment configurations for the framework To build and test the frame-

work several technologies were needed backbone technologies for the software architecture a thermal camera and an object

detection algorithm for the mob detection use case

Chapter 3 explored the state of the art of these technologies and presents the selected technologies Thermal cameras come in

all shapes and sizes and have different features according to their retail prize Contrary to intuition the image quality is not the

defining factor of the retail prize but the amount of extra features such as radiometry communication interfaces etc The FLIR

One Pro and ThermApp were selected for this dissertation since they offer good quality images and features for their price and

their use via smartphone platforms that makes these cameras excellent for prototyping Microservices frameworks also know

a lot of variety depending a lot on the use case for the application using the framework Some are aimed at quick prototyping

others focus on performance etc Flask was selected as the microservices framework as it is easy to use and designed for

prototyping with microservices This does come with a performance trade-off To deploy the microservices in a plugin fashion

71 Conclusion 68

the concept of containers is applied Containers virtualize on the OS level allowing the microservices to be moved around on

the host and distributed on different hosts The current field has some frameworks implementing this technology with Docker

being the most well-known and mature framework and it was selected for that reason The field of object detection has a

variety of solutions for the object detection problem having varying accuracies and some can even create predictions in real-

time The YOLOv3 algorithm implemented in the darknet framework was selected as it generalizes well onto other datasets

(such as thermal images) makes relatively accurate predictions and is able to make predictions in real-time when deployed on

a device with GPU processing capabilities

Chapter 4 presents the implemented prototype of the framework using these technologies Two sample plugins were imple-

mented the filecam that serves a video read in from a file and the display plugin that displays this video on the local device

The framework is limited to one video processing stream for one user at a time and is deployed to a local device It also has a

security risk as the framework has to expose the Docker daemon socket to allow the framework to manipulate the containers

running the plugins This gives the containers that run the core framework processes root access to the host system which can

be abused by potential attackers

Themob detection experiment is presented in Chapter 5 A new thermal image dataset called the Last Post datasetwas collected

for this experiment The dataset features videos of the Last Post ceremony filmed over the course of two weeks What makes

this dataset special is that unlike publicly available datasets it delivers footage of the movement of large crowds filmed from

a high vantage point to simulate footage captured from a drone platform This dataset is used to train a pre-trained YOLOv3

model via transfer-learning The dataset is manually labeled and preprocessed by removing the outliers present Training is

done on a NVIDIA GTX 980 GPU and is evaluated using the MSE loss metric

Chapter 6 presented the test conducted on the framework and the detection model and their corresponding results The per-

formance requirements for the frequently used commands are met by the framework Other commands such as removing

plugins starting up and shutting down the framework do not meet the performance requirements since Docker requires sig-

nificant time to start stop and remove containers The real-time streaming requirements could not be proven because the

time between transmitting a frame and receiving a frame could not be measured directly However the processed videos were

shown to human users that could not distinguish between the processed video and the video played back on a local system

which makes it plausible that the framework achieved this requirement Real-time streaming performance heavily depends on

the plugin and the hardware on which it is deployed When plugins in the framework are processingmedia CPU usage increases

significantly even when only one plugin is active This implies that the length of media processing streams should be kept as

short as possible to achieve good performance The framework is relatively big with some plugins even having a size of 2 GB

This is mostly due to the base images and installed libraries of the plugins and core components Due to each components

and plugin having its own container libraries canrsquot be shared so they are redundantly installed leading to large components

sizes This could be alleviated by using slimmer images and only installing minimal libraries needed The interoperability

requirements are all met by the framework This is proven by a test exchanging mock information between the framework

and plugins The modifiability requirements regarding the plugins are met by the framework The modifiability requirements

regarding the deployment schemes are not met by the framework but are can be achieved by deploying the framework using

a Docker overlay network instead of the Docker bridge network To evaluate the trained model the model made predictions

72 Future work 69

on a separate validation set The model achieves an mAP of 9052 which is much higher than what current state of the art

models are achieving on benchmark datasets This shows that the model is capable of learning the thermal features but is also

overfitting on the data due to temporal correlation between training and validation sets The model can predict in real-time

achieving an average frame rate of 55 frames per second when making predictions on a GPU

72 Future work

This dissertation proposed a framework and implements a prototype of it which only implements a part of the total framework

Object detection using deep learning in general and specified on thermal images is still a young field Several extensions to

this research are possible

721 Security

The framework prototype did not implement any security measures Because in distributed configurations communications

rely on an external network these measures should be implemented to reduce the risks of attacks To allow the components

to manipulate Docker containers the Docker host socket was exposed As stated before this is a serious security risk as the

container gets root access to the host Workarounds for this problem could be to implement a Docker in Docker environment

[136] or deploy the containers in a VM

722 Implementing a detection plugin

Due to the scope and time limit of the dissertation a working prototype plugin containing a trained model for detecting objects

in a video stream could not be made A possible GStreamer pipeline for such a plugin is depicted in Figure 71 This plugin is a

Consumer and receives video via the udpsink Frames are decoded and the raw video is presented to the appsink GStreamer

plugin that allows the video to be dumped into an application This is the detection model that can generate predictions on the

frame The predicted frame is then forwarded to an appsrc GStreamer plugin that puts the predicted frame in a new pipeline to

transmit it to further framework plugins It should be tested whether the detection model can run in a Docker container since

it needs GPU support to be able to predict in real-time A solution could be to use nvidia-docker which leverages NVIDIA GPU

support in Docker containers [137]

Figure 71 GStreamer pipeline for a plugin with a detection model

72 Future work 70

723 Different deployment configurations

The prototype of the framework only implemented one of the deployment configurations presented in Section 233 Other

configurations can be explored by changing the Docker bridge network to a Docker overlay network

724 Multiple streams with different layouts

The prototype only implemented one stream with a chain-like layout Future effort could implement support for multiple

streams that run concurrently The layout can be changed by implementing plugin that can forward media to multiple sources

or merge media coming from different sources which is the concept of sensor fusion

725 Implementing the plugin distribution service (Remote ProducerConsumer)

In Chapter 2 presented the Remote Producer and Consumer that distribute the plugins available for the framework This was

deemed out of scope for the prototype but could be implemented in future versions

726 Using high performance microservices backbone frameworks

The current implementation uses the Flask framework excellent for prototyping but not ideal for high performance Other

frameworks such as Vertx focus on high performance through asynchronous messaging that could improve the performance

of the framework

727 New object detection models and datasets specifically for thermal images

Current effort in object detection models goes towards challenges on benchmark datasets of visual images such as ImageNet

and Pascal VOC There are some thermal datasets publicly available for some detection purposes but these are very small

compared to the visual image datasets Future research could create new benchmark datasets similar to the visual image

datasets specifically for thermal images

Currently publicly available pre-trained neural network models are designed for and trained on the visual image datasets

Future research could go towards designing an architecture specifically for thermal images and training amodel on a benchmark

dataset

Thermal images use several colormaps tomap the relative temperatures in a scene on colors presenting warm and cold regions

Well-known examples are the Iron scheme (used in this dissertation) White-hot and Black-hot Some companies implement

threshold colors that highlight very hot spots or very cold spots in an image (for examples see [138 139] etc) Future research

could investigate how models trained on images using different color schemes differ in their predictions and performances

Thermal images could potentially benefit from radiometric information that adds a ton of information by adding a temperature

dimension to each pixel in the image instead of the relative coloring This information could lead to more accurate predictions

BIBLIOGRAPHY 71

Bibliography

[1] S G Gupta M M Ghonge and P Jawandhiya ldquoReview of Unmanned Aircraft Systemrdquo International Journal of Advanced

Research in Computer Engineering amp Technology vol 2 no 4 pp 2278ndash1323 2013 ISSN 2278 ndash 1323

[2] M Hassanalian and A Abdelkefi Classifications applications and design challenges of drones A review 2017 DOI

10 1016 j paerosci 2017 04 003 [Online] Available http ac els - cdn com S0376042116301348 1 - s2 0 -

S0376042116301348-mainpdf7B5C_7Dtid=256c9506-8f3c-11e7-a898-00000aab0f017B5Camp7Dacdnat=

15042875957B5C_7D

[3] M Joel The Booming Business of Drones 2013 [Online] Available httpshbrorg201301the-booming-business-of-

drones (visited on 01302018)

[4] DJI Zenmuse H3 - 2D [Online] Available httpswwwdjicomzenmuse-h3-2d (visited on 01302018)

[5] Gimbal Guard Drop amp Delivery Device for DJI Mavic Pro [Online] Available httpwwwgimbal-guardcom7B5C_

7Dpprd134610820141productdrop-7B5C7D26-delivery-device-for-dji-mavic-pro (visited on 01302018)

[6] FLIR Systems Aerial Thermal Imaging Kits [Online] Available httpwwwflircomsuasaerial-thermal-imaging-kits

(visited on 01302018)

[7] R Gade and T B Moeslund ldquoThermal cameras and applications a surveyrdquo Machine Vision and Applications vol 25

pp 245ndash262 2014 DOI 101007s00138-013-0570-5 [Online] Available httpslinkspringercomcontentpdf10

10077B5C7D2Fs00138-013-0570-5pdf

[8] M C Harvey J V Rowland and K M Luketina ldquoDrone with thermal infrared camera provides high resolution georefer-

enced imagery of theWaikite geothermal area New Zealandrdquo 2016 DOI 101016jjvolgeores201606014 [Online] Avail-

able httpsacels-cdncomS03770273163014211-s20-S0377027316301421-mainpdf7B5C_7Dtid=78077cee-

05f3-11e8-84ec-00000aab0f6c7B5Camp7Dacdnat=15173405687B5C_7D

[9] S Amici M Turci S Giammanco L Spampinato and F Giulietti ldquoUAV Thermal Infrared Remote Sensing of an Italian Mud

Volcanordquo vol 2 pp 358ndash364 2013 DOI 104236ars201324038 [Online] Available httpwwwscirporgjournalars

20httpdxdoiorg104236ars201324038

[10] J Bendig A Bolten and G Bareth ldquoINTRODUCING A LOW-COST MINI-UAV FOR THERMAL-AND MULTISPECTRAL-IMAGINGrdquo

2012 [Online] Available httpswwwint-arch-photogramm-remote-sens-spatial-inf-scinetXXXIX-B13452012

isprsarchives-XXXIX-B1-345-2012pdf

BIBLIOGRAPHY 72

[11] Workswell ldquoUsing the UAV Thermography for Cultivation and Phenotyping of Cerealsrdquo Tech Rep 2016 [Online] Avail-

able httpswwwdrone-thermal-cameracomwp-contentuploadsCultivation-and-Phenotyping-1pdf

[12] A J Rivera A D Villalobos J C Monje J A Marintildeas and C M Oppus ldquoPost-disaster rescue facility Human detection and

geolocation using aerial dronesrdquo IEEE Region 10 Annual International Conference ProceedingsTENCON pp 384ndash386

2017 ISSN 21593450 DOI 101109TENCON20167848026

[13] P Christiansen K A Steen R N Joslashrgensen and H Karstoft ldquoAutomated detection and recognition of wildlife using

thermal camerasrdquo Sensors (Basel Switzerland) vol 14 no 8 pp 13 778ndash93 Jul 2014 ISSN 1424-8220 DOI 103390

s140813778 [Online] Available httpwwwncbinlmnihgovpubmed2519610520httpwwwpubmedcentralnih

govarticlerenderfcgiartid=PMC4179058

[14] J Zhang J Hu J Lian Z Fan X Ouyang and W Ye ldquoSeeing the forest from drones Testing the potential of lightweight

drones as a tool for long-term forest monitoringrdquo Biological Conversation vol 198 pp 60ndash69 2016 [Online] Available

httpacels-cdncomS00063207163011001-s20-S0006320716301100-mainpdf7B5C_7Dtid=7166e916-8f3c-

11e7-9090-00000aacb35e7B5Camp7Dacdnat=15042877237B5C_7D

[15] D Ventura M Bruno G Jona Lasinio A Belluscio and G Ardizzone ldquoA low-cost drone based application for identifying

and mapping of coastal fish nursery groundsrdquo Estuarine Coastal and Shelf Science vol 171 pp 85ndash98 Mar 2016 ISSN

02727714 DOI 101016j ecss 201601 030 [Online] Available http ac els-cdncomS02727714163003001-s20-

S0272771416300300-mainpdf7B5C_7Dtid=7f4cdb08-8f3c-11e7-a03a-00000aab0f6b7B5Camp7Dacdnat=

15042877467B5C_7D20httplinkinghubelseviercomretrievepiiS0272771416300300

[16] S Chowdhury A Emelogu M Marufuzzaman S G Nurre and L Bian ldquoDrones for disaster response and relief operations

A continuous approximation modelrdquo 2017 DOI 101016jijpe201703024 [Online] Available wwwelseviercomlocate

ijpe

[17] Workswell ldquoPipeline inspection with thermal diagnosticsrdquo 2016 [Online] Available https www drone - thermal -

cameracomwp-contentuploadspipelinepdf

[18] Workswell ldquoThermo diagnosis of photovoltaic power plantsrdquo 2016 [Online] Available httpswwwdrone-thermal-

cameracomwp-contentuploadsWorkswell-WIRIS7B5C_7Dphotovoltaicpdf

[19] Workswell ldquoThermodiagnostics of flat roofsrdquo 2016 [Online] Available httpswwwdrone-thermal-cameracomwp-

contentuploadsroofpdf

[20] Workswell ldquoThermodiagnostics in the power engineering sectorrdquo Tech Rep 2016 [Online] Available https www

drone-thermal-cameracomwp-contentuploadshighvoltagepdf

[21] Workswell Workswell WIRIS - Product - Thermal camera for drones 2016 [Online] Available https www drone -

thermal-cameracomwiris (visited on 01302018)

[22] TEAX Technology ThermalCapture - Thermal Imaging Technology | Capture raw radiometric thermal data with drones

[Online] Available httpthermalcapturecom (visited on 01302018)

BIBLIOGRAPHY 73

[23] DJI Zenmuse XT - unlock the possibilities of sight - DJI 2018 [Online] Available https wwwdji comzenmuse-xt

(visited on 01302018)

[24] Workswell SOFTWARE - Workswell WIRIS - Thermal camera for drones 2016 [Online] Available httpswwwdrone-

thermal-cameracomsoftware (visited on 01312018)

[25] Therm-App Therm-Apptrade - Android-apps op Google Play 2018 [Online] Available httpsplaygooglecomstoreapps

detailsid=comthermapp (visited on 01312018)

[26] B Satzger W Hummer C Inzinger P Leitner and S Dustdar ldquoWinds of change From vendor lock-in to the meta cloudrdquo

IEEE Internet Computing vol 17 no 1 pp 69ndash73 2013 ISSN 10897801 DOI 101109MIC201319

[27] J Divya Drone Technology and Usage Current Uses and Future Drone Technology 2017 [Online] Available httpuk

businessinsidercomdrone-technology-uses-2017-7r=US7B5Camp7DIR=T (visited on 01312018)

[28] A Boulanger ldquoOpen-source versus proprietary software Is one more reliable and secure than the otherrdquo IBM Systems

Journal vol 44 no 2 pp 239ndash248 2005 ISSN 0018-8670 DOI 101147sj4420239 [Online] Available httpieeexplore

ieeeorgdocument5386727

[29] M Kazmeyer Disadvantages of Proprietary Software [Online] Available httpsmallbusinesschroncomdisadvantages-

proprietary-software-65430html (visited on 01312018)

[30] B Steffen and A Seyfried ldquoMethods for measuring pedestrian density flow speed and direction with minimal scatterrdquo

Physica A Statistical Mechanics and its Applications vol 389 no 9 pp 1902ndash1910 May 2010 ISSN 0378-4371 DOI 10

1016JPHYSA200912015 [Online] Available httpswwwsciencedirectcomsciencearticlepiiS0378437109010115

via7B5C7D3Dihub

[31] M Wirz T Franke D Roggen E Mitleton-Kelly P Lukowicz and G Troumlster ldquoInferring crowd conditions from pedestriansrsquo

location traces for real-time crowd monitoring during city-scale mass gatheringsrdquo Proceedings of the Workshop on

Enabling Technologies Infrastructure for Collaborative Enterprises WETICE pp 367ndash372 2012 ISSN 15244547 DOI 10

1109WETICE201226

[32] E Alpaydin Introduction to machine learning 3rd ed MIT Press 2014 p 591 ISBN 026201243X [Online] Available

httpsdlacmorgcitationcfmid=1734076

[33] J W Davis and V Sharma ldquoRobust background-subtraction for person detection in Thermal Imageryrdquo IEEE Computer

Society Conference on Computer Vision and Pattern Recognition Workshops vol 2004-Janua no January 2004 ISSN

21607516 DOI 101109CVPR2004431

[34] W Wang J Zhang and C Shen ldquoImproved Human Detection And Classification in Thermal Imagesrdquo pp 2313ndash2316 2010

[35] R Appel S Belongie P Perona and P Doll ldquoFast Feature Pyramids for Object Detectionrdquo Pami vol 36 no 8 pp 1ndash14

2014 ISSN 01628828 DOI 10 1109 TPAMI 2014 2300479 [Online] Available https vision cornell edu se3 wp -

contentuploads201409DollarPAMI14pyramids7B5C_7D0pdf

[36] T Goedeme ldquoProjectresultaten VLAIO TETRA-projectrdquo KU Leuven Louvain Tech Rep 2017

BIBLIOGRAPHY 74

[37] L-L Slattery DroneSAR wants to turn drones into search-and-rescue heroes 2017 [Online] Available https www

siliconrepubliccomstart-upsdronesar-search-and-rescue-drone-software (visited on 05262018)

[38] A W S Inc What Is Amazon Kinesis Video Streams 2018 [Online] Available https docs aws amazon com

kinesisvideostreamslatestdgwhat-is-kinesis-videohtml (visited on 05262018)

[39] U Government ldquoSystems Engineering Fundamentalsrdquo Defence Acquisition University Press no January p 223 2001

ISSN 1872-7565 DOI 101016jcmpb201005002 [Online] Available httpwwwdticmildocscitationsADA387507

[40] L Bass P Clements and R Kazman Software Architecture in Practice 3rd Addison-Wesley Professional 2012 ISBN

0321815734 9780321815736

[41] J Greene and M Stellman Applied Software Project Management 2006 p 324 ISBN 978-0596009489 [Online] Avail-

able httpwwworeillycomcatalogappliedprojectmgmt

[42] S Barber Acceptable application response times vs industry standard 2018 [Online] Available httpssearchsoftwarequality

techtargetcomtipAcceptable-application-response-times-vs-industry-standard (visited on 05282018)

[43] T Burger How Fast Is Realtime Human Perception and Technology | PubNub 2015 [Online] Available httpswww

pubnubcombloghow-fast-is-realtime-human-perception-and-technology (visited on 05282018)

[44] S-t Modeling P Glennie and N Thrift ldquoTime perception modelsrdquo Neuron pp 15 696ndash15 699 1992

[45] M Richards Software Architecture Patterns First edit Heather Scherer Ed OrsquoReilly Media 2015 [Online] Available

httpwwworeillycomprogrammingfreefilessoftware-architecture-patternspdf

[46] C Richardson Microservice Architecture pattern 2017 [Online] Available httpmicroservicesiopatternsmicroservices

html (visited on 12022017)

[47] P Clements F Bachmann L Bass D Garlan J Ivers R Little P Merson R Nord and J Staffor Documenting Software

Architectures Second Boston Pearson Education Inc 2011 ISBN 0-321-55268-7

[48] Object Management Group ldquoUnified Modeling Language v251rdquo no December 2017 [Online] Available http www

omgorgspecUML251

[49] C De La Torre C Maddock J Hampton P Kulikov and M Jones Communication in a microservice architecture 2017

[Online] Available https docs microsoft com en - us dotnet standard microservices - architecture architect -

microservice-container-applicationscommunication-in-microservice-architecture (visited on 04272018)

[50] H Schulzrinne and S Casner ldquoRTP Profile for Audio and Video Conferences with Minimal Controlrdquo 2003 [Online] Avail-

able httpstoolsietforghtmlrfc3551

[51] D Bull Communicating Pictures A Course in Image and Video Coding Elsevier Science 2014 ISBN 9780080993744

[Online] Available httpsbooksgooglebebooksid=PDZOAwAAQBAJ

[52] On-Net Surveillance Systems Inc ldquoMJPEG vs MPEG4 Understanding the differences advantages and disadvantages of

each compression techniquerdquo 2006 [Online] Available wwwonssicom

BIBLIOGRAPHY 75

[53] M M A V Protocol Introduction MAVLink Developer Guide 2013 [Online] Available httpsmavlinkioen (visited on

09142017)

[54] hartmut Schlosser Microservices trends 2017 Strategies tools and frameworks - JAXenter 2017 [Online] Available

httpsjaxentercommicroservices-trends-2017-survey-133265html (visited on 03242018)

[55] A Ronacher Welcome to Flask mdash Flask Documentation (012) 2017 [Online] Available httpflaskpocooorgdocs012

(visited on 03242018)

[56] F Reyes PythonDecorators 2017 [Online] Available https wiki python org moin PythonDecorators (visited on

04272018)

[57] Stackshare Companies that use Flask and Flask Integrations 2018 [Online] Available https stackshare io flask

(visited on 03242018)

[58] Falcon Falcon - Bare-metal web API framework for Python [Online] Available httpsfalconframeworkorg7B5C

7DsectionAbout (visited on 03242018)

[59] Stackshare Companies that use Falcon and Falcon Integrations 2018 [Online] Available httpsstackshareiofalcon

(visited on 03242018)

[60] A Ronacher Nameko for Microservices 2015 [Online] Available httplucumrpocooorg201548microservices-with-

nameko (visited on 03242018)

[61] C Escoffier Building Reactive Microservices in Java 2017 ISBN 9781491986264

[62] C Posta Microservices for Java Developers ISBN 9781491963081

[63] R Dua A R Raja and D Kakadia ldquoVirtualization vs Containerization to support PaaSrdquo in IEEE International Conference

on Cloud Engineering 2014 ISBN 9781479937660 DOI 101109IC2E201441

[64] D Merkel Docker Lightweight Linux Containers for Consistent Development and Deployment 2014 [Online] Available

http delivery acmorg1011452610000260024111600htmlip=1571935 1787B5Camp7Did=26002417B

5Camp7Dacc=ACTIVE20SERVICE7B5Camp7Dkey=D7FC43CABE88BEAA F15FE2ACB4878E3D 4D4702B0C3E38B35

4D4702B0C3E38B357B5Camp7D7B5C_7D7B5C_7Dacm7B5C_7D7B5C_7D=15214915967B5C_

7D (visited on 03192018)

[65] Docker Inc Docker for the Virtualization Admin 2016 p 12

[66] Docker Inc What is a Container 2018 [Online] Available https www docker com what - container (visited on

03242018)

[67] M Helsley LXC Linux container tools 2009 [Online] Available httpswwwibmcomdeveloperworkslinuxlibraryl-

lxc-containers (visited on 05212018)

[68] J Fink Docker a Software as a Service Operating System-Level Virtualization Framework 2014 [Online] Available

http journal code4lib org articles 9669 utm7B5C _7Dsource = feedburner 7B5C amp7Dutm7B5C _

7Dmedium=feed7B5Camp7Dutm7B5C_7Dcampaign=Feed7B5C7D3A+c4lj+ (visited on 03192018)

BIBLIOGRAPHY 76

[69] C Wang What is Docker Linux containers explained 2017 [Online] Available https www infoworld comarticle

3204171linuxwhat-is-docker-linux-containers-explainedhtml (visited on 05212018)

[70] CoreOS Rkt a security-minded standards-based container engine [Online] Available httpscoreoscomrkt (visited

on 03242018)

[71] F X F Xu X L X Liu and K Fujimura ldquoPedestrian detection and tracking with night visionrdquo IEEE Transactions on

Intelligent Transportation Systems vol 6 no 1 pp 63ndash71 2005 ISSN 1524-9050 DOI 101109TITS2004838222

[72] H Nanda and L Davis ldquoProbabilistic template based pedestrian detection in infrared videosrdquo IEEE Intelligent Vehicles

Symposium Proceedings vol 1 pp 15ndash20 2003 DOI 101109IVS20021187921

[73] R E Schapire ldquoExplaining adaboostrdquo Empirical Inference Festschrift in Honor of Vladimir N Vapnik pp 37ndash52 2013

DOI 101007978-3-642-41136-6_5

[74] P Viola O M Way M J Jones and D Snow ldquoDetecting pedestrian using patterns of motion and appearancerdquo Interna-

tional Journal of Computer Vision vol 63 no 2 pp 153ndash161 2005 DOI 101109ICCV20031238422

[75] I Goodfellow Y Bengio and A Courville Deep Learning MIT Press 2016 httpwwwdeeplearningbookorg

[76] J R R Uijlings K E A Van De Sande T Gevers and A W M Smeulders ldquoSelective Search for Object Recognitionrdquo Tech

Rep 2012 DOI 101007s11263-013-0620-5 arXiv 14094842 [Online] Available httpwwwcscornelleducourses

cs76702014spslidesVisionSeminar14pdf

[77] R Girshick J Donahue T Darrell and J Malik ldquoRegion-Based Convolutional Networks for Accurate Object Detection and

Segmentationrdquo IEEE Transactions on Pattern Analysis and Machine Intelligence vol 38 no 1 pp 142ndash158 2014 ISSN

01628828 DOI 101109TPAMI20152437384 arXiv 13112524

[78] R Girshick ldquoFast R-CNNrdquo in Proceedings of the IEEE International Conference on Computer Vision vol 2015 Inter 2015

pp 1440ndash1448 ISBN 9781467383912 DOI 101109ICCV2015169 arXiv 150408083

[79] S Ren K He R Girshick and J Sun ldquoFaster R-CNN Towards Real-Time Object Detection with Region Proposal Networksrdquo

IEEE Transactions on Pattern Analysis and Machine Intelligence vol 39 no 6 pp 1137ndash1149 2016 ISSN 01628828 DOI

101109TPAMI20162577031 arXiv 150601497

[80] K He Gkioxari P Dollaacuter and R Girshick ldquoMask R-CNNrdquo arXiv 2018 arXiv arXiv170306870v3

[81] J Dai Y Li K He and J Sun ldquoR-FCN Object Detection via Region-based Fully Convolutional Networksrdquo Tech Rep 2016

DOI 101109ICASSP20177952132 arXiv 160506409 [Online] Available httparxivorgabs160506409

[82] J Redmon S Divvala R Girshick and A Farhadi ldquoYou Only Look Once Unified Real-Time Object Detectionrdquo 2015 ISSN

01689002 DOI 101109CVPR201691 arXiv 150602640 [Online] Available httparxivorgabs150602640

[83] J Redmon and A Farhadi ldquoYOLOv3 An Incremental Improvementrdquo axXiv 2018 [Online] Available httpspjreddiecom

mediafilespapersYOLOv3pdf

[84] W Liu D Anguelov D Erhan C Szegedy S Reed C-y Fu and A C Berg ldquoSSD Single Shot MultiBox Detectorrdquo arXiv 2016

arXiv arXiv151202325v5

BIBLIOGRAPHY 77

[85] B Zoph and Q V Le ldquoNeural Architecture Search with Reinforcement Learningrdquo in ICLR 2017 pp 1ndash16 arXiv arXiv

161101578v2

[86] T-y Lin P Goyal R Girshick K He and P Dollaacuter ldquoFocal Loss for Dense Object Detectionrdquo arXiv 2018 arXiv arXiv

170802002v2

[87] Facebook Inc ONNX - About 2017 [Online] Available httpsonnxaiabout (visited on 05212018)

[88] TensorFlow TensorFlow 2018 [Online] Available httpswwwtensorfloworg (visited on 05212018)

[89] J Huang V Rathod C Sun M Zhu A Korattikara A Fathi I Fischer Z Wojna Y Song S Guadarrama and K Murphy

ldquoSpeedaccuracy trade-offs for modern convolutional object detectorsrdquo arXiv 2017 arXiv arXiv161110012v3

[90] J Redmon Darknet Open source neural networks in c httppjreddiecomdarknet 2013ndash2016

[91] Microsoft The Microsoft Cognitive Toolkit | Microsoft Docs 2018 [Online] Available https docs microsoft comen-

uscognitive-toolkitindex (visited on 05212018)

[92] Docker Inc Overview of Docker Compose | Docker Documentation 2018 [Online] Available httpsdocsdockercom

composeoverview (visited on 04272018)

[93] Docker Inc Use bridge networks | Docker Documentation 2018 [Online] Available httpsdocsdockercomnetwork

bridge (visited on 04272018)

[94] A Ronacher Click Documentation (50) 2017 [Online] Available httpclickpocooorg5 (visited on 04272018)

[95] A K Reitz Requests HTTP for Humans mdash Requests 2184 documentation 2018 [Online] Available httpdocspython-

requestsorgenmaster (visited on 05092018)

[96] Docker Inc Docker SDK for PythonmdashDocker SDK for Python 20 documentation 2018 [Online] Available httpsdocker-

pyreadthedocsioenstable (visited on 05122018)

[97] GStreamer GStreamer open source multimedia framework 2018 [Online] Available httpsgstreamerfreedesktop

org (visited on 05132018)

[98] E Walthinsen filesrc GStreamer Core Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktop org data doc gstreamer head gstreamer -plugins html gstreamer -plugins - filesrc html (visited on

05132018)

[99] E Hervey decodebin GStreamer Base Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-base-pluginshtmlgst-plugins-base-plugins-decodebinhtml

(visited on 05132018)

[100] W Taymans jpegenc GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-jpegenchtml (visited on

05132018)

BIBLIOGRAPHY 78

[101] A Communications rtpjpegpay GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https

gstreamer freedesktop org data doc gstreamer head gst - plugins - good html gst - plugins - good - plugins -

rtpjpegpayhtml (visited on 05132018)

[102] W Taymans udpsink GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-udpsinkhtml (visited on

05132018)

[103] GStreamer Basic tutorial 3 Dynamic pipelines [Online] Available httpsgstreamerfreedesktoporgdocumentation

tutorialsbasicdynamic-pipelineshtml (visited on 05132018)

[104] W Taymans udpsrc GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-udpsrchtml (visited on

05142018)

[105] W Taymans rtpjpegdepay GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available httpsgstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-rtpjpegdepayhtml

(visited on 05142018)

[106] A Loonstra ldquoVideostreaming with Gstreamerrdquo [Online] Available httpmediatechnologyleideneduimagesuploads

docswt20147B5C_7Dgstreamerpdf

[107] W Taymans jpegdec GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-jpegdechtml (visited on

05142018)

[108] J Schmidt autovideosink GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available httpsgstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-autovideosinkhtml

(visited on 05142018)

[109] A Ronacher Deployment Options mdash Flask 0124 documentation 2018 [Online] Available httpflaskpocooorgdocs

012deploying (visited on 05142018)

[110] R Yasrab ldquoMitigating Docker Security Issuesrdquo University of Science and Technology of China Hefei Tech Rep [Online]

Available httpsarxivorgpdf180405039pdf

[111] Lvh Donrsquot expose the Docker socket (not even to a container) 2015 [Online] Available httpswwwlvhiopostsdont-

expose-the-docker-socket-not-even-to-a-containerhtml (visited on 05152018)

[112] Docker Inc Use overlay networks | Docker Documentation 2018 [Online] Available httpsdocsdockercomnetwork

overlay7B5C7Dcustomize-the-docker7B5C_7Dgwbridge-interface (visited on 05152018)

[113] J W Davis and M A Keck ldquoA Two-Stage Template Approach to Person Detection in Thermal Imageryrdquo Proc Workshop

on Applications of Computer Vision 2005 [Online] Available httpvcipl-okstateorgpbvsbenchpaperswacv05pdf

BIBLIOGRAPHY 79

[114] J W Davis and V Sharma ldquoBackground-subtraction using contour-based fusion of thermal and visible imageryrdquo Com-

puter Vision and Image Understanding vol 106 no No 2-3 pp 162ndash182 2007 DOI 101016jcviu200606010 [Online]

Available httpswebcseohio-stateedu7B~7Ddavis1719Publicationscviu07pdf

[115] S Hwang J Park N Kim Y Choi and I S Kweon ldquoMultispectral Pedestrian Detection Benchmark Dataset and Baselinerdquo

CVPR 2015 [Online] Available httpssitesgooglecomsitepedestrianbenchmark

[116] Z Wu N Fuller D Theriault and M Betke ldquoA Thermal Infrared Video Benchmark for Visual Analysisrdquo IEEE Conference

on Computer Vision and Pattern Recognition Workshops 2014 DOI 101109CVPRW201439 [Online] Available http

citeseerxistpsueduviewdocdownloaddoi=101173522167B5Camp7Drep=rep17B5Camp7Dtype=pdf

[117] R Miezianko Terravic research infrared database

[118] R Miezieanko Terravic research infrared database

[119] S Z Li R Chu S Liao and L Zhang ldquoIllumination Invariant Face Recognition Using Near-Infrared Imagesrdquo IEEE Trans-

actions on Pattern Analysis and Machine Intelligence vol 29 no 4 pp 627ndash639 2007 DOI 101109TPAMI20071014

[Online] Available httpvcipl-okstateorgpbvsbenchpapersNIRpdf

[120] A Akula R Ghosh S Kumar and H K Sardana ldquoMoving target detection in thermal infrared imagery using spatiotem-

poral informationrdquo J Opt Soc Am A vol 30 no 8 pp 1492ndash1501 Aug 2013 DOI 101364JOSAA30001492 [Online]

Available httpjosaaosaorgabstractcfmURI=josaa-30-8-1492

[121] R I Hammoud IEEE OTCBVS WS Series Bench [Online] Available http vcipl - okstate org pbvs bench (visited on

05182018)

[122] Last Post Association Mission 2018 [Online] Available httpwwwlastpostbeenthe-last-postmission (visited on

05182018)

[123] I FLIR Systems FLIR One Pro 2017 [Online] Available httpswwwflircomglobalassetsimported-assetsdocument17-

1746-oem-flir7B5C_7Done7B5C_7Dpro7B5C_7Ddatasheet7B5C_7Dfinal7B5C_7Dv17B5C_

7Dwebpdf

[124] R J Ramana Introduction to Camouflage andDeception Defence Scientific Information ampDocumentation Centre pp 99ndash

164

[125] A Bornstein and I Richter Microsoft visual object tagging tool [Online] Available httpsgithubcomMicrosoftVoTT

(visited on 05202018)

[126] F E Grubbs ldquoProcedures for Detecting Outlying Observations in Samplesrdquo Technometrics vol 11 no 1 pp 1ndash21 Feb 1969

DOI 10108000401706196910490657 [Online] Available httpwwwtandfonlinecomdoiabs10108000401706

196910490657

[127] J Deng W Dong R Socher L-J Li K Li and L Fei-Fei ldquoImageNet A Large-Scale Hierarchical Image Databaserdquo in CVPR09

2009 [Online] Available httpwwwimage-netorgpapersimagenet7B5C_7Dcvpr09pdf

BIBLIOGRAPHY 80

[128] D Gupta Transfer learning amp The art of using Pre-trained Models in Deep Learning 2017 [Online] Available https

wwwanalyticsvidhyacomblog201706transfer-learning-the-art-of-fine-tuning-a-pre-trained-model (visited on

05202018)

[129] Docker Inc docker stats | Docker Documentation 2018 [Online] Available httpsdocsdockercomenginereference

commandlinestats (visited on 05242018)

[130] M Gori and A Tesi ldquoOn the Problem of Local Minima in Recurrent Neural Networksrdquo IEEE Transactions on Pattern

Analysis and Machine Intelligence vol 14 no 1 pp 76ndash86 1992 DOI 10110934107014

[131] L Prechelt ldquoEarly stopping - but whenrdquo in Neural Networks Tricks of the Trade G B Orr and K-R Muumlller Eds Berlin

Heidelberg Springer Berlin Heidelberg 1998 pp 55ndash69 ISBN 978-3-540-49430-0 DOI 1010073-540-49430-8_3

[Online] Available httpsdoiorg1010073-540-49430-8_3

[132] M Everingham L Van Gool C K Williams J Winn and A Zisserman ldquoThe Pascal visual object classes (VOC) challengerdquo

International Journal of Computer Vision vol 88 no 2 pp 303ndash338 2010 ISSN 09205691 DOI 101007s11263-009-

0275-4

[133] M Everingham S M A Eslami L Van Gool C K I Williams J Winn and A Zisserman ldquoThe Pascal Visual Object Classes

Challenge A Retrospectiverdquo International Journal of Computer Vision vol 111 no 1 pp 98ndash136 2014 ISSN 15731405

DOI 101007s11263-014-0733-5

[134] P Henderson and V Ferrari ldquoEnd-to-end training of object class detectors for mean average precisionrdquo Lecture Notes

in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

vol 10115 LNCS pp 198ndash213 2017 ISSN 16113349 DOI 101007978-3-319-54193-8_13 arXiv 160703476

[135] T Y Lin M Maire S Belongie J Hays P Perona D Ramanan P Dollaacuter and C L Zitnick ldquoMicrosoft COCO Common objects

in contextrdquo Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture

Notes in Bioinformatics) vol 8693 LNCS no PART 5 pp 740ndash755 2014 ISSN 16113349 DOI 101007978-3-319-10602-

1_48 arXiv 14050312

[136] Docker Inc Librarydocker 2018 [Online] Available https hub docker com 7B 5C _ 7D docker (visited on

06012018)

[137] Nvidia Nvidia-docker [Online] Available httpsgithubcomNVIDIAnvidia-docker (visited on 05252018)

[138] FLIR ldquoFLIR Onerdquo [Online] Available http www flir comuploadedFiles Store Products FLIR-ONE3rd-GenFLIR-

ONEFLIR-ONE-Gen-3-Datasheetpdf

[139] FLIR ldquoFLIR Bosonrdquo p 2 2016

FIREFIGHTING DEPARTMENT EMAIL CONVERSATIONS 81

Appendix A

Firefighting department email conversations

This appendix contains the email conversations with different firefighting departments in Belgium as part of an exploration of

the functional requirements of an aerial thermal imaging solution Note that all conversations were translated from Dutch to

English

A1 General email sent to Firefighting departments

This email was sent to the departments later mentioned in this appendix The responses in the following sections are responses

to this email

Subject Firefighting department - Thesis thermal drones

Dear Sir Madam

My name is Brecht Verhoeve I am a student Master of Science computer science engineering at Ghent University I am contacting

your department with reference to the research of my masterrsquos dissertation I am currently researching the applications of

thermal cameras in combination with commercial drones They can create an aerial overview of scenes and objects that often

canrsquot be spotted with visual detectors like hidden persons fires or hot explosives The eventual goal is to let a computer indicate

these objects of interest autonomously on the thermal images of the drone These images could aid a firefighter with their

work

For this research I have some questions for you

Functionality

I have enlisted some functionalities which I believe could be interesting for a firefighter

bull Detection of persons in buildings (find potential victims)

bull Detection of hidden fires in buildings (to identify danger zones)

bull Detection of fires on vast terrains (forests industrial terrains)

A2 Conversation with Firefighting department of Antwerp Belgium 82

bull Indication of hot explosives

I have two questions

bull Do you agree that these are the most important functions

bull Are there any other functions that you deem important

Quality of the application Next to the functionality the quality of the application is also important For me the most important

aspects are

bull Accuracy The software must be accurate There is no room for errors when detecting

bull Speed The software must operate quickly An overview must be created quickly to not waste time in case of an emer-

gency

bull Usability The software must be easy to use

Once again I have two questions

bull Do you agree with these qualities

bull Are there any other important qualities that you deem important

I would like to thank you in advance for your time

Best regards

Brecht Verhoeve

A2 Conversation with Firefighting department of Antwerp Belgium

The answers were given inline For clarity these are explicitly given

Subject Re Firefighting department Antwerp - Thesis thermal drones

Answers can be found in your email

Best regards

Functionality Detection of hidden fires in buildings and environments Are there any other functions that you deem important

Capture the evolution of a fire with the thermal camera Visualise incidents during night-time Capture invisible fires such as

hydrogen or methane fires

A3 Converstation with Firefighting department of Ostend Belgium 83

A3 Converstation with Firefighting department of Ostend Belgium

The answers were given inline For clarity these are explicityl given

Subject Re Firefighting department Ostend - Thesis thermal drones

Dear Brecht

You can find the answers after the questions in your email

Best Regards

Functionality Are there any other functions that you deem important These are the most important for us at the moment

Quality of the application Are there any other important qualities that you deem important The application must work au-

tonomously

A4 Conversation with Firefighting department of Courtrai Belgium

Subject Re Firefighting department Courtrai - Thesis thermal drones

Dear Brecht

Beneath you will find our answers (next to the already mentioned items)

Functionality

bull The detection of persons in a landscape For example missing persons after a traffic accident there are searches in the

dark for victims that were catapulted from a vehicle Today this is done via a thermal camera on the ground but with

a drone this could hold potential benefits Another example is searching for missing persons in nature reserves The

police sometimes asks for assitance of firefighters to search the area

Quality of the application

bull The images needs to be processed in realtime not after the drone has landed

The drones must be deployable for multiple purposes

The interpretation of the images in the future can be important for automatic flight control of drones Currently there is a

European project rdquo3D Safeguardrdquo where the KU Leuven is participating They are already quite advanced in interpreting the

images from a drone to spot persons through smoke With this information the drone can be redirected The application can

thus use the interpretations of the images to control the drone in flight

Best regards

A5 Conversation with Firefighting department of Ghent Belgium

Subject Re Firefighting department Ghent - Thesis thermal drones

A5 Conversation with Firefighting department of Ghent Belgium 84

Hi Brecht

I donrsquot know if yoursquove received the previous email but there you received answers on your questions

Best regards

Subject Re Firefighting department Ghent - Thesis thermal drones

With respect to the functionality I would like to add

bull Measuring the temperature of containers silos

I agree with the quality of the application It could be handy to be able to view the application from one or more devices

Everything should have a clear overview If possible information and controls should be available on one screen

I will follow up

Best regards

THERMAL CAMERA SPECIFICATIONS 85

Appendix B

Thermal camera specifications

This appendix gives all the specifications for the compared thermal cameras First the different cameras their producing

companies and average retail prices are listed in Table B1 Second their respective physical specifications are presented in

Table B2 Third the image qualities are presented in Table B3 Fourth the thermal precisions are presented in Table B4 Fifth

the available interfaces to interact with each camera are presented in Table B5 Sixth the energy consumption of each camera

is presented in Table B6 Seventh how support is offered when developing for these platforms is presented in Table B7 Finally

auxiliary features are presented in Table B8

THERMAL CAMERA SPECIFICATIONS 86

Product Company Price (Euro)

Wiris 2nd Gen 640 Workswell 999500

Wiris 2nd Gen 336 Workswell 699500

Duo Pro R 640 FLIR 640900

Duo Pro R 336 FLIR 438484

Duo FLIR 94999

Duo R FLIR 123999

Vue 640 FLIR 268900

Vue 336 FLIR 125993

Vue Pro 640 FLIR 403218

Vue Pro 336 FLIR 230261

Vue Pro R 640 FLIR 518456

Vue Pro R 336 FLIR 345599

Zenmuse XT 640 DJI x FLIR 1181000

Zenmuse XT 336 DJI x FLIR 697000

Zenmuse XT 336 R DJI x FLIR 939000

Zenmuse XT 640 R DJI x FLIR 1423000

One FLIR 23799

One Pro FLIR 46900

Tau 2 640 FLIR 674636

Tau 2 336 FLIR 493389

Tau 2 324 FLIR 2640

Lepton 3 160 x 120 FLIR 25995

Lepton 3 80 x 60 FLIR 14338

Boson 640 FLIR 122209

Boson 320 FLIR 93842

Quark 2 640 FLIR 33165

Quark 2 336 FLIR 33165

DroneThermal v3 Flytron 34115

Compact Seek Thermal 27500

CompactXR Seek Thermal 28646

Compact Pro Seek Thermal 59900

Therm-App Opgal 93731

Therm-App TH Opgal 295000

Therm-App 25 Hz Opgal 199000

Table B1 Compared cameras their producing companies and their average retail price

THERMAL CAMERA SPECIFICATIONS 87

Product Weight (g) Dimensions (mm)

Wiris 2nd Gen 640 390 135 x 77 x 69

Wiris 2nd Gen 336 390 135 x 77 x 69

Duo Pro R 640 325 85 x 813 x 685

Duo Pro R 336 325 85 x 813 x 685

Duo 84 41 x 59 x 30

Duo R 84 41 x 59 x 30

Vue 640 114 574 x 4445 x 4445

Vue 336 114 574 x 4445 x 4445

Vue Pro 640 9214 574 x 4445 x 4445

Vue Pro 336 9214 574 x 4445 x 4445

Vue Pro R 640 9214 574 x 4445 x 4445

Vue Pro R 336 9214 574 x 4445 x 4445

Zenmuse XT 640 270 103 x 74 x 102

Zenmuse XT 336 270 103 x 74 x 102

Zenmuse XT 336 R 270 103 x 74 x 102

Zenmuse XT 640 R 270 103 x 74 x 102

One 345 67 x 34 x 14

One Pro 365 68 x 34 x 14

Tau 2 640 72 444 x 444 x 444

Tau 2 336 72 444 x 444 x 444

Tau 2 324 72 444 x 444 x 444

Lepton 3 160 x 120 09 118 x 127 x 72

Lepton 3 80 x 60 09 118 x 127 x 72

Boson 640 75 21 x 21 x 11

Boson 320 75 21 x 21 x 11

Quark 2 640 8 22 x 22 x 12

Quark 2 336 8 22 x 22 x 12

DroneThermal v3 3 20 x 20 x 15

Compact 1417 254 x 444 x 203

CompactXR 1417 254 x 444 x 254

Compact Pro 1417 254 x 444 x 254

Therm-App 138 55 x 65 x 40

Therm-App TH 123 55 x 65 x 40

Therm-App 25 Hz 138 55 x 65 x 40

Table B2 Physical specifications

THERMAL CAMERA SPECIFICATIONS 88

Product IR Resolution (pixels) SD resolution (megapixels) Frequency (Hz) FOV Radiometry

Wiris 2nd Gen 640 640 x 512 192 not specified Various yes

Wiris 2nd Gen 336 336 x 256 192 not specified Various yes

Duo Pro R 640 640 x 512 12 30 Various lens yes

Duo Pro R 336 336 x 256 12 30 Various lens yes

Duo 160 x 120 2 75 and 83 57deg x 44deg no

Duo R 160 x 120 2 75 57deg x 44deg yes

Vue 640 640 x 512 0 75 Various lens no

Vue 336 336 x 256 0 75 Various lens no

Vue Pro 640 640 x 512 0 75 Various lens no

Vue Pro 336 336 x 256 0 75 Various lens no

Vue Pro R 640 640 x 512 0 75 Various lens yes

Vue Pro R 336 336 x 256 0 75 Various lens yes

Zenmuse XT 640 640 x 512 0 75 Various lens no

Zenmuse XT 336 336 x 256 0 75 Various lens no

Zenmuse XT 336 R 336 x 256 0 75 Various lens yes

Zenmuse XT 640 R 336 x 256 0 75 Various lens yes

One 80 x 60 15 87 50 deg x 38 deg yes

One Pro 160 x 120 15 87 55 deg x 43 deg yes

Tau 2 640 640 x 512 0 75 Various lens yes

Tau 2 336 336 x 256 0 75 Various lens yes

Tau 2 324 324 x 256 0 76 Various lens yes

Lepton 3 160 x 120 160 x 120 0 88 56 deg available

Lepton 3 80 x 60 80 x 60 0 88 56 deg no

Boson 640 640 x 512 0 90 Various lens no

Boson 320 320 x 256 0 90 Various lens no

Quark 2 640 640 x 512 0 9 Various lens no

Quark 2 336 336 x 256 0 9 Various lens no

DroneThermal v3 80 x 60 0 86 25 deg no

Compact 206 x 156 0 9 36 deg no

CompactXR 205 x 156 0 9 20 deg no

Compact Pro 320 x 240 0 15 32 deg no

Therm-App 384 x 288 0 87 Various lens no

Therm-App TH 384 x 288 0 87 Various lens yes

Therm-App 25 Hz 384 x 288 0 25 Various lens no

Table B3 Image quality

IR InfraRed SD Standard FOV Field of View

THERMAL CAMERA SPECIFICATIONS 89

Product Sensitivity mK Temperature range (degrees Celsius) Accuracy (Celsius)

Wiris 2nd Gen 640 50 -25 to +150 -40 to + 550 2

Wiris 2nd Gen 336 50 -25 to +150 -40 to + 550 2

Duo Pro R 640 50 -25 to + 135 -40 to + 550 5 20

Duo Pro R 336 50 -25 to + 135 -40 to + 550 5 20

Duo not specified -40 tot + 550 5

Duo R not specified -40 to + 550 5

Vue 640 not specified -58 to + 113 not specified

Vue 336 not specified -58 to + 113 not specified

Vue Pro 640 not specified -58 to + 113 not specified

Vue Pro 336 not specified -58 to + 113 not specified

Vue Pro R 640 not specified -58 to + 113 not specified

Vue Pro R 336 not specified -58 to + 113 not specified

Zenmuse XT 640 50 -40 to 550 not specified

Zenmuse XT 336 50 -40 to 550 not specified

Zenmuse XT 336 R 50 -40 to 550 not specified

Zenmuse XT 640 R 50 -40 to 550 not specified

One 150 -20 to 120 3

One Pro 150 -20 to 400 3

Tau 2 640 50 -40 to 550 not specified

Tau 2 336 50 -40 to 550 not specified

Tau 2 324 50 -40 to 550 not specified

Lepton 3 160 x 120 50 0 to 450 5

Lepton 3 80 x 60 50 0 to 450 5

Boson 640 40 0 to 500 not specified

Boson 320 40 0 to 500 not specified

Quark 2 640 50 -40 to 160 not specified

Quark 2 336 50 -40 to 160 not specified

DroneThermal v3 50 0 to 120 not specified

Compact not specified -40 to 330 not specified

CompactXR not specified -40 to 330 not specified

Compact Pro 70 -40 to +330 not specified

Therm-App 70 5 to + 90 3

Therm-App TH 70 0 to 200 2

Therm-App 25 Hz 70 5 to + 90 3

Table B4 Thermal precision

THERMAL CAMERA SPECIFICATIONS 90

Product USB MAVLink HDMI

Wiris 2nd Gen 640 Flash disk yes yes

Wiris 2nd Gen 336 Flash disk yes yes

Duo Pro R 640 Mini-USB yes micro-HDMI

Duo Pro R 336 Mini-USB yes micro-HDMI

Duo Mini-USB yes micro-HDMI

Duo R Mini-USB yes micro-HDMI

Vue 640 Mini-USB No No

Vue 336 Mini-USB no no

Vue Pro 640 Mini-USB yes Optional

Vue Pro 336 Mini-USB yes Optional

Vue Pro R 640 Mini-USB yes Optional

Vue Pro R 336 Mini-USB yes Optional

Zenmuse XT 640 Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 336 Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 336 R Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 640 R Via DJI drone Via DJI drone Via DJI drone

Tau 2 640 No no no

Tau 2 336 No no no

Tau 2 324 No no no

Lepton 3 160 x 120 No no no

Lepton 3 80 x 60 No no no

Boson 640 Yes no no

Boson 320 Yes no no

Quark 2 640 no no no

Quark 2 336 no no no

DroneThermal v3 no no no

Compact Smartphone storage no no

CompactXR Smartphone storage no no

Compact Pro Smartphone storage no no

Therm-App Smartphone storage no no

Therm-App TH Smartphone storage no no

Therm-App 25 Hz Smartphone storage no no

Table B5 Interfaces

THERMAL CAMERA SPECIFICATIONS 91

Product Power consumption (Watt) Input Voltage

Wiris 2nd Gen 640 4 6 - 36

Wiris 2nd Gen 336 4 6 - 36

Duo Pro R 640 10 50 - 260

Duo Pro R 336 10 50 - 260

Duo 22 50 - 260

Duo R 22 50 - 260

Vue 640 12 48 - 60

Vue 336 12 48 - 60

Vue Pro 640 21 48 - 60

Vue Pro 336 21 48 - 60

Vue Pro R 640 21 48 - 60

Vue Pro R 336 21 48 - 60

Zenmuse XT 640 Via DJI drone Via DJI drone

Zenmuse XT 336 Via DJI drone Via DJI drone

Zenmuse XT 336 R Via DJI drone Via DJI drone

Zenmuse XT 640 R Via DJI drone Via DJI drone

One approx 1h battery lifetime Battery

One Pro approx 1h battery lifetime Battery

Tau 2 640 13 40 - 60

Tau 2 336 13 40 - 61

Tau 2 324 13 40 - 62

Lepton 3 160 x 120 065 31

Lepton 3 80 x 60 065 31

Boson 640 05 33

Boson 320 05 33

Quark 2 640 12 33

Quark 2 336 12 33

DroneThermal v3 015 33 - 5

Compact Via smartphone Smartphone

CompactXR Via smartphone Smartphone

Compact Pro Via smartphone Smartphone

Therm-App 05 5

Therm-App TH 05 5

Therm-App 25 Hz 05 5

Table B6 Energy consumption

THERMAL CAMERA SPECIFICATIONS 92

Product Warranty (years) User Manual Phone support Email support FAQs

Wiris 2nd Gen 640 Not specified Yes Yes Yes Yes

Wiris 2nd Gen 336 Not specified Yes Yes Yes Yes

Duo Pro R 640 1 Yes Yes Yes Yes

Duo Pro R 336 1 Yes Yes Yes Yes

Duo 1 yes Yes Yes Yes

Duo R 1 yes yes yes yes

Vue 640 1 yes yes yes yes

Vue 336 1 yes yes yes yes

Vue Pro 640 1 yes yes yes yes

Vue Pro 336 1 yes yes yes yes

Vue Pro R 640 1 yes yes yes yes

Vue Pro R 336 1 yes yes yes yes

Zenmuse XT 640 05 yes yes yes yes

Zenmuse XT 336 05 yes yes yes yes

Zenmuse XT 336 R 05 yes yes yes yes

Zenmuse XT 640 R 05 yes yes yes yes

One 1 yes yes yes yes

One Pro 1 yes yes yes yes

Tau 2 640 1 yes yes yes yes

Tau 2 336 1 yes yes yes yes

Tau 2 324 1 yes yes yes yes

Lepton 3 160 x 120 1 yes yes yes yes

Lepton 3 80 x 60 1 yes yes yes yes

Boson 640 1 yes yes yes yes

Boson 320 1 yes yes yes yes

Quark 2 640 1 yes yes yes yes

Quark 2 336 1 yes yes yes yes

DroneThermal v3 not specified no no no no

Compact 1 yes yes yes yes

CompactXR 1 yes yes yes yes

Compact Pro 1 yes yes yes yes

Therm-App 1 yes yes yes yes

Therm-App TH 1 yes yes yes yes

Therm-App 25 Hz 1 yes yes yes yes

Table B7 Help and support

THERMAL CAMERA SPECIFICATIONS 93

Product Bluetooth Wi-Fi GPS Mobile app Storage

Wiris 2nd Gen 640 no on request Yes no yes

Wiris 2nd Gen 336 no on request yes no yes

Duo Pro R 640 yes no yes yes yes

Duo Pro R 336 yes no yes yes yes

Duo no no no no yes

Duo R no no no no yes

Vue 640 No no no no no

Vue 336 no no no no no

Vue Pro 640 yes no no yes yes

Vue Pro 336 yes no no yes yes

Vue Pro R 640 yes no no yes yes

Vue Pro R 336 yes no no yes yes

Zenmuse XT 640 Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 336 Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 336 R Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 640 R Via DJI drone Via DJI drone Via DJI drone yes yes

One no no no yes yes

One Pro no no no yes yes

Tau 2 640 no no no no yes

Tau 2 336 no no no no yes

Tau 2 324 no no no no yes

Lepton 3 160 x 120 no no no no no

Lepton 3 80 x 60 no no no no no

Boson 640 no no no no no

Boson 320 no no no no no

Quark 2 640 no no no no no

Quark 2 336 no no no no no

DroneThermal v3 no no no no no

Compact no no no yes yes

CompactXR no no no yes yes

Compact Pro no no no yes yes

Therm-App no no no yes yes

Therm-App TH no no no yes yes

Therm-App 25 Hz no no no yes yes

Table B8 Auxiliary features

LAST POST THERMAL DATASET SUMMARY 94

Appendix C

Last Post thermal dataset summary

The goal of this appendix is to provide a summary of the layout of the Last Post thermal dataset The data was captured on

the following days 24th of March 2018 second of April 2018 third of April 2018 third of April 2018 fourth of April 2018 fifth of

April 2018 9th of April 2018 10th of April 2018 11th of April 2018 and 12th of April 2018 For each date a small summary of the

contents is made below The small summary consists of a description of the conditions that day a listing of the video files and

their contents

C1 24th of March 2018

Conditions

bull Hours 1940 - 2020

bull Outside temperature range 5 degrees Celsius - 12 degrees Celsius

bull Clear

bull Humidity 76

bull Wind 24 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 14 kilometers

Videos

bull flir_20180324T195255mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd gathers on the right of the video D

C2 2nd of April 2018 95

bull flir_20180324T195836mp4 This video gives an overview of the inside of the Meningate ceremony Many

people can be seen watching the ceremony

bull flir_20180324T200421mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side

bull flir_20180324T201448mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side The crowd is leaving

the area after the ceremony

bull flir_20180324T202328mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side The crowd is leaving

the area after the ceremony

C2 2nd of April 2018

Conditions

bull Hours 1940 - 2020

bull Outside temperature range 9 degrees Celsius - 15 degrees Celsius

bull Light rain

bull Humidity 74

bull Wind 18 kilometers per hour

bull Precipitation 04 centimeters

bull Visibility 81 kilometers

Videos

bull 2018-04-02194733mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people sometimes with umbrellas passing through

bull 2018-04-02194952mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance

C3 3th of April 2018 96

bull 2018-04-02195518mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance

bull 2018-04-02201322mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance Crowds can be seen as well as people holding umbrellas

C3 3th of April 2018

Conditions

bull Hours 2000 - 2030

bull Outside temperature range 8 degrees Celsius - 16 degrees Celsius

bull Heavy rain

bull Humidity 79

bull Wind 25 kilometers per hour

bull Precipitation 05 centimeters

bull Visibility 101 kilometers

Videos

bull 2018-04-03 201227mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are holding

umbrellas due to heavy rain on that day Due to rain conditions and wind it was difficult to steady the camera which

can be seen in the shaky video

bull 2018-04-03 201727mp4 In the beginning of the clip the camera is moving towards the other side of the

Meningate From 0020 and onwards the clip is useful The video gives an overview of the bridge at the east-side of the

Meningate This is were the Frenchlaan goes into the Menenstraat The video shows people leaving from the Meningate

towards the busses at the other side of the bridge Most people are holding umbrellas due to heavy rain that day The

Meningate is in the bottom left of the picture Several buildings can be seen in the distance In the bottom right the

water of the Kasteelgracht can be seen Sometimes in the left of the picture the wall of the Meningate can be seen

bull 2018-04-03 202311mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are holding

umbrellas due to heavy rain on that day Due to rain conditions and wind it was difficult to steady the camera which

can be seen in the shaky video

C4 4th of April 2018 97

C4 4th of April 2018

Content

bull Hours 1945 - 2030

bull Outside temperature range 10 degrees Celsius - 14 degrees Celsius

bull Cloudy

bull Humidity 87

bull Wind 18 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-04 200052mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen In the video a dense

crowd of people slowly starts to form in the bottom right of the picture Most people are coming from the left of the

video to join the crowd

bull 2018-04-04 200728mp4 This video shows the inside of the Meningate and the ceremony of the last post

Some people are up close in front The large crowd can be seen through the hall

bull 2018-04-04 200913mp4 This video shows the inside of the Meningate and the ceremony of the last post

The video switches between MSX mode visual camera and thermal camera to show the differences

bull 2018-04-04 202859mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen At the start of the video

a crowd is seen in the bottom right At the 0100 mark the ceremony has ended and people are exiting the gate and

coming onto the crossing They form two rows to make place for the marching band exiting the Meningate It can be

seen marching through the crowd at the 0250 mark

C5 5th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 11 degrees Celsius - 15 degrees Celsius

C6 9th of April 2018 98

bull Sunny

bull Humidity 77

bull Wind 11 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-05 200217mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen In the video a dense

crowd of people slowly starts to form in the bottom right of the picture Most people are coming from the left of the

video to join the crowd The video shows 15 minutes before the start of the ceremony

bull 2018-04-04 201838mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen The video shows the

first ten minutes after the end of the ceremony The crowd which can be seen on the left leaves towards the square

C6 9th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 9 degrees Celsius - 10 degrees Celsius

bull Light rain

bull Humidity 99

bull Wind 14 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 81 kilometers

Videos

bull 2018-04-09 200007mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are coming from

the left towards the Meningate in the right Not a lot of people are seen due to rain that day

C7 10th of April 2018 99

bull 2018-04-09-202302mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right People are leaving from the right of the Meningate

towards the square

C7 10th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 14 degrees Celsius - 17 degrees Celsius

bull Partly Cloudy

bull Humidity 52

bull Wind 13 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers per hour

Videos

bull 2018-04-10 195029mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way

bull 2018-04-10 195131mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way

bull 2018-04-10 195748mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way Some people are moving around the crowd

bull 2018-04-10 200122mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen There is a big crowd that

can be seen on the right There are some schools there so some people are wearing backpacks It is quite warm and

the cafe on the other side of the street has opened up its terrace

bull 2018-04-10 201427mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

C8 11th of April 2018 100

that they are standing in a very structured way Some people are moving around the crowd The image is not rotated

well a well rotated image is found in 2018-04-10 201427_rotatedmp4

bull 2018-04-10 201515mp4 This video shows the inside of the Meningate and the ceremony A traditional

rsquoHakkarsquo from New-Zealand soldiers can be heard in the video the soldiers are difficult to spot due to thermal blurring

because many people are standing in one place

bull 2018-04-10 202558mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen that

they are standing in a very structured way Some people are moving around the crowd At the 0200 mark the army

unit marches to the end of the bridge Very dense crowds can be seen afterwards At 0825 the army unit marches in a

straight line towards the Meningate

C8 11th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 12 degrees Celsius - 16 degrees Celsius

bull Sunny

bull Humidity 63

bull Wind 14 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-11 200140mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen

bull 2018-04-11 200601mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat A small crowd can be seen on the left of the video

bull 2018-04-11 201554mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People start leaving the

ceremony from the 0120 mark

C9 12th of April 2018 101

C9 12th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 11 degrees Celsius - 14 degrees Celsius

bull Rain

bull Humidity 94

bull Wind 8 kilometers per hour

bull Precipitation 01 centimeters

bull Visibility 32 kilometers

Videos

bull 2018-04-12 195219mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen Sometimes the hedge

of the wall were the film was made is visible due to shaky cam Not many people are seen due to the rain

bull 2018-04-12 201526mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen Sometimes the hedge

of the wall were the film was made is visible due to shaky cam Not many people are seen due to the rain People are

leaving towards the right

  • Introduction
    • Drones
    • Concepts
      • Thermal Cameras
      • Aerial thermal imaging
        • Problem statement
          • Industry adoption
          • Crowd monitoring
          • Goal
          • Related work
            • Outline
              • System Design
                • Requirements analysis
                  • Functional requirements
                  • Non-functional requirements
                    • Patterns and tactics
                      • Layers
                      • Event-driven architecture
                      • Microkernel
                      • Microservices
                      • Comparison of patterns
                        • Software architecture
                          • Static view
                          • Dynamic views
                          • Deployment views
                              • State of the art and technology choice
                                • Thermal camera options
                                  • Parameters
                                  • Comparative analysis
                                    • Microservices frameworks
                                      • Flask
                                      • Falcon
                                      • Nameko
                                      • Vertx
                                      • Spring Boot
                                        • Deployment framework
                                          • Containers
                                          • LXC
                                          • Docker
                                          • rkt
                                            • Object detection algorithms and frameworks
                                              • Traditional approaches
                                              • Deep learning
                                              • Frameworks
                                                • Technology choice
                                                  • Thermal camera
                                                  • Microservices framework
                                                  • Deployment framework
                                                  • Object detection
                                                      • Proof of Concept implementation
                                                        • Goals and scope of prototype
                                                        • Overview of prototype
                                                          • General overview
                                                          • Client interface
                                                          • Stream
                                                          • Producer and Consumer
                                                          • Implemented plugins
                                                            • Limitations and issues
                                                              • Single client
                                                              • Timeouts
                                                              • Exception handling and testing
                                                              • Docker security issues
                                                              • Docker bridge network
                                                              • Single stream
                                                              • Number of containers per plugin
                                                                  • Mob detection experiment
                                                                    • Last Post thermal dataset
                                                                      • Last Post ceremony
                                                                      • Dataset description
                                                                        • Object detection experiment
                                                                          • Preprocessing
                                                                          • Training
                                                                              • Results and evaluation
                                                                                • Framework results
                                                                                  • Performance evaluation
                                                                                  • Interoperability evaluation
                                                                                  • Modifiability evaluation
                                                                                    • Mob detection experiment results
                                                                                      • Training results
                                                                                      • Metrics
                                                                                      • Validation results
                                                                                          • Conclusion and future work
                                                                                            • Conclusion
                                                                                            • Future work
                                                                                              • Security
                                                                                              • Implementing a detection plugin
                                                                                              • Different deployment configurations
                                                                                              • Multiple streams with different layouts
                                                                                              • Implementing the plugin distribution service (Remote ProducerConsumer)
                                                                                              • Using high performance microservices backbone frameworks
                                                                                              • New object detection models and datasets specifically for thermal images
                                                                                                  • Firefighting department email conversations
                                                                                                    • General email sent to Firefighting departments
                                                                                                    • Conversation with Firefighting department of Antwerp Belgium
                                                                                                    • Converstation with Firefighting department of Ostend Belgium
                                                                                                    • Conversation with Firefighting department of Courtrai Belgium
                                                                                                    • Conversation with Firefighting department of Ghent Belgium
                                                                                                      • Thermal camera specifications
                                                                                                      • Last Post thermal dataset summary
                                                                                                        • 24th of March 2018
                                                                                                        • 2nd of April 2018
                                                                                                        • 3th of April 2018
                                                                                                        • 4th of April 2018
                                                                                                        • 5th of April 2018
                                                                                                        • 9th of April 2018
                                                                                                        • 10th of April 2018
                                                                                                        • 11th of April 2018
                                                                                                        • 12th of April 2018

iv

Permission for usage

rdquoThe author gives permission to make this master dissertation available for consultation and to copy parts of this master dis-

sertation for personal use In the case of any other use the copyright terms have to be respected in particular with regard to

the obligation to state expressly the source when quoting results from this master dissertationrdquo

Brecht Verhoeve

Ghent June 2018

v

Preface

This master dissertation is submitted as a completion of the academic degree as Master of Science in Computer Science Engi-

neering at Ghent University The dissertation investigates the upcoming combination of drones and thermal cameras their use

cases and supporting technologies The dissertation led me through various fields such as software architecture microservices

software containerization GPUs and neural networks I wrote the dissertation focussing on the business and technological

aspects that could lead to increasing industry adoption of these technologies

I would like to thank my supervisors and counsellors for their continuous support this year You were always there for a quick

meeting during which the atmosphere was always positive jokes were always around the corner but with a focus on results

Prof Volckaert for the quickest replies on emails I have yet witnessed to this day and guiding me through the complex journey

of this dissertation Jerico Moeyersons for the office hop-ins and help during that annoying CUDA installation Pieter-Jan

Maenhaut for his questions and reviews during meetings which provided me with new insights and things to write about Nils

Tijtgat for the support in the early days of the thesis Irsquove read your tutorial on YOLO more than I would like to admit And finally

Prof De Turck for the opportunity of working on this topic

I am grateful for the company I had this year when working on the dissertation Ozan Catal Joran Claeys Stefan Wauters Dries

Bosman Pieter De Cleer Igor Lima de Paula Laura Van Messem Lars De Brabandere Stijn Cuyvers Stijn Poelman thank you for

the fun times spontaneous beers and support this year

Special thanks go out to the people of the VTK and FK students associations You provided me with unforgettable experiences

friendships teachings andmemories With a special mention to Steacutephanie Anna and Nick from Career amp Development everyone

from Delta and finally Stijn Adams and Sander De Wilde for their continuous support throughout the years

Finally I want to thank my parents and Marjolein Hondekyn for their advise and massive support Without you I wouldnrsquot have

been able to wrestle myself through the tough periods and finish the dissertation

Brecht Verhoeve

Ghent June 2018

vi

Modifiable drone thermal imaging analysis framework for mob detection during

open-air events

Brecht Verhoeve

Supervisors Prof dr Bruno Volckaert Prof dr ir Filip De Turck

Counsellors Pieter-Jan Maenhaut Jerico Moeyersons

Masterrsquos dissertation submitted in order to obtain the academic degree of

Master of Science in Computer Science Engineering

Department of Information Technology

Chair Prof dr ir Bart Dhoedt

Faculty of Engineering and Architecture

Ghent University

Academic year 2017-2018

Abstract

Drones and thermal cameras are used in combination for many applications such as search and rescue fire fighting etc Due to

vendor specific hardware and software applications are hard to develop and maintain Therefore a modifiable drone thermal

imaging analysis framework is proposed that enables users to more easily develop such image processing applications It

implements a microservice plugin architecture Users can build image processing applications with the framework by building

media streams using plugins that are either thermal cameras or image analysis software modules The framework is evaluated

by building a proof of concept implementation which is tested on the initial requirements It achieves the modifiability and

interoperability requirements at the cost of performance and security The framework is applied for detecting large crowds of

people (mobs) during open-air events A new dataset containing thermal images of such mobs is presented on which a YOLOv3

neural network is trained The trained model is able to detect mobs on new thermal images in real-time achieving frame rates

of 55 frames per second when deployed on a modern GPU

Keywords Drone thermal imaging Video streaming Framework Microservices Object de-

tection Plugin

Modifiable Drone Thermal Imaging AnalysisFramework for Mob Detection during Open-Air

EventsBrecht Verhoeve

Supervisors prof dr Bruno Volckaert prof dr ir Filip De Turck Pieter-Jan Maenhaut Jerico Moeyersons

Abstractmdash Drones and thermal cameras are used in combination formany applications such as search and rescue fire fighting etc Due to ven-dor specific hardware and software applications are hard to develop andmaintain Therefore a modifiable drone thermal imaging analysis frame-work is proposed that enables users to more easily develop such imageprocessing applications It implements a microservice plugin architectureUsers can build image processing applications with the framework by build-ing media streams using plugins that are either thermal cameras or imageanalysis software modules The framework is evaluated by building a proofof concept implementation which is tested on the initial requirements Itachieves the modifiability and interoperability requirements at the cost ofperformance and security The framework is applied for detecting largecrowds of people (mobs) during open-air events A new dataset containingthermal images of such mobs is presented on which a YOLOv3 neural net-work is trained The trained model is able to detect mobs on new thermalimages in real-time achieving frame rates of 55 frames per second whendeployed on a modern GPU

Keywordsmdash Drone thermal imaging Video streaming Framework Mi-croservices Object detection Plugin

I INTRODUCTION

THROUGHOUT history having an overview of the environ-ment from high viewpoints held many benefits The advent

of drones and advanced cameras provides low-cost aerial imag-ing that creates numerous opportunities for new applicationsTraditional visual cameras for the visible light spectrum offerhigh quality images but are limited to daytime or artificiallylighted scenes Thermal cameras measure thermal radiation ofobjects in a scene and thus can operate in utter darkness reveal-ing information not visible to the normal eye [1] The combi-nation of drones and thermal cameras is used in many differentapplications such as geography [23] agriculture [4] search andrescue [5] wildlife monitoring [6] disaster response [7] main-tenance [8] etc

Several vendors offer thermal camera products some specif-ically designed for drone platforms These cameras often usedifferent image formats color schemes and interfaces [19ndash11]This leads to issues if applications want to change the camerathat is used or when the camera is no longer supported by thevendor because different software needs to be built to inter-act with the new camera which often is a non-negligible costThis leads to a problem called vendor lock-in that makes cus-tomers dependent on a certain vendor as they cannot switchproduct without making substantial costs a problem alreadyvery tangible for cloud-based applications [12] Applicationsacross various fields often have slightly different functional andnon-functional requirements For this dissertation several Bel-gian fire fighting departments were asked for requirements for

a thermal drone platform application It quickly became clearthat they had various problems that needed to be solved suchas finding hot explosives measuring temperatures in contain-ers identifying hot entrances detecting invisible methane firesfinding missing persons etc Some use cases need to be eval-uated in real-time (during fires) others need to be extremelyaccurate A hypothetical application should be able to quicklyimplement new detection and analysis features to meet all theserequirements Due to the current solutions not being modifiableenough current applications built with aerial thermal imagingtechnology remain in the niche use case for which they were ini-tially developed [13] Applications could benefit from a back-bone framework to aid in this modifiabilityinteroperability is-sue aiding in developing end-to-end solutions connecting ther-mal cameras to various analysisdetection modules

This dissertation explores the requirements for such a frame-work and its potential software architecture To test the viabil-ity of the architecture a proof of concept prototype is imple-mented and evaluated against the initial requirements To verifyif it aids in developing detection applications the specific usecase for detecting large crowds of people so-called mobs dur-ing open-air events is investigated Monitoring crowds duringopen-air events is important as mobs can create potentially dan-gerous situations through bottlenecks blocking escape routesetc Through monitoring and detecting these mobs these situa-tions can be avoided before they become problematic [14 15]

The remainder of this paper is organized as follows Sec-tion II presents similar projects on the topic of modifiable imag-ing analysis frameworks and thermal object detection SectionIII presents the requirements of the framework and the soft-ware architecture designed from these requirements Section IVpresents the implementation of the framework prototype Themob detection experiment is described in Section V The testsand results to evaluate the framework and the mob detection ex-periment are presented in Section VI Finally Section VII drawsconclusions from this research and indicates where future effortsin this field should go to

II RELATED WORK

The Irish start-up DroneSAR [16] developed a search-and-rescue (SAR) drone platform allowing users to stream live im-ages and video from a drone as it conducts a search for miss-ing persons The platform works with any camera visual andthermal but focuses on drones from vendor DJI DroneSARs in-dustry partner Amazon introduced the Amazon Kinesis Video

Streams platform in January 2018 as a new service for the Ama-zon Web Services (AWS) cloud platform It allows users tostream live video from devices to the AWS cloud and buildapplications for real-time video processing [17] The VIPERproject by EAVISE KU Leuven researched how thermal andvisual video images could be used for real-time detection ofpersons using object detection algorithms such as deep learning[18] The framework presented in this work combines elementsfrom all three of these examples

III REQUIREMENTS AND SOFTWARE ARCHITECTURE

A Functional requirements

Three general actors are identified for the framework an end-user that wants to build a image processing application for a spe-cific use case camera developers integrating new cameras intothe framework and analysisdetection module developers inte-grating new analysisdetection algorithms into the framework sothat end-users can use them to build their applications An end-user should be able to construct a image processing applicationusing the integrated cameras and analysis modules He shouldbe able to adapt this application with the framework for newuse cases Camera and analysis module developers should beable to add and distribute new integrations for new cameras andanalysis modules to the framework This allows the end-users tofocus on the use case not the technical details of the hardwareplatforms or algorithms and have a wider selection of hardwareand algorithms

B Non-functional requirements

Interoperability modifiability and peformance are identifiedas the architecturally significant requirements Interoperabilityspecifies that the framework should be able to interact with var-ious cameras and analysis software modules via interfaces Theamount of systems the framework can successfully interact withadding to the business value of the framework as end-users canuse more devices via the framework to build applications Theframework needs to be extendable with new thermal camerasand analysis modules Applications built with the frameworkshould be modifiable to integrate new hardware and softwareThe available hardware on-site for use cases such as forest firemonitoring is not always powerful enough to support heavy im-age analysis software The framework should be able to deployin a distributed fashion to allow more computationally expen-sive operations to be executed on more powerful remote devicesSome use cases require real-time streaming of video and manip-ulation of these video streams which should be supported forthe framework to be relevant

C Software architecture

An architectural pattern analysis based on the requirementspresented in Section III-B was conducted from which a com-bination of the microservices and microkernel pattern was se-lected as the base pattern for the software architecture The mi-crokernel pattern enables the framework to be extended via aplugin system The microservices framework enables interop-erability through the use of standard interfaces and a communi-cation scheme It also allows for the framework to be deployed

in a distributed fashion [19ndash21] The software architecture isdocumented in static views sequence diagrams and deploymentviews Figure 1 presents an overview of the architecture

Fig 1 Component-connector overview of the framework Theclear components are the core components of the framework thateach user needs to install to use the framework The coloredcomponents are used for the distribution of plugins

End-users interact with the framework via the Client Inter-face a graphical or textual interface Commands are forwardedto the Stream module that manages the layout and state of thecomponents in the media pipeline (stream) that the end-userbuilds for his use case To activate and place the componentsin a certain layout the Stream module connects with the Pro-ducer and Consumer components that control the plugins thatprocess the media Producer Plugins are devices that producemedia such as thermal cameras Consumer Plugins process andconsume media such as analysis software and displays Oncea stream is established the plugins forward media to each otherin the layout specified by the Stream module New support forcameras and analysis modules can be added as plugins to theProducerConsumer Distribution components that distribute thissoftware so that end-users can download and install the pluginsEach module in the architecture is a microservice allowing fora distributed deployment and enabling the modifiability require-ments through interchangeable modules Cameras and analy-sis modules are realized as plugins for the ProducerConsumermodules implemented as a microkernel This allows the frame-work to easily build new streams for different use cases and in-terchange components when needed

C1 Plugin model

Figure 2 depicts the model of a general framework pluginThe plugin defines three interfaces a source media endpoint toreceive media from different sources a listener endpoint to for-ward the processed media to other listening plugins and an APIfor framework control The framework uses the API to changewhich sources and listener a plugin has and its state By linkingplugins together by setting the sources and listeners resourcesthe framework can build a media processing stream ProducerPlugins have no sources since they produce media The statesare used stop and start the media processing of the plugins in

Fig 2 Schematic overview of a plugin

the stream The REST paradigm is selected to build this APIwith state sources and listeners resources that need to be min-imally implemented A plugin has the following states INAC-TIVE PLAY STOP and PAUSE Figure 3 depicts the state tran-sition diagram A plugin implements the visible states STOPPAUSE and PLAY describing if the media process of the pluginis stopped paused or processing respectively The INACTIVEstate is only visible to the framework as it indicates that their isno active process that runs the plugin This is the initial state ofa plugin in the framework When a plugin is added to a streamthe plugin microservice is started transitions to the STOP stateand waits for commands

Fig 3 State transition diagram of a plugin

C2 Network topology and communication protocol

The microservices of the framework and the plugins need acommunication protocol to exchange commands and video Forsending the commands the HTTPTCP protocol is used a syn-chronous protocol that blocks on the response of a request Thisensures that commands are always received and acknowledgedincreasing the reliability of the communication at the cost ofperformance [22] The asynchronous RTPUDP protocol is se-lected to transfer media between the plugins to ensure low la-tency video transfer between plugins to enable real-time videostreams The recommended codec for transmitting video mediais MJPEG which transmits video frames as separately encodedJPEG images [23] Other codecs such as MPEG-4 encode onlycertain frames as keyframes and the other frames as B-framesthat encode differences from the keyframe [24] This impliesthat when receiving images from a stream a keyframe must firstbe received before the video can be decoded Using MJPEG

plugins receiving frames can directly perform analysis on eachkeyframe and do not have to wait for additional frames An ex-ample of this network topology for a stream with 2 Producerplugins and 3 Consumer plugins is depicted in Figure 4

Fig 4 Network topology The full lines represent HTTPTCPcommuncations the dashed line RTPUDP communications

IV PROTOTYPE IMPLEMENTATION

The goal of the prototype implementation is to provide a proofof the concept architecture and meet the requirements presentedin Section III The core framework components are imple-mented the distribution components are left out of scope as theyfocus primarily on the distribution of supported plugins Thecore microservices as well as the plugins are implemented usingthe Docker software containerization framework [25] Contain-ers virtualize on the operating system and allow for portablelightweight software environments for processes with a minorperformance overhead Using this technology the core modulesand plugins can be deployed in a local and distributed fashionand be setup and shutdown easily [26] The microservice con-tainers communicate via the protocols presented in Section III-C2 The REST APIs are built with the Flask framework [27] alightweight Python web development framework ideal for pro-totyping The ProducerConsumer components need access tothe Docker daemon running on the Docker host in order to spinup and shutdown Docker containers containing the plugin pro-cesses This is achieved by mounting the Docker client socket inthe ProducerConsumer containers This gives the container rootaccess to the host a significant security threat [2829] Two sam-ple plugins were implemented Filecam a plugin that producesvideo read in from a file and Display a plugin that forwardsmedia to the display of the local device The plugins transmitmedia using the video streaming framework GStreamer [30]

V MOB DETECTION

A Dataset

Several publicly available datasets for thermal images exists[31ndash34] None of these include large crowds of people so anew dataset called the Last Post dataset was created It consistsof thermal video captured at the Last Post ceremony in Ypres

(a) Thermal view of the square (b) Visual view of the square (c) Thermal view of the bridge (d) Visual view of the bridge

Fig 5 Last Post dataset main scenes

Belgium [35] The videos were captured using the Flir One Prothermal camera for Android [36] using the Iron colorschemeTwo main scenes are present in the dataset depicted in Figure 5Mobs are present in the thermal images not in the visual imagesdue to the images being made on separate days The imagesused for the experiment were manually annotated outliers wereremoved and the dataset was randomly split in a training andvalidation set

B Model

Detecting and classifying objects of interest in images isknown as the object detection problem in machine learning [37]Several object detection algorithms and frameworks have beenimplemented in the past years A distinction is made betweentraditional models [31 38ndash40] deep learning two-stage net-works [41ndash46] and deep learning dense networks [47ndash49] Thetraditional and two-stage methods make predictions relativelyslow (order of seconds on GPU) when compared to the densenetworks (order of milliseconds on GPU) [47] Since the goalis to use the framework in real-time use cases the latter is pre-ferred The YOLOv3 model is selected as it achieves state of theart prediction performances can make real-time predictions andis available via the open source neural network framework dark-net [50 51] The model is pre-trained on the ImageNet dataset[52] The model is trained on a NVIDIA Geforce 980 TX GPUand optimizes the SSE loss using batch gradient descent [50]To select the best weights the average Intersection of Union(IoU) and mean Average Precision (mAP) [53] are calculatedon predictions on the validation set The weights that achievethe highest mAP are selected as the final weights

VI RESULTS

A Framework

To evaluate the framework acceptance tests for the require-ments from Section III were conducted Common frameworkoperations such as manipulating and building a stream have anaverage execution time of 084 seconds with a standard devia-tion of 037 seconds Less common operations such as deacti-vating a plugin starting up the framework and shutting downthe framework have an average execution time of 358 840 and2402 seconds respectively with standard deviations 467 050and 048 respectively Deactivating plugins (STOP to INAC-TIVE transitions) takes a long time as the container running theprocess needs to be removed Real-time streaming could not betested due to the GStreamer framework having no readily avail-

able testing endpoints However when streaming and displayinga video with the framework human users could not differenti-ate between a streaming video and a video played using a nativemedia player making it plausible the framework streams in real-time Great care must be taken when building plugins as theirprocessing speed has a direct impact on the real-time streamingperformance Interoperability is achieved with the REST APIsand plugin model presented in Section III-C The interoperabil-ity is tested by having the framework exchange information witha mock plugin implementing the specified interface and count-ing the number of correct exchanges The average successfulexchange ratio is 99998 The framework can install and de-tect new plugins at runtime achieving runtime modifiability atplugin level Different deployment schemes were not tested forthe prototype

B Mob detection

The weights generated at the 15700th training epoch achievedthe highest mAP value 9052 on the validation set For com-parison performance of other models on benchmark datasetsachieve an average mAP of 748 [54] The reason the model isachieving such high values is because the validation set is tem-porally correlated with the training set as both sets are extractedfrom videos in which frames have a temporal correlation Per-formance when predicting on new datasets will be worse Figure6 depicts some predictions of the model When predicting on avideo the model generated predictions at an average frame rateof 55 frames per second an a GPU

Fig 6 Model predictions on validation set

VII CONCLUSION AND FUTURE WORK

In this dissertation a modifiable drone thermal imaging anal-ysis framework is proposed to allow end-users to build flexiblevideo processing pipelines using different thermal cameras andanalysis modules The framework implements a microservice

container plugin architecture Users can build image process-ing applications with the framework by building media streamsusing plugins that are either thermal cameras or image analy-sis software modules The framework is evaluated by building aproof of concept implementation which is tested on the initial re-quirements The proposed framework achieves the modifiabilityand interoperability requirements at the cost of performance andsecurity The framework is applied for detecting large crowdsof people (mobs) during open-air events A new dataset con-taining thermal images of such mobs is presented on which aYOLOv3 neural network is trained The trained model is ableto detect mobs on new thermal images in real-time achievingframe rates of 55 frames per second when deployed on a modernGPU Some extensions to this research are deploying a detec-tion model using the framework testing the other deploymentconfigurations testing the framework with end-users in prac-tice and building new object detection models specifically forthermal images

REFERENCES

[1] R Gade and T B Moeslund ldquoThermal cameras and applications a sur-veyrdquo Machine Vision and Applications vol 25 pp 245ndash262 2014

[2] M C Harvey J V Rowland and K M Luketina ldquoDrone with thermalinfrared camera provides high resolution georeferenced imagery of theWaikite geothermal area New Zealandrdquo 2016

[3] S Amici M Turci S Giammanco L Spampinato and F Giulietti ldquoUAVThermal Infrared Remote Sensing of an Italian Mud Volcanordquo vol 2pp 358ndash364 2013

[4] J Bendig A Bolten and G Bareth ldquoINTRODUCING A LOW-COSTMINI-UAV FOR THERMAL-AND MULTISPECTRAL-IMAGINGrdquo2012

[5] A J Rivera A D Villalobos J C Monje J A Marinas and C MOppus ldquoPost-disaster rescue facility Human detection and geolocationusing aerial dronesrdquo IEEE Region 10 Annual International ConferenceProceedingsTENCON pp 384ndash386 2017

[6] P Christiansen K A Steen R N Joslashrgensen and H Karstoft ldquoAuto-mated detection and recognition of wildlife using thermal camerasrdquo Sen-sors (Basel Switzerland) vol 14 pp 13778ndash93 jul 2014

[7] S Chowdhury A Emelogu M Marufuzzaman S G Nurre and L BianldquoDrones for disaster response and relief operations A continuous approx-imation modelrdquo 2017

[8] Workswell ldquoPipeline inspection with thermal diagnosticsrdquo 2016[9] DJI ldquoZenmuse H3 - 2Drdquo[10] Workswell ldquoApplications of WIRIS - Thermal vision system for dronesrdquo[11] Therm-App ldquoTherm-App - Android-apps op Google Playrdquo 2018[12] B Satzger W Hummer C Inzinger P Leitner and S Dustdar ldquoWinds of

change From vendor lock-in to the meta cloudrdquo IEEE Internet Comput-ing vol 17 no 1 pp 69ndash73 2013

[13] J Divya ldquoDrone Technology and Usage Current Uses and Future DroneTechnologyrdquo 2017

[14] B Steffen and A Seyfried ldquoMethods for measuring pedestrian densityflow speed and direction with minimal scatterrdquo Physica A Statistical Me-chanics and its Applications vol 389 pp 1902ndash1910 may 2010

[15] M Wirz T Franke D Roggen E Mitleton-Kelly P Lukowicz andG Troster ldquoInferring crowd conditions from pedestriansrsquo location tracesfor real-time crowd monitoring during city-scale mass gatheringsrdquo Pro-ceedings of the Workshop on Enabling Technologies Infrastructure forCollaborative Enterprises WETICE pp 367ndash372 2012

[16] L-L Slattery ldquoDroneSAR wants to turn drones into search-and-rescueheroesrdquo 2017

[17] Amazon Web Services Inc ldquoWhat Is Amazon Kinesis Video Streamsrdquo2018

[18] T Goedeme ldquoProjectresultaten VLAIO TETRA-projectrdquo tech rep KULeuven Louvain 2017

[19] L Bass P Clements and R Kazman Software Architecture in PracticeAddison-Wesley Professional 3rd ed 2012

[20] M Richards Software Architecture Patterns OrsquoReilly Media first edit ed2015

[21] C Richardson ldquoMicroservice Architecture patternrdquo 2017[22] C De La Torre C Maddock J Hampton P Kulikov and M Jones ldquoCom-

munication in a microservice architecturerdquo 2017

[23] On-Net Surveillance Systems Inc ldquoMJPEG vs MPEG4 Understandingthe differences advantages and disadvantages of each compression tech-niquerdquo 2006

[24] D Bull Communicating Pictures A Course in Image and Video CodingElsevier Science 2014

[25] Docker Inc ldquoDocker - Build Ship and Run Any App Anywhererdquo 2018[26] D Merkel ldquoDocker Lightweight Linux Containers for Consistent Devel-

opment and Deploymentrdquo 2014[27] A Ronacher ldquoWelcome to Flask Flask Documentation (012)rdquo 2017[28] Lvh ldquoDonrsquot expose the Docker socket (not even to a container)rdquo 2015[29] R Yasrab ldquoMitigating Docker Security Issuesrdquo tech rep University of

Science and Technology of China Hefei[30] GStreamer ldquoGStreamer open source multimedia frameworkrdquo 2018[31] J W Davis and M A Keck ldquoA Two-Stage Template Approach to Per-

son Detection in Thermal Imageryrdquo Proc Workshop on Applications ofComputer Vision 2005

[32] S Hwang J Park N Kim Y Choi and I S Kweon ldquoMultispectralPedestrian Detection Benchmark Dataset and Baselinerdquo CVPR 2015

[33] Z Wu N Fuller D Theriault and M Betke ldquoA Thermal Infrared VideoBenchmark for Visual Analysisrdquo IEEE Conference on Computer Visionand Pattern Recognition Workshops 2014

[34] S Z Li R Chu S Liao and L Zhang ldquoIllumination Invariant FaceRecognition Using Near-Infrared Imagesrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 29 no 4 pp 627ndash639 2007

[35] Last Post Association ldquoMissionrdquo 2018[36] FLIR ldquoFLIR One Prordquo[37] E Alpaydin Introduction to machine learning MIT Press 3 ed 2014[38] F X F Xu X L X Liu and K Fujimura ldquoPedestrian detection and

tracking with night visionrdquo IEEE Transactions on Intelligent Transporta-tion Systems vol 6 no 1 pp 63ndash71 2005

[39] H Nanda and L Davis ldquoProbabilistic template based pedestrian detectionin infrared videosrdquo IEEE Intelligent Vehicles Symposium Proceedingsvol 1 pp 15ndash20 2003

[40] R Appel S Belongie P Perona and P Doll ldquoFast Feature Pyramids forObject Detectionrdquo Pami vol 36 no 8 pp 1ndash14 2014

[41] J R R Uijlings K E A Van De Sande T Gevers and A W M Smeul-ders ldquoSelective Search for Object Recognitionrdquo tech rep 2012

[42] R Girshick J Donahue T Darrell and J Malik ldquoRegion-Based Convolu-tional Networks for Accurate Object Detection and Segmentationrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol 38 no 1pp 142ndash158 2014

[43] R Girshick ldquoFast R-CNNrdquo in Proceedings of the IEEE InternationalConference on Computer Vision vol 2015 Inter pp 1440ndash1448 2015

[44] S Ren K He R Girshick and J Sun ldquoFaster R-CNN Towards Real-Time Object Detection with Region Proposal Networksrdquo IEEE Trans-actions on Pattern Analysis and Machine Intelligence vol 39 no 6pp 1137ndash1149 2016

[45] K He Gkioxari P Dollar and R Girshick ldquoMask R-CNNrdquo arXiv 2018[46] J Dai Y Li K He and J Sun ldquoR-FCN Object Detection via Region-

based Fully Convolutional Networksrdquo tech rep 2016[47] J Redmon S Divvala R Girshick and A Farhadi ldquoYou Only Look

Once Unified Real-Time Object Detectionrdquo 2015[48] W Liu D Anguelov D Erhan C Szegedy S Reed C-y Fu and A C

Berg ldquoSSD Single Shot MultiBox Detectorrdquo arXiv 2016[49] T-y Lin P Goyal R Girshick K He and P Dollar ldquoFocal Loss for

Dense Object Detectionrdquo arXiv 2018[50] J Redmon and A Farhadi ldquoYOLOv3 An Incremental Improvementrdquo

axXiv 2018[51] J Redmon ldquoDarknet Open source neural networks in crdquo

httppjreddiecomdarknet 2013ndash2016[52] J Deng W Dong R Socher L-J Li K Li and L Fei-Fei ldquoImageNet

A Large-Scale Hierarchical Image Databaserdquo in CVPR09 2009[53] M Everingham S M A Eslami L Van Gool C K I Williams J Winn

and A Zisserman ldquoThe Pascal Visual Object Classes Challenge A Ret-rospectiverdquo International Journal of Computer Vision vol 111 no 1pp 98ndash136 2014

[54] A Ouaknine ldquoReview of Deep Learning Algorithms for Object Detec-tionrdquo 2018

xii

Contents

1 Introduction 1

11 Drones 1

12 Concepts 2

121 Thermal Cameras 2

122 Aerial thermal imaging 2

13 Problem statement 2

131 Industry adoption 2

132 Crowd monitoring 3

133 Goal 4

134 Related work 4

14 Outline 4

2 System Design 5

21 Requirements analysis 5

211 Functional requirements 5

212 Non-functional requirements 6

22 Patterns and tactics 11

221 Layers 12

222 Event-driven architecture 12

223 Microkernel 12

224 Microservices 13

225 Comparison of patterns 13

23 Software architecture 15

231 Static view 15

232 Dynamic views 22

233 Deployment views 23

3 State of the art and technology choice 27

31 Thermal camera options 27

311 Parameters 27

312 Comparative analysis 30

32 Microservices frameworks 31

321 Flask 31

322 Falcon 33

323 Nameko 33

324 Vertx 33

325 Spring Boot 34

33 Deployment framework 34

331 Containers 34

332 LXC 35

333 Docker 35

334 rkt 35

34 Object detection algorithms and frameworks 36

341 Traditional approaches 36

342 Deep learning 37

343 Frameworks 39

35 Technology choice 41

351 Thermal camera 41

352 Microservices framework 41

353 Deployment framework 41

354 Object detection 41

4 Proof of Concept implementation 43

41 Goals and scope of prototype 43

42 Overview of prototype 43

421 General overview 43

422 Client interface 45

423 Stream 46

424 Producer and Consumer 46

425 Implemented plugins 48

43 Limitations and issues 51

431 Single client 51

432 Timeouts 51

433 Exception handling and testing 51

434 Docker security issues 51

435 Docker bridge network 52

436 Single stream 52

437 Number of containers per plugin 52

5 Mob detection experiment 53

51 Last Post thermal dataset 53

511 Last Post ceremony 53

512 Dataset description 54

52 Object detection experiment 56

521 Preprocessing 56

522 Training 56

6 Results and evaluation 58

61 Framework results 58

611 Performance evaluation 58

612 Interoperability evaluation 60

613 Modifiability evaluation 62

62 Mob detection experiment results 62

621 Training results 63

622 Metrics 63

623 Validation results 64

7 Conclusion and future work 67

71 Conclusion 67

72 Future work 69

721 Security 69

722 Implementing a detection plugin 69

723 Different deployment configurations 70

724 Multiple streams with different layouts 70

725 Implementing the plugin distribution service (Remote ProducerConsumer) 70

726 Using high performance microservices backbone frameworks 70

727 New object detection models and datasets specifically for thermal images 70

A Firefighting department email conversations 81

A1 General email sent to Firefighting departments 81

A2 Conversation with Firefighting department of Antwerp Belgium 82

A3 Converstation with Firefighting department of Ostend Belgium 83

A4 Conversation with Firefighting department of Courtrai Belgium 83

A5 Conversation with Firefighting department of Ghent Belgium 83

B Thermal camera specifications 85

C Last Post thermal dataset summary 94

C1 24th of March 2018 94

C2 2nd of April 2018 95

C3 3th of April 2018 96

C4 4th of April 2018 97

C5 5th of April 2018 97

C6 9th of April 2018 98

C7 10th of April 2018 99

C8 11th of April 2018 100

C9 12th of April 2018 101

xvi

List of Figures

21 Use case diagram 7

22 Overview of the framework software architecture 16

23 Framework network topology 17

24 Client Interface detailed view 17

25 Stream detailed view 18

26 Stream model 18

27 Plugin model 19

28 Plugin state transition diagram 20

29 Component-connector diagrams of the Producer and Consumer module 21

210 Producer and Consumer Distribution component-connector diagrams 22

211 Add plugin sequence diagram 23

212 Link plugins sequence diagram 24

213 Deployment diagrams 26

31 Thermal image and MSX image of a dog 28

33 Rethink IT Most used tools and frameworks for microservices results [54] 32

34 Containers compared to virtual machines [66] 36

41 filecam GStreamer pipeline 49

42 local plugin GStreamer pipeline 50

51 Last Post ceremony panorama 54

52 Last Post filming locations 54

53 Main scenes in the Last Post dataset 55

54 Outliers 57

61 Average training loss per epoch 64

62 Validation metrics per epoch 65

63 Predictions of the model on images in the validation set 66

71 GStreamer pipeline for a plugin with a detection model 69

xviii

List of Tables

21 Performance utility tree 8

22 Interoperability utility tree 9

23 Modifiability utility tree 10

24 Usability utility tree 11

25 Security utility tree 11

26 Availability utility tree 12

27 Architecture pattern comparison 14

61 Acceptance tests results summary 59

62 Performance test statistics summary measured in seconds 60

63 Resource usage of the framework in several conditions 61

64 Total size of framework components 61

65 Interoperability tests results (S Source L Listener) 62

B1 Compared cameras their producing companies and their average retail price 86

B2 Physical specifications 87

B3 Image quality IR InfraRed SD Standard FOV Field of View 88

B4 Thermal precision 89

B5 Interfaces 90

B6 Energy consumption 91

B7 Help and support 92

B8 Auxiliary features 93

xix

List of Listings

1 Minimal Flask application 32

2 Vertx example 33

3 Spring Boot example 34

4 docker-composeyml snippet of the prototype 44

5 Mounting the Docker socket on the container 47

6 Starting a plugin container 47

7 Dynamic linking of the decodebin and jpegenc 50

xx

List of Abbreviations

ACF Aggregated Channel Features

AMQP Advanced Message Queuing Protocol

API Application Programming Interface

AS Availability Scenario

ASR Architecturally Significant Requirement

CLI Command Line Interface

CNN Convolutional Neural Networks

CRUD Create Read Update Destroy

DNS Domain Name System

FR Functional Requirement

GPU Graphical Processing Unit

H High

HTTP Hyper Text Transfer Protocol

ICF Integral Channel Features

IoU Intersection of Union

IS Interoperability Scenario

IT Interoperability Tactic

JVM Java Virtual Machine

L Low

xxi

LXC Linux Containers

M Medium

mAP mean Average Precision

Motion-JPEG MJPEG

MS Modifiability Scenario

MSX Multi Spectral Dynamic Imaging

MT Modifiablity Tactic

NFR Non-Functional Requirement

ONNX Open Neural Network Exchange Format

OS Operating System

PS Performance Scenario

PT Performance Tactic

QAR Quality Attribute Requirement

REST Representational State Transfer

RNN Recurrent Neural Network

RPN Region Proposal Network

RTP Real-time Transport Protocol

SS Security Scenario

SSE Sum of Squared Errors

SVM Support Vector Machine

TCP Transmission Control Protocol

UDP User Datagram Protocol

UI User Interface

US Usability Scenario

YOLO You Only Look Once

INTRODUCTION 1

Chapter 1

Introduction

Throughout history having an overview of the environment from high viewpoints held many benefits Early civilizations used

hills to monitor their surroundings population and spot possible attackers The discovery of flight meant that environments

could now be viewed from a birdrsquos-eye view offering even more visibility revealing much more of the world below Recently a

much more smaller type of aircraft was developed the drone Ranging from large plane-like to almost insect-like devices and

having a wide variety of uses drones are quickly taking over the sky Drones would not be as effective without proper cameras

providing a detailed view on the world below With digital videocameras offering superb quality for steadily decreasing costs

almost every scene can be captured in great detail However these cameras are limited to the visible light spectrum which

hinders drones to operate in all circumstances like nightly flights Thermal cameras measure the emitted heat of a scene and

can reveal information not visible to the eye such as hidden persons or animals pipelines malfunctioning equipment etc The

combination of these two technologies certainly holds many exciting opportunities for the future

11 Drones

Drones are flying robots that can fly remotely or autonomously and donrsquot carry a human operator They can carry a variety of

payloads video cameras delivery parcels fluid containers sensors lights but also lethal explosives [1]

Drones are classified in different categories based on varying parameters such as the physical characteristics (diameter weight)

aerial movement techniques application domains etc Based on diameter drones are classified as smart dust (1 mm to 025

cm) pico air vehicles (025 cm - 25 cm) nano air vehicles (25 cm - 15 cm) micro air vehicles (15 cm - 1 m) micro unmanned

aerial vehicles (1 m - 2 m) and unmanned aerial vehicles (2 m and larger) Often depending on their diameter the weight

of these devices ranges from less than a gram up to more than 2000 kg Drones have different flight techniques such as

propulsion engines with wings rotors in various amounts flapping wings and even balloons They are used for all kinds of

purposes ranging from search and rescue missions environmental protection delivery recon etc Hassanalian et al provide

an excellent overview of most types of drones [2]

Due to the increasing interest in commercial drone platforms [3] a variety of payloads were developed specifically tailored for

these aerial robots such as gimbals to mount action video cameras [4] gimbals for delivering packets [5] and thermal imaging

12 Concepts 2

platforms [6]

12 Concepts

121 Thermal Cameras

Thermal cameras are passive sensors that capture the infrared radiation emitted by all objects with a temperature above

absolute zero degrees Kelvin In contrast to visible light cameras thermal cameras do not depend on an external energy

source for visibility and colors of objects or scenes This makes captured images independent of the illumination colors etc

Furthermore images can be captured in the absence of visible light [7] Originally thermal camera technology was developed

for night vision purposes for the military and the devices were very expensive Later the technology was commercialized

and has developed quickly over the last few decades resulting in better and cheaper cameras [7] This led to access for a

broader public and the technology is now introduced to a wide range of different applications such as building inspection gas

detection industrial appliances medicinal science agriculture fire detection surveillance etc [7] Thermal cameras are now

being mounted on drones to give an aerial thermal overview

122 Aerial thermal imaging

Aerial thermal imaging is defined as the creation of thermal images using a flying device This dissertation focuses on the usage

of drones for aerial thermal imaging There are many applications for aerial thermal imaging Some examples are geography

[8 9] agriculture [10 11] search and rescue operations [12] wildlife monitoring [13] forest monitoring [14 15] disaster response

[16] equipment and building maintenance [17ndash20] etc In the past few years several industry players have developed thermal

cameras specifically aimed at these drone applications Examples are FLIR [6] Workswell [21] and TEAX Technology [22]

13 Problem statement

131 Industry adoption

The implementation of thermal cameras on drone platforms faces some issues for wide adoption by the industry Several

vendors offer thermal camera products some specifically designed for drone platforms that often implement different image

formats color schemes and interfaces (eg [23ndash25]) This leads to issues if users want to modify their applications by changing

the camera that is used because the applicationmust implement new software to interact with the camera or when the camera

is no longer supported by the vendor leaving the application with outdated hardware and software This leads to a problem

called vendor lock-in that makes customers dependent on a certain vendor as they cannot switch product without making

substantial costs a problem already very tangible for cloud-based applications today [26]

Applications across various fields often have different functional and non-functional requirements Some applications have hard

real-time deadlines (such as firefighting search and rescue security etc) that must be respected other applications require

13 Problem statement 3

highly accurate predictions (eg person detection agriculture etc) A single application domain can even have many different

use cases

Several firefighting departments in Belgium were contacted to get an overview of their needs for a drone thermal imaging ap-

plication It quickly became clear they had various detection problems such as finding missing persons locating hot explosives

measuring temperatures in silos detecting invisible methane fires etc Equipment also wears down more quickly due to usage

in harsh environments such as fires in close proximity A drone thermal application for them needs to be able to exchange

functionality and hardware easily and have high performance constraints to deliver value for them The email conversations

can be read in Appendix A

Other drone thermal imaging applications are mostly only used in the niche domain for which they were developed because

they arenrsquot designed for flexibility [27] These proprietary applications have some disadvantages the development and support

potentially has a large cost vendor lock-in can occur when products are no longer supported security issues could arise and

customization is difficult [28 29] Applications could benefit from a backbone framework to aid in this modifiabilityinteroper-

ability issue aiding in developing end-to-end solutions connecting thermal cameras to various analysisdetection modules for

various use cases

132 Crowd monitoring

Festivals and other open air events are popular gatherings that attract many people For every event organizer it is important to

ensure safety and avoid incidents Large groups of people so-called mobs can create potentially dangerous situations through

bottlenecks blocking escape routes etc Therefore having the ability to monitor crowds and predict their behavior is very

important to avoid such scenarios Data can be obtained by evaluating video footage from past comparable events or real time

video monitoring of current events [30] By analyzing this footage potentially dangerous situations can be avoided by acting

on the mob formation and safety regulations can be improved to help planning future events Vision-based approaches face

several limitations mounted cameras cannot capture elements outside of their field of view canrsquot see in some conditions (for

example during night time) and it is difficult to infer information from the raw footage [31]

Thermal cameras could help for crowd monitoring because they can operate in any condition Having precise and detailed

object recognition for the images produced by these cameras is crucial to extract information correctly In this context clas-

sifying images is not satisfactory localization of the objects contained within the images is needed This problem is known

as object detection [32] There are several challenges for object detection in thermal images the image quality is very low

when compared to visible light images there is a lack of color and texture information and temperature measures are relative

measures etc This makes extracting discriminative information from these images difficult [33] Most efforts towards object

detection on thermal images has gone towards human detection Most of the proposed algorithms focus on feature extraction

using the Aggregated Channel Features technique and boosting algorithms for learning [33ndash35] Novel approaches make use

of so-called deep learning with neural networks that achieve very good results given enough data [36]

14 Outline 4

133 Goal

The goal of the dissertation is to explore the requirements of the possible backbone framework suggested in Section 131 and its

potential software architecture The architecture is evaluated by building a proof of concept implementation of the framework

and evaluating it against the proposed requirements To verify its use in developing drone thermal imaging applications the

specific mob-detection use case is investigated

134 Related work

The Irish start-up DroneSAR [37] developed a search-and-rescue (SAR) drone platform allowing users to stream live images

and video from a drone as it conducts a search for missing persons The platform works with any camera visual and thermal

but focuses on drones from vendor DJI DroneSARs industry partner Amazon introduced the Amazon Kinesis Video Streams

platform in January 2018 as a new service for the Amazon Web Services (AWS) cloud platform It allows users to stream live

video from devices to the AWS cloud and build applications for real-time video processing [38] The VIPER project by EAVISE

KU Leuven researched how thermal and visual video images could be used for real-time detection of persons using object

detection algorithms such as deep learning [36] The framework presented in this work combines elements from all three of

these examples

14 Outline

The remainder of this dissertation is organized as follows Chapter 2 presents the requirements for the framework and the

software architecture Chapter 3 explores several state of the art technologies that can serve as backbone technologies for

the framework To test the viability of the software architecture a prototype is implemented Chapter 4 presents the different

aspects of this prototype Chapter 5 describes the experiment that is conducted to research the detection of mobs in thermal

images The results of both the framework and the detection experiment are presented and evaluated in Chapter 6 Finally the

conclusion and future research efforts are presented in Chapter 7

SYSTEM DESIGN 5

Chapter 2

System Design

Finding out what users actually expect from a software system and what makes it valuable for them is of key importance for the

success of that system This chapter first explores the functional and non-functional requirements of the hypothetical frame-

work suggested in Chapter 1 to find out what makes building the framework worthwhile Well known architectural patterns

enable certain software requirements very well and can be used for building the software architecture of the framework The

framework software architecture combines some of these patterns and is presented in several documents

21 Requirements analysis

Requirements are the stated life-cycle customer needs and objectives for the system and they relate to how well the system

will work in its intended environment They are those aspects of the framework that will provide value to the users

211 Functional requirements

Functional requirements (FR) describe the necessary task action or activity that must be accomplished by the system often

captured in use cases andor user stories [39 40] Use cases provide a summary of the features described in the user stories

Several external people andor systems defined as actors interact with the framework to achieve a certain goal [40] Three

actors are identified for the framework an end-user that uses the framework in order to build an image processing applica-

tion for a specific use case such as the ones described in Section 122 a camera developer who creates support software for

a specific thermal camera for the framework so that the end-user can buy and use their product and an analysis software

developer that creates analysis software for a specific use case (tracking object detecting objects etc) so that the end-user

can use their software to build his 1 application The camera and analysis software developers are generalized to an actor called

plugin developer who develops plugins to extend the functionality of the framework These plugins are the building blocks

with which the end-user can build image processing applications

The general user scenario for the framework proceeds as follows An end-user wants to build an image processing application

1To avoid unnecessary gender specific pronoun clutter the male pronoun is used by default

21 Requirements analysis 6

eg to detect fires in a landscape using a drone He has a thermal camera for this and has read about hot-spot detection in

video The user looks for a plugin for the framework that can read video from his thermal camera and for a plugin that does

the hot-spot detection If the user finds these plugins he can add them to the framework and use them for the application he

is building He connects both plugins with the framework in a specific order to finish his application For this simple example

the thermal camera plugin is connected to the hot-spot detection plugin so that video coming from the thermal camera is

transmitted to the detection plugin to find the fires in the landscape The plugins in the application and the specific order in

which they are connected is defined as a stream This stream should be easily modifiable if additional or other functionalities

are required Suppose that the thermal camera delivers very high quality video and the detector cannot work with this as it

can only operate on low quality images The end-user searches for a plugin that scales the high quality video down to an

accepted quality for the detector This plugin is placed in between the thermal camera and the detector and the application

can work again By continuously adding plugins to the framework the number of possible applications that can be built with

the framework increase making the framework useable for more aerial thermal imaging use cases

Instead of developing the application from scratch users can use the already implemented plugins to build the applications in

an ad hoc fashion Because of this the development time for such applications can be reduced and users can switch hardware

andor algorithms easily The FRs are summarized in a use case diagram that connects each actor with their respective require-

ments and the relationship among them [40] depicted in Figure 21 Trivial functionalities such as launching and shutting down

the framework are omitted The red use cases represent use cases to extend the functionality of the framework the blue use

cases represent use cases for building streams white use cases modify the media processing of the stream Some use cases

depend on others the blue and white use cases work with plugins of the framework their prerequisite use case is rdquoAdd pluginrdquo

as a plugin must be a part of the framework for a user to use it the rdquo(Un)Link pluginsrdquo rdquoStopPausePlay streamrdquo use cases

depend on rdquoAdd plugins to streamrdquo as a stream must contain plugins before they can be manipulated

212 Non-functional requirements

A non-functional requirement (NFR) specifies how the framework is supposed to be or in what manner it should execute its

functionality [41] These qualifications typically cover business and system quality requirements A distinction is made between

quality attribute requirements (QAR) and constraints QARs are qualifications of the FRs or of the overall product eg how

fast a certain function must be executed or how resilient it must be to erroneous input They are closely related to business

requirements which are specifications that once delivered provide value to the actors [40] The QARs are captured in a utility

tree [40] that has a root node representing the system This root node is elaborated by listing the major QARs that the system

is required to exhibit Each QAR is subdivided into more specific QARs To make the specific requirements unambiguous and

testable a scenario for the system or a specific function is written and they are evaluated against the business value and the

architectural impact [40] The QAR can either have High (H) Medium (M) and Low (L) business value and architectural impact

respectively The business value is defined as the value for the end user if the QAR is enabled High designates a must-have

requirement Medium is for a requirement which is important but would not lead to project failure Low describes a nice to have

QAR but not something that is worth much effort Architectural impact defines how much the architecture must be designed

towards the QAR to enable it High means that meeting this QAR will profoundly affect the architecture Medium means

21 Requirements analysis 7

Figure 21 Use case diagram

that meeting this QAR will somewhat affect the architecture Low means that meeting this QAR will have little effect on the

architecture The following QARs are discussed performance interoperability modifiability usability security and availability

Performance

Performance indicates the frameworks ability to meet timing requirements It characterizes the events that can occur and the

frameworks time-based response to those events Latency is defined as the time between the arrival of the stimulus and the

systemrsquos response to it [40] The system latency is the latency between the initialization of an action sequence and the first

change to the system noticeable by the user Streaming latency is defined as the time between the arrival of a video frame and

the arrival of the next video frame The jitter of the response is the allowable variation in latency Scalability is the number

of users that can use the framework at the same time The utility tree is presented in Table 21 The software industry has

not defined a quantified rsquogood latencyrsquo for end-users but a 4 second latency rule is often used as a rule-of-thumb [42] The

average response time for general framework commands should then be less than 2 seconds with a standard deviation of 1

seconds ensuring most execution times respect the 4 second bound As stated in Chapter 1 some use cases require real-time

video streaming such as fire fighting The notion of low latency real-time video loosely defines that video should be streamed

almost simultaneously if a camera is filming and a human user does not notice a latency between the video of the camera

and the real world the video stream is considered real-time Real-time is thus a human time perception and for visual inputs

this bound is as low as 13 milliseconds Anything above 13 milliseconds becomes noticeable anything above 100 milliseconds

hinders human performance [43 44] However the framework focusses on the use of thermal cameras most of which most

donrsquot produce frames faster than 8 frames per second or 125 milliseconds per frame (see Section 31) More expensive cameras

21 Requirements analysis 8

can shoot at 25 frames per second corresponding to a latency of 40 milliseconds and this bound is selected for the streaming

latency with a standard deviation of 20 milliseconds remaining below the frame rate of less expensive cameras The number

of users that can use the framework at the same time is assumed to be low as current aerial thermal image applications are

currently operated by only one user or a few The assumption is that a maximum of five users can use the framework at the

same time All of these requirements are quantified as relatively rsquogoodrsquo values These bounds should be evaluated for user

satisfaction by having users use a prototype of the framework in practice

Attribute refinement Id Quality attribute scenario

LatencyPS-1 The average execution time of all framework commands does not exceed 2 sec-

onds (H M)

PS-2 A playing stream should have an upper limit of 40ms streaming latency (H H)

JitterPS-3 The average standard deviation of the execution time of all framework com-

mands should not exceed 1 second under normal operation (H M)

PS-4 The average standard deviation in streaming latency should not exceed 20ms

under normal operation (H H)

Scalability PS-5 The system should usable by five users at the same time (M M)

Table 21 Performance utility tree

Interoperability

Interoperability is the degree to which two or more independently developed systems can usefully exchange meaningful infor-

mation via interfaces in a particular context [40] The framework will interoperate with cameras and analysis modules via the

framework plugins Henceforth the camera and analysis plugins will be referred to as a Producer plugin and a Consumer plugin

A Producer plugin is thus a plugin that represents a camera that produces video and a Consumer plugin a plugin that represents

a module that processes or consumes video The framework will thus interact with the Producer and Consumer plugins with

which the framework exchanges requests to link them together control their media process etc The more correct exchanges

there are between the two the better the user can use the plugin for building applications with the framework This QAR is

quantified by the ratio of requests made by the framework that were correctly interpreted by the plugin (successful requests)

and the total number of requests during a runtime of the framework [40] Intuitively one argues that the framework must

achieve perfect interoperability with a perfect exchange success rate of 100 Reality however tends to not agree with perfec-

tion and it can never be excluded that exchanges will always be correct Therefore it is better to aim for a good interoperability

measure and prepare for failed exchanges instead of naively assuming the framework will be perfect An exchange success

rate of 9999 is selected the motivation for this bound is as follows A plugin is assumed to be always correct up to first the

mistake after which the plugin is faulty and the fault needs to be identified and ensured that it wonrsquot occur again An exchange

success rate of 9999 means that if 10000 plugins are installed and used by the framework only one will fail during uptime

For one plugin during framework up time the mean time between failures is then 10000 exchanges It is suspected that this

21 Requirements analysis 9

amount of exchanges are very high for normal framework use Because the possibility of faulty exchanges is acknowledged

the framework will need to implement a fallback mechanism to compensate The utility tree is presented in Table 22

Attribute refinement Id Quality attribute scenario

Syntactic interoperabilityIS-1 The framework exchanges requests with a Producer plugin (known at runtime)

with a success ratio of 9999 Incorrect requests are undone by the framework

and logged (HH)

IS-2 The framework exchanges requests with a Consumer plugin (known at runtime)

with a success ratio of 9999 Incorrect requests are undone by the framework

and logged (HH)

Table 22 Interoperability utility tree

Modifiability

Modifiability is the cost and risk of changing functionality of the system [40] One of themost important values of the framework

is modifiability of the supported thermal cameras and analysis modules The framework needs to be extendable for new

functionalities by enabling developers to add their support software in the form of a plugin End-users should be able to

modify the components that they use for their image processing applications easily and quickly to allow for interchangeable

hardware and software and quickly set up new applications Modifiability is defined in two environments runtime defined as

periods during which the system is up and running and downtime defined as the time periods during which the system is not

active The utility tree is presented in Table 23

To enable users to choose the extensions they need the framework will need a distribution service that contains all plugins

available for the framework from which a user can select and install plugins for their local version of the framework Adding

new plugins to the distribution service should not affect versions of the frameworks installed by the user When a user adds a

plugin from the distribution to his version of the framework the framework should only reload once before making the plugin

useable for user comfort Deployability is defined as the different device configurations that specify how the framework can be

deployed If the framework can be deployed in different fashions this can increase the value for the end-user Suppose a fire

fighting use case in which a forest fire is monitored on site Computationally powerful devices might not be available on site

so moving some plugins processing media to a remote server or cloud could still allow usage of the framework Perhaps the

device processing the media is already remote for example a drone on security patrol in this case access via a remote device

such as a smartphone is desirable This leads to the deployment configurations described in the utility tree

Usability

Usability indicates how easy it is for the user to accomplish a desired task and the kind of user support the system provides

Learnability indicates how easy it is for a user to gain knowledge on how to use the framework Errors are the amount of errors

21 Requirements analysis 10

Attribute refinement Id Quality attribute scenario

Run time modifiability

MS-1 Support for a new Producer plugin should be added to the distribution service

within one day without the framework having to restart (H H)

MS-2 Support for a new Consumer plugin should be added to the distribution service

within one day without the framework having to restart (H H)

MS-3 End-users should be able to extend their framework with new functionalities

by installing new Consumer and Producer Plugins (HH)

MS-4 End-users should be able tomodify the plugins used to build their stream (HH)

Down time modifiabilityMS-5 New Producer plugins can be installed to the local framework at runtime the

framework can only reload once before the plugin is useable (H H)

MS-6 New Producer plugins can be installed to the local framework at runtime the

framework can only reload once before the plugin is useable (H H)

Deployability

MS-7 The system should be deployable on a combination of a smartphone and

cloudremote server environment (H H)

MS-8 The system should be deployable on a personal computer or laptop (H H)

MS-9 The system should be deployable on a smartphone laptop and cloud environ-

ment (H H)

Table 23 Modifiability utility tree

a user can make when trying to execute certain functions [40] The utility tree is presented in Table 24

Security

Security is a measure of the systemrsquos ability to protect data and information from unauthorized access while still providing

access to users and systems that are authorized An action taken against the system to cause it harm is called an attack

Security has three main characteristics Confidentiality is the property that data or services are protected from unauthorized

access Integrity is the property that data or services are protected from unauthorized manipulation Availability is the property

of the systemmaintaining its functionality during an attack Authentication verifies the identities of the parties of an interaction

checks if they are truly who they claim to be and gives or provokes access [40] Security is important for the framework if it is

deployed on multiple devices that use a public network to communicate The utility tree is presented in Table 25

Availability

Availability in a general context (not only security) refers to howavailable the software is to carry out its functionality Downtime

is a measure of the time that the system is unavailable to carry out its functions The utility tree is presented in Table 26

Availability is specified for the part of the framework that distributes the plugins

22 Patterns and tactics 11

Attribute refinement Id Quality attribute scenario

Learnability

US-1 A user should be able to learn how to build an image processing application in

at most one hour (H L)

US-2 An experienced developer should be able to start developing a Consumer plugin

for the system within one day (H L)

US-3 An experienced developer should be able to start developing a Producer plugin

for the system within one day (H L)

Errors US-4 A user should not make more than 3 errors to build an image processing appli-

cation (H L)

Table 24 Usability utility tree

Attribute refinement Id Quality attribute scenario

Confidentiality SS-1 Streams created by a user can only be accessed by that user and not by any

other entity (H L)

Integrity SS-2 Streams canrsquot be manipulated without authorization by the user that made the

streams (H L)

Availability SS-3 During an attack the core functionality is still available to the user (H M)

AuthenticationSS-4 Users should authenticate with the system to perform functions (H L)

SS-5 Developers should authenticate their plugins before adding them to the frame-

work (H L)

Table 25 Security utility tree

Architecturally significant requirements

Architecturally Significant Requirements (ASR) are the requirements that are themost important to realize according to business

value and have the most impact on the architecture From the utility trees and the measures of the quality attribute scenarios

the QARs modifiability interoperability and performance are identified as ASRs

22 Patterns and tactics

An architectural pattern is a package of design decisions that is found repeatedly in practice that has known properties that

permit reuse and describe a class of architectures Architectural tactics are simpler than patterns which typically use just a

single structure or computational mechanism They are meant to address a single architectural force Tactics are the rdquobuilding

blocksrdquo of design and an architectural pattern typically comprises one or more tactics [40] Based on the ASRs several tactics

are listed in Table 27 that are used for the base pattern selection The explored patterns are layers event-driven architecture

22 Patterns and tactics 12

microkernel and microservices

221 Layers

The layered pattern divides the software into units called layers that each perform a specific role within the application Each

layer is allowed to use the layer directly beneath it via its interface Changes in one layer are isolated if the interfaces donrsquot

change enablingMT-1 andMT-2MT-5 [40] While changes can be isolated by the isolated layers they remain difficult due the

monolithic nature of most implementations of this pattern Layers contribute to a performance penalty due to the rdquoarchitecture

sinkhole phenomenonrdquo in which requests are simply propagating through layers for the sake of layers [45]

222 Event-driven architecture

This pattern consists of several event publishers that create events and event subscribers that process these events The pub-

lishers and subscribers are decoupled by using an event channel to which the publishers publish events that the event channel

forwards to the event subscribers The subscribers should have a single purpose and execute asynchronously Since the publish-

ers and subscribers are single-purpose and are completely decoupled from other components via the event channel changes

are isolated to one or some components enabling MT-1 MT-2 MT-4 MT-5 and MT-7 If the event channel adds a discovery

mechanism IT-1 can also be enabled Overall the pattern is relatively easy to deploy due to the decoupled nature of the com-

ponents Performance in general can be very high through the asynchronous nature of the architecture enabling PT-6 and PT-7

If the event channel is tweaked to contain extra functionality PT-1 PT-3 PT-8 PT-9 PT-10 PT-11 can be enabled as well

If the components have a limited event response then PT-2 and PT-5 can also be enabled Development can be somewhat

complicated due to the asynchronous nature of the pattern [40 45]

223 Microkernel

The microkernel pattern allows the addition of application features as plugins to the core application providing extensibility

as well as feature separation and isolation The pattern consists of two components a core system called the kernel and

Attribute refinement Id Quality attribute scenario

DowntimeAS-1 The system should be up 995 per year This means the system has an allowed

scheduled downtime of 43 hours and 30 minutes per year for maintenance (M

L)

AS-2 The maximal duration of the interval during which the system is unavailable is

3 hours (M L)

Network AS-3 If there is no active network connection the local device can be used for opera-

tion of the framework (H H)

Table 26 Availability utility tree

22 Patterns and tactics 13

plugins The business logic is divided between independent plugins and the kernel The kernel contains only the minimal

functionality required to make the system operational The plugins are standalone independent components that contain

specialized processing additional features and custom code This code is meant to enhance or extend the core system to

produce additional business capabilities In many implementations plugins are independently developed third-party modules

Changes can largely be isolated and implemented quickly through loosely coupled plugins AllMTs can be enabled Depending

on how the pattern is implemented the plugins can be dynamically added to the kernel at runtime Via a resource discovery

service in the kernel the ITs can be enabled In general most applications built using the microkernel pattern perform well

because applications can be customized and streamlined to only include the features that are needed [45]

224 Microservices

Microservices is an architectural pattern that structures an application as a collection of loosely coupled services that implement

business capabilities Each component of the pattern is deployed as a separate unit that can be deployed on one device or

multiple devices The components can vary in granularity from a single module to a large portion of the application The

components contain one or more modules that represent either a single-purpose function or an independent portion of a

business application [45 46] Due to the separately deployed units changes are isolated to individual components enabling all

MTs Via service discovery mechanisms the ITs can also be enabled The microservices pattern supports distributed deployment

of the software across multiple devices by design This pattern is not known to produce high-performance applications due to

the distributed nature of the pattern which relies on communication via a network [45 46]

225 Comparison of patterns

Table 27 summarizes the analysis of the patterns A score is given based on howwell the pattern enables the tactic Lowmeans

that the pattern does not naturally enable the tactic Medium indicates the pattern can be implemented with the tactic but

does not include it itself High means the tactic is enabled in the pattern Excellent means that the tactic plays a key role in the

pattern

The microkernel pattern andmicroservices pattern both enable most tactics The microkernel pattern implements extendability

of the framework by design using plugins which is the main idea for the framework and thus is an excellent base pattern

Interoperability and deployability of these plugins can be ensured by the microservices pattern as it designs the microservices

to have well defined interfaces for interoperability and allows for the framework to be deployed in a distributed fashion The

architecture presented below is a combination of both the microkernel pattern and the microservices pattern

22 Patterns and tactics 14

Tactic Layers Event-driven Microkernel Microservices

MT-1 Split module Medium High High Excellent

MT-2 Increase semantic coherence Medium High High Excellent

MT-3 Encapsulate Medium High High Excellent

MT-4 Use an intermediary Medium High High Excellent

MT-5 Restrict dependencies High High High Excellent

MT-6 Anticipate expected changes Low High Excellent Excellent

MT-7 Abstract common services Low High Excellent Excellent

MT-8 Defer binding | Runtime registration Low Low Medium High

IT-1 Discover services Low Low High High

IT-2 Orchestrate interface Low Low High High

IT-3 Tailor interface Low Low High High

P-1 Manage sampling rate Low High High Medium

P-2 Limit event response Low High High Medium

P-3 Prioritize events Low High High Medium

P-4 Reduce overhead Low High High Low

P-5 Bound execution time Low High High Medium

PT-6 Increase resource efficiency Low High High High

PT-7 Introduce concurrency Low High Low High

PT-8 Maintain copies of computation Low High Low High

PT-9 Load balancing Low High Low High

PT-10 Maintain multiple copies of data Low High Low High

PT-11 Bound queue sizes Low High Low Medium

PT-12 Schedule resources Low High Low Medium

Table 27 Comparison of how well the discussed patterns enable the tactics needed for the ASRs

23 Software architecture 15

23 Software architecture

The software architecture is documented in three document categories static views dynamic views and deployment views

The static views comprise the different components of the system and their relationship among each other The dynamic views

describe the runtime behavior of the system Finally the deployment views provide different configurations how the system

can be deployed on different devices [47]

231 Static view

Figure 22 presents an overview of the architecture using a component-connector UML diagram Components are the boxes that

represent different software entities that exist at runtime The components have interfaces through which they interact with

other components These are indicated using the rsquolollipoprsquo notation with the rsquoballrsquo representing the interface that a component

provides and a socket indicating that another component is using this interface The type of data exchanged is noted next to

the interface Multiple boxes indicate that multiple components of the same kind can exist at runtime [48]

The architecture consists of the following core components Client Interface Producer Stream Consumer Producer Distribution

Consumer Distribution Producer Plugin and Consumer Plugin The clear components in Figure 22 form the core framework

which each user needs to install to use the framework The colored components form a distribution service for framework

plugins to extend the functionality they are not installed with the core framework but run as remote instances with which

the user can interact to extend his version of the core framework with new plugins A user can use the framework via the

Client Interface building streams that are maintained in the Stream component The Stream component makes requests to

the Producer and Consumer components to activate and control the selected plugins to build the stream Additional plugins

can be added to the framework and are distributed via the Producer and Consumer Distribution components The architecture

implements a hybrid combination of the microservices and microkernel pattern Each presented component is a microservice

that implements its own interface to interact with other components The Producer and Consumer components act as kernels in

the microkernel pattern while the Producer and Consumer plugins acting as plugins in the microkernel pattern These patterns

enable the tactics needed to meet the requirements presented in Section 21

Communication protocol

To allow the microservices to communicate a communication protocol must be designed Communication protocols can roughly

be classified in two categories synchronous and asynchronous Synchronous protocols block on requests which means that the

client waits for a response of the server and can only continue executing when a response is received This makes a synchronous

protocol inherently more reliable but also slower An example synchronous protocol is the Hyper Text Transfer Protocol (HTTP)

Asynchronous protocols just send messages and do not block on the response This makes the protocol less reliable but also

faster [49]

There are two types of traffic exchanged between microservices First there are the command requests that are exchanged

between microservices to edit resources or change state Second there are the video frames that are exchanged between Pro-

ducer and Consumer Plugins Both types of traffic have different requirements The commands must be communicated reliably

23 Software architecture 16

Figure 22 Overview component-connector diagram of the architecture

and need to executed once and only once The reliability is more important than latency so a synchronous protocol is pre-

ferred Microservices traditionally implement the synchronous HTTP protocol with Representational State Transfer Application

Programming Interfaces (REST API) that specifies the application endpoints as textual resources [45] This common protocol

is used for the exchanged command requests in the framework

The video frames need to be sent with low latency at a high frequency but reliability is less important An asynchronous

protocol is preferred For video streaming the Real-time Transport Protocol (RTP) running on top of the User Datagram Protocol

(UDP) is selected as it enables real-time transfer of data between processes [50] RTP defines a standardized packet format to

transmit video and audio over a network It sequences each packet with a sequence number and a timestamp This allows the

application to detect missing packets and latencies in the network The UDP protocol is a low latency asynchronous transport

protocol as it doesnrsquot guarantee packet delivery

The recommended codec for transmitting video media is Motion-JPEG that encodes video frames as separately encoded JPEG

images This makes analysis and processing in subsequent plugins easier as only the received frame is needed to perform

the analysis or processing Other video compression formats such as MPEG-4 use key-frames and object oriented differential

compression formats If a key-frame is received via the stream the frame can be used as is If a reference frame is received the

receiver needs to wait for the corresponding key-frame to be received to be able to construct the full video frame for analysis

This introduces extra complexity and lower quality detection which is a clear trade-off for the quality and simplicity which

MJPEG offers [51 52]

Applying these protocols to the architecture results in the network topology depicted in Figure 23 The full lines represent

communication via HTTP on top of the Transmission Control Protocol (TCP) The dashed lines represent the RTP protocol on top

of the UDP protocol The boxes represent the different microservice components of the framework

23 Software architecture 17

Figure 23 Framework network topology Each box is a microservice component of the framework The full lines indicate communication over the HTTPTCP

protocol the dashed lines indicate communication over the RTPUDP protocol

Client Interface

The Client Interface is the interface through which end-users can interact with the framework Figure 24 presents the detailed

component-connector diagram The Client Interface consists of a User Interface (UI) component and a API Gateway component

Devices can make requests to the Client Interface via the Client Requests interface provided by the API GateWay The UI provides

the UI Operation interface that is used by end-users to control the framework This can be either a visual or textual interface

The UI actions are translated to client requests that are forwarded to the API Gateway using the Client Requests interface The

API Gateway translates the client requests and forwards them to the other core framework components

Figure 24 Client Interface detailed view

Stream

The Stream component maintains the logical representation of the streams built by the end-user for his image processing

application Figure 25 presents the detailed component-connector diagram

23 Software architecture 18

Figure 25 Stream detailed view

It consists of an API a StreamManager and several StreamModel components The API provides the Stream Commands interface

used by the Client Interface to interact with the framework it translates incoming requests to commands for the Stream

Manager that can then execute these commands Commands include creating a new stream modifying the stream layout

modifying the stream state etc The StreamManager creates and manages multiple streams represented by the Stream Model

So the end-user builds Stream Models to create image processing applications The Stream Model represents the logical model

of these image processing application streams As stated before a stream consists of several plugins processing media placed

in some order that are linked by the framework Figure 26 illustrates this concept

Figure 26 Logical model of a stream The arrows represent the flow of media through the stream

Logically the Stream Model is represented as a tree with multiple roots and multiple leaves The framework build streams by

initializing the needed plugins and connecting them in order In the example StreamModel plugins receivemedia frommultiple

source plugins and forward media to multiple targets The Stream Model has a global state that represents the cumulative

state of all plugins To transition the global state from A to B all plugins need to transition from A to B This is done by first

making the transition on the leaves of the Stream Model after which the transition propagates to the root nodes This ensures

that no media is lost because the first transitioned plugins canrsquot process anything as their is no media put into the tree

23 Software architecture 19

Producer and Consumer plugins

A Plugin represents an independent media processing element either of the Consumer type (such as a thermal camera) or

the Producer type (such as an object detection software module) Plugins are deployed as standalone microservices providing

a REST API interface that the framework uses to control the plugin Figure 27 represents a general plugin model A plugin

receives media from other plugins called the sources processes this media and forwards it to other plugins called the listeners

A Producer plugin only has listeners a Consumer plugin has both sources and listeners Merging the media from multiple

sources and forwarding the processed media to multiple listeners is the responsibility of the plugin

Figure 27 Plugin model

The plugin REST API should at least provide a state resource representing the state of how the plugin is processing media

a sources resource that represent the sources from which the plugin receives media to process and a listeners

resource which represent the listeners to which the plugin transmits the processed media Only Consumers have the both

the sources and listeners resource as Producer Plugins produce their own media source and hence can only have

listeners

To indicate if and how the plugin is actively processing media a finite state machine is implemented The state transition

diagram is presented in Figure 28 A plugin can be in four possible states INACTIVE STOP PLAY and PAUSE When a plugin

is in the INACTIVE state no active microservice is running the plugin This is the initial state for all plugins of the framework

This state is only visible to the framework as in this state the plugin is not instantiated When a plugin is in the STOP state

the framework has instantiated a microservice running the plugin The plugin is listening for commands on its API but is not

processing any media This state is visible to the plugin In the PLAY state a plugin is processing media received from its

source(s) and transmits processed media to its listener(s) and is listening for commands When in the PAUSE state media

processing is paused but media buffers are kept This is to decrease the latency when the plugin transitions back to the PLAY

state since the plugin can continue processing from the point from where it was paused The difference with the STOP state

when transitioning STOP state the plugin clears its media buffers

The plugin starts in the INACTIVE state When a microservice running the plugin is instantiated by the framework the plugin

initializes itself in the STOP state From the STOP state the plugin can transition to the PLAY state to process media This

transition is only successful if sources and listeners are registered with the plugin From the PLAY state a transition to both

23 Software architecture 20

Figure 28 The state transition diagram for a plugin

the STOP state and the PAUSE state can be made which stops the processing of media and respectively drops or keeps the

media buffers The plugin cannot make multiple state transitions per command When a transition is made to INACTIVE the

framework first transitions the plugin to the STOP state after which the INACTIVE state can be reached

A sourcelistener has the following fields hostname the hostname of the microservice running the plugin and port the port

on which the sourcelistener is reachable

On the sources and listeners an HTTP GET and POST method must be provided GET retrieves the sourceslisteners

and their details POST adds a new sourcelistener to the plugin Both resources additionally need to provide an individ-

ual endpoint per sourcelistener on which GET PUT and DELETE must be provided This is for individual manipulation of the

sourcelistener GET retrieves the details PUT updates the fields of a listener and DELETE removes a sourcelistener from the

plugin

Producer and Consumer

The Producer and Consumer components are responsible for interacting and managing the ProducerConsumer plugins used in

the streams Figure 29 presents the component-connector diagram of the Producer and Consumer components Both compo-

nents have a similar architecture but are separate components This is because their plugin models differ and are suspected

to often be deployed on different devices having specific hardware requirements Producers Plugins could be deployed on

the thermal camera itself having a very specific operating system whereas a Consumer plugin might need access to specific

processors to speed up its execution

The Producer and Consumer consist of the following components API Kernel Plugin Model and Plugin Manager The API trans-

lates requests coming from the Stream component to commands for the Kernel The Kernel implements the core functionalities

such as activating (deploying) and deactivating plugins managing their state and manipulating their resources It creates a

Plugin Model for each Plugin that the framework has installed This model represents a plugin logically on framework level

and keeps track of the Plugin resources The Plugin Manager manages the plugins that were added to the framework stored in

the Plugin Directory It manages the plugin installations adding updates or installing additional plugins that can be retrieved

from the Producer and Consumer Distribution components

23 Software architecture 21

(a) Producer component-connector diagram

(b) Consumer component-connector diagram

Figure 29 Component-connector diagrams of the Producer and Consumer module

Producer and Consumer Distribution

The Producer and Consumer Distribution components are responsible for managing and maintaining the plugins for the frame-

work They act as online software repositories from which local versions of the framework can retrieve new plugins The

component-connector diagrams are presented in Figure 210 The Distribution components consists of the following subcom-

ponents API Plugin Manager and Plugin Tester Plugin Developers can make requests to the API that translates these requests

to Create Read Update Destroy (CRUD) commands for the Plugin Manager The Plugin Manager executes these commands

on the Plugins that are kept in the Plugin Repository The quality of the framework depends on the quality of the plugins

that it offers Therefore plugins should be thoroughly tested before being added to the framework to guarantee quality The

Plugin Tester component is responsible for this testing Tests should include testing if the plugin implements the Plugin Model

correctly if the plugin meets the performance requirements etc When a plugin passes these tests it is added to the Plugin

Repository so that end-users can install the plugin and use it for their applications

23 Software architecture 22

(a) Producer Distribution (b) Consumer Distribution

Figure 210 Producer and Consumer Distribution component-connector diagrams

232 Dynamic views

Dynamic views depict the behavior of the system and complement the static views They are documented using sequence

diagrams that show an explicit sequence of messages between architecture elements that describes a use case [40] Two key

use cases are presented here Add a plugin to the stream and linking plugins to build the stream

Add plugin to stream

Figure 211 presents the sequence diagram for adding a Producer plugin to the framework The framework is assumed to be

running the user has created a streamS and the Producer PluginA is correctly installed The end-user executes the command

to add A to stream S on the Client Interface that passes the command to the Stream component The Stream component

requests the creation of a microservice instance of A that is created by the Producer Kernel When the Producer Plugin is

instantiated the Producer Kernel creates a Plugin Model of A and adds it to its references so that the instance of A can be

reached for future commands Afterwards the StreamManager is informed of the success upon which the StreamManager can

addA to the Stream Model ready to be linked with other plugins The user is notified of this success and can continue building

IfA could not be instantiated (due to not being installed not installed correctly etc) A is marked as rsquobrokenrsquo and the user is

notified that the action could not be completed When the plugin is marked as rsquobrokenrsquo it can no longer be used and needs to

be reinstalled The sequence diagram for adding a Consumer Plugin is similar but replaces the Producer components with the

Consumer components

Link plugins

Figure 212 presents the sequence diagram for linking two plugins in a stream In the sequence diagram two Consumer Plugins

A and B are linked this can be extended to a Producer Plugin linking with a Consumer Plugin The framework is assumed

to be running the user has created a stream S the plugins A and B have been instantiated and added to the stream The

end-user executes the command to linkA andB in stream S on the Client Interface that passes the command to the Stream

component that checks if the link is valid for the Stream Model S Linking can only be done if the stream is in the STOP state

and if the plugins are already in the stream If the link is valid the Stream Manager can begin linking the plugins To link

23 Software architecture 23

Figure 211 Add a Producer Plugin to stream

the plugins in the order A-B A is added as a source for B and B is added as a listener for A These subsequences are

found in their corresponding frames in the diagram and are very similar The Stream Manager makes the request to add the

sourcelistener to the Kernel that finds the corresponding plugin and makes the request on the corresponding Plugin Model

If the Plugin succeeded the Plugin Model is updated and the Stream Manager is notified of this success If both plugins have

successfully set the source and listener the Stream Model layout is updated with the link Should the sourcelistener request

fail for one of the plugins the change is rolled back and the end-user is notified

233 Deployment views

The different deployment configurations are illustrated via deployment diagrams using the Deployment UML 25 specification

[48] rsquoHostrsquo specifies the device on which components are deployed The rsquomicroservicersquo indicates the isolated environment

in which components are running These isolated environments on the host are realized as software containers that enable

portability of the components to other deployment configurations This concept is further discussed in Section 33 The Producer

and Consumer Distribution components were left out of the diagrams as they are always distributed on a different host than

the core framework Two deployment configurations are presented the local configuration that deploys the components on

a single device and the distributed configuration that deploys each component on a separate device These configurations are

presented in Figure 213

23 Software architecture 24

Figure 212 Link two plugins in a stream The rsquoformat requestrsquo and rsquotranslate requestrsquo actions in the API components have been omitted to reduce clutter

in the diagram but are executed by the API components

23 Software architecture 25

Local configuration deployment

The local configuration deploys the framework on a single local device The configuration is depicted in Figure 213a Because

the framework is deployed as a whole it can operate offline This configuration is useful for image processing applications that

canrsquot rely on a stable network connection Examples are remote locations or densely built-up areas The components are still

deployed as separate microservices due to the architecture of the framework This has an impact on the performance of the

framework because for every interaction between components either the HTTP message protocol or RTP protocol is used that

introduces extra overhead compared to direct invocation of commands

Distributed configuration deployment

The distributed configuration deploys the framework on multiple devices The components are distributed over these devices

made possible by the microservice isolation and communication protocols This configuration is depicted in Figure 213b Obvi-

ously in this configuration each component of the framework must have a reliable network connection to communicate with

the other framework components This configuration could be used for example for a security application The end-user has

the Stream module running on a master node that controls several cameras The end-user can configure his image process-

ing application through the Client Interface running on his device that communicates with the Stream module running on

the master node The master node can control each camera by communicating with the Producer component If for example

the security application requires autonomous detection of trespassing people a computationally intensive task the Consumer

Plugins could need dedicated hardware to run that is only available on another device The Consumer component can then be

deployed on that dedicated device and the Stream component can again communicate with it over the network This success of

this configuration depends on the availability of the network and the capacity of the network If the network fails commands

and media canrsquot come through and the framework can no longer execute Due to the distributed nature performance will also

be worse when compared to the local configuration because each request between the components travels over a network

that can experience delays

23 Software architecture 26

(a) Local configuration deployment diagram(b) Distributed configuration deployment diagram

]

Figure 213 Deployment diagrams

STATE OF THE ART AND TECHNOLOGY CHOICE 27

Chapter 3

State of the art and technology choice

To build and test a proof of concept implementation of the architecture presented in Chapter 2 several state of the art tech-

nologies can be used as support for the framework These are presented in Sections 31 32 33 and 34 For each category a

choice is made that will serve as the basis for the implementation of the proof of concept discussed in Section 35 Readers

already familiar with the presented technologies can safely skip ahead to Section 35

31 Thermal camera options

This section aims to provide an overview of some currently commercially available thermal cameras The overview is not a

complete overview of all products offered by all vendors This data was gathered in September 2017 so some products can

be discontinued and new products can already be launched Several parameters are collected for each product Section 311

discusses why these parameters are important to assess the quality of a thermal camera Section 312 aims to aggregate these

parameters and presents insights into the data The full list of specifications can be found in Appendix B

311 Parameters

The following parameters were considered for the comparison physical specifications image quality thermal precision inter-

faces energy consumption help and support and auxiliary features

Price

Thermal cameras are relatively expensive when compared to visible light cameras For example a 20 megapixel (MP) visible

light camera can cost as low as 100 euro while thermal cameras having a much lower image resolution can cost as much as

15000 euro Prices for thermal cameras cover a very wide range and budgets are limited in practice

31 Thermal camera options 28

Physical specifications

Two specifications are considered the weight of the camera and the dimensions of the camera Drones have a limited carry

weight due to maximal carrying capacities and a faster draining of battery life when carrying heavier loads Lighter and smaller

cameras are preferred for usage with drones These often offer lower image quality and less features than the heavier cameras

Image quality

Image quality specifies how much information an image can possibly hold It consists of five parameters resolution capture

frequency or frame rate field of view and radiometric information Image resolution is the amount of detail an image holds

Higher resolution cameras can capture more details in a scene resulting in a sharper image that holds more information Due

to more details smaller objects can also be seen allowing scenes to be viewed from larger distances Drones capture images

from relatively large distances so good resolutions are required for the images to be useful Image resolution is measured in

pixel density presented as the product of the amount of pixels in width and height of the image The highest resolution found

for the compared cameras is 640 x 512 pixels Some cameras offer a visual camera next to the thermal camera This allows

an overlay of the visual image and the thermal image so-called Multi Spectral Dynamic Imaging (MSX) This creates artificial

sharper images because edges can be seen more clearly because they are more visible in the visual image Figure 31 depicts a

thermal-only image and a MSX image of a dog It can be seen that the MSX image is sharper MSX is a more low-cost solution

to produce sharper images compared to increasing the thermal resolution as visible light cameras are less expensive [7]

(a) Thermal (b) MSX

Figure 31 Thermal image and MSX image of a dog

The capture frequency or frame rate dictates how many frames the camera can capture per second Higher frequency cameras

are able to track dynamic scenes better The field of view is angle throughwhich the camera is sensitive to thermal radiation and

31 Thermal camera options 29

determines the extent of the world that can be seen by the camera Bigger field of views can capturemore of the environment in

one image Most cameras allow various lenses to be mounted onto the camera which allows for greater flexibility in choosing

the field of view Radiometric image information is thermal information embeddedwith the infrared image that can be analyzed

after recording Radiometric information characterizes the distribution of the thermal radiationrsquos power in space and specifies

the temperature per pixel exactly Regular thermal images use a relative scaling of temperatures that are mapped onto a

colorspace with some color being the hottest color in the image and another color the coldest For example in Figure 31a the

Iron color scheme is used which maps the cold regions of the image on blue color variants and warmer regions on red and

yellow variants Radiometric information can give a very detailed description of the radiation pattern of a scene

Thermal precision

Thermal precision specifies the temperature range the sensitivity and accuracy of the temperature measurements The tem-

perature range indicates the minimum and maximum range a camera can detect A larger temperature range comes with a

trade-off in sensitivity and accuracy Often cameras offer different modi of operation and operate using different intervals

according to the accuracy needed in a scene Sensitivity indicates the ability of the camera to record finer distinctions in tem-

perature Accuracy is the margin of error for temperature readings on the thermal camera An accuracy of 5 degrees Celsius

for small temperature ranges and 20 degrees Celsius for large temperature ranges is commonly found The increase in error

margin is a trade-off for the larger temperature interval Objects have different emit infrared waves in various forms (due

to black-box radiation [7]) To accurately compare the temperatures cameras often implement emissivity corrections that

normalize the measurements

Interfaces

Cameras can communicate with other devices via several interfaces during use Cameras mounted on a drone cannot be ac-

cessed during flight and need these interface to transfer data USB and HDMI are the most commonly found interfaces to

connect the camera with an on-board processing unit gimbal or battery MAVLink [53] is a very lightweight header-only mes-

sage marshalling library for micro air vehicles drones When a camera provides this interface this allows for a very efficient

communication scheme to control the camera remotely Other interfaces include Bluetooth or Wi-fi

Energy consumption

A device mounted on a drone has a limited energy source at its disposal The less energy the camera consumes the longer the

drone can operate This can even lead to lighter batteries used in-flight reducing the carried weight and therefore also the

energy consumption Typically energy consumptions for cameras are much lower than the energy consumption of the drone

itself so this is a minor specification Input voltage and power consumption are specified

31 Thermal camera options 30

Help and support

How the camera is supported by the company has a big impact on the ease of developing applications for the camera platform

User manuals phone or email support and FAQs are very helpful Should the camera be malfunctioning a product warranty is

necessary to recover the broken product

User experience

The user experience is another important factor as there is a difference in the technical specifications and the actual experience

of the user The user experience is measured in a number of good and a number of bad reviews Reviews are scored from zero

to five with zero being a very bad experience and 5 being a very good experience A good review is scored three or more a bad

review less than three stars

Auxiliary features

Some cameras offer even more features than the ones mentioned above These can be a connection with the Global Positioning

System (GPS) to indicate where images were captured a software application to interact with the camera analysis functionality

tracking etc

312 Comparative analysis

It can be seen that FLIR is the market leader on thermal solutions for drones They offer the largest product line and products

from other companies often utilize one of their camera cores Figure 32a plots the retail price compared to the thermal

resolution Cameras with high and low resolutions are found across all price ranges Clearly other features determine the price

of a thermal camera A feature function is defined that maps the features of a thermal camera on an integer The function

increments the integer if

bull The camera has MSX support

bull The camera has a standard data format (not just an analog or digital signal)

bull The camera offers radiometric information

bull The image resolution is larger than 640 x 512 pixels being the highest resolution found for these products

bull The sensitivity is smaller than 100 mK

bull The camera offers emissivity correction

bull The camera offers a USB interface

bull The camera offers a MAVLink interface

32 Microservices frameworks 31

bull The camera offers an HDMI interface

bull The camera offers a Bluetooth connection

bull The camera offers Wi-Fi connection

bull The camera offers GPS tagging

Figure 32b plots these feature points versus the retail price This gives a more log-like relationship The features of a camera

determine the price much more than just the image quality For a price less than 5000 euro thermal cameras are found that

implement most basic features Then the price increases rather fast for less added features These are features like radiometry

that require additional hardware that greatly increase the price of the camera

32 Microservices frameworks

The architecture presented in Section 23 relies heavily on the microservices pattern Therefore this Section aims to present

several microservices frameworks to support this architecture Figure 33 depicts the results of the Rethink IT survey query-

ing the most used frameworks for microservices by developers [54] The most popular frameworks Java EE and Spring Boot

are written in Java The Java EE framework is more of a one-stop-shop framework offering much more functionalities than

just a backbone microservices framework and is therefore not considered Spring Boot is clearly a very popular and mature

framework more streamlined for microservices Vertx is a more upcoming framework renowned for its performance making

it worthwhile to explore Python is an upcoming language for web development and because it is excellent for prototyping

several frameworks for this language are explored as well The frameworks presented here are Vertx version 351 Spring Boot

version 20 Flask version 012 Falcon version 141 and Nameko version 290

321 Flask

Flask is a micro web development framework for Python The term rdquomicrordquo means that Flask aims to keep its core simple but

extensible Flask is an unopinionated framework as it only provides a glue layer to build a REST API around the application

(a) Camera resolution compared to retail price(b) Camera feature points compared to price

32 Microservices frameworks 32

Figure 33 Rethink IT Most used tools and frameworks for microservices results [54]

However it provides a large list of extensions if extra functionality is required [55] Starting a microservice is very simple

as illustrated in Listing 1 Flask uses the concept of Python decorators [56] to bind Python functions to a REST API in Listing

1 for example the function service_status() is linked to the rsquorsquo resource When a user issues an HTTP GET request

on this resource the route() function on the app object is called by Flask Because route() is a decorator for the

service_status() function service_status() is wrapped and passed to the route() function so that when

a user issues an HTTP GET request the service_status() function that was passed gets called This allows for an easy

construction of the REST API just decorate all the functions of the microservice with the correct Flask decorator

from flask import Flask

app = Flask(__name__)

approute()

def service_status()

return service_status

if __name__ == __main__

apprun()

Listing 1 Minimal Flask application

Because Flask is a microframework its memory footprint is small with the binary file only being 535KB large It is in use

by several large companies such as Netflix and Reddit [57] In a production environment the default Flask web server is not

sufficient as it only serves one user at a time However for prototyping it is an excellent framework [55]

32 Microservices frameworks 33

322 Falcon

Falcon is a bare-metal Python web framework that differentiates itself in performance when compared to other frameworks

It targets itself towards microservices due to being even more lightweight and faster when compared to frameworks like Flask

In a benchmark test it achieves 27 times the speed of Flask [58] The framework seems less mature and has not been adopted

by many companies [59] It is not considered for the prototype of the system but could be used in production as it achieves

better performance

323 Nameko

Nameko is a framework specifically built for building microservices in Python Next to offering a REST API it also offers asyn-

chronous events over the Advanced Message Queuing Protocol (AMQP) It is only meant to be used for microservices not for

web applications that serve content It is a relatively young framework and is not backed by any major companies as of yet It

is however backed by the developer of the Flask framework [60]

324 Vertx

Vertx is a toolkit for building reactive applications on the Java Virtual Machine (JVM) This framework follows the reactive

systems principles These principles are used to achieve responsiveness and build systems that respond to requests in a timely

fashion even with failures or under load To build such a system reactive systems embrace a message-driven approach All

the components interact using messages sent and received asynchronously Reactive microservices built with Vertx have the

following characteristics autonomy asynchronous resilience and elasticity Vertx is a toolkit and can be used as any other

library which makes it very flexible It provides a large set of features metrics different programming languages different

protocols templating data access cluster management etc

Vertx embraces the asynchronous development model which can be seen in Listing 2

import iovertxcoreAbstractVerticle

public class Server extends AbstractVerticle

public void start()

vertxcreateHttpServer()requestHandler(req -gt

reqresponse()

putHeader(content-type textplain)

end(Hello from Vertx)

)listen(8080)

Listing 2 Vertx example

33 Deployment framework 34

The event which occurs is the HTTP request On arrival of the event the Handler is called and is executed The Handler is chained

to a listen request and does not block the calling thread The Handler is only notified when an event is ready to be processed

or when the result of an asynchronous operation has been computed [61]

325 Spring Boot

Spring Boot is an opinionated Java framework for building microservices based on the Spring dependency injection framework

It allows developers to create microservices through reduced boilerplate and configuration For simple applications it provides

a similar syntax to Flask in Python and uses decorators for routing An example is given in Listing 3 The framework handles

most of the routing and request handling but restricts the developer in application structure The framework is not lightweight

and performs less well than Vertx [62]

RestController

RequestMapping(api)

public class HelloRestController

RequestMapping(method = RequestMethodGET value=hola

produces = textplain)

public String hello()

return Hello Spring Boot

Listing 3 Spring Boot example

33 Deployment framework

To allow for the modifiability and interoperability requirements discussed in Section 212 and the different deployment config-

urations in Section 233 Linux containers (LXC) are used A container is a lightweight operating system running inside the host

system running instructions native to the core CPU eliminating the need for instruction level emulation that Virtual Machines

use Containers provide an identical isolated runtime environment for processes without the overhead of virtualization This

makes them perfect for highly portable software as only the container needs to be moved and can directly be executed on any

system supporting the containers [63] First the concept of containers is introduced in Section 331 Second several container

frameworks are presented in Sections 332 333 334

331 Containers

Containers sandbox processes from each other and are often described as the lightweight equivalent of virtual machines The

difference between a virtual machine and a container is the level of virtualization Virtual machines virtualize at the hardware

33 Deployment framework 35

level whereas containers do this at the operating system (OS) level The achieved effect is similar but there are significant

differences Containers make available protected portions of the OS and share its resources Two containers running on one OS

have their own OS abstraction layer and donrsquot know they are running on the same host This provides a significant difference in

resource utilization Virtual machines provide access to hardware only so it is necessary to install an OS As a result there are

multiple OSs running which gobble up resources Containers piggyback on the running OS of the host environment They merely

execute in spaces that are isolated form each other and certain parts of the OS This allows for efficient resource utilization and

for cheap creation and destruction of containers Consequently starting and stopping a container is equivalent to starting and

stopping an application [64 65] This comparison is illustrated in Figure 34

Containers offer several advantages over running a process directly on the system Due to the OS virtualization of the con-

tainers software is always deployed on the same operating system defined in the container This allows for a rsquowrite once run

everywherersquo scenario which allows for portability of the system to a range of devices Containers communicate with each other

using protocols such as HTTP This allows for the processes in containers to be written in any programming language using

any external library that is needed For the system this means that if the Producer and Consumer Plugins are packaged as

containers they can effectively be made in any available technology greatly enhancing the extensibility of the system

332 LXC

Linux containers are the basis on top of which other container frameworks are built LXC provides a normal OS environment

similar to a VM The containers in this framework almost behave identically to a VM They can run multiple processes LXC can

be used directly but offer only low level functionalities and can be difficult to set up [67]

333 Docker

Docker started as an open-source project at dotCloud in early 2013 It was an extension of the technology the company had

developed to run its cloud applications on thousands of servers [64] Now Docker is a standalone mature company providing a

software container platform for the deployment of applications [66] Docker provides two main services a simple toolset and

API for managing Linux containers and a cloud platform which provides easy access to recipes for software containers created

by other developers [68] Docker is the container technology with most public traction and is becoming the container standard

at the time of writing due to its functionalities and very responsive community It offers functionality to easily build and run

containers but also manage them in large clusters A design decision that limits Docker is that each container can only run one

process at a time and the Docker client Docker consists of a daemon that manages the containers and the API Engine a REST

client Should this client fail dangling containers can arise [69]

334 rkt

Core OSrsquo rkt is an emerging container technology providing an API engine similar to the Docker API Engine that can run LXC

containers as well as Docker containers rkt focusses on security standardization and is specifically designed to run in cloud

environments Unlike Docker rkt does not use a daemon process with a REST client The command line tool executes all the

34 Object detection algorithms and frameworks 36

(a) Container stack (b) Virtual machine stack

Figure 34 Containers compared to virtual machines [66]

operations which makes the framework more reliable rkt is not as mature as Docker yet It is portable to multiple Linux

environments but is not yet portable to macOS and Windows [70]

34 Object detection algorithms and frameworks

As stated in Section 132 object detection is the computer vision task of detecting which objects are present in an image and

where they are located Several approaches to this problem have been proposed some of which focus on thermal images This

section aims to give a small overview of different existing techniques For the technical details on the algorithms the reader is

referred to the respective articles on the algorithms

341 Traditional approaches

Traditional approaches include hot-spot detection techniques and Adaptive Boosting (AdaBoost) with various feature extraction

techniques such as Aggregated Channel Features (ACF) and Integral Channel Features (ICF) Thesemethods rely on clever feature

engineering solutions that use domain knowledge or statistical insights to transform the raw dataset into a specific set of

features in order to find patterns [32]

Hot-spot detection

Hot-spot techniques work on the assumptions that people have an overall higher body temperature than most of the back-

ground in the thermal image These techniques first select candidate objects these are the hot-spots in the image The hot-spots

define the region on which a classifier is run and are thus the localization step in the object detection problem Afterwards

a classifier is trained on these candidates Xu et al used a Support Vector Machine (SVM) classifier to classify if the hot-spot

34 Object detection algorithms and frameworks 37

represented a pedestrian [71] Nanda et al used a Bayes classifier to classify the hot-spots [72] These methods are generally

not applicable because people often are not the only hot-spots in thermal images

AdaBoost

AdaBoost is a machine learning algorithm that utilizes the output of so-called weak learning algorithms (weak learners) and

combine their outputs into aweighted sum that forms the output of the boosted classifier AdaBoostmodifies theweak learners

in favor of data points misclassified by previous classifiers [73] Viola and Jones et al built a detection algorithm that uses two

consecutive frames of a video sequence and trains the AdaBoost classifier on both motion and appearance information [74]

Davis et al use a two-stage template approach that initially performs a fast screening procedure using a generalized template

using a contour saliency map to locate potential person locations Any window located in the first phase is then forwarded to

the AdaBoost algorithm to validate the presence of the person Dollaacuter et al extracted features using different ICF and ACF [35]

ICF and ACF compute features by calculating several aggregations over the different channels of an image such as gradient

color histogram and colors Goedeme et al expanded these detectors with extra thermal channels to achieve comparable

results as Dollaacuter et al but for thermal images [36]

342 Deep learning

Over the past few decades there has been a shift in proposed solution methods towards deep learning Deep learning for object

detection uses Convolutional Neural Networks (CNN) CNNs are a specialized kind of neural network for processing data that

has a known grid-like topology such as images CNNs generally consist of three steps a convolution step that creates a feature

map of a region of an image a pooling step that summarizes the output of the convolution step and finally a fully-connected

network that learns from the features extracted in the previous steps [75] The key difference is that these algorithms do the

feature extraction in the convolutional layers and do not need feature engineering like the algorithms presented in Section

341 This requires quite a bit of computing power when compared to the traditional methods Since deep learning made the

shift to computing on Graphical Processing Units (GPUs) computations became feasible and these models proved to achieve

very good performance on various machine learning problems Two model types are described two-stage networks (R-CNN

R-FCN) that extract image regions first and make separate predictions on each region and dense networks (YOLO SSD NASNet

RetinaNet) that operate on the image as a whole

Region-based Convolutional Network (R-CNN)

R-CNN uses a selective search method to find objects an alternative to the exhaustive search in an image It initializes small

regions in an image and merges them hierarchically The detected regions are merged according to color spaces and other

similarity metrics [76] R-CNN combines this selective search with a CNN per region to find out the objects in these regions [77]

34 Object detection algorithms and frameworks 38

Fast(er) Region-based Convolutional Network (Fast(er) R-CNN)

Fast R-CNN was developed to reduce the time consumption related to the high number of models necessary to analyze region

proposals from the selective search method in R-CNN Instead of using a CNN for each region a single CNN with multiple

convolutional layers is used [78] Faster RCNN drops the region proposals detected with the selective search method (which

is computationally expensive) and introduced the Region Proposal Network (RPN) to directly generate region proposals This

accelerates training and testing and improves performance [79] Mask R-CNN is an extension of the Faster R-CNN model that

adds a parallel branch to the bounding box detection to predict object masks that is the segmentation of an object by pixel in

the image [80]

Region-based Fully Convolutional Network (R-FCN)

R-FCN tries a more efficient approach to region detection Instead of applying a per-region subnetwork multiple times R-FCN

uses a fully convolutional network with computations shared across the entire image This allows it to be compatible with

multiple backbone networks such as Residual Networks [81]

You Only Look Once (YOLO)

The previously discussed methods need to run the same computations on different parts of an image multiple times before

generating a prediction This makes those methods relatively slow The YOLO model [82] was developed with the requirement

to make predictions as fast as possible trading off accuracy for speed to move towards real-time object detection YOLO directly

predicts bounding boxes and class probabilities with a single CNN in a single evaluation instead of first detecting object regions

and predicting classes afterwards This has some benefits over the other methods YOLO is very fast when compared to other

methods capable of processing images in real-time up to 155 frames per second for some variants It also learns contextual

information because it trains on entire images instead of regions YOLO also generalizes better for other image types All these

benefits come at the cost of accuracy YOLO struggles to precisely localize some objects especially small objects The following

versions of YOLO focus on delivering more accuracy The algorithm is currently in its third version [83]

Single-Shot Detector (SSD)

The SSD [84] is similar to YOLO and predicts all the bounding boxes and the class probabilities in one single evaluation (single

shot) using one CNN The model takes an image as input which passes through multiple convolutional layers When compared

to YOLO SSD achieves higher accuracies by adding convolutional layers and including separate filters for different aspect ratio

detections

Neural Architecture Search Net (NASNet)

NASNet takes a different approach and does not design the network architecture to perform the object detection beforehand

but instead trains a Recurrent Neural Network (RNN) to generate the model descriptions of the CNN to perform the object

34 Object detection algorithms and frameworks 39

detection The RNN is trained using reinforcement learning The NASNets built for object detection perform as good as most

networks but are slower to train [85]

RetinaNet

RetinaNet is the latest state-of-the art object detector It is a simple dense detector similar to YOLO and SSD but matches

the accuracy of the two-stage detectors like the R-CNN variants RetinaNet proposes that the foreground-background class

imbalance encountered when training the dense detectors lead to less accuracy when compared to the two-stage detectors

RetinaNet uses a newmethod called Focal Loss that focuses training on a sparse set of examples to counter this class imbalance

which results in a very good performance and a very fast detection [86]

343 Frameworks

While the previous Sections focused on different algorithms actually implementing these algorithms is not straightforward

Thatrsquos why over the past years several deep learning frameworks have emerged that try to provide easier access to this tech-

nology Some frameworks provide APIs for some of the object detection algorithms presented above This section gives a small

overview of some frameworks Most frameworks differ quite a bit from each other which makes porting a model from one

framework to another rather difficult The Open Neural Network Exchange Format (ONNX) initiative hopes to propose a stan-

dard for interchangeable models which should aid switching among frameworks more easily in the future [87] Note that there

are other frameworks available but those do not yet support object detection functions out of the box

TensorFlow

Perhaps the most well-known framework TensorFlow is an open source machine learning library for neural networks with a

Python interface It was developed by Google for internal use and released for the public in 2015 [88] Recently an Object

Detection API has been built for TensorFlow which implements pre-trained models on benchmark datasets such as SSD Faster

R-CNN R-FCN and Mask R-CNN [89] TensorFlow offers a lot of flexibility in its use and can be used for many machine learning

problems

Darknet

Darknet is an open source neural network framework written in C and CUDA It is maintained by Joseph Redmon the person

behind the YOLO algorithm [90] Darknet does not offer the flexibility that other frameworks offer but is easy to install and

use when compared to others Out of the box Darknet offers an interface for YOLO The open source community offers some

ports of this framework to other popular frameworks such as Tensorflow

34 Object detection algorithms and frameworks 40

CNTK

The Microsoft Cognitive Toolkit (CNTK) is an open source toolkit for distributed deep learning It offers a Python C or C++

interface Itrsquos one of the first framework so support ONNX CNTK offers an API for Fast R-CNN and Faster R-CNN [91]

35 Technology choice 41

35 Technology choice

This Section presents the choices made for each technology described in the previous Sections

351 Thermal camera

The FLIR One Pro and Therm-App were selected as thermal cameras for the proof of concept Both offer relatively high quality

images 160 x 120 pixels and 320 x 240 pixels respectively This is of course relative to their price 469 and 93731 euro respec-

tively These prices are at the low end of the product ranges offered Both cameras are designed to use on a smartphone which

makes them ideal for prototyping since these devices are widely available and setting up the camera via the apps from the

respective companies is easy Both cameras provide MPEG-4h264 encoded video output easily understood by most playback

software Both cameras can be found in the lower left of Figure 32b

For deployment in production-ready applications with drones these cameras are not the best choice They arenrsquot specifically

designed to be used on a drone and donrsquot offer the best image quality possible In those applications platforms like the FLIR Vue

Duo Zenmuse or Workswell Wiris are better candidates due to their superior image quality MAVLink interfaces compatibility

with commercially available gimbals to mount them on drones and other features

352 Microservices framework

Flask is selected as the microservices framework The arguments for Flask are as follows Flask is a mature web framework

with major companies backing it This means the APIs stay consistent and the framework is stable in use When compared to

some other frameworks like Spring Boot Flask is unopionated which allows for maximum flexibility during development Flask

also has a very small memory footprint that makes it easier to deploy on less powerful on-board devices like drones Flask is

also easy to use and quick to set up ideal for developing a proof of concept A final argument is the familiarity of the author

with Flask

353 Deployment framework

Docker is selected as the deployment framework Docker is the most mature and well supported container framework at

the time of writing and likely will be important in the future It offers the most features and is specifically designed for the

microservices pattern [68]

354 Object detection

One of the requirements specified in Section 21 is real-time streaming Real-time object detection is only achieved by a few

models presented in Section 34 Candidates are YOLO SSD and RetinaNet As there is no framework that provides an implemen-

tation of the RetinaNet algorithm out of the box at the time of writing this algorithm is not selected SSD is implemented in

the TensorFlow object detection API However at the time of writing this API has not been found stable when trying out the API

fallbacks to older versions of the software were needed to be able to test the models This was due to the object detection API

35 Technology choice 42

using older versions of the TensorFlow framework Therefore YOLO implemented in the darknet framework is selected Darknet

offers a stable distribution YOLO achieves good results and has proven to be a very fast detector capable for real-time object

detection

PROOF OF CONCEPT IMPLEMENTATION 43

Chapter 4

Proof of Concept implementation

To prove the concept of the architecture discussed in the previous chapters a prototype is implemented First the goals and the

scope of the prototype are presented in Section 41 Next the components of the prototype are presented in Section 42 Finally

the known limitations and issues of the prototype are presented in Section 43

41 Goals and scope of prototype

The goals of the prototype are to prove the QARs defined in Section 21 The prototype focusses on the ASRs performance

interoperability and modifiability The usability security and availability requirements are left out of the scope of the prototype

because they are not an ASR and require significant resources (focus groups longtime deployment etc) to test

The components that are implemented in the prototype are Client Interface Stream Consumer and Producer because they

represent the core functionality of the framework to build image processing application streams using plugins The Producer

and Consumer Distribution components enable third party plugin developers to add their functionality to the framework These

are distribution functionalities which are out of scope of the prototype The prototype will only support one video stream All

functions presented in Figure 21 are implemented with the exception of rsquoInstall pluginrsquo rsquoUninstall pluginrsquo rsquoAdd pluginrsquo rsquoView

pluginrsquo rsquoRemove pluginrsquo and rsquoUpdate pluginrsquo as they are only supported via the Producer and Consumer Distribution components

The prototype is deployed on a local device Distributed deployment configurations require small changes in the implementation

(see Section 43)

42 Overview of prototype

421 General overview

The prototype consists of four main components a cli streamer producer and consumer The cli process is

the Client Interface implemented as a textual Command Line user Interface (CLI) which allows a user to interact with the

prototype through textual commands in a shell This process is deployed on the local machine The streamer producer

42 Overview of prototype 44

and consumer processes are deployed as microservices in their own Docker containers The prototype is initialized through

the cli which spins up the Docker containers of the other processes This is achieved with the tool docker-compose Compose

is a tool for defining and running multi-container Docker applications The compose YAML file defines the configurations for

the microservices Compose uses these configurations to start and stop the application with a single command [92] A snippet

of the compose file for the application is given in Listing 4 Containers are specified as services The example service

configuration given is that of the producer First the name of the container is specified which overwrites the default name

as the container name is used as hostname for the container in Docker [93] The build configuration specifies where the

container build recipe is situated The port mapping allows processes from the localhost to access processes in the container

For the producer service this is only used for debugging The volumes configuration specifies folders from the host to

be mounted to the container This configuration mounts in the source code and resources It also provides access to the Docker

socket to allow interaction with the Docker host (see Section 424)

services

producer

container_name producer

build

context producer

dockerfile Dockerfile

ports

- 808080

volumes

- producerusrproducer

- varrundockersockvarrundockersock

Listing 4 docker-composeyml snippet of the prototype

All containers are connected to a Docker bridge network [93] for communication A bridge network uses a software bridge to

allow connected containers to communicate while providing isolation from containers which are not connected to that bridge

network The bridge network applies to containers running on the same Docker host The network is thus confined to the local

Docker host and is not distributed on different devices The bridge network has some advantages

bull The bridge provides better isolation and interoperability between containers Containers automatically expose all ports

to each other and none to the outside world

bull The bridge provides automatic Domain Name System (DNS) resolution between containers This means that containers

resolve the IP address of each other by container name or alias

bull Containers can be attached to and detached from the networks on the fly

bull Environment variables are shared which can be used to provide equal environment configurations for every container

on the bridge

42 Overview of prototype 45

422 Client interface

The Client Interface is implemented by the cli component The cli is built in Python with the Click package by Armin

Ronacher [94] Click is a CLI creation kit which aims to make the implementation of CLIs easier It resembles the Flask frame-

work as it also leverages Python decorators [56] for most of its functionality The source code of the cli is located in the

mosquitopy file Commands can be executed by calling python mosquitopy or by calling mosquito if the

source code is installed into the Python environment The following commands are implemented

bull mosquito Displays a help page listing command groups

bull mosquito on Starts the application

bull mosquito off Shuts down the application

bull mosquito plugins Groups all commands to manage plugins Plugins can only be listed not installed or unin-

stalled as the Remote Producer and Remote Consumer are not implemented

bull mosquito plugins ls Lists all locally installed plugins

bull mosquito stream Groups all commands to manipulate the current stream

bull mosquito stream add Adds a producer or consumer to the stream

bull mosquito stream delete Deletes a producer or consumer from the stream

bull mosquito stream elements List all producers and consumers that were added to the stream

bull mosquito stream link Links two stream plugins

bull mosquito stream pause Pauses the stream

bull mosquito stream play Plays the stream This means the stream is processing media

bull mosquito stream print Prints the stream layout (which plugins are linked)

bull mosquito stream stop Stop the stream

bull mosquito stream view View the stream on the local device

A typical use of the application would be the following First the application is started using mosquito on Then plugins

are added to the stream using mosquito stream add [ELEMENT_TYPE] [ELEMENT] This will instantiate the

corresponding plugins in the Producer and Consumer component The plugins are linked in order using mosquito stream

link [ELEMENT_1] [ELEMENT_2] The stream is then set to play using mosquito stream play When the

last plugin is linked to the special local plugin the user can view the output from that plugin using mosquito stream

view which opens up a window in which the stream is displayed

42 Overview of prototype 46

As specified in the software architecture (see Section 23) the Client Interface can use the Stream Commands interface of the

Stream component As specified in Section 231 this interface is a REST API so the client can use this interface through the HTTP

protocol This is done with the Python Requests library [95]

423 Stream

The Stream component is responsible for the logical representation of the stream (see Section 231) implemented as the

streamer component The component consists of three objects api that contains the REST API StreamManager and

the Stream object representing the Stream Model in the framework Requests to the other microservices are sent using the

Python Requests library The prototype implementation only supports one stream with a chain-like model This means that

unlike the stream depicted in Figure 26 a plugin canrsquot have multiple sources or multiple listeners The Stream object man-

ages the logical representation of the stream and manipulates the references to the plugins by forwarding commands to the

producer and consumer component respectively It contains two data structures outline which is the logical struc-

ture of the stream and elements that contains all the plugins present in the stream In the prototype the Stream component

provides the following functionalities on its API endpoints

bull plugins GET Fetches all the plugins from the producer and consumer components and returns their in-

formation

bull elements GET POST DELETE Resource to add and delete plugins from the elements bin

bull streamlinks POST Resource to create links for elements

bull streamstate GET PUT Resource to update the state

bull shutdown POST Shut down the framework

Since the streamer component is the only component of the framework that interacts with outside users it has the re-

sponsibility to gracefully shut down the framework This is needed to solve the problem of dangling plugin containers that

run plugins that have not been stopped and removed after closing the application Since only plugins that are contained in a

stream have a running container associated the stream can notify the Producer and Consumer components to stop and remove

those containers

424 Producer and Consumer

The Producer and Consumer component cover similar responsibilities in managing installed plugins They are implemented in

the producer and consumer components Both components consist of the following objects api that contains the REST

API the Kernel that implements the core functionalities the PluginManager which finds plugins installed on the device

and checks if their installation is valid and the Plugin which is the logical representation of a plugin as described in Section

231 Commands to control the plugins are made using the Python Requests library

42 Overview of prototype 47

For the component to be able to start stop and interact with the plugin containers the component needs access to the Docker

host and the Docker client running on that host But because the component is running in its own container it is isolated from

the Docker host and canrsquot interact with the Docker client by default The workaround for this problem is to expose the socket

on which the Docker client is running on the Docker host to the container This is done by mounting the Docker socket of the

host on the Docker socket in the container In Docker compose the mounting is achieved using the following Listing

volumes

- varrundockersockvarrundockersock

Listing 5 Mounting the Docker socket on the container

This has some implications on security (see Section 43) To interact with the now exposed Docker client the component uses

the docker-py library [96] Listing 6 shows how a connection is made to the Docker client and a plugin container is started

The container is started from the plugin image on the network of the framework and is given the plugin name as the container

name Docker thus creates a DNS entry with the plugin name which makes the container addressable on its name Due to this

implementation this limits that there can only be one container of a plugin running at all times in the current implementation

import docker

client = dockerfrom_env()

container = clientcontainersrun(

image=plugin_name

detach=True

name=plugin_name

network=mosquito_default

)

Listing 6 Starting a plugin container

When both components are initialized the Kernel and PluginManager are created The PluginManager searches

for a plugin_directory which contains information on which plugins are installed on the device Each installed plugin

should have a valid image on the device which are contained in the images directory of the Docker daemon If the image

or information file cannot be found on the device the plugin is marked as broken and canrsquot be used by the framework To

describe the API the consumer API is used The producer API is analogous but replaces consumer with producer

and doesnrsquot have the sources endpoints The Producer and Consumer components provide the following functionalities

on the API endpoints

bull consumers GET Retrieves a list of the installed consumers on the device on which the component is running

bull consumerslthostnamegt GET DELETE Retrieves the information of a consumer specified by the host-

name value which is the name of the consumer

42 Overview of prototype 48

bull consumerslthostnamegtstate GET PUT Retrieves or respectively updates the state of a consumer

specified by the hostname value

bull consumerslthostnamegtsources GET POST Retrieves the sources or respectively adds a new source

to the consumer specified by the hostname value

bull consumerslthostnamegtsourcesltsource_hostnamegt

GET PUT DELETE Retrieves updates or removes the source specified by source_hostname of a consumer spec-

ified by hostname respectively

bull consumerslthostnamegtlisteners All listeners resources are analogous to the sources re-

sources

425 Implemented plugins

Three plugins are implemented and tested filecam (called rsquoMycamrsquo in the code) a producer that reads in a video file and

transmits it in MJPEG encoding using the RTP protocol testsrc a producer which generates test video and transmits it

in MJPEG encoding using the RTP protocol and local a consumer which captures incoming RTP MJPEG video frames and

displays them on the local display The filecam and local plugins are discussed since the testsrc is similar to the

filecam

The plugins are implemented in Python use the GStreamer library with the Python bindings [97] for media streaming and the

Flask framework to implement the API These libraries donrsquot have to be used by future plugins which can just implement a REST

API and provide a media stream specified in their descriptions

Filecam plugin

The filecam image is based of the Ubuntu 1710 image It is chosen over lighter Linux distributions because it offers more

functionalities out of the box for prototyping Other dependencies are Python 36 GStreamer 112 GStreamer plugins-base

plugins-good plugins-bad plugins-ugly libav doc and tools and python-gst

The API of the plugin offers the following functionalities

bull state GET PUT Retrieve and respectively update the state of the plugin

bull listeners GET POST Retrieve and respectively add a listener on the plugin

bull listenerslthostnamegt GET PUT DELETE Retrieve update and respectively delete a listener on the

plugin

The implemented GStreamer pipeline is depicted in Figure 41 The pipeline consists of the following GStreamer elements

1 filesrc GStreamer element that reads data from a file in the local file system This file can have any extension

and is not limited to video or audio files [98] The location property is set to the location of the file in the plugin

container

42 Overview of prototype 49

Figure 41 filecam GStreamer pipeline

2 decodebin GStreamer bin that automatically constructs a decoding pipeline using available decoders and demuxers

via auto-plugging [99] Note that for some media containers and codecs the appropriate decoders must be installed

For example to decode the MPEG streams contained in MP4 files a h264 decoder is needed that can be found in the

rsquolibavrsquo GStreamer plugins library

3 jpegenc GStreamer elements that encodes raw video into JPEG images [100] This implements the MPEG video

stream as all video frames are encoded as JPEG images

4 rtpjpegpay GStreamer element that payload encodes JPEG images into RTP packets according to RFC 2435 [101]

5 udpsink GStreamer element that sends UDP packets to the network When combined with an RTP payload plugin

it implements RTP streaming [102] The host and port property are set to the hostname and port property of the

listener of the plugin

This pipeline is implemented using the Python GStreamer bindings The process consists of creating each GStreamer element

adding them to the GStreamer pipeline and linking the elements in order of appearance in the pipeline The decodebin

and jpegenc element canrsquot be linked when created because there is no default sink pad available on the decodebin

Because the decodebin needs to decide on how to decode media it needs the pipeline to be processing media to it If no

media is flowing the decodebin canrsquot know what decoder it needs to decode the media and what media it can offer to the

sink element Therefore the process of dynamic linking is used [103] All elements which can be linked when the pipeline is

not in the PLAYING state are linked A handler is registered on the rsquopad-addedrsquo signal emitted when a new pad is added

on the decodebin indicating that it can forward media downstream When media is flowing through the pipeline the

decodebin creates new pads when it can generate output data and emits the rsquopad-addedrsquo signal A callback is performed

on the handler which links the decodebin with the jpegenc Listing 7 illustrates this concept

callback handler

def on_pad(source pad sink)

get the sink pad from the sink element

sink_pad = sinkget_static_pad(sink)

get the pad type

pad_caps = padget_current_caps()

pad_type = pad_capsget_structure(0)get_name()

Only if the pad is raw video the link is made

if pad_type == videox-raw

42 Overview of prototype 50

Perform the dynamic link

padlink(sink_pad)

Other pad types are ignored

filesrc = GstElementFactorymake(filesrc)

decodebin = GstElementFactorymake(decodebin)

jpegenc = GstElementFactorymake(jpegenc)

(create other elements and add elements to pipeline)

Only filesrc and decodebin can be linked statically

filesrclink(decodebin)

Register on_pad handler on the pad-added signal

handler_id = decodebinconnect(pad-added on_pad jpegenc)

Set pipeline to PLAYING callback will be called to perform the dynamic link

pipelineset_state(GstStatePLAYING)

Listing 7 Dynamic linking of the decodebin and jpegenc

Local plugin

The local plugin captures an incoming media stream and displays it on the local display This plugin is special with respect

to other plugins in that it is not deployed in a Docker container It runs natively via the cli on the host to allow access to

the local display This version is built for macOS High Sierra (version 10134) and uses GStreamer 112 GStreamer plugins-base

plugins-good plugins-bad plugins-ugly libav doc and tools to receive an incoming stream When a plugin links to the local

plugin the Stream component does not instruct the Consumer component to start the plugin but instead links the plugin to the

local host For macOS the address of the host is hostdockerinternal The GStreamer pipeline used by the plugin is depicted in

Figure 42

Figure 42 local plugin GStreamer pipeline

The pipeline consists of the following elements

1 updsrc GStreamer element that reads UDP packets from the network [104] The port property is set to the port to

which the source is transmitting media

2 rtpjpegdepay GStreamer element that retrieves JPEG images from the received RTP packets [105] This element

canrsquot process the media received from the udpsrc directly because it canrsquot know what type of data it will be receiv-

43 Limitations and issues 51

ing Between the pads a rsquocapabilities filterrsquo is placed which informs the elements on the type of data that will be

flowing through In this case the capabilities are applicationx-rtp which tells that there will be rtp pack-

ets coming through encoding-name=JPEG which tells that the payload of the RTP packets are JPEG images and

payload=26 which also tells that the encoding is JPEG according to RFC3551 [50 106]

3 jpegdec GStreamer element that decodes JPEG images [107]

4 autovideosink GStreamer element that automatically detects an appropriate videosink and forwards the video

to it [108]

43 Limitations and issues

The implementation presented is a prototype and slimmed down version of the architecture presented in Section 23 The

following limitations and issues remain

431 Single client

The current implementation deploys the Flask framework (on which each microservice relies) on the built-in Flask web server

(Werkzeug) which is provided for development convenience It is only built for use by a single user and by default can only

handle one request at each given moment which implies that the framework can also only be used by a single user [109]

432 Timeouts

The framework does not perform checks on request timeouts when passing commands to components and plugins This can

be a problem when the framework is deployed on several devices and the request latency is much higher In case of timeouts

the framework will keep waiting for a response which leads to a crash

433 Exception handling and testing

The framework is only tested for the so called rsquohappy pathrsquo the default scenario featuring no exceptional or error conditions

Some alternate paths are handled butmost still need to be tested An example scenario would be if one of the plugin containers

in a stream fails and stops The framework is not able to detect this and will assume that the container is still running

434 Docker security issues

The Docker client is a client that communicates with a daemon process using the socket dockerd This socket is a UNIX

domain socket called varrundockersock The daemon is highly privileged having root access to the host system

Any process that can write to this socket effectively has root access To allow the components of the framework to manipulate

the plugin containers they need access to this socket Therefore the socket ismounted in the containerwhich gives the container

43 Limitations and issues 52

write access to the socket This implies that that container now has root access on the host when writing to this socket Because

the container gets root access to the host an attacker can walk the file tree of the host and extract sensitive information or run

unwanted software This type of attack is known as a rsquoDocker Breakoutrsquo or rsquoContainer Escapersquo attack [110 111]

435 Docker bridge network

The current implementation deploys the framework on a Docker bridge network which can only be used if the framework is

deployed on a single device The current implementation can thus only be deployed on a single device To deploy the framework

on multiple devices the framework must be deployed using a Docker overlay network [112]

436 Single stream

The implementation supports one stream which must be a chain Multiple streams in tree form with merging media from

multiple sources and broadcasting to multiple listeners is not supported

437 Number of containers per plugin

The framework uses the name of the plugin as identifier for the containers The name is also the hostname on which the

container can be reached Therefore there can only be one active container associated with a plugin at runtime

MOB DETECTION EXPERIMENT 53

Chapter 5

Mob detection experiment

To try out an actual drone thermal imaging application the mob detection experiment is carried out The goal of this experi-

ment is to use existing object detection algorithms on a dataset of thermal images to try and detect large crowds of people

hereinafter referred to as a mob

Several public datasets of thermal images exist Most datasets focus on the detection of people in scenes [113ndash117] some on

face recognition [118 119] others on vehicle recognition [120] Most of these datasets are freely available through the OTCBVS

Benchmark Dataset Collection [121] No datasets containing large amounts of people were found so the Last Post thermal

dataset was created for the detection of mobs and other analysis tasks This dataset is presented in Section 51

To detect mobs in the images of the datasets a deep learning approach using neural networks is explored The selection and

training of the model is described in Section 52

51 Last Post thermal dataset

The Last Post dataset consists of videos of the Last Post Ceremony taking place each night at 800 PM (Brussels timezone) under

the Menin Gate in Ypres Belgium Section 511 gives some insight into this unique ceremony The full dataset is described in

Section 512

511 Last Post ceremony

The Last Post ceremony is a nightly ceremony taking place under the Menin Gate in Ypres at 800 PM sharp The ceremony is

held in remembrance of the fallen soldiers during World War I (1914-1918) The Last Post association [122] states its mission as

follows

True to its statutes the Last Post Association wishes to honor and remember the soldiers of the British Empire

who gave their lives during the Great War of 1914-1918 The Last Post ceremony seeks to express day after day

the lasting debt of gratitude which we all owe to the men who fought and fell for the restoration of peace and

the independence of Belgium

51 Last Post thermal dataset 54

Figure 51 gives an impression of the size of the ceremony Because of the sheer number of people that gather under the gate

each day the Last Post is a unique open air event that allowed for repeatable conditions to capture footage therefore the event

was a perfect opportunity to create the dataset

Figure 51 Last Post ceremony panorama

512 Dataset description

Due to legislation in Belgium drones cannot be flown in public areas without a certification and permit by authorities The

creation of real aerial thermal images with a drone was thus not feasible Therefore an elevated position (in order to simulate

aerial images) on the walls next to Menin gate was used to capture the footage of the adjacent square on one side and the

bridge on the other side Figure 52 shows the locations where the video footage was captured

Figure 52 Locations where the video footage was captured The black stars represent the captured scenes the red stars represent the locations from

where the scene was filmed

The data was recorded with the FLIR One Generation 3 Pro camera for Android devices hereafter referred to as rdquoCamerardquo [123]

Since thermal images donrsquot hold color information a color scheme is used to represent the relative differences in temperature

The rsquoIronrsquo color scheme which maps colder sections of a scene on blue colors and warmer sections on red and yellow colors

51 Last Post thermal dataset 55

The videos are encoded using the H264 MPEG-4 codec Decoded the color information is captured in 420 YUV format The

frame rate of the videos varies from 7 Hz to 8 Hz depending on the speed of the objects in the scene There is sound present

in the videos which is encoded with the MPEG AAC codec For a full list of sequences the reader is referred to Appendix C

The two locations that make up the main scenes in the dataset are presented in Figure 53 The thermal images and visual

images of each scene are depicted next to each other The thermal and visual images were not captured at the same time so

the mobs that are present in the thermal images canrsquot be seen in the visual images In both scenes buildings are present that

are quite warm when compared to the surroundings as can be seen in the thermal images In Figure 53a it even becomes

difficult to recognize the mob when they are standing close to the building This is less the case for Figure 53c where due to

the water present in the image the mob has higher contrast due to the larger difference in emitted heat Towards the far right

of the image the mob seemingly disappears into the background The effect of two objects having a similar heat signature and

having no clear transition in thermal images is defined as thermal camouflage a technique that is often used by animals and

military units [124] This effect is even visible when looking at the mobs present in both images because people are standing

so close together it becomes difficult to recognize individual persons in the crowd

(a) Thermal view of the square in location A (b) Visual view of the square in location A

(c) Thermal view of the bridge in location B (d) Visual view of the bridge in location B

Figure 53 Main scenes in the Last Post dataset

52 Object detection experiment 56

52 Object detection experiment

521 Preprocessing

The Last Post dataset was not used entirely for training the model because there were not enough resources to manually

annotate every image Therefore a smaller dataset was used to serve as a baseline model

The following videos were used 2018-04-10 195029mp4 2018-04-10 200122mp4 2018-04-04-

202859mp4 2018-04-10 202558mp4 and 2018-04-04 200052mp4 captured on the fourth and

tenth of April 2018 These videos were used because of their contents They contain images from location A and B respectively

in which the mob behaves more dynamically compared to other videos This was due to a marching band present on the fourth

of April and a marching army unit on the tenth of April See Appendix C for a summary of the contents of these videos From

these videos images were extracted at a capture rate of 1 Hz Each image was manually labelled using the Microsoft Visual

Object Tagging Tool [125] The tool allows to export the training images to various formats such as Pascal VOC for Tensorflow

YOLO and Microsoft CNTK

Within the data several visual outliers are present An outlier is an observation point that is distant from other observations

It is created due to variability in capturing the videos or indicate experimental errors [126] The errors detected here are the

latter form and are depicted in Figure 54 The first type of outliers are system faults in the Camera Due to an error in the

processing of the video the Camera would sometimes not register any input This causes the Camera to produce completely

black images which is depicted in Figure 54a The Camera softwaremaps temperatures onto colors in the image The variations

of the colors are relative to the temperature interval ranging from the minimum and maximum temperature detected by the

Camera If the minimum andor maximum detected temperature change the Camera needs to adapt its color mapping This

causes the Camera to fade to bright colors for a short period of time (1 to 2 seconds) The resulting image is depicted in Figure

54b Because the resulting image is too bright and objects are hard to detect it is considered an outlier Due to instabilities

when capturing the footage sequences with fast motion some images are very blurry This makes it hard even for a person to

decide what is visible in the frame therefore it is considered an outlier This is depicted in Figure 54c Sometimes people would

pass in front of the Camera which resulted in brightly colored areas in the videos that were not part of the scene and therefore

are another type of outliers depicted in Figure 54d Because the presented outliers are experimental errors and do not belong

in the scenes they were removed from the dataset

522 Training

The model that is used for training is YOLOv3 implemented using the darknet neural network framework [83] The model is

trained using convolutional weights that are pre-trained on the ImageNet database [127] The concept of using weights from a

pre-trained model previously trained on large datasets is known as transfer learning It is very important that when choosing

a pre-trained model the problem statement of the pre-trained model is close enough to the current problem statement For

the pre-trained model on ImageNet this was to identify objects in images which lies close to the detection of mobs in thermal

images Because the type of images (thermal versus visual) is fundamentally different the model could suffer in performance

Goedeme et al [36] solved a similar problem with thermal images and achieved good results which gives an indication that

52 Object detection experiment 57

(a) System fault in the Camera no input was detected (b) The Camera updates to new temperature interval

(c) Due to moving the Camera too fast the image becomes too blurry (d) Very warm object due to people passing in front of the Camera

Figure 54 Outliers

detection should be feasible with the pre-trained model Also because the dataset is relatively small training the model from

scratch could actually hurt performance [128] Training was carried out on the NVIDIA Geforce GTX 980 GPU that allows training

to be done much faster To evaluate training progress the Sum of Squared Error (SSE) loss function is calculated defined assumni=1(xij minus xj)

2 where n is the number of samples in a batch used in a single training epoch and j is the dimension (x

or y) as defined in [83] The result of this training is discussed in Chapter 6

RESULTS AND EVALUATION 58

Chapter 6

Results and evaluation

The goal of this Chapter is to present the results of the framework and the detection experiment The results of the framework

tests are presented in Section 61 The results of the object detection experiment are presented in Section 62

61 Framework results

To evaluate the framework acceptance tests are conducted that test if the framework meets the QARs defined in Section 21 As

stated in Section 41 only the ASRs will be tested A summary of which requirements are met by the framework is given in Table

61 Passed means that the framework has met the requirement not passed that the framework hasnrsquot met the requirement

and plausible means that the frameworkmight havemet the requirement but not enough data could be gathered to be certain

611 Performance evaluation

To evaluate performance the acceptance tests for the requirements are conducted the impact of the framework on the pro-

cessing resources are recorded and the total size of the framework is measured

Acceptance tests

To test the performance of the framework the execution times of each command executed with the CLI (see Section 422) are

measured Each command is executed 200 times except for the on off and link commands they are measured manually

10 times Because these commands launched system threads and their finish signal could not be captured they had to be

measured by hand Commands were executed on a 26 GHz Intel Core i5-2540 processor running macOS High Sierra version

10134 The summarized statistics of the tests are given in Table 62

The average execution times for the Play Stop Pause Add Elements Print View and Link commands do not exceed the 2

second bound specified in PS-1 while the average execution times of the Delete On and Off commands do exceed this bound

This performance requirement is not met by the framework The same result is found for PS-2 Especially the Delete and Off

command exceed the requirements by quite a bit The Delete command shuts down a plugin and removes the Docker container

61 Framework results 59

Requirement id Status

PS-1 Not Passed

PS-2 Plausible

PS-3 Not Passed

PS-4 Plausible

PS-5 Not Passed

IS-1 Passed

IS-2 Passed

MS-1 Passed

MS-2 Passed

MS-3 Passed

MS-4 Passed

MS-5 Plausible

MS-6 Passed

MS-7 Plausible

Table 61 Acceptance tests results summary

from the host This action is costly in time The Off command removes all the plugins and all the microservices of the framework

and thus suffers from the same costly action This could be ameliorated by having the framework not removing the containers

but stopping them instead which requires less resources as it only stops the process running in the container but does not

delete the container from the system

PS-2 and PS-4 could not be measured due to the GStreamer pipeline of the prototype not allowing frames to be tracked

However since real-time is a human time perception if a person canrsquot distinguish the streamed videos from videos played with

a native video player real-time streaming is plausible [43 44] The videos were shown side by side to ten users that could not

distinguish between both videos indicating presumable real-time streaming Since the hard requirements cannot bemeasured

the requirements are not met but are plausible Real-time streaming performance also heavily depends on the used plugins

and the hardware on which they are deployed If a plugin canrsquot process its media fast enough due to lack of processing power

or a slow implementation it will slow down the whole stream

The scalability requirement PS-5 could not be met due to the Flask Werkzeug server only being able to process one request at

a time (see Section 43)

Only two performance requirements are met by the prototype However this is mostly due to some actions being very slow

such as shutting down the framework or removing a plugin As these are actions that should occur less frequently when a user

is using the framework these actions are less important for the perceived quality Frequent actions such as adding linking and

changing the state of the stream do perform rather well and contribute more to the perceived quality Overall the performance

of the framework is not stellar but not bad either This can partially be explained due to the choice of supporting frameworks

61 Framework results 60

Statistic Play Stop Pause Add Delete Elements Print View On Off Link

Mean 0690 0804 0634 1363 8402 0562 0564 122 358 24023 0849

Std deviation 0050 0059 0088 1037 4669 0070 00747 0260 0498 0481 0170

Minimum 0629 0708 0549 0516 0505 0517 0517 0757 3015 23707 0637

25 Percentile 0665 0775 0594 1049 1154 0534 0536 0998 3143 23750 0798

Median 0678 0800 0623 111 11132 0550 0552 1214 3500 23886 0853

75 Percentile 0700 0820 0653 1233 11189 0562 0560 1433 3850 24034 0877

Maximum 1016 1279 1631 625 11846 1227 1149 1691 4562 25326 1261

Table 62 Performance test statistics summary measured in seconds

such as Flask that are not built for performance Other more high performance frameworks such as Vertx could ameliorate

performance

Resource usage

The resources used by the modules of the framework are measured using the Docker statistics tool [129] A summary of the

resources used is given in Table 63 When the framework is idle resource usage is negligible When a plugin is active there is

a slight increase in resources This increase in resources depends on the runtime size of the plugin unknown to the framework

The increase peaks when the plugin is processing media CPU usage is 40 on one core which implies that on one CPU core only

two plugins can be active simultaneously before reaching the ceiling of the processing power In a production environment of

the framework plugins need to be tested thoroughly so that these metrics are known beforehand These metrics imply that

the length of streams should be kept short to avoid having many plugins active simultaneously

Size of framework

The total size of all the Docker images of the components of the framework are given in Table 64 Most images are quite large

the framework core components have an average size of 724 MB and the plugins have sizes ranging from 1GB to 3GB This

size can be explained due to the base images and additionally installed software in the images For development flexibility

the base images used are Linux Ubuntu images that are typically larger than other Linux distributions For the plugins the full

GStreamer library with all plugins was installed which is more than 2 GB large The sizes of the components can be reduced in

a production environment by choosing slimmer Linux distributions as base images and only installing the minimally needed

libraries to get a working plugin

612 Interoperability evaluation

The systems with which the framework exchanges data are the plugins These plugins must follow the plugin model presented

in Section 231 implement the presented resources using a REST API the state machine and protocols If these specifications

61 Framework results 61

Condition Container CPU usage [] Memory usage [MiB]

Idle streamer 100 4209

consumer 003 244

producer 001 2414

1 plugin active not processing media streamer 156 4248

consumer 002 2442

producer 002 2423

mycam plugin 075 4597

1 plugin active processing media streamer 156 4251

consumer 002 2442

producer 002 2424

mycam plugin 4003 9924

Table 63 Resource usage of the framework in several conditions

Image Size [MB]

streamer 718

consumer 729

producer 729

testsrc 1250

mycam 3020

Table 64 Total size of framework components

are followed by a plugin the framework should have no issues exchanging information with the plugin To test this a new

mock plugin is implemented For each resource of the plugin the framework is given random mock input data to exchange

with the plugin When the exchange is complete the values in the plugin are requested and compared with the given input If

the input matches the value in the plugin the exchange was successful These tests were executed 50000 times The results

are summarized in Table 65 Play pause and stop are the requests to change the state of the plugin The sourcelistener add

update and delete commands manipulate the sources and listeners of the plugin Overall there were almost no errors made

when exchanging information only when updating a source and deleting a listener there was one incorrect exchange The

ratios achieved are always 100 correct exchanges except for updating a source and deleting a listener which are 99998

IS-1 and IS-2 specify that commands exchanged with the plugins need to be correct 9999 of the uptime so this requirement

is clearly met

Plugins also interact with each other by transmitting media to each other according to the stream layout This interoperability

62 Mob detection experiment results 62

Value Play Pause Stop Add S Update S Delete S Add L Update L Delete L

Correct 50000 50000 50000 50000 50000 49999 50000 50000 49999

Incorrect 0 0 0 0 0 1 0 0 1

Ratio () 100 100 100 100 100 99998 100 100 99998

Table 65 Interoperability tests results (S Source L Listener)

is not directly controlled by the framework as plugins can be developed by third parties To solve this a plugin needs to provide

its specifications to the framework before being integrated as a plugin This allows the framework to decide whether or not two

plugins will be able to interact with each other in a stream For example if plugin A supports MJPEG streams transmitted via

RTPUDP it will be able to interact with plugin B implementing the same protocols If plugin B implements another protocol it

will not be able to interact with plugin A If this is specified the framework can notify a user that two plugins are not compatible

These scenarios should be avoided which is done by specifying standard protocols for plugins

613 Modifiability evaluation

Plugins are installed for the prototype by building and adding their image to the image directory of the Docker host The

framework does not need a restart to install these images Therefore requirements MS-1 and MS-2 are met End-users can

extend their version of the framework with new plugins by installing them by building the respective plugin images meeting

MS-3 Streams can be modified by linking different plugins by design meetingMS-4 The framework can detect newly installed

plugins when starting up if the image is installed to the image directory of the Docker host Therefore requirementsMS-5 and

MS-6 are met The current prototype is only deployable on a local device as discussed in Section 41 meeting requirementMS-7

The other requirements can be met by deploying the framework using the Docker overlay network as discussed in Section 43

without having to implement changes to the code base The requirements MS-8 and MS-9 are not met but are plausible by

using a different Docker deployment

In general the frameworkwas designed to bemodifiable for different video analysis tasks The hybridmicrokernelmicroservices

architecture enables this modifiability The microkernel plugin architecture allows a user to modify a video analysis stream

during framework use The microservices architecture allows for a modifiable deployment configuration

62 Mob detection experiment results

To evaluate the detection experiment the trained model is tested on the validation set that contains random images from the

total annotated dataset presented in Section 512 First the results of the training of the model are presented in Section 621

Second the metrics that were used to evaluate the model are presented in Section 622 Finally the results of the validation

are presented in Section 623

62 Mob detection experiment results 63

621 Training results

To monitor training the average loss per training epoch was measured the resulting training evolutions are depicted in Figure

61 Darknet does not shuffle training data automatically and creates training batches in order of the training data provided

Since YOLO uses gradient descent for optimization this can lead to YOLO getting stuck in local minima of the cost surface [130]

This effect is seen in Figure 61a around epoch 4500 every image in the training set has been loaded at least once at this point

the model was training on images from location B and now images from location A are loaded (see Section 512) This leads to

a peak in average loss as YOLO was optimizing images from location B and probably converging to a local minimum for that

type of images Therefore in a second run data was shuffled allowing the model to get out of local minima easier Figure

61b shows the difference in training loss the curve is much more irregular thanks to the shuffling of the data Once again

the average loss decreases more around epoch 4500 when every image in the training set has been loaded at least once The

average loss stagnates values in the interval [004 007] To avoid overfitting the model on the training data and achieve worse

generalization performance early stopping is applied Early stopping is a generalization technique to stop the training of a

neural network early before the network starts overfitting [131] The stopping criterion used is progress defined as the decrease

of training error in successive training epochs [131] or the slope of the loss curve depicted in Figure 61 This slope approaches

0 from epoch 13000 and onward so this epoch is selected as early stopping point Because the generalization error is not a

smooth curve and consists of many local minima it is a good idea to validate model weights in the neighborhood of the early

stopping point as well as these could potentially yield better performance on the validation set [131]

622 Metrics

Themodel predicts bounding boxes for objects in the images of the validation sets The bounding box provided by the annotated

dataset is defined as the ground truth bounding boxBgt The bounding box provided by the model is defined as the predicted

bounding boxBp To evaluate the performance of themodel and select the best weights several metrics are used The standard

metrics used to evaluate object detection problems are the Intersection over Union (IoU) and themean Average Precision (mAP)

The IoU is a metric used in common object detection challenges such as the Pascal VOC challenge [132] If the functionA(Bx)

gives the area for a bounding boxBx the IoU is defined as

IoU =A(Bp capBgt)

A(Bp cupBgt)(61)

The mAP for set of detections another metric used in the Pascal VOC challenge is defined as the mean over classes of the

interpolated AP for each class A detection is considered a true positive if the IoU for the detection is greater than 05 The

interpolated AP is given by the area under the precision-recall curve for the detections [132ndash134]

Themodel is also tested on several videos not included in the train and validation set to visually evaluate detection andmeasure

the number of frames per second that can be processed by the model

62 Mob detection experiment results 64

(a) Average training loss when data is not shuffled Vertical average loss horizontal time (in training epochs)

(b) Average training loss when data is shuffled Vertical average loss horizontal time (in training epochs)

Figure 61 Average training loss per epoch

623 Validation results

YOLO creates a snapshot from the weights the model is using at a certain epoch every 100 epochs [83] This makes it possible

to validate each set of weights on the validation set and show the evolution of the validation performance Figure 62 shows

these evolutions for the average IoU and mAP metrics The mAP gradually grows from epoch 4500 onwards and stagnates

around epoch 11500 This shows that the model is not learning anymore and is at risk of overfitting The mAP stagnates in the

interval of [88 91] The average IoU shows a similar trend but varies more because predictions on the same images rarely

are exactly the same

The best mAP value is achieved at epoch 15700 being 9052 The weights from this epoch are used for further testing and

validation The mAP for the 05 IoU threshold of YOLOv3 on the COCO benchmark dataset [135] is 748 comparing this to the

achieved mAP for the Last Post dataset the Last Post mAP is very high The reason for this difference is that the validation

62 Mob detection experiment results 65

(a) mAP () per epoch Vertical mAP () horizontal time (in training epochs)

(b) IoU () per epoch Vertical IoU () horizontal time (in training epochs)

Figure 62 Validation metrics per epoch

set has a high correlation with the validation set Due to the training set and validation set being extracted from videos all

images from one video are correlated in time to each other Images from the validation set are thus correlated to images in

the training set and the model is optimized on these types of images explaining the high mAP This indicates that the model is

somewhat overfitting on the training data This was confirmed when testing the model on unseen videos Although the model

could detect a mob most of the time it produced more visual errors Because this data was not annotated no metrics could be

extracted Figure 63 depicts some predictions of the model on images from the validation set The predicted bounding boxes

resemble the ground truth bounding boxes quite accurately visually

To test the speed of the predictions of the model the total time to predict images in the validation set was measured For the

NVIDIA Geforce GTX 980 GPU the average prediction time for one image is 14673 milliseconds with a standard deviation of

0517 milliseconds This indicates that the upper limit of the frame rate when making predictions on a video is approximately

68 frames per second on the GPU For comparison predictions with the model were also made on a CPU a 26 GHz Intel Core

i5-2540 processor with AVX instructions speedup The average prediction time on the CPU is 5849 seconds with a standard

deviation of 0438 seconds resulting in an upper limit for the frame rate on the CPU of 0171 frames per second Clearly real

time object detection with this model is only possible on a GPU When generating predictions on a test video the average frame

rate of the video was 55 frames per second

62 Mob detection experiment results 66

(a) Prediction of a large mob at location B (b) Prediction of the mob at location A

(c) Prediction of a small mob at location B (d) Prediction of the mob at location B

Figure 63 Predictions of the model on images in the validation set

CONCLUSION AND FUTURE WORK 67

Chapter 7

Conclusion and future work

71 Conclusion

Aerial thermal imaging with drones is a promising technology that can deliver many promising applications for various use

cases across many different domains such as agriculture fire fighting search and rescue etc Most applications built with this

technology are built with a specific use case in mind using a thermal camera and analysis software specifically for this use

case and therefore struggle to exchange hardware and algorithms for new use cases Therefore the goal of this dissertation

was to design build and test a possible backbone framework that allows building these applications in a modifiable way The

specific use case of mob detection in thermal images was investigated as a sample use case for the framework

Chapter 2 explored the requirements of such a framework The ASRs to achieve the goal of the framework are performance

interoperability and modifiability Performance is needed because some use cases (like fire fighting) require real-time video

analysis Interoperability enables the framework to interact with different thermal cameras and different processinganalysis

modules Modifiability enables the framework to interchange the thermal cameras and analyzers in its process to build ap-

plications for different use cases A hybrid combination of the microkernel pattern and the microservices pattern is used to

meet these requirements as the microkernel pattern enabled interchanging the cameras and analyzers via a plugin system

and the microservices pattern enabled different deployment configurations for the framework To build and test the frame-

work several technologies were needed backbone technologies for the software architecture a thermal camera and an object

detection algorithm for the mob detection use case

Chapter 3 explored the state of the art of these technologies and presents the selected technologies Thermal cameras come in

all shapes and sizes and have different features according to their retail prize Contrary to intuition the image quality is not the

defining factor of the retail prize but the amount of extra features such as radiometry communication interfaces etc The FLIR

One Pro and ThermApp were selected for this dissertation since they offer good quality images and features for their price and

their use via smartphone platforms that makes these cameras excellent for prototyping Microservices frameworks also know

a lot of variety depending a lot on the use case for the application using the framework Some are aimed at quick prototyping

others focus on performance etc Flask was selected as the microservices framework as it is easy to use and designed for

prototyping with microservices This does come with a performance trade-off To deploy the microservices in a plugin fashion

71 Conclusion 68

the concept of containers is applied Containers virtualize on the OS level allowing the microservices to be moved around on

the host and distributed on different hosts The current field has some frameworks implementing this technology with Docker

being the most well-known and mature framework and it was selected for that reason The field of object detection has a

variety of solutions for the object detection problem having varying accuracies and some can even create predictions in real-

time The YOLOv3 algorithm implemented in the darknet framework was selected as it generalizes well onto other datasets

(such as thermal images) makes relatively accurate predictions and is able to make predictions in real-time when deployed on

a device with GPU processing capabilities

Chapter 4 presents the implemented prototype of the framework using these technologies Two sample plugins were imple-

mented the filecam that serves a video read in from a file and the display plugin that displays this video on the local device

The framework is limited to one video processing stream for one user at a time and is deployed to a local device It also has a

security risk as the framework has to expose the Docker daemon socket to allow the framework to manipulate the containers

running the plugins This gives the containers that run the core framework processes root access to the host system which can

be abused by potential attackers

Themob detection experiment is presented in Chapter 5 A new thermal image dataset called the Last Post datasetwas collected

for this experiment The dataset features videos of the Last Post ceremony filmed over the course of two weeks What makes

this dataset special is that unlike publicly available datasets it delivers footage of the movement of large crowds filmed from

a high vantage point to simulate footage captured from a drone platform This dataset is used to train a pre-trained YOLOv3

model via transfer-learning The dataset is manually labeled and preprocessed by removing the outliers present Training is

done on a NVIDIA GTX 980 GPU and is evaluated using the MSE loss metric

Chapter 6 presented the test conducted on the framework and the detection model and their corresponding results The per-

formance requirements for the frequently used commands are met by the framework Other commands such as removing

plugins starting up and shutting down the framework do not meet the performance requirements since Docker requires sig-

nificant time to start stop and remove containers The real-time streaming requirements could not be proven because the

time between transmitting a frame and receiving a frame could not be measured directly However the processed videos were

shown to human users that could not distinguish between the processed video and the video played back on a local system

which makes it plausible that the framework achieved this requirement Real-time streaming performance heavily depends on

the plugin and the hardware on which it is deployed When plugins in the framework are processingmedia CPU usage increases

significantly even when only one plugin is active This implies that the length of media processing streams should be kept as

short as possible to achieve good performance The framework is relatively big with some plugins even having a size of 2 GB

This is mostly due to the base images and installed libraries of the plugins and core components Due to each components

and plugin having its own container libraries canrsquot be shared so they are redundantly installed leading to large components

sizes This could be alleviated by using slimmer images and only installing minimal libraries needed The interoperability

requirements are all met by the framework This is proven by a test exchanging mock information between the framework

and plugins The modifiability requirements regarding the plugins are met by the framework The modifiability requirements

regarding the deployment schemes are not met by the framework but are can be achieved by deploying the framework using

a Docker overlay network instead of the Docker bridge network To evaluate the trained model the model made predictions

72 Future work 69

on a separate validation set The model achieves an mAP of 9052 which is much higher than what current state of the art

models are achieving on benchmark datasets This shows that the model is capable of learning the thermal features but is also

overfitting on the data due to temporal correlation between training and validation sets The model can predict in real-time

achieving an average frame rate of 55 frames per second when making predictions on a GPU

72 Future work

This dissertation proposed a framework and implements a prototype of it which only implements a part of the total framework

Object detection using deep learning in general and specified on thermal images is still a young field Several extensions to

this research are possible

721 Security

The framework prototype did not implement any security measures Because in distributed configurations communications

rely on an external network these measures should be implemented to reduce the risks of attacks To allow the components

to manipulate Docker containers the Docker host socket was exposed As stated before this is a serious security risk as the

container gets root access to the host Workarounds for this problem could be to implement a Docker in Docker environment

[136] or deploy the containers in a VM

722 Implementing a detection plugin

Due to the scope and time limit of the dissertation a working prototype plugin containing a trained model for detecting objects

in a video stream could not be made A possible GStreamer pipeline for such a plugin is depicted in Figure 71 This plugin is a

Consumer and receives video via the udpsink Frames are decoded and the raw video is presented to the appsink GStreamer

plugin that allows the video to be dumped into an application This is the detection model that can generate predictions on the

frame The predicted frame is then forwarded to an appsrc GStreamer plugin that puts the predicted frame in a new pipeline to

transmit it to further framework plugins It should be tested whether the detection model can run in a Docker container since

it needs GPU support to be able to predict in real-time A solution could be to use nvidia-docker which leverages NVIDIA GPU

support in Docker containers [137]

Figure 71 GStreamer pipeline for a plugin with a detection model

72 Future work 70

723 Different deployment configurations

The prototype of the framework only implemented one of the deployment configurations presented in Section 233 Other

configurations can be explored by changing the Docker bridge network to a Docker overlay network

724 Multiple streams with different layouts

The prototype only implemented one stream with a chain-like layout Future effort could implement support for multiple

streams that run concurrently The layout can be changed by implementing plugin that can forward media to multiple sources

or merge media coming from different sources which is the concept of sensor fusion

725 Implementing the plugin distribution service (Remote ProducerConsumer)

In Chapter 2 presented the Remote Producer and Consumer that distribute the plugins available for the framework This was

deemed out of scope for the prototype but could be implemented in future versions

726 Using high performance microservices backbone frameworks

The current implementation uses the Flask framework excellent for prototyping but not ideal for high performance Other

frameworks such as Vertx focus on high performance through asynchronous messaging that could improve the performance

of the framework

727 New object detection models and datasets specifically for thermal images

Current effort in object detection models goes towards challenges on benchmark datasets of visual images such as ImageNet

and Pascal VOC There are some thermal datasets publicly available for some detection purposes but these are very small

compared to the visual image datasets Future research could create new benchmark datasets similar to the visual image

datasets specifically for thermal images

Currently publicly available pre-trained neural network models are designed for and trained on the visual image datasets

Future research could go towards designing an architecture specifically for thermal images and training amodel on a benchmark

dataset

Thermal images use several colormaps tomap the relative temperatures in a scene on colors presenting warm and cold regions

Well-known examples are the Iron scheme (used in this dissertation) White-hot and Black-hot Some companies implement

threshold colors that highlight very hot spots or very cold spots in an image (for examples see [138 139] etc) Future research

could investigate how models trained on images using different color schemes differ in their predictions and performances

Thermal images could potentially benefit from radiometric information that adds a ton of information by adding a temperature

dimension to each pixel in the image instead of the relative coloring This information could lead to more accurate predictions

BIBLIOGRAPHY 71

Bibliography

[1] S G Gupta M M Ghonge and P Jawandhiya ldquoReview of Unmanned Aircraft Systemrdquo International Journal of Advanced

Research in Computer Engineering amp Technology vol 2 no 4 pp 2278ndash1323 2013 ISSN 2278 ndash 1323

[2] M Hassanalian and A Abdelkefi Classifications applications and design challenges of drones A review 2017 DOI

10 1016 j paerosci 2017 04 003 [Online] Available http ac els - cdn com S0376042116301348 1 - s2 0 -

S0376042116301348-mainpdf7B5C_7Dtid=256c9506-8f3c-11e7-a898-00000aab0f017B5Camp7Dacdnat=

15042875957B5C_7D

[3] M Joel The Booming Business of Drones 2013 [Online] Available httpshbrorg201301the-booming-business-of-

drones (visited on 01302018)

[4] DJI Zenmuse H3 - 2D [Online] Available httpswwwdjicomzenmuse-h3-2d (visited on 01302018)

[5] Gimbal Guard Drop amp Delivery Device for DJI Mavic Pro [Online] Available httpwwwgimbal-guardcom7B5C_

7Dpprd134610820141productdrop-7B5C7D26-delivery-device-for-dji-mavic-pro (visited on 01302018)

[6] FLIR Systems Aerial Thermal Imaging Kits [Online] Available httpwwwflircomsuasaerial-thermal-imaging-kits

(visited on 01302018)

[7] R Gade and T B Moeslund ldquoThermal cameras and applications a surveyrdquo Machine Vision and Applications vol 25

pp 245ndash262 2014 DOI 101007s00138-013-0570-5 [Online] Available httpslinkspringercomcontentpdf10

10077B5C7D2Fs00138-013-0570-5pdf

[8] M C Harvey J V Rowland and K M Luketina ldquoDrone with thermal infrared camera provides high resolution georefer-

enced imagery of theWaikite geothermal area New Zealandrdquo 2016 DOI 101016jjvolgeores201606014 [Online] Avail-

able httpsacels-cdncomS03770273163014211-s20-S0377027316301421-mainpdf7B5C_7Dtid=78077cee-

05f3-11e8-84ec-00000aab0f6c7B5Camp7Dacdnat=15173405687B5C_7D

[9] S Amici M Turci S Giammanco L Spampinato and F Giulietti ldquoUAV Thermal Infrared Remote Sensing of an Italian Mud

Volcanordquo vol 2 pp 358ndash364 2013 DOI 104236ars201324038 [Online] Available httpwwwscirporgjournalars

20httpdxdoiorg104236ars201324038

[10] J Bendig A Bolten and G Bareth ldquoINTRODUCING A LOW-COST MINI-UAV FOR THERMAL-AND MULTISPECTRAL-IMAGINGrdquo

2012 [Online] Available httpswwwint-arch-photogramm-remote-sens-spatial-inf-scinetXXXIX-B13452012

isprsarchives-XXXIX-B1-345-2012pdf

BIBLIOGRAPHY 72

[11] Workswell ldquoUsing the UAV Thermography for Cultivation and Phenotyping of Cerealsrdquo Tech Rep 2016 [Online] Avail-

able httpswwwdrone-thermal-cameracomwp-contentuploadsCultivation-and-Phenotyping-1pdf

[12] A J Rivera A D Villalobos J C Monje J A Marintildeas and C M Oppus ldquoPost-disaster rescue facility Human detection and

geolocation using aerial dronesrdquo IEEE Region 10 Annual International Conference ProceedingsTENCON pp 384ndash386

2017 ISSN 21593450 DOI 101109TENCON20167848026

[13] P Christiansen K A Steen R N Joslashrgensen and H Karstoft ldquoAutomated detection and recognition of wildlife using

thermal camerasrdquo Sensors (Basel Switzerland) vol 14 no 8 pp 13 778ndash93 Jul 2014 ISSN 1424-8220 DOI 103390

s140813778 [Online] Available httpwwwncbinlmnihgovpubmed2519610520httpwwwpubmedcentralnih

govarticlerenderfcgiartid=PMC4179058

[14] J Zhang J Hu J Lian Z Fan X Ouyang and W Ye ldquoSeeing the forest from drones Testing the potential of lightweight

drones as a tool for long-term forest monitoringrdquo Biological Conversation vol 198 pp 60ndash69 2016 [Online] Available

httpacels-cdncomS00063207163011001-s20-S0006320716301100-mainpdf7B5C_7Dtid=7166e916-8f3c-

11e7-9090-00000aacb35e7B5Camp7Dacdnat=15042877237B5C_7D

[15] D Ventura M Bruno G Jona Lasinio A Belluscio and G Ardizzone ldquoA low-cost drone based application for identifying

and mapping of coastal fish nursery groundsrdquo Estuarine Coastal and Shelf Science vol 171 pp 85ndash98 Mar 2016 ISSN

02727714 DOI 101016j ecss 201601 030 [Online] Available http ac els-cdncomS02727714163003001-s20-

S0272771416300300-mainpdf7B5C_7Dtid=7f4cdb08-8f3c-11e7-a03a-00000aab0f6b7B5Camp7Dacdnat=

15042877467B5C_7D20httplinkinghubelseviercomretrievepiiS0272771416300300

[16] S Chowdhury A Emelogu M Marufuzzaman S G Nurre and L Bian ldquoDrones for disaster response and relief operations

A continuous approximation modelrdquo 2017 DOI 101016jijpe201703024 [Online] Available wwwelseviercomlocate

ijpe

[17] Workswell ldquoPipeline inspection with thermal diagnosticsrdquo 2016 [Online] Available https www drone - thermal -

cameracomwp-contentuploadspipelinepdf

[18] Workswell ldquoThermo diagnosis of photovoltaic power plantsrdquo 2016 [Online] Available httpswwwdrone-thermal-

cameracomwp-contentuploadsWorkswell-WIRIS7B5C_7Dphotovoltaicpdf

[19] Workswell ldquoThermodiagnostics of flat roofsrdquo 2016 [Online] Available httpswwwdrone-thermal-cameracomwp-

contentuploadsroofpdf

[20] Workswell ldquoThermodiagnostics in the power engineering sectorrdquo Tech Rep 2016 [Online] Available https www

drone-thermal-cameracomwp-contentuploadshighvoltagepdf

[21] Workswell Workswell WIRIS - Product - Thermal camera for drones 2016 [Online] Available https www drone -

thermal-cameracomwiris (visited on 01302018)

[22] TEAX Technology ThermalCapture - Thermal Imaging Technology | Capture raw radiometric thermal data with drones

[Online] Available httpthermalcapturecom (visited on 01302018)

BIBLIOGRAPHY 73

[23] DJI Zenmuse XT - unlock the possibilities of sight - DJI 2018 [Online] Available https wwwdji comzenmuse-xt

(visited on 01302018)

[24] Workswell SOFTWARE - Workswell WIRIS - Thermal camera for drones 2016 [Online] Available httpswwwdrone-

thermal-cameracomsoftware (visited on 01312018)

[25] Therm-App Therm-Apptrade - Android-apps op Google Play 2018 [Online] Available httpsplaygooglecomstoreapps

detailsid=comthermapp (visited on 01312018)

[26] B Satzger W Hummer C Inzinger P Leitner and S Dustdar ldquoWinds of change From vendor lock-in to the meta cloudrdquo

IEEE Internet Computing vol 17 no 1 pp 69ndash73 2013 ISSN 10897801 DOI 101109MIC201319

[27] J Divya Drone Technology and Usage Current Uses and Future Drone Technology 2017 [Online] Available httpuk

businessinsidercomdrone-technology-uses-2017-7r=US7B5Camp7DIR=T (visited on 01312018)

[28] A Boulanger ldquoOpen-source versus proprietary software Is one more reliable and secure than the otherrdquo IBM Systems

Journal vol 44 no 2 pp 239ndash248 2005 ISSN 0018-8670 DOI 101147sj4420239 [Online] Available httpieeexplore

ieeeorgdocument5386727

[29] M Kazmeyer Disadvantages of Proprietary Software [Online] Available httpsmallbusinesschroncomdisadvantages-

proprietary-software-65430html (visited on 01312018)

[30] B Steffen and A Seyfried ldquoMethods for measuring pedestrian density flow speed and direction with minimal scatterrdquo

Physica A Statistical Mechanics and its Applications vol 389 no 9 pp 1902ndash1910 May 2010 ISSN 0378-4371 DOI 10

1016JPHYSA200912015 [Online] Available httpswwwsciencedirectcomsciencearticlepiiS0378437109010115

via7B5C7D3Dihub

[31] M Wirz T Franke D Roggen E Mitleton-Kelly P Lukowicz and G Troumlster ldquoInferring crowd conditions from pedestriansrsquo

location traces for real-time crowd monitoring during city-scale mass gatheringsrdquo Proceedings of the Workshop on

Enabling Technologies Infrastructure for Collaborative Enterprises WETICE pp 367ndash372 2012 ISSN 15244547 DOI 10

1109WETICE201226

[32] E Alpaydin Introduction to machine learning 3rd ed MIT Press 2014 p 591 ISBN 026201243X [Online] Available

httpsdlacmorgcitationcfmid=1734076

[33] J W Davis and V Sharma ldquoRobust background-subtraction for person detection in Thermal Imageryrdquo IEEE Computer

Society Conference on Computer Vision and Pattern Recognition Workshops vol 2004-Janua no January 2004 ISSN

21607516 DOI 101109CVPR2004431

[34] W Wang J Zhang and C Shen ldquoImproved Human Detection And Classification in Thermal Imagesrdquo pp 2313ndash2316 2010

[35] R Appel S Belongie P Perona and P Doll ldquoFast Feature Pyramids for Object Detectionrdquo Pami vol 36 no 8 pp 1ndash14

2014 ISSN 01628828 DOI 10 1109 TPAMI 2014 2300479 [Online] Available https vision cornell edu se3 wp -

contentuploads201409DollarPAMI14pyramids7B5C_7D0pdf

[36] T Goedeme ldquoProjectresultaten VLAIO TETRA-projectrdquo KU Leuven Louvain Tech Rep 2017

BIBLIOGRAPHY 74

[37] L-L Slattery DroneSAR wants to turn drones into search-and-rescue heroes 2017 [Online] Available https www

siliconrepubliccomstart-upsdronesar-search-and-rescue-drone-software (visited on 05262018)

[38] A W S Inc What Is Amazon Kinesis Video Streams 2018 [Online] Available https docs aws amazon com

kinesisvideostreamslatestdgwhat-is-kinesis-videohtml (visited on 05262018)

[39] U Government ldquoSystems Engineering Fundamentalsrdquo Defence Acquisition University Press no January p 223 2001

ISSN 1872-7565 DOI 101016jcmpb201005002 [Online] Available httpwwwdticmildocscitationsADA387507

[40] L Bass P Clements and R Kazman Software Architecture in Practice 3rd Addison-Wesley Professional 2012 ISBN

0321815734 9780321815736

[41] J Greene and M Stellman Applied Software Project Management 2006 p 324 ISBN 978-0596009489 [Online] Avail-

able httpwwworeillycomcatalogappliedprojectmgmt

[42] S Barber Acceptable application response times vs industry standard 2018 [Online] Available httpssearchsoftwarequality

techtargetcomtipAcceptable-application-response-times-vs-industry-standard (visited on 05282018)

[43] T Burger How Fast Is Realtime Human Perception and Technology | PubNub 2015 [Online] Available httpswww

pubnubcombloghow-fast-is-realtime-human-perception-and-technology (visited on 05282018)

[44] S-t Modeling P Glennie and N Thrift ldquoTime perception modelsrdquo Neuron pp 15 696ndash15 699 1992

[45] M Richards Software Architecture Patterns First edit Heather Scherer Ed OrsquoReilly Media 2015 [Online] Available

httpwwworeillycomprogrammingfreefilessoftware-architecture-patternspdf

[46] C Richardson Microservice Architecture pattern 2017 [Online] Available httpmicroservicesiopatternsmicroservices

html (visited on 12022017)

[47] P Clements F Bachmann L Bass D Garlan J Ivers R Little P Merson R Nord and J Staffor Documenting Software

Architectures Second Boston Pearson Education Inc 2011 ISBN 0-321-55268-7

[48] Object Management Group ldquoUnified Modeling Language v251rdquo no December 2017 [Online] Available http www

omgorgspecUML251

[49] C De La Torre C Maddock J Hampton P Kulikov and M Jones Communication in a microservice architecture 2017

[Online] Available https docs microsoft com en - us dotnet standard microservices - architecture architect -

microservice-container-applicationscommunication-in-microservice-architecture (visited on 04272018)

[50] H Schulzrinne and S Casner ldquoRTP Profile for Audio and Video Conferences with Minimal Controlrdquo 2003 [Online] Avail-

able httpstoolsietforghtmlrfc3551

[51] D Bull Communicating Pictures A Course in Image and Video Coding Elsevier Science 2014 ISBN 9780080993744

[Online] Available httpsbooksgooglebebooksid=PDZOAwAAQBAJ

[52] On-Net Surveillance Systems Inc ldquoMJPEG vs MPEG4 Understanding the differences advantages and disadvantages of

each compression techniquerdquo 2006 [Online] Available wwwonssicom

BIBLIOGRAPHY 75

[53] M M A V Protocol Introduction MAVLink Developer Guide 2013 [Online] Available httpsmavlinkioen (visited on

09142017)

[54] hartmut Schlosser Microservices trends 2017 Strategies tools and frameworks - JAXenter 2017 [Online] Available

httpsjaxentercommicroservices-trends-2017-survey-133265html (visited on 03242018)

[55] A Ronacher Welcome to Flask mdash Flask Documentation (012) 2017 [Online] Available httpflaskpocooorgdocs012

(visited on 03242018)

[56] F Reyes PythonDecorators 2017 [Online] Available https wiki python org moin PythonDecorators (visited on

04272018)

[57] Stackshare Companies that use Flask and Flask Integrations 2018 [Online] Available https stackshare io flask

(visited on 03242018)

[58] Falcon Falcon - Bare-metal web API framework for Python [Online] Available httpsfalconframeworkorg7B5C

7DsectionAbout (visited on 03242018)

[59] Stackshare Companies that use Falcon and Falcon Integrations 2018 [Online] Available httpsstackshareiofalcon

(visited on 03242018)

[60] A Ronacher Nameko for Microservices 2015 [Online] Available httplucumrpocooorg201548microservices-with-

nameko (visited on 03242018)

[61] C Escoffier Building Reactive Microservices in Java 2017 ISBN 9781491986264

[62] C Posta Microservices for Java Developers ISBN 9781491963081

[63] R Dua A R Raja and D Kakadia ldquoVirtualization vs Containerization to support PaaSrdquo in IEEE International Conference

on Cloud Engineering 2014 ISBN 9781479937660 DOI 101109IC2E201441

[64] D Merkel Docker Lightweight Linux Containers for Consistent Development and Deployment 2014 [Online] Available

http delivery acmorg1011452610000260024111600htmlip=1571935 1787B5Camp7Did=26002417B

5Camp7Dacc=ACTIVE20SERVICE7B5Camp7Dkey=D7FC43CABE88BEAA F15FE2ACB4878E3D 4D4702B0C3E38B35

4D4702B0C3E38B357B5Camp7D7B5C_7D7B5C_7Dacm7B5C_7D7B5C_7D=15214915967B5C_

7D (visited on 03192018)

[65] Docker Inc Docker for the Virtualization Admin 2016 p 12

[66] Docker Inc What is a Container 2018 [Online] Available https www docker com what - container (visited on

03242018)

[67] M Helsley LXC Linux container tools 2009 [Online] Available httpswwwibmcomdeveloperworkslinuxlibraryl-

lxc-containers (visited on 05212018)

[68] J Fink Docker a Software as a Service Operating System-Level Virtualization Framework 2014 [Online] Available

http journal code4lib org articles 9669 utm7B5C _7Dsource = feedburner 7B5C amp7Dutm7B5C _

7Dmedium=feed7B5Camp7Dutm7B5C_7Dcampaign=Feed7B5C7D3A+c4lj+ (visited on 03192018)

BIBLIOGRAPHY 76

[69] C Wang What is Docker Linux containers explained 2017 [Online] Available https www infoworld comarticle

3204171linuxwhat-is-docker-linux-containers-explainedhtml (visited on 05212018)

[70] CoreOS Rkt a security-minded standards-based container engine [Online] Available httpscoreoscomrkt (visited

on 03242018)

[71] F X F Xu X L X Liu and K Fujimura ldquoPedestrian detection and tracking with night visionrdquo IEEE Transactions on

Intelligent Transportation Systems vol 6 no 1 pp 63ndash71 2005 ISSN 1524-9050 DOI 101109TITS2004838222

[72] H Nanda and L Davis ldquoProbabilistic template based pedestrian detection in infrared videosrdquo IEEE Intelligent Vehicles

Symposium Proceedings vol 1 pp 15ndash20 2003 DOI 101109IVS20021187921

[73] R E Schapire ldquoExplaining adaboostrdquo Empirical Inference Festschrift in Honor of Vladimir N Vapnik pp 37ndash52 2013

DOI 101007978-3-642-41136-6_5

[74] P Viola O M Way M J Jones and D Snow ldquoDetecting pedestrian using patterns of motion and appearancerdquo Interna-

tional Journal of Computer Vision vol 63 no 2 pp 153ndash161 2005 DOI 101109ICCV20031238422

[75] I Goodfellow Y Bengio and A Courville Deep Learning MIT Press 2016 httpwwwdeeplearningbookorg

[76] J R R Uijlings K E A Van De Sande T Gevers and A W M Smeulders ldquoSelective Search for Object Recognitionrdquo Tech

Rep 2012 DOI 101007s11263-013-0620-5 arXiv 14094842 [Online] Available httpwwwcscornelleducourses

cs76702014spslidesVisionSeminar14pdf

[77] R Girshick J Donahue T Darrell and J Malik ldquoRegion-Based Convolutional Networks for Accurate Object Detection and

Segmentationrdquo IEEE Transactions on Pattern Analysis and Machine Intelligence vol 38 no 1 pp 142ndash158 2014 ISSN

01628828 DOI 101109TPAMI20152437384 arXiv 13112524

[78] R Girshick ldquoFast R-CNNrdquo in Proceedings of the IEEE International Conference on Computer Vision vol 2015 Inter 2015

pp 1440ndash1448 ISBN 9781467383912 DOI 101109ICCV2015169 arXiv 150408083

[79] S Ren K He R Girshick and J Sun ldquoFaster R-CNN Towards Real-Time Object Detection with Region Proposal Networksrdquo

IEEE Transactions on Pattern Analysis and Machine Intelligence vol 39 no 6 pp 1137ndash1149 2016 ISSN 01628828 DOI

101109TPAMI20162577031 arXiv 150601497

[80] K He Gkioxari P Dollaacuter and R Girshick ldquoMask R-CNNrdquo arXiv 2018 arXiv arXiv170306870v3

[81] J Dai Y Li K He and J Sun ldquoR-FCN Object Detection via Region-based Fully Convolutional Networksrdquo Tech Rep 2016

DOI 101109ICASSP20177952132 arXiv 160506409 [Online] Available httparxivorgabs160506409

[82] J Redmon S Divvala R Girshick and A Farhadi ldquoYou Only Look Once Unified Real-Time Object Detectionrdquo 2015 ISSN

01689002 DOI 101109CVPR201691 arXiv 150602640 [Online] Available httparxivorgabs150602640

[83] J Redmon and A Farhadi ldquoYOLOv3 An Incremental Improvementrdquo axXiv 2018 [Online] Available httpspjreddiecom

mediafilespapersYOLOv3pdf

[84] W Liu D Anguelov D Erhan C Szegedy S Reed C-y Fu and A C Berg ldquoSSD Single Shot MultiBox Detectorrdquo arXiv 2016

arXiv arXiv151202325v5

BIBLIOGRAPHY 77

[85] B Zoph and Q V Le ldquoNeural Architecture Search with Reinforcement Learningrdquo in ICLR 2017 pp 1ndash16 arXiv arXiv

161101578v2

[86] T-y Lin P Goyal R Girshick K He and P Dollaacuter ldquoFocal Loss for Dense Object Detectionrdquo arXiv 2018 arXiv arXiv

170802002v2

[87] Facebook Inc ONNX - About 2017 [Online] Available httpsonnxaiabout (visited on 05212018)

[88] TensorFlow TensorFlow 2018 [Online] Available httpswwwtensorfloworg (visited on 05212018)

[89] J Huang V Rathod C Sun M Zhu A Korattikara A Fathi I Fischer Z Wojna Y Song S Guadarrama and K Murphy

ldquoSpeedaccuracy trade-offs for modern convolutional object detectorsrdquo arXiv 2017 arXiv arXiv161110012v3

[90] J Redmon Darknet Open source neural networks in c httppjreddiecomdarknet 2013ndash2016

[91] Microsoft The Microsoft Cognitive Toolkit | Microsoft Docs 2018 [Online] Available https docs microsoft comen-

uscognitive-toolkitindex (visited on 05212018)

[92] Docker Inc Overview of Docker Compose | Docker Documentation 2018 [Online] Available httpsdocsdockercom

composeoverview (visited on 04272018)

[93] Docker Inc Use bridge networks | Docker Documentation 2018 [Online] Available httpsdocsdockercomnetwork

bridge (visited on 04272018)

[94] A Ronacher Click Documentation (50) 2017 [Online] Available httpclickpocooorg5 (visited on 04272018)

[95] A K Reitz Requests HTTP for Humans mdash Requests 2184 documentation 2018 [Online] Available httpdocspython-

requestsorgenmaster (visited on 05092018)

[96] Docker Inc Docker SDK for PythonmdashDocker SDK for Python 20 documentation 2018 [Online] Available httpsdocker-

pyreadthedocsioenstable (visited on 05122018)

[97] GStreamer GStreamer open source multimedia framework 2018 [Online] Available httpsgstreamerfreedesktop

org (visited on 05132018)

[98] E Walthinsen filesrc GStreamer Core Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktop org data doc gstreamer head gstreamer -plugins html gstreamer -plugins - filesrc html (visited on

05132018)

[99] E Hervey decodebin GStreamer Base Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-base-pluginshtmlgst-plugins-base-plugins-decodebinhtml

(visited on 05132018)

[100] W Taymans jpegenc GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-jpegenchtml (visited on

05132018)

BIBLIOGRAPHY 78

[101] A Communications rtpjpegpay GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https

gstreamer freedesktop org data doc gstreamer head gst - plugins - good html gst - plugins - good - plugins -

rtpjpegpayhtml (visited on 05132018)

[102] W Taymans udpsink GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-udpsinkhtml (visited on

05132018)

[103] GStreamer Basic tutorial 3 Dynamic pipelines [Online] Available httpsgstreamerfreedesktoporgdocumentation

tutorialsbasicdynamic-pipelineshtml (visited on 05132018)

[104] W Taymans udpsrc GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-udpsrchtml (visited on

05142018)

[105] W Taymans rtpjpegdepay GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available httpsgstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-rtpjpegdepayhtml

(visited on 05142018)

[106] A Loonstra ldquoVideostreaming with Gstreamerrdquo [Online] Available httpmediatechnologyleideneduimagesuploads

docswt20147B5C_7Dgstreamerpdf

[107] W Taymans jpegdec GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-jpegdechtml (visited on

05142018)

[108] J Schmidt autovideosink GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available httpsgstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-autovideosinkhtml

(visited on 05142018)

[109] A Ronacher Deployment Options mdash Flask 0124 documentation 2018 [Online] Available httpflaskpocooorgdocs

012deploying (visited on 05142018)

[110] R Yasrab ldquoMitigating Docker Security Issuesrdquo University of Science and Technology of China Hefei Tech Rep [Online]

Available httpsarxivorgpdf180405039pdf

[111] Lvh Donrsquot expose the Docker socket (not even to a container) 2015 [Online] Available httpswwwlvhiopostsdont-

expose-the-docker-socket-not-even-to-a-containerhtml (visited on 05152018)

[112] Docker Inc Use overlay networks | Docker Documentation 2018 [Online] Available httpsdocsdockercomnetwork

overlay7B5C7Dcustomize-the-docker7B5C_7Dgwbridge-interface (visited on 05152018)

[113] J W Davis and M A Keck ldquoA Two-Stage Template Approach to Person Detection in Thermal Imageryrdquo Proc Workshop

on Applications of Computer Vision 2005 [Online] Available httpvcipl-okstateorgpbvsbenchpaperswacv05pdf

BIBLIOGRAPHY 79

[114] J W Davis and V Sharma ldquoBackground-subtraction using contour-based fusion of thermal and visible imageryrdquo Com-

puter Vision and Image Understanding vol 106 no No 2-3 pp 162ndash182 2007 DOI 101016jcviu200606010 [Online]

Available httpswebcseohio-stateedu7B~7Ddavis1719Publicationscviu07pdf

[115] S Hwang J Park N Kim Y Choi and I S Kweon ldquoMultispectral Pedestrian Detection Benchmark Dataset and Baselinerdquo

CVPR 2015 [Online] Available httpssitesgooglecomsitepedestrianbenchmark

[116] Z Wu N Fuller D Theriault and M Betke ldquoA Thermal Infrared Video Benchmark for Visual Analysisrdquo IEEE Conference

on Computer Vision and Pattern Recognition Workshops 2014 DOI 101109CVPRW201439 [Online] Available http

citeseerxistpsueduviewdocdownloaddoi=101173522167B5Camp7Drep=rep17B5Camp7Dtype=pdf

[117] R Miezianko Terravic research infrared database

[118] R Miezieanko Terravic research infrared database

[119] S Z Li R Chu S Liao and L Zhang ldquoIllumination Invariant Face Recognition Using Near-Infrared Imagesrdquo IEEE Trans-

actions on Pattern Analysis and Machine Intelligence vol 29 no 4 pp 627ndash639 2007 DOI 101109TPAMI20071014

[Online] Available httpvcipl-okstateorgpbvsbenchpapersNIRpdf

[120] A Akula R Ghosh S Kumar and H K Sardana ldquoMoving target detection in thermal infrared imagery using spatiotem-

poral informationrdquo J Opt Soc Am A vol 30 no 8 pp 1492ndash1501 Aug 2013 DOI 101364JOSAA30001492 [Online]

Available httpjosaaosaorgabstractcfmURI=josaa-30-8-1492

[121] R I Hammoud IEEE OTCBVS WS Series Bench [Online] Available http vcipl - okstate org pbvs bench (visited on

05182018)

[122] Last Post Association Mission 2018 [Online] Available httpwwwlastpostbeenthe-last-postmission (visited on

05182018)

[123] I FLIR Systems FLIR One Pro 2017 [Online] Available httpswwwflircomglobalassetsimported-assetsdocument17-

1746-oem-flir7B5C_7Done7B5C_7Dpro7B5C_7Ddatasheet7B5C_7Dfinal7B5C_7Dv17B5C_

7Dwebpdf

[124] R J Ramana Introduction to Camouflage andDeception Defence Scientific Information ampDocumentation Centre pp 99ndash

164

[125] A Bornstein and I Richter Microsoft visual object tagging tool [Online] Available httpsgithubcomMicrosoftVoTT

(visited on 05202018)

[126] F E Grubbs ldquoProcedures for Detecting Outlying Observations in Samplesrdquo Technometrics vol 11 no 1 pp 1ndash21 Feb 1969

DOI 10108000401706196910490657 [Online] Available httpwwwtandfonlinecomdoiabs10108000401706

196910490657

[127] J Deng W Dong R Socher L-J Li K Li and L Fei-Fei ldquoImageNet A Large-Scale Hierarchical Image Databaserdquo in CVPR09

2009 [Online] Available httpwwwimage-netorgpapersimagenet7B5C_7Dcvpr09pdf

BIBLIOGRAPHY 80

[128] D Gupta Transfer learning amp The art of using Pre-trained Models in Deep Learning 2017 [Online] Available https

wwwanalyticsvidhyacomblog201706transfer-learning-the-art-of-fine-tuning-a-pre-trained-model (visited on

05202018)

[129] Docker Inc docker stats | Docker Documentation 2018 [Online] Available httpsdocsdockercomenginereference

commandlinestats (visited on 05242018)

[130] M Gori and A Tesi ldquoOn the Problem of Local Minima in Recurrent Neural Networksrdquo IEEE Transactions on Pattern

Analysis and Machine Intelligence vol 14 no 1 pp 76ndash86 1992 DOI 10110934107014

[131] L Prechelt ldquoEarly stopping - but whenrdquo in Neural Networks Tricks of the Trade G B Orr and K-R Muumlller Eds Berlin

Heidelberg Springer Berlin Heidelberg 1998 pp 55ndash69 ISBN 978-3-540-49430-0 DOI 1010073-540-49430-8_3

[Online] Available httpsdoiorg1010073-540-49430-8_3

[132] M Everingham L Van Gool C K Williams J Winn and A Zisserman ldquoThe Pascal visual object classes (VOC) challengerdquo

International Journal of Computer Vision vol 88 no 2 pp 303ndash338 2010 ISSN 09205691 DOI 101007s11263-009-

0275-4

[133] M Everingham S M A Eslami L Van Gool C K I Williams J Winn and A Zisserman ldquoThe Pascal Visual Object Classes

Challenge A Retrospectiverdquo International Journal of Computer Vision vol 111 no 1 pp 98ndash136 2014 ISSN 15731405

DOI 101007s11263-014-0733-5

[134] P Henderson and V Ferrari ldquoEnd-to-end training of object class detectors for mean average precisionrdquo Lecture Notes

in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

vol 10115 LNCS pp 198ndash213 2017 ISSN 16113349 DOI 101007978-3-319-54193-8_13 arXiv 160703476

[135] T Y Lin M Maire S Belongie J Hays P Perona D Ramanan P Dollaacuter and C L Zitnick ldquoMicrosoft COCO Common objects

in contextrdquo Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture

Notes in Bioinformatics) vol 8693 LNCS no PART 5 pp 740ndash755 2014 ISSN 16113349 DOI 101007978-3-319-10602-

1_48 arXiv 14050312

[136] Docker Inc Librarydocker 2018 [Online] Available https hub docker com 7B 5C _ 7D docker (visited on

06012018)

[137] Nvidia Nvidia-docker [Online] Available httpsgithubcomNVIDIAnvidia-docker (visited on 05252018)

[138] FLIR ldquoFLIR Onerdquo [Online] Available http www flir comuploadedFiles Store Products FLIR-ONE3rd-GenFLIR-

ONEFLIR-ONE-Gen-3-Datasheetpdf

[139] FLIR ldquoFLIR Bosonrdquo p 2 2016

FIREFIGHTING DEPARTMENT EMAIL CONVERSATIONS 81

Appendix A

Firefighting department email conversations

This appendix contains the email conversations with different firefighting departments in Belgium as part of an exploration of

the functional requirements of an aerial thermal imaging solution Note that all conversations were translated from Dutch to

English

A1 General email sent to Firefighting departments

This email was sent to the departments later mentioned in this appendix The responses in the following sections are responses

to this email

Subject Firefighting department - Thesis thermal drones

Dear Sir Madam

My name is Brecht Verhoeve I am a student Master of Science computer science engineering at Ghent University I am contacting

your department with reference to the research of my masterrsquos dissertation I am currently researching the applications of

thermal cameras in combination with commercial drones They can create an aerial overview of scenes and objects that often

canrsquot be spotted with visual detectors like hidden persons fires or hot explosives The eventual goal is to let a computer indicate

these objects of interest autonomously on the thermal images of the drone These images could aid a firefighter with their

work

For this research I have some questions for you

Functionality

I have enlisted some functionalities which I believe could be interesting for a firefighter

bull Detection of persons in buildings (find potential victims)

bull Detection of hidden fires in buildings (to identify danger zones)

bull Detection of fires on vast terrains (forests industrial terrains)

A2 Conversation with Firefighting department of Antwerp Belgium 82

bull Indication of hot explosives

I have two questions

bull Do you agree that these are the most important functions

bull Are there any other functions that you deem important

Quality of the application Next to the functionality the quality of the application is also important For me the most important

aspects are

bull Accuracy The software must be accurate There is no room for errors when detecting

bull Speed The software must operate quickly An overview must be created quickly to not waste time in case of an emer-

gency

bull Usability The software must be easy to use

Once again I have two questions

bull Do you agree with these qualities

bull Are there any other important qualities that you deem important

I would like to thank you in advance for your time

Best regards

Brecht Verhoeve

A2 Conversation with Firefighting department of Antwerp Belgium

The answers were given inline For clarity these are explicitly given

Subject Re Firefighting department Antwerp - Thesis thermal drones

Answers can be found in your email

Best regards

Functionality Detection of hidden fires in buildings and environments Are there any other functions that you deem important

Capture the evolution of a fire with the thermal camera Visualise incidents during night-time Capture invisible fires such as

hydrogen or methane fires

A3 Converstation with Firefighting department of Ostend Belgium 83

A3 Converstation with Firefighting department of Ostend Belgium

The answers were given inline For clarity these are explicityl given

Subject Re Firefighting department Ostend - Thesis thermal drones

Dear Brecht

You can find the answers after the questions in your email

Best Regards

Functionality Are there any other functions that you deem important These are the most important for us at the moment

Quality of the application Are there any other important qualities that you deem important The application must work au-

tonomously

A4 Conversation with Firefighting department of Courtrai Belgium

Subject Re Firefighting department Courtrai - Thesis thermal drones

Dear Brecht

Beneath you will find our answers (next to the already mentioned items)

Functionality

bull The detection of persons in a landscape For example missing persons after a traffic accident there are searches in the

dark for victims that were catapulted from a vehicle Today this is done via a thermal camera on the ground but with

a drone this could hold potential benefits Another example is searching for missing persons in nature reserves The

police sometimes asks for assitance of firefighters to search the area

Quality of the application

bull The images needs to be processed in realtime not after the drone has landed

The drones must be deployable for multiple purposes

The interpretation of the images in the future can be important for automatic flight control of drones Currently there is a

European project rdquo3D Safeguardrdquo where the KU Leuven is participating They are already quite advanced in interpreting the

images from a drone to spot persons through smoke With this information the drone can be redirected The application can

thus use the interpretations of the images to control the drone in flight

Best regards

A5 Conversation with Firefighting department of Ghent Belgium

Subject Re Firefighting department Ghent - Thesis thermal drones

A5 Conversation with Firefighting department of Ghent Belgium 84

Hi Brecht

I donrsquot know if yoursquove received the previous email but there you received answers on your questions

Best regards

Subject Re Firefighting department Ghent - Thesis thermal drones

With respect to the functionality I would like to add

bull Measuring the temperature of containers silos

I agree with the quality of the application It could be handy to be able to view the application from one or more devices

Everything should have a clear overview If possible information and controls should be available on one screen

I will follow up

Best regards

THERMAL CAMERA SPECIFICATIONS 85

Appendix B

Thermal camera specifications

This appendix gives all the specifications for the compared thermal cameras First the different cameras their producing

companies and average retail prices are listed in Table B1 Second their respective physical specifications are presented in

Table B2 Third the image qualities are presented in Table B3 Fourth the thermal precisions are presented in Table B4 Fifth

the available interfaces to interact with each camera are presented in Table B5 Sixth the energy consumption of each camera

is presented in Table B6 Seventh how support is offered when developing for these platforms is presented in Table B7 Finally

auxiliary features are presented in Table B8

THERMAL CAMERA SPECIFICATIONS 86

Product Company Price (Euro)

Wiris 2nd Gen 640 Workswell 999500

Wiris 2nd Gen 336 Workswell 699500

Duo Pro R 640 FLIR 640900

Duo Pro R 336 FLIR 438484

Duo FLIR 94999

Duo R FLIR 123999

Vue 640 FLIR 268900

Vue 336 FLIR 125993

Vue Pro 640 FLIR 403218

Vue Pro 336 FLIR 230261

Vue Pro R 640 FLIR 518456

Vue Pro R 336 FLIR 345599

Zenmuse XT 640 DJI x FLIR 1181000

Zenmuse XT 336 DJI x FLIR 697000

Zenmuse XT 336 R DJI x FLIR 939000

Zenmuse XT 640 R DJI x FLIR 1423000

One FLIR 23799

One Pro FLIR 46900

Tau 2 640 FLIR 674636

Tau 2 336 FLIR 493389

Tau 2 324 FLIR 2640

Lepton 3 160 x 120 FLIR 25995

Lepton 3 80 x 60 FLIR 14338

Boson 640 FLIR 122209

Boson 320 FLIR 93842

Quark 2 640 FLIR 33165

Quark 2 336 FLIR 33165

DroneThermal v3 Flytron 34115

Compact Seek Thermal 27500

CompactXR Seek Thermal 28646

Compact Pro Seek Thermal 59900

Therm-App Opgal 93731

Therm-App TH Opgal 295000

Therm-App 25 Hz Opgal 199000

Table B1 Compared cameras their producing companies and their average retail price

THERMAL CAMERA SPECIFICATIONS 87

Product Weight (g) Dimensions (mm)

Wiris 2nd Gen 640 390 135 x 77 x 69

Wiris 2nd Gen 336 390 135 x 77 x 69

Duo Pro R 640 325 85 x 813 x 685

Duo Pro R 336 325 85 x 813 x 685

Duo 84 41 x 59 x 30

Duo R 84 41 x 59 x 30

Vue 640 114 574 x 4445 x 4445

Vue 336 114 574 x 4445 x 4445

Vue Pro 640 9214 574 x 4445 x 4445

Vue Pro 336 9214 574 x 4445 x 4445

Vue Pro R 640 9214 574 x 4445 x 4445

Vue Pro R 336 9214 574 x 4445 x 4445

Zenmuse XT 640 270 103 x 74 x 102

Zenmuse XT 336 270 103 x 74 x 102

Zenmuse XT 336 R 270 103 x 74 x 102

Zenmuse XT 640 R 270 103 x 74 x 102

One 345 67 x 34 x 14

One Pro 365 68 x 34 x 14

Tau 2 640 72 444 x 444 x 444

Tau 2 336 72 444 x 444 x 444

Tau 2 324 72 444 x 444 x 444

Lepton 3 160 x 120 09 118 x 127 x 72

Lepton 3 80 x 60 09 118 x 127 x 72

Boson 640 75 21 x 21 x 11

Boson 320 75 21 x 21 x 11

Quark 2 640 8 22 x 22 x 12

Quark 2 336 8 22 x 22 x 12

DroneThermal v3 3 20 x 20 x 15

Compact 1417 254 x 444 x 203

CompactXR 1417 254 x 444 x 254

Compact Pro 1417 254 x 444 x 254

Therm-App 138 55 x 65 x 40

Therm-App TH 123 55 x 65 x 40

Therm-App 25 Hz 138 55 x 65 x 40

Table B2 Physical specifications

THERMAL CAMERA SPECIFICATIONS 88

Product IR Resolution (pixels) SD resolution (megapixels) Frequency (Hz) FOV Radiometry

Wiris 2nd Gen 640 640 x 512 192 not specified Various yes

Wiris 2nd Gen 336 336 x 256 192 not specified Various yes

Duo Pro R 640 640 x 512 12 30 Various lens yes

Duo Pro R 336 336 x 256 12 30 Various lens yes

Duo 160 x 120 2 75 and 83 57deg x 44deg no

Duo R 160 x 120 2 75 57deg x 44deg yes

Vue 640 640 x 512 0 75 Various lens no

Vue 336 336 x 256 0 75 Various lens no

Vue Pro 640 640 x 512 0 75 Various lens no

Vue Pro 336 336 x 256 0 75 Various lens no

Vue Pro R 640 640 x 512 0 75 Various lens yes

Vue Pro R 336 336 x 256 0 75 Various lens yes

Zenmuse XT 640 640 x 512 0 75 Various lens no

Zenmuse XT 336 336 x 256 0 75 Various lens no

Zenmuse XT 336 R 336 x 256 0 75 Various lens yes

Zenmuse XT 640 R 336 x 256 0 75 Various lens yes

One 80 x 60 15 87 50 deg x 38 deg yes

One Pro 160 x 120 15 87 55 deg x 43 deg yes

Tau 2 640 640 x 512 0 75 Various lens yes

Tau 2 336 336 x 256 0 75 Various lens yes

Tau 2 324 324 x 256 0 76 Various lens yes

Lepton 3 160 x 120 160 x 120 0 88 56 deg available

Lepton 3 80 x 60 80 x 60 0 88 56 deg no

Boson 640 640 x 512 0 90 Various lens no

Boson 320 320 x 256 0 90 Various lens no

Quark 2 640 640 x 512 0 9 Various lens no

Quark 2 336 336 x 256 0 9 Various lens no

DroneThermal v3 80 x 60 0 86 25 deg no

Compact 206 x 156 0 9 36 deg no

CompactXR 205 x 156 0 9 20 deg no

Compact Pro 320 x 240 0 15 32 deg no

Therm-App 384 x 288 0 87 Various lens no

Therm-App TH 384 x 288 0 87 Various lens yes

Therm-App 25 Hz 384 x 288 0 25 Various lens no

Table B3 Image quality

IR InfraRed SD Standard FOV Field of View

THERMAL CAMERA SPECIFICATIONS 89

Product Sensitivity mK Temperature range (degrees Celsius) Accuracy (Celsius)

Wiris 2nd Gen 640 50 -25 to +150 -40 to + 550 2

Wiris 2nd Gen 336 50 -25 to +150 -40 to + 550 2

Duo Pro R 640 50 -25 to + 135 -40 to + 550 5 20

Duo Pro R 336 50 -25 to + 135 -40 to + 550 5 20

Duo not specified -40 tot + 550 5

Duo R not specified -40 to + 550 5

Vue 640 not specified -58 to + 113 not specified

Vue 336 not specified -58 to + 113 not specified

Vue Pro 640 not specified -58 to + 113 not specified

Vue Pro 336 not specified -58 to + 113 not specified

Vue Pro R 640 not specified -58 to + 113 not specified

Vue Pro R 336 not specified -58 to + 113 not specified

Zenmuse XT 640 50 -40 to 550 not specified

Zenmuse XT 336 50 -40 to 550 not specified

Zenmuse XT 336 R 50 -40 to 550 not specified

Zenmuse XT 640 R 50 -40 to 550 not specified

One 150 -20 to 120 3

One Pro 150 -20 to 400 3

Tau 2 640 50 -40 to 550 not specified

Tau 2 336 50 -40 to 550 not specified

Tau 2 324 50 -40 to 550 not specified

Lepton 3 160 x 120 50 0 to 450 5

Lepton 3 80 x 60 50 0 to 450 5

Boson 640 40 0 to 500 not specified

Boson 320 40 0 to 500 not specified

Quark 2 640 50 -40 to 160 not specified

Quark 2 336 50 -40 to 160 not specified

DroneThermal v3 50 0 to 120 not specified

Compact not specified -40 to 330 not specified

CompactXR not specified -40 to 330 not specified

Compact Pro 70 -40 to +330 not specified

Therm-App 70 5 to + 90 3

Therm-App TH 70 0 to 200 2

Therm-App 25 Hz 70 5 to + 90 3

Table B4 Thermal precision

THERMAL CAMERA SPECIFICATIONS 90

Product USB MAVLink HDMI

Wiris 2nd Gen 640 Flash disk yes yes

Wiris 2nd Gen 336 Flash disk yes yes

Duo Pro R 640 Mini-USB yes micro-HDMI

Duo Pro R 336 Mini-USB yes micro-HDMI

Duo Mini-USB yes micro-HDMI

Duo R Mini-USB yes micro-HDMI

Vue 640 Mini-USB No No

Vue 336 Mini-USB no no

Vue Pro 640 Mini-USB yes Optional

Vue Pro 336 Mini-USB yes Optional

Vue Pro R 640 Mini-USB yes Optional

Vue Pro R 336 Mini-USB yes Optional

Zenmuse XT 640 Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 336 Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 336 R Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 640 R Via DJI drone Via DJI drone Via DJI drone

Tau 2 640 No no no

Tau 2 336 No no no

Tau 2 324 No no no

Lepton 3 160 x 120 No no no

Lepton 3 80 x 60 No no no

Boson 640 Yes no no

Boson 320 Yes no no

Quark 2 640 no no no

Quark 2 336 no no no

DroneThermal v3 no no no

Compact Smartphone storage no no

CompactXR Smartphone storage no no

Compact Pro Smartphone storage no no

Therm-App Smartphone storage no no

Therm-App TH Smartphone storage no no

Therm-App 25 Hz Smartphone storage no no

Table B5 Interfaces

THERMAL CAMERA SPECIFICATIONS 91

Product Power consumption (Watt) Input Voltage

Wiris 2nd Gen 640 4 6 - 36

Wiris 2nd Gen 336 4 6 - 36

Duo Pro R 640 10 50 - 260

Duo Pro R 336 10 50 - 260

Duo 22 50 - 260

Duo R 22 50 - 260

Vue 640 12 48 - 60

Vue 336 12 48 - 60

Vue Pro 640 21 48 - 60

Vue Pro 336 21 48 - 60

Vue Pro R 640 21 48 - 60

Vue Pro R 336 21 48 - 60

Zenmuse XT 640 Via DJI drone Via DJI drone

Zenmuse XT 336 Via DJI drone Via DJI drone

Zenmuse XT 336 R Via DJI drone Via DJI drone

Zenmuse XT 640 R Via DJI drone Via DJI drone

One approx 1h battery lifetime Battery

One Pro approx 1h battery lifetime Battery

Tau 2 640 13 40 - 60

Tau 2 336 13 40 - 61

Tau 2 324 13 40 - 62

Lepton 3 160 x 120 065 31

Lepton 3 80 x 60 065 31

Boson 640 05 33

Boson 320 05 33

Quark 2 640 12 33

Quark 2 336 12 33

DroneThermal v3 015 33 - 5

Compact Via smartphone Smartphone

CompactXR Via smartphone Smartphone

Compact Pro Via smartphone Smartphone

Therm-App 05 5

Therm-App TH 05 5

Therm-App 25 Hz 05 5

Table B6 Energy consumption

THERMAL CAMERA SPECIFICATIONS 92

Product Warranty (years) User Manual Phone support Email support FAQs

Wiris 2nd Gen 640 Not specified Yes Yes Yes Yes

Wiris 2nd Gen 336 Not specified Yes Yes Yes Yes

Duo Pro R 640 1 Yes Yes Yes Yes

Duo Pro R 336 1 Yes Yes Yes Yes

Duo 1 yes Yes Yes Yes

Duo R 1 yes yes yes yes

Vue 640 1 yes yes yes yes

Vue 336 1 yes yes yes yes

Vue Pro 640 1 yes yes yes yes

Vue Pro 336 1 yes yes yes yes

Vue Pro R 640 1 yes yes yes yes

Vue Pro R 336 1 yes yes yes yes

Zenmuse XT 640 05 yes yes yes yes

Zenmuse XT 336 05 yes yes yes yes

Zenmuse XT 336 R 05 yes yes yes yes

Zenmuse XT 640 R 05 yes yes yes yes

One 1 yes yes yes yes

One Pro 1 yes yes yes yes

Tau 2 640 1 yes yes yes yes

Tau 2 336 1 yes yes yes yes

Tau 2 324 1 yes yes yes yes

Lepton 3 160 x 120 1 yes yes yes yes

Lepton 3 80 x 60 1 yes yes yes yes

Boson 640 1 yes yes yes yes

Boson 320 1 yes yes yes yes

Quark 2 640 1 yes yes yes yes

Quark 2 336 1 yes yes yes yes

DroneThermal v3 not specified no no no no

Compact 1 yes yes yes yes

CompactXR 1 yes yes yes yes

Compact Pro 1 yes yes yes yes

Therm-App 1 yes yes yes yes

Therm-App TH 1 yes yes yes yes

Therm-App 25 Hz 1 yes yes yes yes

Table B7 Help and support

THERMAL CAMERA SPECIFICATIONS 93

Product Bluetooth Wi-Fi GPS Mobile app Storage

Wiris 2nd Gen 640 no on request Yes no yes

Wiris 2nd Gen 336 no on request yes no yes

Duo Pro R 640 yes no yes yes yes

Duo Pro R 336 yes no yes yes yes

Duo no no no no yes

Duo R no no no no yes

Vue 640 No no no no no

Vue 336 no no no no no

Vue Pro 640 yes no no yes yes

Vue Pro 336 yes no no yes yes

Vue Pro R 640 yes no no yes yes

Vue Pro R 336 yes no no yes yes

Zenmuse XT 640 Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 336 Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 336 R Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 640 R Via DJI drone Via DJI drone Via DJI drone yes yes

One no no no yes yes

One Pro no no no yes yes

Tau 2 640 no no no no yes

Tau 2 336 no no no no yes

Tau 2 324 no no no no yes

Lepton 3 160 x 120 no no no no no

Lepton 3 80 x 60 no no no no no

Boson 640 no no no no no

Boson 320 no no no no no

Quark 2 640 no no no no no

Quark 2 336 no no no no no

DroneThermal v3 no no no no no

Compact no no no yes yes

CompactXR no no no yes yes

Compact Pro no no no yes yes

Therm-App no no no yes yes

Therm-App TH no no no yes yes

Therm-App 25 Hz no no no yes yes

Table B8 Auxiliary features

LAST POST THERMAL DATASET SUMMARY 94

Appendix C

Last Post thermal dataset summary

The goal of this appendix is to provide a summary of the layout of the Last Post thermal dataset The data was captured on

the following days 24th of March 2018 second of April 2018 third of April 2018 third of April 2018 fourth of April 2018 fifth of

April 2018 9th of April 2018 10th of April 2018 11th of April 2018 and 12th of April 2018 For each date a small summary of the

contents is made below The small summary consists of a description of the conditions that day a listing of the video files and

their contents

C1 24th of March 2018

Conditions

bull Hours 1940 - 2020

bull Outside temperature range 5 degrees Celsius - 12 degrees Celsius

bull Clear

bull Humidity 76

bull Wind 24 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 14 kilometers

Videos

bull flir_20180324T195255mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd gathers on the right of the video D

C2 2nd of April 2018 95

bull flir_20180324T195836mp4 This video gives an overview of the inside of the Meningate ceremony Many

people can be seen watching the ceremony

bull flir_20180324T200421mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side

bull flir_20180324T201448mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side The crowd is leaving

the area after the ceremony

bull flir_20180324T202328mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side The crowd is leaving

the area after the ceremony

C2 2nd of April 2018

Conditions

bull Hours 1940 - 2020

bull Outside temperature range 9 degrees Celsius - 15 degrees Celsius

bull Light rain

bull Humidity 74

bull Wind 18 kilometers per hour

bull Precipitation 04 centimeters

bull Visibility 81 kilometers

Videos

bull 2018-04-02194733mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people sometimes with umbrellas passing through

bull 2018-04-02194952mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance

C3 3th of April 2018 96

bull 2018-04-02195518mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance

bull 2018-04-02201322mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance Crowds can be seen as well as people holding umbrellas

C3 3th of April 2018

Conditions

bull Hours 2000 - 2030

bull Outside temperature range 8 degrees Celsius - 16 degrees Celsius

bull Heavy rain

bull Humidity 79

bull Wind 25 kilometers per hour

bull Precipitation 05 centimeters

bull Visibility 101 kilometers

Videos

bull 2018-04-03 201227mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are holding

umbrellas due to heavy rain on that day Due to rain conditions and wind it was difficult to steady the camera which

can be seen in the shaky video

bull 2018-04-03 201727mp4 In the beginning of the clip the camera is moving towards the other side of the

Meningate From 0020 and onwards the clip is useful The video gives an overview of the bridge at the east-side of the

Meningate This is were the Frenchlaan goes into the Menenstraat The video shows people leaving from the Meningate

towards the busses at the other side of the bridge Most people are holding umbrellas due to heavy rain that day The

Meningate is in the bottom left of the picture Several buildings can be seen in the distance In the bottom right the

water of the Kasteelgracht can be seen Sometimes in the left of the picture the wall of the Meningate can be seen

bull 2018-04-03 202311mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are holding

umbrellas due to heavy rain on that day Due to rain conditions and wind it was difficult to steady the camera which

can be seen in the shaky video

C4 4th of April 2018 97

C4 4th of April 2018

Content

bull Hours 1945 - 2030

bull Outside temperature range 10 degrees Celsius - 14 degrees Celsius

bull Cloudy

bull Humidity 87

bull Wind 18 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-04 200052mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen In the video a dense

crowd of people slowly starts to form in the bottom right of the picture Most people are coming from the left of the

video to join the crowd

bull 2018-04-04 200728mp4 This video shows the inside of the Meningate and the ceremony of the last post

Some people are up close in front The large crowd can be seen through the hall

bull 2018-04-04 200913mp4 This video shows the inside of the Meningate and the ceremony of the last post

The video switches between MSX mode visual camera and thermal camera to show the differences

bull 2018-04-04 202859mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen At the start of the video

a crowd is seen in the bottom right At the 0100 mark the ceremony has ended and people are exiting the gate and

coming onto the crossing They form two rows to make place for the marching band exiting the Meningate It can be

seen marching through the crowd at the 0250 mark

C5 5th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 11 degrees Celsius - 15 degrees Celsius

C6 9th of April 2018 98

bull Sunny

bull Humidity 77

bull Wind 11 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-05 200217mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen In the video a dense

crowd of people slowly starts to form in the bottom right of the picture Most people are coming from the left of the

video to join the crowd The video shows 15 minutes before the start of the ceremony

bull 2018-04-04 201838mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen The video shows the

first ten minutes after the end of the ceremony The crowd which can be seen on the left leaves towards the square

C6 9th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 9 degrees Celsius - 10 degrees Celsius

bull Light rain

bull Humidity 99

bull Wind 14 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 81 kilometers

Videos

bull 2018-04-09 200007mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are coming from

the left towards the Meningate in the right Not a lot of people are seen due to rain that day

C7 10th of April 2018 99

bull 2018-04-09-202302mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right People are leaving from the right of the Meningate

towards the square

C7 10th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 14 degrees Celsius - 17 degrees Celsius

bull Partly Cloudy

bull Humidity 52

bull Wind 13 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers per hour

Videos

bull 2018-04-10 195029mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way

bull 2018-04-10 195131mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way

bull 2018-04-10 195748mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way Some people are moving around the crowd

bull 2018-04-10 200122mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen There is a big crowd that

can be seen on the right There are some schools there so some people are wearing backpacks It is quite warm and

the cafe on the other side of the street has opened up its terrace

bull 2018-04-10 201427mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

C8 11th of April 2018 100

that they are standing in a very structured way Some people are moving around the crowd The image is not rotated

well a well rotated image is found in 2018-04-10 201427_rotatedmp4

bull 2018-04-10 201515mp4 This video shows the inside of the Meningate and the ceremony A traditional

rsquoHakkarsquo from New-Zealand soldiers can be heard in the video the soldiers are difficult to spot due to thermal blurring

because many people are standing in one place

bull 2018-04-10 202558mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen that

they are standing in a very structured way Some people are moving around the crowd At the 0200 mark the army

unit marches to the end of the bridge Very dense crowds can be seen afterwards At 0825 the army unit marches in a

straight line towards the Meningate

C8 11th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 12 degrees Celsius - 16 degrees Celsius

bull Sunny

bull Humidity 63

bull Wind 14 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-11 200140mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen

bull 2018-04-11 200601mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat A small crowd can be seen on the left of the video

bull 2018-04-11 201554mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People start leaving the

ceremony from the 0120 mark

C9 12th of April 2018 101

C9 12th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 11 degrees Celsius - 14 degrees Celsius

bull Rain

bull Humidity 94

bull Wind 8 kilometers per hour

bull Precipitation 01 centimeters

bull Visibility 32 kilometers

Videos

bull 2018-04-12 195219mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen Sometimes the hedge

of the wall were the film was made is visible due to shaky cam Not many people are seen due to the rain

bull 2018-04-12 201526mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen Sometimes the hedge

of the wall were the film was made is visible due to shaky cam Not many people are seen due to the rain People are

leaving towards the right

  • Introduction
    • Drones
    • Concepts
      • Thermal Cameras
      • Aerial thermal imaging
        • Problem statement
          • Industry adoption
          • Crowd monitoring
          • Goal
          • Related work
            • Outline
              • System Design
                • Requirements analysis
                  • Functional requirements
                  • Non-functional requirements
                    • Patterns and tactics
                      • Layers
                      • Event-driven architecture
                      • Microkernel
                      • Microservices
                      • Comparison of patterns
                        • Software architecture
                          • Static view
                          • Dynamic views
                          • Deployment views
                              • State of the art and technology choice
                                • Thermal camera options
                                  • Parameters
                                  • Comparative analysis
                                    • Microservices frameworks
                                      • Flask
                                      • Falcon
                                      • Nameko
                                      • Vertx
                                      • Spring Boot
                                        • Deployment framework
                                          • Containers
                                          • LXC
                                          • Docker
                                          • rkt
                                            • Object detection algorithms and frameworks
                                              • Traditional approaches
                                              • Deep learning
                                              • Frameworks
                                                • Technology choice
                                                  • Thermal camera
                                                  • Microservices framework
                                                  • Deployment framework
                                                  • Object detection
                                                      • Proof of Concept implementation
                                                        • Goals and scope of prototype
                                                        • Overview of prototype
                                                          • General overview
                                                          • Client interface
                                                          • Stream
                                                          • Producer and Consumer
                                                          • Implemented plugins
                                                            • Limitations and issues
                                                              • Single client
                                                              • Timeouts
                                                              • Exception handling and testing
                                                              • Docker security issues
                                                              • Docker bridge network
                                                              • Single stream
                                                              • Number of containers per plugin
                                                                  • Mob detection experiment
                                                                    • Last Post thermal dataset
                                                                      • Last Post ceremony
                                                                      • Dataset description
                                                                        • Object detection experiment
                                                                          • Preprocessing
                                                                          • Training
                                                                              • Results and evaluation
                                                                                • Framework results
                                                                                  • Performance evaluation
                                                                                  • Interoperability evaluation
                                                                                  • Modifiability evaluation
                                                                                    • Mob detection experiment results
                                                                                      • Training results
                                                                                      • Metrics
                                                                                      • Validation results
                                                                                          • Conclusion and future work
                                                                                            • Conclusion
                                                                                            • Future work
                                                                                              • Security
                                                                                              • Implementing a detection plugin
                                                                                              • Different deployment configurations
                                                                                              • Multiple streams with different layouts
                                                                                              • Implementing the plugin distribution service (Remote ProducerConsumer)
                                                                                              • Using high performance microservices backbone frameworks
                                                                                              • New object detection models and datasets specifically for thermal images
                                                                                                  • Firefighting department email conversations
                                                                                                    • General email sent to Firefighting departments
                                                                                                    • Conversation with Firefighting department of Antwerp Belgium
                                                                                                    • Converstation with Firefighting department of Ostend Belgium
                                                                                                    • Conversation with Firefighting department of Courtrai Belgium
                                                                                                    • Conversation with Firefighting department of Ghent Belgium
                                                                                                      • Thermal camera specifications
                                                                                                      • Last Post thermal dataset summary
                                                                                                        • 24th of March 2018
                                                                                                        • 2nd of April 2018
                                                                                                        • 3th of April 2018
                                                                                                        • 4th of April 2018
                                                                                                        • 5th of April 2018
                                                                                                        • 9th of April 2018
                                                                                                        • 10th of April 2018
                                                                                                        • 11th of April 2018
                                                                                                        • 12th of April 2018

v

Preface

This master dissertation is submitted as a completion of the academic degree as Master of Science in Computer Science Engi-

neering at Ghent University The dissertation investigates the upcoming combination of drones and thermal cameras their use

cases and supporting technologies The dissertation led me through various fields such as software architecture microservices

software containerization GPUs and neural networks I wrote the dissertation focussing on the business and technological

aspects that could lead to increasing industry adoption of these technologies

I would like to thank my supervisors and counsellors for their continuous support this year You were always there for a quick

meeting during which the atmosphere was always positive jokes were always around the corner but with a focus on results

Prof Volckaert for the quickest replies on emails I have yet witnessed to this day and guiding me through the complex journey

of this dissertation Jerico Moeyersons for the office hop-ins and help during that annoying CUDA installation Pieter-Jan

Maenhaut for his questions and reviews during meetings which provided me with new insights and things to write about Nils

Tijtgat for the support in the early days of the thesis Irsquove read your tutorial on YOLO more than I would like to admit And finally

Prof De Turck for the opportunity of working on this topic

I am grateful for the company I had this year when working on the dissertation Ozan Catal Joran Claeys Stefan Wauters Dries

Bosman Pieter De Cleer Igor Lima de Paula Laura Van Messem Lars De Brabandere Stijn Cuyvers Stijn Poelman thank you for

the fun times spontaneous beers and support this year

Special thanks go out to the people of the VTK and FK students associations You provided me with unforgettable experiences

friendships teachings andmemories With a special mention to Steacutephanie Anna and Nick from Career amp Development everyone

from Delta and finally Stijn Adams and Sander De Wilde for their continuous support throughout the years

Finally I want to thank my parents and Marjolein Hondekyn for their advise and massive support Without you I wouldnrsquot have

been able to wrestle myself through the tough periods and finish the dissertation

Brecht Verhoeve

Ghent June 2018

vi

Modifiable drone thermal imaging analysis framework for mob detection during

open-air events

Brecht Verhoeve

Supervisors Prof dr Bruno Volckaert Prof dr ir Filip De Turck

Counsellors Pieter-Jan Maenhaut Jerico Moeyersons

Masterrsquos dissertation submitted in order to obtain the academic degree of

Master of Science in Computer Science Engineering

Department of Information Technology

Chair Prof dr ir Bart Dhoedt

Faculty of Engineering and Architecture

Ghent University

Academic year 2017-2018

Abstract

Drones and thermal cameras are used in combination for many applications such as search and rescue fire fighting etc Due to

vendor specific hardware and software applications are hard to develop and maintain Therefore a modifiable drone thermal

imaging analysis framework is proposed that enables users to more easily develop such image processing applications It

implements a microservice plugin architecture Users can build image processing applications with the framework by building

media streams using plugins that are either thermal cameras or image analysis software modules The framework is evaluated

by building a proof of concept implementation which is tested on the initial requirements It achieves the modifiability and

interoperability requirements at the cost of performance and security The framework is applied for detecting large crowds of

people (mobs) during open-air events A new dataset containing thermal images of such mobs is presented on which a YOLOv3

neural network is trained The trained model is able to detect mobs on new thermal images in real-time achieving frame rates

of 55 frames per second when deployed on a modern GPU

Keywords Drone thermal imaging Video streaming Framework Microservices Object de-

tection Plugin

Modifiable Drone Thermal Imaging AnalysisFramework for Mob Detection during Open-Air

EventsBrecht Verhoeve

Supervisors prof dr Bruno Volckaert prof dr ir Filip De Turck Pieter-Jan Maenhaut Jerico Moeyersons

Abstractmdash Drones and thermal cameras are used in combination formany applications such as search and rescue fire fighting etc Due to ven-dor specific hardware and software applications are hard to develop andmaintain Therefore a modifiable drone thermal imaging analysis frame-work is proposed that enables users to more easily develop such imageprocessing applications It implements a microservice plugin architectureUsers can build image processing applications with the framework by build-ing media streams using plugins that are either thermal cameras or imageanalysis software modules The framework is evaluated by building a proofof concept implementation which is tested on the initial requirements Itachieves the modifiability and interoperability requirements at the cost ofperformance and security The framework is applied for detecting largecrowds of people (mobs) during open-air events A new dataset containingthermal images of such mobs is presented on which a YOLOv3 neural net-work is trained The trained model is able to detect mobs on new thermalimages in real-time achieving frame rates of 55 frames per second whendeployed on a modern GPU

Keywordsmdash Drone thermal imaging Video streaming Framework Mi-croservices Object detection Plugin

I INTRODUCTION

THROUGHOUT history having an overview of the environ-ment from high viewpoints held many benefits The advent

of drones and advanced cameras provides low-cost aerial imag-ing that creates numerous opportunities for new applicationsTraditional visual cameras for the visible light spectrum offerhigh quality images but are limited to daytime or artificiallylighted scenes Thermal cameras measure thermal radiation ofobjects in a scene and thus can operate in utter darkness reveal-ing information not visible to the normal eye [1] The combi-nation of drones and thermal cameras is used in many differentapplications such as geography [23] agriculture [4] search andrescue [5] wildlife monitoring [6] disaster response [7] main-tenance [8] etc

Several vendors offer thermal camera products some specif-ically designed for drone platforms These cameras often usedifferent image formats color schemes and interfaces [19ndash11]This leads to issues if applications want to change the camerathat is used or when the camera is no longer supported by thevendor because different software needs to be built to inter-act with the new camera which often is a non-negligible costThis leads to a problem called vendor lock-in that makes cus-tomers dependent on a certain vendor as they cannot switchproduct without making substantial costs a problem alreadyvery tangible for cloud-based applications [12] Applicationsacross various fields often have slightly different functional andnon-functional requirements For this dissertation several Bel-gian fire fighting departments were asked for requirements for

a thermal drone platform application It quickly became clearthat they had various problems that needed to be solved suchas finding hot explosives measuring temperatures in contain-ers identifying hot entrances detecting invisible methane firesfinding missing persons etc Some use cases need to be eval-uated in real-time (during fires) others need to be extremelyaccurate A hypothetical application should be able to quicklyimplement new detection and analysis features to meet all theserequirements Due to the current solutions not being modifiableenough current applications built with aerial thermal imagingtechnology remain in the niche use case for which they were ini-tially developed [13] Applications could benefit from a back-bone framework to aid in this modifiabilityinteroperability is-sue aiding in developing end-to-end solutions connecting ther-mal cameras to various analysisdetection modules

This dissertation explores the requirements for such a frame-work and its potential software architecture To test the viabil-ity of the architecture a proof of concept prototype is imple-mented and evaluated against the initial requirements To verifyif it aids in developing detection applications the specific usecase for detecting large crowds of people so-called mobs dur-ing open-air events is investigated Monitoring crowds duringopen-air events is important as mobs can create potentially dan-gerous situations through bottlenecks blocking escape routesetc Through monitoring and detecting these mobs these situa-tions can be avoided before they become problematic [14 15]

The remainder of this paper is organized as follows Sec-tion II presents similar projects on the topic of modifiable imag-ing analysis frameworks and thermal object detection SectionIII presents the requirements of the framework and the soft-ware architecture designed from these requirements Section IVpresents the implementation of the framework prototype Themob detection experiment is described in Section V The testsand results to evaluate the framework and the mob detection ex-periment are presented in Section VI Finally Section VII drawsconclusions from this research and indicates where future effortsin this field should go to

II RELATED WORK

The Irish start-up DroneSAR [16] developed a search-and-rescue (SAR) drone platform allowing users to stream live im-ages and video from a drone as it conducts a search for miss-ing persons The platform works with any camera visual andthermal but focuses on drones from vendor DJI DroneSARs in-dustry partner Amazon introduced the Amazon Kinesis Video

Streams platform in January 2018 as a new service for the Ama-zon Web Services (AWS) cloud platform It allows users tostream live video from devices to the AWS cloud and buildapplications for real-time video processing [17] The VIPERproject by EAVISE KU Leuven researched how thermal andvisual video images could be used for real-time detection ofpersons using object detection algorithms such as deep learning[18] The framework presented in this work combines elementsfrom all three of these examples

III REQUIREMENTS AND SOFTWARE ARCHITECTURE

A Functional requirements

Three general actors are identified for the framework an end-user that wants to build a image processing application for a spe-cific use case camera developers integrating new cameras intothe framework and analysisdetection module developers inte-grating new analysisdetection algorithms into the framework sothat end-users can use them to build their applications An end-user should be able to construct a image processing applicationusing the integrated cameras and analysis modules He shouldbe able to adapt this application with the framework for newuse cases Camera and analysis module developers should beable to add and distribute new integrations for new cameras andanalysis modules to the framework This allows the end-users tofocus on the use case not the technical details of the hardwareplatforms or algorithms and have a wider selection of hardwareand algorithms

B Non-functional requirements

Interoperability modifiability and peformance are identifiedas the architecturally significant requirements Interoperabilityspecifies that the framework should be able to interact with var-ious cameras and analysis software modules via interfaces Theamount of systems the framework can successfully interact withadding to the business value of the framework as end-users canuse more devices via the framework to build applications Theframework needs to be extendable with new thermal camerasand analysis modules Applications built with the frameworkshould be modifiable to integrate new hardware and softwareThe available hardware on-site for use cases such as forest firemonitoring is not always powerful enough to support heavy im-age analysis software The framework should be able to deployin a distributed fashion to allow more computationally expen-sive operations to be executed on more powerful remote devicesSome use cases require real-time streaming of video and manip-ulation of these video streams which should be supported forthe framework to be relevant

C Software architecture

An architectural pattern analysis based on the requirementspresented in Section III-B was conducted from which a com-bination of the microservices and microkernel pattern was se-lected as the base pattern for the software architecture The mi-crokernel pattern enables the framework to be extended via aplugin system The microservices framework enables interop-erability through the use of standard interfaces and a communi-cation scheme It also allows for the framework to be deployed

in a distributed fashion [19ndash21] The software architecture isdocumented in static views sequence diagrams and deploymentviews Figure 1 presents an overview of the architecture

Fig 1 Component-connector overview of the framework Theclear components are the core components of the framework thateach user needs to install to use the framework The coloredcomponents are used for the distribution of plugins

End-users interact with the framework via the Client Inter-face a graphical or textual interface Commands are forwardedto the Stream module that manages the layout and state of thecomponents in the media pipeline (stream) that the end-userbuilds for his use case To activate and place the componentsin a certain layout the Stream module connects with the Pro-ducer and Consumer components that control the plugins thatprocess the media Producer Plugins are devices that producemedia such as thermal cameras Consumer Plugins process andconsume media such as analysis software and displays Oncea stream is established the plugins forward media to each otherin the layout specified by the Stream module New support forcameras and analysis modules can be added as plugins to theProducerConsumer Distribution components that distribute thissoftware so that end-users can download and install the pluginsEach module in the architecture is a microservice allowing fora distributed deployment and enabling the modifiability require-ments through interchangeable modules Cameras and analy-sis modules are realized as plugins for the ProducerConsumermodules implemented as a microkernel This allows the frame-work to easily build new streams for different use cases and in-terchange components when needed

C1 Plugin model

Figure 2 depicts the model of a general framework pluginThe plugin defines three interfaces a source media endpoint toreceive media from different sources a listener endpoint to for-ward the processed media to other listening plugins and an APIfor framework control The framework uses the API to changewhich sources and listener a plugin has and its state By linkingplugins together by setting the sources and listeners resourcesthe framework can build a media processing stream ProducerPlugins have no sources since they produce media The statesare used stop and start the media processing of the plugins in

Fig 2 Schematic overview of a plugin

the stream The REST paradigm is selected to build this APIwith state sources and listeners resources that need to be min-imally implemented A plugin has the following states INAC-TIVE PLAY STOP and PAUSE Figure 3 depicts the state tran-sition diagram A plugin implements the visible states STOPPAUSE and PLAY describing if the media process of the pluginis stopped paused or processing respectively The INACTIVEstate is only visible to the framework as it indicates that their isno active process that runs the plugin This is the initial state ofa plugin in the framework When a plugin is added to a streamthe plugin microservice is started transitions to the STOP stateand waits for commands

Fig 3 State transition diagram of a plugin

C2 Network topology and communication protocol

The microservices of the framework and the plugins need acommunication protocol to exchange commands and video Forsending the commands the HTTPTCP protocol is used a syn-chronous protocol that blocks on the response of a request Thisensures that commands are always received and acknowledgedincreasing the reliability of the communication at the cost ofperformance [22] The asynchronous RTPUDP protocol is se-lected to transfer media between the plugins to ensure low la-tency video transfer between plugins to enable real-time videostreams The recommended codec for transmitting video mediais MJPEG which transmits video frames as separately encodedJPEG images [23] Other codecs such as MPEG-4 encode onlycertain frames as keyframes and the other frames as B-framesthat encode differences from the keyframe [24] This impliesthat when receiving images from a stream a keyframe must firstbe received before the video can be decoded Using MJPEG

plugins receiving frames can directly perform analysis on eachkeyframe and do not have to wait for additional frames An ex-ample of this network topology for a stream with 2 Producerplugins and 3 Consumer plugins is depicted in Figure 4

Fig 4 Network topology The full lines represent HTTPTCPcommuncations the dashed line RTPUDP communications

IV PROTOTYPE IMPLEMENTATION

The goal of the prototype implementation is to provide a proofof the concept architecture and meet the requirements presentedin Section III The core framework components are imple-mented the distribution components are left out of scope as theyfocus primarily on the distribution of supported plugins Thecore microservices as well as the plugins are implemented usingthe Docker software containerization framework [25] Contain-ers virtualize on the operating system and allow for portablelightweight software environments for processes with a minorperformance overhead Using this technology the core modulesand plugins can be deployed in a local and distributed fashionand be setup and shutdown easily [26] The microservice con-tainers communicate via the protocols presented in Section III-C2 The REST APIs are built with the Flask framework [27] alightweight Python web development framework ideal for pro-totyping The ProducerConsumer components need access tothe Docker daemon running on the Docker host in order to spinup and shutdown Docker containers containing the plugin pro-cesses This is achieved by mounting the Docker client socket inthe ProducerConsumer containers This gives the container rootaccess to the host a significant security threat [2829] Two sam-ple plugins were implemented Filecam a plugin that producesvideo read in from a file and Display a plugin that forwardsmedia to the display of the local device The plugins transmitmedia using the video streaming framework GStreamer [30]

V MOB DETECTION

A Dataset

Several publicly available datasets for thermal images exists[31ndash34] None of these include large crowds of people so anew dataset called the Last Post dataset was created It consistsof thermal video captured at the Last Post ceremony in Ypres

(a) Thermal view of the square (b) Visual view of the square (c) Thermal view of the bridge (d) Visual view of the bridge

Fig 5 Last Post dataset main scenes

Belgium [35] The videos were captured using the Flir One Prothermal camera for Android [36] using the Iron colorschemeTwo main scenes are present in the dataset depicted in Figure 5Mobs are present in the thermal images not in the visual imagesdue to the images being made on separate days The imagesused for the experiment were manually annotated outliers wereremoved and the dataset was randomly split in a training andvalidation set

B Model

Detecting and classifying objects of interest in images isknown as the object detection problem in machine learning [37]Several object detection algorithms and frameworks have beenimplemented in the past years A distinction is made betweentraditional models [31 38ndash40] deep learning two-stage net-works [41ndash46] and deep learning dense networks [47ndash49] Thetraditional and two-stage methods make predictions relativelyslow (order of seconds on GPU) when compared to the densenetworks (order of milliseconds on GPU) [47] Since the goalis to use the framework in real-time use cases the latter is pre-ferred The YOLOv3 model is selected as it achieves state of theart prediction performances can make real-time predictions andis available via the open source neural network framework dark-net [50 51] The model is pre-trained on the ImageNet dataset[52] The model is trained on a NVIDIA Geforce 980 TX GPUand optimizes the SSE loss using batch gradient descent [50]To select the best weights the average Intersection of Union(IoU) and mean Average Precision (mAP) [53] are calculatedon predictions on the validation set The weights that achievethe highest mAP are selected as the final weights

VI RESULTS

A Framework

To evaluate the framework acceptance tests for the require-ments from Section III were conducted Common frameworkoperations such as manipulating and building a stream have anaverage execution time of 084 seconds with a standard devia-tion of 037 seconds Less common operations such as deacti-vating a plugin starting up the framework and shutting downthe framework have an average execution time of 358 840 and2402 seconds respectively with standard deviations 467 050and 048 respectively Deactivating plugins (STOP to INAC-TIVE transitions) takes a long time as the container running theprocess needs to be removed Real-time streaming could not betested due to the GStreamer framework having no readily avail-

able testing endpoints However when streaming and displayinga video with the framework human users could not differenti-ate between a streaming video and a video played using a nativemedia player making it plausible the framework streams in real-time Great care must be taken when building plugins as theirprocessing speed has a direct impact on the real-time streamingperformance Interoperability is achieved with the REST APIsand plugin model presented in Section III-C The interoperabil-ity is tested by having the framework exchange information witha mock plugin implementing the specified interface and count-ing the number of correct exchanges The average successfulexchange ratio is 99998 The framework can install and de-tect new plugins at runtime achieving runtime modifiability atplugin level Different deployment schemes were not tested forthe prototype

B Mob detection

The weights generated at the 15700th training epoch achievedthe highest mAP value 9052 on the validation set For com-parison performance of other models on benchmark datasetsachieve an average mAP of 748 [54] The reason the model isachieving such high values is because the validation set is tem-porally correlated with the training set as both sets are extractedfrom videos in which frames have a temporal correlation Per-formance when predicting on new datasets will be worse Figure6 depicts some predictions of the model When predicting on avideo the model generated predictions at an average frame rateof 55 frames per second an a GPU

Fig 6 Model predictions on validation set

VII CONCLUSION AND FUTURE WORK

In this dissertation a modifiable drone thermal imaging anal-ysis framework is proposed to allow end-users to build flexiblevideo processing pipelines using different thermal cameras andanalysis modules The framework implements a microservice

container plugin architecture Users can build image process-ing applications with the framework by building media streamsusing plugins that are either thermal cameras or image analy-sis software modules The framework is evaluated by building aproof of concept implementation which is tested on the initial re-quirements The proposed framework achieves the modifiabilityand interoperability requirements at the cost of performance andsecurity The framework is applied for detecting large crowdsof people (mobs) during open-air events A new dataset con-taining thermal images of such mobs is presented on which aYOLOv3 neural network is trained The trained model is ableto detect mobs on new thermal images in real-time achievingframe rates of 55 frames per second when deployed on a modernGPU Some extensions to this research are deploying a detec-tion model using the framework testing the other deploymentconfigurations testing the framework with end-users in prac-tice and building new object detection models specifically forthermal images

REFERENCES

[1] R Gade and T B Moeslund ldquoThermal cameras and applications a sur-veyrdquo Machine Vision and Applications vol 25 pp 245ndash262 2014

[2] M C Harvey J V Rowland and K M Luketina ldquoDrone with thermalinfrared camera provides high resolution georeferenced imagery of theWaikite geothermal area New Zealandrdquo 2016

[3] S Amici M Turci S Giammanco L Spampinato and F Giulietti ldquoUAVThermal Infrared Remote Sensing of an Italian Mud Volcanordquo vol 2pp 358ndash364 2013

[4] J Bendig A Bolten and G Bareth ldquoINTRODUCING A LOW-COSTMINI-UAV FOR THERMAL-AND MULTISPECTRAL-IMAGINGrdquo2012

[5] A J Rivera A D Villalobos J C Monje J A Marinas and C MOppus ldquoPost-disaster rescue facility Human detection and geolocationusing aerial dronesrdquo IEEE Region 10 Annual International ConferenceProceedingsTENCON pp 384ndash386 2017

[6] P Christiansen K A Steen R N Joslashrgensen and H Karstoft ldquoAuto-mated detection and recognition of wildlife using thermal camerasrdquo Sen-sors (Basel Switzerland) vol 14 pp 13778ndash93 jul 2014

[7] S Chowdhury A Emelogu M Marufuzzaman S G Nurre and L BianldquoDrones for disaster response and relief operations A continuous approx-imation modelrdquo 2017

[8] Workswell ldquoPipeline inspection with thermal diagnosticsrdquo 2016[9] DJI ldquoZenmuse H3 - 2Drdquo[10] Workswell ldquoApplications of WIRIS - Thermal vision system for dronesrdquo[11] Therm-App ldquoTherm-App - Android-apps op Google Playrdquo 2018[12] B Satzger W Hummer C Inzinger P Leitner and S Dustdar ldquoWinds of

change From vendor lock-in to the meta cloudrdquo IEEE Internet Comput-ing vol 17 no 1 pp 69ndash73 2013

[13] J Divya ldquoDrone Technology and Usage Current Uses and Future DroneTechnologyrdquo 2017

[14] B Steffen and A Seyfried ldquoMethods for measuring pedestrian densityflow speed and direction with minimal scatterrdquo Physica A Statistical Me-chanics and its Applications vol 389 pp 1902ndash1910 may 2010

[15] M Wirz T Franke D Roggen E Mitleton-Kelly P Lukowicz andG Troster ldquoInferring crowd conditions from pedestriansrsquo location tracesfor real-time crowd monitoring during city-scale mass gatheringsrdquo Pro-ceedings of the Workshop on Enabling Technologies Infrastructure forCollaborative Enterprises WETICE pp 367ndash372 2012

[16] L-L Slattery ldquoDroneSAR wants to turn drones into search-and-rescueheroesrdquo 2017

[17] Amazon Web Services Inc ldquoWhat Is Amazon Kinesis Video Streamsrdquo2018

[18] T Goedeme ldquoProjectresultaten VLAIO TETRA-projectrdquo tech rep KULeuven Louvain 2017

[19] L Bass P Clements and R Kazman Software Architecture in PracticeAddison-Wesley Professional 3rd ed 2012

[20] M Richards Software Architecture Patterns OrsquoReilly Media first edit ed2015

[21] C Richardson ldquoMicroservice Architecture patternrdquo 2017[22] C De La Torre C Maddock J Hampton P Kulikov and M Jones ldquoCom-

munication in a microservice architecturerdquo 2017

[23] On-Net Surveillance Systems Inc ldquoMJPEG vs MPEG4 Understandingthe differences advantages and disadvantages of each compression tech-niquerdquo 2006

[24] D Bull Communicating Pictures A Course in Image and Video CodingElsevier Science 2014

[25] Docker Inc ldquoDocker - Build Ship and Run Any App Anywhererdquo 2018[26] D Merkel ldquoDocker Lightweight Linux Containers for Consistent Devel-

opment and Deploymentrdquo 2014[27] A Ronacher ldquoWelcome to Flask Flask Documentation (012)rdquo 2017[28] Lvh ldquoDonrsquot expose the Docker socket (not even to a container)rdquo 2015[29] R Yasrab ldquoMitigating Docker Security Issuesrdquo tech rep University of

Science and Technology of China Hefei[30] GStreamer ldquoGStreamer open source multimedia frameworkrdquo 2018[31] J W Davis and M A Keck ldquoA Two-Stage Template Approach to Per-

son Detection in Thermal Imageryrdquo Proc Workshop on Applications ofComputer Vision 2005

[32] S Hwang J Park N Kim Y Choi and I S Kweon ldquoMultispectralPedestrian Detection Benchmark Dataset and Baselinerdquo CVPR 2015

[33] Z Wu N Fuller D Theriault and M Betke ldquoA Thermal Infrared VideoBenchmark for Visual Analysisrdquo IEEE Conference on Computer Visionand Pattern Recognition Workshops 2014

[34] S Z Li R Chu S Liao and L Zhang ldquoIllumination Invariant FaceRecognition Using Near-Infrared Imagesrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 29 no 4 pp 627ndash639 2007

[35] Last Post Association ldquoMissionrdquo 2018[36] FLIR ldquoFLIR One Prordquo[37] E Alpaydin Introduction to machine learning MIT Press 3 ed 2014[38] F X F Xu X L X Liu and K Fujimura ldquoPedestrian detection and

tracking with night visionrdquo IEEE Transactions on Intelligent Transporta-tion Systems vol 6 no 1 pp 63ndash71 2005

[39] H Nanda and L Davis ldquoProbabilistic template based pedestrian detectionin infrared videosrdquo IEEE Intelligent Vehicles Symposium Proceedingsvol 1 pp 15ndash20 2003

[40] R Appel S Belongie P Perona and P Doll ldquoFast Feature Pyramids forObject Detectionrdquo Pami vol 36 no 8 pp 1ndash14 2014

[41] J R R Uijlings K E A Van De Sande T Gevers and A W M Smeul-ders ldquoSelective Search for Object Recognitionrdquo tech rep 2012

[42] R Girshick J Donahue T Darrell and J Malik ldquoRegion-Based Convolu-tional Networks for Accurate Object Detection and Segmentationrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol 38 no 1pp 142ndash158 2014

[43] R Girshick ldquoFast R-CNNrdquo in Proceedings of the IEEE InternationalConference on Computer Vision vol 2015 Inter pp 1440ndash1448 2015

[44] S Ren K He R Girshick and J Sun ldquoFaster R-CNN Towards Real-Time Object Detection with Region Proposal Networksrdquo IEEE Trans-actions on Pattern Analysis and Machine Intelligence vol 39 no 6pp 1137ndash1149 2016

[45] K He Gkioxari P Dollar and R Girshick ldquoMask R-CNNrdquo arXiv 2018[46] J Dai Y Li K He and J Sun ldquoR-FCN Object Detection via Region-

based Fully Convolutional Networksrdquo tech rep 2016[47] J Redmon S Divvala R Girshick and A Farhadi ldquoYou Only Look

Once Unified Real-Time Object Detectionrdquo 2015[48] W Liu D Anguelov D Erhan C Szegedy S Reed C-y Fu and A C

Berg ldquoSSD Single Shot MultiBox Detectorrdquo arXiv 2016[49] T-y Lin P Goyal R Girshick K He and P Dollar ldquoFocal Loss for

Dense Object Detectionrdquo arXiv 2018[50] J Redmon and A Farhadi ldquoYOLOv3 An Incremental Improvementrdquo

axXiv 2018[51] J Redmon ldquoDarknet Open source neural networks in crdquo

httppjreddiecomdarknet 2013ndash2016[52] J Deng W Dong R Socher L-J Li K Li and L Fei-Fei ldquoImageNet

A Large-Scale Hierarchical Image Databaserdquo in CVPR09 2009[53] M Everingham S M A Eslami L Van Gool C K I Williams J Winn

and A Zisserman ldquoThe Pascal Visual Object Classes Challenge A Ret-rospectiverdquo International Journal of Computer Vision vol 111 no 1pp 98ndash136 2014

[54] A Ouaknine ldquoReview of Deep Learning Algorithms for Object Detec-tionrdquo 2018

xii

Contents

1 Introduction 1

11 Drones 1

12 Concepts 2

121 Thermal Cameras 2

122 Aerial thermal imaging 2

13 Problem statement 2

131 Industry adoption 2

132 Crowd monitoring 3

133 Goal 4

134 Related work 4

14 Outline 4

2 System Design 5

21 Requirements analysis 5

211 Functional requirements 5

212 Non-functional requirements 6

22 Patterns and tactics 11

221 Layers 12

222 Event-driven architecture 12

223 Microkernel 12

224 Microservices 13

225 Comparison of patterns 13

23 Software architecture 15

231 Static view 15

232 Dynamic views 22

233 Deployment views 23

3 State of the art and technology choice 27

31 Thermal camera options 27

311 Parameters 27

312 Comparative analysis 30

32 Microservices frameworks 31

321 Flask 31

322 Falcon 33

323 Nameko 33

324 Vertx 33

325 Spring Boot 34

33 Deployment framework 34

331 Containers 34

332 LXC 35

333 Docker 35

334 rkt 35

34 Object detection algorithms and frameworks 36

341 Traditional approaches 36

342 Deep learning 37

343 Frameworks 39

35 Technology choice 41

351 Thermal camera 41

352 Microservices framework 41

353 Deployment framework 41

354 Object detection 41

4 Proof of Concept implementation 43

41 Goals and scope of prototype 43

42 Overview of prototype 43

421 General overview 43

422 Client interface 45

423 Stream 46

424 Producer and Consumer 46

425 Implemented plugins 48

43 Limitations and issues 51

431 Single client 51

432 Timeouts 51

433 Exception handling and testing 51

434 Docker security issues 51

435 Docker bridge network 52

436 Single stream 52

437 Number of containers per plugin 52

5 Mob detection experiment 53

51 Last Post thermal dataset 53

511 Last Post ceremony 53

512 Dataset description 54

52 Object detection experiment 56

521 Preprocessing 56

522 Training 56

6 Results and evaluation 58

61 Framework results 58

611 Performance evaluation 58

612 Interoperability evaluation 60

613 Modifiability evaluation 62

62 Mob detection experiment results 62

621 Training results 63

622 Metrics 63

623 Validation results 64

7 Conclusion and future work 67

71 Conclusion 67

72 Future work 69

721 Security 69

722 Implementing a detection plugin 69

723 Different deployment configurations 70

724 Multiple streams with different layouts 70

725 Implementing the plugin distribution service (Remote ProducerConsumer) 70

726 Using high performance microservices backbone frameworks 70

727 New object detection models and datasets specifically for thermal images 70

A Firefighting department email conversations 81

A1 General email sent to Firefighting departments 81

A2 Conversation with Firefighting department of Antwerp Belgium 82

A3 Converstation with Firefighting department of Ostend Belgium 83

A4 Conversation with Firefighting department of Courtrai Belgium 83

A5 Conversation with Firefighting department of Ghent Belgium 83

B Thermal camera specifications 85

C Last Post thermal dataset summary 94

C1 24th of March 2018 94

C2 2nd of April 2018 95

C3 3th of April 2018 96

C4 4th of April 2018 97

C5 5th of April 2018 97

C6 9th of April 2018 98

C7 10th of April 2018 99

C8 11th of April 2018 100

C9 12th of April 2018 101

xvi

List of Figures

21 Use case diagram 7

22 Overview of the framework software architecture 16

23 Framework network topology 17

24 Client Interface detailed view 17

25 Stream detailed view 18

26 Stream model 18

27 Plugin model 19

28 Plugin state transition diagram 20

29 Component-connector diagrams of the Producer and Consumer module 21

210 Producer and Consumer Distribution component-connector diagrams 22

211 Add plugin sequence diagram 23

212 Link plugins sequence diagram 24

213 Deployment diagrams 26

31 Thermal image and MSX image of a dog 28

33 Rethink IT Most used tools and frameworks for microservices results [54] 32

34 Containers compared to virtual machines [66] 36

41 filecam GStreamer pipeline 49

42 local plugin GStreamer pipeline 50

51 Last Post ceremony panorama 54

52 Last Post filming locations 54

53 Main scenes in the Last Post dataset 55

54 Outliers 57

61 Average training loss per epoch 64

62 Validation metrics per epoch 65

63 Predictions of the model on images in the validation set 66

71 GStreamer pipeline for a plugin with a detection model 69

xviii

List of Tables

21 Performance utility tree 8

22 Interoperability utility tree 9

23 Modifiability utility tree 10

24 Usability utility tree 11

25 Security utility tree 11

26 Availability utility tree 12

27 Architecture pattern comparison 14

61 Acceptance tests results summary 59

62 Performance test statistics summary measured in seconds 60

63 Resource usage of the framework in several conditions 61

64 Total size of framework components 61

65 Interoperability tests results (S Source L Listener) 62

B1 Compared cameras their producing companies and their average retail price 86

B2 Physical specifications 87

B3 Image quality IR InfraRed SD Standard FOV Field of View 88

B4 Thermal precision 89

B5 Interfaces 90

B6 Energy consumption 91

B7 Help and support 92

B8 Auxiliary features 93

xix

List of Listings

1 Minimal Flask application 32

2 Vertx example 33

3 Spring Boot example 34

4 docker-composeyml snippet of the prototype 44

5 Mounting the Docker socket on the container 47

6 Starting a plugin container 47

7 Dynamic linking of the decodebin and jpegenc 50

xx

List of Abbreviations

ACF Aggregated Channel Features

AMQP Advanced Message Queuing Protocol

API Application Programming Interface

AS Availability Scenario

ASR Architecturally Significant Requirement

CLI Command Line Interface

CNN Convolutional Neural Networks

CRUD Create Read Update Destroy

DNS Domain Name System

FR Functional Requirement

GPU Graphical Processing Unit

H High

HTTP Hyper Text Transfer Protocol

ICF Integral Channel Features

IoU Intersection of Union

IS Interoperability Scenario

IT Interoperability Tactic

JVM Java Virtual Machine

L Low

xxi

LXC Linux Containers

M Medium

mAP mean Average Precision

Motion-JPEG MJPEG

MS Modifiability Scenario

MSX Multi Spectral Dynamic Imaging

MT Modifiablity Tactic

NFR Non-Functional Requirement

ONNX Open Neural Network Exchange Format

OS Operating System

PS Performance Scenario

PT Performance Tactic

QAR Quality Attribute Requirement

REST Representational State Transfer

RNN Recurrent Neural Network

RPN Region Proposal Network

RTP Real-time Transport Protocol

SS Security Scenario

SSE Sum of Squared Errors

SVM Support Vector Machine

TCP Transmission Control Protocol

UDP User Datagram Protocol

UI User Interface

US Usability Scenario

YOLO You Only Look Once

INTRODUCTION 1

Chapter 1

Introduction

Throughout history having an overview of the environment from high viewpoints held many benefits Early civilizations used

hills to monitor their surroundings population and spot possible attackers The discovery of flight meant that environments

could now be viewed from a birdrsquos-eye view offering even more visibility revealing much more of the world below Recently a

much more smaller type of aircraft was developed the drone Ranging from large plane-like to almost insect-like devices and

having a wide variety of uses drones are quickly taking over the sky Drones would not be as effective without proper cameras

providing a detailed view on the world below With digital videocameras offering superb quality for steadily decreasing costs

almost every scene can be captured in great detail However these cameras are limited to the visible light spectrum which

hinders drones to operate in all circumstances like nightly flights Thermal cameras measure the emitted heat of a scene and

can reveal information not visible to the eye such as hidden persons or animals pipelines malfunctioning equipment etc The

combination of these two technologies certainly holds many exciting opportunities for the future

11 Drones

Drones are flying robots that can fly remotely or autonomously and donrsquot carry a human operator They can carry a variety of

payloads video cameras delivery parcels fluid containers sensors lights but also lethal explosives [1]

Drones are classified in different categories based on varying parameters such as the physical characteristics (diameter weight)

aerial movement techniques application domains etc Based on diameter drones are classified as smart dust (1 mm to 025

cm) pico air vehicles (025 cm - 25 cm) nano air vehicles (25 cm - 15 cm) micro air vehicles (15 cm - 1 m) micro unmanned

aerial vehicles (1 m - 2 m) and unmanned aerial vehicles (2 m and larger) Often depending on their diameter the weight

of these devices ranges from less than a gram up to more than 2000 kg Drones have different flight techniques such as

propulsion engines with wings rotors in various amounts flapping wings and even balloons They are used for all kinds of

purposes ranging from search and rescue missions environmental protection delivery recon etc Hassanalian et al provide

an excellent overview of most types of drones [2]

Due to the increasing interest in commercial drone platforms [3] a variety of payloads were developed specifically tailored for

these aerial robots such as gimbals to mount action video cameras [4] gimbals for delivering packets [5] and thermal imaging

12 Concepts 2

platforms [6]

12 Concepts

121 Thermal Cameras

Thermal cameras are passive sensors that capture the infrared radiation emitted by all objects with a temperature above

absolute zero degrees Kelvin In contrast to visible light cameras thermal cameras do not depend on an external energy

source for visibility and colors of objects or scenes This makes captured images independent of the illumination colors etc

Furthermore images can be captured in the absence of visible light [7] Originally thermal camera technology was developed

for night vision purposes for the military and the devices were very expensive Later the technology was commercialized

and has developed quickly over the last few decades resulting in better and cheaper cameras [7] This led to access for a

broader public and the technology is now introduced to a wide range of different applications such as building inspection gas

detection industrial appliances medicinal science agriculture fire detection surveillance etc [7] Thermal cameras are now

being mounted on drones to give an aerial thermal overview

122 Aerial thermal imaging

Aerial thermal imaging is defined as the creation of thermal images using a flying device This dissertation focuses on the usage

of drones for aerial thermal imaging There are many applications for aerial thermal imaging Some examples are geography

[8 9] agriculture [10 11] search and rescue operations [12] wildlife monitoring [13] forest monitoring [14 15] disaster response

[16] equipment and building maintenance [17ndash20] etc In the past few years several industry players have developed thermal

cameras specifically aimed at these drone applications Examples are FLIR [6] Workswell [21] and TEAX Technology [22]

13 Problem statement

131 Industry adoption

The implementation of thermal cameras on drone platforms faces some issues for wide adoption by the industry Several

vendors offer thermal camera products some specifically designed for drone platforms that often implement different image

formats color schemes and interfaces (eg [23ndash25]) This leads to issues if users want to modify their applications by changing

the camera that is used because the applicationmust implement new software to interact with the camera or when the camera

is no longer supported by the vendor leaving the application with outdated hardware and software This leads to a problem

called vendor lock-in that makes customers dependent on a certain vendor as they cannot switch product without making

substantial costs a problem already very tangible for cloud-based applications today [26]

Applications across various fields often have different functional and non-functional requirements Some applications have hard

real-time deadlines (such as firefighting search and rescue security etc) that must be respected other applications require

13 Problem statement 3

highly accurate predictions (eg person detection agriculture etc) A single application domain can even have many different

use cases

Several firefighting departments in Belgium were contacted to get an overview of their needs for a drone thermal imaging ap-

plication It quickly became clear they had various detection problems such as finding missing persons locating hot explosives

measuring temperatures in silos detecting invisible methane fires etc Equipment also wears down more quickly due to usage

in harsh environments such as fires in close proximity A drone thermal application for them needs to be able to exchange

functionality and hardware easily and have high performance constraints to deliver value for them The email conversations

can be read in Appendix A

Other drone thermal imaging applications are mostly only used in the niche domain for which they were developed because

they arenrsquot designed for flexibility [27] These proprietary applications have some disadvantages the development and support

potentially has a large cost vendor lock-in can occur when products are no longer supported security issues could arise and

customization is difficult [28 29] Applications could benefit from a backbone framework to aid in this modifiabilityinteroper-

ability issue aiding in developing end-to-end solutions connecting thermal cameras to various analysisdetection modules for

various use cases

132 Crowd monitoring

Festivals and other open air events are popular gatherings that attract many people For every event organizer it is important to

ensure safety and avoid incidents Large groups of people so-called mobs can create potentially dangerous situations through

bottlenecks blocking escape routes etc Therefore having the ability to monitor crowds and predict their behavior is very

important to avoid such scenarios Data can be obtained by evaluating video footage from past comparable events or real time

video monitoring of current events [30] By analyzing this footage potentially dangerous situations can be avoided by acting

on the mob formation and safety regulations can be improved to help planning future events Vision-based approaches face

several limitations mounted cameras cannot capture elements outside of their field of view canrsquot see in some conditions (for

example during night time) and it is difficult to infer information from the raw footage [31]

Thermal cameras could help for crowd monitoring because they can operate in any condition Having precise and detailed

object recognition for the images produced by these cameras is crucial to extract information correctly In this context clas-

sifying images is not satisfactory localization of the objects contained within the images is needed This problem is known

as object detection [32] There are several challenges for object detection in thermal images the image quality is very low

when compared to visible light images there is a lack of color and texture information and temperature measures are relative

measures etc This makes extracting discriminative information from these images difficult [33] Most efforts towards object

detection on thermal images has gone towards human detection Most of the proposed algorithms focus on feature extraction

using the Aggregated Channel Features technique and boosting algorithms for learning [33ndash35] Novel approaches make use

of so-called deep learning with neural networks that achieve very good results given enough data [36]

14 Outline 4

133 Goal

The goal of the dissertation is to explore the requirements of the possible backbone framework suggested in Section 131 and its

potential software architecture The architecture is evaluated by building a proof of concept implementation of the framework

and evaluating it against the proposed requirements To verify its use in developing drone thermal imaging applications the

specific mob-detection use case is investigated

134 Related work

The Irish start-up DroneSAR [37] developed a search-and-rescue (SAR) drone platform allowing users to stream live images

and video from a drone as it conducts a search for missing persons The platform works with any camera visual and thermal

but focuses on drones from vendor DJI DroneSARs industry partner Amazon introduced the Amazon Kinesis Video Streams

platform in January 2018 as a new service for the Amazon Web Services (AWS) cloud platform It allows users to stream live

video from devices to the AWS cloud and build applications for real-time video processing [38] The VIPER project by EAVISE

KU Leuven researched how thermal and visual video images could be used for real-time detection of persons using object

detection algorithms such as deep learning [36] The framework presented in this work combines elements from all three of

these examples

14 Outline

The remainder of this dissertation is organized as follows Chapter 2 presents the requirements for the framework and the

software architecture Chapter 3 explores several state of the art technologies that can serve as backbone technologies for

the framework To test the viability of the software architecture a prototype is implemented Chapter 4 presents the different

aspects of this prototype Chapter 5 describes the experiment that is conducted to research the detection of mobs in thermal

images The results of both the framework and the detection experiment are presented and evaluated in Chapter 6 Finally the

conclusion and future research efforts are presented in Chapter 7

SYSTEM DESIGN 5

Chapter 2

System Design

Finding out what users actually expect from a software system and what makes it valuable for them is of key importance for the

success of that system This chapter first explores the functional and non-functional requirements of the hypothetical frame-

work suggested in Chapter 1 to find out what makes building the framework worthwhile Well known architectural patterns

enable certain software requirements very well and can be used for building the software architecture of the framework The

framework software architecture combines some of these patterns and is presented in several documents

21 Requirements analysis

Requirements are the stated life-cycle customer needs and objectives for the system and they relate to how well the system

will work in its intended environment They are those aspects of the framework that will provide value to the users

211 Functional requirements

Functional requirements (FR) describe the necessary task action or activity that must be accomplished by the system often

captured in use cases andor user stories [39 40] Use cases provide a summary of the features described in the user stories

Several external people andor systems defined as actors interact with the framework to achieve a certain goal [40] Three

actors are identified for the framework an end-user that uses the framework in order to build an image processing applica-

tion for a specific use case such as the ones described in Section 122 a camera developer who creates support software for

a specific thermal camera for the framework so that the end-user can buy and use their product and an analysis software

developer that creates analysis software for a specific use case (tracking object detecting objects etc) so that the end-user

can use their software to build his 1 application The camera and analysis software developers are generalized to an actor called

plugin developer who develops plugins to extend the functionality of the framework These plugins are the building blocks

with which the end-user can build image processing applications

The general user scenario for the framework proceeds as follows An end-user wants to build an image processing application

1To avoid unnecessary gender specific pronoun clutter the male pronoun is used by default

21 Requirements analysis 6

eg to detect fires in a landscape using a drone He has a thermal camera for this and has read about hot-spot detection in

video The user looks for a plugin for the framework that can read video from his thermal camera and for a plugin that does

the hot-spot detection If the user finds these plugins he can add them to the framework and use them for the application he

is building He connects both plugins with the framework in a specific order to finish his application For this simple example

the thermal camera plugin is connected to the hot-spot detection plugin so that video coming from the thermal camera is

transmitted to the detection plugin to find the fires in the landscape The plugins in the application and the specific order in

which they are connected is defined as a stream This stream should be easily modifiable if additional or other functionalities

are required Suppose that the thermal camera delivers very high quality video and the detector cannot work with this as it

can only operate on low quality images The end-user searches for a plugin that scales the high quality video down to an

accepted quality for the detector This plugin is placed in between the thermal camera and the detector and the application

can work again By continuously adding plugins to the framework the number of possible applications that can be built with

the framework increase making the framework useable for more aerial thermal imaging use cases

Instead of developing the application from scratch users can use the already implemented plugins to build the applications in

an ad hoc fashion Because of this the development time for such applications can be reduced and users can switch hardware

andor algorithms easily The FRs are summarized in a use case diagram that connects each actor with their respective require-

ments and the relationship among them [40] depicted in Figure 21 Trivial functionalities such as launching and shutting down

the framework are omitted The red use cases represent use cases to extend the functionality of the framework the blue use

cases represent use cases for building streams white use cases modify the media processing of the stream Some use cases

depend on others the blue and white use cases work with plugins of the framework their prerequisite use case is rdquoAdd pluginrdquo

as a plugin must be a part of the framework for a user to use it the rdquo(Un)Link pluginsrdquo rdquoStopPausePlay streamrdquo use cases

depend on rdquoAdd plugins to streamrdquo as a stream must contain plugins before they can be manipulated

212 Non-functional requirements

A non-functional requirement (NFR) specifies how the framework is supposed to be or in what manner it should execute its

functionality [41] These qualifications typically cover business and system quality requirements A distinction is made between

quality attribute requirements (QAR) and constraints QARs are qualifications of the FRs or of the overall product eg how

fast a certain function must be executed or how resilient it must be to erroneous input They are closely related to business

requirements which are specifications that once delivered provide value to the actors [40] The QARs are captured in a utility

tree [40] that has a root node representing the system This root node is elaborated by listing the major QARs that the system

is required to exhibit Each QAR is subdivided into more specific QARs To make the specific requirements unambiguous and

testable a scenario for the system or a specific function is written and they are evaluated against the business value and the

architectural impact [40] The QAR can either have High (H) Medium (M) and Low (L) business value and architectural impact

respectively The business value is defined as the value for the end user if the QAR is enabled High designates a must-have

requirement Medium is for a requirement which is important but would not lead to project failure Low describes a nice to have

QAR but not something that is worth much effort Architectural impact defines how much the architecture must be designed

towards the QAR to enable it High means that meeting this QAR will profoundly affect the architecture Medium means

21 Requirements analysis 7

Figure 21 Use case diagram

that meeting this QAR will somewhat affect the architecture Low means that meeting this QAR will have little effect on the

architecture The following QARs are discussed performance interoperability modifiability usability security and availability

Performance

Performance indicates the frameworks ability to meet timing requirements It characterizes the events that can occur and the

frameworks time-based response to those events Latency is defined as the time between the arrival of the stimulus and the

systemrsquos response to it [40] The system latency is the latency between the initialization of an action sequence and the first

change to the system noticeable by the user Streaming latency is defined as the time between the arrival of a video frame and

the arrival of the next video frame The jitter of the response is the allowable variation in latency Scalability is the number

of users that can use the framework at the same time The utility tree is presented in Table 21 The software industry has

not defined a quantified rsquogood latencyrsquo for end-users but a 4 second latency rule is often used as a rule-of-thumb [42] The

average response time for general framework commands should then be less than 2 seconds with a standard deviation of 1

seconds ensuring most execution times respect the 4 second bound As stated in Chapter 1 some use cases require real-time

video streaming such as fire fighting The notion of low latency real-time video loosely defines that video should be streamed

almost simultaneously if a camera is filming and a human user does not notice a latency between the video of the camera

and the real world the video stream is considered real-time Real-time is thus a human time perception and for visual inputs

this bound is as low as 13 milliseconds Anything above 13 milliseconds becomes noticeable anything above 100 milliseconds

hinders human performance [43 44] However the framework focusses on the use of thermal cameras most of which most

donrsquot produce frames faster than 8 frames per second or 125 milliseconds per frame (see Section 31) More expensive cameras

21 Requirements analysis 8

can shoot at 25 frames per second corresponding to a latency of 40 milliseconds and this bound is selected for the streaming

latency with a standard deviation of 20 milliseconds remaining below the frame rate of less expensive cameras The number

of users that can use the framework at the same time is assumed to be low as current aerial thermal image applications are

currently operated by only one user or a few The assumption is that a maximum of five users can use the framework at the

same time All of these requirements are quantified as relatively rsquogoodrsquo values These bounds should be evaluated for user

satisfaction by having users use a prototype of the framework in practice

Attribute refinement Id Quality attribute scenario

LatencyPS-1 The average execution time of all framework commands does not exceed 2 sec-

onds (H M)

PS-2 A playing stream should have an upper limit of 40ms streaming latency (H H)

JitterPS-3 The average standard deviation of the execution time of all framework com-

mands should not exceed 1 second under normal operation (H M)

PS-4 The average standard deviation in streaming latency should not exceed 20ms

under normal operation (H H)

Scalability PS-5 The system should usable by five users at the same time (M M)

Table 21 Performance utility tree

Interoperability

Interoperability is the degree to which two or more independently developed systems can usefully exchange meaningful infor-

mation via interfaces in a particular context [40] The framework will interoperate with cameras and analysis modules via the

framework plugins Henceforth the camera and analysis plugins will be referred to as a Producer plugin and a Consumer plugin

A Producer plugin is thus a plugin that represents a camera that produces video and a Consumer plugin a plugin that represents

a module that processes or consumes video The framework will thus interact with the Producer and Consumer plugins with

which the framework exchanges requests to link them together control their media process etc The more correct exchanges

there are between the two the better the user can use the plugin for building applications with the framework This QAR is

quantified by the ratio of requests made by the framework that were correctly interpreted by the plugin (successful requests)

and the total number of requests during a runtime of the framework [40] Intuitively one argues that the framework must

achieve perfect interoperability with a perfect exchange success rate of 100 Reality however tends to not agree with perfec-

tion and it can never be excluded that exchanges will always be correct Therefore it is better to aim for a good interoperability

measure and prepare for failed exchanges instead of naively assuming the framework will be perfect An exchange success

rate of 9999 is selected the motivation for this bound is as follows A plugin is assumed to be always correct up to first the

mistake after which the plugin is faulty and the fault needs to be identified and ensured that it wonrsquot occur again An exchange

success rate of 9999 means that if 10000 plugins are installed and used by the framework only one will fail during uptime

For one plugin during framework up time the mean time between failures is then 10000 exchanges It is suspected that this

21 Requirements analysis 9

amount of exchanges are very high for normal framework use Because the possibility of faulty exchanges is acknowledged

the framework will need to implement a fallback mechanism to compensate The utility tree is presented in Table 22

Attribute refinement Id Quality attribute scenario

Syntactic interoperabilityIS-1 The framework exchanges requests with a Producer plugin (known at runtime)

with a success ratio of 9999 Incorrect requests are undone by the framework

and logged (HH)

IS-2 The framework exchanges requests with a Consumer plugin (known at runtime)

with a success ratio of 9999 Incorrect requests are undone by the framework

and logged (HH)

Table 22 Interoperability utility tree

Modifiability

Modifiability is the cost and risk of changing functionality of the system [40] One of themost important values of the framework

is modifiability of the supported thermal cameras and analysis modules The framework needs to be extendable for new

functionalities by enabling developers to add their support software in the form of a plugin End-users should be able to

modify the components that they use for their image processing applications easily and quickly to allow for interchangeable

hardware and software and quickly set up new applications Modifiability is defined in two environments runtime defined as

periods during which the system is up and running and downtime defined as the time periods during which the system is not

active The utility tree is presented in Table 23

To enable users to choose the extensions they need the framework will need a distribution service that contains all plugins

available for the framework from which a user can select and install plugins for their local version of the framework Adding

new plugins to the distribution service should not affect versions of the frameworks installed by the user When a user adds a

plugin from the distribution to his version of the framework the framework should only reload once before making the plugin

useable for user comfort Deployability is defined as the different device configurations that specify how the framework can be

deployed If the framework can be deployed in different fashions this can increase the value for the end-user Suppose a fire

fighting use case in which a forest fire is monitored on site Computationally powerful devices might not be available on site

so moving some plugins processing media to a remote server or cloud could still allow usage of the framework Perhaps the

device processing the media is already remote for example a drone on security patrol in this case access via a remote device

such as a smartphone is desirable This leads to the deployment configurations described in the utility tree

Usability

Usability indicates how easy it is for the user to accomplish a desired task and the kind of user support the system provides

Learnability indicates how easy it is for a user to gain knowledge on how to use the framework Errors are the amount of errors

21 Requirements analysis 10

Attribute refinement Id Quality attribute scenario

Run time modifiability

MS-1 Support for a new Producer plugin should be added to the distribution service

within one day without the framework having to restart (H H)

MS-2 Support for a new Consumer plugin should be added to the distribution service

within one day without the framework having to restart (H H)

MS-3 End-users should be able to extend their framework with new functionalities

by installing new Consumer and Producer Plugins (HH)

MS-4 End-users should be able tomodify the plugins used to build their stream (HH)

Down time modifiabilityMS-5 New Producer plugins can be installed to the local framework at runtime the

framework can only reload once before the plugin is useable (H H)

MS-6 New Producer plugins can be installed to the local framework at runtime the

framework can only reload once before the plugin is useable (H H)

Deployability

MS-7 The system should be deployable on a combination of a smartphone and

cloudremote server environment (H H)

MS-8 The system should be deployable on a personal computer or laptop (H H)

MS-9 The system should be deployable on a smartphone laptop and cloud environ-

ment (H H)

Table 23 Modifiability utility tree

a user can make when trying to execute certain functions [40] The utility tree is presented in Table 24

Security

Security is a measure of the systemrsquos ability to protect data and information from unauthorized access while still providing

access to users and systems that are authorized An action taken against the system to cause it harm is called an attack

Security has three main characteristics Confidentiality is the property that data or services are protected from unauthorized

access Integrity is the property that data or services are protected from unauthorized manipulation Availability is the property

of the systemmaintaining its functionality during an attack Authentication verifies the identities of the parties of an interaction

checks if they are truly who they claim to be and gives or provokes access [40] Security is important for the framework if it is

deployed on multiple devices that use a public network to communicate The utility tree is presented in Table 25

Availability

Availability in a general context (not only security) refers to howavailable the software is to carry out its functionality Downtime

is a measure of the time that the system is unavailable to carry out its functions The utility tree is presented in Table 26

Availability is specified for the part of the framework that distributes the plugins

22 Patterns and tactics 11

Attribute refinement Id Quality attribute scenario

Learnability

US-1 A user should be able to learn how to build an image processing application in

at most one hour (H L)

US-2 An experienced developer should be able to start developing a Consumer plugin

for the system within one day (H L)

US-3 An experienced developer should be able to start developing a Producer plugin

for the system within one day (H L)

Errors US-4 A user should not make more than 3 errors to build an image processing appli-

cation (H L)

Table 24 Usability utility tree

Attribute refinement Id Quality attribute scenario

Confidentiality SS-1 Streams created by a user can only be accessed by that user and not by any

other entity (H L)

Integrity SS-2 Streams canrsquot be manipulated without authorization by the user that made the

streams (H L)

Availability SS-3 During an attack the core functionality is still available to the user (H M)

AuthenticationSS-4 Users should authenticate with the system to perform functions (H L)

SS-5 Developers should authenticate their plugins before adding them to the frame-

work (H L)

Table 25 Security utility tree

Architecturally significant requirements

Architecturally Significant Requirements (ASR) are the requirements that are themost important to realize according to business

value and have the most impact on the architecture From the utility trees and the measures of the quality attribute scenarios

the QARs modifiability interoperability and performance are identified as ASRs

22 Patterns and tactics

An architectural pattern is a package of design decisions that is found repeatedly in practice that has known properties that

permit reuse and describe a class of architectures Architectural tactics are simpler than patterns which typically use just a

single structure or computational mechanism They are meant to address a single architectural force Tactics are the rdquobuilding

blocksrdquo of design and an architectural pattern typically comprises one or more tactics [40] Based on the ASRs several tactics

are listed in Table 27 that are used for the base pattern selection The explored patterns are layers event-driven architecture

22 Patterns and tactics 12

microkernel and microservices

221 Layers

The layered pattern divides the software into units called layers that each perform a specific role within the application Each

layer is allowed to use the layer directly beneath it via its interface Changes in one layer are isolated if the interfaces donrsquot

change enablingMT-1 andMT-2MT-5 [40] While changes can be isolated by the isolated layers they remain difficult due the

monolithic nature of most implementations of this pattern Layers contribute to a performance penalty due to the rdquoarchitecture

sinkhole phenomenonrdquo in which requests are simply propagating through layers for the sake of layers [45]

222 Event-driven architecture

This pattern consists of several event publishers that create events and event subscribers that process these events The pub-

lishers and subscribers are decoupled by using an event channel to which the publishers publish events that the event channel

forwards to the event subscribers The subscribers should have a single purpose and execute asynchronously Since the publish-

ers and subscribers are single-purpose and are completely decoupled from other components via the event channel changes

are isolated to one or some components enabling MT-1 MT-2 MT-4 MT-5 and MT-7 If the event channel adds a discovery

mechanism IT-1 can also be enabled Overall the pattern is relatively easy to deploy due to the decoupled nature of the com-

ponents Performance in general can be very high through the asynchronous nature of the architecture enabling PT-6 and PT-7

If the event channel is tweaked to contain extra functionality PT-1 PT-3 PT-8 PT-9 PT-10 PT-11 can be enabled as well

If the components have a limited event response then PT-2 and PT-5 can also be enabled Development can be somewhat

complicated due to the asynchronous nature of the pattern [40 45]

223 Microkernel

The microkernel pattern allows the addition of application features as plugins to the core application providing extensibility

as well as feature separation and isolation The pattern consists of two components a core system called the kernel and

Attribute refinement Id Quality attribute scenario

DowntimeAS-1 The system should be up 995 per year This means the system has an allowed

scheduled downtime of 43 hours and 30 minutes per year for maintenance (M

L)

AS-2 The maximal duration of the interval during which the system is unavailable is

3 hours (M L)

Network AS-3 If there is no active network connection the local device can be used for opera-

tion of the framework (H H)

Table 26 Availability utility tree

22 Patterns and tactics 13

plugins The business logic is divided between independent plugins and the kernel The kernel contains only the minimal

functionality required to make the system operational The plugins are standalone independent components that contain

specialized processing additional features and custom code This code is meant to enhance or extend the core system to

produce additional business capabilities In many implementations plugins are independently developed third-party modules

Changes can largely be isolated and implemented quickly through loosely coupled plugins AllMTs can be enabled Depending

on how the pattern is implemented the plugins can be dynamically added to the kernel at runtime Via a resource discovery

service in the kernel the ITs can be enabled In general most applications built using the microkernel pattern perform well

because applications can be customized and streamlined to only include the features that are needed [45]

224 Microservices

Microservices is an architectural pattern that structures an application as a collection of loosely coupled services that implement

business capabilities Each component of the pattern is deployed as a separate unit that can be deployed on one device or

multiple devices The components can vary in granularity from a single module to a large portion of the application The

components contain one or more modules that represent either a single-purpose function or an independent portion of a

business application [45 46] Due to the separately deployed units changes are isolated to individual components enabling all

MTs Via service discovery mechanisms the ITs can also be enabled The microservices pattern supports distributed deployment

of the software across multiple devices by design This pattern is not known to produce high-performance applications due to

the distributed nature of the pattern which relies on communication via a network [45 46]

225 Comparison of patterns

Table 27 summarizes the analysis of the patterns A score is given based on howwell the pattern enables the tactic Lowmeans

that the pattern does not naturally enable the tactic Medium indicates the pattern can be implemented with the tactic but

does not include it itself High means the tactic is enabled in the pattern Excellent means that the tactic plays a key role in the

pattern

The microkernel pattern andmicroservices pattern both enable most tactics The microkernel pattern implements extendability

of the framework by design using plugins which is the main idea for the framework and thus is an excellent base pattern

Interoperability and deployability of these plugins can be ensured by the microservices pattern as it designs the microservices

to have well defined interfaces for interoperability and allows for the framework to be deployed in a distributed fashion The

architecture presented below is a combination of both the microkernel pattern and the microservices pattern

22 Patterns and tactics 14

Tactic Layers Event-driven Microkernel Microservices

MT-1 Split module Medium High High Excellent

MT-2 Increase semantic coherence Medium High High Excellent

MT-3 Encapsulate Medium High High Excellent

MT-4 Use an intermediary Medium High High Excellent

MT-5 Restrict dependencies High High High Excellent

MT-6 Anticipate expected changes Low High Excellent Excellent

MT-7 Abstract common services Low High Excellent Excellent

MT-8 Defer binding | Runtime registration Low Low Medium High

IT-1 Discover services Low Low High High

IT-2 Orchestrate interface Low Low High High

IT-3 Tailor interface Low Low High High

P-1 Manage sampling rate Low High High Medium

P-2 Limit event response Low High High Medium

P-3 Prioritize events Low High High Medium

P-4 Reduce overhead Low High High Low

P-5 Bound execution time Low High High Medium

PT-6 Increase resource efficiency Low High High High

PT-7 Introduce concurrency Low High Low High

PT-8 Maintain copies of computation Low High Low High

PT-9 Load balancing Low High Low High

PT-10 Maintain multiple copies of data Low High Low High

PT-11 Bound queue sizes Low High Low Medium

PT-12 Schedule resources Low High Low Medium

Table 27 Comparison of how well the discussed patterns enable the tactics needed for the ASRs

23 Software architecture 15

23 Software architecture

The software architecture is documented in three document categories static views dynamic views and deployment views

The static views comprise the different components of the system and their relationship among each other The dynamic views

describe the runtime behavior of the system Finally the deployment views provide different configurations how the system

can be deployed on different devices [47]

231 Static view

Figure 22 presents an overview of the architecture using a component-connector UML diagram Components are the boxes that

represent different software entities that exist at runtime The components have interfaces through which they interact with

other components These are indicated using the rsquolollipoprsquo notation with the rsquoballrsquo representing the interface that a component

provides and a socket indicating that another component is using this interface The type of data exchanged is noted next to

the interface Multiple boxes indicate that multiple components of the same kind can exist at runtime [48]

The architecture consists of the following core components Client Interface Producer Stream Consumer Producer Distribution

Consumer Distribution Producer Plugin and Consumer Plugin The clear components in Figure 22 form the core framework

which each user needs to install to use the framework The colored components form a distribution service for framework

plugins to extend the functionality they are not installed with the core framework but run as remote instances with which

the user can interact to extend his version of the core framework with new plugins A user can use the framework via the

Client Interface building streams that are maintained in the Stream component The Stream component makes requests to

the Producer and Consumer components to activate and control the selected plugins to build the stream Additional plugins

can be added to the framework and are distributed via the Producer and Consumer Distribution components The architecture

implements a hybrid combination of the microservices and microkernel pattern Each presented component is a microservice

that implements its own interface to interact with other components The Producer and Consumer components act as kernels in

the microkernel pattern while the Producer and Consumer plugins acting as plugins in the microkernel pattern These patterns

enable the tactics needed to meet the requirements presented in Section 21

Communication protocol

To allow the microservices to communicate a communication protocol must be designed Communication protocols can roughly

be classified in two categories synchronous and asynchronous Synchronous protocols block on requests which means that the

client waits for a response of the server and can only continue executing when a response is received This makes a synchronous

protocol inherently more reliable but also slower An example synchronous protocol is the Hyper Text Transfer Protocol (HTTP)

Asynchronous protocols just send messages and do not block on the response This makes the protocol less reliable but also

faster [49]

There are two types of traffic exchanged between microservices First there are the command requests that are exchanged

between microservices to edit resources or change state Second there are the video frames that are exchanged between Pro-

ducer and Consumer Plugins Both types of traffic have different requirements The commands must be communicated reliably

23 Software architecture 16

Figure 22 Overview component-connector diagram of the architecture

and need to executed once and only once The reliability is more important than latency so a synchronous protocol is pre-

ferred Microservices traditionally implement the synchronous HTTP protocol with Representational State Transfer Application

Programming Interfaces (REST API) that specifies the application endpoints as textual resources [45] This common protocol

is used for the exchanged command requests in the framework

The video frames need to be sent with low latency at a high frequency but reliability is less important An asynchronous

protocol is preferred For video streaming the Real-time Transport Protocol (RTP) running on top of the User Datagram Protocol

(UDP) is selected as it enables real-time transfer of data between processes [50] RTP defines a standardized packet format to

transmit video and audio over a network It sequences each packet with a sequence number and a timestamp This allows the

application to detect missing packets and latencies in the network The UDP protocol is a low latency asynchronous transport

protocol as it doesnrsquot guarantee packet delivery

The recommended codec for transmitting video media is Motion-JPEG that encodes video frames as separately encoded JPEG

images This makes analysis and processing in subsequent plugins easier as only the received frame is needed to perform

the analysis or processing Other video compression formats such as MPEG-4 use key-frames and object oriented differential

compression formats If a key-frame is received via the stream the frame can be used as is If a reference frame is received the

receiver needs to wait for the corresponding key-frame to be received to be able to construct the full video frame for analysis

This introduces extra complexity and lower quality detection which is a clear trade-off for the quality and simplicity which

MJPEG offers [51 52]

Applying these protocols to the architecture results in the network topology depicted in Figure 23 The full lines represent

communication via HTTP on top of the Transmission Control Protocol (TCP) The dashed lines represent the RTP protocol on top

of the UDP protocol The boxes represent the different microservice components of the framework

23 Software architecture 17

Figure 23 Framework network topology Each box is a microservice component of the framework The full lines indicate communication over the HTTPTCP

protocol the dashed lines indicate communication over the RTPUDP protocol

Client Interface

The Client Interface is the interface through which end-users can interact with the framework Figure 24 presents the detailed

component-connector diagram The Client Interface consists of a User Interface (UI) component and a API Gateway component

Devices can make requests to the Client Interface via the Client Requests interface provided by the API GateWay The UI provides

the UI Operation interface that is used by end-users to control the framework This can be either a visual or textual interface

The UI actions are translated to client requests that are forwarded to the API Gateway using the Client Requests interface The

API Gateway translates the client requests and forwards them to the other core framework components

Figure 24 Client Interface detailed view

Stream

The Stream component maintains the logical representation of the streams built by the end-user for his image processing

application Figure 25 presents the detailed component-connector diagram

23 Software architecture 18

Figure 25 Stream detailed view

It consists of an API a StreamManager and several StreamModel components The API provides the Stream Commands interface

used by the Client Interface to interact with the framework it translates incoming requests to commands for the Stream

Manager that can then execute these commands Commands include creating a new stream modifying the stream layout

modifying the stream state etc The StreamManager creates and manages multiple streams represented by the Stream Model

So the end-user builds Stream Models to create image processing applications The Stream Model represents the logical model

of these image processing application streams As stated before a stream consists of several plugins processing media placed

in some order that are linked by the framework Figure 26 illustrates this concept

Figure 26 Logical model of a stream The arrows represent the flow of media through the stream

Logically the Stream Model is represented as a tree with multiple roots and multiple leaves The framework build streams by

initializing the needed plugins and connecting them in order In the example StreamModel plugins receivemedia frommultiple

source plugins and forward media to multiple targets The Stream Model has a global state that represents the cumulative

state of all plugins To transition the global state from A to B all plugins need to transition from A to B This is done by first

making the transition on the leaves of the Stream Model after which the transition propagates to the root nodes This ensures

that no media is lost because the first transitioned plugins canrsquot process anything as their is no media put into the tree

23 Software architecture 19

Producer and Consumer plugins

A Plugin represents an independent media processing element either of the Consumer type (such as a thermal camera) or

the Producer type (such as an object detection software module) Plugins are deployed as standalone microservices providing

a REST API interface that the framework uses to control the plugin Figure 27 represents a general plugin model A plugin

receives media from other plugins called the sources processes this media and forwards it to other plugins called the listeners

A Producer plugin only has listeners a Consumer plugin has both sources and listeners Merging the media from multiple

sources and forwarding the processed media to multiple listeners is the responsibility of the plugin

Figure 27 Plugin model

The plugin REST API should at least provide a state resource representing the state of how the plugin is processing media

a sources resource that represent the sources from which the plugin receives media to process and a listeners

resource which represent the listeners to which the plugin transmits the processed media Only Consumers have the both

the sources and listeners resource as Producer Plugins produce their own media source and hence can only have

listeners

To indicate if and how the plugin is actively processing media a finite state machine is implemented The state transition

diagram is presented in Figure 28 A plugin can be in four possible states INACTIVE STOP PLAY and PAUSE When a plugin

is in the INACTIVE state no active microservice is running the plugin This is the initial state for all plugins of the framework

This state is only visible to the framework as in this state the plugin is not instantiated When a plugin is in the STOP state

the framework has instantiated a microservice running the plugin The plugin is listening for commands on its API but is not

processing any media This state is visible to the plugin In the PLAY state a plugin is processing media received from its

source(s) and transmits processed media to its listener(s) and is listening for commands When in the PAUSE state media

processing is paused but media buffers are kept This is to decrease the latency when the plugin transitions back to the PLAY

state since the plugin can continue processing from the point from where it was paused The difference with the STOP state

when transitioning STOP state the plugin clears its media buffers

The plugin starts in the INACTIVE state When a microservice running the plugin is instantiated by the framework the plugin

initializes itself in the STOP state From the STOP state the plugin can transition to the PLAY state to process media This

transition is only successful if sources and listeners are registered with the plugin From the PLAY state a transition to both

23 Software architecture 20

Figure 28 The state transition diagram for a plugin

the STOP state and the PAUSE state can be made which stops the processing of media and respectively drops or keeps the

media buffers The plugin cannot make multiple state transitions per command When a transition is made to INACTIVE the

framework first transitions the plugin to the STOP state after which the INACTIVE state can be reached

A sourcelistener has the following fields hostname the hostname of the microservice running the plugin and port the port

on which the sourcelistener is reachable

On the sources and listeners an HTTP GET and POST method must be provided GET retrieves the sourceslisteners

and their details POST adds a new sourcelistener to the plugin Both resources additionally need to provide an individ-

ual endpoint per sourcelistener on which GET PUT and DELETE must be provided This is for individual manipulation of the

sourcelistener GET retrieves the details PUT updates the fields of a listener and DELETE removes a sourcelistener from the

plugin

Producer and Consumer

The Producer and Consumer components are responsible for interacting and managing the ProducerConsumer plugins used in

the streams Figure 29 presents the component-connector diagram of the Producer and Consumer components Both compo-

nents have a similar architecture but are separate components This is because their plugin models differ and are suspected

to often be deployed on different devices having specific hardware requirements Producers Plugins could be deployed on

the thermal camera itself having a very specific operating system whereas a Consumer plugin might need access to specific

processors to speed up its execution

The Producer and Consumer consist of the following components API Kernel Plugin Model and Plugin Manager The API trans-

lates requests coming from the Stream component to commands for the Kernel The Kernel implements the core functionalities

such as activating (deploying) and deactivating plugins managing their state and manipulating their resources It creates a

Plugin Model for each Plugin that the framework has installed This model represents a plugin logically on framework level

and keeps track of the Plugin resources The Plugin Manager manages the plugins that were added to the framework stored in

the Plugin Directory It manages the plugin installations adding updates or installing additional plugins that can be retrieved

from the Producer and Consumer Distribution components

23 Software architecture 21

(a) Producer component-connector diagram

(b) Consumer component-connector diagram

Figure 29 Component-connector diagrams of the Producer and Consumer module

Producer and Consumer Distribution

The Producer and Consumer Distribution components are responsible for managing and maintaining the plugins for the frame-

work They act as online software repositories from which local versions of the framework can retrieve new plugins The

component-connector diagrams are presented in Figure 210 The Distribution components consists of the following subcom-

ponents API Plugin Manager and Plugin Tester Plugin Developers can make requests to the API that translates these requests

to Create Read Update Destroy (CRUD) commands for the Plugin Manager The Plugin Manager executes these commands

on the Plugins that are kept in the Plugin Repository The quality of the framework depends on the quality of the plugins

that it offers Therefore plugins should be thoroughly tested before being added to the framework to guarantee quality The

Plugin Tester component is responsible for this testing Tests should include testing if the plugin implements the Plugin Model

correctly if the plugin meets the performance requirements etc When a plugin passes these tests it is added to the Plugin

Repository so that end-users can install the plugin and use it for their applications

23 Software architecture 22

(a) Producer Distribution (b) Consumer Distribution

Figure 210 Producer and Consumer Distribution component-connector diagrams

232 Dynamic views

Dynamic views depict the behavior of the system and complement the static views They are documented using sequence

diagrams that show an explicit sequence of messages between architecture elements that describes a use case [40] Two key

use cases are presented here Add a plugin to the stream and linking plugins to build the stream

Add plugin to stream

Figure 211 presents the sequence diagram for adding a Producer plugin to the framework The framework is assumed to be

running the user has created a streamS and the Producer PluginA is correctly installed The end-user executes the command

to add A to stream S on the Client Interface that passes the command to the Stream component The Stream component

requests the creation of a microservice instance of A that is created by the Producer Kernel When the Producer Plugin is

instantiated the Producer Kernel creates a Plugin Model of A and adds it to its references so that the instance of A can be

reached for future commands Afterwards the StreamManager is informed of the success upon which the StreamManager can

addA to the Stream Model ready to be linked with other plugins The user is notified of this success and can continue building

IfA could not be instantiated (due to not being installed not installed correctly etc) A is marked as rsquobrokenrsquo and the user is

notified that the action could not be completed When the plugin is marked as rsquobrokenrsquo it can no longer be used and needs to

be reinstalled The sequence diagram for adding a Consumer Plugin is similar but replaces the Producer components with the

Consumer components

Link plugins

Figure 212 presents the sequence diagram for linking two plugins in a stream In the sequence diagram two Consumer Plugins

A and B are linked this can be extended to a Producer Plugin linking with a Consumer Plugin The framework is assumed

to be running the user has created a stream S the plugins A and B have been instantiated and added to the stream The

end-user executes the command to linkA andB in stream S on the Client Interface that passes the command to the Stream

component that checks if the link is valid for the Stream Model S Linking can only be done if the stream is in the STOP state

and if the plugins are already in the stream If the link is valid the Stream Manager can begin linking the plugins To link

23 Software architecture 23

Figure 211 Add a Producer Plugin to stream

the plugins in the order A-B A is added as a source for B and B is added as a listener for A These subsequences are

found in their corresponding frames in the diagram and are very similar The Stream Manager makes the request to add the

sourcelistener to the Kernel that finds the corresponding plugin and makes the request on the corresponding Plugin Model

If the Plugin succeeded the Plugin Model is updated and the Stream Manager is notified of this success If both plugins have

successfully set the source and listener the Stream Model layout is updated with the link Should the sourcelistener request

fail for one of the plugins the change is rolled back and the end-user is notified

233 Deployment views

The different deployment configurations are illustrated via deployment diagrams using the Deployment UML 25 specification

[48] rsquoHostrsquo specifies the device on which components are deployed The rsquomicroservicersquo indicates the isolated environment

in which components are running These isolated environments on the host are realized as software containers that enable

portability of the components to other deployment configurations This concept is further discussed in Section 33 The Producer

and Consumer Distribution components were left out of the diagrams as they are always distributed on a different host than

the core framework Two deployment configurations are presented the local configuration that deploys the components on

a single device and the distributed configuration that deploys each component on a separate device These configurations are

presented in Figure 213

23 Software architecture 24

Figure 212 Link two plugins in a stream The rsquoformat requestrsquo and rsquotranslate requestrsquo actions in the API components have been omitted to reduce clutter

in the diagram but are executed by the API components

23 Software architecture 25

Local configuration deployment

The local configuration deploys the framework on a single local device The configuration is depicted in Figure 213a Because

the framework is deployed as a whole it can operate offline This configuration is useful for image processing applications that

canrsquot rely on a stable network connection Examples are remote locations or densely built-up areas The components are still

deployed as separate microservices due to the architecture of the framework This has an impact on the performance of the

framework because for every interaction between components either the HTTP message protocol or RTP protocol is used that

introduces extra overhead compared to direct invocation of commands

Distributed configuration deployment

The distributed configuration deploys the framework on multiple devices The components are distributed over these devices

made possible by the microservice isolation and communication protocols This configuration is depicted in Figure 213b Obvi-

ously in this configuration each component of the framework must have a reliable network connection to communicate with

the other framework components This configuration could be used for example for a security application The end-user has

the Stream module running on a master node that controls several cameras The end-user can configure his image process-

ing application through the Client Interface running on his device that communicates with the Stream module running on

the master node The master node can control each camera by communicating with the Producer component If for example

the security application requires autonomous detection of trespassing people a computationally intensive task the Consumer

Plugins could need dedicated hardware to run that is only available on another device The Consumer component can then be

deployed on that dedicated device and the Stream component can again communicate with it over the network This success of

this configuration depends on the availability of the network and the capacity of the network If the network fails commands

and media canrsquot come through and the framework can no longer execute Due to the distributed nature performance will also

be worse when compared to the local configuration because each request between the components travels over a network

that can experience delays

23 Software architecture 26

(a) Local configuration deployment diagram(b) Distributed configuration deployment diagram

]

Figure 213 Deployment diagrams

STATE OF THE ART AND TECHNOLOGY CHOICE 27

Chapter 3

State of the art and technology choice

To build and test a proof of concept implementation of the architecture presented in Chapter 2 several state of the art tech-

nologies can be used as support for the framework These are presented in Sections 31 32 33 and 34 For each category a

choice is made that will serve as the basis for the implementation of the proof of concept discussed in Section 35 Readers

already familiar with the presented technologies can safely skip ahead to Section 35

31 Thermal camera options

This section aims to provide an overview of some currently commercially available thermal cameras The overview is not a

complete overview of all products offered by all vendors This data was gathered in September 2017 so some products can

be discontinued and new products can already be launched Several parameters are collected for each product Section 311

discusses why these parameters are important to assess the quality of a thermal camera Section 312 aims to aggregate these

parameters and presents insights into the data The full list of specifications can be found in Appendix B

311 Parameters

The following parameters were considered for the comparison physical specifications image quality thermal precision inter-

faces energy consumption help and support and auxiliary features

Price

Thermal cameras are relatively expensive when compared to visible light cameras For example a 20 megapixel (MP) visible

light camera can cost as low as 100 euro while thermal cameras having a much lower image resolution can cost as much as

15000 euro Prices for thermal cameras cover a very wide range and budgets are limited in practice

31 Thermal camera options 28

Physical specifications

Two specifications are considered the weight of the camera and the dimensions of the camera Drones have a limited carry

weight due to maximal carrying capacities and a faster draining of battery life when carrying heavier loads Lighter and smaller

cameras are preferred for usage with drones These often offer lower image quality and less features than the heavier cameras

Image quality

Image quality specifies how much information an image can possibly hold It consists of five parameters resolution capture

frequency or frame rate field of view and radiometric information Image resolution is the amount of detail an image holds

Higher resolution cameras can capture more details in a scene resulting in a sharper image that holds more information Due

to more details smaller objects can also be seen allowing scenes to be viewed from larger distances Drones capture images

from relatively large distances so good resolutions are required for the images to be useful Image resolution is measured in

pixel density presented as the product of the amount of pixels in width and height of the image The highest resolution found

for the compared cameras is 640 x 512 pixels Some cameras offer a visual camera next to the thermal camera This allows

an overlay of the visual image and the thermal image so-called Multi Spectral Dynamic Imaging (MSX) This creates artificial

sharper images because edges can be seen more clearly because they are more visible in the visual image Figure 31 depicts a

thermal-only image and a MSX image of a dog It can be seen that the MSX image is sharper MSX is a more low-cost solution

to produce sharper images compared to increasing the thermal resolution as visible light cameras are less expensive [7]

(a) Thermal (b) MSX

Figure 31 Thermal image and MSX image of a dog

The capture frequency or frame rate dictates how many frames the camera can capture per second Higher frequency cameras

are able to track dynamic scenes better The field of view is angle throughwhich the camera is sensitive to thermal radiation and

31 Thermal camera options 29

determines the extent of the world that can be seen by the camera Bigger field of views can capturemore of the environment in

one image Most cameras allow various lenses to be mounted onto the camera which allows for greater flexibility in choosing

the field of view Radiometric image information is thermal information embeddedwith the infrared image that can be analyzed

after recording Radiometric information characterizes the distribution of the thermal radiationrsquos power in space and specifies

the temperature per pixel exactly Regular thermal images use a relative scaling of temperatures that are mapped onto a

colorspace with some color being the hottest color in the image and another color the coldest For example in Figure 31a the

Iron color scheme is used which maps the cold regions of the image on blue color variants and warmer regions on red and

yellow variants Radiometric information can give a very detailed description of the radiation pattern of a scene

Thermal precision

Thermal precision specifies the temperature range the sensitivity and accuracy of the temperature measurements The tem-

perature range indicates the minimum and maximum range a camera can detect A larger temperature range comes with a

trade-off in sensitivity and accuracy Often cameras offer different modi of operation and operate using different intervals

according to the accuracy needed in a scene Sensitivity indicates the ability of the camera to record finer distinctions in tem-

perature Accuracy is the margin of error for temperature readings on the thermal camera An accuracy of 5 degrees Celsius

for small temperature ranges and 20 degrees Celsius for large temperature ranges is commonly found The increase in error

margin is a trade-off for the larger temperature interval Objects have different emit infrared waves in various forms (due

to black-box radiation [7]) To accurately compare the temperatures cameras often implement emissivity corrections that

normalize the measurements

Interfaces

Cameras can communicate with other devices via several interfaces during use Cameras mounted on a drone cannot be ac-

cessed during flight and need these interface to transfer data USB and HDMI are the most commonly found interfaces to

connect the camera with an on-board processing unit gimbal or battery MAVLink [53] is a very lightweight header-only mes-

sage marshalling library for micro air vehicles drones When a camera provides this interface this allows for a very efficient

communication scheme to control the camera remotely Other interfaces include Bluetooth or Wi-fi

Energy consumption

A device mounted on a drone has a limited energy source at its disposal The less energy the camera consumes the longer the

drone can operate This can even lead to lighter batteries used in-flight reducing the carried weight and therefore also the

energy consumption Typically energy consumptions for cameras are much lower than the energy consumption of the drone

itself so this is a minor specification Input voltage and power consumption are specified

31 Thermal camera options 30

Help and support

How the camera is supported by the company has a big impact on the ease of developing applications for the camera platform

User manuals phone or email support and FAQs are very helpful Should the camera be malfunctioning a product warranty is

necessary to recover the broken product

User experience

The user experience is another important factor as there is a difference in the technical specifications and the actual experience

of the user The user experience is measured in a number of good and a number of bad reviews Reviews are scored from zero

to five with zero being a very bad experience and 5 being a very good experience A good review is scored three or more a bad

review less than three stars

Auxiliary features

Some cameras offer even more features than the ones mentioned above These can be a connection with the Global Positioning

System (GPS) to indicate where images were captured a software application to interact with the camera analysis functionality

tracking etc

312 Comparative analysis

It can be seen that FLIR is the market leader on thermal solutions for drones They offer the largest product line and products

from other companies often utilize one of their camera cores Figure 32a plots the retail price compared to the thermal

resolution Cameras with high and low resolutions are found across all price ranges Clearly other features determine the price

of a thermal camera A feature function is defined that maps the features of a thermal camera on an integer The function

increments the integer if

bull The camera has MSX support

bull The camera has a standard data format (not just an analog or digital signal)

bull The camera offers radiometric information

bull The image resolution is larger than 640 x 512 pixels being the highest resolution found for these products

bull The sensitivity is smaller than 100 mK

bull The camera offers emissivity correction

bull The camera offers a USB interface

bull The camera offers a MAVLink interface

32 Microservices frameworks 31

bull The camera offers an HDMI interface

bull The camera offers a Bluetooth connection

bull The camera offers Wi-Fi connection

bull The camera offers GPS tagging

Figure 32b plots these feature points versus the retail price This gives a more log-like relationship The features of a camera

determine the price much more than just the image quality For a price less than 5000 euro thermal cameras are found that

implement most basic features Then the price increases rather fast for less added features These are features like radiometry

that require additional hardware that greatly increase the price of the camera

32 Microservices frameworks

The architecture presented in Section 23 relies heavily on the microservices pattern Therefore this Section aims to present

several microservices frameworks to support this architecture Figure 33 depicts the results of the Rethink IT survey query-

ing the most used frameworks for microservices by developers [54] The most popular frameworks Java EE and Spring Boot

are written in Java The Java EE framework is more of a one-stop-shop framework offering much more functionalities than

just a backbone microservices framework and is therefore not considered Spring Boot is clearly a very popular and mature

framework more streamlined for microservices Vertx is a more upcoming framework renowned for its performance making

it worthwhile to explore Python is an upcoming language for web development and because it is excellent for prototyping

several frameworks for this language are explored as well The frameworks presented here are Vertx version 351 Spring Boot

version 20 Flask version 012 Falcon version 141 and Nameko version 290

321 Flask

Flask is a micro web development framework for Python The term rdquomicrordquo means that Flask aims to keep its core simple but

extensible Flask is an unopinionated framework as it only provides a glue layer to build a REST API around the application

(a) Camera resolution compared to retail price(b) Camera feature points compared to price

32 Microservices frameworks 32

Figure 33 Rethink IT Most used tools and frameworks for microservices results [54]

However it provides a large list of extensions if extra functionality is required [55] Starting a microservice is very simple

as illustrated in Listing 1 Flask uses the concept of Python decorators [56] to bind Python functions to a REST API in Listing

1 for example the function service_status() is linked to the rsquorsquo resource When a user issues an HTTP GET request

on this resource the route() function on the app object is called by Flask Because route() is a decorator for the

service_status() function service_status() is wrapped and passed to the route() function so that when

a user issues an HTTP GET request the service_status() function that was passed gets called This allows for an easy

construction of the REST API just decorate all the functions of the microservice with the correct Flask decorator

from flask import Flask

app = Flask(__name__)

approute()

def service_status()

return service_status

if __name__ == __main__

apprun()

Listing 1 Minimal Flask application

Because Flask is a microframework its memory footprint is small with the binary file only being 535KB large It is in use

by several large companies such as Netflix and Reddit [57] In a production environment the default Flask web server is not

sufficient as it only serves one user at a time However for prototyping it is an excellent framework [55]

32 Microservices frameworks 33

322 Falcon

Falcon is a bare-metal Python web framework that differentiates itself in performance when compared to other frameworks

It targets itself towards microservices due to being even more lightweight and faster when compared to frameworks like Flask

In a benchmark test it achieves 27 times the speed of Flask [58] The framework seems less mature and has not been adopted

by many companies [59] It is not considered for the prototype of the system but could be used in production as it achieves

better performance

323 Nameko

Nameko is a framework specifically built for building microservices in Python Next to offering a REST API it also offers asyn-

chronous events over the Advanced Message Queuing Protocol (AMQP) It is only meant to be used for microservices not for

web applications that serve content It is a relatively young framework and is not backed by any major companies as of yet It

is however backed by the developer of the Flask framework [60]

324 Vertx

Vertx is a toolkit for building reactive applications on the Java Virtual Machine (JVM) This framework follows the reactive

systems principles These principles are used to achieve responsiveness and build systems that respond to requests in a timely

fashion even with failures or under load To build such a system reactive systems embrace a message-driven approach All

the components interact using messages sent and received asynchronously Reactive microservices built with Vertx have the

following characteristics autonomy asynchronous resilience and elasticity Vertx is a toolkit and can be used as any other

library which makes it very flexible It provides a large set of features metrics different programming languages different

protocols templating data access cluster management etc

Vertx embraces the asynchronous development model which can be seen in Listing 2

import iovertxcoreAbstractVerticle

public class Server extends AbstractVerticle

public void start()

vertxcreateHttpServer()requestHandler(req -gt

reqresponse()

putHeader(content-type textplain)

end(Hello from Vertx)

)listen(8080)

Listing 2 Vertx example

33 Deployment framework 34

The event which occurs is the HTTP request On arrival of the event the Handler is called and is executed The Handler is chained

to a listen request and does not block the calling thread The Handler is only notified when an event is ready to be processed

or when the result of an asynchronous operation has been computed [61]

325 Spring Boot

Spring Boot is an opinionated Java framework for building microservices based on the Spring dependency injection framework

It allows developers to create microservices through reduced boilerplate and configuration For simple applications it provides

a similar syntax to Flask in Python and uses decorators for routing An example is given in Listing 3 The framework handles

most of the routing and request handling but restricts the developer in application structure The framework is not lightweight

and performs less well than Vertx [62]

RestController

RequestMapping(api)

public class HelloRestController

RequestMapping(method = RequestMethodGET value=hola

produces = textplain)

public String hello()

return Hello Spring Boot

Listing 3 Spring Boot example

33 Deployment framework

To allow for the modifiability and interoperability requirements discussed in Section 212 and the different deployment config-

urations in Section 233 Linux containers (LXC) are used A container is a lightweight operating system running inside the host

system running instructions native to the core CPU eliminating the need for instruction level emulation that Virtual Machines

use Containers provide an identical isolated runtime environment for processes without the overhead of virtualization This

makes them perfect for highly portable software as only the container needs to be moved and can directly be executed on any

system supporting the containers [63] First the concept of containers is introduced in Section 331 Second several container

frameworks are presented in Sections 332 333 334

331 Containers

Containers sandbox processes from each other and are often described as the lightweight equivalent of virtual machines The

difference between a virtual machine and a container is the level of virtualization Virtual machines virtualize at the hardware

33 Deployment framework 35

level whereas containers do this at the operating system (OS) level The achieved effect is similar but there are significant

differences Containers make available protected portions of the OS and share its resources Two containers running on one OS

have their own OS abstraction layer and donrsquot know they are running on the same host This provides a significant difference in

resource utilization Virtual machines provide access to hardware only so it is necessary to install an OS As a result there are

multiple OSs running which gobble up resources Containers piggyback on the running OS of the host environment They merely

execute in spaces that are isolated form each other and certain parts of the OS This allows for efficient resource utilization and

for cheap creation and destruction of containers Consequently starting and stopping a container is equivalent to starting and

stopping an application [64 65] This comparison is illustrated in Figure 34

Containers offer several advantages over running a process directly on the system Due to the OS virtualization of the con-

tainers software is always deployed on the same operating system defined in the container This allows for a rsquowrite once run

everywherersquo scenario which allows for portability of the system to a range of devices Containers communicate with each other

using protocols such as HTTP This allows for the processes in containers to be written in any programming language using

any external library that is needed For the system this means that if the Producer and Consumer Plugins are packaged as

containers they can effectively be made in any available technology greatly enhancing the extensibility of the system

332 LXC

Linux containers are the basis on top of which other container frameworks are built LXC provides a normal OS environment

similar to a VM The containers in this framework almost behave identically to a VM They can run multiple processes LXC can

be used directly but offer only low level functionalities and can be difficult to set up [67]

333 Docker

Docker started as an open-source project at dotCloud in early 2013 It was an extension of the technology the company had

developed to run its cloud applications on thousands of servers [64] Now Docker is a standalone mature company providing a

software container platform for the deployment of applications [66] Docker provides two main services a simple toolset and

API for managing Linux containers and a cloud platform which provides easy access to recipes for software containers created

by other developers [68] Docker is the container technology with most public traction and is becoming the container standard

at the time of writing due to its functionalities and very responsive community It offers functionality to easily build and run

containers but also manage them in large clusters A design decision that limits Docker is that each container can only run one

process at a time and the Docker client Docker consists of a daemon that manages the containers and the API Engine a REST

client Should this client fail dangling containers can arise [69]

334 rkt

Core OSrsquo rkt is an emerging container technology providing an API engine similar to the Docker API Engine that can run LXC

containers as well as Docker containers rkt focusses on security standardization and is specifically designed to run in cloud

environments Unlike Docker rkt does not use a daemon process with a REST client The command line tool executes all the

34 Object detection algorithms and frameworks 36

(a) Container stack (b) Virtual machine stack

Figure 34 Containers compared to virtual machines [66]

operations which makes the framework more reliable rkt is not as mature as Docker yet It is portable to multiple Linux

environments but is not yet portable to macOS and Windows [70]

34 Object detection algorithms and frameworks

As stated in Section 132 object detection is the computer vision task of detecting which objects are present in an image and

where they are located Several approaches to this problem have been proposed some of which focus on thermal images This

section aims to give a small overview of different existing techniques For the technical details on the algorithms the reader is

referred to the respective articles on the algorithms

341 Traditional approaches

Traditional approaches include hot-spot detection techniques and Adaptive Boosting (AdaBoost) with various feature extraction

techniques such as Aggregated Channel Features (ACF) and Integral Channel Features (ICF) Thesemethods rely on clever feature

engineering solutions that use domain knowledge or statistical insights to transform the raw dataset into a specific set of

features in order to find patterns [32]

Hot-spot detection

Hot-spot techniques work on the assumptions that people have an overall higher body temperature than most of the back-

ground in the thermal image These techniques first select candidate objects these are the hot-spots in the image The hot-spots

define the region on which a classifier is run and are thus the localization step in the object detection problem Afterwards

a classifier is trained on these candidates Xu et al used a Support Vector Machine (SVM) classifier to classify if the hot-spot

34 Object detection algorithms and frameworks 37

represented a pedestrian [71] Nanda et al used a Bayes classifier to classify the hot-spots [72] These methods are generally

not applicable because people often are not the only hot-spots in thermal images

AdaBoost

AdaBoost is a machine learning algorithm that utilizes the output of so-called weak learning algorithms (weak learners) and

combine their outputs into aweighted sum that forms the output of the boosted classifier AdaBoostmodifies theweak learners

in favor of data points misclassified by previous classifiers [73] Viola and Jones et al built a detection algorithm that uses two

consecutive frames of a video sequence and trains the AdaBoost classifier on both motion and appearance information [74]

Davis et al use a two-stage template approach that initially performs a fast screening procedure using a generalized template

using a contour saliency map to locate potential person locations Any window located in the first phase is then forwarded to

the AdaBoost algorithm to validate the presence of the person Dollaacuter et al extracted features using different ICF and ACF [35]

ICF and ACF compute features by calculating several aggregations over the different channels of an image such as gradient

color histogram and colors Goedeme et al expanded these detectors with extra thermal channels to achieve comparable

results as Dollaacuter et al but for thermal images [36]

342 Deep learning

Over the past few decades there has been a shift in proposed solution methods towards deep learning Deep learning for object

detection uses Convolutional Neural Networks (CNN) CNNs are a specialized kind of neural network for processing data that

has a known grid-like topology such as images CNNs generally consist of three steps a convolution step that creates a feature

map of a region of an image a pooling step that summarizes the output of the convolution step and finally a fully-connected

network that learns from the features extracted in the previous steps [75] The key difference is that these algorithms do the

feature extraction in the convolutional layers and do not need feature engineering like the algorithms presented in Section

341 This requires quite a bit of computing power when compared to the traditional methods Since deep learning made the

shift to computing on Graphical Processing Units (GPUs) computations became feasible and these models proved to achieve

very good performance on various machine learning problems Two model types are described two-stage networks (R-CNN

R-FCN) that extract image regions first and make separate predictions on each region and dense networks (YOLO SSD NASNet

RetinaNet) that operate on the image as a whole

Region-based Convolutional Network (R-CNN)

R-CNN uses a selective search method to find objects an alternative to the exhaustive search in an image It initializes small

regions in an image and merges them hierarchically The detected regions are merged according to color spaces and other

similarity metrics [76] R-CNN combines this selective search with a CNN per region to find out the objects in these regions [77]

34 Object detection algorithms and frameworks 38

Fast(er) Region-based Convolutional Network (Fast(er) R-CNN)

Fast R-CNN was developed to reduce the time consumption related to the high number of models necessary to analyze region

proposals from the selective search method in R-CNN Instead of using a CNN for each region a single CNN with multiple

convolutional layers is used [78] Faster RCNN drops the region proposals detected with the selective search method (which

is computationally expensive) and introduced the Region Proposal Network (RPN) to directly generate region proposals This

accelerates training and testing and improves performance [79] Mask R-CNN is an extension of the Faster R-CNN model that

adds a parallel branch to the bounding box detection to predict object masks that is the segmentation of an object by pixel in

the image [80]

Region-based Fully Convolutional Network (R-FCN)

R-FCN tries a more efficient approach to region detection Instead of applying a per-region subnetwork multiple times R-FCN

uses a fully convolutional network with computations shared across the entire image This allows it to be compatible with

multiple backbone networks such as Residual Networks [81]

You Only Look Once (YOLO)

The previously discussed methods need to run the same computations on different parts of an image multiple times before

generating a prediction This makes those methods relatively slow The YOLO model [82] was developed with the requirement

to make predictions as fast as possible trading off accuracy for speed to move towards real-time object detection YOLO directly

predicts bounding boxes and class probabilities with a single CNN in a single evaluation instead of first detecting object regions

and predicting classes afterwards This has some benefits over the other methods YOLO is very fast when compared to other

methods capable of processing images in real-time up to 155 frames per second for some variants It also learns contextual

information because it trains on entire images instead of regions YOLO also generalizes better for other image types All these

benefits come at the cost of accuracy YOLO struggles to precisely localize some objects especially small objects The following

versions of YOLO focus on delivering more accuracy The algorithm is currently in its third version [83]

Single-Shot Detector (SSD)

The SSD [84] is similar to YOLO and predicts all the bounding boxes and the class probabilities in one single evaluation (single

shot) using one CNN The model takes an image as input which passes through multiple convolutional layers When compared

to YOLO SSD achieves higher accuracies by adding convolutional layers and including separate filters for different aspect ratio

detections

Neural Architecture Search Net (NASNet)

NASNet takes a different approach and does not design the network architecture to perform the object detection beforehand

but instead trains a Recurrent Neural Network (RNN) to generate the model descriptions of the CNN to perform the object

34 Object detection algorithms and frameworks 39

detection The RNN is trained using reinforcement learning The NASNets built for object detection perform as good as most

networks but are slower to train [85]

RetinaNet

RetinaNet is the latest state-of-the art object detector It is a simple dense detector similar to YOLO and SSD but matches

the accuracy of the two-stage detectors like the R-CNN variants RetinaNet proposes that the foreground-background class

imbalance encountered when training the dense detectors lead to less accuracy when compared to the two-stage detectors

RetinaNet uses a newmethod called Focal Loss that focuses training on a sparse set of examples to counter this class imbalance

which results in a very good performance and a very fast detection [86]

343 Frameworks

While the previous Sections focused on different algorithms actually implementing these algorithms is not straightforward

Thatrsquos why over the past years several deep learning frameworks have emerged that try to provide easier access to this tech-

nology Some frameworks provide APIs for some of the object detection algorithms presented above This section gives a small

overview of some frameworks Most frameworks differ quite a bit from each other which makes porting a model from one

framework to another rather difficult The Open Neural Network Exchange Format (ONNX) initiative hopes to propose a stan-

dard for interchangeable models which should aid switching among frameworks more easily in the future [87] Note that there

are other frameworks available but those do not yet support object detection functions out of the box

TensorFlow

Perhaps the most well-known framework TensorFlow is an open source machine learning library for neural networks with a

Python interface It was developed by Google for internal use and released for the public in 2015 [88] Recently an Object

Detection API has been built for TensorFlow which implements pre-trained models on benchmark datasets such as SSD Faster

R-CNN R-FCN and Mask R-CNN [89] TensorFlow offers a lot of flexibility in its use and can be used for many machine learning

problems

Darknet

Darknet is an open source neural network framework written in C and CUDA It is maintained by Joseph Redmon the person

behind the YOLO algorithm [90] Darknet does not offer the flexibility that other frameworks offer but is easy to install and

use when compared to others Out of the box Darknet offers an interface for YOLO The open source community offers some

ports of this framework to other popular frameworks such as Tensorflow

34 Object detection algorithms and frameworks 40

CNTK

The Microsoft Cognitive Toolkit (CNTK) is an open source toolkit for distributed deep learning It offers a Python C or C++

interface Itrsquos one of the first framework so support ONNX CNTK offers an API for Fast R-CNN and Faster R-CNN [91]

35 Technology choice 41

35 Technology choice

This Section presents the choices made for each technology described in the previous Sections

351 Thermal camera

The FLIR One Pro and Therm-App were selected as thermal cameras for the proof of concept Both offer relatively high quality

images 160 x 120 pixels and 320 x 240 pixels respectively This is of course relative to their price 469 and 93731 euro respec-

tively These prices are at the low end of the product ranges offered Both cameras are designed to use on a smartphone which

makes them ideal for prototyping since these devices are widely available and setting up the camera via the apps from the

respective companies is easy Both cameras provide MPEG-4h264 encoded video output easily understood by most playback

software Both cameras can be found in the lower left of Figure 32b

For deployment in production-ready applications with drones these cameras are not the best choice They arenrsquot specifically

designed to be used on a drone and donrsquot offer the best image quality possible In those applications platforms like the FLIR Vue

Duo Zenmuse or Workswell Wiris are better candidates due to their superior image quality MAVLink interfaces compatibility

with commercially available gimbals to mount them on drones and other features

352 Microservices framework

Flask is selected as the microservices framework The arguments for Flask are as follows Flask is a mature web framework

with major companies backing it This means the APIs stay consistent and the framework is stable in use When compared to

some other frameworks like Spring Boot Flask is unopionated which allows for maximum flexibility during development Flask

also has a very small memory footprint that makes it easier to deploy on less powerful on-board devices like drones Flask is

also easy to use and quick to set up ideal for developing a proof of concept A final argument is the familiarity of the author

with Flask

353 Deployment framework

Docker is selected as the deployment framework Docker is the most mature and well supported container framework at

the time of writing and likely will be important in the future It offers the most features and is specifically designed for the

microservices pattern [68]

354 Object detection

One of the requirements specified in Section 21 is real-time streaming Real-time object detection is only achieved by a few

models presented in Section 34 Candidates are YOLO SSD and RetinaNet As there is no framework that provides an implemen-

tation of the RetinaNet algorithm out of the box at the time of writing this algorithm is not selected SSD is implemented in

the TensorFlow object detection API However at the time of writing this API has not been found stable when trying out the API

fallbacks to older versions of the software were needed to be able to test the models This was due to the object detection API

35 Technology choice 42

using older versions of the TensorFlow framework Therefore YOLO implemented in the darknet framework is selected Darknet

offers a stable distribution YOLO achieves good results and has proven to be a very fast detector capable for real-time object

detection

PROOF OF CONCEPT IMPLEMENTATION 43

Chapter 4

Proof of Concept implementation

To prove the concept of the architecture discussed in the previous chapters a prototype is implemented First the goals and the

scope of the prototype are presented in Section 41 Next the components of the prototype are presented in Section 42 Finally

the known limitations and issues of the prototype are presented in Section 43

41 Goals and scope of prototype

The goals of the prototype are to prove the QARs defined in Section 21 The prototype focusses on the ASRs performance

interoperability and modifiability The usability security and availability requirements are left out of the scope of the prototype

because they are not an ASR and require significant resources (focus groups longtime deployment etc) to test

The components that are implemented in the prototype are Client Interface Stream Consumer and Producer because they

represent the core functionality of the framework to build image processing application streams using plugins The Producer

and Consumer Distribution components enable third party plugin developers to add their functionality to the framework These

are distribution functionalities which are out of scope of the prototype The prototype will only support one video stream All

functions presented in Figure 21 are implemented with the exception of rsquoInstall pluginrsquo rsquoUninstall pluginrsquo rsquoAdd pluginrsquo rsquoView

pluginrsquo rsquoRemove pluginrsquo and rsquoUpdate pluginrsquo as they are only supported via the Producer and Consumer Distribution components

The prototype is deployed on a local device Distributed deployment configurations require small changes in the implementation

(see Section 43)

42 Overview of prototype

421 General overview

The prototype consists of four main components a cli streamer producer and consumer The cli process is

the Client Interface implemented as a textual Command Line user Interface (CLI) which allows a user to interact with the

prototype through textual commands in a shell This process is deployed on the local machine The streamer producer

42 Overview of prototype 44

and consumer processes are deployed as microservices in their own Docker containers The prototype is initialized through

the cli which spins up the Docker containers of the other processes This is achieved with the tool docker-compose Compose

is a tool for defining and running multi-container Docker applications The compose YAML file defines the configurations for

the microservices Compose uses these configurations to start and stop the application with a single command [92] A snippet

of the compose file for the application is given in Listing 4 Containers are specified as services The example service

configuration given is that of the producer First the name of the container is specified which overwrites the default name

as the container name is used as hostname for the container in Docker [93] The build configuration specifies where the

container build recipe is situated The port mapping allows processes from the localhost to access processes in the container

For the producer service this is only used for debugging The volumes configuration specifies folders from the host to

be mounted to the container This configuration mounts in the source code and resources It also provides access to the Docker

socket to allow interaction with the Docker host (see Section 424)

services

producer

container_name producer

build

context producer

dockerfile Dockerfile

ports

- 808080

volumes

- producerusrproducer

- varrundockersockvarrundockersock

Listing 4 docker-composeyml snippet of the prototype

All containers are connected to a Docker bridge network [93] for communication A bridge network uses a software bridge to

allow connected containers to communicate while providing isolation from containers which are not connected to that bridge

network The bridge network applies to containers running on the same Docker host The network is thus confined to the local

Docker host and is not distributed on different devices The bridge network has some advantages

bull The bridge provides better isolation and interoperability between containers Containers automatically expose all ports

to each other and none to the outside world

bull The bridge provides automatic Domain Name System (DNS) resolution between containers This means that containers

resolve the IP address of each other by container name or alias

bull Containers can be attached to and detached from the networks on the fly

bull Environment variables are shared which can be used to provide equal environment configurations for every container

on the bridge

42 Overview of prototype 45

422 Client interface

The Client Interface is implemented by the cli component The cli is built in Python with the Click package by Armin

Ronacher [94] Click is a CLI creation kit which aims to make the implementation of CLIs easier It resembles the Flask frame-

work as it also leverages Python decorators [56] for most of its functionality The source code of the cli is located in the

mosquitopy file Commands can be executed by calling python mosquitopy or by calling mosquito if the

source code is installed into the Python environment The following commands are implemented

bull mosquito Displays a help page listing command groups

bull mosquito on Starts the application

bull mosquito off Shuts down the application

bull mosquito plugins Groups all commands to manage plugins Plugins can only be listed not installed or unin-

stalled as the Remote Producer and Remote Consumer are not implemented

bull mosquito plugins ls Lists all locally installed plugins

bull mosquito stream Groups all commands to manipulate the current stream

bull mosquito stream add Adds a producer or consumer to the stream

bull mosquito stream delete Deletes a producer or consumer from the stream

bull mosquito stream elements List all producers and consumers that were added to the stream

bull mosquito stream link Links two stream plugins

bull mosquito stream pause Pauses the stream

bull mosquito stream play Plays the stream This means the stream is processing media

bull mosquito stream print Prints the stream layout (which plugins are linked)

bull mosquito stream stop Stop the stream

bull mosquito stream view View the stream on the local device

A typical use of the application would be the following First the application is started using mosquito on Then plugins

are added to the stream using mosquito stream add [ELEMENT_TYPE] [ELEMENT] This will instantiate the

corresponding plugins in the Producer and Consumer component The plugins are linked in order using mosquito stream

link [ELEMENT_1] [ELEMENT_2] The stream is then set to play using mosquito stream play When the

last plugin is linked to the special local plugin the user can view the output from that plugin using mosquito stream

view which opens up a window in which the stream is displayed

42 Overview of prototype 46

As specified in the software architecture (see Section 23) the Client Interface can use the Stream Commands interface of the

Stream component As specified in Section 231 this interface is a REST API so the client can use this interface through the HTTP

protocol This is done with the Python Requests library [95]

423 Stream

The Stream component is responsible for the logical representation of the stream (see Section 231) implemented as the

streamer component The component consists of three objects api that contains the REST API StreamManager and

the Stream object representing the Stream Model in the framework Requests to the other microservices are sent using the

Python Requests library The prototype implementation only supports one stream with a chain-like model This means that

unlike the stream depicted in Figure 26 a plugin canrsquot have multiple sources or multiple listeners The Stream object man-

ages the logical representation of the stream and manipulates the references to the plugins by forwarding commands to the

producer and consumer component respectively It contains two data structures outline which is the logical struc-

ture of the stream and elements that contains all the plugins present in the stream In the prototype the Stream component

provides the following functionalities on its API endpoints

bull plugins GET Fetches all the plugins from the producer and consumer components and returns their in-

formation

bull elements GET POST DELETE Resource to add and delete plugins from the elements bin

bull streamlinks POST Resource to create links for elements

bull streamstate GET PUT Resource to update the state

bull shutdown POST Shut down the framework

Since the streamer component is the only component of the framework that interacts with outside users it has the re-

sponsibility to gracefully shut down the framework This is needed to solve the problem of dangling plugin containers that

run plugins that have not been stopped and removed after closing the application Since only plugins that are contained in a

stream have a running container associated the stream can notify the Producer and Consumer components to stop and remove

those containers

424 Producer and Consumer

The Producer and Consumer component cover similar responsibilities in managing installed plugins They are implemented in

the producer and consumer components Both components consist of the following objects api that contains the REST

API the Kernel that implements the core functionalities the PluginManager which finds plugins installed on the device

and checks if their installation is valid and the Plugin which is the logical representation of a plugin as described in Section

231 Commands to control the plugins are made using the Python Requests library

42 Overview of prototype 47

For the component to be able to start stop and interact with the plugin containers the component needs access to the Docker

host and the Docker client running on that host But because the component is running in its own container it is isolated from

the Docker host and canrsquot interact with the Docker client by default The workaround for this problem is to expose the socket

on which the Docker client is running on the Docker host to the container This is done by mounting the Docker socket of the

host on the Docker socket in the container In Docker compose the mounting is achieved using the following Listing

volumes

- varrundockersockvarrundockersock

Listing 5 Mounting the Docker socket on the container

This has some implications on security (see Section 43) To interact with the now exposed Docker client the component uses

the docker-py library [96] Listing 6 shows how a connection is made to the Docker client and a plugin container is started

The container is started from the plugin image on the network of the framework and is given the plugin name as the container

name Docker thus creates a DNS entry with the plugin name which makes the container addressable on its name Due to this

implementation this limits that there can only be one container of a plugin running at all times in the current implementation

import docker

client = dockerfrom_env()

container = clientcontainersrun(

image=plugin_name

detach=True

name=plugin_name

network=mosquito_default

)

Listing 6 Starting a plugin container

When both components are initialized the Kernel and PluginManager are created The PluginManager searches

for a plugin_directory which contains information on which plugins are installed on the device Each installed plugin

should have a valid image on the device which are contained in the images directory of the Docker daemon If the image

or information file cannot be found on the device the plugin is marked as broken and canrsquot be used by the framework To

describe the API the consumer API is used The producer API is analogous but replaces consumer with producer

and doesnrsquot have the sources endpoints The Producer and Consumer components provide the following functionalities

on the API endpoints

bull consumers GET Retrieves a list of the installed consumers on the device on which the component is running

bull consumerslthostnamegt GET DELETE Retrieves the information of a consumer specified by the host-

name value which is the name of the consumer

42 Overview of prototype 48

bull consumerslthostnamegtstate GET PUT Retrieves or respectively updates the state of a consumer

specified by the hostname value

bull consumerslthostnamegtsources GET POST Retrieves the sources or respectively adds a new source

to the consumer specified by the hostname value

bull consumerslthostnamegtsourcesltsource_hostnamegt

GET PUT DELETE Retrieves updates or removes the source specified by source_hostname of a consumer spec-

ified by hostname respectively

bull consumerslthostnamegtlisteners All listeners resources are analogous to the sources re-

sources

425 Implemented plugins

Three plugins are implemented and tested filecam (called rsquoMycamrsquo in the code) a producer that reads in a video file and

transmits it in MJPEG encoding using the RTP protocol testsrc a producer which generates test video and transmits it

in MJPEG encoding using the RTP protocol and local a consumer which captures incoming RTP MJPEG video frames and

displays them on the local display The filecam and local plugins are discussed since the testsrc is similar to the

filecam

The plugins are implemented in Python use the GStreamer library with the Python bindings [97] for media streaming and the

Flask framework to implement the API These libraries donrsquot have to be used by future plugins which can just implement a REST

API and provide a media stream specified in their descriptions

Filecam plugin

The filecam image is based of the Ubuntu 1710 image It is chosen over lighter Linux distributions because it offers more

functionalities out of the box for prototyping Other dependencies are Python 36 GStreamer 112 GStreamer plugins-base

plugins-good plugins-bad plugins-ugly libav doc and tools and python-gst

The API of the plugin offers the following functionalities

bull state GET PUT Retrieve and respectively update the state of the plugin

bull listeners GET POST Retrieve and respectively add a listener on the plugin

bull listenerslthostnamegt GET PUT DELETE Retrieve update and respectively delete a listener on the

plugin

The implemented GStreamer pipeline is depicted in Figure 41 The pipeline consists of the following GStreamer elements

1 filesrc GStreamer element that reads data from a file in the local file system This file can have any extension

and is not limited to video or audio files [98] The location property is set to the location of the file in the plugin

container

42 Overview of prototype 49

Figure 41 filecam GStreamer pipeline

2 decodebin GStreamer bin that automatically constructs a decoding pipeline using available decoders and demuxers

via auto-plugging [99] Note that for some media containers and codecs the appropriate decoders must be installed

For example to decode the MPEG streams contained in MP4 files a h264 decoder is needed that can be found in the

rsquolibavrsquo GStreamer plugins library

3 jpegenc GStreamer elements that encodes raw video into JPEG images [100] This implements the MPEG video

stream as all video frames are encoded as JPEG images

4 rtpjpegpay GStreamer element that payload encodes JPEG images into RTP packets according to RFC 2435 [101]

5 udpsink GStreamer element that sends UDP packets to the network When combined with an RTP payload plugin

it implements RTP streaming [102] The host and port property are set to the hostname and port property of the

listener of the plugin

This pipeline is implemented using the Python GStreamer bindings The process consists of creating each GStreamer element

adding them to the GStreamer pipeline and linking the elements in order of appearance in the pipeline The decodebin

and jpegenc element canrsquot be linked when created because there is no default sink pad available on the decodebin

Because the decodebin needs to decide on how to decode media it needs the pipeline to be processing media to it If no

media is flowing the decodebin canrsquot know what decoder it needs to decode the media and what media it can offer to the

sink element Therefore the process of dynamic linking is used [103] All elements which can be linked when the pipeline is

not in the PLAYING state are linked A handler is registered on the rsquopad-addedrsquo signal emitted when a new pad is added

on the decodebin indicating that it can forward media downstream When media is flowing through the pipeline the

decodebin creates new pads when it can generate output data and emits the rsquopad-addedrsquo signal A callback is performed

on the handler which links the decodebin with the jpegenc Listing 7 illustrates this concept

callback handler

def on_pad(source pad sink)

get the sink pad from the sink element

sink_pad = sinkget_static_pad(sink)

get the pad type

pad_caps = padget_current_caps()

pad_type = pad_capsget_structure(0)get_name()

Only if the pad is raw video the link is made

if pad_type == videox-raw

42 Overview of prototype 50

Perform the dynamic link

padlink(sink_pad)

Other pad types are ignored

filesrc = GstElementFactorymake(filesrc)

decodebin = GstElementFactorymake(decodebin)

jpegenc = GstElementFactorymake(jpegenc)

(create other elements and add elements to pipeline)

Only filesrc and decodebin can be linked statically

filesrclink(decodebin)

Register on_pad handler on the pad-added signal

handler_id = decodebinconnect(pad-added on_pad jpegenc)

Set pipeline to PLAYING callback will be called to perform the dynamic link

pipelineset_state(GstStatePLAYING)

Listing 7 Dynamic linking of the decodebin and jpegenc

Local plugin

The local plugin captures an incoming media stream and displays it on the local display This plugin is special with respect

to other plugins in that it is not deployed in a Docker container It runs natively via the cli on the host to allow access to

the local display This version is built for macOS High Sierra (version 10134) and uses GStreamer 112 GStreamer plugins-base

plugins-good plugins-bad plugins-ugly libav doc and tools to receive an incoming stream When a plugin links to the local

plugin the Stream component does not instruct the Consumer component to start the plugin but instead links the plugin to the

local host For macOS the address of the host is hostdockerinternal The GStreamer pipeline used by the plugin is depicted in

Figure 42

Figure 42 local plugin GStreamer pipeline

The pipeline consists of the following elements

1 updsrc GStreamer element that reads UDP packets from the network [104] The port property is set to the port to

which the source is transmitting media

2 rtpjpegdepay GStreamer element that retrieves JPEG images from the received RTP packets [105] This element

canrsquot process the media received from the udpsrc directly because it canrsquot know what type of data it will be receiv-

43 Limitations and issues 51

ing Between the pads a rsquocapabilities filterrsquo is placed which informs the elements on the type of data that will be

flowing through In this case the capabilities are applicationx-rtp which tells that there will be rtp pack-

ets coming through encoding-name=JPEG which tells that the payload of the RTP packets are JPEG images and

payload=26 which also tells that the encoding is JPEG according to RFC3551 [50 106]

3 jpegdec GStreamer element that decodes JPEG images [107]

4 autovideosink GStreamer element that automatically detects an appropriate videosink and forwards the video

to it [108]

43 Limitations and issues

The implementation presented is a prototype and slimmed down version of the architecture presented in Section 23 The

following limitations and issues remain

431 Single client

The current implementation deploys the Flask framework (on which each microservice relies) on the built-in Flask web server

(Werkzeug) which is provided for development convenience It is only built for use by a single user and by default can only

handle one request at each given moment which implies that the framework can also only be used by a single user [109]

432 Timeouts

The framework does not perform checks on request timeouts when passing commands to components and plugins This can

be a problem when the framework is deployed on several devices and the request latency is much higher In case of timeouts

the framework will keep waiting for a response which leads to a crash

433 Exception handling and testing

The framework is only tested for the so called rsquohappy pathrsquo the default scenario featuring no exceptional or error conditions

Some alternate paths are handled butmost still need to be tested An example scenario would be if one of the plugin containers

in a stream fails and stops The framework is not able to detect this and will assume that the container is still running

434 Docker security issues

The Docker client is a client that communicates with a daemon process using the socket dockerd This socket is a UNIX

domain socket called varrundockersock The daemon is highly privileged having root access to the host system

Any process that can write to this socket effectively has root access To allow the components of the framework to manipulate

the plugin containers they need access to this socket Therefore the socket ismounted in the containerwhich gives the container

43 Limitations and issues 52

write access to the socket This implies that that container now has root access on the host when writing to this socket Because

the container gets root access to the host an attacker can walk the file tree of the host and extract sensitive information or run

unwanted software This type of attack is known as a rsquoDocker Breakoutrsquo or rsquoContainer Escapersquo attack [110 111]

435 Docker bridge network

The current implementation deploys the framework on a Docker bridge network which can only be used if the framework is

deployed on a single device The current implementation can thus only be deployed on a single device To deploy the framework

on multiple devices the framework must be deployed using a Docker overlay network [112]

436 Single stream

The implementation supports one stream which must be a chain Multiple streams in tree form with merging media from

multiple sources and broadcasting to multiple listeners is not supported

437 Number of containers per plugin

The framework uses the name of the plugin as identifier for the containers The name is also the hostname on which the

container can be reached Therefore there can only be one active container associated with a plugin at runtime

MOB DETECTION EXPERIMENT 53

Chapter 5

Mob detection experiment

To try out an actual drone thermal imaging application the mob detection experiment is carried out The goal of this experi-

ment is to use existing object detection algorithms on a dataset of thermal images to try and detect large crowds of people

hereinafter referred to as a mob

Several public datasets of thermal images exist Most datasets focus on the detection of people in scenes [113ndash117] some on

face recognition [118 119] others on vehicle recognition [120] Most of these datasets are freely available through the OTCBVS

Benchmark Dataset Collection [121] No datasets containing large amounts of people were found so the Last Post thermal

dataset was created for the detection of mobs and other analysis tasks This dataset is presented in Section 51

To detect mobs in the images of the datasets a deep learning approach using neural networks is explored The selection and

training of the model is described in Section 52

51 Last Post thermal dataset

The Last Post dataset consists of videos of the Last Post Ceremony taking place each night at 800 PM (Brussels timezone) under

the Menin Gate in Ypres Belgium Section 511 gives some insight into this unique ceremony The full dataset is described in

Section 512

511 Last Post ceremony

The Last Post ceremony is a nightly ceremony taking place under the Menin Gate in Ypres at 800 PM sharp The ceremony is

held in remembrance of the fallen soldiers during World War I (1914-1918) The Last Post association [122] states its mission as

follows

True to its statutes the Last Post Association wishes to honor and remember the soldiers of the British Empire

who gave their lives during the Great War of 1914-1918 The Last Post ceremony seeks to express day after day

the lasting debt of gratitude which we all owe to the men who fought and fell for the restoration of peace and

the independence of Belgium

51 Last Post thermal dataset 54

Figure 51 gives an impression of the size of the ceremony Because of the sheer number of people that gather under the gate

each day the Last Post is a unique open air event that allowed for repeatable conditions to capture footage therefore the event

was a perfect opportunity to create the dataset

Figure 51 Last Post ceremony panorama

512 Dataset description

Due to legislation in Belgium drones cannot be flown in public areas without a certification and permit by authorities The

creation of real aerial thermal images with a drone was thus not feasible Therefore an elevated position (in order to simulate

aerial images) on the walls next to Menin gate was used to capture the footage of the adjacent square on one side and the

bridge on the other side Figure 52 shows the locations where the video footage was captured

Figure 52 Locations where the video footage was captured The black stars represent the captured scenes the red stars represent the locations from

where the scene was filmed

The data was recorded with the FLIR One Generation 3 Pro camera for Android devices hereafter referred to as rdquoCamerardquo [123]

Since thermal images donrsquot hold color information a color scheme is used to represent the relative differences in temperature

The rsquoIronrsquo color scheme which maps colder sections of a scene on blue colors and warmer sections on red and yellow colors

51 Last Post thermal dataset 55

The videos are encoded using the H264 MPEG-4 codec Decoded the color information is captured in 420 YUV format The

frame rate of the videos varies from 7 Hz to 8 Hz depending on the speed of the objects in the scene There is sound present

in the videos which is encoded with the MPEG AAC codec For a full list of sequences the reader is referred to Appendix C

The two locations that make up the main scenes in the dataset are presented in Figure 53 The thermal images and visual

images of each scene are depicted next to each other The thermal and visual images were not captured at the same time so

the mobs that are present in the thermal images canrsquot be seen in the visual images In both scenes buildings are present that

are quite warm when compared to the surroundings as can be seen in the thermal images In Figure 53a it even becomes

difficult to recognize the mob when they are standing close to the building This is less the case for Figure 53c where due to

the water present in the image the mob has higher contrast due to the larger difference in emitted heat Towards the far right

of the image the mob seemingly disappears into the background The effect of two objects having a similar heat signature and

having no clear transition in thermal images is defined as thermal camouflage a technique that is often used by animals and

military units [124] This effect is even visible when looking at the mobs present in both images because people are standing

so close together it becomes difficult to recognize individual persons in the crowd

(a) Thermal view of the square in location A (b) Visual view of the square in location A

(c) Thermal view of the bridge in location B (d) Visual view of the bridge in location B

Figure 53 Main scenes in the Last Post dataset

52 Object detection experiment 56

52 Object detection experiment

521 Preprocessing

The Last Post dataset was not used entirely for training the model because there were not enough resources to manually

annotate every image Therefore a smaller dataset was used to serve as a baseline model

The following videos were used 2018-04-10 195029mp4 2018-04-10 200122mp4 2018-04-04-

202859mp4 2018-04-10 202558mp4 and 2018-04-04 200052mp4 captured on the fourth and

tenth of April 2018 These videos were used because of their contents They contain images from location A and B respectively

in which the mob behaves more dynamically compared to other videos This was due to a marching band present on the fourth

of April and a marching army unit on the tenth of April See Appendix C for a summary of the contents of these videos From

these videos images were extracted at a capture rate of 1 Hz Each image was manually labelled using the Microsoft Visual

Object Tagging Tool [125] The tool allows to export the training images to various formats such as Pascal VOC for Tensorflow

YOLO and Microsoft CNTK

Within the data several visual outliers are present An outlier is an observation point that is distant from other observations

It is created due to variability in capturing the videos or indicate experimental errors [126] The errors detected here are the

latter form and are depicted in Figure 54 The first type of outliers are system faults in the Camera Due to an error in the

processing of the video the Camera would sometimes not register any input This causes the Camera to produce completely

black images which is depicted in Figure 54a The Camera softwaremaps temperatures onto colors in the image The variations

of the colors are relative to the temperature interval ranging from the minimum and maximum temperature detected by the

Camera If the minimum andor maximum detected temperature change the Camera needs to adapt its color mapping This

causes the Camera to fade to bright colors for a short period of time (1 to 2 seconds) The resulting image is depicted in Figure

54b Because the resulting image is too bright and objects are hard to detect it is considered an outlier Due to instabilities

when capturing the footage sequences with fast motion some images are very blurry This makes it hard even for a person to

decide what is visible in the frame therefore it is considered an outlier This is depicted in Figure 54c Sometimes people would

pass in front of the Camera which resulted in brightly colored areas in the videos that were not part of the scene and therefore

are another type of outliers depicted in Figure 54d Because the presented outliers are experimental errors and do not belong

in the scenes they were removed from the dataset

522 Training

The model that is used for training is YOLOv3 implemented using the darknet neural network framework [83] The model is

trained using convolutional weights that are pre-trained on the ImageNet database [127] The concept of using weights from a

pre-trained model previously trained on large datasets is known as transfer learning It is very important that when choosing

a pre-trained model the problem statement of the pre-trained model is close enough to the current problem statement For

the pre-trained model on ImageNet this was to identify objects in images which lies close to the detection of mobs in thermal

images Because the type of images (thermal versus visual) is fundamentally different the model could suffer in performance

Goedeme et al [36] solved a similar problem with thermal images and achieved good results which gives an indication that

52 Object detection experiment 57

(a) System fault in the Camera no input was detected (b) The Camera updates to new temperature interval

(c) Due to moving the Camera too fast the image becomes too blurry (d) Very warm object due to people passing in front of the Camera

Figure 54 Outliers

detection should be feasible with the pre-trained model Also because the dataset is relatively small training the model from

scratch could actually hurt performance [128] Training was carried out on the NVIDIA Geforce GTX 980 GPU that allows training

to be done much faster To evaluate training progress the Sum of Squared Error (SSE) loss function is calculated defined assumni=1(xij minus xj)

2 where n is the number of samples in a batch used in a single training epoch and j is the dimension (x

or y) as defined in [83] The result of this training is discussed in Chapter 6

RESULTS AND EVALUATION 58

Chapter 6

Results and evaluation

The goal of this Chapter is to present the results of the framework and the detection experiment The results of the framework

tests are presented in Section 61 The results of the object detection experiment are presented in Section 62

61 Framework results

To evaluate the framework acceptance tests are conducted that test if the framework meets the QARs defined in Section 21 As

stated in Section 41 only the ASRs will be tested A summary of which requirements are met by the framework is given in Table

61 Passed means that the framework has met the requirement not passed that the framework hasnrsquot met the requirement

and plausible means that the frameworkmight havemet the requirement but not enough data could be gathered to be certain

611 Performance evaluation

To evaluate performance the acceptance tests for the requirements are conducted the impact of the framework on the pro-

cessing resources are recorded and the total size of the framework is measured

Acceptance tests

To test the performance of the framework the execution times of each command executed with the CLI (see Section 422) are

measured Each command is executed 200 times except for the on off and link commands they are measured manually

10 times Because these commands launched system threads and their finish signal could not be captured they had to be

measured by hand Commands were executed on a 26 GHz Intel Core i5-2540 processor running macOS High Sierra version

10134 The summarized statistics of the tests are given in Table 62

The average execution times for the Play Stop Pause Add Elements Print View and Link commands do not exceed the 2

second bound specified in PS-1 while the average execution times of the Delete On and Off commands do exceed this bound

This performance requirement is not met by the framework The same result is found for PS-2 Especially the Delete and Off

command exceed the requirements by quite a bit The Delete command shuts down a plugin and removes the Docker container

61 Framework results 59

Requirement id Status

PS-1 Not Passed

PS-2 Plausible

PS-3 Not Passed

PS-4 Plausible

PS-5 Not Passed

IS-1 Passed

IS-2 Passed

MS-1 Passed

MS-2 Passed

MS-3 Passed

MS-4 Passed

MS-5 Plausible

MS-6 Passed

MS-7 Plausible

Table 61 Acceptance tests results summary

from the host This action is costly in time The Off command removes all the plugins and all the microservices of the framework

and thus suffers from the same costly action This could be ameliorated by having the framework not removing the containers

but stopping them instead which requires less resources as it only stops the process running in the container but does not

delete the container from the system

PS-2 and PS-4 could not be measured due to the GStreamer pipeline of the prototype not allowing frames to be tracked

However since real-time is a human time perception if a person canrsquot distinguish the streamed videos from videos played with

a native video player real-time streaming is plausible [43 44] The videos were shown side by side to ten users that could not

distinguish between both videos indicating presumable real-time streaming Since the hard requirements cannot bemeasured

the requirements are not met but are plausible Real-time streaming performance also heavily depends on the used plugins

and the hardware on which they are deployed If a plugin canrsquot process its media fast enough due to lack of processing power

or a slow implementation it will slow down the whole stream

The scalability requirement PS-5 could not be met due to the Flask Werkzeug server only being able to process one request at

a time (see Section 43)

Only two performance requirements are met by the prototype However this is mostly due to some actions being very slow

such as shutting down the framework or removing a plugin As these are actions that should occur less frequently when a user

is using the framework these actions are less important for the perceived quality Frequent actions such as adding linking and

changing the state of the stream do perform rather well and contribute more to the perceived quality Overall the performance

of the framework is not stellar but not bad either This can partially be explained due to the choice of supporting frameworks

61 Framework results 60

Statistic Play Stop Pause Add Delete Elements Print View On Off Link

Mean 0690 0804 0634 1363 8402 0562 0564 122 358 24023 0849

Std deviation 0050 0059 0088 1037 4669 0070 00747 0260 0498 0481 0170

Minimum 0629 0708 0549 0516 0505 0517 0517 0757 3015 23707 0637

25 Percentile 0665 0775 0594 1049 1154 0534 0536 0998 3143 23750 0798

Median 0678 0800 0623 111 11132 0550 0552 1214 3500 23886 0853

75 Percentile 0700 0820 0653 1233 11189 0562 0560 1433 3850 24034 0877

Maximum 1016 1279 1631 625 11846 1227 1149 1691 4562 25326 1261

Table 62 Performance test statistics summary measured in seconds

such as Flask that are not built for performance Other more high performance frameworks such as Vertx could ameliorate

performance

Resource usage

The resources used by the modules of the framework are measured using the Docker statistics tool [129] A summary of the

resources used is given in Table 63 When the framework is idle resource usage is negligible When a plugin is active there is

a slight increase in resources This increase in resources depends on the runtime size of the plugin unknown to the framework

The increase peaks when the plugin is processing media CPU usage is 40 on one core which implies that on one CPU core only

two plugins can be active simultaneously before reaching the ceiling of the processing power In a production environment of

the framework plugins need to be tested thoroughly so that these metrics are known beforehand These metrics imply that

the length of streams should be kept short to avoid having many plugins active simultaneously

Size of framework

The total size of all the Docker images of the components of the framework are given in Table 64 Most images are quite large

the framework core components have an average size of 724 MB and the plugins have sizes ranging from 1GB to 3GB This

size can be explained due to the base images and additionally installed software in the images For development flexibility

the base images used are Linux Ubuntu images that are typically larger than other Linux distributions For the plugins the full

GStreamer library with all plugins was installed which is more than 2 GB large The sizes of the components can be reduced in

a production environment by choosing slimmer Linux distributions as base images and only installing the minimally needed

libraries to get a working plugin

612 Interoperability evaluation

The systems with which the framework exchanges data are the plugins These plugins must follow the plugin model presented

in Section 231 implement the presented resources using a REST API the state machine and protocols If these specifications

61 Framework results 61

Condition Container CPU usage [] Memory usage [MiB]

Idle streamer 100 4209

consumer 003 244

producer 001 2414

1 plugin active not processing media streamer 156 4248

consumer 002 2442

producer 002 2423

mycam plugin 075 4597

1 plugin active processing media streamer 156 4251

consumer 002 2442

producer 002 2424

mycam plugin 4003 9924

Table 63 Resource usage of the framework in several conditions

Image Size [MB]

streamer 718

consumer 729

producer 729

testsrc 1250

mycam 3020

Table 64 Total size of framework components

are followed by a plugin the framework should have no issues exchanging information with the plugin To test this a new

mock plugin is implemented For each resource of the plugin the framework is given random mock input data to exchange

with the plugin When the exchange is complete the values in the plugin are requested and compared with the given input If

the input matches the value in the plugin the exchange was successful These tests were executed 50000 times The results

are summarized in Table 65 Play pause and stop are the requests to change the state of the plugin The sourcelistener add

update and delete commands manipulate the sources and listeners of the plugin Overall there were almost no errors made

when exchanging information only when updating a source and deleting a listener there was one incorrect exchange The

ratios achieved are always 100 correct exchanges except for updating a source and deleting a listener which are 99998

IS-1 and IS-2 specify that commands exchanged with the plugins need to be correct 9999 of the uptime so this requirement

is clearly met

Plugins also interact with each other by transmitting media to each other according to the stream layout This interoperability

62 Mob detection experiment results 62

Value Play Pause Stop Add S Update S Delete S Add L Update L Delete L

Correct 50000 50000 50000 50000 50000 49999 50000 50000 49999

Incorrect 0 0 0 0 0 1 0 0 1

Ratio () 100 100 100 100 100 99998 100 100 99998

Table 65 Interoperability tests results (S Source L Listener)

is not directly controlled by the framework as plugins can be developed by third parties To solve this a plugin needs to provide

its specifications to the framework before being integrated as a plugin This allows the framework to decide whether or not two

plugins will be able to interact with each other in a stream For example if plugin A supports MJPEG streams transmitted via

RTPUDP it will be able to interact with plugin B implementing the same protocols If plugin B implements another protocol it

will not be able to interact with plugin A If this is specified the framework can notify a user that two plugins are not compatible

These scenarios should be avoided which is done by specifying standard protocols for plugins

613 Modifiability evaluation

Plugins are installed for the prototype by building and adding their image to the image directory of the Docker host The

framework does not need a restart to install these images Therefore requirements MS-1 and MS-2 are met End-users can

extend their version of the framework with new plugins by installing them by building the respective plugin images meeting

MS-3 Streams can be modified by linking different plugins by design meetingMS-4 The framework can detect newly installed

plugins when starting up if the image is installed to the image directory of the Docker host Therefore requirementsMS-5 and

MS-6 are met The current prototype is only deployable on a local device as discussed in Section 41 meeting requirementMS-7

The other requirements can be met by deploying the framework using the Docker overlay network as discussed in Section 43

without having to implement changes to the code base The requirements MS-8 and MS-9 are not met but are plausible by

using a different Docker deployment

In general the frameworkwas designed to bemodifiable for different video analysis tasks The hybridmicrokernelmicroservices

architecture enables this modifiability The microkernel plugin architecture allows a user to modify a video analysis stream

during framework use The microservices architecture allows for a modifiable deployment configuration

62 Mob detection experiment results

To evaluate the detection experiment the trained model is tested on the validation set that contains random images from the

total annotated dataset presented in Section 512 First the results of the training of the model are presented in Section 621

Second the metrics that were used to evaluate the model are presented in Section 622 Finally the results of the validation

are presented in Section 623

62 Mob detection experiment results 63

621 Training results

To monitor training the average loss per training epoch was measured the resulting training evolutions are depicted in Figure

61 Darknet does not shuffle training data automatically and creates training batches in order of the training data provided

Since YOLO uses gradient descent for optimization this can lead to YOLO getting stuck in local minima of the cost surface [130]

This effect is seen in Figure 61a around epoch 4500 every image in the training set has been loaded at least once at this point

the model was training on images from location B and now images from location A are loaded (see Section 512) This leads to

a peak in average loss as YOLO was optimizing images from location B and probably converging to a local minimum for that

type of images Therefore in a second run data was shuffled allowing the model to get out of local minima easier Figure

61b shows the difference in training loss the curve is much more irregular thanks to the shuffling of the data Once again

the average loss decreases more around epoch 4500 when every image in the training set has been loaded at least once The

average loss stagnates values in the interval [004 007] To avoid overfitting the model on the training data and achieve worse

generalization performance early stopping is applied Early stopping is a generalization technique to stop the training of a

neural network early before the network starts overfitting [131] The stopping criterion used is progress defined as the decrease

of training error in successive training epochs [131] or the slope of the loss curve depicted in Figure 61 This slope approaches

0 from epoch 13000 and onward so this epoch is selected as early stopping point Because the generalization error is not a

smooth curve and consists of many local minima it is a good idea to validate model weights in the neighborhood of the early

stopping point as well as these could potentially yield better performance on the validation set [131]

622 Metrics

Themodel predicts bounding boxes for objects in the images of the validation sets The bounding box provided by the annotated

dataset is defined as the ground truth bounding boxBgt The bounding box provided by the model is defined as the predicted

bounding boxBp To evaluate the performance of themodel and select the best weights several metrics are used The standard

metrics used to evaluate object detection problems are the Intersection over Union (IoU) and themean Average Precision (mAP)

The IoU is a metric used in common object detection challenges such as the Pascal VOC challenge [132] If the functionA(Bx)

gives the area for a bounding boxBx the IoU is defined as

IoU =A(Bp capBgt)

A(Bp cupBgt)(61)

The mAP for set of detections another metric used in the Pascal VOC challenge is defined as the mean over classes of the

interpolated AP for each class A detection is considered a true positive if the IoU for the detection is greater than 05 The

interpolated AP is given by the area under the precision-recall curve for the detections [132ndash134]

Themodel is also tested on several videos not included in the train and validation set to visually evaluate detection andmeasure

the number of frames per second that can be processed by the model

62 Mob detection experiment results 64

(a) Average training loss when data is not shuffled Vertical average loss horizontal time (in training epochs)

(b) Average training loss when data is shuffled Vertical average loss horizontal time (in training epochs)

Figure 61 Average training loss per epoch

623 Validation results

YOLO creates a snapshot from the weights the model is using at a certain epoch every 100 epochs [83] This makes it possible

to validate each set of weights on the validation set and show the evolution of the validation performance Figure 62 shows

these evolutions for the average IoU and mAP metrics The mAP gradually grows from epoch 4500 onwards and stagnates

around epoch 11500 This shows that the model is not learning anymore and is at risk of overfitting The mAP stagnates in the

interval of [88 91] The average IoU shows a similar trend but varies more because predictions on the same images rarely

are exactly the same

The best mAP value is achieved at epoch 15700 being 9052 The weights from this epoch are used for further testing and

validation The mAP for the 05 IoU threshold of YOLOv3 on the COCO benchmark dataset [135] is 748 comparing this to the

achieved mAP for the Last Post dataset the Last Post mAP is very high The reason for this difference is that the validation

62 Mob detection experiment results 65

(a) mAP () per epoch Vertical mAP () horizontal time (in training epochs)

(b) IoU () per epoch Vertical IoU () horizontal time (in training epochs)

Figure 62 Validation metrics per epoch

set has a high correlation with the validation set Due to the training set and validation set being extracted from videos all

images from one video are correlated in time to each other Images from the validation set are thus correlated to images in

the training set and the model is optimized on these types of images explaining the high mAP This indicates that the model is

somewhat overfitting on the training data This was confirmed when testing the model on unseen videos Although the model

could detect a mob most of the time it produced more visual errors Because this data was not annotated no metrics could be

extracted Figure 63 depicts some predictions of the model on images from the validation set The predicted bounding boxes

resemble the ground truth bounding boxes quite accurately visually

To test the speed of the predictions of the model the total time to predict images in the validation set was measured For the

NVIDIA Geforce GTX 980 GPU the average prediction time for one image is 14673 milliseconds with a standard deviation of

0517 milliseconds This indicates that the upper limit of the frame rate when making predictions on a video is approximately

68 frames per second on the GPU For comparison predictions with the model were also made on a CPU a 26 GHz Intel Core

i5-2540 processor with AVX instructions speedup The average prediction time on the CPU is 5849 seconds with a standard

deviation of 0438 seconds resulting in an upper limit for the frame rate on the CPU of 0171 frames per second Clearly real

time object detection with this model is only possible on a GPU When generating predictions on a test video the average frame

rate of the video was 55 frames per second

62 Mob detection experiment results 66

(a) Prediction of a large mob at location B (b) Prediction of the mob at location A

(c) Prediction of a small mob at location B (d) Prediction of the mob at location B

Figure 63 Predictions of the model on images in the validation set

CONCLUSION AND FUTURE WORK 67

Chapter 7

Conclusion and future work

71 Conclusion

Aerial thermal imaging with drones is a promising technology that can deliver many promising applications for various use

cases across many different domains such as agriculture fire fighting search and rescue etc Most applications built with this

technology are built with a specific use case in mind using a thermal camera and analysis software specifically for this use

case and therefore struggle to exchange hardware and algorithms for new use cases Therefore the goal of this dissertation

was to design build and test a possible backbone framework that allows building these applications in a modifiable way The

specific use case of mob detection in thermal images was investigated as a sample use case for the framework

Chapter 2 explored the requirements of such a framework The ASRs to achieve the goal of the framework are performance

interoperability and modifiability Performance is needed because some use cases (like fire fighting) require real-time video

analysis Interoperability enables the framework to interact with different thermal cameras and different processinganalysis

modules Modifiability enables the framework to interchange the thermal cameras and analyzers in its process to build ap-

plications for different use cases A hybrid combination of the microkernel pattern and the microservices pattern is used to

meet these requirements as the microkernel pattern enabled interchanging the cameras and analyzers via a plugin system

and the microservices pattern enabled different deployment configurations for the framework To build and test the frame-

work several technologies were needed backbone technologies for the software architecture a thermal camera and an object

detection algorithm for the mob detection use case

Chapter 3 explored the state of the art of these technologies and presents the selected technologies Thermal cameras come in

all shapes and sizes and have different features according to their retail prize Contrary to intuition the image quality is not the

defining factor of the retail prize but the amount of extra features such as radiometry communication interfaces etc The FLIR

One Pro and ThermApp were selected for this dissertation since they offer good quality images and features for their price and

their use via smartphone platforms that makes these cameras excellent for prototyping Microservices frameworks also know

a lot of variety depending a lot on the use case for the application using the framework Some are aimed at quick prototyping

others focus on performance etc Flask was selected as the microservices framework as it is easy to use and designed for

prototyping with microservices This does come with a performance trade-off To deploy the microservices in a plugin fashion

71 Conclusion 68

the concept of containers is applied Containers virtualize on the OS level allowing the microservices to be moved around on

the host and distributed on different hosts The current field has some frameworks implementing this technology with Docker

being the most well-known and mature framework and it was selected for that reason The field of object detection has a

variety of solutions for the object detection problem having varying accuracies and some can even create predictions in real-

time The YOLOv3 algorithm implemented in the darknet framework was selected as it generalizes well onto other datasets

(such as thermal images) makes relatively accurate predictions and is able to make predictions in real-time when deployed on

a device with GPU processing capabilities

Chapter 4 presents the implemented prototype of the framework using these technologies Two sample plugins were imple-

mented the filecam that serves a video read in from a file and the display plugin that displays this video on the local device

The framework is limited to one video processing stream for one user at a time and is deployed to a local device It also has a

security risk as the framework has to expose the Docker daemon socket to allow the framework to manipulate the containers

running the plugins This gives the containers that run the core framework processes root access to the host system which can

be abused by potential attackers

Themob detection experiment is presented in Chapter 5 A new thermal image dataset called the Last Post datasetwas collected

for this experiment The dataset features videos of the Last Post ceremony filmed over the course of two weeks What makes

this dataset special is that unlike publicly available datasets it delivers footage of the movement of large crowds filmed from

a high vantage point to simulate footage captured from a drone platform This dataset is used to train a pre-trained YOLOv3

model via transfer-learning The dataset is manually labeled and preprocessed by removing the outliers present Training is

done on a NVIDIA GTX 980 GPU and is evaluated using the MSE loss metric

Chapter 6 presented the test conducted on the framework and the detection model and their corresponding results The per-

formance requirements for the frequently used commands are met by the framework Other commands such as removing

plugins starting up and shutting down the framework do not meet the performance requirements since Docker requires sig-

nificant time to start stop and remove containers The real-time streaming requirements could not be proven because the

time between transmitting a frame and receiving a frame could not be measured directly However the processed videos were

shown to human users that could not distinguish between the processed video and the video played back on a local system

which makes it plausible that the framework achieved this requirement Real-time streaming performance heavily depends on

the plugin and the hardware on which it is deployed When plugins in the framework are processingmedia CPU usage increases

significantly even when only one plugin is active This implies that the length of media processing streams should be kept as

short as possible to achieve good performance The framework is relatively big with some plugins even having a size of 2 GB

This is mostly due to the base images and installed libraries of the plugins and core components Due to each components

and plugin having its own container libraries canrsquot be shared so they are redundantly installed leading to large components

sizes This could be alleviated by using slimmer images and only installing minimal libraries needed The interoperability

requirements are all met by the framework This is proven by a test exchanging mock information between the framework

and plugins The modifiability requirements regarding the plugins are met by the framework The modifiability requirements

regarding the deployment schemes are not met by the framework but are can be achieved by deploying the framework using

a Docker overlay network instead of the Docker bridge network To evaluate the trained model the model made predictions

72 Future work 69

on a separate validation set The model achieves an mAP of 9052 which is much higher than what current state of the art

models are achieving on benchmark datasets This shows that the model is capable of learning the thermal features but is also

overfitting on the data due to temporal correlation between training and validation sets The model can predict in real-time

achieving an average frame rate of 55 frames per second when making predictions on a GPU

72 Future work

This dissertation proposed a framework and implements a prototype of it which only implements a part of the total framework

Object detection using deep learning in general and specified on thermal images is still a young field Several extensions to

this research are possible

721 Security

The framework prototype did not implement any security measures Because in distributed configurations communications

rely on an external network these measures should be implemented to reduce the risks of attacks To allow the components

to manipulate Docker containers the Docker host socket was exposed As stated before this is a serious security risk as the

container gets root access to the host Workarounds for this problem could be to implement a Docker in Docker environment

[136] or deploy the containers in a VM

722 Implementing a detection plugin

Due to the scope and time limit of the dissertation a working prototype plugin containing a trained model for detecting objects

in a video stream could not be made A possible GStreamer pipeline for such a plugin is depicted in Figure 71 This plugin is a

Consumer and receives video via the udpsink Frames are decoded and the raw video is presented to the appsink GStreamer

plugin that allows the video to be dumped into an application This is the detection model that can generate predictions on the

frame The predicted frame is then forwarded to an appsrc GStreamer plugin that puts the predicted frame in a new pipeline to

transmit it to further framework plugins It should be tested whether the detection model can run in a Docker container since

it needs GPU support to be able to predict in real-time A solution could be to use nvidia-docker which leverages NVIDIA GPU

support in Docker containers [137]

Figure 71 GStreamer pipeline for a plugin with a detection model

72 Future work 70

723 Different deployment configurations

The prototype of the framework only implemented one of the deployment configurations presented in Section 233 Other

configurations can be explored by changing the Docker bridge network to a Docker overlay network

724 Multiple streams with different layouts

The prototype only implemented one stream with a chain-like layout Future effort could implement support for multiple

streams that run concurrently The layout can be changed by implementing plugin that can forward media to multiple sources

or merge media coming from different sources which is the concept of sensor fusion

725 Implementing the plugin distribution service (Remote ProducerConsumer)

In Chapter 2 presented the Remote Producer and Consumer that distribute the plugins available for the framework This was

deemed out of scope for the prototype but could be implemented in future versions

726 Using high performance microservices backbone frameworks

The current implementation uses the Flask framework excellent for prototyping but not ideal for high performance Other

frameworks such as Vertx focus on high performance through asynchronous messaging that could improve the performance

of the framework

727 New object detection models and datasets specifically for thermal images

Current effort in object detection models goes towards challenges on benchmark datasets of visual images such as ImageNet

and Pascal VOC There are some thermal datasets publicly available for some detection purposes but these are very small

compared to the visual image datasets Future research could create new benchmark datasets similar to the visual image

datasets specifically for thermal images

Currently publicly available pre-trained neural network models are designed for and trained on the visual image datasets

Future research could go towards designing an architecture specifically for thermal images and training amodel on a benchmark

dataset

Thermal images use several colormaps tomap the relative temperatures in a scene on colors presenting warm and cold regions

Well-known examples are the Iron scheme (used in this dissertation) White-hot and Black-hot Some companies implement

threshold colors that highlight very hot spots or very cold spots in an image (for examples see [138 139] etc) Future research

could investigate how models trained on images using different color schemes differ in their predictions and performances

Thermal images could potentially benefit from radiometric information that adds a ton of information by adding a temperature

dimension to each pixel in the image instead of the relative coloring This information could lead to more accurate predictions

BIBLIOGRAPHY 71

Bibliography

[1] S G Gupta M M Ghonge and P Jawandhiya ldquoReview of Unmanned Aircraft Systemrdquo International Journal of Advanced

Research in Computer Engineering amp Technology vol 2 no 4 pp 2278ndash1323 2013 ISSN 2278 ndash 1323

[2] M Hassanalian and A Abdelkefi Classifications applications and design challenges of drones A review 2017 DOI

10 1016 j paerosci 2017 04 003 [Online] Available http ac els - cdn com S0376042116301348 1 - s2 0 -

S0376042116301348-mainpdf7B5C_7Dtid=256c9506-8f3c-11e7-a898-00000aab0f017B5Camp7Dacdnat=

15042875957B5C_7D

[3] M Joel The Booming Business of Drones 2013 [Online] Available httpshbrorg201301the-booming-business-of-

drones (visited on 01302018)

[4] DJI Zenmuse H3 - 2D [Online] Available httpswwwdjicomzenmuse-h3-2d (visited on 01302018)

[5] Gimbal Guard Drop amp Delivery Device for DJI Mavic Pro [Online] Available httpwwwgimbal-guardcom7B5C_

7Dpprd134610820141productdrop-7B5C7D26-delivery-device-for-dji-mavic-pro (visited on 01302018)

[6] FLIR Systems Aerial Thermal Imaging Kits [Online] Available httpwwwflircomsuasaerial-thermal-imaging-kits

(visited on 01302018)

[7] R Gade and T B Moeslund ldquoThermal cameras and applications a surveyrdquo Machine Vision and Applications vol 25

pp 245ndash262 2014 DOI 101007s00138-013-0570-5 [Online] Available httpslinkspringercomcontentpdf10

10077B5C7D2Fs00138-013-0570-5pdf

[8] M C Harvey J V Rowland and K M Luketina ldquoDrone with thermal infrared camera provides high resolution georefer-

enced imagery of theWaikite geothermal area New Zealandrdquo 2016 DOI 101016jjvolgeores201606014 [Online] Avail-

able httpsacels-cdncomS03770273163014211-s20-S0377027316301421-mainpdf7B5C_7Dtid=78077cee-

05f3-11e8-84ec-00000aab0f6c7B5Camp7Dacdnat=15173405687B5C_7D

[9] S Amici M Turci S Giammanco L Spampinato and F Giulietti ldquoUAV Thermal Infrared Remote Sensing of an Italian Mud

Volcanordquo vol 2 pp 358ndash364 2013 DOI 104236ars201324038 [Online] Available httpwwwscirporgjournalars

20httpdxdoiorg104236ars201324038

[10] J Bendig A Bolten and G Bareth ldquoINTRODUCING A LOW-COST MINI-UAV FOR THERMAL-AND MULTISPECTRAL-IMAGINGrdquo

2012 [Online] Available httpswwwint-arch-photogramm-remote-sens-spatial-inf-scinetXXXIX-B13452012

isprsarchives-XXXIX-B1-345-2012pdf

BIBLIOGRAPHY 72

[11] Workswell ldquoUsing the UAV Thermography for Cultivation and Phenotyping of Cerealsrdquo Tech Rep 2016 [Online] Avail-

able httpswwwdrone-thermal-cameracomwp-contentuploadsCultivation-and-Phenotyping-1pdf

[12] A J Rivera A D Villalobos J C Monje J A Marintildeas and C M Oppus ldquoPost-disaster rescue facility Human detection and

geolocation using aerial dronesrdquo IEEE Region 10 Annual International Conference ProceedingsTENCON pp 384ndash386

2017 ISSN 21593450 DOI 101109TENCON20167848026

[13] P Christiansen K A Steen R N Joslashrgensen and H Karstoft ldquoAutomated detection and recognition of wildlife using

thermal camerasrdquo Sensors (Basel Switzerland) vol 14 no 8 pp 13 778ndash93 Jul 2014 ISSN 1424-8220 DOI 103390

s140813778 [Online] Available httpwwwncbinlmnihgovpubmed2519610520httpwwwpubmedcentralnih

govarticlerenderfcgiartid=PMC4179058

[14] J Zhang J Hu J Lian Z Fan X Ouyang and W Ye ldquoSeeing the forest from drones Testing the potential of lightweight

drones as a tool for long-term forest monitoringrdquo Biological Conversation vol 198 pp 60ndash69 2016 [Online] Available

httpacels-cdncomS00063207163011001-s20-S0006320716301100-mainpdf7B5C_7Dtid=7166e916-8f3c-

11e7-9090-00000aacb35e7B5Camp7Dacdnat=15042877237B5C_7D

[15] D Ventura M Bruno G Jona Lasinio A Belluscio and G Ardizzone ldquoA low-cost drone based application for identifying

and mapping of coastal fish nursery groundsrdquo Estuarine Coastal and Shelf Science vol 171 pp 85ndash98 Mar 2016 ISSN

02727714 DOI 101016j ecss 201601 030 [Online] Available http ac els-cdncomS02727714163003001-s20-

S0272771416300300-mainpdf7B5C_7Dtid=7f4cdb08-8f3c-11e7-a03a-00000aab0f6b7B5Camp7Dacdnat=

15042877467B5C_7D20httplinkinghubelseviercomretrievepiiS0272771416300300

[16] S Chowdhury A Emelogu M Marufuzzaman S G Nurre and L Bian ldquoDrones for disaster response and relief operations

A continuous approximation modelrdquo 2017 DOI 101016jijpe201703024 [Online] Available wwwelseviercomlocate

ijpe

[17] Workswell ldquoPipeline inspection with thermal diagnosticsrdquo 2016 [Online] Available https www drone - thermal -

cameracomwp-contentuploadspipelinepdf

[18] Workswell ldquoThermo diagnosis of photovoltaic power plantsrdquo 2016 [Online] Available httpswwwdrone-thermal-

cameracomwp-contentuploadsWorkswell-WIRIS7B5C_7Dphotovoltaicpdf

[19] Workswell ldquoThermodiagnostics of flat roofsrdquo 2016 [Online] Available httpswwwdrone-thermal-cameracomwp-

contentuploadsroofpdf

[20] Workswell ldquoThermodiagnostics in the power engineering sectorrdquo Tech Rep 2016 [Online] Available https www

drone-thermal-cameracomwp-contentuploadshighvoltagepdf

[21] Workswell Workswell WIRIS - Product - Thermal camera for drones 2016 [Online] Available https www drone -

thermal-cameracomwiris (visited on 01302018)

[22] TEAX Technology ThermalCapture - Thermal Imaging Technology | Capture raw radiometric thermal data with drones

[Online] Available httpthermalcapturecom (visited on 01302018)

BIBLIOGRAPHY 73

[23] DJI Zenmuse XT - unlock the possibilities of sight - DJI 2018 [Online] Available https wwwdji comzenmuse-xt

(visited on 01302018)

[24] Workswell SOFTWARE - Workswell WIRIS - Thermal camera for drones 2016 [Online] Available httpswwwdrone-

thermal-cameracomsoftware (visited on 01312018)

[25] Therm-App Therm-Apptrade - Android-apps op Google Play 2018 [Online] Available httpsplaygooglecomstoreapps

detailsid=comthermapp (visited on 01312018)

[26] B Satzger W Hummer C Inzinger P Leitner and S Dustdar ldquoWinds of change From vendor lock-in to the meta cloudrdquo

IEEE Internet Computing vol 17 no 1 pp 69ndash73 2013 ISSN 10897801 DOI 101109MIC201319

[27] J Divya Drone Technology and Usage Current Uses and Future Drone Technology 2017 [Online] Available httpuk

businessinsidercomdrone-technology-uses-2017-7r=US7B5Camp7DIR=T (visited on 01312018)

[28] A Boulanger ldquoOpen-source versus proprietary software Is one more reliable and secure than the otherrdquo IBM Systems

Journal vol 44 no 2 pp 239ndash248 2005 ISSN 0018-8670 DOI 101147sj4420239 [Online] Available httpieeexplore

ieeeorgdocument5386727

[29] M Kazmeyer Disadvantages of Proprietary Software [Online] Available httpsmallbusinesschroncomdisadvantages-

proprietary-software-65430html (visited on 01312018)

[30] B Steffen and A Seyfried ldquoMethods for measuring pedestrian density flow speed and direction with minimal scatterrdquo

Physica A Statistical Mechanics and its Applications vol 389 no 9 pp 1902ndash1910 May 2010 ISSN 0378-4371 DOI 10

1016JPHYSA200912015 [Online] Available httpswwwsciencedirectcomsciencearticlepiiS0378437109010115

via7B5C7D3Dihub

[31] M Wirz T Franke D Roggen E Mitleton-Kelly P Lukowicz and G Troumlster ldquoInferring crowd conditions from pedestriansrsquo

location traces for real-time crowd monitoring during city-scale mass gatheringsrdquo Proceedings of the Workshop on

Enabling Technologies Infrastructure for Collaborative Enterprises WETICE pp 367ndash372 2012 ISSN 15244547 DOI 10

1109WETICE201226

[32] E Alpaydin Introduction to machine learning 3rd ed MIT Press 2014 p 591 ISBN 026201243X [Online] Available

httpsdlacmorgcitationcfmid=1734076

[33] J W Davis and V Sharma ldquoRobust background-subtraction for person detection in Thermal Imageryrdquo IEEE Computer

Society Conference on Computer Vision and Pattern Recognition Workshops vol 2004-Janua no January 2004 ISSN

21607516 DOI 101109CVPR2004431

[34] W Wang J Zhang and C Shen ldquoImproved Human Detection And Classification in Thermal Imagesrdquo pp 2313ndash2316 2010

[35] R Appel S Belongie P Perona and P Doll ldquoFast Feature Pyramids for Object Detectionrdquo Pami vol 36 no 8 pp 1ndash14

2014 ISSN 01628828 DOI 10 1109 TPAMI 2014 2300479 [Online] Available https vision cornell edu se3 wp -

contentuploads201409DollarPAMI14pyramids7B5C_7D0pdf

[36] T Goedeme ldquoProjectresultaten VLAIO TETRA-projectrdquo KU Leuven Louvain Tech Rep 2017

BIBLIOGRAPHY 74

[37] L-L Slattery DroneSAR wants to turn drones into search-and-rescue heroes 2017 [Online] Available https www

siliconrepubliccomstart-upsdronesar-search-and-rescue-drone-software (visited on 05262018)

[38] A W S Inc What Is Amazon Kinesis Video Streams 2018 [Online] Available https docs aws amazon com

kinesisvideostreamslatestdgwhat-is-kinesis-videohtml (visited on 05262018)

[39] U Government ldquoSystems Engineering Fundamentalsrdquo Defence Acquisition University Press no January p 223 2001

ISSN 1872-7565 DOI 101016jcmpb201005002 [Online] Available httpwwwdticmildocscitationsADA387507

[40] L Bass P Clements and R Kazman Software Architecture in Practice 3rd Addison-Wesley Professional 2012 ISBN

0321815734 9780321815736

[41] J Greene and M Stellman Applied Software Project Management 2006 p 324 ISBN 978-0596009489 [Online] Avail-

able httpwwworeillycomcatalogappliedprojectmgmt

[42] S Barber Acceptable application response times vs industry standard 2018 [Online] Available httpssearchsoftwarequality

techtargetcomtipAcceptable-application-response-times-vs-industry-standard (visited on 05282018)

[43] T Burger How Fast Is Realtime Human Perception and Technology | PubNub 2015 [Online] Available httpswww

pubnubcombloghow-fast-is-realtime-human-perception-and-technology (visited on 05282018)

[44] S-t Modeling P Glennie and N Thrift ldquoTime perception modelsrdquo Neuron pp 15 696ndash15 699 1992

[45] M Richards Software Architecture Patterns First edit Heather Scherer Ed OrsquoReilly Media 2015 [Online] Available

httpwwworeillycomprogrammingfreefilessoftware-architecture-patternspdf

[46] C Richardson Microservice Architecture pattern 2017 [Online] Available httpmicroservicesiopatternsmicroservices

html (visited on 12022017)

[47] P Clements F Bachmann L Bass D Garlan J Ivers R Little P Merson R Nord and J Staffor Documenting Software

Architectures Second Boston Pearson Education Inc 2011 ISBN 0-321-55268-7

[48] Object Management Group ldquoUnified Modeling Language v251rdquo no December 2017 [Online] Available http www

omgorgspecUML251

[49] C De La Torre C Maddock J Hampton P Kulikov and M Jones Communication in a microservice architecture 2017

[Online] Available https docs microsoft com en - us dotnet standard microservices - architecture architect -

microservice-container-applicationscommunication-in-microservice-architecture (visited on 04272018)

[50] H Schulzrinne and S Casner ldquoRTP Profile for Audio and Video Conferences with Minimal Controlrdquo 2003 [Online] Avail-

able httpstoolsietforghtmlrfc3551

[51] D Bull Communicating Pictures A Course in Image and Video Coding Elsevier Science 2014 ISBN 9780080993744

[Online] Available httpsbooksgooglebebooksid=PDZOAwAAQBAJ

[52] On-Net Surveillance Systems Inc ldquoMJPEG vs MPEG4 Understanding the differences advantages and disadvantages of

each compression techniquerdquo 2006 [Online] Available wwwonssicom

BIBLIOGRAPHY 75

[53] M M A V Protocol Introduction MAVLink Developer Guide 2013 [Online] Available httpsmavlinkioen (visited on

09142017)

[54] hartmut Schlosser Microservices trends 2017 Strategies tools and frameworks - JAXenter 2017 [Online] Available

httpsjaxentercommicroservices-trends-2017-survey-133265html (visited on 03242018)

[55] A Ronacher Welcome to Flask mdash Flask Documentation (012) 2017 [Online] Available httpflaskpocooorgdocs012

(visited on 03242018)

[56] F Reyes PythonDecorators 2017 [Online] Available https wiki python org moin PythonDecorators (visited on

04272018)

[57] Stackshare Companies that use Flask and Flask Integrations 2018 [Online] Available https stackshare io flask

(visited on 03242018)

[58] Falcon Falcon - Bare-metal web API framework for Python [Online] Available httpsfalconframeworkorg7B5C

7DsectionAbout (visited on 03242018)

[59] Stackshare Companies that use Falcon and Falcon Integrations 2018 [Online] Available httpsstackshareiofalcon

(visited on 03242018)

[60] A Ronacher Nameko for Microservices 2015 [Online] Available httplucumrpocooorg201548microservices-with-

nameko (visited on 03242018)

[61] C Escoffier Building Reactive Microservices in Java 2017 ISBN 9781491986264

[62] C Posta Microservices for Java Developers ISBN 9781491963081

[63] R Dua A R Raja and D Kakadia ldquoVirtualization vs Containerization to support PaaSrdquo in IEEE International Conference

on Cloud Engineering 2014 ISBN 9781479937660 DOI 101109IC2E201441

[64] D Merkel Docker Lightweight Linux Containers for Consistent Development and Deployment 2014 [Online] Available

http delivery acmorg1011452610000260024111600htmlip=1571935 1787B5Camp7Did=26002417B

5Camp7Dacc=ACTIVE20SERVICE7B5Camp7Dkey=D7FC43CABE88BEAA F15FE2ACB4878E3D 4D4702B0C3E38B35

4D4702B0C3E38B357B5Camp7D7B5C_7D7B5C_7Dacm7B5C_7D7B5C_7D=15214915967B5C_

7D (visited on 03192018)

[65] Docker Inc Docker for the Virtualization Admin 2016 p 12

[66] Docker Inc What is a Container 2018 [Online] Available https www docker com what - container (visited on

03242018)

[67] M Helsley LXC Linux container tools 2009 [Online] Available httpswwwibmcomdeveloperworkslinuxlibraryl-

lxc-containers (visited on 05212018)

[68] J Fink Docker a Software as a Service Operating System-Level Virtualization Framework 2014 [Online] Available

http journal code4lib org articles 9669 utm7B5C _7Dsource = feedburner 7B5C amp7Dutm7B5C _

7Dmedium=feed7B5Camp7Dutm7B5C_7Dcampaign=Feed7B5C7D3A+c4lj+ (visited on 03192018)

BIBLIOGRAPHY 76

[69] C Wang What is Docker Linux containers explained 2017 [Online] Available https www infoworld comarticle

3204171linuxwhat-is-docker-linux-containers-explainedhtml (visited on 05212018)

[70] CoreOS Rkt a security-minded standards-based container engine [Online] Available httpscoreoscomrkt (visited

on 03242018)

[71] F X F Xu X L X Liu and K Fujimura ldquoPedestrian detection and tracking with night visionrdquo IEEE Transactions on

Intelligent Transportation Systems vol 6 no 1 pp 63ndash71 2005 ISSN 1524-9050 DOI 101109TITS2004838222

[72] H Nanda and L Davis ldquoProbabilistic template based pedestrian detection in infrared videosrdquo IEEE Intelligent Vehicles

Symposium Proceedings vol 1 pp 15ndash20 2003 DOI 101109IVS20021187921

[73] R E Schapire ldquoExplaining adaboostrdquo Empirical Inference Festschrift in Honor of Vladimir N Vapnik pp 37ndash52 2013

DOI 101007978-3-642-41136-6_5

[74] P Viola O M Way M J Jones and D Snow ldquoDetecting pedestrian using patterns of motion and appearancerdquo Interna-

tional Journal of Computer Vision vol 63 no 2 pp 153ndash161 2005 DOI 101109ICCV20031238422

[75] I Goodfellow Y Bengio and A Courville Deep Learning MIT Press 2016 httpwwwdeeplearningbookorg

[76] J R R Uijlings K E A Van De Sande T Gevers and A W M Smeulders ldquoSelective Search for Object Recognitionrdquo Tech

Rep 2012 DOI 101007s11263-013-0620-5 arXiv 14094842 [Online] Available httpwwwcscornelleducourses

cs76702014spslidesVisionSeminar14pdf

[77] R Girshick J Donahue T Darrell and J Malik ldquoRegion-Based Convolutional Networks for Accurate Object Detection and

Segmentationrdquo IEEE Transactions on Pattern Analysis and Machine Intelligence vol 38 no 1 pp 142ndash158 2014 ISSN

01628828 DOI 101109TPAMI20152437384 arXiv 13112524

[78] R Girshick ldquoFast R-CNNrdquo in Proceedings of the IEEE International Conference on Computer Vision vol 2015 Inter 2015

pp 1440ndash1448 ISBN 9781467383912 DOI 101109ICCV2015169 arXiv 150408083

[79] S Ren K He R Girshick and J Sun ldquoFaster R-CNN Towards Real-Time Object Detection with Region Proposal Networksrdquo

IEEE Transactions on Pattern Analysis and Machine Intelligence vol 39 no 6 pp 1137ndash1149 2016 ISSN 01628828 DOI

101109TPAMI20162577031 arXiv 150601497

[80] K He Gkioxari P Dollaacuter and R Girshick ldquoMask R-CNNrdquo arXiv 2018 arXiv arXiv170306870v3

[81] J Dai Y Li K He and J Sun ldquoR-FCN Object Detection via Region-based Fully Convolutional Networksrdquo Tech Rep 2016

DOI 101109ICASSP20177952132 arXiv 160506409 [Online] Available httparxivorgabs160506409

[82] J Redmon S Divvala R Girshick and A Farhadi ldquoYou Only Look Once Unified Real-Time Object Detectionrdquo 2015 ISSN

01689002 DOI 101109CVPR201691 arXiv 150602640 [Online] Available httparxivorgabs150602640

[83] J Redmon and A Farhadi ldquoYOLOv3 An Incremental Improvementrdquo axXiv 2018 [Online] Available httpspjreddiecom

mediafilespapersYOLOv3pdf

[84] W Liu D Anguelov D Erhan C Szegedy S Reed C-y Fu and A C Berg ldquoSSD Single Shot MultiBox Detectorrdquo arXiv 2016

arXiv arXiv151202325v5

BIBLIOGRAPHY 77

[85] B Zoph and Q V Le ldquoNeural Architecture Search with Reinforcement Learningrdquo in ICLR 2017 pp 1ndash16 arXiv arXiv

161101578v2

[86] T-y Lin P Goyal R Girshick K He and P Dollaacuter ldquoFocal Loss for Dense Object Detectionrdquo arXiv 2018 arXiv arXiv

170802002v2

[87] Facebook Inc ONNX - About 2017 [Online] Available httpsonnxaiabout (visited on 05212018)

[88] TensorFlow TensorFlow 2018 [Online] Available httpswwwtensorfloworg (visited on 05212018)

[89] J Huang V Rathod C Sun M Zhu A Korattikara A Fathi I Fischer Z Wojna Y Song S Guadarrama and K Murphy

ldquoSpeedaccuracy trade-offs for modern convolutional object detectorsrdquo arXiv 2017 arXiv arXiv161110012v3

[90] J Redmon Darknet Open source neural networks in c httppjreddiecomdarknet 2013ndash2016

[91] Microsoft The Microsoft Cognitive Toolkit | Microsoft Docs 2018 [Online] Available https docs microsoft comen-

uscognitive-toolkitindex (visited on 05212018)

[92] Docker Inc Overview of Docker Compose | Docker Documentation 2018 [Online] Available httpsdocsdockercom

composeoverview (visited on 04272018)

[93] Docker Inc Use bridge networks | Docker Documentation 2018 [Online] Available httpsdocsdockercomnetwork

bridge (visited on 04272018)

[94] A Ronacher Click Documentation (50) 2017 [Online] Available httpclickpocooorg5 (visited on 04272018)

[95] A K Reitz Requests HTTP for Humans mdash Requests 2184 documentation 2018 [Online] Available httpdocspython-

requestsorgenmaster (visited on 05092018)

[96] Docker Inc Docker SDK for PythonmdashDocker SDK for Python 20 documentation 2018 [Online] Available httpsdocker-

pyreadthedocsioenstable (visited on 05122018)

[97] GStreamer GStreamer open source multimedia framework 2018 [Online] Available httpsgstreamerfreedesktop

org (visited on 05132018)

[98] E Walthinsen filesrc GStreamer Core Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktop org data doc gstreamer head gstreamer -plugins html gstreamer -plugins - filesrc html (visited on

05132018)

[99] E Hervey decodebin GStreamer Base Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-base-pluginshtmlgst-plugins-base-plugins-decodebinhtml

(visited on 05132018)

[100] W Taymans jpegenc GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-jpegenchtml (visited on

05132018)

BIBLIOGRAPHY 78

[101] A Communications rtpjpegpay GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https

gstreamer freedesktop org data doc gstreamer head gst - plugins - good html gst - plugins - good - plugins -

rtpjpegpayhtml (visited on 05132018)

[102] W Taymans udpsink GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-udpsinkhtml (visited on

05132018)

[103] GStreamer Basic tutorial 3 Dynamic pipelines [Online] Available httpsgstreamerfreedesktoporgdocumentation

tutorialsbasicdynamic-pipelineshtml (visited on 05132018)

[104] W Taymans udpsrc GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-udpsrchtml (visited on

05142018)

[105] W Taymans rtpjpegdepay GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available httpsgstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-rtpjpegdepayhtml

(visited on 05142018)

[106] A Loonstra ldquoVideostreaming with Gstreamerrdquo [Online] Available httpmediatechnologyleideneduimagesuploads

docswt20147B5C_7Dgstreamerpdf

[107] W Taymans jpegdec GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-jpegdechtml (visited on

05142018)

[108] J Schmidt autovideosink GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available httpsgstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-autovideosinkhtml

(visited on 05142018)

[109] A Ronacher Deployment Options mdash Flask 0124 documentation 2018 [Online] Available httpflaskpocooorgdocs

012deploying (visited on 05142018)

[110] R Yasrab ldquoMitigating Docker Security Issuesrdquo University of Science and Technology of China Hefei Tech Rep [Online]

Available httpsarxivorgpdf180405039pdf

[111] Lvh Donrsquot expose the Docker socket (not even to a container) 2015 [Online] Available httpswwwlvhiopostsdont-

expose-the-docker-socket-not-even-to-a-containerhtml (visited on 05152018)

[112] Docker Inc Use overlay networks | Docker Documentation 2018 [Online] Available httpsdocsdockercomnetwork

overlay7B5C7Dcustomize-the-docker7B5C_7Dgwbridge-interface (visited on 05152018)

[113] J W Davis and M A Keck ldquoA Two-Stage Template Approach to Person Detection in Thermal Imageryrdquo Proc Workshop

on Applications of Computer Vision 2005 [Online] Available httpvcipl-okstateorgpbvsbenchpaperswacv05pdf

BIBLIOGRAPHY 79

[114] J W Davis and V Sharma ldquoBackground-subtraction using contour-based fusion of thermal and visible imageryrdquo Com-

puter Vision and Image Understanding vol 106 no No 2-3 pp 162ndash182 2007 DOI 101016jcviu200606010 [Online]

Available httpswebcseohio-stateedu7B~7Ddavis1719Publicationscviu07pdf

[115] S Hwang J Park N Kim Y Choi and I S Kweon ldquoMultispectral Pedestrian Detection Benchmark Dataset and Baselinerdquo

CVPR 2015 [Online] Available httpssitesgooglecomsitepedestrianbenchmark

[116] Z Wu N Fuller D Theriault and M Betke ldquoA Thermal Infrared Video Benchmark for Visual Analysisrdquo IEEE Conference

on Computer Vision and Pattern Recognition Workshops 2014 DOI 101109CVPRW201439 [Online] Available http

citeseerxistpsueduviewdocdownloaddoi=101173522167B5Camp7Drep=rep17B5Camp7Dtype=pdf

[117] R Miezianko Terravic research infrared database

[118] R Miezieanko Terravic research infrared database

[119] S Z Li R Chu S Liao and L Zhang ldquoIllumination Invariant Face Recognition Using Near-Infrared Imagesrdquo IEEE Trans-

actions on Pattern Analysis and Machine Intelligence vol 29 no 4 pp 627ndash639 2007 DOI 101109TPAMI20071014

[Online] Available httpvcipl-okstateorgpbvsbenchpapersNIRpdf

[120] A Akula R Ghosh S Kumar and H K Sardana ldquoMoving target detection in thermal infrared imagery using spatiotem-

poral informationrdquo J Opt Soc Am A vol 30 no 8 pp 1492ndash1501 Aug 2013 DOI 101364JOSAA30001492 [Online]

Available httpjosaaosaorgabstractcfmURI=josaa-30-8-1492

[121] R I Hammoud IEEE OTCBVS WS Series Bench [Online] Available http vcipl - okstate org pbvs bench (visited on

05182018)

[122] Last Post Association Mission 2018 [Online] Available httpwwwlastpostbeenthe-last-postmission (visited on

05182018)

[123] I FLIR Systems FLIR One Pro 2017 [Online] Available httpswwwflircomglobalassetsimported-assetsdocument17-

1746-oem-flir7B5C_7Done7B5C_7Dpro7B5C_7Ddatasheet7B5C_7Dfinal7B5C_7Dv17B5C_

7Dwebpdf

[124] R J Ramana Introduction to Camouflage andDeception Defence Scientific Information ampDocumentation Centre pp 99ndash

164

[125] A Bornstein and I Richter Microsoft visual object tagging tool [Online] Available httpsgithubcomMicrosoftVoTT

(visited on 05202018)

[126] F E Grubbs ldquoProcedures for Detecting Outlying Observations in Samplesrdquo Technometrics vol 11 no 1 pp 1ndash21 Feb 1969

DOI 10108000401706196910490657 [Online] Available httpwwwtandfonlinecomdoiabs10108000401706

196910490657

[127] J Deng W Dong R Socher L-J Li K Li and L Fei-Fei ldquoImageNet A Large-Scale Hierarchical Image Databaserdquo in CVPR09

2009 [Online] Available httpwwwimage-netorgpapersimagenet7B5C_7Dcvpr09pdf

BIBLIOGRAPHY 80

[128] D Gupta Transfer learning amp The art of using Pre-trained Models in Deep Learning 2017 [Online] Available https

wwwanalyticsvidhyacomblog201706transfer-learning-the-art-of-fine-tuning-a-pre-trained-model (visited on

05202018)

[129] Docker Inc docker stats | Docker Documentation 2018 [Online] Available httpsdocsdockercomenginereference

commandlinestats (visited on 05242018)

[130] M Gori and A Tesi ldquoOn the Problem of Local Minima in Recurrent Neural Networksrdquo IEEE Transactions on Pattern

Analysis and Machine Intelligence vol 14 no 1 pp 76ndash86 1992 DOI 10110934107014

[131] L Prechelt ldquoEarly stopping - but whenrdquo in Neural Networks Tricks of the Trade G B Orr and K-R Muumlller Eds Berlin

Heidelberg Springer Berlin Heidelberg 1998 pp 55ndash69 ISBN 978-3-540-49430-0 DOI 1010073-540-49430-8_3

[Online] Available httpsdoiorg1010073-540-49430-8_3

[132] M Everingham L Van Gool C K Williams J Winn and A Zisserman ldquoThe Pascal visual object classes (VOC) challengerdquo

International Journal of Computer Vision vol 88 no 2 pp 303ndash338 2010 ISSN 09205691 DOI 101007s11263-009-

0275-4

[133] M Everingham S M A Eslami L Van Gool C K I Williams J Winn and A Zisserman ldquoThe Pascal Visual Object Classes

Challenge A Retrospectiverdquo International Journal of Computer Vision vol 111 no 1 pp 98ndash136 2014 ISSN 15731405

DOI 101007s11263-014-0733-5

[134] P Henderson and V Ferrari ldquoEnd-to-end training of object class detectors for mean average precisionrdquo Lecture Notes

in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

vol 10115 LNCS pp 198ndash213 2017 ISSN 16113349 DOI 101007978-3-319-54193-8_13 arXiv 160703476

[135] T Y Lin M Maire S Belongie J Hays P Perona D Ramanan P Dollaacuter and C L Zitnick ldquoMicrosoft COCO Common objects

in contextrdquo Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture

Notes in Bioinformatics) vol 8693 LNCS no PART 5 pp 740ndash755 2014 ISSN 16113349 DOI 101007978-3-319-10602-

1_48 arXiv 14050312

[136] Docker Inc Librarydocker 2018 [Online] Available https hub docker com 7B 5C _ 7D docker (visited on

06012018)

[137] Nvidia Nvidia-docker [Online] Available httpsgithubcomNVIDIAnvidia-docker (visited on 05252018)

[138] FLIR ldquoFLIR Onerdquo [Online] Available http www flir comuploadedFiles Store Products FLIR-ONE3rd-GenFLIR-

ONEFLIR-ONE-Gen-3-Datasheetpdf

[139] FLIR ldquoFLIR Bosonrdquo p 2 2016

FIREFIGHTING DEPARTMENT EMAIL CONVERSATIONS 81

Appendix A

Firefighting department email conversations

This appendix contains the email conversations with different firefighting departments in Belgium as part of an exploration of

the functional requirements of an aerial thermal imaging solution Note that all conversations were translated from Dutch to

English

A1 General email sent to Firefighting departments

This email was sent to the departments later mentioned in this appendix The responses in the following sections are responses

to this email

Subject Firefighting department - Thesis thermal drones

Dear Sir Madam

My name is Brecht Verhoeve I am a student Master of Science computer science engineering at Ghent University I am contacting

your department with reference to the research of my masterrsquos dissertation I am currently researching the applications of

thermal cameras in combination with commercial drones They can create an aerial overview of scenes and objects that often

canrsquot be spotted with visual detectors like hidden persons fires or hot explosives The eventual goal is to let a computer indicate

these objects of interest autonomously on the thermal images of the drone These images could aid a firefighter with their

work

For this research I have some questions for you

Functionality

I have enlisted some functionalities which I believe could be interesting for a firefighter

bull Detection of persons in buildings (find potential victims)

bull Detection of hidden fires in buildings (to identify danger zones)

bull Detection of fires on vast terrains (forests industrial terrains)

A2 Conversation with Firefighting department of Antwerp Belgium 82

bull Indication of hot explosives

I have two questions

bull Do you agree that these are the most important functions

bull Are there any other functions that you deem important

Quality of the application Next to the functionality the quality of the application is also important For me the most important

aspects are

bull Accuracy The software must be accurate There is no room for errors when detecting

bull Speed The software must operate quickly An overview must be created quickly to not waste time in case of an emer-

gency

bull Usability The software must be easy to use

Once again I have two questions

bull Do you agree with these qualities

bull Are there any other important qualities that you deem important

I would like to thank you in advance for your time

Best regards

Brecht Verhoeve

A2 Conversation with Firefighting department of Antwerp Belgium

The answers were given inline For clarity these are explicitly given

Subject Re Firefighting department Antwerp - Thesis thermal drones

Answers can be found in your email

Best regards

Functionality Detection of hidden fires in buildings and environments Are there any other functions that you deem important

Capture the evolution of a fire with the thermal camera Visualise incidents during night-time Capture invisible fires such as

hydrogen or methane fires

A3 Converstation with Firefighting department of Ostend Belgium 83

A3 Converstation with Firefighting department of Ostend Belgium

The answers were given inline For clarity these are explicityl given

Subject Re Firefighting department Ostend - Thesis thermal drones

Dear Brecht

You can find the answers after the questions in your email

Best Regards

Functionality Are there any other functions that you deem important These are the most important for us at the moment

Quality of the application Are there any other important qualities that you deem important The application must work au-

tonomously

A4 Conversation with Firefighting department of Courtrai Belgium

Subject Re Firefighting department Courtrai - Thesis thermal drones

Dear Brecht

Beneath you will find our answers (next to the already mentioned items)

Functionality

bull The detection of persons in a landscape For example missing persons after a traffic accident there are searches in the

dark for victims that were catapulted from a vehicle Today this is done via a thermal camera on the ground but with

a drone this could hold potential benefits Another example is searching for missing persons in nature reserves The

police sometimes asks for assitance of firefighters to search the area

Quality of the application

bull The images needs to be processed in realtime not after the drone has landed

The drones must be deployable for multiple purposes

The interpretation of the images in the future can be important for automatic flight control of drones Currently there is a

European project rdquo3D Safeguardrdquo where the KU Leuven is participating They are already quite advanced in interpreting the

images from a drone to spot persons through smoke With this information the drone can be redirected The application can

thus use the interpretations of the images to control the drone in flight

Best regards

A5 Conversation with Firefighting department of Ghent Belgium

Subject Re Firefighting department Ghent - Thesis thermal drones

A5 Conversation with Firefighting department of Ghent Belgium 84

Hi Brecht

I donrsquot know if yoursquove received the previous email but there you received answers on your questions

Best regards

Subject Re Firefighting department Ghent - Thesis thermal drones

With respect to the functionality I would like to add

bull Measuring the temperature of containers silos

I agree with the quality of the application It could be handy to be able to view the application from one or more devices

Everything should have a clear overview If possible information and controls should be available on one screen

I will follow up

Best regards

THERMAL CAMERA SPECIFICATIONS 85

Appendix B

Thermal camera specifications

This appendix gives all the specifications for the compared thermal cameras First the different cameras their producing

companies and average retail prices are listed in Table B1 Second their respective physical specifications are presented in

Table B2 Third the image qualities are presented in Table B3 Fourth the thermal precisions are presented in Table B4 Fifth

the available interfaces to interact with each camera are presented in Table B5 Sixth the energy consumption of each camera

is presented in Table B6 Seventh how support is offered when developing for these platforms is presented in Table B7 Finally

auxiliary features are presented in Table B8

THERMAL CAMERA SPECIFICATIONS 86

Product Company Price (Euro)

Wiris 2nd Gen 640 Workswell 999500

Wiris 2nd Gen 336 Workswell 699500

Duo Pro R 640 FLIR 640900

Duo Pro R 336 FLIR 438484

Duo FLIR 94999

Duo R FLIR 123999

Vue 640 FLIR 268900

Vue 336 FLIR 125993

Vue Pro 640 FLIR 403218

Vue Pro 336 FLIR 230261

Vue Pro R 640 FLIR 518456

Vue Pro R 336 FLIR 345599

Zenmuse XT 640 DJI x FLIR 1181000

Zenmuse XT 336 DJI x FLIR 697000

Zenmuse XT 336 R DJI x FLIR 939000

Zenmuse XT 640 R DJI x FLIR 1423000

One FLIR 23799

One Pro FLIR 46900

Tau 2 640 FLIR 674636

Tau 2 336 FLIR 493389

Tau 2 324 FLIR 2640

Lepton 3 160 x 120 FLIR 25995

Lepton 3 80 x 60 FLIR 14338

Boson 640 FLIR 122209

Boson 320 FLIR 93842

Quark 2 640 FLIR 33165

Quark 2 336 FLIR 33165

DroneThermal v3 Flytron 34115

Compact Seek Thermal 27500

CompactXR Seek Thermal 28646

Compact Pro Seek Thermal 59900

Therm-App Opgal 93731

Therm-App TH Opgal 295000

Therm-App 25 Hz Opgal 199000

Table B1 Compared cameras their producing companies and their average retail price

THERMAL CAMERA SPECIFICATIONS 87

Product Weight (g) Dimensions (mm)

Wiris 2nd Gen 640 390 135 x 77 x 69

Wiris 2nd Gen 336 390 135 x 77 x 69

Duo Pro R 640 325 85 x 813 x 685

Duo Pro R 336 325 85 x 813 x 685

Duo 84 41 x 59 x 30

Duo R 84 41 x 59 x 30

Vue 640 114 574 x 4445 x 4445

Vue 336 114 574 x 4445 x 4445

Vue Pro 640 9214 574 x 4445 x 4445

Vue Pro 336 9214 574 x 4445 x 4445

Vue Pro R 640 9214 574 x 4445 x 4445

Vue Pro R 336 9214 574 x 4445 x 4445

Zenmuse XT 640 270 103 x 74 x 102

Zenmuse XT 336 270 103 x 74 x 102

Zenmuse XT 336 R 270 103 x 74 x 102

Zenmuse XT 640 R 270 103 x 74 x 102

One 345 67 x 34 x 14

One Pro 365 68 x 34 x 14

Tau 2 640 72 444 x 444 x 444

Tau 2 336 72 444 x 444 x 444

Tau 2 324 72 444 x 444 x 444

Lepton 3 160 x 120 09 118 x 127 x 72

Lepton 3 80 x 60 09 118 x 127 x 72

Boson 640 75 21 x 21 x 11

Boson 320 75 21 x 21 x 11

Quark 2 640 8 22 x 22 x 12

Quark 2 336 8 22 x 22 x 12

DroneThermal v3 3 20 x 20 x 15

Compact 1417 254 x 444 x 203

CompactXR 1417 254 x 444 x 254

Compact Pro 1417 254 x 444 x 254

Therm-App 138 55 x 65 x 40

Therm-App TH 123 55 x 65 x 40

Therm-App 25 Hz 138 55 x 65 x 40

Table B2 Physical specifications

THERMAL CAMERA SPECIFICATIONS 88

Product IR Resolution (pixels) SD resolution (megapixels) Frequency (Hz) FOV Radiometry

Wiris 2nd Gen 640 640 x 512 192 not specified Various yes

Wiris 2nd Gen 336 336 x 256 192 not specified Various yes

Duo Pro R 640 640 x 512 12 30 Various lens yes

Duo Pro R 336 336 x 256 12 30 Various lens yes

Duo 160 x 120 2 75 and 83 57deg x 44deg no

Duo R 160 x 120 2 75 57deg x 44deg yes

Vue 640 640 x 512 0 75 Various lens no

Vue 336 336 x 256 0 75 Various lens no

Vue Pro 640 640 x 512 0 75 Various lens no

Vue Pro 336 336 x 256 0 75 Various lens no

Vue Pro R 640 640 x 512 0 75 Various lens yes

Vue Pro R 336 336 x 256 0 75 Various lens yes

Zenmuse XT 640 640 x 512 0 75 Various lens no

Zenmuse XT 336 336 x 256 0 75 Various lens no

Zenmuse XT 336 R 336 x 256 0 75 Various lens yes

Zenmuse XT 640 R 336 x 256 0 75 Various lens yes

One 80 x 60 15 87 50 deg x 38 deg yes

One Pro 160 x 120 15 87 55 deg x 43 deg yes

Tau 2 640 640 x 512 0 75 Various lens yes

Tau 2 336 336 x 256 0 75 Various lens yes

Tau 2 324 324 x 256 0 76 Various lens yes

Lepton 3 160 x 120 160 x 120 0 88 56 deg available

Lepton 3 80 x 60 80 x 60 0 88 56 deg no

Boson 640 640 x 512 0 90 Various lens no

Boson 320 320 x 256 0 90 Various lens no

Quark 2 640 640 x 512 0 9 Various lens no

Quark 2 336 336 x 256 0 9 Various lens no

DroneThermal v3 80 x 60 0 86 25 deg no

Compact 206 x 156 0 9 36 deg no

CompactXR 205 x 156 0 9 20 deg no

Compact Pro 320 x 240 0 15 32 deg no

Therm-App 384 x 288 0 87 Various lens no

Therm-App TH 384 x 288 0 87 Various lens yes

Therm-App 25 Hz 384 x 288 0 25 Various lens no

Table B3 Image quality

IR InfraRed SD Standard FOV Field of View

THERMAL CAMERA SPECIFICATIONS 89

Product Sensitivity mK Temperature range (degrees Celsius) Accuracy (Celsius)

Wiris 2nd Gen 640 50 -25 to +150 -40 to + 550 2

Wiris 2nd Gen 336 50 -25 to +150 -40 to + 550 2

Duo Pro R 640 50 -25 to + 135 -40 to + 550 5 20

Duo Pro R 336 50 -25 to + 135 -40 to + 550 5 20

Duo not specified -40 tot + 550 5

Duo R not specified -40 to + 550 5

Vue 640 not specified -58 to + 113 not specified

Vue 336 not specified -58 to + 113 not specified

Vue Pro 640 not specified -58 to + 113 not specified

Vue Pro 336 not specified -58 to + 113 not specified

Vue Pro R 640 not specified -58 to + 113 not specified

Vue Pro R 336 not specified -58 to + 113 not specified

Zenmuse XT 640 50 -40 to 550 not specified

Zenmuse XT 336 50 -40 to 550 not specified

Zenmuse XT 336 R 50 -40 to 550 not specified

Zenmuse XT 640 R 50 -40 to 550 not specified

One 150 -20 to 120 3

One Pro 150 -20 to 400 3

Tau 2 640 50 -40 to 550 not specified

Tau 2 336 50 -40 to 550 not specified

Tau 2 324 50 -40 to 550 not specified

Lepton 3 160 x 120 50 0 to 450 5

Lepton 3 80 x 60 50 0 to 450 5

Boson 640 40 0 to 500 not specified

Boson 320 40 0 to 500 not specified

Quark 2 640 50 -40 to 160 not specified

Quark 2 336 50 -40 to 160 not specified

DroneThermal v3 50 0 to 120 not specified

Compact not specified -40 to 330 not specified

CompactXR not specified -40 to 330 not specified

Compact Pro 70 -40 to +330 not specified

Therm-App 70 5 to + 90 3

Therm-App TH 70 0 to 200 2

Therm-App 25 Hz 70 5 to + 90 3

Table B4 Thermal precision

THERMAL CAMERA SPECIFICATIONS 90

Product USB MAVLink HDMI

Wiris 2nd Gen 640 Flash disk yes yes

Wiris 2nd Gen 336 Flash disk yes yes

Duo Pro R 640 Mini-USB yes micro-HDMI

Duo Pro R 336 Mini-USB yes micro-HDMI

Duo Mini-USB yes micro-HDMI

Duo R Mini-USB yes micro-HDMI

Vue 640 Mini-USB No No

Vue 336 Mini-USB no no

Vue Pro 640 Mini-USB yes Optional

Vue Pro 336 Mini-USB yes Optional

Vue Pro R 640 Mini-USB yes Optional

Vue Pro R 336 Mini-USB yes Optional

Zenmuse XT 640 Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 336 Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 336 R Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 640 R Via DJI drone Via DJI drone Via DJI drone

Tau 2 640 No no no

Tau 2 336 No no no

Tau 2 324 No no no

Lepton 3 160 x 120 No no no

Lepton 3 80 x 60 No no no

Boson 640 Yes no no

Boson 320 Yes no no

Quark 2 640 no no no

Quark 2 336 no no no

DroneThermal v3 no no no

Compact Smartphone storage no no

CompactXR Smartphone storage no no

Compact Pro Smartphone storage no no

Therm-App Smartphone storage no no

Therm-App TH Smartphone storage no no

Therm-App 25 Hz Smartphone storage no no

Table B5 Interfaces

THERMAL CAMERA SPECIFICATIONS 91

Product Power consumption (Watt) Input Voltage

Wiris 2nd Gen 640 4 6 - 36

Wiris 2nd Gen 336 4 6 - 36

Duo Pro R 640 10 50 - 260

Duo Pro R 336 10 50 - 260

Duo 22 50 - 260

Duo R 22 50 - 260

Vue 640 12 48 - 60

Vue 336 12 48 - 60

Vue Pro 640 21 48 - 60

Vue Pro 336 21 48 - 60

Vue Pro R 640 21 48 - 60

Vue Pro R 336 21 48 - 60

Zenmuse XT 640 Via DJI drone Via DJI drone

Zenmuse XT 336 Via DJI drone Via DJI drone

Zenmuse XT 336 R Via DJI drone Via DJI drone

Zenmuse XT 640 R Via DJI drone Via DJI drone

One approx 1h battery lifetime Battery

One Pro approx 1h battery lifetime Battery

Tau 2 640 13 40 - 60

Tau 2 336 13 40 - 61

Tau 2 324 13 40 - 62

Lepton 3 160 x 120 065 31

Lepton 3 80 x 60 065 31

Boson 640 05 33

Boson 320 05 33

Quark 2 640 12 33

Quark 2 336 12 33

DroneThermal v3 015 33 - 5

Compact Via smartphone Smartphone

CompactXR Via smartphone Smartphone

Compact Pro Via smartphone Smartphone

Therm-App 05 5

Therm-App TH 05 5

Therm-App 25 Hz 05 5

Table B6 Energy consumption

THERMAL CAMERA SPECIFICATIONS 92

Product Warranty (years) User Manual Phone support Email support FAQs

Wiris 2nd Gen 640 Not specified Yes Yes Yes Yes

Wiris 2nd Gen 336 Not specified Yes Yes Yes Yes

Duo Pro R 640 1 Yes Yes Yes Yes

Duo Pro R 336 1 Yes Yes Yes Yes

Duo 1 yes Yes Yes Yes

Duo R 1 yes yes yes yes

Vue 640 1 yes yes yes yes

Vue 336 1 yes yes yes yes

Vue Pro 640 1 yes yes yes yes

Vue Pro 336 1 yes yes yes yes

Vue Pro R 640 1 yes yes yes yes

Vue Pro R 336 1 yes yes yes yes

Zenmuse XT 640 05 yes yes yes yes

Zenmuse XT 336 05 yes yes yes yes

Zenmuse XT 336 R 05 yes yes yes yes

Zenmuse XT 640 R 05 yes yes yes yes

One 1 yes yes yes yes

One Pro 1 yes yes yes yes

Tau 2 640 1 yes yes yes yes

Tau 2 336 1 yes yes yes yes

Tau 2 324 1 yes yes yes yes

Lepton 3 160 x 120 1 yes yes yes yes

Lepton 3 80 x 60 1 yes yes yes yes

Boson 640 1 yes yes yes yes

Boson 320 1 yes yes yes yes

Quark 2 640 1 yes yes yes yes

Quark 2 336 1 yes yes yes yes

DroneThermal v3 not specified no no no no

Compact 1 yes yes yes yes

CompactXR 1 yes yes yes yes

Compact Pro 1 yes yes yes yes

Therm-App 1 yes yes yes yes

Therm-App TH 1 yes yes yes yes

Therm-App 25 Hz 1 yes yes yes yes

Table B7 Help and support

THERMAL CAMERA SPECIFICATIONS 93

Product Bluetooth Wi-Fi GPS Mobile app Storage

Wiris 2nd Gen 640 no on request Yes no yes

Wiris 2nd Gen 336 no on request yes no yes

Duo Pro R 640 yes no yes yes yes

Duo Pro R 336 yes no yes yes yes

Duo no no no no yes

Duo R no no no no yes

Vue 640 No no no no no

Vue 336 no no no no no

Vue Pro 640 yes no no yes yes

Vue Pro 336 yes no no yes yes

Vue Pro R 640 yes no no yes yes

Vue Pro R 336 yes no no yes yes

Zenmuse XT 640 Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 336 Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 336 R Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 640 R Via DJI drone Via DJI drone Via DJI drone yes yes

One no no no yes yes

One Pro no no no yes yes

Tau 2 640 no no no no yes

Tau 2 336 no no no no yes

Tau 2 324 no no no no yes

Lepton 3 160 x 120 no no no no no

Lepton 3 80 x 60 no no no no no

Boson 640 no no no no no

Boson 320 no no no no no

Quark 2 640 no no no no no

Quark 2 336 no no no no no

DroneThermal v3 no no no no no

Compact no no no yes yes

CompactXR no no no yes yes

Compact Pro no no no yes yes

Therm-App no no no yes yes

Therm-App TH no no no yes yes

Therm-App 25 Hz no no no yes yes

Table B8 Auxiliary features

LAST POST THERMAL DATASET SUMMARY 94

Appendix C

Last Post thermal dataset summary

The goal of this appendix is to provide a summary of the layout of the Last Post thermal dataset The data was captured on

the following days 24th of March 2018 second of April 2018 third of April 2018 third of April 2018 fourth of April 2018 fifth of

April 2018 9th of April 2018 10th of April 2018 11th of April 2018 and 12th of April 2018 For each date a small summary of the

contents is made below The small summary consists of a description of the conditions that day a listing of the video files and

their contents

C1 24th of March 2018

Conditions

bull Hours 1940 - 2020

bull Outside temperature range 5 degrees Celsius - 12 degrees Celsius

bull Clear

bull Humidity 76

bull Wind 24 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 14 kilometers

Videos

bull flir_20180324T195255mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd gathers on the right of the video D

C2 2nd of April 2018 95

bull flir_20180324T195836mp4 This video gives an overview of the inside of the Meningate ceremony Many

people can be seen watching the ceremony

bull flir_20180324T200421mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side

bull flir_20180324T201448mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side The crowd is leaving

the area after the ceremony

bull flir_20180324T202328mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side The crowd is leaving

the area after the ceremony

C2 2nd of April 2018

Conditions

bull Hours 1940 - 2020

bull Outside temperature range 9 degrees Celsius - 15 degrees Celsius

bull Light rain

bull Humidity 74

bull Wind 18 kilometers per hour

bull Precipitation 04 centimeters

bull Visibility 81 kilometers

Videos

bull 2018-04-02194733mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people sometimes with umbrellas passing through

bull 2018-04-02194952mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance

C3 3th of April 2018 96

bull 2018-04-02195518mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance

bull 2018-04-02201322mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance Crowds can be seen as well as people holding umbrellas

C3 3th of April 2018

Conditions

bull Hours 2000 - 2030

bull Outside temperature range 8 degrees Celsius - 16 degrees Celsius

bull Heavy rain

bull Humidity 79

bull Wind 25 kilometers per hour

bull Precipitation 05 centimeters

bull Visibility 101 kilometers

Videos

bull 2018-04-03 201227mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are holding

umbrellas due to heavy rain on that day Due to rain conditions and wind it was difficult to steady the camera which

can be seen in the shaky video

bull 2018-04-03 201727mp4 In the beginning of the clip the camera is moving towards the other side of the

Meningate From 0020 and onwards the clip is useful The video gives an overview of the bridge at the east-side of the

Meningate This is were the Frenchlaan goes into the Menenstraat The video shows people leaving from the Meningate

towards the busses at the other side of the bridge Most people are holding umbrellas due to heavy rain that day The

Meningate is in the bottom left of the picture Several buildings can be seen in the distance In the bottom right the

water of the Kasteelgracht can be seen Sometimes in the left of the picture the wall of the Meningate can be seen

bull 2018-04-03 202311mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are holding

umbrellas due to heavy rain on that day Due to rain conditions and wind it was difficult to steady the camera which

can be seen in the shaky video

C4 4th of April 2018 97

C4 4th of April 2018

Content

bull Hours 1945 - 2030

bull Outside temperature range 10 degrees Celsius - 14 degrees Celsius

bull Cloudy

bull Humidity 87

bull Wind 18 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-04 200052mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen In the video a dense

crowd of people slowly starts to form in the bottom right of the picture Most people are coming from the left of the

video to join the crowd

bull 2018-04-04 200728mp4 This video shows the inside of the Meningate and the ceremony of the last post

Some people are up close in front The large crowd can be seen through the hall

bull 2018-04-04 200913mp4 This video shows the inside of the Meningate and the ceremony of the last post

The video switches between MSX mode visual camera and thermal camera to show the differences

bull 2018-04-04 202859mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen At the start of the video

a crowd is seen in the bottom right At the 0100 mark the ceremony has ended and people are exiting the gate and

coming onto the crossing They form two rows to make place for the marching band exiting the Meningate It can be

seen marching through the crowd at the 0250 mark

C5 5th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 11 degrees Celsius - 15 degrees Celsius

C6 9th of April 2018 98

bull Sunny

bull Humidity 77

bull Wind 11 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-05 200217mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen In the video a dense

crowd of people slowly starts to form in the bottom right of the picture Most people are coming from the left of the

video to join the crowd The video shows 15 minutes before the start of the ceremony

bull 2018-04-04 201838mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen The video shows the

first ten minutes after the end of the ceremony The crowd which can be seen on the left leaves towards the square

C6 9th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 9 degrees Celsius - 10 degrees Celsius

bull Light rain

bull Humidity 99

bull Wind 14 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 81 kilometers

Videos

bull 2018-04-09 200007mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are coming from

the left towards the Meningate in the right Not a lot of people are seen due to rain that day

C7 10th of April 2018 99

bull 2018-04-09-202302mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right People are leaving from the right of the Meningate

towards the square

C7 10th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 14 degrees Celsius - 17 degrees Celsius

bull Partly Cloudy

bull Humidity 52

bull Wind 13 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers per hour

Videos

bull 2018-04-10 195029mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way

bull 2018-04-10 195131mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way

bull 2018-04-10 195748mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way Some people are moving around the crowd

bull 2018-04-10 200122mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen There is a big crowd that

can be seen on the right There are some schools there so some people are wearing backpacks It is quite warm and

the cafe on the other side of the street has opened up its terrace

bull 2018-04-10 201427mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

C8 11th of April 2018 100

that they are standing in a very structured way Some people are moving around the crowd The image is not rotated

well a well rotated image is found in 2018-04-10 201427_rotatedmp4

bull 2018-04-10 201515mp4 This video shows the inside of the Meningate and the ceremony A traditional

rsquoHakkarsquo from New-Zealand soldiers can be heard in the video the soldiers are difficult to spot due to thermal blurring

because many people are standing in one place

bull 2018-04-10 202558mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen that

they are standing in a very structured way Some people are moving around the crowd At the 0200 mark the army

unit marches to the end of the bridge Very dense crowds can be seen afterwards At 0825 the army unit marches in a

straight line towards the Meningate

C8 11th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 12 degrees Celsius - 16 degrees Celsius

bull Sunny

bull Humidity 63

bull Wind 14 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-11 200140mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen

bull 2018-04-11 200601mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat A small crowd can be seen on the left of the video

bull 2018-04-11 201554mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People start leaving the

ceremony from the 0120 mark

C9 12th of April 2018 101

C9 12th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 11 degrees Celsius - 14 degrees Celsius

bull Rain

bull Humidity 94

bull Wind 8 kilometers per hour

bull Precipitation 01 centimeters

bull Visibility 32 kilometers

Videos

bull 2018-04-12 195219mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen Sometimes the hedge

of the wall were the film was made is visible due to shaky cam Not many people are seen due to the rain

bull 2018-04-12 201526mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen Sometimes the hedge

of the wall were the film was made is visible due to shaky cam Not many people are seen due to the rain People are

leaving towards the right

  • Introduction
    • Drones
    • Concepts
      • Thermal Cameras
      • Aerial thermal imaging
        • Problem statement
          • Industry adoption
          • Crowd monitoring
          • Goal
          • Related work
            • Outline
              • System Design
                • Requirements analysis
                  • Functional requirements
                  • Non-functional requirements
                    • Patterns and tactics
                      • Layers
                      • Event-driven architecture
                      • Microkernel
                      • Microservices
                      • Comparison of patterns
                        • Software architecture
                          • Static view
                          • Dynamic views
                          • Deployment views
                              • State of the art and technology choice
                                • Thermal camera options
                                  • Parameters
                                  • Comparative analysis
                                    • Microservices frameworks
                                      • Flask
                                      • Falcon
                                      • Nameko
                                      • Vertx
                                      • Spring Boot
                                        • Deployment framework
                                          • Containers
                                          • LXC
                                          • Docker
                                          • rkt
                                            • Object detection algorithms and frameworks
                                              • Traditional approaches
                                              • Deep learning
                                              • Frameworks
                                                • Technology choice
                                                  • Thermal camera
                                                  • Microservices framework
                                                  • Deployment framework
                                                  • Object detection
                                                      • Proof of Concept implementation
                                                        • Goals and scope of prototype
                                                        • Overview of prototype
                                                          • General overview
                                                          • Client interface
                                                          • Stream
                                                          • Producer and Consumer
                                                          • Implemented plugins
                                                            • Limitations and issues
                                                              • Single client
                                                              • Timeouts
                                                              • Exception handling and testing
                                                              • Docker security issues
                                                              • Docker bridge network
                                                              • Single stream
                                                              • Number of containers per plugin
                                                                  • Mob detection experiment
                                                                    • Last Post thermal dataset
                                                                      • Last Post ceremony
                                                                      • Dataset description
                                                                        • Object detection experiment
                                                                          • Preprocessing
                                                                          • Training
                                                                              • Results and evaluation
                                                                                • Framework results
                                                                                  • Performance evaluation
                                                                                  • Interoperability evaluation
                                                                                  • Modifiability evaluation
                                                                                    • Mob detection experiment results
                                                                                      • Training results
                                                                                      • Metrics
                                                                                      • Validation results
                                                                                          • Conclusion and future work
                                                                                            • Conclusion
                                                                                            • Future work
                                                                                              • Security
                                                                                              • Implementing a detection plugin
                                                                                              • Different deployment configurations
                                                                                              • Multiple streams with different layouts
                                                                                              • Implementing the plugin distribution service (Remote ProducerConsumer)
                                                                                              • Using high performance microservices backbone frameworks
                                                                                              • New object detection models and datasets specifically for thermal images
                                                                                                  • Firefighting department email conversations
                                                                                                    • General email sent to Firefighting departments
                                                                                                    • Conversation with Firefighting department of Antwerp Belgium
                                                                                                    • Converstation with Firefighting department of Ostend Belgium
                                                                                                    • Conversation with Firefighting department of Courtrai Belgium
                                                                                                    • Conversation with Firefighting department of Ghent Belgium
                                                                                                      • Thermal camera specifications
                                                                                                      • Last Post thermal dataset summary
                                                                                                        • 24th of March 2018
                                                                                                        • 2nd of April 2018
                                                                                                        • 3th of April 2018
                                                                                                        • 4th of April 2018
                                                                                                        • 5th of April 2018
                                                                                                        • 9th of April 2018
                                                                                                        • 10th of April 2018
                                                                                                        • 11th of April 2018
                                                                                                        • 12th of April 2018

vi

Modifiable drone thermal imaging analysis framework for mob detection during

open-air events

Brecht Verhoeve

Supervisors Prof dr Bruno Volckaert Prof dr ir Filip De Turck

Counsellors Pieter-Jan Maenhaut Jerico Moeyersons

Masterrsquos dissertation submitted in order to obtain the academic degree of

Master of Science in Computer Science Engineering

Department of Information Technology

Chair Prof dr ir Bart Dhoedt

Faculty of Engineering and Architecture

Ghent University

Academic year 2017-2018

Abstract

Drones and thermal cameras are used in combination for many applications such as search and rescue fire fighting etc Due to

vendor specific hardware and software applications are hard to develop and maintain Therefore a modifiable drone thermal

imaging analysis framework is proposed that enables users to more easily develop such image processing applications It

implements a microservice plugin architecture Users can build image processing applications with the framework by building

media streams using plugins that are either thermal cameras or image analysis software modules The framework is evaluated

by building a proof of concept implementation which is tested on the initial requirements It achieves the modifiability and

interoperability requirements at the cost of performance and security The framework is applied for detecting large crowds of

people (mobs) during open-air events A new dataset containing thermal images of such mobs is presented on which a YOLOv3

neural network is trained The trained model is able to detect mobs on new thermal images in real-time achieving frame rates

of 55 frames per second when deployed on a modern GPU

Keywords Drone thermal imaging Video streaming Framework Microservices Object de-

tection Plugin

Modifiable Drone Thermal Imaging AnalysisFramework for Mob Detection during Open-Air

EventsBrecht Verhoeve

Supervisors prof dr Bruno Volckaert prof dr ir Filip De Turck Pieter-Jan Maenhaut Jerico Moeyersons

Abstractmdash Drones and thermal cameras are used in combination formany applications such as search and rescue fire fighting etc Due to ven-dor specific hardware and software applications are hard to develop andmaintain Therefore a modifiable drone thermal imaging analysis frame-work is proposed that enables users to more easily develop such imageprocessing applications It implements a microservice plugin architectureUsers can build image processing applications with the framework by build-ing media streams using plugins that are either thermal cameras or imageanalysis software modules The framework is evaluated by building a proofof concept implementation which is tested on the initial requirements Itachieves the modifiability and interoperability requirements at the cost ofperformance and security The framework is applied for detecting largecrowds of people (mobs) during open-air events A new dataset containingthermal images of such mobs is presented on which a YOLOv3 neural net-work is trained The trained model is able to detect mobs on new thermalimages in real-time achieving frame rates of 55 frames per second whendeployed on a modern GPU

Keywordsmdash Drone thermal imaging Video streaming Framework Mi-croservices Object detection Plugin

I INTRODUCTION

THROUGHOUT history having an overview of the environ-ment from high viewpoints held many benefits The advent

of drones and advanced cameras provides low-cost aerial imag-ing that creates numerous opportunities for new applicationsTraditional visual cameras for the visible light spectrum offerhigh quality images but are limited to daytime or artificiallylighted scenes Thermal cameras measure thermal radiation ofobjects in a scene and thus can operate in utter darkness reveal-ing information not visible to the normal eye [1] The combi-nation of drones and thermal cameras is used in many differentapplications such as geography [23] agriculture [4] search andrescue [5] wildlife monitoring [6] disaster response [7] main-tenance [8] etc

Several vendors offer thermal camera products some specif-ically designed for drone platforms These cameras often usedifferent image formats color schemes and interfaces [19ndash11]This leads to issues if applications want to change the camerathat is used or when the camera is no longer supported by thevendor because different software needs to be built to inter-act with the new camera which often is a non-negligible costThis leads to a problem called vendor lock-in that makes cus-tomers dependent on a certain vendor as they cannot switchproduct without making substantial costs a problem alreadyvery tangible for cloud-based applications [12] Applicationsacross various fields often have slightly different functional andnon-functional requirements For this dissertation several Bel-gian fire fighting departments were asked for requirements for

a thermal drone platform application It quickly became clearthat they had various problems that needed to be solved suchas finding hot explosives measuring temperatures in contain-ers identifying hot entrances detecting invisible methane firesfinding missing persons etc Some use cases need to be eval-uated in real-time (during fires) others need to be extremelyaccurate A hypothetical application should be able to quicklyimplement new detection and analysis features to meet all theserequirements Due to the current solutions not being modifiableenough current applications built with aerial thermal imagingtechnology remain in the niche use case for which they were ini-tially developed [13] Applications could benefit from a back-bone framework to aid in this modifiabilityinteroperability is-sue aiding in developing end-to-end solutions connecting ther-mal cameras to various analysisdetection modules

This dissertation explores the requirements for such a frame-work and its potential software architecture To test the viabil-ity of the architecture a proof of concept prototype is imple-mented and evaluated against the initial requirements To verifyif it aids in developing detection applications the specific usecase for detecting large crowds of people so-called mobs dur-ing open-air events is investigated Monitoring crowds duringopen-air events is important as mobs can create potentially dan-gerous situations through bottlenecks blocking escape routesetc Through monitoring and detecting these mobs these situa-tions can be avoided before they become problematic [14 15]

The remainder of this paper is organized as follows Sec-tion II presents similar projects on the topic of modifiable imag-ing analysis frameworks and thermal object detection SectionIII presents the requirements of the framework and the soft-ware architecture designed from these requirements Section IVpresents the implementation of the framework prototype Themob detection experiment is described in Section V The testsand results to evaluate the framework and the mob detection ex-periment are presented in Section VI Finally Section VII drawsconclusions from this research and indicates where future effortsin this field should go to

II RELATED WORK

The Irish start-up DroneSAR [16] developed a search-and-rescue (SAR) drone platform allowing users to stream live im-ages and video from a drone as it conducts a search for miss-ing persons The platform works with any camera visual andthermal but focuses on drones from vendor DJI DroneSARs in-dustry partner Amazon introduced the Amazon Kinesis Video

Streams platform in January 2018 as a new service for the Ama-zon Web Services (AWS) cloud platform It allows users tostream live video from devices to the AWS cloud and buildapplications for real-time video processing [17] The VIPERproject by EAVISE KU Leuven researched how thermal andvisual video images could be used for real-time detection ofpersons using object detection algorithms such as deep learning[18] The framework presented in this work combines elementsfrom all three of these examples

III REQUIREMENTS AND SOFTWARE ARCHITECTURE

A Functional requirements

Three general actors are identified for the framework an end-user that wants to build a image processing application for a spe-cific use case camera developers integrating new cameras intothe framework and analysisdetection module developers inte-grating new analysisdetection algorithms into the framework sothat end-users can use them to build their applications An end-user should be able to construct a image processing applicationusing the integrated cameras and analysis modules He shouldbe able to adapt this application with the framework for newuse cases Camera and analysis module developers should beable to add and distribute new integrations for new cameras andanalysis modules to the framework This allows the end-users tofocus on the use case not the technical details of the hardwareplatforms or algorithms and have a wider selection of hardwareand algorithms

B Non-functional requirements

Interoperability modifiability and peformance are identifiedas the architecturally significant requirements Interoperabilityspecifies that the framework should be able to interact with var-ious cameras and analysis software modules via interfaces Theamount of systems the framework can successfully interact withadding to the business value of the framework as end-users canuse more devices via the framework to build applications Theframework needs to be extendable with new thermal camerasand analysis modules Applications built with the frameworkshould be modifiable to integrate new hardware and softwareThe available hardware on-site for use cases such as forest firemonitoring is not always powerful enough to support heavy im-age analysis software The framework should be able to deployin a distributed fashion to allow more computationally expen-sive operations to be executed on more powerful remote devicesSome use cases require real-time streaming of video and manip-ulation of these video streams which should be supported forthe framework to be relevant

C Software architecture

An architectural pattern analysis based on the requirementspresented in Section III-B was conducted from which a com-bination of the microservices and microkernel pattern was se-lected as the base pattern for the software architecture The mi-crokernel pattern enables the framework to be extended via aplugin system The microservices framework enables interop-erability through the use of standard interfaces and a communi-cation scheme It also allows for the framework to be deployed

in a distributed fashion [19ndash21] The software architecture isdocumented in static views sequence diagrams and deploymentviews Figure 1 presents an overview of the architecture

Fig 1 Component-connector overview of the framework Theclear components are the core components of the framework thateach user needs to install to use the framework The coloredcomponents are used for the distribution of plugins

End-users interact with the framework via the Client Inter-face a graphical or textual interface Commands are forwardedto the Stream module that manages the layout and state of thecomponents in the media pipeline (stream) that the end-userbuilds for his use case To activate and place the componentsin a certain layout the Stream module connects with the Pro-ducer and Consumer components that control the plugins thatprocess the media Producer Plugins are devices that producemedia such as thermal cameras Consumer Plugins process andconsume media such as analysis software and displays Oncea stream is established the plugins forward media to each otherin the layout specified by the Stream module New support forcameras and analysis modules can be added as plugins to theProducerConsumer Distribution components that distribute thissoftware so that end-users can download and install the pluginsEach module in the architecture is a microservice allowing fora distributed deployment and enabling the modifiability require-ments through interchangeable modules Cameras and analy-sis modules are realized as plugins for the ProducerConsumermodules implemented as a microkernel This allows the frame-work to easily build new streams for different use cases and in-terchange components when needed

C1 Plugin model

Figure 2 depicts the model of a general framework pluginThe plugin defines three interfaces a source media endpoint toreceive media from different sources a listener endpoint to for-ward the processed media to other listening plugins and an APIfor framework control The framework uses the API to changewhich sources and listener a plugin has and its state By linkingplugins together by setting the sources and listeners resourcesthe framework can build a media processing stream ProducerPlugins have no sources since they produce media The statesare used stop and start the media processing of the plugins in

Fig 2 Schematic overview of a plugin

the stream The REST paradigm is selected to build this APIwith state sources and listeners resources that need to be min-imally implemented A plugin has the following states INAC-TIVE PLAY STOP and PAUSE Figure 3 depicts the state tran-sition diagram A plugin implements the visible states STOPPAUSE and PLAY describing if the media process of the pluginis stopped paused or processing respectively The INACTIVEstate is only visible to the framework as it indicates that their isno active process that runs the plugin This is the initial state ofa plugin in the framework When a plugin is added to a streamthe plugin microservice is started transitions to the STOP stateand waits for commands

Fig 3 State transition diagram of a plugin

C2 Network topology and communication protocol

The microservices of the framework and the plugins need acommunication protocol to exchange commands and video Forsending the commands the HTTPTCP protocol is used a syn-chronous protocol that blocks on the response of a request Thisensures that commands are always received and acknowledgedincreasing the reliability of the communication at the cost ofperformance [22] The asynchronous RTPUDP protocol is se-lected to transfer media between the plugins to ensure low la-tency video transfer between plugins to enable real-time videostreams The recommended codec for transmitting video mediais MJPEG which transmits video frames as separately encodedJPEG images [23] Other codecs such as MPEG-4 encode onlycertain frames as keyframes and the other frames as B-framesthat encode differences from the keyframe [24] This impliesthat when receiving images from a stream a keyframe must firstbe received before the video can be decoded Using MJPEG

plugins receiving frames can directly perform analysis on eachkeyframe and do not have to wait for additional frames An ex-ample of this network topology for a stream with 2 Producerplugins and 3 Consumer plugins is depicted in Figure 4

Fig 4 Network topology The full lines represent HTTPTCPcommuncations the dashed line RTPUDP communications

IV PROTOTYPE IMPLEMENTATION

The goal of the prototype implementation is to provide a proofof the concept architecture and meet the requirements presentedin Section III The core framework components are imple-mented the distribution components are left out of scope as theyfocus primarily on the distribution of supported plugins Thecore microservices as well as the plugins are implemented usingthe Docker software containerization framework [25] Contain-ers virtualize on the operating system and allow for portablelightweight software environments for processes with a minorperformance overhead Using this technology the core modulesand plugins can be deployed in a local and distributed fashionand be setup and shutdown easily [26] The microservice con-tainers communicate via the protocols presented in Section III-C2 The REST APIs are built with the Flask framework [27] alightweight Python web development framework ideal for pro-totyping The ProducerConsumer components need access tothe Docker daemon running on the Docker host in order to spinup and shutdown Docker containers containing the plugin pro-cesses This is achieved by mounting the Docker client socket inthe ProducerConsumer containers This gives the container rootaccess to the host a significant security threat [2829] Two sam-ple plugins were implemented Filecam a plugin that producesvideo read in from a file and Display a plugin that forwardsmedia to the display of the local device The plugins transmitmedia using the video streaming framework GStreamer [30]

V MOB DETECTION

A Dataset

Several publicly available datasets for thermal images exists[31ndash34] None of these include large crowds of people so anew dataset called the Last Post dataset was created It consistsof thermal video captured at the Last Post ceremony in Ypres

(a) Thermal view of the square (b) Visual view of the square (c) Thermal view of the bridge (d) Visual view of the bridge

Fig 5 Last Post dataset main scenes

Belgium [35] The videos were captured using the Flir One Prothermal camera for Android [36] using the Iron colorschemeTwo main scenes are present in the dataset depicted in Figure 5Mobs are present in the thermal images not in the visual imagesdue to the images being made on separate days The imagesused for the experiment were manually annotated outliers wereremoved and the dataset was randomly split in a training andvalidation set

B Model

Detecting and classifying objects of interest in images isknown as the object detection problem in machine learning [37]Several object detection algorithms and frameworks have beenimplemented in the past years A distinction is made betweentraditional models [31 38ndash40] deep learning two-stage net-works [41ndash46] and deep learning dense networks [47ndash49] Thetraditional and two-stage methods make predictions relativelyslow (order of seconds on GPU) when compared to the densenetworks (order of milliseconds on GPU) [47] Since the goalis to use the framework in real-time use cases the latter is pre-ferred The YOLOv3 model is selected as it achieves state of theart prediction performances can make real-time predictions andis available via the open source neural network framework dark-net [50 51] The model is pre-trained on the ImageNet dataset[52] The model is trained on a NVIDIA Geforce 980 TX GPUand optimizes the SSE loss using batch gradient descent [50]To select the best weights the average Intersection of Union(IoU) and mean Average Precision (mAP) [53] are calculatedon predictions on the validation set The weights that achievethe highest mAP are selected as the final weights

VI RESULTS

A Framework

To evaluate the framework acceptance tests for the require-ments from Section III were conducted Common frameworkoperations such as manipulating and building a stream have anaverage execution time of 084 seconds with a standard devia-tion of 037 seconds Less common operations such as deacti-vating a plugin starting up the framework and shutting downthe framework have an average execution time of 358 840 and2402 seconds respectively with standard deviations 467 050and 048 respectively Deactivating plugins (STOP to INAC-TIVE transitions) takes a long time as the container running theprocess needs to be removed Real-time streaming could not betested due to the GStreamer framework having no readily avail-

able testing endpoints However when streaming and displayinga video with the framework human users could not differenti-ate between a streaming video and a video played using a nativemedia player making it plausible the framework streams in real-time Great care must be taken when building plugins as theirprocessing speed has a direct impact on the real-time streamingperformance Interoperability is achieved with the REST APIsand plugin model presented in Section III-C The interoperabil-ity is tested by having the framework exchange information witha mock plugin implementing the specified interface and count-ing the number of correct exchanges The average successfulexchange ratio is 99998 The framework can install and de-tect new plugins at runtime achieving runtime modifiability atplugin level Different deployment schemes were not tested forthe prototype

B Mob detection

The weights generated at the 15700th training epoch achievedthe highest mAP value 9052 on the validation set For com-parison performance of other models on benchmark datasetsachieve an average mAP of 748 [54] The reason the model isachieving such high values is because the validation set is tem-porally correlated with the training set as both sets are extractedfrom videos in which frames have a temporal correlation Per-formance when predicting on new datasets will be worse Figure6 depicts some predictions of the model When predicting on avideo the model generated predictions at an average frame rateof 55 frames per second an a GPU

Fig 6 Model predictions on validation set

VII CONCLUSION AND FUTURE WORK

In this dissertation a modifiable drone thermal imaging anal-ysis framework is proposed to allow end-users to build flexiblevideo processing pipelines using different thermal cameras andanalysis modules The framework implements a microservice

container plugin architecture Users can build image process-ing applications with the framework by building media streamsusing plugins that are either thermal cameras or image analy-sis software modules The framework is evaluated by building aproof of concept implementation which is tested on the initial re-quirements The proposed framework achieves the modifiabilityand interoperability requirements at the cost of performance andsecurity The framework is applied for detecting large crowdsof people (mobs) during open-air events A new dataset con-taining thermal images of such mobs is presented on which aYOLOv3 neural network is trained The trained model is ableto detect mobs on new thermal images in real-time achievingframe rates of 55 frames per second when deployed on a modernGPU Some extensions to this research are deploying a detec-tion model using the framework testing the other deploymentconfigurations testing the framework with end-users in prac-tice and building new object detection models specifically forthermal images

REFERENCES

[1] R Gade and T B Moeslund ldquoThermal cameras and applications a sur-veyrdquo Machine Vision and Applications vol 25 pp 245ndash262 2014

[2] M C Harvey J V Rowland and K M Luketina ldquoDrone with thermalinfrared camera provides high resolution georeferenced imagery of theWaikite geothermal area New Zealandrdquo 2016

[3] S Amici M Turci S Giammanco L Spampinato and F Giulietti ldquoUAVThermal Infrared Remote Sensing of an Italian Mud Volcanordquo vol 2pp 358ndash364 2013

[4] J Bendig A Bolten and G Bareth ldquoINTRODUCING A LOW-COSTMINI-UAV FOR THERMAL-AND MULTISPECTRAL-IMAGINGrdquo2012

[5] A J Rivera A D Villalobos J C Monje J A Marinas and C MOppus ldquoPost-disaster rescue facility Human detection and geolocationusing aerial dronesrdquo IEEE Region 10 Annual International ConferenceProceedingsTENCON pp 384ndash386 2017

[6] P Christiansen K A Steen R N Joslashrgensen and H Karstoft ldquoAuto-mated detection and recognition of wildlife using thermal camerasrdquo Sen-sors (Basel Switzerland) vol 14 pp 13778ndash93 jul 2014

[7] S Chowdhury A Emelogu M Marufuzzaman S G Nurre and L BianldquoDrones for disaster response and relief operations A continuous approx-imation modelrdquo 2017

[8] Workswell ldquoPipeline inspection with thermal diagnosticsrdquo 2016[9] DJI ldquoZenmuse H3 - 2Drdquo[10] Workswell ldquoApplications of WIRIS - Thermal vision system for dronesrdquo[11] Therm-App ldquoTherm-App - Android-apps op Google Playrdquo 2018[12] B Satzger W Hummer C Inzinger P Leitner and S Dustdar ldquoWinds of

change From vendor lock-in to the meta cloudrdquo IEEE Internet Comput-ing vol 17 no 1 pp 69ndash73 2013

[13] J Divya ldquoDrone Technology and Usage Current Uses and Future DroneTechnologyrdquo 2017

[14] B Steffen and A Seyfried ldquoMethods for measuring pedestrian densityflow speed and direction with minimal scatterrdquo Physica A Statistical Me-chanics and its Applications vol 389 pp 1902ndash1910 may 2010

[15] M Wirz T Franke D Roggen E Mitleton-Kelly P Lukowicz andG Troster ldquoInferring crowd conditions from pedestriansrsquo location tracesfor real-time crowd monitoring during city-scale mass gatheringsrdquo Pro-ceedings of the Workshop on Enabling Technologies Infrastructure forCollaborative Enterprises WETICE pp 367ndash372 2012

[16] L-L Slattery ldquoDroneSAR wants to turn drones into search-and-rescueheroesrdquo 2017

[17] Amazon Web Services Inc ldquoWhat Is Amazon Kinesis Video Streamsrdquo2018

[18] T Goedeme ldquoProjectresultaten VLAIO TETRA-projectrdquo tech rep KULeuven Louvain 2017

[19] L Bass P Clements and R Kazman Software Architecture in PracticeAddison-Wesley Professional 3rd ed 2012

[20] M Richards Software Architecture Patterns OrsquoReilly Media first edit ed2015

[21] C Richardson ldquoMicroservice Architecture patternrdquo 2017[22] C De La Torre C Maddock J Hampton P Kulikov and M Jones ldquoCom-

munication in a microservice architecturerdquo 2017

[23] On-Net Surveillance Systems Inc ldquoMJPEG vs MPEG4 Understandingthe differences advantages and disadvantages of each compression tech-niquerdquo 2006

[24] D Bull Communicating Pictures A Course in Image and Video CodingElsevier Science 2014

[25] Docker Inc ldquoDocker - Build Ship and Run Any App Anywhererdquo 2018[26] D Merkel ldquoDocker Lightweight Linux Containers for Consistent Devel-

opment and Deploymentrdquo 2014[27] A Ronacher ldquoWelcome to Flask Flask Documentation (012)rdquo 2017[28] Lvh ldquoDonrsquot expose the Docker socket (not even to a container)rdquo 2015[29] R Yasrab ldquoMitigating Docker Security Issuesrdquo tech rep University of

Science and Technology of China Hefei[30] GStreamer ldquoGStreamer open source multimedia frameworkrdquo 2018[31] J W Davis and M A Keck ldquoA Two-Stage Template Approach to Per-

son Detection in Thermal Imageryrdquo Proc Workshop on Applications ofComputer Vision 2005

[32] S Hwang J Park N Kim Y Choi and I S Kweon ldquoMultispectralPedestrian Detection Benchmark Dataset and Baselinerdquo CVPR 2015

[33] Z Wu N Fuller D Theriault and M Betke ldquoA Thermal Infrared VideoBenchmark for Visual Analysisrdquo IEEE Conference on Computer Visionand Pattern Recognition Workshops 2014

[34] S Z Li R Chu S Liao and L Zhang ldquoIllumination Invariant FaceRecognition Using Near-Infrared Imagesrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 29 no 4 pp 627ndash639 2007

[35] Last Post Association ldquoMissionrdquo 2018[36] FLIR ldquoFLIR One Prordquo[37] E Alpaydin Introduction to machine learning MIT Press 3 ed 2014[38] F X F Xu X L X Liu and K Fujimura ldquoPedestrian detection and

tracking with night visionrdquo IEEE Transactions on Intelligent Transporta-tion Systems vol 6 no 1 pp 63ndash71 2005

[39] H Nanda and L Davis ldquoProbabilistic template based pedestrian detectionin infrared videosrdquo IEEE Intelligent Vehicles Symposium Proceedingsvol 1 pp 15ndash20 2003

[40] R Appel S Belongie P Perona and P Doll ldquoFast Feature Pyramids forObject Detectionrdquo Pami vol 36 no 8 pp 1ndash14 2014

[41] J R R Uijlings K E A Van De Sande T Gevers and A W M Smeul-ders ldquoSelective Search for Object Recognitionrdquo tech rep 2012

[42] R Girshick J Donahue T Darrell and J Malik ldquoRegion-Based Convolu-tional Networks for Accurate Object Detection and Segmentationrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol 38 no 1pp 142ndash158 2014

[43] R Girshick ldquoFast R-CNNrdquo in Proceedings of the IEEE InternationalConference on Computer Vision vol 2015 Inter pp 1440ndash1448 2015

[44] S Ren K He R Girshick and J Sun ldquoFaster R-CNN Towards Real-Time Object Detection with Region Proposal Networksrdquo IEEE Trans-actions on Pattern Analysis and Machine Intelligence vol 39 no 6pp 1137ndash1149 2016

[45] K He Gkioxari P Dollar and R Girshick ldquoMask R-CNNrdquo arXiv 2018[46] J Dai Y Li K He and J Sun ldquoR-FCN Object Detection via Region-

based Fully Convolutional Networksrdquo tech rep 2016[47] J Redmon S Divvala R Girshick and A Farhadi ldquoYou Only Look

Once Unified Real-Time Object Detectionrdquo 2015[48] W Liu D Anguelov D Erhan C Szegedy S Reed C-y Fu and A C

Berg ldquoSSD Single Shot MultiBox Detectorrdquo arXiv 2016[49] T-y Lin P Goyal R Girshick K He and P Dollar ldquoFocal Loss for

Dense Object Detectionrdquo arXiv 2018[50] J Redmon and A Farhadi ldquoYOLOv3 An Incremental Improvementrdquo

axXiv 2018[51] J Redmon ldquoDarknet Open source neural networks in crdquo

httppjreddiecomdarknet 2013ndash2016[52] J Deng W Dong R Socher L-J Li K Li and L Fei-Fei ldquoImageNet

A Large-Scale Hierarchical Image Databaserdquo in CVPR09 2009[53] M Everingham S M A Eslami L Van Gool C K I Williams J Winn

and A Zisserman ldquoThe Pascal Visual Object Classes Challenge A Ret-rospectiverdquo International Journal of Computer Vision vol 111 no 1pp 98ndash136 2014

[54] A Ouaknine ldquoReview of Deep Learning Algorithms for Object Detec-tionrdquo 2018

xii

Contents

1 Introduction 1

11 Drones 1

12 Concepts 2

121 Thermal Cameras 2

122 Aerial thermal imaging 2

13 Problem statement 2

131 Industry adoption 2

132 Crowd monitoring 3

133 Goal 4

134 Related work 4

14 Outline 4

2 System Design 5

21 Requirements analysis 5

211 Functional requirements 5

212 Non-functional requirements 6

22 Patterns and tactics 11

221 Layers 12

222 Event-driven architecture 12

223 Microkernel 12

224 Microservices 13

225 Comparison of patterns 13

23 Software architecture 15

231 Static view 15

232 Dynamic views 22

233 Deployment views 23

3 State of the art and technology choice 27

31 Thermal camera options 27

311 Parameters 27

312 Comparative analysis 30

32 Microservices frameworks 31

321 Flask 31

322 Falcon 33

323 Nameko 33

324 Vertx 33

325 Spring Boot 34

33 Deployment framework 34

331 Containers 34

332 LXC 35

333 Docker 35

334 rkt 35

34 Object detection algorithms and frameworks 36

341 Traditional approaches 36

342 Deep learning 37

343 Frameworks 39

35 Technology choice 41

351 Thermal camera 41

352 Microservices framework 41

353 Deployment framework 41

354 Object detection 41

4 Proof of Concept implementation 43

41 Goals and scope of prototype 43

42 Overview of prototype 43

421 General overview 43

422 Client interface 45

423 Stream 46

424 Producer and Consumer 46

425 Implemented plugins 48

43 Limitations and issues 51

431 Single client 51

432 Timeouts 51

433 Exception handling and testing 51

434 Docker security issues 51

435 Docker bridge network 52

436 Single stream 52

437 Number of containers per plugin 52

5 Mob detection experiment 53

51 Last Post thermal dataset 53

511 Last Post ceremony 53

512 Dataset description 54

52 Object detection experiment 56

521 Preprocessing 56

522 Training 56

6 Results and evaluation 58

61 Framework results 58

611 Performance evaluation 58

612 Interoperability evaluation 60

613 Modifiability evaluation 62

62 Mob detection experiment results 62

621 Training results 63

622 Metrics 63

623 Validation results 64

7 Conclusion and future work 67

71 Conclusion 67

72 Future work 69

721 Security 69

722 Implementing a detection plugin 69

723 Different deployment configurations 70

724 Multiple streams with different layouts 70

725 Implementing the plugin distribution service (Remote ProducerConsumer) 70

726 Using high performance microservices backbone frameworks 70

727 New object detection models and datasets specifically for thermal images 70

A Firefighting department email conversations 81

A1 General email sent to Firefighting departments 81

A2 Conversation with Firefighting department of Antwerp Belgium 82

A3 Converstation with Firefighting department of Ostend Belgium 83

A4 Conversation with Firefighting department of Courtrai Belgium 83

A5 Conversation with Firefighting department of Ghent Belgium 83

B Thermal camera specifications 85

C Last Post thermal dataset summary 94

C1 24th of March 2018 94

C2 2nd of April 2018 95

C3 3th of April 2018 96

C4 4th of April 2018 97

C5 5th of April 2018 97

C6 9th of April 2018 98

C7 10th of April 2018 99

C8 11th of April 2018 100

C9 12th of April 2018 101

xvi

List of Figures

21 Use case diagram 7

22 Overview of the framework software architecture 16

23 Framework network topology 17

24 Client Interface detailed view 17

25 Stream detailed view 18

26 Stream model 18

27 Plugin model 19

28 Plugin state transition diagram 20

29 Component-connector diagrams of the Producer and Consumer module 21

210 Producer and Consumer Distribution component-connector diagrams 22

211 Add plugin sequence diagram 23

212 Link plugins sequence diagram 24

213 Deployment diagrams 26

31 Thermal image and MSX image of a dog 28

33 Rethink IT Most used tools and frameworks for microservices results [54] 32

34 Containers compared to virtual machines [66] 36

41 filecam GStreamer pipeline 49

42 local plugin GStreamer pipeline 50

51 Last Post ceremony panorama 54

52 Last Post filming locations 54

53 Main scenes in the Last Post dataset 55

54 Outliers 57

61 Average training loss per epoch 64

62 Validation metrics per epoch 65

63 Predictions of the model on images in the validation set 66

71 GStreamer pipeline for a plugin with a detection model 69

xviii

List of Tables

21 Performance utility tree 8

22 Interoperability utility tree 9

23 Modifiability utility tree 10

24 Usability utility tree 11

25 Security utility tree 11

26 Availability utility tree 12

27 Architecture pattern comparison 14

61 Acceptance tests results summary 59

62 Performance test statistics summary measured in seconds 60

63 Resource usage of the framework in several conditions 61

64 Total size of framework components 61

65 Interoperability tests results (S Source L Listener) 62

B1 Compared cameras their producing companies and their average retail price 86

B2 Physical specifications 87

B3 Image quality IR InfraRed SD Standard FOV Field of View 88

B4 Thermal precision 89

B5 Interfaces 90

B6 Energy consumption 91

B7 Help and support 92

B8 Auxiliary features 93

xix

List of Listings

1 Minimal Flask application 32

2 Vertx example 33

3 Spring Boot example 34

4 docker-composeyml snippet of the prototype 44

5 Mounting the Docker socket on the container 47

6 Starting a plugin container 47

7 Dynamic linking of the decodebin and jpegenc 50

xx

List of Abbreviations

ACF Aggregated Channel Features

AMQP Advanced Message Queuing Protocol

API Application Programming Interface

AS Availability Scenario

ASR Architecturally Significant Requirement

CLI Command Line Interface

CNN Convolutional Neural Networks

CRUD Create Read Update Destroy

DNS Domain Name System

FR Functional Requirement

GPU Graphical Processing Unit

H High

HTTP Hyper Text Transfer Protocol

ICF Integral Channel Features

IoU Intersection of Union

IS Interoperability Scenario

IT Interoperability Tactic

JVM Java Virtual Machine

L Low

xxi

LXC Linux Containers

M Medium

mAP mean Average Precision

Motion-JPEG MJPEG

MS Modifiability Scenario

MSX Multi Spectral Dynamic Imaging

MT Modifiablity Tactic

NFR Non-Functional Requirement

ONNX Open Neural Network Exchange Format

OS Operating System

PS Performance Scenario

PT Performance Tactic

QAR Quality Attribute Requirement

REST Representational State Transfer

RNN Recurrent Neural Network

RPN Region Proposal Network

RTP Real-time Transport Protocol

SS Security Scenario

SSE Sum of Squared Errors

SVM Support Vector Machine

TCP Transmission Control Protocol

UDP User Datagram Protocol

UI User Interface

US Usability Scenario

YOLO You Only Look Once

INTRODUCTION 1

Chapter 1

Introduction

Throughout history having an overview of the environment from high viewpoints held many benefits Early civilizations used

hills to monitor their surroundings population and spot possible attackers The discovery of flight meant that environments

could now be viewed from a birdrsquos-eye view offering even more visibility revealing much more of the world below Recently a

much more smaller type of aircraft was developed the drone Ranging from large plane-like to almost insect-like devices and

having a wide variety of uses drones are quickly taking over the sky Drones would not be as effective without proper cameras

providing a detailed view on the world below With digital videocameras offering superb quality for steadily decreasing costs

almost every scene can be captured in great detail However these cameras are limited to the visible light spectrum which

hinders drones to operate in all circumstances like nightly flights Thermal cameras measure the emitted heat of a scene and

can reveal information not visible to the eye such as hidden persons or animals pipelines malfunctioning equipment etc The

combination of these two technologies certainly holds many exciting opportunities for the future

11 Drones

Drones are flying robots that can fly remotely or autonomously and donrsquot carry a human operator They can carry a variety of

payloads video cameras delivery parcels fluid containers sensors lights but also lethal explosives [1]

Drones are classified in different categories based on varying parameters such as the physical characteristics (diameter weight)

aerial movement techniques application domains etc Based on diameter drones are classified as smart dust (1 mm to 025

cm) pico air vehicles (025 cm - 25 cm) nano air vehicles (25 cm - 15 cm) micro air vehicles (15 cm - 1 m) micro unmanned

aerial vehicles (1 m - 2 m) and unmanned aerial vehicles (2 m and larger) Often depending on their diameter the weight

of these devices ranges from less than a gram up to more than 2000 kg Drones have different flight techniques such as

propulsion engines with wings rotors in various amounts flapping wings and even balloons They are used for all kinds of

purposes ranging from search and rescue missions environmental protection delivery recon etc Hassanalian et al provide

an excellent overview of most types of drones [2]

Due to the increasing interest in commercial drone platforms [3] a variety of payloads were developed specifically tailored for

these aerial robots such as gimbals to mount action video cameras [4] gimbals for delivering packets [5] and thermal imaging

12 Concepts 2

platforms [6]

12 Concepts

121 Thermal Cameras

Thermal cameras are passive sensors that capture the infrared radiation emitted by all objects with a temperature above

absolute zero degrees Kelvin In contrast to visible light cameras thermal cameras do not depend on an external energy

source for visibility and colors of objects or scenes This makes captured images independent of the illumination colors etc

Furthermore images can be captured in the absence of visible light [7] Originally thermal camera technology was developed

for night vision purposes for the military and the devices were very expensive Later the technology was commercialized

and has developed quickly over the last few decades resulting in better and cheaper cameras [7] This led to access for a

broader public and the technology is now introduced to a wide range of different applications such as building inspection gas

detection industrial appliances medicinal science agriculture fire detection surveillance etc [7] Thermal cameras are now

being mounted on drones to give an aerial thermal overview

122 Aerial thermal imaging

Aerial thermal imaging is defined as the creation of thermal images using a flying device This dissertation focuses on the usage

of drones for aerial thermal imaging There are many applications for aerial thermal imaging Some examples are geography

[8 9] agriculture [10 11] search and rescue operations [12] wildlife monitoring [13] forest monitoring [14 15] disaster response

[16] equipment and building maintenance [17ndash20] etc In the past few years several industry players have developed thermal

cameras specifically aimed at these drone applications Examples are FLIR [6] Workswell [21] and TEAX Technology [22]

13 Problem statement

131 Industry adoption

The implementation of thermal cameras on drone platforms faces some issues for wide adoption by the industry Several

vendors offer thermal camera products some specifically designed for drone platforms that often implement different image

formats color schemes and interfaces (eg [23ndash25]) This leads to issues if users want to modify their applications by changing

the camera that is used because the applicationmust implement new software to interact with the camera or when the camera

is no longer supported by the vendor leaving the application with outdated hardware and software This leads to a problem

called vendor lock-in that makes customers dependent on a certain vendor as they cannot switch product without making

substantial costs a problem already very tangible for cloud-based applications today [26]

Applications across various fields often have different functional and non-functional requirements Some applications have hard

real-time deadlines (such as firefighting search and rescue security etc) that must be respected other applications require

13 Problem statement 3

highly accurate predictions (eg person detection agriculture etc) A single application domain can even have many different

use cases

Several firefighting departments in Belgium were contacted to get an overview of their needs for a drone thermal imaging ap-

plication It quickly became clear they had various detection problems such as finding missing persons locating hot explosives

measuring temperatures in silos detecting invisible methane fires etc Equipment also wears down more quickly due to usage

in harsh environments such as fires in close proximity A drone thermal application for them needs to be able to exchange

functionality and hardware easily and have high performance constraints to deliver value for them The email conversations

can be read in Appendix A

Other drone thermal imaging applications are mostly only used in the niche domain for which they were developed because

they arenrsquot designed for flexibility [27] These proprietary applications have some disadvantages the development and support

potentially has a large cost vendor lock-in can occur when products are no longer supported security issues could arise and

customization is difficult [28 29] Applications could benefit from a backbone framework to aid in this modifiabilityinteroper-

ability issue aiding in developing end-to-end solutions connecting thermal cameras to various analysisdetection modules for

various use cases

132 Crowd monitoring

Festivals and other open air events are popular gatherings that attract many people For every event organizer it is important to

ensure safety and avoid incidents Large groups of people so-called mobs can create potentially dangerous situations through

bottlenecks blocking escape routes etc Therefore having the ability to monitor crowds and predict their behavior is very

important to avoid such scenarios Data can be obtained by evaluating video footage from past comparable events or real time

video monitoring of current events [30] By analyzing this footage potentially dangerous situations can be avoided by acting

on the mob formation and safety regulations can be improved to help planning future events Vision-based approaches face

several limitations mounted cameras cannot capture elements outside of their field of view canrsquot see in some conditions (for

example during night time) and it is difficult to infer information from the raw footage [31]

Thermal cameras could help for crowd monitoring because they can operate in any condition Having precise and detailed

object recognition for the images produced by these cameras is crucial to extract information correctly In this context clas-

sifying images is not satisfactory localization of the objects contained within the images is needed This problem is known

as object detection [32] There are several challenges for object detection in thermal images the image quality is very low

when compared to visible light images there is a lack of color and texture information and temperature measures are relative

measures etc This makes extracting discriminative information from these images difficult [33] Most efforts towards object

detection on thermal images has gone towards human detection Most of the proposed algorithms focus on feature extraction

using the Aggregated Channel Features technique and boosting algorithms for learning [33ndash35] Novel approaches make use

of so-called deep learning with neural networks that achieve very good results given enough data [36]

14 Outline 4

133 Goal

The goal of the dissertation is to explore the requirements of the possible backbone framework suggested in Section 131 and its

potential software architecture The architecture is evaluated by building a proof of concept implementation of the framework

and evaluating it against the proposed requirements To verify its use in developing drone thermal imaging applications the

specific mob-detection use case is investigated

134 Related work

The Irish start-up DroneSAR [37] developed a search-and-rescue (SAR) drone platform allowing users to stream live images

and video from a drone as it conducts a search for missing persons The platform works with any camera visual and thermal

but focuses on drones from vendor DJI DroneSARs industry partner Amazon introduced the Amazon Kinesis Video Streams

platform in January 2018 as a new service for the Amazon Web Services (AWS) cloud platform It allows users to stream live

video from devices to the AWS cloud and build applications for real-time video processing [38] The VIPER project by EAVISE

KU Leuven researched how thermal and visual video images could be used for real-time detection of persons using object

detection algorithms such as deep learning [36] The framework presented in this work combines elements from all three of

these examples

14 Outline

The remainder of this dissertation is organized as follows Chapter 2 presents the requirements for the framework and the

software architecture Chapter 3 explores several state of the art technologies that can serve as backbone technologies for

the framework To test the viability of the software architecture a prototype is implemented Chapter 4 presents the different

aspects of this prototype Chapter 5 describes the experiment that is conducted to research the detection of mobs in thermal

images The results of both the framework and the detection experiment are presented and evaluated in Chapter 6 Finally the

conclusion and future research efforts are presented in Chapter 7

SYSTEM DESIGN 5

Chapter 2

System Design

Finding out what users actually expect from a software system and what makes it valuable for them is of key importance for the

success of that system This chapter first explores the functional and non-functional requirements of the hypothetical frame-

work suggested in Chapter 1 to find out what makes building the framework worthwhile Well known architectural patterns

enable certain software requirements very well and can be used for building the software architecture of the framework The

framework software architecture combines some of these patterns and is presented in several documents

21 Requirements analysis

Requirements are the stated life-cycle customer needs and objectives for the system and they relate to how well the system

will work in its intended environment They are those aspects of the framework that will provide value to the users

211 Functional requirements

Functional requirements (FR) describe the necessary task action or activity that must be accomplished by the system often

captured in use cases andor user stories [39 40] Use cases provide a summary of the features described in the user stories

Several external people andor systems defined as actors interact with the framework to achieve a certain goal [40] Three

actors are identified for the framework an end-user that uses the framework in order to build an image processing applica-

tion for a specific use case such as the ones described in Section 122 a camera developer who creates support software for

a specific thermal camera for the framework so that the end-user can buy and use their product and an analysis software

developer that creates analysis software for a specific use case (tracking object detecting objects etc) so that the end-user

can use their software to build his 1 application The camera and analysis software developers are generalized to an actor called

plugin developer who develops plugins to extend the functionality of the framework These plugins are the building blocks

with which the end-user can build image processing applications

The general user scenario for the framework proceeds as follows An end-user wants to build an image processing application

1To avoid unnecessary gender specific pronoun clutter the male pronoun is used by default

21 Requirements analysis 6

eg to detect fires in a landscape using a drone He has a thermal camera for this and has read about hot-spot detection in

video The user looks for a plugin for the framework that can read video from his thermal camera and for a plugin that does

the hot-spot detection If the user finds these plugins he can add them to the framework and use them for the application he

is building He connects both plugins with the framework in a specific order to finish his application For this simple example

the thermal camera plugin is connected to the hot-spot detection plugin so that video coming from the thermal camera is

transmitted to the detection plugin to find the fires in the landscape The plugins in the application and the specific order in

which they are connected is defined as a stream This stream should be easily modifiable if additional or other functionalities

are required Suppose that the thermal camera delivers very high quality video and the detector cannot work with this as it

can only operate on low quality images The end-user searches for a plugin that scales the high quality video down to an

accepted quality for the detector This plugin is placed in between the thermal camera and the detector and the application

can work again By continuously adding plugins to the framework the number of possible applications that can be built with

the framework increase making the framework useable for more aerial thermal imaging use cases

Instead of developing the application from scratch users can use the already implemented plugins to build the applications in

an ad hoc fashion Because of this the development time for such applications can be reduced and users can switch hardware

andor algorithms easily The FRs are summarized in a use case diagram that connects each actor with their respective require-

ments and the relationship among them [40] depicted in Figure 21 Trivial functionalities such as launching and shutting down

the framework are omitted The red use cases represent use cases to extend the functionality of the framework the blue use

cases represent use cases for building streams white use cases modify the media processing of the stream Some use cases

depend on others the blue and white use cases work with plugins of the framework their prerequisite use case is rdquoAdd pluginrdquo

as a plugin must be a part of the framework for a user to use it the rdquo(Un)Link pluginsrdquo rdquoStopPausePlay streamrdquo use cases

depend on rdquoAdd plugins to streamrdquo as a stream must contain plugins before they can be manipulated

212 Non-functional requirements

A non-functional requirement (NFR) specifies how the framework is supposed to be or in what manner it should execute its

functionality [41] These qualifications typically cover business and system quality requirements A distinction is made between

quality attribute requirements (QAR) and constraints QARs are qualifications of the FRs or of the overall product eg how

fast a certain function must be executed or how resilient it must be to erroneous input They are closely related to business

requirements which are specifications that once delivered provide value to the actors [40] The QARs are captured in a utility

tree [40] that has a root node representing the system This root node is elaborated by listing the major QARs that the system

is required to exhibit Each QAR is subdivided into more specific QARs To make the specific requirements unambiguous and

testable a scenario for the system or a specific function is written and they are evaluated against the business value and the

architectural impact [40] The QAR can either have High (H) Medium (M) and Low (L) business value and architectural impact

respectively The business value is defined as the value for the end user if the QAR is enabled High designates a must-have

requirement Medium is for a requirement which is important but would not lead to project failure Low describes a nice to have

QAR but not something that is worth much effort Architectural impact defines how much the architecture must be designed

towards the QAR to enable it High means that meeting this QAR will profoundly affect the architecture Medium means

21 Requirements analysis 7

Figure 21 Use case diagram

that meeting this QAR will somewhat affect the architecture Low means that meeting this QAR will have little effect on the

architecture The following QARs are discussed performance interoperability modifiability usability security and availability

Performance

Performance indicates the frameworks ability to meet timing requirements It characterizes the events that can occur and the

frameworks time-based response to those events Latency is defined as the time between the arrival of the stimulus and the

systemrsquos response to it [40] The system latency is the latency between the initialization of an action sequence and the first

change to the system noticeable by the user Streaming latency is defined as the time between the arrival of a video frame and

the arrival of the next video frame The jitter of the response is the allowable variation in latency Scalability is the number

of users that can use the framework at the same time The utility tree is presented in Table 21 The software industry has

not defined a quantified rsquogood latencyrsquo for end-users but a 4 second latency rule is often used as a rule-of-thumb [42] The

average response time for general framework commands should then be less than 2 seconds with a standard deviation of 1

seconds ensuring most execution times respect the 4 second bound As stated in Chapter 1 some use cases require real-time

video streaming such as fire fighting The notion of low latency real-time video loosely defines that video should be streamed

almost simultaneously if a camera is filming and a human user does not notice a latency between the video of the camera

and the real world the video stream is considered real-time Real-time is thus a human time perception and for visual inputs

this bound is as low as 13 milliseconds Anything above 13 milliseconds becomes noticeable anything above 100 milliseconds

hinders human performance [43 44] However the framework focusses on the use of thermal cameras most of which most

donrsquot produce frames faster than 8 frames per second or 125 milliseconds per frame (see Section 31) More expensive cameras

21 Requirements analysis 8

can shoot at 25 frames per second corresponding to a latency of 40 milliseconds and this bound is selected for the streaming

latency with a standard deviation of 20 milliseconds remaining below the frame rate of less expensive cameras The number

of users that can use the framework at the same time is assumed to be low as current aerial thermal image applications are

currently operated by only one user or a few The assumption is that a maximum of five users can use the framework at the

same time All of these requirements are quantified as relatively rsquogoodrsquo values These bounds should be evaluated for user

satisfaction by having users use a prototype of the framework in practice

Attribute refinement Id Quality attribute scenario

LatencyPS-1 The average execution time of all framework commands does not exceed 2 sec-

onds (H M)

PS-2 A playing stream should have an upper limit of 40ms streaming latency (H H)

JitterPS-3 The average standard deviation of the execution time of all framework com-

mands should not exceed 1 second under normal operation (H M)

PS-4 The average standard deviation in streaming latency should not exceed 20ms

under normal operation (H H)

Scalability PS-5 The system should usable by five users at the same time (M M)

Table 21 Performance utility tree

Interoperability

Interoperability is the degree to which two or more independently developed systems can usefully exchange meaningful infor-

mation via interfaces in a particular context [40] The framework will interoperate with cameras and analysis modules via the

framework plugins Henceforth the camera and analysis plugins will be referred to as a Producer plugin and a Consumer plugin

A Producer plugin is thus a plugin that represents a camera that produces video and a Consumer plugin a plugin that represents

a module that processes or consumes video The framework will thus interact with the Producer and Consumer plugins with

which the framework exchanges requests to link them together control their media process etc The more correct exchanges

there are between the two the better the user can use the plugin for building applications with the framework This QAR is

quantified by the ratio of requests made by the framework that were correctly interpreted by the plugin (successful requests)

and the total number of requests during a runtime of the framework [40] Intuitively one argues that the framework must

achieve perfect interoperability with a perfect exchange success rate of 100 Reality however tends to not agree with perfec-

tion and it can never be excluded that exchanges will always be correct Therefore it is better to aim for a good interoperability

measure and prepare for failed exchanges instead of naively assuming the framework will be perfect An exchange success

rate of 9999 is selected the motivation for this bound is as follows A plugin is assumed to be always correct up to first the

mistake after which the plugin is faulty and the fault needs to be identified and ensured that it wonrsquot occur again An exchange

success rate of 9999 means that if 10000 plugins are installed and used by the framework only one will fail during uptime

For one plugin during framework up time the mean time between failures is then 10000 exchanges It is suspected that this

21 Requirements analysis 9

amount of exchanges are very high for normal framework use Because the possibility of faulty exchanges is acknowledged

the framework will need to implement a fallback mechanism to compensate The utility tree is presented in Table 22

Attribute refinement Id Quality attribute scenario

Syntactic interoperabilityIS-1 The framework exchanges requests with a Producer plugin (known at runtime)

with a success ratio of 9999 Incorrect requests are undone by the framework

and logged (HH)

IS-2 The framework exchanges requests with a Consumer plugin (known at runtime)

with a success ratio of 9999 Incorrect requests are undone by the framework

and logged (HH)

Table 22 Interoperability utility tree

Modifiability

Modifiability is the cost and risk of changing functionality of the system [40] One of themost important values of the framework

is modifiability of the supported thermal cameras and analysis modules The framework needs to be extendable for new

functionalities by enabling developers to add their support software in the form of a plugin End-users should be able to

modify the components that they use for their image processing applications easily and quickly to allow for interchangeable

hardware and software and quickly set up new applications Modifiability is defined in two environments runtime defined as

periods during which the system is up and running and downtime defined as the time periods during which the system is not

active The utility tree is presented in Table 23

To enable users to choose the extensions they need the framework will need a distribution service that contains all plugins

available for the framework from which a user can select and install plugins for their local version of the framework Adding

new plugins to the distribution service should not affect versions of the frameworks installed by the user When a user adds a

plugin from the distribution to his version of the framework the framework should only reload once before making the plugin

useable for user comfort Deployability is defined as the different device configurations that specify how the framework can be

deployed If the framework can be deployed in different fashions this can increase the value for the end-user Suppose a fire

fighting use case in which a forest fire is monitored on site Computationally powerful devices might not be available on site

so moving some plugins processing media to a remote server or cloud could still allow usage of the framework Perhaps the

device processing the media is already remote for example a drone on security patrol in this case access via a remote device

such as a smartphone is desirable This leads to the deployment configurations described in the utility tree

Usability

Usability indicates how easy it is for the user to accomplish a desired task and the kind of user support the system provides

Learnability indicates how easy it is for a user to gain knowledge on how to use the framework Errors are the amount of errors

21 Requirements analysis 10

Attribute refinement Id Quality attribute scenario

Run time modifiability

MS-1 Support for a new Producer plugin should be added to the distribution service

within one day without the framework having to restart (H H)

MS-2 Support for a new Consumer plugin should be added to the distribution service

within one day without the framework having to restart (H H)

MS-3 End-users should be able to extend their framework with new functionalities

by installing new Consumer and Producer Plugins (HH)

MS-4 End-users should be able tomodify the plugins used to build their stream (HH)

Down time modifiabilityMS-5 New Producer plugins can be installed to the local framework at runtime the

framework can only reload once before the plugin is useable (H H)

MS-6 New Producer plugins can be installed to the local framework at runtime the

framework can only reload once before the plugin is useable (H H)

Deployability

MS-7 The system should be deployable on a combination of a smartphone and

cloudremote server environment (H H)

MS-8 The system should be deployable on a personal computer or laptop (H H)

MS-9 The system should be deployable on a smartphone laptop and cloud environ-

ment (H H)

Table 23 Modifiability utility tree

a user can make when trying to execute certain functions [40] The utility tree is presented in Table 24

Security

Security is a measure of the systemrsquos ability to protect data and information from unauthorized access while still providing

access to users and systems that are authorized An action taken against the system to cause it harm is called an attack

Security has three main characteristics Confidentiality is the property that data or services are protected from unauthorized

access Integrity is the property that data or services are protected from unauthorized manipulation Availability is the property

of the systemmaintaining its functionality during an attack Authentication verifies the identities of the parties of an interaction

checks if they are truly who they claim to be and gives or provokes access [40] Security is important for the framework if it is

deployed on multiple devices that use a public network to communicate The utility tree is presented in Table 25

Availability

Availability in a general context (not only security) refers to howavailable the software is to carry out its functionality Downtime

is a measure of the time that the system is unavailable to carry out its functions The utility tree is presented in Table 26

Availability is specified for the part of the framework that distributes the plugins

22 Patterns and tactics 11

Attribute refinement Id Quality attribute scenario

Learnability

US-1 A user should be able to learn how to build an image processing application in

at most one hour (H L)

US-2 An experienced developer should be able to start developing a Consumer plugin

for the system within one day (H L)

US-3 An experienced developer should be able to start developing a Producer plugin

for the system within one day (H L)

Errors US-4 A user should not make more than 3 errors to build an image processing appli-

cation (H L)

Table 24 Usability utility tree

Attribute refinement Id Quality attribute scenario

Confidentiality SS-1 Streams created by a user can only be accessed by that user and not by any

other entity (H L)

Integrity SS-2 Streams canrsquot be manipulated without authorization by the user that made the

streams (H L)

Availability SS-3 During an attack the core functionality is still available to the user (H M)

AuthenticationSS-4 Users should authenticate with the system to perform functions (H L)

SS-5 Developers should authenticate their plugins before adding them to the frame-

work (H L)

Table 25 Security utility tree

Architecturally significant requirements

Architecturally Significant Requirements (ASR) are the requirements that are themost important to realize according to business

value and have the most impact on the architecture From the utility trees and the measures of the quality attribute scenarios

the QARs modifiability interoperability and performance are identified as ASRs

22 Patterns and tactics

An architectural pattern is a package of design decisions that is found repeatedly in practice that has known properties that

permit reuse and describe a class of architectures Architectural tactics are simpler than patterns which typically use just a

single structure or computational mechanism They are meant to address a single architectural force Tactics are the rdquobuilding

blocksrdquo of design and an architectural pattern typically comprises one or more tactics [40] Based on the ASRs several tactics

are listed in Table 27 that are used for the base pattern selection The explored patterns are layers event-driven architecture

22 Patterns and tactics 12

microkernel and microservices

221 Layers

The layered pattern divides the software into units called layers that each perform a specific role within the application Each

layer is allowed to use the layer directly beneath it via its interface Changes in one layer are isolated if the interfaces donrsquot

change enablingMT-1 andMT-2MT-5 [40] While changes can be isolated by the isolated layers they remain difficult due the

monolithic nature of most implementations of this pattern Layers contribute to a performance penalty due to the rdquoarchitecture

sinkhole phenomenonrdquo in which requests are simply propagating through layers for the sake of layers [45]

222 Event-driven architecture

This pattern consists of several event publishers that create events and event subscribers that process these events The pub-

lishers and subscribers are decoupled by using an event channel to which the publishers publish events that the event channel

forwards to the event subscribers The subscribers should have a single purpose and execute asynchronously Since the publish-

ers and subscribers are single-purpose and are completely decoupled from other components via the event channel changes

are isolated to one or some components enabling MT-1 MT-2 MT-4 MT-5 and MT-7 If the event channel adds a discovery

mechanism IT-1 can also be enabled Overall the pattern is relatively easy to deploy due to the decoupled nature of the com-

ponents Performance in general can be very high through the asynchronous nature of the architecture enabling PT-6 and PT-7

If the event channel is tweaked to contain extra functionality PT-1 PT-3 PT-8 PT-9 PT-10 PT-11 can be enabled as well

If the components have a limited event response then PT-2 and PT-5 can also be enabled Development can be somewhat

complicated due to the asynchronous nature of the pattern [40 45]

223 Microkernel

The microkernel pattern allows the addition of application features as plugins to the core application providing extensibility

as well as feature separation and isolation The pattern consists of two components a core system called the kernel and

Attribute refinement Id Quality attribute scenario

DowntimeAS-1 The system should be up 995 per year This means the system has an allowed

scheduled downtime of 43 hours and 30 minutes per year for maintenance (M

L)

AS-2 The maximal duration of the interval during which the system is unavailable is

3 hours (M L)

Network AS-3 If there is no active network connection the local device can be used for opera-

tion of the framework (H H)

Table 26 Availability utility tree

22 Patterns and tactics 13

plugins The business logic is divided between independent plugins and the kernel The kernel contains only the minimal

functionality required to make the system operational The plugins are standalone independent components that contain

specialized processing additional features and custom code This code is meant to enhance or extend the core system to

produce additional business capabilities In many implementations plugins are independently developed third-party modules

Changes can largely be isolated and implemented quickly through loosely coupled plugins AllMTs can be enabled Depending

on how the pattern is implemented the plugins can be dynamically added to the kernel at runtime Via a resource discovery

service in the kernel the ITs can be enabled In general most applications built using the microkernel pattern perform well

because applications can be customized and streamlined to only include the features that are needed [45]

224 Microservices

Microservices is an architectural pattern that structures an application as a collection of loosely coupled services that implement

business capabilities Each component of the pattern is deployed as a separate unit that can be deployed on one device or

multiple devices The components can vary in granularity from a single module to a large portion of the application The

components contain one or more modules that represent either a single-purpose function or an independent portion of a

business application [45 46] Due to the separately deployed units changes are isolated to individual components enabling all

MTs Via service discovery mechanisms the ITs can also be enabled The microservices pattern supports distributed deployment

of the software across multiple devices by design This pattern is not known to produce high-performance applications due to

the distributed nature of the pattern which relies on communication via a network [45 46]

225 Comparison of patterns

Table 27 summarizes the analysis of the patterns A score is given based on howwell the pattern enables the tactic Lowmeans

that the pattern does not naturally enable the tactic Medium indicates the pattern can be implemented with the tactic but

does not include it itself High means the tactic is enabled in the pattern Excellent means that the tactic plays a key role in the

pattern

The microkernel pattern andmicroservices pattern both enable most tactics The microkernel pattern implements extendability

of the framework by design using plugins which is the main idea for the framework and thus is an excellent base pattern

Interoperability and deployability of these plugins can be ensured by the microservices pattern as it designs the microservices

to have well defined interfaces for interoperability and allows for the framework to be deployed in a distributed fashion The

architecture presented below is a combination of both the microkernel pattern and the microservices pattern

22 Patterns and tactics 14

Tactic Layers Event-driven Microkernel Microservices

MT-1 Split module Medium High High Excellent

MT-2 Increase semantic coherence Medium High High Excellent

MT-3 Encapsulate Medium High High Excellent

MT-4 Use an intermediary Medium High High Excellent

MT-5 Restrict dependencies High High High Excellent

MT-6 Anticipate expected changes Low High Excellent Excellent

MT-7 Abstract common services Low High Excellent Excellent

MT-8 Defer binding | Runtime registration Low Low Medium High

IT-1 Discover services Low Low High High

IT-2 Orchestrate interface Low Low High High

IT-3 Tailor interface Low Low High High

P-1 Manage sampling rate Low High High Medium

P-2 Limit event response Low High High Medium

P-3 Prioritize events Low High High Medium

P-4 Reduce overhead Low High High Low

P-5 Bound execution time Low High High Medium

PT-6 Increase resource efficiency Low High High High

PT-7 Introduce concurrency Low High Low High

PT-8 Maintain copies of computation Low High Low High

PT-9 Load balancing Low High Low High

PT-10 Maintain multiple copies of data Low High Low High

PT-11 Bound queue sizes Low High Low Medium

PT-12 Schedule resources Low High Low Medium

Table 27 Comparison of how well the discussed patterns enable the tactics needed for the ASRs

23 Software architecture 15

23 Software architecture

The software architecture is documented in three document categories static views dynamic views and deployment views

The static views comprise the different components of the system and their relationship among each other The dynamic views

describe the runtime behavior of the system Finally the deployment views provide different configurations how the system

can be deployed on different devices [47]

231 Static view

Figure 22 presents an overview of the architecture using a component-connector UML diagram Components are the boxes that

represent different software entities that exist at runtime The components have interfaces through which they interact with

other components These are indicated using the rsquolollipoprsquo notation with the rsquoballrsquo representing the interface that a component

provides and a socket indicating that another component is using this interface The type of data exchanged is noted next to

the interface Multiple boxes indicate that multiple components of the same kind can exist at runtime [48]

The architecture consists of the following core components Client Interface Producer Stream Consumer Producer Distribution

Consumer Distribution Producer Plugin and Consumer Plugin The clear components in Figure 22 form the core framework

which each user needs to install to use the framework The colored components form a distribution service for framework

plugins to extend the functionality they are not installed with the core framework but run as remote instances with which

the user can interact to extend his version of the core framework with new plugins A user can use the framework via the

Client Interface building streams that are maintained in the Stream component The Stream component makes requests to

the Producer and Consumer components to activate and control the selected plugins to build the stream Additional plugins

can be added to the framework and are distributed via the Producer and Consumer Distribution components The architecture

implements a hybrid combination of the microservices and microkernel pattern Each presented component is a microservice

that implements its own interface to interact with other components The Producer and Consumer components act as kernels in

the microkernel pattern while the Producer and Consumer plugins acting as plugins in the microkernel pattern These patterns

enable the tactics needed to meet the requirements presented in Section 21

Communication protocol

To allow the microservices to communicate a communication protocol must be designed Communication protocols can roughly

be classified in two categories synchronous and asynchronous Synchronous protocols block on requests which means that the

client waits for a response of the server and can only continue executing when a response is received This makes a synchronous

protocol inherently more reliable but also slower An example synchronous protocol is the Hyper Text Transfer Protocol (HTTP)

Asynchronous protocols just send messages and do not block on the response This makes the protocol less reliable but also

faster [49]

There are two types of traffic exchanged between microservices First there are the command requests that are exchanged

between microservices to edit resources or change state Second there are the video frames that are exchanged between Pro-

ducer and Consumer Plugins Both types of traffic have different requirements The commands must be communicated reliably

23 Software architecture 16

Figure 22 Overview component-connector diagram of the architecture

and need to executed once and only once The reliability is more important than latency so a synchronous protocol is pre-

ferred Microservices traditionally implement the synchronous HTTP protocol with Representational State Transfer Application

Programming Interfaces (REST API) that specifies the application endpoints as textual resources [45] This common protocol

is used for the exchanged command requests in the framework

The video frames need to be sent with low latency at a high frequency but reliability is less important An asynchronous

protocol is preferred For video streaming the Real-time Transport Protocol (RTP) running on top of the User Datagram Protocol

(UDP) is selected as it enables real-time transfer of data between processes [50] RTP defines a standardized packet format to

transmit video and audio over a network It sequences each packet with a sequence number and a timestamp This allows the

application to detect missing packets and latencies in the network The UDP protocol is a low latency asynchronous transport

protocol as it doesnrsquot guarantee packet delivery

The recommended codec for transmitting video media is Motion-JPEG that encodes video frames as separately encoded JPEG

images This makes analysis and processing in subsequent plugins easier as only the received frame is needed to perform

the analysis or processing Other video compression formats such as MPEG-4 use key-frames and object oriented differential

compression formats If a key-frame is received via the stream the frame can be used as is If a reference frame is received the

receiver needs to wait for the corresponding key-frame to be received to be able to construct the full video frame for analysis

This introduces extra complexity and lower quality detection which is a clear trade-off for the quality and simplicity which

MJPEG offers [51 52]

Applying these protocols to the architecture results in the network topology depicted in Figure 23 The full lines represent

communication via HTTP on top of the Transmission Control Protocol (TCP) The dashed lines represent the RTP protocol on top

of the UDP protocol The boxes represent the different microservice components of the framework

23 Software architecture 17

Figure 23 Framework network topology Each box is a microservice component of the framework The full lines indicate communication over the HTTPTCP

protocol the dashed lines indicate communication over the RTPUDP protocol

Client Interface

The Client Interface is the interface through which end-users can interact with the framework Figure 24 presents the detailed

component-connector diagram The Client Interface consists of a User Interface (UI) component and a API Gateway component

Devices can make requests to the Client Interface via the Client Requests interface provided by the API GateWay The UI provides

the UI Operation interface that is used by end-users to control the framework This can be either a visual or textual interface

The UI actions are translated to client requests that are forwarded to the API Gateway using the Client Requests interface The

API Gateway translates the client requests and forwards them to the other core framework components

Figure 24 Client Interface detailed view

Stream

The Stream component maintains the logical representation of the streams built by the end-user for his image processing

application Figure 25 presents the detailed component-connector diagram

23 Software architecture 18

Figure 25 Stream detailed view

It consists of an API a StreamManager and several StreamModel components The API provides the Stream Commands interface

used by the Client Interface to interact with the framework it translates incoming requests to commands for the Stream

Manager that can then execute these commands Commands include creating a new stream modifying the stream layout

modifying the stream state etc The StreamManager creates and manages multiple streams represented by the Stream Model

So the end-user builds Stream Models to create image processing applications The Stream Model represents the logical model

of these image processing application streams As stated before a stream consists of several plugins processing media placed

in some order that are linked by the framework Figure 26 illustrates this concept

Figure 26 Logical model of a stream The arrows represent the flow of media through the stream

Logically the Stream Model is represented as a tree with multiple roots and multiple leaves The framework build streams by

initializing the needed plugins and connecting them in order In the example StreamModel plugins receivemedia frommultiple

source plugins and forward media to multiple targets The Stream Model has a global state that represents the cumulative

state of all plugins To transition the global state from A to B all plugins need to transition from A to B This is done by first

making the transition on the leaves of the Stream Model after which the transition propagates to the root nodes This ensures

that no media is lost because the first transitioned plugins canrsquot process anything as their is no media put into the tree

23 Software architecture 19

Producer and Consumer plugins

A Plugin represents an independent media processing element either of the Consumer type (such as a thermal camera) or

the Producer type (such as an object detection software module) Plugins are deployed as standalone microservices providing

a REST API interface that the framework uses to control the plugin Figure 27 represents a general plugin model A plugin

receives media from other plugins called the sources processes this media and forwards it to other plugins called the listeners

A Producer plugin only has listeners a Consumer plugin has both sources and listeners Merging the media from multiple

sources and forwarding the processed media to multiple listeners is the responsibility of the plugin

Figure 27 Plugin model

The plugin REST API should at least provide a state resource representing the state of how the plugin is processing media

a sources resource that represent the sources from which the plugin receives media to process and a listeners

resource which represent the listeners to which the plugin transmits the processed media Only Consumers have the both

the sources and listeners resource as Producer Plugins produce their own media source and hence can only have

listeners

To indicate if and how the plugin is actively processing media a finite state machine is implemented The state transition

diagram is presented in Figure 28 A plugin can be in four possible states INACTIVE STOP PLAY and PAUSE When a plugin

is in the INACTIVE state no active microservice is running the plugin This is the initial state for all plugins of the framework

This state is only visible to the framework as in this state the plugin is not instantiated When a plugin is in the STOP state

the framework has instantiated a microservice running the plugin The plugin is listening for commands on its API but is not

processing any media This state is visible to the plugin In the PLAY state a plugin is processing media received from its

source(s) and transmits processed media to its listener(s) and is listening for commands When in the PAUSE state media

processing is paused but media buffers are kept This is to decrease the latency when the plugin transitions back to the PLAY

state since the plugin can continue processing from the point from where it was paused The difference with the STOP state

when transitioning STOP state the plugin clears its media buffers

The plugin starts in the INACTIVE state When a microservice running the plugin is instantiated by the framework the plugin

initializes itself in the STOP state From the STOP state the plugin can transition to the PLAY state to process media This

transition is only successful if sources and listeners are registered with the plugin From the PLAY state a transition to both

23 Software architecture 20

Figure 28 The state transition diagram for a plugin

the STOP state and the PAUSE state can be made which stops the processing of media and respectively drops or keeps the

media buffers The plugin cannot make multiple state transitions per command When a transition is made to INACTIVE the

framework first transitions the plugin to the STOP state after which the INACTIVE state can be reached

A sourcelistener has the following fields hostname the hostname of the microservice running the plugin and port the port

on which the sourcelistener is reachable

On the sources and listeners an HTTP GET and POST method must be provided GET retrieves the sourceslisteners

and their details POST adds a new sourcelistener to the plugin Both resources additionally need to provide an individ-

ual endpoint per sourcelistener on which GET PUT and DELETE must be provided This is for individual manipulation of the

sourcelistener GET retrieves the details PUT updates the fields of a listener and DELETE removes a sourcelistener from the

plugin

Producer and Consumer

The Producer and Consumer components are responsible for interacting and managing the ProducerConsumer plugins used in

the streams Figure 29 presents the component-connector diagram of the Producer and Consumer components Both compo-

nents have a similar architecture but are separate components This is because their plugin models differ and are suspected

to often be deployed on different devices having specific hardware requirements Producers Plugins could be deployed on

the thermal camera itself having a very specific operating system whereas a Consumer plugin might need access to specific

processors to speed up its execution

The Producer and Consumer consist of the following components API Kernel Plugin Model and Plugin Manager The API trans-

lates requests coming from the Stream component to commands for the Kernel The Kernel implements the core functionalities

such as activating (deploying) and deactivating plugins managing their state and manipulating their resources It creates a

Plugin Model for each Plugin that the framework has installed This model represents a plugin logically on framework level

and keeps track of the Plugin resources The Plugin Manager manages the plugins that were added to the framework stored in

the Plugin Directory It manages the plugin installations adding updates or installing additional plugins that can be retrieved

from the Producer and Consumer Distribution components

23 Software architecture 21

(a) Producer component-connector diagram

(b) Consumer component-connector diagram

Figure 29 Component-connector diagrams of the Producer and Consumer module

Producer and Consumer Distribution

The Producer and Consumer Distribution components are responsible for managing and maintaining the plugins for the frame-

work They act as online software repositories from which local versions of the framework can retrieve new plugins The

component-connector diagrams are presented in Figure 210 The Distribution components consists of the following subcom-

ponents API Plugin Manager and Plugin Tester Plugin Developers can make requests to the API that translates these requests

to Create Read Update Destroy (CRUD) commands for the Plugin Manager The Plugin Manager executes these commands

on the Plugins that are kept in the Plugin Repository The quality of the framework depends on the quality of the plugins

that it offers Therefore plugins should be thoroughly tested before being added to the framework to guarantee quality The

Plugin Tester component is responsible for this testing Tests should include testing if the plugin implements the Plugin Model

correctly if the plugin meets the performance requirements etc When a plugin passes these tests it is added to the Plugin

Repository so that end-users can install the plugin and use it for their applications

23 Software architecture 22

(a) Producer Distribution (b) Consumer Distribution

Figure 210 Producer and Consumer Distribution component-connector diagrams

232 Dynamic views

Dynamic views depict the behavior of the system and complement the static views They are documented using sequence

diagrams that show an explicit sequence of messages between architecture elements that describes a use case [40] Two key

use cases are presented here Add a plugin to the stream and linking plugins to build the stream

Add plugin to stream

Figure 211 presents the sequence diagram for adding a Producer plugin to the framework The framework is assumed to be

running the user has created a streamS and the Producer PluginA is correctly installed The end-user executes the command

to add A to stream S on the Client Interface that passes the command to the Stream component The Stream component

requests the creation of a microservice instance of A that is created by the Producer Kernel When the Producer Plugin is

instantiated the Producer Kernel creates a Plugin Model of A and adds it to its references so that the instance of A can be

reached for future commands Afterwards the StreamManager is informed of the success upon which the StreamManager can

addA to the Stream Model ready to be linked with other plugins The user is notified of this success and can continue building

IfA could not be instantiated (due to not being installed not installed correctly etc) A is marked as rsquobrokenrsquo and the user is

notified that the action could not be completed When the plugin is marked as rsquobrokenrsquo it can no longer be used and needs to

be reinstalled The sequence diagram for adding a Consumer Plugin is similar but replaces the Producer components with the

Consumer components

Link plugins

Figure 212 presents the sequence diagram for linking two plugins in a stream In the sequence diagram two Consumer Plugins

A and B are linked this can be extended to a Producer Plugin linking with a Consumer Plugin The framework is assumed

to be running the user has created a stream S the plugins A and B have been instantiated and added to the stream The

end-user executes the command to linkA andB in stream S on the Client Interface that passes the command to the Stream

component that checks if the link is valid for the Stream Model S Linking can only be done if the stream is in the STOP state

and if the plugins are already in the stream If the link is valid the Stream Manager can begin linking the plugins To link

23 Software architecture 23

Figure 211 Add a Producer Plugin to stream

the plugins in the order A-B A is added as a source for B and B is added as a listener for A These subsequences are

found in their corresponding frames in the diagram and are very similar The Stream Manager makes the request to add the

sourcelistener to the Kernel that finds the corresponding plugin and makes the request on the corresponding Plugin Model

If the Plugin succeeded the Plugin Model is updated and the Stream Manager is notified of this success If both plugins have

successfully set the source and listener the Stream Model layout is updated with the link Should the sourcelistener request

fail for one of the plugins the change is rolled back and the end-user is notified

233 Deployment views

The different deployment configurations are illustrated via deployment diagrams using the Deployment UML 25 specification

[48] rsquoHostrsquo specifies the device on which components are deployed The rsquomicroservicersquo indicates the isolated environment

in which components are running These isolated environments on the host are realized as software containers that enable

portability of the components to other deployment configurations This concept is further discussed in Section 33 The Producer

and Consumer Distribution components were left out of the diagrams as they are always distributed on a different host than

the core framework Two deployment configurations are presented the local configuration that deploys the components on

a single device and the distributed configuration that deploys each component on a separate device These configurations are

presented in Figure 213

23 Software architecture 24

Figure 212 Link two plugins in a stream The rsquoformat requestrsquo and rsquotranslate requestrsquo actions in the API components have been omitted to reduce clutter

in the diagram but are executed by the API components

23 Software architecture 25

Local configuration deployment

The local configuration deploys the framework on a single local device The configuration is depicted in Figure 213a Because

the framework is deployed as a whole it can operate offline This configuration is useful for image processing applications that

canrsquot rely on a stable network connection Examples are remote locations or densely built-up areas The components are still

deployed as separate microservices due to the architecture of the framework This has an impact on the performance of the

framework because for every interaction between components either the HTTP message protocol or RTP protocol is used that

introduces extra overhead compared to direct invocation of commands

Distributed configuration deployment

The distributed configuration deploys the framework on multiple devices The components are distributed over these devices

made possible by the microservice isolation and communication protocols This configuration is depicted in Figure 213b Obvi-

ously in this configuration each component of the framework must have a reliable network connection to communicate with

the other framework components This configuration could be used for example for a security application The end-user has

the Stream module running on a master node that controls several cameras The end-user can configure his image process-

ing application through the Client Interface running on his device that communicates with the Stream module running on

the master node The master node can control each camera by communicating with the Producer component If for example

the security application requires autonomous detection of trespassing people a computationally intensive task the Consumer

Plugins could need dedicated hardware to run that is only available on another device The Consumer component can then be

deployed on that dedicated device and the Stream component can again communicate with it over the network This success of

this configuration depends on the availability of the network and the capacity of the network If the network fails commands

and media canrsquot come through and the framework can no longer execute Due to the distributed nature performance will also

be worse when compared to the local configuration because each request between the components travels over a network

that can experience delays

23 Software architecture 26

(a) Local configuration deployment diagram(b) Distributed configuration deployment diagram

]

Figure 213 Deployment diagrams

STATE OF THE ART AND TECHNOLOGY CHOICE 27

Chapter 3

State of the art and technology choice

To build and test a proof of concept implementation of the architecture presented in Chapter 2 several state of the art tech-

nologies can be used as support for the framework These are presented in Sections 31 32 33 and 34 For each category a

choice is made that will serve as the basis for the implementation of the proof of concept discussed in Section 35 Readers

already familiar with the presented technologies can safely skip ahead to Section 35

31 Thermal camera options

This section aims to provide an overview of some currently commercially available thermal cameras The overview is not a

complete overview of all products offered by all vendors This data was gathered in September 2017 so some products can

be discontinued and new products can already be launched Several parameters are collected for each product Section 311

discusses why these parameters are important to assess the quality of a thermal camera Section 312 aims to aggregate these

parameters and presents insights into the data The full list of specifications can be found in Appendix B

311 Parameters

The following parameters were considered for the comparison physical specifications image quality thermal precision inter-

faces energy consumption help and support and auxiliary features

Price

Thermal cameras are relatively expensive when compared to visible light cameras For example a 20 megapixel (MP) visible

light camera can cost as low as 100 euro while thermal cameras having a much lower image resolution can cost as much as

15000 euro Prices for thermal cameras cover a very wide range and budgets are limited in practice

31 Thermal camera options 28

Physical specifications

Two specifications are considered the weight of the camera and the dimensions of the camera Drones have a limited carry

weight due to maximal carrying capacities and a faster draining of battery life when carrying heavier loads Lighter and smaller

cameras are preferred for usage with drones These often offer lower image quality and less features than the heavier cameras

Image quality

Image quality specifies how much information an image can possibly hold It consists of five parameters resolution capture

frequency or frame rate field of view and radiometric information Image resolution is the amount of detail an image holds

Higher resolution cameras can capture more details in a scene resulting in a sharper image that holds more information Due

to more details smaller objects can also be seen allowing scenes to be viewed from larger distances Drones capture images

from relatively large distances so good resolutions are required for the images to be useful Image resolution is measured in

pixel density presented as the product of the amount of pixels in width and height of the image The highest resolution found

for the compared cameras is 640 x 512 pixels Some cameras offer a visual camera next to the thermal camera This allows

an overlay of the visual image and the thermal image so-called Multi Spectral Dynamic Imaging (MSX) This creates artificial

sharper images because edges can be seen more clearly because they are more visible in the visual image Figure 31 depicts a

thermal-only image and a MSX image of a dog It can be seen that the MSX image is sharper MSX is a more low-cost solution

to produce sharper images compared to increasing the thermal resolution as visible light cameras are less expensive [7]

(a) Thermal (b) MSX

Figure 31 Thermal image and MSX image of a dog

The capture frequency or frame rate dictates how many frames the camera can capture per second Higher frequency cameras

are able to track dynamic scenes better The field of view is angle throughwhich the camera is sensitive to thermal radiation and

31 Thermal camera options 29

determines the extent of the world that can be seen by the camera Bigger field of views can capturemore of the environment in

one image Most cameras allow various lenses to be mounted onto the camera which allows for greater flexibility in choosing

the field of view Radiometric image information is thermal information embeddedwith the infrared image that can be analyzed

after recording Radiometric information characterizes the distribution of the thermal radiationrsquos power in space and specifies

the temperature per pixel exactly Regular thermal images use a relative scaling of temperatures that are mapped onto a

colorspace with some color being the hottest color in the image and another color the coldest For example in Figure 31a the

Iron color scheme is used which maps the cold regions of the image on blue color variants and warmer regions on red and

yellow variants Radiometric information can give a very detailed description of the radiation pattern of a scene

Thermal precision

Thermal precision specifies the temperature range the sensitivity and accuracy of the temperature measurements The tem-

perature range indicates the minimum and maximum range a camera can detect A larger temperature range comes with a

trade-off in sensitivity and accuracy Often cameras offer different modi of operation and operate using different intervals

according to the accuracy needed in a scene Sensitivity indicates the ability of the camera to record finer distinctions in tem-

perature Accuracy is the margin of error for temperature readings on the thermal camera An accuracy of 5 degrees Celsius

for small temperature ranges and 20 degrees Celsius for large temperature ranges is commonly found The increase in error

margin is a trade-off for the larger temperature interval Objects have different emit infrared waves in various forms (due

to black-box radiation [7]) To accurately compare the temperatures cameras often implement emissivity corrections that

normalize the measurements

Interfaces

Cameras can communicate with other devices via several interfaces during use Cameras mounted on a drone cannot be ac-

cessed during flight and need these interface to transfer data USB and HDMI are the most commonly found interfaces to

connect the camera with an on-board processing unit gimbal or battery MAVLink [53] is a very lightweight header-only mes-

sage marshalling library for micro air vehicles drones When a camera provides this interface this allows for a very efficient

communication scheme to control the camera remotely Other interfaces include Bluetooth or Wi-fi

Energy consumption

A device mounted on a drone has a limited energy source at its disposal The less energy the camera consumes the longer the

drone can operate This can even lead to lighter batteries used in-flight reducing the carried weight and therefore also the

energy consumption Typically energy consumptions for cameras are much lower than the energy consumption of the drone

itself so this is a minor specification Input voltage and power consumption are specified

31 Thermal camera options 30

Help and support

How the camera is supported by the company has a big impact on the ease of developing applications for the camera platform

User manuals phone or email support and FAQs are very helpful Should the camera be malfunctioning a product warranty is

necessary to recover the broken product

User experience

The user experience is another important factor as there is a difference in the technical specifications and the actual experience

of the user The user experience is measured in a number of good and a number of bad reviews Reviews are scored from zero

to five with zero being a very bad experience and 5 being a very good experience A good review is scored three or more a bad

review less than three stars

Auxiliary features

Some cameras offer even more features than the ones mentioned above These can be a connection with the Global Positioning

System (GPS) to indicate where images were captured a software application to interact with the camera analysis functionality

tracking etc

312 Comparative analysis

It can be seen that FLIR is the market leader on thermal solutions for drones They offer the largest product line and products

from other companies often utilize one of their camera cores Figure 32a plots the retail price compared to the thermal

resolution Cameras with high and low resolutions are found across all price ranges Clearly other features determine the price

of a thermal camera A feature function is defined that maps the features of a thermal camera on an integer The function

increments the integer if

bull The camera has MSX support

bull The camera has a standard data format (not just an analog or digital signal)

bull The camera offers radiometric information

bull The image resolution is larger than 640 x 512 pixels being the highest resolution found for these products

bull The sensitivity is smaller than 100 mK

bull The camera offers emissivity correction

bull The camera offers a USB interface

bull The camera offers a MAVLink interface

32 Microservices frameworks 31

bull The camera offers an HDMI interface

bull The camera offers a Bluetooth connection

bull The camera offers Wi-Fi connection

bull The camera offers GPS tagging

Figure 32b plots these feature points versus the retail price This gives a more log-like relationship The features of a camera

determine the price much more than just the image quality For a price less than 5000 euro thermal cameras are found that

implement most basic features Then the price increases rather fast for less added features These are features like radiometry

that require additional hardware that greatly increase the price of the camera

32 Microservices frameworks

The architecture presented in Section 23 relies heavily on the microservices pattern Therefore this Section aims to present

several microservices frameworks to support this architecture Figure 33 depicts the results of the Rethink IT survey query-

ing the most used frameworks for microservices by developers [54] The most popular frameworks Java EE and Spring Boot

are written in Java The Java EE framework is more of a one-stop-shop framework offering much more functionalities than

just a backbone microservices framework and is therefore not considered Spring Boot is clearly a very popular and mature

framework more streamlined for microservices Vertx is a more upcoming framework renowned for its performance making

it worthwhile to explore Python is an upcoming language for web development and because it is excellent for prototyping

several frameworks for this language are explored as well The frameworks presented here are Vertx version 351 Spring Boot

version 20 Flask version 012 Falcon version 141 and Nameko version 290

321 Flask

Flask is a micro web development framework for Python The term rdquomicrordquo means that Flask aims to keep its core simple but

extensible Flask is an unopinionated framework as it only provides a glue layer to build a REST API around the application

(a) Camera resolution compared to retail price(b) Camera feature points compared to price

32 Microservices frameworks 32

Figure 33 Rethink IT Most used tools and frameworks for microservices results [54]

However it provides a large list of extensions if extra functionality is required [55] Starting a microservice is very simple

as illustrated in Listing 1 Flask uses the concept of Python decorators [56] to bind Python functions to a REST API in Listing

1 for example the function service_status() is linked to the rsquorsquo resource When a user issues an HTTP GET request

on this resource the route() function on the app object is called by Flask Because route() is a decorator for the

service_status() function service_status() is wrapped and passed to the route() function so that when

a user issues an HTTP GET request the service_status() function that was passed gets called This allows for an easy

construction of the REST API just decorate all the functions of the microservice with the correct Flask decorator

from flask import Flask

app = Flask(__name__)

approute()

def service_status()

return service_status

if __name__ == __main__

apprun()

Listing 1 Minimal Flask application

Because Flask is a microframework its memory footprint is small with the binary file only being 535KB large It is in use

by several large companies such as Netflix and Reddit [57] In a production environment the default Flask web server is not

sufficient as it only serves one user at a time However for prototyping it is an excellent framework [55]

32 Microservices frameworks 33

322 Falcon

Falcon is a bare-metal Python web framework that differentiates itself in performance when compared to other frameworks

It targets itself towards microservices due to being even more lightweight and faster when compared to frameworks like Flask

In a benchmark test it achieves 27 times the speed of Flask [58] The framework seems less mature and has not been adopted

by many companies [59] It is not considered for the prototype of the system but could be used in production as it achieves

better performance

323 Nameko

Nameko is a framework specifically built for building microservices in Python Next to offering a REST API it also offers asyn-

chronous events over the Advanced Message Queuing Protocol (AMQP) It is only meant to be used for microservices not for

web applications that serve content It is a relatively young framework and is not backed by any major companies as of yet It

is however backed by the developer of the Flask framework [60]

324 Vertx

Vertx is a toolkit for building reactive applications on the Java Virtual Machine (JVM) This framework follows the reactive

systems principles These principles are used to achieve responsiveness and build systems that respond to requests in a timely

fashion even with failures or under load To build such a system reactive systems embrace a message-driven approach All

the components interact using messages sent and received asynchronously Reactive microservices built with Vertx have the

following characteristics autonomy asynchronous resilience and elasticity Vertx is a toolkit and can be used as any other

library which makes it very flexible It provides a large set of features metrics different programming languages different

protocols templating data access cluster management etc

Vertx embraces the asynchronous development model which can be seen in Listing 2

import iovertxcoreAbstractVerticle

public class Server extends AbstractVerticle

public void start()

vertxcreateHttpServer()requestHandler(req -gt

reqresponse()

putHeader(content-type textplain)

end(Hello from Vertx)

)listen(8080)

Listing 2 Vertx example

33 Deployment framework 34

The event which occurs is the HTTP request On arrival of the event the Handler is called and is executed The Handler is chained

to a listen request and does not block the calling thread The Handler is only notified when an event is ready to be processed

or when the result of an asynchronous operation has been computed [61]

325 Spring Boot

Spring Boot is an opinionated Java framework for building microservices based on the Spring dependency injection framework

It allows developers to create microservices through reduced boilerplate and configuration For simple applications it provides

a similar syntax to Flask in Python and uses decorators for routing An example is given in Listing 3 The framework handles

most of the routing and request handling but restricts the developer in application structure The framework is not lightweight

and performs less well than Vertx [62]

RestController

RequestMapping(api)

public class HelloRestController

RequestMapping(method = RequestMethodGET value=hola

produces = textplain)

public String hello()

return Hello Spring Boot

Listing 3 Spring Boot example

33 Deployment framework

To allow for the modifiability and interoperability requirements discussed in Section 212 and the different deployment config-

urations in Section 233 Linux containers (LXC) are used A container is a lightweight operating system running inside the host

system running instructions native to the core CPU eliminating the need for instruction level emulation that Virtual Machines

use Containers provide an identical isolated runtime environment for processes without the overhead of virtualization This

makes them perfect for highly portable software as only the container needs to be moved and can directly be executed on any

system supporting the containers [63] First the concept of containers is introduced in Section 331 Second several container

frameworks are presented in Sections 332 333 334

331 Containers

Containers sandbox processes from each other and are often described as the lightweight equivalent of virtual machines The

difference between a virtual machine and a container is the level of virtualization Virtual machines virtualize at the hardware

33 Deployment framework 35

level whereas containers do this at the operating system (OS) level The achieved effect is similar but there are significant

differences Containers make available protected portions of the OS and share its resources Two containers running on one OS

have their own OS abstraction layer and donrsquot know they are running on the same host This provides a significant difference in

resource utilization Virtual machines provide access to hardware only so it is necessary to install an OS As a result there are

multiple OSs running which gobble up resources Containers piggyback on the running OS of the host environment They merely

execute in spaces that are isolated form each other and certain parts of the OS This allows for efficient resource utilization and

for cheap creation and destruction of containers Consequently starting and stopping a container is equivalent to starting and

stopping an application [64 65] This comparison is illustrated in Figure 34

Containers offer several advantages over running a process directly on the system Due to the OS virtualization of the con-

tainers software is always deployed on the same operating system defined in the container This allows for a rsquowrite once run

everywherersquo scenario which allows for portability of the system to a range of devices Containers communicate with each other

using protocols such as HTTP This allows for the processes in containers to be written in any programming language using

any external library that is needed For the system this means that if the Producer and Consumer Plugins are packaged as

containers they can effectively be made in any available technology greatly enhancing the extensibility of the system

332 LXC

Linux containers are the basis on top of which other container frameworks are built LXC provides a normal OS environment

similar to a VM The containers in this framework almost behave identically to a VM They can run multiple processes LXC can

be used directly but offer only low level functionalities and can be difficult to set up [67]

333 Docker

Docker started as an open-source project at dotCloud in early 2013 It was an extension of the technology the company had

developed to run its cloud applications on thousands of servers [64] Now Docker is a standalone mature company providing a

software container platform for the deployment of applications [66] Docker provides two main services a simple toolset and

API for managing Linux containers and a cloud platform which provides easy access to recipes for software containers created

by other developers [68] Docker is the container technology with most public traction and is becoming the container standard

at the time of writing due to its functionalities and very responsive community It offers functionality to easily build and run

containers but also manage them in large clusters A design decision that limits Docker is that each container can only run one

process at a time and the Docker client Docker consists of a daemon that manages the containers and the API Engine a REST

client Should this client fail dangling containers can arise [69]

334 rkt

Core OSrsquo rkt is an emerging container technology providing an API engine similar to the Docker API Engine that can run LXC

containers as well as Docker containers rkt focusses on security standardization and is specifically designed to run in cloud

environments Unlike Docker rkt does not use a daemon process with a REST client The command line tool executes all the

34 Object detection algorithms and frameworks 36

(a) Container stack (b) Virtual machine stack

Figure 34 Containers compared to virtual machines [66]

operations which makes the framework more reliable rkt is not as mature as Docker yet It is portable to multiple Linux

environments but is not yet portable to macOS and Windows [70]

34 Object detection algorithms and frameworks

As stated in Section 132 object detection is the computer vision task of detecting which objects are present in an image and

where they are located Several approaches to this problem have been proposed some of which focus on thermal images This

section aims to give a small overview of different existing techniques For the technical details on the algorithms the reader is

referred to the respective articles on the algorithms

341 Traditional approaches

Traditional approaches include hot-spot detection techniques and Adaptive Boosting (AdaBoost) with various feature extraction

techniques such as Aggregated Channel Features (ACF) and Integral Channel Features (ICF) Thesemethods rely on clever feature

engineering solutions that use domain knowledge or statistical insights to transform the raw dataset into a specific set of

features in order to find patterns [32]

Hot-spot detection

Hot-spot techniques work on the assumptions that people have an overall higher body temperature than most of the back-

ground in the thermal image These techniques first select candidate objects these are the hot-spots in the image The hot-spots

define the region on which a classifier is run and are thus the localization step in the object detection problem Afterwards

a classifier is trained on these candidates Xu et al used a Support Vector Machine (SVM) classifier to classify if the hot-spot

34 Object detection algorithms and frameworks 37

represented a pedestrian [71] Nanda et al used a Bayes classifier to classify the hot-spots [72] These methods are generally

not applicable because people often are not the only hot-spots in thermal images

AdaBoost

AdaBoost is a machine learning algorithm that utilizes the output of so-called weak learning algorithms (weak learners) and

combine their outputs into aweighted sum that forms the output of the boosted classifier AdaBoostmodifies theweak learners

in favor of data points misclassified by previous classifiers [73] Viola and Jones et al built a detection algorithm that uses two

consecutive frames of a video sequence and trains the AdaBoost classifier on both motion and appearance information [74]

Davis et al use a two-stage template approach that initially performs a fast screening procedure using a generalized template

using a contour saliency map to locate potential person locations Any window located in the first phase is then forwarded to

the AdaBoost algorithm to validate the presence of the person Dollaacuter et al extracted features using different ICF and ACF [35]

ICF and ACF compute features by calculating several aggregations over the different channels of an image such as gradient

color histogram and colors Goedeme et al expanded these detectors with extra thermal channels to achieve comparable

results as Dollaacuter et al but for thermal images [36]

342 Deep learning

Over the past few decades there has been a shift in proposed solution methods towards deep learning Deep learning for object

detection uses Convolutional Neural Networks (CNN) CNNs are a specialized kind of neural network for processing data that

has a known grid-like topology such as images CNNs generally consist of three steps a convolution step that creates a feature

map of a region of an image a pooling step that summarizes the output of the convolution step and finally a fully-connected

network that learns from the features extracted in the previous steps [75] The key difference is that these algorithms do the

feature extraction in the convolutional layers and do not need feature engineering like the algorithms presented in Section

341 This requires quite a bit of computing power when compared to the traditional methods Since deep learning made the

shift to computing on Graphical Processing Units (GPUs) computations became feasible and these models proved to achieve

very good performance on various machine learning problems Two model types are described two-stage networks (R-CNN

R-FCN) that extract image regions first and make separate predictions on each region and dense networks (YOLO SSD NASNet

RetinaNet) that operate on the image as a whole

Region-based Convolutional Network (R-CNN)

R-CNN uses a selective search method to find objects an alternative to the exhaustive search in an image It initializes small

regions in an image and merges them hierarchically The detected regions are merged according to color spaces and other

similarity metrics [76] R-CNN combines this selective search with a CNN per region to find out the objects in these regions [77]

34 Object detection algorithms and frameworks 38

Fast(er) Region-based Convolutional Network (Fast(er) R-CNN)

Fast R-CNN was developed to reduce the time consumption related to the high number of models necessary to analyze region

proposals from the selective search method in R-CNN Instead of using a CNN for each region a single CNN with multiple

convolutional layers is used [78] Faster RCNN drops the region proposals detected with the selective search method (which

is computationally expensive) and introduced the Region Proposal Network (RPN) to directly generate region proposals This

accelerates training and testing and improves performance [79] Mask R-CNN is an extension of the Faster R-CNN model that

adds a parallel branch to the bounding box detection to predict object masks that is the segmentation of an object by pixel in

the image [80]

Region-based Fully Convolutional Network (R-FCN)

R-FCN tries a more efficient approach to region detection Instead of applying a per-region subnetwork multiple times R-FCN

uses a fully convolutional network with computations shared across the entire image This allows it to be compatible with

multiple backbone networks such as Residual Networks [81]

You Only Look Once (YOLO)

The previously discussed methods need to run the same computations on different parts of an image multiple times before

generating a prediction This makes those methods relatively slow The YOLO model [82] was developed with the requirement

to make predictions as fast as possible trading off accuracy for speed to move towards real-time object detection YOLO directly

predicts bounding boxes and class probabilities with a single CNN in a single evaluation instead of first detecting object regions

and predicting classes afterwards This has some benefits over the other methods YOLO is very fast when compared to other

methods capable of processing images in real-time up to 155 frames per second for some variants It also learns contextual

information because it trains on entire images instead of regions YOLO also generalizes better for other image types All these

benefits come at the cost of accuracy YOLO struggles to precisely localize some objects especially small objects The following

versions of YOLO focus on delivering more accuracy The algorithm is currently in its third version [83]

Single-Shot Detector (SSD)

The SSD [84] is similar to YOLO and predicts all the bounding boxes and the class probabilities in one single evaluation (single

shot) using one CNN The model takes an image as input which passes through multiple convolutional layers When compared

to YOLO SSD achieves higher accuracies by adding convolutional layers and including separate filters for different aspect ratio

detections

Neural Architecture Search Net (NASNet)

NASNet takes a different approach and does not design the network architecture to perform the object detection beforehand

but instead trains a Recurrent Neural Network (RNN) to generate the model descriptions of the CNN to perform the object

34 Object detection algorithms and frameworks 39

detection The RNN is trained using reinforcement learning The NASNets built for object detection perform as good as most

networks but are slower to train [85]

RetinaNet

RetinaNet is the latest state-of-the art object detector It is a simple dense detector similar to YOLO and SSD but matches

the accuracy of the two-stage detectors like the R-CNN variants RetinaNet proposes that the foreground-background class

imbalance encountered when training the dense detectors lead to less accuracy when compared to the two-stage detectors

RetinaNet uses a newmethod called Focal Loss that focuses training on a sparse set of examples to counter this class imbalance

which results in a very good performance and a very fast detection [86]

343 Frameworks

While the previous Sections focused on different algorithms actually implementing these algorithms is not straightforward

Thatrsquos why over the past years several deep learning frameworks have emerged that try to provide easier access to this tech-

nology Some frameworks provide APIs for some of the object detection algorithms presented above This section gives a small

overview of some frameworks Most frameworks differ quite a bit from each other which makes porting a model from one

framework to another rather difficult The Open Neural Network Exchange Format (ONNX) initiative hopes to propose a stan-

dard for interchangeable models which should aid switching among frameworks more easily in the future [87] Note that there

are other frameworks available but those do not yet support object detection functions out of the box

TensorFlow

Perhaps the most well-known framework TensorFlow is an open source machine learning library for neural networks with a

Python interface It was developed by Google for internal use and released for the public in 2015 [88] Recently an Object

Detection API has been built for TensorFlow which implements pre-trained models on benchmark datasets such as SSD Faster

R-CNN R-FCN and Mask R-CNN [89] TensorFlow offers a lot of flexibility in its use and can be used for many machine learning

problems

Darknet

Darknet is an open source neural network framework written in C and CUDA It is maintained by Joseph Redmon the person

behind the YOLO algorithm [90] Darknet does not offer the flexibility that other frameworks offer but is easy to install and

use when compared to others Out of the box Darknet offers an interface for YOLO The open source community offers some

ports of this framework to other popular frameworks such as Tensorflow

34 Object detection algorithms and frameworks 40

CNTK

The Microsoft Cognitive Toolkit (CNTK) is an open source toolkit for distributed deep learning It offers a Python C or C++

interface Itrsquos one of the first framework so support ONNX CNTK offers an API for Fast R-CNN and Faster R-CNN [91]

35 Technology choice 41

35 Technology choice

This Section presents the choices made for each technology described in the previous Sections

351 Thermal camera

The FLIR One Pro and Therm-App were selected as thermal cameras for the proof of concept Both offer relatively high quality

images 160 x 120 pixels and 320 x 240 pixels respectively This is of course relative to their price 469 and 93731 euro respec-

tively These prices are at the low end of the product ranges offered Both cameras are designed to use on a smartphone which

makes them ideal for prototyping since these devices are widely available and setting up the camera via the apps from the

respective companies is easy Both cameras provide MPEG-4h264 encoded video output easily understood by most playback

software Both cameras can be found in the lower left of Figure 32b

For deployment in production-ready applications with drones these cameras are not the best choice They arenrsquot specifically

designed to be used on a drone and donrsquot offer the best image quality possible In those applications platforms like the FLIR Vue

Duo Zenmuse or Workswell Wiris are better candidates due to their superior image quality MAVLink interfaces compatibility

with commercially available gimbals to mount them on drones and other features

352 Microservices framework

Flask is selected as the microservices framework The arguments for Flask are as follows Flask is a mature web framework

with major companies backing it This means the APIs stay consistent and the framework is stable in use When compared to

some other frameworks like Spring Boot Flask is unopionated which allows for maximum flexibility during development Flask

also has a very small memory footprint that makes it easier to deploy on less powerful on-board devices like drones Flask is

also easy to use and quick to set up ideal for developing a proof of concept A final argument is the familiarity of the author

with Flask

353 Deployment framework

Docker is selected as the deployment framework Docker is the most mature and well supported container framework at

the time of writing and likely will be important in the future It offers the most features and is specifically designed for the

microservices pattern [68]

354 Object detection

One of the requirements specified in Section 21 is real-time streaming Real-time object detection is only achieved by a few

models presented in Section 34 Candidates are YOLO SSD and RetinaNet As there is no framework that provides an implemen-

tation of the RetinaNet algorithm out of the box at the time of writing this algorithm is not selected SSD is implemented in

the TensorFlow object detection API However at the time of writing this API has not been found stable when trying out the API

fallbacks to older versions of the software were needed to be able to test the models This was due to the object detection API

35 Technology choice 42

using older versions of the TensorFlow framework Therefore YOLO implemented in the darknet framework is selected Darknet

offers a stable distribution YOLO achieves good results and has proven to be a very fast detector capable for real-time object

detection

PROOF OF CONCEPT IMPLEMENTATION 43

Chapter 4

Proof of Concept implementation

To prove the concept of the architecture discussed in the previous chapters a prototype is implemented First the goals and the

scope of the prototype are presented in Section 41 Next the components of the prototype are presented in Section 42 Finally

the known limitations and issues of the prototype are presented in Section 43

41 Goals and scope of prototype

The goals of the prototype are to prove the QARs defined in Section 21 The prototype focusses on the ASRs performance

interoperability and modifiability The usability security and availability requirements are left out of the scope of the prototype

because they are not an ASR and require significant resources (focus groups longtime deployment etc) to test

The components that are implemented in the prototype are Client Interface Stream Consumer and Producer because they

represent the core functionality of the framework to build image processing application streams using plugins The Producer

and Consumer Distribution components enable third party plugin developers to add their functionality to the framework These

are distribution functionalities which are out of scope of the prototype The prototype will only support one video stream All

functions presented in Figure 21 are implemented with the exception of rsquoInstall pluginrsquo rsquoUninstall pluginrsquo rsquoAdd pluginrsquo rsquoView

pluginrsquo rsquoRemove pluginrsquo and rsquoUpdate pluginrsquo as they are only supported via the Producer and Consumer Distribution components

The prototype is deployed on a local device Distributed deployment configurations require small changes in the implementation

(see Section 43)

42 Overview of prototype

421 General overview

The prototype consists of four main components a cli streamer producer and consumer The cli process is

the Client Interface implemented as a textual Command Line user Interface (CLI) which allows a user to interact with the

prototype through textual commands in a shell This process is deployed on the local machine The streamer producer

42 Overview of prototype 44

and consumer processes are deployed as microservices in their own Docker containers The prototype is initialized through

the cli which spins up the Docker containers of the other processes This is achieved with the tool docker-compose Compose

is a tool for defining and running multi-container Docker applications The compose YAML file defines the configurations for

the microservices Compose uses these configurations to start and stop the application with a single command [92] A snippet

of the compose file for the application is given in Listing 4 Containers are specified as services The example service

configuration given is that of the producer First the name of the container is specified which overwrites the default name

as the container name is used as hostname for the container in Docker [93] The build configuration specifies where the

container build recipe is situated The port mapping allows processes from the localhost to access processes in the container

For the producer service this is only used for debugging The volumes configuration specifies folders from the host to

be mounted to the container This configuration mounts in the source code and resources It also provides access to the Docker

socket to allow interaction with the Docker host (see Section 424)

services

producer

container_name producer

build

context producer

dockerfile Dockerfile

ports

- 808080

volumes

- producerusrproducer

- varrundockersockvarrundockersock

Listing 4 docker-composeyml snippet of the prototype

All containers are connected to a Docker bridge network [93] for communication A bridge network uses a software bridge to

allow connected containers to communicate while providing isolation from containers which are not connected to that bridge

network The bridge network applies to containers running on the same Docker host The network is thus confined to the local

Docker host and is not distributed on different devices The bridge network has some advantages

bull The bridge provides better isolation and interoperability between containers Containers automatically expose all ports

to each other and none to the outside world

bull The bridge provides automatic Domain Name System (DNS) resolution between containers This means that containers

resolve the IP address of each other by container name or alias

bull Containers can be attached to and detached from the networks on the fly

bull Environment variables are shared which can be used to provide equal environment configurations for every container

on the bridge

42 Overview of prototype 45

422 Client interface

The Client Interface is implemented by the cli component The cli is built in Python with the Click package by Armin

Ronacher [94] Click is a CLI creation kit which aims to make the implementation of CLIs easier It resembles the Flask frame-

work as it also leverages Python decorators [56] for most of its functionality The source code of the cli is located in the

mosquitopy file Commands can be executed by calling python mosquitopy or by calling mosquito if the

source code is installed into the Python environment The following commands are implemented

bull mosquito Displays a help page listing command groups

bull mosquito on Starts the application

bull mosquito off Shuts down the application

bull mosquito plugins Groups all commands to manage plugins Plugins can only be listed not installed or unin-

stalled as the Remote Producer and Remote Consumer are not implemented

bull mosquito plugins ls Lists all locally installed plugins

bull mosquito stream Groups all commands to manipulate the current stream

bull mosquito stream add Adds a producer or consumer to the stream

bull mosquito stream delete Deletes a producer or consumer from the stream

bull mosquito stream elements List all producers and consumers that were added to the stream

bull mosquito stream link Links two stream plugins

bull mosquito stream pause Pauses the stream

bull mosquito stream play Plays the stream This means the stream is processing media

bull mosquito stream print Prints the stream layout (which plugins are linked)

bull mosquito stream stop Stop the stream

bull mosquito stream view View the stream on the local device

A typical use of the application would be the following First the application is started using mosquito on Then plugins

are added to the stream using mosquito stream add [ELEMENT_TYPE] [ELEMENT] This will instantiate the

corresponding plugins in the Producer and Consumer component The plugins are linked in order using mosquito stream

link [ELEMENT_1] [ELEMENT_2] The stream is then set to play using mosquito stream play When the

last plugin is linked to the special local plugin the user can view the output from that plugin using mosquito stream

view which opens up a window in which the stream is displayed

42 Overview of prototype 46

As specified in the software architecture (see Section 23) the Client Interface can use the Stream Commands interface of the

Stream component As specified in Section 231 this interface is a REST API so the client can use this interface through the HTTP

protocol This is done with the Python Requests library [95]

423 Stream

The Stream component is responsible for the logical representation of the stream (see Section 231) implemented as the

streamer component The component consists of three objects api that contains the REST API StreamManager and

the Stream object representing the Stream Model in the framework Requests to the other microservices are sent using the

Python Requests library The prototype implementation only supports one stream with a chain-like model This means that

unlike the stream depicted in Figure 26 a plugin canrsquot have multiple sources or multiple listeners The Stream object man-

ages the logical representation of the stream and manipulates the references to the plugins by forwarding commands to the

producer and consumer component respectively It contains two data structures outline which is the logical struc-

ture of the stream and elements that contains all the plugins present in the stream In the prototype the Stream component

provides the following functionalities on its API endpoints

bull plugins GET Fetches all the plugins from the producer and consumer components and returns their in-

formation

bull elements GET POST DELETE Resource to add and delete plugins from the elements bin

bull streamlinks POST Resource to create links for elements

bull streamstate GET PUT Resource to update the state

bull shutdown POST Shut down the framework

Since the streamer component is the only component of the framework that interacts with outside users it has the re-

sponsibility to gracefully shut down the framework This is needed to solve the problem of dangling plugin containers that

run plugins that have not been stopped and removed after closing the application Since only plugins that are contained in a

stream have a running container associated the stream can notify the Producer and Consumer components to stop and remove

those containers

424 Producer and Consumer

The Producer and Consumer component cover similar responsibilities in managing installed plugins They are implemented in

the producer and consumer components Both components consist of the following objects api that contains the REST

API the Kernel that implements the core functionalities the PluginManager which finds plugins installed on the device

and checks if their installation is valid and the Plugin which is the logical representation of a plugin as described in Section

231 Commands to control the plugins are made using the Python Requests library

42 Overview of prototype 47

For the component to be able to start stop and interact with the plugin containers the component needs access to the Docker

host and the Docker client running on that host But because the component is running in its own container it is isolated from

the Docker host and canrsquot interact with the Docker client by default The workaround for this problem is to expose the socket

on which the Docker client is running on the Docker host to the container This is done by mounting the Docker socket of the

host on the Docker socket in the container In Docker compose the mounting is achieved using the following Listing

volumes

- varrundockersockvarrundockersock

Listing 5 Mounting the Docker socket on the container

This has some implications on security (see Section 43) To interact with the now exposed Docker client the component uses

the docker-py library [96] Listing 6 shows how a connection is made to the Docker client and a plugin container is started

The container is started from the plugin image on the network of the framework and is given the plugin name as the container

name Docker thus creates a DNS entry with the plugin name which makes the container addressable on its name Due to this

implementation this limits that there can only be one container of a plugin running at all times in the current implementation

import docker

client = dockerfrom_env()

container = clientcontainersrun(

image=plugin_name

detach=True

name=plugin_name

network=mosquito_default

)

Listing 6 Starting a plugin container

When both components are initialized the Kernel and PluginManager are created The PluginManager searches

for a plugin_directory which contains information on which plugins are installed on the device Each installed plugin

should have a valid image on the device which are contained in the images directory of the Docker daemon If the image

or information file cannot be found on the device the plugin is marked as broken and canrsquot be used by the framework To

describe the API the consumer API is used The producer API is analogous but replaces consumer with producer

and doesnrsquot have the sources endpoints The Producer and Consumer components provide the following functionalities

on the API endpoints

bull consumers GET Retrieves a list of the installed consumers on the device on which the component is running

bull consumerslthostnamegt GET DELETE Retrieves the information of a consumer specified by the host-

name value which is the name of the consumer

42 Overview of prototype 48

bull consumerslthostnamegtstate GET PUT Retrieves or respectively updates the state of a consumer

specified by the hostname value

bull consumerslthostnamegtsources GET POST Retrieves the sources or respectively adds a new source

to the consumer specified by the hostname value

bull consumerslthostnamegtsourcesltsource_hostnamegt

GET PUT DELETE Retrieves updates or removes the source specified by source_hostname of a consumer spec-

ified by hostname respectively

bull consumerslthostnamegtlisteners All listeners resources are analogous to the sources re-

sources

425 Implemented plugins

Three plugins are implemented and tested filecam (called rsquoMycamrsquo in the code) a producer that reads in a video file and

transmits it in MJPEG encoding using the RTP protocol testsrc a producer which generates test video and transmits it

in MJPEG encoding using the RTP protocol and local a consumer which captures incoming RTP MJPEG video frames and

displays them on the local display The filecam and local plugins are discussed since the testsrc is similar to the

filecam

The plugins are implemented in Python use the GStreamer library with the Python bindings [97] for media streaming and the

Flask framework to implement the API These libraries donrsquot have to be used by future plugins which can just implement a REST

API and provide a media stream specified in their descriptions

Filecam plugin

The filecam image is based of the Ubuntu 1710 image It is chosen over lighter Linux distributions because it offers more

functionalities out of the box for prototyping Other dependencies are Python 36 GStreamer 112 GStreamer plugins-base

plugins-good plugins-bad plugins-ugly libav doc and tools and python-gst

The API of the plugin offers the following functionalities

bull state GET PUT Retrieve and respectively update the state of the plugin

bull listeners GET POST Retrieve and respectively add a listener on the plugin

bull listenerslthostnamegt GET PUT DELETE Retrieve update and respectively delete a listener on the

plugin

The implemented GStreamer pipeline is depicted in Figure 41 The pipeline consists of the following GStreamer elements

1 filesrc GStreamer element that reads data from a file in the local file system This file can have any extension

and is not limited to video or audio files [98] The location property is set to the location of the file in the plugin

container

42 Overview of prototype 49

Figure 41 filecam GStreamer pipeline

2 decodebin GStreamer bin that automatically constructs a decoding pipeline using available decoders and demuxers

via auto-plugging [99] Note that for some media containers and codecs the appropriate decoders must be installed

For example to decode the MPEG streams contained in MP4 files a h264 decoder is needed that can be found in the

rsquolibavrsquo GStreamer plugins library

3 jpegenc GStreamer elements that encodes raw video into JPEG images [100] This implements the MPEG video

stream as all video frames are encoded as JPEG images

4 rtpjpegpay GStreamer element that payload encodes JPEG images into RTP packets according to RFC 2435 [101]

5 udpsink GStreamer element that sends UDP packets to the network When combined with an RTP payload plugin

it implements RTP streaming [102] The host and port property are set to the hostname and port property of the

listener of the plugin

This pipeline is implemented using the Python GStreamer bindings The process consists of creating each GStreamer element

adding them to the GStreamer pipeline and linking the elements in order of appearance in the pipeline The decodebin

and jpegenc element canrsquot be linked when created because there is no default sink pad available on the decodebin

Because the decodebin needs to decide on how to decode media it needs the pipeline to be processing media to it If no

media is flowing the decodebin canrsquot know what decoder it needs to decode the media and what media it can offer to the

sink element Therefore the process of dynamic linking is used [103] All elements which can be linked when the pipeline is

not in the PLAYING state are linked A handler is registered on the rsquopad-addedrsquo signal emitted when a new pad is added

on the decodebin indicating that it can forward media downstream When media is flowing through the pipeline the

decodebin creates new pads when it can generate output data and emits the rsquopad-addedrsquo signal A callback is performed

on the handler which links the decodebin with the jpegenc Listing 7 illustrates this concept

callback handler

def on_pad(source pad sink)

get the sink pad from the sink element

sink_pad = sinkget_static_pad(sink)

get the pad type

pad_caps = padget_current_caps()

pad_type = pad_capsget_structure(0)get_name()

Only if the pad is raw video the link is made

if pad_type == videox-raw

42 Overview of prototype 50

Perform the dynamic link

padlink(sink_pad)

Other pad types are ignored

filesrc = GstElementFactorymake(filesrc)

decodebin = GstElementFactorymake(decodebin)

jpegenc = GstElementFactorymake(jpegenc)

(create other elements and add elements to pipeline)

Only filesrc and decodebin can be linked statically

filesrclink(decodebin)

Register on_pad handler on the pad-added signal

handler_id = decodebinconnect(pad-added on_pad jpegenc)

Set pipeline to PLAYING callback will be called to perform the dynamic link

pipelineset_state(GstStatePLAYING)

Listing 7 Dynamic linking of the decodebin and jpegenc

Local plugin

The local plugin captures an incoming media stream and displays it on the local display This plugin is special with respect

to other plugins in that it is not deployed in a Docker container It runs natively via the cli on the host to allow access to

the local display This version is built for macOS High Sierra (version 10134) and uses GStreamer 112 GStreamer plugins-base

plugins-good plugins-bad plugins-ugly libav doc and tools to receive an incoming stream When a plugin links to the local

plugin the Stream component does not instruct the Consumer component to start the plugin but instead links the plugin to the

local host For macOS the address of the host is hostdockerinternal The GStreamer pipeline used by the plugin is depicted in

Figure 42

Figure 42 local plugin GStreamer pipeline

The pipeline consists of the following elements

1 updsrc GStreamer element that reads UDP packets from the network [104] The port property is set to the port to

which the source is transmitting media

2 rtpjpegdepay GStreamer element that retrieves JPEG images from the received RTP packets [105] This element

canrsquot process the media received from the udpsrc directly because it canrsquot know what type of data it will be receiv-

43 Limitations and issues 51

ing Between the pads a rsquocapabilities filterrsquo is placed which informs the elements on the type of data that will be

flowing through In this case the capabilities are applicationx-rtp which tells that there will be rtp pack-

ets coming through encoding-name=JPEG which tells that the payload of the RTP packets are JPEG images and

payload=26 which also tells that the encoding is JPEG according to RFC3551 [50 106]

3 jpegdec GStreamer element that decodes JPEG images [107]

4 autovideosink GStreamer element that automatically detects an appropriate videosink and forwards the video

to it [108]

43 Limitations and issues

The implementation presented is a prototype and slimmed down version of the architecture presented in Section 23 The

following limitations and issues remain

431 Single client

The current implementation deploys the Flask framework (on which each microservice relies) on the built-in Flask web server

(Werkzeug) which is provided for development convenience It is only built for use by a single user and by default can only

handle one request at each given moment which implies that the framework can also only be used by a single user [109]

432 Timeouts

The framework does not perform checks on request timeouts when passing commands to components and plugins This can

be a problem when the framework is deployed on several devices and the request latency is much higher In case of timeouts

the framework will keep waiting for a response which leads to a crash

433 Exception handling and testing

The framework is only tested for the so called rsquohappy pathrsquo the default scenario featuring no exceptional or error conditions

Some alternate paths are handled butmost still need to be tested An example scenario would be if one of the plugin containers

in a stream fails and stops The framework is not able to detect this and will assume that the container is still running

434 Docker security issues

The Docker client is a client that communicates with a daemon process using the socket dockerd This socket is a UNIX

domain socket called varrundockersock The daemon is highly privileged having root access to the host system

Any process that can write to this socket effectively has root access To allow the components of the framework to manipulate

the plugin containers they need access to this socket Therefore the socket ismounted in the containerwhich gives the container

43 Limitations and issues 52

write access to the socket This implies that that container now has root access on the host when writing to this socket Because

the container gets root access to the host an attacker can walk the file tree of the host and extract sensitive information or run

unwanted software This type of attack is known as a rsquoDocker Breakoutrsquo or rsquoContainer Escapersquo attack [110 111]

435 Docker bridge network

The current implementation deploys the framework on a Docker bridge network which can only be used if the framework is

deployed on a single device The current implementation can thus only be deployed on a single device To deploy the framework

on multiple devices the framework must be deployed using a Docker overlay network [112]

436 Single stream

The implementation supports one stream which must be a chain Multiple streams in tree form with merging media from

multiple sources and broadcasting to multiple listeners is not supported

437 Number of containers per plugin

The framework uses the name of the plugin as identifier for the containers The name is also the hostname on which the

container can be reached Therefore there can only be one active container associated with a plugin at runtime

MOB DETECTION EXPERIMENT 53

Chapter 5

Mob detection experiment

To try out an actual drone thermal imaging application the mob detection experiment is carried out The goal of this experi-

ment is to use existing object detection algorithms on a dataset of thermal images to try and detect large crowds of people

hereinafter referred to as a mob

Several public datasets of thermal images exist Most datasets focus on the detection of people in scenes [113ndash117] some on

face recognition [118 119] others on vehicle recognition [120] Most of these datasets are freely available through the OTCBVS

Benchmark Dataset Collection [121] No datasets containing large amounts of people were found so the Last Post thermal

dataset was created for the detection of mobs and other analysis tasks This dataset is presented in Section 51

To detect mobs in the images of the datasets a deep learning approach using neural networks is explored The selection and

training of the model is described in Section 52

51 Last Post thermal dataset

The Last Post dataset consists of videos of the Last Post Ceremony taking place each night at 800 PM (Brussels timezone) under

the Menin Gate in Ypres Belgium Section 511 gives some insight into this unique ceremony The full dataset is described in

Section 512

511 Last Post ceremony

The Last Post ceremony is a nightly ceremony taking place under the Menin Gate in Ypres at 800 PM sharp The ceremony is

held in remembrance of the fallen soldiers during World War I (1914-1918) The Last Post association [122] states its mission as

follows

True to its statutes the Last Post Association wishes to honor and remember the soldiers of the British Empire

who gave their lives during the Great War of 1914-1918 The Last Post ceremony seeks to express day after day

the lasting debt of gratitude which we all owe to the men who fought and fell for the restoration of peace and

the independence of Belgium

51 Last Post thermal dataset 54

Figure 51 gives an impression of the size of the ceremony Because of the sheer number of people that gather under the gate

each day the Last Post is a unique open air event that allowed for repeatable conditions to capture footage therefore the event

was a perfect opportunity to create the dataset

Figure 51 Last Post ceremony panorama

512 Dataset description

Due to legislation in Belgium drones cannot be flown in public areas without a certification and permit by authorities The

creation of real aerial thermal images with a drone was thus not feasible Therefore an elevated position (in order to simulate

aerial images) on the walls next to Menin gate was used to capture the footage of the adjacent square on one side and the

bridge on the other side Figure 52 shows the locations where the video footage was captured

Figure 52 Locations where the video footage was captured The black stars represent the captured scenes the red stars represent the locations from

where the scene was filmed

The data was recorded with the FLIR One Generation 3 Pro camera for Android devices hereafter referred to as rdquoCamerardquo [123]

Since thermal images donrsquot hold color information a color scheme is used to represent the relative differences in temperature

The rsquoIronrsquo color scheme which maps colder sections of a scene on blue colors and warmer sections on red and yellow colors

51 Last Post thermal dataset 55

The videos are encoded using the H264 MPEG-4 codec Decoded the color information is captured in 420 YUV format The

frame rate of the videos varies from 7 Hz to 8 Hz depending on the speed of the objects in the scene There is sound present

in the videos which is encoded with the MPEG AAC codec For a full list of sequences the reader is referred to Appendix C

The two locations that make up the main scenes in the dataset are presented in Figure 53 The thermal images and visual

images of each scene are depicted next to each other The thermal and visual images were not captured at the same time so

the mobs that are present in the thermal images canrsquot be seen in the visual images In both scenes buildings are present that

are quite warm when compared to the surroundings as can be seen in the thermal images In Figure 53a it even becomes

difficult to recognize the mob when they are standing close to the building This is less the case for Figure 53c where due to

the water present in the image the mob has higher contrast due to the larger difference in emitted heat Towards the far right

of the image the mob seemingly disappears into the background The effect of two objects having a similar heat signature and

having no clear transition in thermal images is defined as thermal camouflage a technique that is often used by animals and

military units [124] This effect is even visible when looking at the mobs present in both images because people are standing

so close together it becomes difficult to recognize individual persons in the crowd

(a) Thermal view of the square in location A (b) Visual view of the square in location A

(c) Thermal view of the bridge in location B (d) Visual view of the bridge in location B

Figure 53 Main scenes in the Last Post dataset

52 Object detection experiment 56

52 Object detection experiment

521 Preprocessing

The Last Post dataset was not used entirely for training the model because there were not enough resources to manually

annotate every image Therefore a smaller dataset was used to serve as a baseline model

The following videos were used 2018-04-10 195029mp4 2018-04-10 200122mp4 2018-04-04-

202859mp4 2018-04-10 202558mp4 and 2018-04-04 200052mp4 captured on the fourth and

tenth of April 2018 These videos were used because of their contents They contain images from location A and B respectively

in which the mob behaves more dynamically compared to other videos This was due to a marching band present on the fourth

of April and a marching army unit on the tenth of April See Appendix C for a summary of the contents of these videos From

these videos images were extracted at a capture rate of 1 Hz Each image was manually labelled using the Microsoft Visual

Object Tagging Tool [125] The tool allows to export the training images to various formats such as Pascal VOC for Tensorflow

YOLO and Microsoft CNTK

Within the data several visual outliers are present An outlier is an observation point that is distant from other observations

It is created due to variability in capturing the videos or indicate experimental errors [126] The errors detected here are the

latter form and are depicted in Figure 54 The first type of outliers are system faults in the Camera Due to an error in the

processing of the video the Camera would sometimes not register any input This causes the Camera to produce completely

black images which is depicted in Figure 54a The Camera softwaremaps temperatures onto colors in the image The variations

of the colors are relative to the temperature interval ranging from the minimum and maximum temperature detected by the

Camera If the minimum andor maximum detected temperature change the Camera needs to adapt its color mapping This

causes the Camera to fade to bright colors for a short period of time (1 to 2 seconds) The resulting image is depicted in Figure

54b Because the resulting image is too bright and objects are hard to detect it is considered an outlier Due to instabilities

when capturing the footage sequences with fast motion some images are very blurry This makes it hard even for a person to

decide what is visible in the frame therefore it is considered an outlier This is depicted in Figure 54c Sometimes people would

pass in front of the Camera which resulted in brightly colored areas in the videos that were not part of the scene and therefore

are another type of outliers depicted in Figure 54d Because the presented outliers are experimental errors and do not belong

in the scenes they were removed from the dataset

522 Training

The model that is used for training is YOLOv3 implemented using the darknet neural network framework [83] The model is

trained using convolutional weights that are pre-trained on the ImageNet database [127] The concept of using weights from a

pre-trained model previously trained on large datasets is known as transfer learning It is very important that when choosing

a pre-trained model the problem statement of the pre-trained model is close enough to the current problem statement For

the pre-trained model on ImageNet this was to identify objects in images which lies close to the detection of mobs in thermal

images Because the type of images (thermal versus visual) is fundamentally different the model could suffer in performance

Goedeme et al [36] solved a similar problem with thermal images and achieved good results which gives an indication that

52 Object detection experiment 57

(a) System fault in the Camera no input was detected (b) The Camera updates to new temperature interval

(c) Due to moving the Camera too fast the image becomes too blurry (d) Very warm object due to people passing in front of the Camera

Figure 54 Outliers

detection should be feasible with the pre-trained model Also because the dataset is relatively small training the model from

scratch could actually hurt performance [128] Training was carried out on the NVIDIA Geforce GTX 980 GPU that allows training

to be done much faster To evaluate training progress the Sum of Squared Error (SSE) loss function is calculated defined assumni=1(xij minus xj)

2 where n is the number of samples in a batch used in a single training epoch and j is the dimension (x

or y) as defined in [83] The result of this training is discussed in Chapter 6

RESULTS AND EVALUATION 58

Chapter 6

Results and evaluation

The goal of this Chapter is to present the results of the framework and the detection experiment The results of the framework

tests are presented in Section 61 The results of the object detection experiment are presented in Section 62

61 Framework results

To evaluate the framework acceptance tests are conducted that test if the framework meets the QARs defined in Section 21 As

stated in Section 41 only the ASRs will be tested A summary of which requirements are met by the framework is given in Table

61 Passed means that the framework has met the requirement not passed that the framework hasnrsquot met the requirement

and plausible means that the frameworkmight havemet the requirement but not enough data could be gathered to be certain

611 Performance evaluation

To evaluate performance the acceptance tests for the requirements are conducted the impact of the framework on the pro-

cessing resources are recorded and the total size of the framework is measured

Acceptance tests

To test the performance of the framework the execution times of each command executed with the CLI (see Section 422) are

measured Each command is executed 200 times except for the on off and link commands they are measured manually

10 times Because these commands launched system threads and their finish signal could not be captured they had to be

measured by hand Commands were executed on a 26 GHz Intel Core i5-2540 processor running macOS High Sierra version

10134 The summarized statistics of the tests are given in Table 62

The average execution times for the Play Stop Pause Add Elements Print View and Link commands do not exceed the 2

second bound specified in PS-1 while the average execution times of the Delete On and Off commands do exceed this bound

This performance requirement is not met by the framework The same result is found for PS-2 Especially the Delete and Off

command exceed the requirements by quite a bit The Delete command shuts down a plugin and removes the Docker container

61 Framework results 59

Requirement id Status

PS-1 Not Passed

PS-2 Plausible

PS-3 Not Passed

PS-4 Plausible

PS-5 Not Passed

IS-1 Passed

IS-2 Passed

MS-1 Passed

MS-2 Passed

MS-3 Passed

MS-4 Passed

MS-5 Plausible

MS-6 Passed

MS-7 Plausible

Table 61 Acceptance tests results summary

from the host This action is costly in time The Off command removes all the plugins and all the microservices of the framework

and thus suffers from the same costly action This could be ameliorated by having the framework not removing the containers

but stopping them instead which requires less resources as it only stops the process running in the container but does not

delete the container from the system

PS-2 and PS-4 could not be measured due to the GStreamer pipeline of the prototype not allowing frames to be tracked

However since real-time is a human time perception if a person canrsquot distinguish the streamed videos from videos played with

a native video player real-time streaming is plausible [43 44] The videos were shown side by side to ten users that could not

distinguish between both videos indicating presumable real-time streaming Since the hard requirements cannot bemeasured

the requirements are not met but are plausible Real-time streaming performance also heavily depends on the used plugins

and the hardware on which they are deployed If a plugin canrsquot process its media fast enough due to lack of processing power

or a slow implementation it will slow down the whole stream

The scalability requirement PS-5 could not be met due to the Flask Werkzeug server only being able to process one request at

a time (see Section 43)

Only two performance requirements are met by the prototype However this is mostly due to some actions being very slow

such as shutting down the framework or removing a plugin As these are actions that should occur less frequently when a user

is using the framework these actions are less important for the perceived quality Frequent actions such as adding linking and

changing the state of the stream do perform rather well and contribute more to the perceived quality Overall the performance

of the framework is not stellar but not bad either This can partially be explained due to the choice of supporting frameworks

61 Framework results 60

Statistic Play Stop Pause Add Delete Elements Print View On Off Link

Mean 0690 0804 0634 1363 8402 0562 0564 122 358 24023 0849

Std deviation 0050 0059 0088 1037 4669 0070 00747 0260 0498 0481 0170

Minimum 0629 0708 0549 0516 0505 0517 0517 0757 3015 23707 0637

25 Percentile 0665 0775 0594 1049 1154 0534 0536 0998 3143 23750 0798

Median 0678 0800 0623 111 11132 0550 0552 1214 3500 23886 0853

75 Percentile 0700 0820 0653 1233 11189 0562 0560 1433 3850 24034 0877

Maximum 1016 1279 1631 625 11846 1227 1149 1691 4562 25326 1261

Table 62 Performance test statistics summary measured in seconds

such as Flask that are not built for performance Other more high performance frameworks such as Vertx could ameliorate

performance

Resource usage

The resources used by the modules of the framework are measured using the Docker statistics tool [129] A summary of the

resources used is given in Table 63 When the framework is idle resource usage is negligible When a plugin is active there is

a slight increase in resources This increase in resources depends on the runtime size of the plugin unknown to the framework

The increase peaks when the plugin is processing media CPU usage is 40 on one core which implies that on one CPU core only

two plugins can be active simultaneously before reaching the ceiling of the processing power In a production environment of

the framework plugins need to be tested thoroughly so that these metrics are known beforehand These metrics imply that

the length of streams should be kept short to avoid having many plugins active simultaneously

Size of framework

The total size of all the Docker images of the components of the framework are given in Table 64 Most images are quite large

the framework core components have an average size of 724 MB and the plugins have sizes ranging from 1GB to 3GB This

size can be explained due to the base images and additionally installed software in the images For development flexibility

the base images used are Linux Ubuntu images that are typically larger than other Linux distributions For the plugins the full

GStreamer library with all plugins was installed which is more than 2 GB large The sizes of the components can be reduced in

a production environment by choosing slimmer Linux distributions as base images and only installing the minimally needed

libraries to get a working plugin

612 Interoperability evaluation

The systems with which the framework exchanges data are the plugins These plugins must follow the plugin model presented

in Section 231 implement the presented resources using a REST API the state machine and protocols If these specifications

61 Framework results 61

Condition Container CPU usage [] Memory usage [MiB]

Idle streamer 100 4209

consumer 003 244

producer 001 2414

1 plugin active not processing media streamer 156 4248

consumer 002 2442

producer 002 2423

mycam plugin 075 4597

1 plugin active processing media streamer 156 4251

consumer 002 2442

producer 002 2424

mycam plugin 4003 9924

Table 63 Resource usage of the framework in several conditions

Image Size [MB]

streamer 718

consumer 729

producer 729

testsrc 1250

mycam 3020

Table 64 Total size of framework components

are followed by a plugin the framework should have no issues exchanging information with the plugin To test this a new

mock plugin is implemented For each resource of the plugin the framework is given random mock input data to exchange

with the plugin When the exchange is complete the values in the plugin are requested and compared with the given input If

the input matches the value in the plugin the exchange was successful These tests were executed 50000 times The results

are summarized in Table 65 Play pause and stop are the requests to change the state of the plugin The sourcelistener add

update and delete commands manipulate the sources and listeners of the plugin Overall there were almost no errors made

when exchanging information only when updating a source and deleting a listener there was one incorrect exchange The

ratios achieved are always 100 correct exchanges except for updating a source and deleting a listener which are 99998

IS-1 and IS-2 specify that commands exchanged with the plugins need to be correct 9999 of the uptime so this requirement

is clearly met

Plugins also interact with each other by transmitting media to each other according to the stream layout This interoperability

62 Mob detection experiment results 62

Value Play Pause Stop Add S Update S Delete S Add L Update L Delete L

Correct 50000 50000 50000 50000 50000 49999 50000 50000 49999

Incorrect 0 0 0 0 0 1 0 0 1

Ratio () 100 100 100 100 100 99998 100 100 99998

Table 65 Interoperability tests results (S Source L Listener)

is not directly controlled by the framework as plugins can be developed by third parties To solve this a plugin needs to provide

its specifications to the framework before being integrated as a plugin This allows the framework to decide whether or not two

plugins will be able to interact with each other in a stream For example if plugin A supports MJPEG streams transmitted via

RTPUDP it will be able to interact with plugin B implementing the same protocols If plugin B implements another protocol it

will not be able to interact with plugin A If this is specified the framework can notify a user that two plugins are not compatible

These scenarios should be avoided which is done by specifying standard protocols for plugins

613 Modifiability evaluation

Plugins are installed for the prototype by building and adding their image to the image directory of the Docker host The

framework does not need a restart to install these images Therefore requirements MS-1 and MS-2 are met End-users can

extend their version of the framework with new plugins by installing them by building the respective plugin images meeting

MS-3 Streams can be modified by linking different plugins by design meetingMS-4 The framework can detect newly installed

plugins when starting up if the image is installed to the image directory of the Docker host Therefore requirementsMS-5 and

MS-6 are met The current prototype is only deployable on a local device as discussed in Section 41 meeting requirementMS-7

The other requirements can be met by deploying the framework using the Docker overlay network as discussed in Section 43

without having to implement changes to the code base The requirements MS-8 and MS-9 are not met but are plausible by

using a different Docker deployment

In general the frameworkwas designed to bemodifiable for different video analysis tasks The hybridmicrokernelmicroservices

architecture enables this modifiability The microkernel plugin architecture allows a user to modify a video analysis stream

during framework use The microservices architecture allows for a modifiable deployment configuration

62 Mob detection experiment results

To evaluate the detection experiment the trained model is tested on the validation set that contains random images from the

total annotated dataset presented in Section 512 First the results of the training of the model are presented in Section 621

Second the metrics that were used to evaluate the model are presented in Section 622 Finally the results of the validation

are presented in Section 623

62 Mob detection experiment results 63

621 Training results

To monitor training the average loss per training epoch was measured the resulting training evolutions are depicted in Figure

61 Darknet does not shuffle training data automatically and creates training batches in order of the training data provided

Since YOLO uses gradient descent for optimization this can lead to YOLO getting stuck in local minima of the cost surface [130]

This effect is seen in Figure 61a around epoch 4500 every image in the training set has been loaded at least once at this point

the model was training on images from location B and now images from location A are loaded (see Section 512) This leads to

a peak in average loss as YOLO was optimizing images from location B and probably converging to a local minimum for that

type of images Therefore in a second run data was shuffled allowing the model to get out of local minima easier Figure

61b shows the difference in training loss the curve is much more irregular thanks to the shuffling of the data Once again

the average loss decreases more around epoch 4500 when every image in the training set has been loaded at least once The

average loss stagnates values in the interval [004 007] To avoid overfitting the model on the training data and achieve worse

generalization performance early stopping is applied Early stopping is a generalization technique to stop the training of a

neural network early before the network starts overfitting [131] The stopping criterion used is progress defined as the decrease

of training error in successive training epochs [131] or the slope of the loss curve depicted in Figure 61 This slope approaches

0 from epoch 13000 and onward so this epoch is selected as early stopping point Because the generalization error is not a

smooth curve and consists of many local minima it is a good idea to validate model weights in the neighborhood of the early

stopping point as well as these could potentially yield better performance on the validation set [131]

622 Metrics

Themodel predicts bounding boxes for objects in the images of the validation sets The bounding box provided by the annotated

dataset is defined as the ground truth bounding boxBgt The bounding box provided by the model is defined as the predicted

bounding boxBp To evaluate the performance of themodel and select the best weights several metrics are used The standard

metrics used to evaluate object detection problems are the Intersection over Union (IoU) and themean Average Precision (mAP)

The IoU is a metric used in common object detection challenges such as the Pascal VOC challenge [132] If the functionA(Bx)

gives the area for a bounding boxBx the IoU is defined as

IoU =A(Bp capBgt)

A(Bp cupBgt)(61)

The mAP for set of detections another metric used in the Pascal VOC challenge is defined as the mean over classes of the

interpolated AP for each class A detection is considered a true positive if the IoU for the detection is greater than 05 The

interpolated AP is given by the area under the precision-recall curve for the detections [132ndash134]

Themodel is also tested on several videos not included in the train and validation set to visually evaluate detection andmeasure

the number of frames per second that can be processed by the model

62 Mob detection experiment results 64

(a) Average training loss when data is not shuffled Vertical average loss horizontal time (in training epochs)

(b) Average training loss when data is shuffled Vertical average loss horizontal time (in training epochs)

Figure 61 Average training loss per epoch

623 Validation results

YOLO creates a snapshot from the weights the model is using at a certain epoch every 100 epochs [83] This makes it possible

to validate each set of weights on the validation set and show the evolution of the validation performance Figure 62 shows

these evolutions for the average IoU and mAP metrics The mAP gradually grows from epoch 4500 onwards and stagnates

around epoch 11500 This shows that the model is not learning anymore and is at risk of overfitting The mAP stagnates in the

interval of [88 91] The average IoU shows a similar trend but varies more because predictions on the same images rarely

are exactly the same

The best mAP value is achieved at epoch 15700 being 9052 The weights from this epoch are used for further testing and

validation The mAP for the 05 IoU threshold of YOLOv3 on the COCO benchmark dataset [135] is 748 comparing this to the

achieved mAP for the Last Post dataset the Last Post mAP is very high The reason for this difference is that the validation

62 Mob detection experiment results 65

(a) mAP () per epoch Vertical mAP () horizontal time (in training epochs)

(b) IoU () per epoch Vertical IoU () horizontal time (in training epochs)

Figure 62 Validation metrics per epoch

set has a high correlation with the validation set Due to the training set and validation set being extracted from videos all

images from one video are correlated in time to each other Images from the validation set are thus correlated to images in

the training set and the model is optimized on these types of images explaining the high mAP This indicates that the model is

somewhat overfitting on the training data This was confirmed when testing the model on unseen videos Although the model

could detect a mob most of the time it produced more visual errors Because this data was not annotated no metrics could be

extracted Figure 63 depicts some predictions of the model on images from the validation set The predicted bounding boxes

resemble the ground truth bounding boxes quite accurately visually

To test the speed of the predictions of the model the total time to predict images in the validation set was measured For the

NVIDIA Geforce GTX 980 GPU the average prediction time for one image is 14673 milliseconds with a standard deviation of

0517 milliseconds This indicates that the upper limit of the frame rate when making predictions on a video is approximately

68 frames per second on the GPU For comparison predictions with the model were also made on a CPU a 26 GHz Intel Core

i5-2540 processor with AVX instructions speedup The average prediction time on the CPU is 5849 seconds with a standard

deviation of 0438 seconds resulting in an upper limit for the frame rate on the CPU of 0171 frames per second Clearly real

time object detection with this model is only possible on a GPU When generating predictions on a test video the average frame

rate of the video was 55 frames per second

62 Mob detection experiment results 66

(a) Prediction of a large mob at location B (b) Prediction of the mob at location A

(c) Prediction of a small mob at location B (d) Prediction of the mob at location B

Figure 63 Predictions of the model on images in the validation set

CONCLUSION AND FUTURE WORK 67

Chapter 7

Conclusion and future work

71 Conclusion

Aerial thermal imaging with drones is a promising technology that can deliver many promising applications for various use

cases across many different domains such as agriculture fire fighting search and rescue etc Most applications built with this

technology are built with a specific use case in mind using a thermal camera and analysis software specifically for this use

case and therefore struggle to exchange hardware and algorithms for new use cases Therefore the goal of this dissertation

was to design build and test a possible backbone framework that allows building these applications in a modifiable way The

specific use case of mob detection in thermal images was investigated as a sample use case for the framework

Chapter 2 explored the requirements of such a framework The ASRs to achieve the goal of the framework are performance

interoperability and modifiability Performance is needed because some use cases (like fire fighting) require real-time video

analysis Interoperability enables the framework to interact with different thermal cameras and different processinganalysis

modules Modifiability enables the framework to interchange the thermal cameras and analyzers in its process to build ap-

plications for different use cases A hybrid combination of the microkernel pattern and the microservices pattern is used to

meet these requirements as the microkernel pattern enabled interchanging the cameras and analyzers via a plugin system

and the microservices pattern enabled different deployment configurations for the framework To build and test the frame-

work several technologies were needed backbone technologies for the software architecture a thermal camera and an object

detection algorithm for the mob detection use case

Chapter 3 explored the state of the art of these technologies and presents the selected technologies Thermal cameras come in

all shapes and sizes and have different features according to their retail prize Contrary to intuition the image quality is not the

defining factor of the retail prize but the amount of extra features such as radiometry communication interfaces etc The FLIR

One Pro and ThermApp were selected for this dissertation since they offer good quality images and features for their price and

their use via smartphone platforms that makes these cameras excellent for prototyping Microservices frameworks also know

a lot of variety depending a lot on the use case for the application using the framework Some are aimed at quick prototyping

others focus on performance etc Flask was selected as the microservices framework as it is easy to use and designed for

prototyping with microservices This does come with a performance trade-off To deploy the microservices in a plugin fashion

71 Conclusion 68

the concept of containers is applied Containers virtualize on the OS level allowing the microservices to be moved around on

the host and distributed on different hosts The current field has some frameworks implementing this technology with Docker

being the most well-known and mature framework and it was selected for that reason The field of object detection has a

variety of solutions for the object detection problem having varying accuracies and some can even create predictions in real-

time The YOLOv3 algorithm implemented in the darknet framework was selected as it generalizes well onto other datasets

(such as thermal images) makes relatively accurate predictions and is able to make predictions in real-time when deployed on

a device with GPU processing capabilities

Chapter 4 presents the implemented prototype of the framework using these technologies Two sample plugins were imple-

mented the filecam that serves a video read in from a file and the display plugin that displays this video on the local device

The framework is limited to one video processing stream for one user at a time and is deployed to a local device It also has a

security risk as the framework has to expose the Docker daemon socket to allow the framework to manipulate the containers

running the plugins This gives the containers that run the core framework processes root access to the host system which can

be abused by potential attackers

Themob detection experiment is presented in Chapter 5 A new thermal image dataset called the Last Post datasetwas collected

for this experiment The dataset features videos of the Last Post ceremony filmed over the course of two weeks What makes

this dataset special is that unlike publicly available datasets it delivers footage of the movement of large crowds filmed from

a high vantage point to simulate footage captured from a drone platform This dataset is used to train a pre-trained YOLOv3

model via transfer-learning The dataset is manually labeled and preprocessed by removing the outliers present Training is

done on a NVIDIA GTX 980 GPU and is evaluated using the MSE loss metric

Chapter 6 presented the test conducted on the framework and the detection model and their corresponding results The per-

formance requirements for the frequently used commands are met by the framework Other commands such as removing

plugins starting up and shutting down the framework do not meet the performance requirements since Docker requires sig-

nificant time to start stop and remove containers The real-time streaming requirements could not be proven because the

time between transmitting a frame and receiving a frame could not be measured directly However the processed videos were

shown to human users that could not distinguish between the processed video and the video played back on a local system

which makes it plausible that the framework achieved this requirement Real-time streaming performance heavily depends on

the plugin and the hardware on which it is deployed When plugins in the framework are processingmedia CPU usage increases

significantly even when only one plugin is active This implies that the length of media processing streams should be kept as

short as possible to achieve good performance The framework is relatively big with some plugins even having a size of 2 GB

This is mostly due to the base images and installed libraries of the plugins and core components Due to each components

and plugin having its own container libraries canrsquot be shared so they are redundantly installed leading to large components

sizes This could be alleviated by using slimmer images and only installing minimal libraries needed The interoperability

requirements are all met by the framework This is proven by a test exchanging mock information between the framework

and plugins The modifiability requirements regarding the plugins are met by the framework The modifiability requirements

regarding the deployment schemes are not met by the framework but are can be achieved by deploying the framework using

a Docker overlay network instead of the Docker bridge network To evaluate the trained model the model made predictions

72 Future work 69

on a separate validation set The model achieves an mAP of 9052 which is much higher than what current state of the art

models are achieving on benchmark datasets This shows that the model is capable of learning the thermal features but is also

overfitting on the data due to temporal correlation between training and validation sets The model can predict in real-time

achieving an average frame rate of 55 frames per second when making predictions on a GPU

72 Future work

This dissertation proposed a framework and implements a prototype of it which only implements a part of the total framework

Object detection using deep learning in general and specified on thermal images is still a young field Several extensions to

this research are possible

721 Security

The framework prototype did not implement any security measures Because in distributed configurations communications

rely on an external network these measures should be implemented to reduce the risks of attacks To allow the components

to manipulate Docker containers the Docker host socket was exposed As stated before this is a serious security risk as the

container gets root access to the host Workarounds for this problem could be to implement a Docker in Docker environment

[136] or deploy the containers in a VM

722 Implementing a detection plugin

Due to the scope and time limit of the dissertation a working prototype plugin containing a trained model for detecting objects

in a video stream could not be made A possible GStreamer pipeline for such a plugin is depicted in Figure 71 This plugin is a

Consumer and receives video via the udpsink Frames are decoded and the raw video is presented to the appsink GStreamer

plugin that allows the video to be dumped into an application This is the detection model that can generate predictions on the

frame The predicted frame is then forwarded to an appsrc GStreamer plugin that puts the predicted frame in a new pipeline to

transmit it to further framework plugins It should be tested whether the detection model can run in a Docker container since

it needs GPU support to be able to predict in real-time A solution could be to use nvidia-docker which leverages NVIDIA GPU

support in Docker containers [137]

Figure 71 GStreamer pipeline for a plugin with a detection model

72 Future work 70

723 Different deployment configurations

The prototype of the framework only implemented one of the deployment configurations presented in Section 233 Other

configurations can be explored by changing the Docker bridge network to a Docker overlay network

724 Multiple streams with different layouts

The prototype only implemented one stream with a chain-like layout Future effort could implement support for multiple

streams that run concurrently The layout can be changed by implementing plugin that can forward media to multiple sources

or merge media coming from different sources which is the concept of sensor fusion

725 Implementing the plugin distribution service (Remote ProducerConsumer)

In Chapter 2 presented the Remote Producer and Consumer that distribute the plugins available for the framework This was

deemed out of scope for the prototype but could be implemented in future versions

726 Using high performance microservices backbone frameworks

The current implementation uses the Flask framework excellent for prototyping but not ideal for high performance Other

frameworks such as Vertx focus on high performance through asynchronous messaging that could improve the performance

of the framework

727 New object detection models and datasets specifically for thermal images

Current effort in object detection models goes towards challenges on benchmark datasets of visual images such as ImageNet

and Pascal VOC There are some thermal datasets publicly available for some detection purposes but these are very small

compared to the visual image datasets Future research could create new benchmark datasets similar to the visual image

datasets specifically for thermal images

Currently publicly available pre-trained neural network models are designed for and trained on the visual image datasets

Future research could go towards designing an architecture specifically for thermal images and training amodel on a benchmark

dataset

Thermal images use several colormaps tomap the relative temperatures in a scene on colors presenting warm and cold regions

Well-known examples are the Iron scheme (used in this dissertation) White-hot and Black-hot Some companies implement

threshold colors that highlight very hot spots or very cold spots in an image (for examples see [138 139] etc) Future research

could investigate how models trained on images using different color schemes differ in their predictions and performances

Thermal images could potentially benefit from radiometric information that adds a ton of information by adding a temperature

dimension to each pixel in the image instead of the relative coloring This information could lead to more accurate predictions

BIBLIOGRAPHY 71

Bibliography

[1] S G Gupta M M Ghonge and P Jawandhiya ldquoReview of Unmanned Aircraft Systemrdquo International Journal of Advanced

Research in Computer Engineering amp Technology vol 2 no 4 pp 2278ndash1323 2013 ISSN 2278 ndash 1323

[2] M Hassanalian and A Abdelkefi Classifications applications and design challenges of drones A review 2017 DOI

10 1016 j paerosci 2017 04 003 [Online] Available http ac els - cdn com S0376042116301348 1 - s2 0 -

S0376042116301348-mainpdf7B5C_7Dtid=256c9506-8f3c-11e7-a898-00000aab0f017B5Camp7Dacdnat=

15042875957B5C_7D

[3] M Joel The Booming Business of Drones 2013 [Online] Available httpshbrorg201301the-booming-business-of-

drones (visited on 01302018)

[4] DJI Zenmuse H3 - 2D [Online] Available httpswwwdjicomzenmuse-h3-2d (visited on 01302018)

[5] Gimbal Guard Drop amp Delivery Device for DJI Mavic Pro [Online] Available httpwwwgimbal-guardcom7B5C_

7Dpprd134610820141productdrop-7B5C7D26-delivery-device-for-dji-mavic-pro (visited on 01302018)

[6] FLIR Systems Aerial Thermal Imaging Kits [Online] Available httpwwwflircomsuasaerial-thermal-imaging-kits

(visited on 01302018)

[7] R Gade and T B Moeslund ldquoThermal cameras and applications a surveyrdquo Machine Vision and Applications vol 25

pp 245ndash262 2014 DOI 101007s00138-013-0570-5 [Online] Available httpslinkspringercomcontentpdf10

10077B5C7D2Fs00138-013-0570-5pdf

[8] M C Harvey J V Rowland and K M Luketina ldquoDrone with thermal infrared camera provides high resolution georefer-

enced imagery of theWaikite geothermal area New Zealandrdquo 2016 DOI 101016jjvolgeores201606014 [Online] Avail-

able httpsacels-cdncomS03770273163014211-s20-S0377027316301421-mainpdf7B5C_7Dtid=78077cee-

05f3-11e8-84ec-00000aab0f6c7B5Camp7Dacdnat=15173405687B5C_7D

[9] S Amici M Turci S Giammanco L Spampinato and F Giulietti ldquoUAV Thermal Infrared Remote Sensing of an Italian Mud

Volcanordquo vol 2 pp 358ndash364 2013 DOI 104236ars201324038 [Online] Available httpwwwscirporgjournalars

20httpdxdoiorg104236ars201324038

[10] J Bendig A Bolten and G Bareth ldquoINTRODUCING A LOW-COST MINI-UAV FOR THERMAL-AND MULTISPECTRAL-IMAGINGrdquo

2012 [Online] Available httpswwwint-arch-photogramm-remote-sens-spatial-inf-scinetXXXIX-B13452012

isprsarchives-XXXIX-B1-345-2012pdf

BIBLIOGRAPHY 72

[11] Workswell ldquoUsing the UAV Thermography for Cultivation and Phenotyping of Cerealsrdquo Tech Rep 2016 [Online] Avail-

able httpswwwdrone-thermal-cameracomwp-contentuploadsCultivation-and-Phenotyping-1pdf

[12] A J Rivera A D Villalobos J C Monje J A Marintildeas and C M Oppus ldquoPost-disaster rescue facility Human detection and

geolocation using aerial dronesrdquo IEEE Region 10 Annual International Conference ProceedingsTENCON pp 384ndash386

2017 ISSN 21593450 DOI 101109TENCON20167848026

[13] P Christiansen K A Steen R N Joslashrgensen and H Karstoft ldquoAutomated detection and recognition of wildlife using

thermal camerasrdquo Sensors (Basel Switzerland) vol 14 no 8 pp 13 778ndash93 Jul 2014 ISSN 1424-8220 DOI 103390

s140813778 [Online] Available httpwwwncbinlmnihgovpubmed2519610520httpwwwpubmedcentralnih

govarticlerenderfcgiartid=PMC4179058

[14] J Zhang J Hu J Lian Z Fan X Ouyang and W Ye ldquoSeeing the forest from drones Testing the potential of lightweight

drones as a tool for long-term forest monitoringrdquo Biological Conversation vol 198 pp 60ndash69 2016 [Online] Available

httpacels-cdncomS00063207163011001-s20-S0006320716301100-mainpdf7B5C_7Dtid=7166e916-8f3c-

11e7-9090-00000aacb35e7B5Camp7Dacdnat=15042877237B5C_7D

[15] D Ventura M Bruno G Jona Lasinio A Belluscio and G Ardizzone ldquoA low-cost drone based application for identifying

and mapping of coastal fish nursery groundsrdquo Estuarine Coastal and Shelf Science vol 171 pp 85ndash98 Mar 2016 ISSN

02727714 DOI 101016j ecss 201601 030 [Online] Available http ac els-cdncomS02727714163003001-s20-

S0272771416300300-mainpdf7B5C_7Dtid=7f4cdb08-8f3c-11e7-a03a-00000aab0f6b7B5Camp7Dacdnat=

15042877467B5C_7D20httplinkinghubelseviercomretrievepiiS0272771416300300

[16] S Chowdhury A Emelogu M Marufuzzaman S G Nurre and L Bian ldquoDrones for disaster response and relief operations

A continuous approximation modelrdquo 2017 DOI 101016jijpe201703024 [Online] Available wwwelseviercomlocate

ijpe

[17] Workswell ldquoPipeline inspection with thermal diagnosticsrdquo 2016 [Online] Available https www drone - thermal -

cameracomwp-contentuploadspipelinepdf

[18] Workswell ldquoThermo diagnosis of photovoltaic power plantsrdquo 2016 [Online] Available httpswwwdrone-thermal-

cameracomwp-contentuploadsWorkswell-WIRIS7B5C_7Dphotovoltaicpdf

[19] Workswell ldquoThermodiagnostics of flat roofsrdquo 2016 [Online] Available httpswwwdrone-thermal-cameracomwp-

contentuploadsroofpdf

[20] Workswell ldquoThermodiagnostics in the power engineering sectorrdquo Tech Rep 2016 [Online] Available https www

drone-thermal-cameracomwp-contentuploadshighvoltagepdf

[21] Workswell Workswell WIRIS - Product - Thermal camera for drones 2016 [Online] Available https www drone -

thermal-cameracomwiris (visited on 01302018)

[22] TEAX Technology ThermalCapture - Thermal Imaging Technology | Capture raw radiometric thermal data with drones

[Online] Available httpthermalcapturecom (visited on 01302018)

BIBLIOGRAPHY 73

[23] DJI Zenmuse XT - unlock the possibilities of sight - DJI 2018 [Online] Available https wwwdji comzenmuse-xt

(visited on 01302018)

[24] Workswell SOFTWARE - Workswell WIRIS - Thermal camera for drones 2016 [Online] Available httpswwwdrone-

thermal-cameracomsoftware (visited on 01312018)

[25] Therm-App Therm-Apptrade - Android-apps op Google Play 2018 [Online] Available httpsplaygooglecomstoreapps

detailsid=comthermapp (visited on 01312018)

[26] B Satzger W Hummer C Inzinger P Leitner and S Dustdar ldquoWinds of change From vendor lock-in to the meta cloudrdquo

IEEE Internet Computing vol 17 no 1 pp 69ndash73 2013 ISSN 10897801 DOI 101109MIC201319

[27] J Divya Drone Technology and Usage Current Uses and Future Drone Technology 2017 [Online] Available httpuk

businessinsidercomdrone-technology-uses-2017-7r=US7B5Camp7DIR=T (visited on 01312018)

[28] A Boulanger ldquoOpen-source versus proprietary software Is one more reliable and secure than the otherrdquo IBM Systems

Journal vol 44 no 2 pp 239ndash248 2005 ISSN 0018-8670 DOI 101147sj4420239 [Online] Available httpieeexplore

ieeeorgdocument5386727

[29] M Kazmeyer Disadvantages of Proprietary Software [Online] Available httpsmallbusinesschroncomdisadvantages-

proprietary-software-65430html (visited on 01312018)

[30] B Steffen and A Seyfried ldquoMethods for measuring pedestrian density flow speed and direction with minimal scatterrdquo

Physica A Statistical Mechanics and its Applications vol 389 no 9 pp 1902ndash1910 May 2010 ISSN 0378-4371 DOI 10

1016JPHYSA200912015 [Online] Available httpswwwsciencedirectcomsciencearticlepiiS0378437109010115

via7B5C7D3Dihub

[31] M Wirz T Franke D Roggen E Mitleton-Kelly P Lukowicz and G Troumlster ldquoInferring crowd conditions from pedestriansrsquo

location traces for real-time crowd monitoring during city-scale mass gatheringsrdquo Proceedings of the Workshop on

Enabling Technologies Infrastructure for Collaborative Enterprises WETICE pp 367ndash372 2012 ISSN 15244547 DOI 10

1109WETICE201226

[32] E Alpaydin Introduction to machine learning 3rd ed MIT Press 2014 p 591 ISBN 026201243X [Online] Available

httpsdlacmorgcitationcfmid=1734076

[33] J W Davis and V Sharma ldquoRobust background-subtraction for person detection in Thermal Imageryrdquo IEEE Computer

Society Conference on Computer Vision and Pattern Recognition Workshops vol 2004-Janua no January 2004 ISSN

21607516 DOI 101109CVPR2004431

[34] W Wang J Zhang and C Shen ldquoImproved Human Detection And Classification in Thermal Imagesrdquo pp 2313ndash2316 2010

[35] R Appel S Belongie P Perona and P Doll ldquoFast Feature Pyramids for Object Detectionrdquo Pami vol 36 no 8 pp 1ndash14

2014 ISSN 01628828 DOI 10 1109 TPAMI 2014 2300479 [Online] Available https vision cornell edu se3 wp -

contentuploads201409DollarPAMI14pyramids7B5C_7D0pdf

[36] T Goedeme ldquoProjectresultaten VLAIO TETRA-projectrdquo KU Leuven Louvain Tech Rep 2017

BIBLIOGRAPHY 74

[37] L-L Slattery DroneSAR wants to turn drones into search-and-rescue heroes 2017 [Online] Available https www

siliconrepubliccomstart-upsdronesar-search-and-rescue-drone-software (visited on 05262018)

[38] A W S Inc What Is Amazon Kinesis Video Streams 2018 [Online] Available https docs aws amazon com

kinesisvideostreamslatestdgwhat-is-kinesis-videohtml (visited on 05262018)

[39] U Government ldquoSystems Engineering Fundamentalsrdquo Defence Acquisition University Press no January p 223 2001

ISSN 1872-7565 DOI 101016jcmpb201005002 [Online] Available httpwwwdticmildocscitationsADA387507

[40] L Bass P Clements and R Kazman Software Architecture in Practice 3rd Addison-Wesley Professional 2012 ISBN

0321815734 9780321815736

[41] J Greene and M Stellman Applied Software Project Management 2006 p 324 ISBN 978-0596009489 [Online] Avail-

able httpwwworeillycomcatalogappliedprojectmgmt

[42] S Barber Acceptable application response times vs industry standard 2018 [Online] Available httpssearchsoftwarequality

techtargetcomtipAcceptable-application-response-times-vs-industry-standard (visited on 05282018)

[43] T Burger How Fast Is Realtime Human Perception and Technology | PubNub 2015 [Online] Available httpswww

pubnubcombloghow-fast-is-realtime-human-perception-and-technology (visited on 05282018)

[44] S-t Modeling P Glennie and N Thrift ldquoTime perception modelsrdquo Neuron pp 15 696ndash15 699 1992

[45] M Richards Software Architecture Patterns First edit Heather Scherer Ed OrsquoReilly Media 2015 [Online] Available

httpwwworeillycomprogrammingfreefilessoftware-architecture-patternspdf

[46] C Richardson Microservice Architecture pattern 2017 [Online] Available httpmicroservicesiopatternsmicroservices

html (visited on 12022017)

[47] P Clements F Bachmann L Bass D Garlan J Ivers R Little P Merson R Nord and J Staffor Documenting Software

Architectures Second Boston Pearson Education Inc 2011 ISBN 0-321-55268-7

[48] Object Management Group ldquoUnified Modeling Language v251rdquo no December 2017 [Online] Available http www

omgorgspecUML251

[49] C De La Torre C Maddock J Hampton P Kulikov and M Jones Communication in a microservice architecture 2017

[Online] Available https docs microsoft com en - us dotnet standard microservices - architecture architect -

microservice-container-applicationscommunication-in-microservice-architecture (visited on 04272018)

[50] H Schulzrinne and S Casner ldquoRTP Profile for Audio and Video Conferences with Minimal Controlrdquo 2003 [Online] Avail-

able httpstoolsietforghtmlrfc3551

[51] D Bull Communicating Pictures A Course in Image and Video Coding Elsevier Science 2014 ISBN 9780080993744

[Online] Available httpsbooksgooglebebooksid=PDZOAwAAQBAJ

[52] On-Net Surveillance Systems Inc ldquoMJPEG vs MPEG4 Understanding the differences advantages and disadvantages of

each compression techniquerdquo 2006 [Online] Available wwwonssicom

BIBLIOGRAPHY 75

[53] M M A V Protocol Introduction MAVLink Developer Guide 2013 [Online] Available httpsmavlinkioen (visited on

09142017)

[54] hartmut Schlosser Microservices trends 2017 Strategies tools and frameworks - JAXenter 2017 [Online] Available

httpsjaxentercommicroservices-trends-2017-survey-133265html (visited on 03242018)

[55] A Ronacher Welcome to Flask mdash Flask Documentation (012) 2017 [Online] Available httpflaskpocooorgdocs012

(visited on 03242018)

[56] F Reyes PythonDecorators 2017 [Online] Available https wiki python org moin PythonDecorators (visited on

04272018)

[57] Stackshare Companies that use Flask and Flask Integrations 2018 [Online] Available https stackshare io flask

(visited on 03242018)

[58] Falcon Falcon - Bare-metal web API framework for Python [Online] Available httpsfalconframeworkorg7B5C

7DsectionAbout (visited on 03242018)

[59] Stackshare Companies that use Falcon and Falcon Integrations 2018 [Online] Available httpsstackshareiofalcon

(visited on 03242018)

[60] A Ronacher Nameko for Microservices 2015 [Online] Available httplucumrpocooorg201548microservices-with-

nameko (visited on 03242018)

[61] C Escoffier Building Reactive Microservices in Java 2017 ISBN 9781491986264

[62] C Posta Microservices for Java Developers ISBN 9781491963081

[63] R Dua A R Raja and D Kakadia ldquoVirtualization vs Containerization to support PaaSrdquo in IEEE International Conference

on Cloud Engineering 2014 ISBN 9781479937660 DOI 101109IC2E201441

[64] D Merkel Docker Lightweight Linux Containers for Consistent Development and Deployment 2014 [Online] Available

http delivery acmorg1011452610000260024111600htmlip=1571935 1787B5Camp7Did=26002417B

5Camp7Dacc=ACTIVE20SERVICE7B5Camp7Dkey=D7FC43CABE88BEAA F15FE2ACB4878E3D 4D4702B0C3E38B35

4D4702B0C3E38B357B5Camp7D7B5C_7D7B5C_7Dacm7B5C_7D7B5C_7D=15214915967B5C_

7D (visited on 03192018)

[65] Docker Inc Docker for the Virtualization Admin 2016 p 12

[66] Docker Inc What is a Container 2018 [Online] Available https www docker com what - container (visited on

03242018)

[67] M Helsley LXC Linux container tools 2009 [Online] Available httpswwwibmcomdeveloperworkslinuxlibraryl-

lxc-containers (visited on 05212018)

[68] J Fink Docker a Software as a Service Operating System-Level Virtualization Framework 2014 [Online] Available

http journal code4lib org articles 9669 utm7B5C _7Dsource = feedburner 7B5C amp7Dutm7B5C _

7Dmedium=feed7B5Camp7Dutm7B5C_7Dcampaign=Feed7B5C7D3A+c4lj+ (visited on 03192018)

BIBLIOGRAPHY 76

[69] C Wang What is Docker Linux containers explained 2017 [Online] Available https www infoworld comarticle

3204171linuxwhat-is-docker-linux-containers-explainedhtml (visited on 05212018)

[70] CoreOS Rkt a security-minded standards-based container engine [Online] Available httpscoreoscomrkt (visited

on 03242018)

[71] F X F Xu X L X Liu and K Fujimura ldquoPedestrian detection and tracking with night visionrdquo IEEE Transactions on

Intelligent Transportation Systems vol 6 no 1 pp 63ndash71 2005 ISSN 1524-9050 DOI 101109TITS2004838222

[72] H Nanda and L Davis ldquoProbabilistic template based pedestrian detection in infrared videosrdquo IEEE Intelligent Vehicles

Symposium Proceedings vol 1 pp 15ndash20 2003 DOI 101109IVS20021187921

[73] R E Schapire ldquoExplaining adaboostrdquo Empirical Inference Festschrift in Honor of Vladimir N Vapnik pp 37ndash52 2013

DOI 101007978-3-642-41136-6_5

[74] P Viola O M Way M J Jones and D Snow ldquoDetecting pedestrian using patterns of motion and appearancerdquo Interna-

tional Journal of Computer Vision vol 63 no 2 pp 153ndash161 2005 DOI 101109ICCV20031238422

[75] I Goodfellow Y Bengio and A Courville Deep Learning MIT Press 2016 httpwwwdeeplearningbookorg

[76] J R R Uijlings K E A Van De Sande T Gevers and A W M Smeulders ldquoSelective Search for Object Recognitionrdquo Tech

Rep 2012 DOI 101007s11263-013-0620-5 arXiv 14094842 [Online] Available httpwwwcscornelleducourses

cs76702014spslidesVisionSeminar14pdf

[77] R Girshick J Donahue T Darrell and J Malik ldquoRegion-Based Convolutional Networks for Accurate Object Detection and

Segmentationrdquo IEEE Transactions on Pattern Analysis and Machine Intelligence vol 38 no 1 pp 142ndash158 2014 ISSN

01628828 DOI 101109TPAMI20152437384 arXiv 13112524

[78] R Girshick ldquoFast R-CNNrdquo in Proceedings of the IEEE International Conference on Computer Vision vol 2015 Inter 2015

pp 1440ndash1448 ISBN 9781467383912 DOI 101109ICCV2015169 arXiv 150408083

[79] S Ren K He R Girshick and J Sun ldquoFaster R-CNN Towards Real-Time Object Detection with Region Proposal Networksrdquo

IEEE Transactions on Pattern Analysis and Machine Intelligence vol 39 no 6 pp 1137ndash1149 2016 ISSN 01628828 DOI

101109TPAMI20162577031 arXiv 150601497

[80] K He Gkioxari P Dollaacuter and R Girshick ldquoMask R-CNNrdquo arXiv 2018 arXiv arXiv170306870v3

[81] J Dai Y Li K He and J Sun ldquoR-FCN Object Detection via Region-based Fully Convolutional Networksrdquo Tech Rep 2016

DOI 101109ICASSP20177952132 arXiv 160506409 [Online] Available httparxivorgabs160506409

[82] J Redmon S Divvala R Girshick and A Farhadi ldquoYou Only Look Once Unified Real-Time Object Detectionrdquo 2015 ISSN

01689002 DOI 101109CVPR201691 arXiv 150602640 [Online] Available httparxivorgabs150602640

[83] J Redmon and A Farhadi ldquoYOLOv3 An Incremental Improvementrdquo axXiv 2018 [Online] Available httpspjreddiecom

mediafilespapersYOLOv3pdf

[84] W Liu D Anguelov D Erhan C Szegedy S Reed C-y Fu and A C Berg ldquoSSD Single Shot MultiBox Detectorrdquo arXiv 2016

arXiv arXiv151202325v5

BIBLIOGRAPHY 77

[85] B Zoph and Q V Le ldquoNeural Architecture Search with Reinforcement Learningrdquo in ICLR 2017 pp 1ndash16 arXiv arXiv

161101578v2

[86] T-y Lin P Goyal R Girshick K He and P Dollaacuter ldquoFocal Loss for Dense Object Detectionrdquo arXiv 2018 arXiv arXiv

170802002v2

[87] Facebook Inc ONNX - About 2017 [Online] Available httpsonnxaiabout (visited on 05212018)

[88] TensorFlow TensorFlow 2018 [Online] Available httpswwwtensorfloworg (visited on 05212018)

[89] J Huang V Rathod C Sun M Zhu A Korattikara A Fathi I Fischer Z Wojna Y Song S Guadarrama and K Murphy

ldquoSpeedaccuracy trade-offs for modern convolutional object detectorsrdquo arXiv 2017 arXiv arXiv161110012v3

[90] J Redmon Darknet Open source neural networks in c httppjreddiecomdarknet 2013ndash2016

[91] Microsoft The Microsoft Cognitive Toolkit | Microsoft Docs 2018 [Online] Available https docs microsoft comen-

uscognitive-toolkitindex (visited on 05212018)

[92] Docker Inc Overview of Docker Compose | Docker Documentation 2018 [Online] Available httpsdocsdockercom

composeoverview (visited on 04272018)

[93] Docker Inc Use bridge networks | Docker Documentation 2018 [Online] Available httpsdocsdockercomnetwork

bridge (visited on 04272018)

[94] A Ronacher Click Documentation (50) 2017 [Online] Available httpclickpocooorg5 (visited on 04272018)

[95] A K Reitz Requests HTTP for Humans mdash Requests 2184 documentation 2018 [Online] Available httpdocspython-

requestsorgenmaster (visited on 05092018)

[96] Docker Inc Docker SDK for PythonmdashDocker SDK for Python 20 documentation 2018 [Online] Available httpsdocker-

pyreadthedocsioenstable (visited on 05122018)

[97] GStreamer GStreamer open source multimedia framework 2018 [Online] Available httpsgstreamerfreedesktop

org (visited on 05132018)

[98] E Walthinsen filesrc GStreamer Core Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktop org data doc gstreamer head gstreamer -plugins html gstreamer -plugins - filesrc html (visited on

05132018)

[99] E Hervey decodebin GStreamer Base Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-base-pluginshtmlgst-plugins-base-plugins-decodebinhtml

(visited on 05132018)

[100] W Taymans jpegenc GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-jpegenchtml (visited on

05132018)

BIBLIOGRAPHY 78

[101] A Communications rtpjpegpay GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https

gstreamer freedesktop org data doc gstreamer head gst - plugins - good html gst - plugins - good - plugins -

rtpjpegpayhtml (visited on 05132018)

[102] W Taymans udpsink GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-udpsinkhtml (visited on

05132018)

[103] GStreamer Basic tutorial 3 Dynamic pipelines [Online] Available httpsgstreamerfreedesktoporgdocumentation

tutorialsbasicdynamic-pipelineshtml (visited on 05132018)

[104] W Taymans udpsrc GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-udpsrchtml (visited on

05142018)

[105] W Taymans rtpjpegdepay GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available httpsgstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-rtpjpegdepayhtml

(visited on 05142018)

[106] A Loonstra ldquoVideostreaming with Gstreamerrdquo [Online] Available httpmediatechnologyleideneduimagesuploads

docswt20147B5C_7Dgstreamerpdf

[107] W Taymans jpegdec GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-jpegdechtml (visited on

05142018)

[108] J Schmidt autovideosink GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available httpsgstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-autovideosinkhtml

(visited on 05142018)

[109] A Ronacher Deployment Options mdash Flask 0124 documentation 2018 [Online] Available httpflaskpocooorgdocs

012deploying (visited on 05142018)

[110] R Yasrab ldquoMitigating Docker Security Issuesrdquo University of Science and Technology of China Hefei Tech Rep [Online]

Available httpsarxivorgpdf180405039pdf

[111] Lvh Donrsquot expose the Docker socket (not even to a container) 2015 [Online] Available httpswwwlvhiopostsdont-

expose-the-docker-socket-not-even-to-a-containerhtml (visited on 05152018)

[112] Docker Inc Use overlay networks | Docker Documentation 2018 [Online] Available httpsdocsdockercomnetwork

overlay7B5C7Dcustomize-the-docker7B5C_7Dgwbridge-interface (visited on 05152018)

[113] J W Davis and M A Keck ldquoA Two-Stage Template Approach to Person Detection in Thermal Imageryrdquo Proc Workshop

on Applications of Computer Vision 2005 [Online] Available httpvcipl-okstateorgpbvsbenchpaperswacv05pdf

BIBLIOGRAPHY 79

[114] J W Davis and V Sharma ldquoBackground-subtraction using contour-based fusion of thermal and visible imageryrdquo Com-

puter Vision and Image Understanding vol 106 no No 2-3 pp 162ndash182 2007 DOI 101016jcviu200606010 [Online]

Available httpswebcseohio-stateedu7B~7Ddavis1719Publicationscviu07pdf

[115] S Hwang J Park N Kim Y Choi and I S Kweon ldquoMultispectral Pedestrian Detection Benchmark Dataset and Baselinerdquo

CVPR 2015 [Online] Available httpssitesgooglecomsitepedestrianbenchmark

[116] Z Wu N Fuller D Theriault and M Betke ldquoA Thermal Infrared Video Benchmark for Visual Analysisrdquo IEEE Conference

on Computer Vision and Pattern Recognition Workshops 2014 DOI 101109CVPRW201439 [Online] Available http

citeseerxistpsueduviewdocdownloaddoi=101173522167B5Camp7Drep=rep17B5Camp7Dtype=pdf

[117] R Miezianko Terravic research infrared database

[118] R Miezieanko Terravic research infrared database

[119] S Z Li R Chu S Liao and L Zhang ldquoIllumination Invariant Face Recognition Using Near-Infrared Imagesrdquo IEEE Trans-

actions on Pattern Analysis and Machine Intelligence vol 29 no 4 pp 627ndash639 2007 DOI 101109TPAMI20071014

[Online] Available httpvcipl-okstateorgpbvsbenchpapersNIRpdf

[120] A Akula R Ghosh S Kumar and H K Sardana ldquoMoving target detection in thermal infrared imagery using spatiotem-

poral informationrdquo J Opt Soc Am A vol 30 no 8 pp 1492ndash1501 Aug 2013 DOI 101364JOSAA30001492 [Online]

Available httpjosaaosaorgabstractcfmURI=josaa-30-8-1492

[121] R I Hammoud IEEE OTCBVS WS Series Bench [Online] Available http vcipl - okstate org pbvs bench (visited on

05182018)

[122] Last Post Association Mission 2018 [Online] Available httpwwwlastpostbeenthe-last-postmission (visited on

05182018)

[123] I FLIR Systems FLIR One Pro 2017 [Online] Available httpswwwflircomglobalassetsimported-assetsdocument17-

1746-oem-flir7B5C_7Done7B5C_7Dpro7B5C_7Ddatasheet7B5C_7Dfinal7B5C_7Dv17B5C_

7Dwebpdf

[124] R J Ramana Introduction to Camouflage andDeception Defence Scientific Information ampDocumentation Centre pp 99ndash

164

[125] A Bornstein and I Richter Microsoft visual object tagging tool [Online] Available httpsgithubcomMicrosoftVoTT

(visited on 05202018)

[126] F E Grubbs ldquoProcedures for Detecting Outlying Observations in Samplesrdquo Technometrics vol 11 no 1 pp 1ndash21 Feb 1969

DOI 10108000401706196910490657 [Online] Available httpwwwtandfonlinecomdoiabs10108000401706

196910490657

[127] J Deng W Dong R Socher L-J Li K Li and L Fei-Fei ldquoImageNet A Large-Scale Hierarchical Image Databaserdquo in CVPR09

2009 [Online] Available httpwwwimage-netorgpapersimagenet7B5C_7Dcvpr09pdf

BIBLIOGRAPHY 80

[128] D Gupta Transfer learning amp The art of using Pre-trained Models in Deep Learning 2017 [Online] Available https

wwwanalyticsvidhyacomblog201706transfer-learning-the-art-of-fine-tuning-a-pre-trained-model (visited on

05202018)

[129] Docker Inc docker stats | Docker Documentation 2018 [Online] Available httpsdocsdockercomenginereference

commandlinestats (visited on 05242018)

[130] M Gori and A Tesi ldquoOn the Problem of Local Minima in Recurrent Neural Networksrdquo IEEE Transactions on Pattern

Analysis and Machine Intelligence vol 14 no 1 pp 76ndash86 1992 DOI 10110934107014

[131] L Prechelt ldquoEarly stopping - but whenrdquo in Neural Networks Tricks of the Trade G B Orr and K-R Muumlller Eds Berlin

Heidelberg Springer Berlin Heidelberg 1998 pp 55ndash69 ISBN 978-3-540-49430-0 DOI 1010073-540-49430-8_3

[Online] Available httpsdoiorg1010073-540-49430-8_3

[132] M Everingham L Van Gool C K Williams J Winn and A Zisserman ldquoThe Pascal visual object classes (VOC) challengerdquo

International Journal of Computer Vision vol 88 no 2 pp 303ndash338 2010 ISSN 09205691 DOI 101007s11263-009-

0275-4

[133] M Everingham S M A Eslami L Van Gool C K I Williams J Winn and A Zisserman ldquoThe Pascal Visual Object Classes

Challenge A Retrospectiverdquo International Journal of Computer Vision vol 111 no 1 pp 98ndash136 2014 ISSN 15731405

DOI 101007s11263-014-0733-5

[134] P Henderson and V Ferrari ldquoEnd-to-end training of object class detectors for mean average precisionrdquo Lecture Notes

in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

vol 10115 LNCS pp 198ndash213 2017 ISSN 16113349 DOI 101007978-3-319-54193-8_13 arXiv 160703476

[135] T Y Lin M Maire S Belongie J Hays P Perona D Ramanan P Dollaacuter and C L Zitnick ldquoMicrosoft COCO Common objects

in contextrdquo Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture

Notes in Bioinformatics) vol 8693 LNCS no PART 5 pp 740ndash755 2014 ISSN 16113349 DOI 101007978-3-319-10602-

1_48 arXiv 14050312

[136] Docker Inc Librarydocker 2018 [Online] Available https hub docker com 7B 5C _ 7D docker (visited on

06012018)

[137] Nvidia Nvidia-docker [Online] Available httpsgithubcomNVIDIAnvidia-docker (visited on 05252018)

[138] FLIR ldquoFLIR Onerdquo [Online] Available http www flir comuploadedFiles Store Products FLIR-ONE3rd-GenFLIR-

ONEFLIR-ONE-Gen-3-Datasheetpdf

[139] FLIR ldquoFLIR Bosonrdquo p 2 2016

FIREFIGHTING DEPARTMENT EMAIL CONVERSATIONS 81

Appendix A

Firefighting department email conversations

This appendix contains the email conversations with different firefighting departments in Belgium as part of an exploration of

the functional requirements of an aerial thermal imaging solution Note that all conversations were translated from Dutch to

English

A1 General email sent to Firefighting departments

This email was sent to the departments later mentioned in this appendix The responses in the following sections are responses

to this email

Subject Firefighting department - Thesis thermal drones

Dear Sir Madam

My name is Brecht Verhoeve I am a student Master of Science computer science engineering at Ghent University I am contacting

your department with reference to the research of my masterrsquos dissertation I am currently researching the applications of

thermal cameras in combination with commercial drones They can create an aerial overview of scenes and objects that often

canrsquot be spotted with visual detectors like hidden persons fires or hot explosives The eventual goal is to let a computer indicate

these objects of interest autonomously on the thermal images of the drone These images could aid a firefighter with their

work

For this research I have some questions for you

Functionality

I have enlisted some functionalities which I believe could be interesting for a firefighter

bull Detection of persons in buildings (find potential victims)

bull Detection of hidden fires in buildings (to identify danger zones)

bull Detection of fires on vast terrains (forests industrial terrains)

A2 Conversation with Firefighting department of Antwerp Belgium 82

bull Indication of hot explosives

I have two questions

bull Do you agree that these are the most important functions

bull Are there any other functions that you deem important

Quality of the application Next to the functionality the quality of the application is also important For me the most important

aspects are

bull Accuracy The software must be accurate There is no room for errors when detecting

bull Speed The software must operate quickly An overview must be created quickly to not waste time in case of an emer-

gency

bull Usability The software must be easy to use

Once again I have two questions

bull Do you agree with these qualities

bull Are there any other important qualities that you deem important

I would like to thank you in advance for your time

Best regards

Brecht Verhoeve

A2 Conversation with Firefighting department of Antwerp Belgium

The answers were given inline For clarity these are explicitly given

Subject Re Firefighting department Antwerp - Thesis thermal drones

Answers can be found in your email

Best regards

Functionality Detection of hidden fires in buildings and environments Are there any other functions that you deem important

Capture the evolution of a fire with the thermal camera Visualise incidents during night-time Capture invisible fires such as

hydrogen or methane fires

A3 Converstation with Firefighting department of Ostend Belgium 83

A3 Converstation with Firefighting department of Ostend Belgium

The answers were given inline For clarity these are explicityl given

Subject Re Firefighting department Ostend - Thesis thermal drones

Dear Brecht

You can find the answers after the questions in your email

Best Regards

Functionality Are there any other functions that you deem important These are the most important for us at the moment

Quality of the application Are there any other important qualities that you deem important The application must work au-

tonomously

A4 Conversation with Firefighting department of Courtrai Belgium

Subject Re Firefighting department Courtrai - Thesis thermal drones

Dear Brecht

Beneath you will find our answers (next to the already mentioned items)

Functionality

bull The detection of persons in a landscape For example missing persons after a traffic accident there are searches in the

dark for victims that were catapulted from a vehicle Today this is done via a thermal camera on the ground but with

a drone this could hold potential benefits Another example is searching for missing persons in nature reserves The

police sometimes asks for assitance of firefighters to search the area

Quality of the application

bull The images needs to be processed in realtime not after the drone has landed

The drones must be deployable for multiple purposes

The interpretation of the images in the future can be important for automatic flight control of drones Currently there is a

European project rdquo3D Safeguardrdquo where the KU Leuven is participating They are already quite advanced in interpreting the

images from a drone to spot persons through smoke With this information the drone can be redirected The application can

thus use the interpretations of the images to control the drone in flight

Best regards

A5 Conversation with Firefighting department of Ghent Belgium

Subject Re Firefighting department Ghent - Thesis thermal drones

A5 Conversation with Firefighting department of Ghent Belgium 84

Hi Brecht

I donrsquot know if yoursquove received the previous email but there you received answers on your questions

Best regards

Subject Re Firefighting department Ghent - Thesis thermal drones

With respect to the functionality I would like to add

bull Measuring the temperature of containers silos

I agree with the quality of the application It could be handy to be able to view the application from one or more devices

Everything should have a clear overview If possible information and controls should be available on one screen

I will follow up

Best regards

THERMAL CAMERA SPECIFICATIONS 85

Appendix B

Thermal camera specifications

This appendix gives all the specifications for the compared thermal cameras First the different cameras their producing

companies and average retail prices are listed in Table B1 Second their respective physical specifications are presented in

Table B2 Third the image qualities are presented in Table B3 Fourth the thermal precisions are presented in Table B4 Fifth

the available interfaces to interact with each camera are presented in Table B5 Sixth the energy consumption of each camera

is presented in Table B6 Seventh how support is offered when developing for these platforms is presented in Table B7 Finally

auxiliary features are presented in Table B8

THERMAL CAMERA SPECIFICATIONS 86

Product Company Price (Euro)

Wiris 2nd Gen 640 Workswell 999500

Wiris 2nd Gen 336 Workswell 699500

Duo Pro R 640 FLIR 640900

Duo Pro R 336 FLIR 438484

Duo FLIR 94999

Duo R FLIR 123999

Vue 640 FLIR 268900

Vue 336 FLIR 125993

Vue Pro 640 FLIR 403218

Vue Pro 336 FLIR 230261

Vue Pro R 640 FLIR 518456

Vue Pro R 336 FLIR 345599

Zenmuse XT 640 DJI x FLIR 1181000

Zenmuse XT 336 DJI x FLIR 697000

Zenmuse XT 336 R DJI x FLIR 939000

Zenmuse XT 640 R DJI x FLIR 1423000

One FLIR 23799

One Pro FLIR 46900

Tau 2 640 FLIR 674636

Tau 2 336 FLIR 493389

Tau 2 324 FLIR 2640

Lepton 3 160 x 120 FLIR 25995

Lepton 3 80 x 60 FLIR 14338

Boson 640 FLIR 122209

Boson 320 FLIR 93842

Quark 2 640 FLIR 33165

Quark 2 336 FLIR 33165

DroneThermal v3 Flytron 34115

Compact Seek Thermal 27500

CompactXR Seek Thermal 28646

Compact Pro Seek Thermal 59900

Therm-App Opgal 93731

Therm-App TH Opgal 295000

Therm-App 25 Hz Opgal 199000

Table B1 Compared cameras their producing companies and their average retail price

THERMAL CAMERA SPECIFICATIONS 87

Product Weight (g) Dimensions (mm)

Wiris 2nd Gen 640 390 135 x 77 x 69

Wiris 2nd Gen 336 390 135 x 77 x 69

Duo Pro R 640 325 85 x 813 x 685

Duo Pro R 336 325 85 x 813 x 685

Duo 84 41 x 59 x 30

Duo R 84 41 x 59 x 30

Vue 640 114 574 x 4445 x 4445

Vue 336 114 574 x 4445 x 4445

Vue Pro 640 9214 574 x 4445 x 4445

Vue Pro 336 9214 574 x 4445 x 4445

Vue Pro R 640 9214 574 x 4445 x 4445

Vue Pro R 336 9214 574 x 4445 x 4445

Zenmuse XT 640 270 103 x 74 x 102

Zenmuse XT 336 270 103 x 74 x 102

Zenmuse XT 336 R 270 103 x 74 x 102

Zenmuse XT 640 R 270 103 x 74 x 102

One 345 67 x 34 x 14

One Pro 365 68 x 34 x 14

Tau 2 640 72 444 x 444 x 444

Tau 2 336 72 444 x 444 x 444

Tau 2 324 72 444 x 444 x 444

Lepton 3 160 x 120 09 118 x 127 x 72

Lepton 3 80 x 60 09 118 x 127 x 72

Boson 640 75 21 x 21 x 11

Boson 320 75 21 x 21 x 11

Quark 2 640 8 22 x 22 x 12

Quark 2 336 8 22 x 22 x 12

DroneThermal v3 3 20 x 20 x 15

Compact 1417 254 x 444 x 203

CompactXR 1417 254 x 444 x 254

Compact Pro 1417 254 x 444 x 254

Therm-App 138 55 x 65 x 40

Therm-App TH 123 55 x 65 x 40

Therm-App 25 Hz 138 55 x 65 x 40

Table B2 Physical specifications

THERMAL CAMERA SPECIFICATIONS 88

Product IR Resolution (pixels) SD resolution (megapixels) Frequency (Hz) FOV Radiometry

Wiris 2nd Gen 640 640 x 512 192 not specified Various yes

Wiris 2nd Gen 336 336 x 256 192 not specified Various yes

Duo Pro R 640 640 x 512 12 30 Various lens yes

Duo Pro R 336 336 x 256 12 30 Various lens yes

Duo 160 x 120 2 75 and 83 57deg x 44deg no

Duo R 160 x 120 2 75 57deg x 44deg yes

Vue 640 640 x 512 0 75 Various lens no

Vue 336 336 x 256 0 75 Various lens no

Vue Pro 640 640 x 512 0 75 Various lens no

Vue Pro 336 336 x 256 0 75 Various lens no

Vue Pro R 640 640 x 512 0 75 Various lens yes

Vue Pro R 336 336 x 256 0 75 Various lens yes

Zenmuse XT 640 640 x 512 0 75 Various lens no

Zenmuse XT 336 336 x 256 0 75 Various lens no

Zenmuse XT 336 R 336 x 256 0 75 Various lens yes

Zenmuse XT 640 R 336 x 256 0 75 Various lens yes

One 80 x 60 15 87 50 deg x 38 deg yes

One Pro 160 x 120 15 87 55 deg x 43 deg yes

Tau 2 640 640 x 512 0 75 Various lens yes

Tau 2 336 336 x 256 0 75 Various lens yes

Tau 2 324 324 x 256 0 76 Various lens yes

Lepton 3 160 x 120 160 x 120 0 88 56 deg available

Lepton 3 80 x 60 80 x 60 0 88 56 deg no

Boson 640 640 x 512 0 90 Various lens no

Boson 320 320 x 256 0 90 Various lens no

Quark 2 640 640 x 512 0 9 Various lens no

Quark 2 336 336 x 256 0 9 Various lens no

DroneThermal v3 80 x 60 0 86 25 deg no

Compact 206 x 156 0 9 36 deg no

CompactXR 205 x 156 0 9 20 deg no

Compact Pro 320 x 240 0 15 32 deg no

Therm-App 384 x 288 0 87 Various lens no

Therm-App TH 384 x 288 0 87 Various lens yes

Therm-App 25 Hz 384 x 288 0 25 Various lens no

Table B3 Image quality

IR InfraRed SD Standard FOV Field of View

THERMAL CAMERA SPECIFICATIONS 89

Product Sensitivity mK Temperature range (degrees Celsius) Accuracy (Celsius)

Wiris 2nd Gen 640 50 -25 to +150 -40 to + 550 2

Wiris 2nd Gen 336 50 -25 to +150 -40 to + 550 2

Duo Pro R 640 50 -25 to + 135 -40 to + 550 5 20

Duo Pro R 336 50 -25 to + 135 -40 to + 550 5 20

Duo not specified -40 tot + 550 5

Duo R not specified -40 to + 550 5

Vue 640 not specified -58 to + 113 not specified

Vue 336 not specified -58 to + 113 not specified

Vue Pro 640 not specified -58 to + 113 not specified

Vue Pro 336 not specified -58 to + 113 not specified

Vue Pro R 640 not specified -58 to + 113 not specified

Vue Pro R 336 not specified -58 to + 113 not specified

Zenmuse XT 640 50 -40 to 550 not specified

Zenmuse XT 336 50 -40 to 550 not specified

Zenmuse XT 336 R 50 -40 to 550 not specified

Zenmuse XT 640 R 50 -40 to 550 not specified

One 150 -20 to 120 3

One Pro 150 -20 to 400 3

Tau 2 640 50 -40 to 550 not specified

Tau 2 336 50 -40 to 550 not specified

Tau 2 324 50 -40 to 550 not specified

Lepton 3 160 x 120 50 0 to 450 5

Lepton 3 80 x 60 50 0 to 450 5

Boson 640 40 0 to 500 not specified

Boson 320 40 0 to 500 not specified

Quark 2 640 50 -40 to 160 not specified

Quark 2 336 50 -40 to 160 not specified

DroneThermal v3 50 0 to 120 not specified

Compact not specified -40 to 330 not specified

CompactXR not specified -40 to 330 not specified

Compact Pro 70 -40 to +330 not specified

Therm-App 70 5 to + 90 3

Therm-App TH 70 0 to 200 2

Therm-App 25 Hz 70 5 to + 90 3

Table B4 Thermal precision

THERMAL CAMERA SPECIFICATIONS 90

Product USB MAVLink HDMI

Wiris 2nd Gen 640 Flash disk yes yes

Wiris 2nd Gen 336 Flash disk yes yes

Duo Pro R 640 Mini-USB yes micro-HDMI

Duo Pro R 336 Mini-USB yes micro-HDMI

Duo Mini-USB yes micro-HDMI

Duo R Mini-USB yes micro-HDMI

Vue 640 Mini-USB No No

Vue 336 Mini-USB no no

Vue Pro 640 Mini-USB yes Optional

Vue Pro 336 Mini-USB yes Optional

Vue Pro R 640 Mini-USB yes Optional

Vue Pro R 336 Mini-USB yes Optional

Zenmuse XT 640 Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 336 Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 336 R Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 640 R Via DJI drone Via DJI drone Via DJI drone

Tau 2 640 No no no

Tau 2 336 No no no

Tau 2 324 No no no

Lepton 3 160 x 120 No no no

Lepton 3 80 x 60 No no no

Boson 640 Yes no no

Boson 320 Yes no no

Quark 2 640 no no no

Quark 2 336 no no no

DroneThermal v3 no no no

Compact Smartphone storage no no

CompactXR Smartphone storage no no

Compact Pro Smartphone storage no no

Therm-App Smartphone storage no no

Therm-App TH Smartphone storage no no

Therm-App 25 Hz Smartphone storage no no

Table B5 Interfaces

THERMAL CAMERA SPECIFICATIONS 91

Product Power consumption (Watt) Input Voltage

Wiris 2nd Gen 640 4 6 - 36

Wiris 2nd Gen 336 4 6 - 36

Duo Pro R 640 10 50 - 260

Duo Pro R 336 10 50 - 260

Duo 22 50 - 260

Duo R 22 50 - 260

Vue 640 12 48 - 60

Vue 336 12 48 - 60

Vue Pro 640 21 48 - 60

Vue Pro 336 21 48 - 60

Vue Pro R 640 21 48 - 60

Vue Pro R 336 21 48 - 60

Zenmuse XT 640 Via DJI drone Via DJI drone

Zenmuse XT 336 Via DJI drone Via DJI drone

Zenmuse XT 336 R Via DJI drone Via DJI drone

Zenmuse XT 640 R Via DJI drone Via DJI drone

One approx 1h battery lifetime Battery

One Pro approx 1h battery lifetime Battery

Tau 2 640 13 40 - 60

Tau 2 336 13 40 - 61

Tau 2 324 13 40 - 62

Lepton 3 160 x 120 065 31

Lepton 3 80 x 60 065 31

Boson 640 05 33

Boson 320 05 33

Quark 2 640 12 33

Quark 2 336 12 33

DroneThermal v3 015 33 - 5

Compact Via smartphone Smartphone

CompactXR Via smartphone Smartphone

Compact Pro Via smartphone Smartphone

Therm-App 05 5

Therm-App TH 05 5

Therm-App 25 Hz 05 5

Table B6 Energy consumption

THERMAL CAMERA SPECIFICATIONS 92

Product Warranty (years) User Manual Phone support Email support FAQs

Wiris 2nd Gen 640 Not specified Yes Yes Yes Yes

Wiris 2nd Gen 336 Not specified Yes Yes Yes Yes

Duo Pro R 640 1 Yes Yes Yes Yes

Duo Pro R 336 1 Yes Yes Yes Yes

Duo 1 yes Yes Yes Yes

Duo R 1 yes yes yes yes

Vue 640 1 yes yes yes yes

Vue 336 1 yes yes yes yes

Vue Pro 640 1 yes yes yes yes

Vue Pro 336 1 yes yes yes yes

Vue Pro R 640 1 yes yes yes yes

Vue Pro R 336 1 yes yes yes yes

Zenmuse XT 640 05 yes yes yes yes

Zenmuse XT 336 05 yes yes yes yes

Zenmuse XT 336 R 05 yes yes yes yes

Zenmuse XT 640 R 05 yes yes yes yes

One 1 yes yes yes yes

One Pro 1 yes yes yes yes

Tau 2 640 1 yes yes yes yes

Tau 2 336 1 yes yes yes yes

Tau 2 324 1 yes yes yes yes

Lepton 3 160 x 120 1 yes yes yes yes

Lepton 3 80 x 60 1 yes yes yes yes

Boson 640 1 yes yes yes yes

Boson 320 1 yes yes yes yes

Quark 2 640 1 yes yes yes yes

Quark 2 336 1 yes yes yes yes

DroneThermal v3 not specified no no no no

Compact 1 yes yes yes yes

CompactXR 1 yes yes yes yes

Compact Pro 1 yes yes yes yes

Therm-App 1 yes yes yes yes

Therm-App TH 1 yes yes yes yes

Therm-App 25 Hz 1 yes yes yes yes

Table B7 Help and support

THERMAL CAMERA SPECIFICATIONS 93

Product Bluetooth Wi-Fi GPS Mobile app Storage

Wiris 2nd Gen 640 no on request Yes no yes

Wiris 2nd Gen 336 no on request yes no yes

Duo Pro R 640 yes no yes yes yes

Duo Pro R 336 yes no yes yes yes

Duo no no no no yes

Duo R no no no no yes

Vue 640 No no no no no

Vue 336 no no no no no

Vue Pro 640 yes no no yes yes

Vue Pro 336 yes no no yes yes

Vue Pro R 640 yes no no yes yes

Vue Pro R 336 yes no no yes yes

Zenmuse XT 640 Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 336 Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 336 R Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 640 R Via DJI drone Via DJI drone Via DJI drone yes yes

One no no no yes yes

One Pro no no no yes yes

Tau 2 640 no no no no yes

Tau 2 336 no no no no yes

Tau 2 324 no no no no yes

Lepton 3 160 x 120 no no no no no

Lepton 3 80 x 60 no no no no no

Boson 640 no no no no no

Boson 320 no no no no no

Quark 2 640 no no no no no

Quark 2 336 no no no no no

DroneThermal v3 no no no no no

Compact no no no yes yes

CompactXR no no no yes yes

Compact Pro no no no yes yes

Therm-App no no no yes yes

Therm-App TH no no no yes yes

Therm-App 25 Hz no no no yes yes

Table B8 Auxiliary features

LAST POST THERMAL DATASET SUMMARY 94

Appendix C

Last Post thermal dataset summary

The goal of this appendix is to provide a summary of the layout of the Last Post thermal dataset The data was captured on

the following days 24th of March 2018 second of April 2018 third of April 2018 third of April 2018 fourth of April 2018 fifth of

April 2018 9th of April 2018 10th of April 2018 11th of April 2018 and 12th of April 2018 For each date a small summary of the

contents is made below The small summary consists of a description of the conditions that day a listing of the video files and

their contents

C1 24th of March 2018

Conditions

bull Hours 1940 - 2020

bull Outside temperature range 5 degrees Celsius - 12 degrees Celsius

bull Clear

bull Humidity 76

bull Wind 24 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 14 kilometers

Videos

bull flir_20180324T195255mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd gathers on the right of the video D

C2 2nd of April 2018 95

bull flir_20180324T195836mp4 This video gives an overview of the inside of the Meningate ceremony Many

people can be seen watching the ceremony

bull flir_20180324T200421mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side

bull flir_20180324T201448mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side The crowd is leaving

the area after the ceremony

bull flir_20180324T202328mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side The crowd is leaving

the area after the ceremony

C2 2nd of April 2018

Conditions

bull Hours 1940 - 2020

bull Outside temperature range 9 degrees Celsius - 15 degrees Celsius

bull Light rain

bull Humidity 74

bull Wind 18 kilometers per hour

bull Precipitation 04 centimeters

bull Visibility 81 kilometers

Videos

bull 2018-04-02194733mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people sometimes with umbrellas passing through

bull 2018-04-02194952mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance

C3 3th of April 2018 96

bull 2018-04-02195518mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance

bull 2018-04-02201322mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance Crowds can be seen as well as people holding umbrellas

C3 3th of April 2018

Conditions

bull Hours 2000 - 2030

bull Outside temperature range 8 degrees Celsius - 16 degrees Celsius

bull Heavy rain

bull Humidity 79

bull Wind 25 kilometers per hour

bull Precipitation 05 centimeters

bull Visibility 101 kilometers

Videos

bull 2018-04-03 201227mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are holding

umbrellas due to heavy rain on that day Due to rain conditions and wind it was difficult to steady the camera which

can be seen in the shaky video

bull 2018-04-03 201727mp4 In the beginning of the clip the camera is moving towards the other side of the

Meningate From 0020 and onwards the clip is useful The video gives an overview of the bridge at the east-side of the

Meningate This is were the Frenchlaan goes into the Menenstraat The video shows people leaving from the Meningate

towards the busses at the other side of the bridge Most people are holding umbrellas due to heavy rain that day The

Meningate is in the bottom left of the picture Several buildings can be seen in the distance In the bottom right the

water of the Kasteelgracht can be seen Sometimes in the left of the picture the wall of the Meningate can be seen

bull 2018-04-03 202311mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are holding

umbrellas due to heavy rain on that day Due to rain conditions and wind it was difficult to steady the camera which

can be seen in the shaky video

C4 4th of April 2018 97

C4 4th of April 2018

Content

bull Hours 1945 - 2030

bull Outside temperature range 10 degrees Celsius - 14 degrees Celsius

bull Cloudy

bull Humidity 87

bull Wind 18 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-04 200052mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen In the video a dense

crowd of people slowly starts to form in the bottom right of the picture Most people are coming from the left of the

video to join the crowd

bull 2018-04-04 200728mp4 This video shows the inside of the Meningate and the ceremony of the last post

Some people are up close in front The large crowd can be seen through the hall

bull 2018-04-04 200913mp4 This video shows the inside of the Meningate and the ceremony of the last post

The video switches between MSX mode visual camera and thermal camera to show the differences

bull 2018-04-04 202859mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen At the start of the video

a crowd is seen in the bottom right At the 0100 mark the ceremony has ended and people are exiting the gate and

coming onto the crossing They form two rows to make place for the marching band exiting the Meningate It can be

seen marching through the crowd at the 0250 mark

C5 5th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 11 degrees Celsius - 15 degrees Celsius

C6 9th of April 2018 98

bull Sunny

bull Humidity 77

bull Wind 11 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-05 200217mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen In the video a dense

crowd of people slowly starts to form in the bottom right of the picture Most people are coming from the left of the

video to join the crowd The video shows 15 minutes before the start of the ceremony

bull 2018-04-04 201838mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen The video shows the

first ten minutes after the end of the ceremony The crowd which can be seen on the left leaves towards the square

C6 9th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 9 degrees Celsius - 10 degrees Celsius

bull Light rain

bull Humidity 99

bull Wind 14 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 81 kilometers

Videos

bull 2018-04-09 200007mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are coming from

the left towards the Meningate in the right Not a lot of people are seen due to rain that day

C7 10th of April 2018 99

bull 2018-04-09-202302mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right People are leaving from the right of the Meningate

towards the square

C7 10th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 14 degrees Celsius - 17 degrees Celsius

bull Partly Cloudy

bull Humidity 52

bull Wind 13 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers per hour

Videos

bull 2018-04-10 195029mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way

bull 2018-04-10 195131mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way

bull 2018-04-10 195748mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way Some people are moving around the crowd

bull 2018-04-10 200122mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen There is a big crowd that

can be seen on the right There are some schools there so some people are wearing backpacks It is quite warm and

the cafe on the other side of the street has opened up its terrace

bull 2018-04-10 201427mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

C8 11th of April 2018 100

that they are standing in a very structured way Some people are moving around the crowd The image is not rotated

well a well rotated image is found in 2018-04-10 201427_rotatedmp4

bull 2018-04-10 201515mp4 This video shows the inside of the Meningate and the ceremony A traditional

rsquoHakkarsquo from New-Zealand soldiers can be heard in the video the soldiers are difficult to spot due to thermal blurring

because many people are standing in one place

bull 2018-04-10 202558mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen that

they are standing in a very structured way Some people are moving around the crowd At the 0200 mark the army

unit marches to the end of the bridge Very dense crowds can be seen afterwards At 0825 the army unit marches in a

straight line towards the Meningate

C8 11th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 12 degrees Celsius - 16 degrees Celsius

bull Sunny

bull Humidity 63

bull Wind 14 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-11 200140mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen

bull 2018-04-11 200601mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat A small crowd can be seen on the left of the video

bull 2018-04-11 201554mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People start leaving the

ceremony from the 0120 mark

C9 12th of April 2018 101

C9 12th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 11 degrees Celsius - 14 degrees Celsius

bull Rain

bull Humidity 94

bull Wind 8 kilometers per hour

bull Precipitation 01 centimeters

bull Visibility 32 kilometers

Videos

bull 2018-04-12 195219mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen Sometimes the hedge

of the wall were the film was made is visible due to shaky cam Not many people are seen due to the rain

bull 2018-04-12 201526mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen Sometimes the hedge

of the wall were the film was made is visible due to shaky cam Not many people are seen due to the rain People are

leaving towards the right

  • Introduction
    • Drones
    • Concepts
      • Thermal Cameras
      • Aerial thermal imaging
        • Problem statement
          • Industry adoption
          • Crowd monitoring
          • Goal
          • Related work
            • Outline
              • System Design
                • Requirements analysis
                  • Functional requirements
                  • Non-functional requirements
                    • Patterns and tactics
                      • Layers
                      • Event-driven architecture
                      • Microkernel
                      • Microservices
                      • Comparison of patterns
                        • Software architecture
                          • Static view
                          • Dynamic views
                          • Deployment views
                              • State of the art and technology choice
                                • Thermal camera options
                                  • Parameters
                                  • Comparative analysis
                                    • Microservices frameworks
                                      • Flask
                                      • Falcon
                                      • Nameko
                                      • Vertx
                                      • Spring Boot
                                        • Deployment framework
                                          • Containers
                                          • LXC
                                          • Docker
                                          • rkt
                                            • Object detection algorithms and frameworks
                                              • Traditional approaches
                                              • Deep learning
                                              • Frameworks
                                                • Technology choice
                                                  • Thermal camera
                                                  • Microservices framework
                                                  • Deployment framework
                                                  • Object detection
                                                      • Proof of Concept implementation
                                                        • Goals and scope of prototype
                                                        • Overview of prototype
                                                          • General overview
                                                          • Client interface
                                                          • Stream
                                                          • Producer and Consumer
                                                          • Implemented plugins
                                                            • Limitations and issues
                                                              • Single client
                                                              • Timeouts
                                                              • Exception handling and testing
                                                              • Docker security issues
                                                              • Docker bridge network
                                                              • Single stream
                                                              • Number of containers per plugin
                                                                  • Mob detection experiment
                                                                    • Last Post thermal dataset
                                                                      • Last Post ceremony
                                                                      • Dataset description
                                                                        • Object detection experiment
                                                                          • Preprocessing
                                                                          • Training
                                                                              • Results and evaluation
                                                                                • Framework results
                                                                                  • Performance evaluation
                                                                                  • Interoperability evaluation
                                                                                  • Modifiability evaluation
                                                                                    • Mob detection experiment results
                                                                                      • Training results
                                                                                      • Metrics
                                                                                      • Validation results
                                                                                          • Conclusion and future work
                                                                                            • Conclusion
                                                                                            • Future work
                                                                                              • Security
                                                                                              • Implementing a detection plugin
                                                                                              • Different deployment configurations
                                                                                              • Multiple streams with different layouts
                                                                                              • Implementing the plugin distribution service (Remote ProducerConsumer)
                                                                                              • Using high performance microservices backbone frameworks
                                                                                              • New object detection models and datasets specifically for thermal images
                                                                                                  • Firefighting department email conversations
                                                                                                    • General email sent to Firefighting departments
                                                                                                    • Conversation with Firefighting department of Antwerp Belgium
                                                                                                    • Converstation with Firefighting department of Ostend Belgium
                                                                                                    • Conversation with Firefighting department of Courtrai Belgium
                                                                                                    • Conversation with Firefighting department of Ghent Belgium
                                                                                                      • Thermal camera specifications
                                                                                                      • Last Post thermal dataset summary
                                                                                                        • 24th of March 2018
                                                                                                        • 2nd of April 2018
                                                                                                        • 3th of April 2018
                                                                                                        • 4th of April 2018
                                                                                                        • 5th of April 2018
                                                                                                        • 9th of April 2018
                                                                                                        • 10th of April 2018
                                                                                                        • 11th of April 2018
                                                                                                        • 12th of April 2018

Modifiable Drone Thermal Imaging AnalysisFramework for Mob Detection during Open-Air

EventsBrecht Verhoeve

Supervisors prof dr Bruno Volckaert prof dr ir Filip De Turck Pieter-Jan Maenhaut Jerico Moeyersons

Abstractmdash Drones and thermal cameras are used in combination formany applications such as search and rescue fire fighting etc Due to ven-dor specific hardware and software applications are hard to develop andmaintain Therefore a modifiable drone thermal imaging analysis frame-work is proposed that enables users to more easily develop such imageprocessing applications It implements a microservice plugin architectureUsers can build image processing applications with the framework by build-ing media streams using plugins that are either thermal cameras or imageanalysis software modules The framework is evaluated by building a proofof concept implementation which is tested on the initial requirements Itachieves the modifiability and interoperability requirements at the cost ofperformance and security The framework is applied for detecting largecrowds of people (mobs) during open-air events A new dataset containingthermal images of such mobs is presented on which a YOLOv3 neural net-work is trained The trained model is able to detect mobs on new thermalimages in real-time achieving frame rates of 55 frames per second whendeployed on a modern GPU

Keywordsmdash Drone thermal imaging Video streaming Framework Mi-croservices Object detection Plugin

I INTRODUCTION

THROUGHOUT history having an overview of the environ-ment from high viewpoints held many benefits The advent

of drones and advanced cameras provides low-cost aerial imag-ing that creates numerous opportunities for new applicationsTraditional visual cameras for the visible light spectrum offerhigh quality images but are limited to daytime or artificiallylighted scenes Thermal cameras measure thermal radiation ofobjects in a scene and thus can operate in utter darkness reveal-ing information not visible to the normal eye [1] The combi-nation of drones and thermal cameras is used in many differentapplications such as geography [23] agriculture [4] search andrescue [5] wildlife monitoring [6] disaster response [7] main-tenance [8] etc

Several vendors offer thermal camera products some specif-ically designed for drone platforms These cameras often usedifferent image formats color schemes and interfaces [19ndash11]This leads to issues if applications want to change the camerathat is used or when the camera is no longer supported by thevendor because different software needs to be built to inter-act with the new camera which often is a non-negligible costThis leads to a problem called vendor lock-in that makes cus-tomers dependent on a certain vendor as they cannot switchproduct without making substantial costs a problem alreadyvery tangible for cloud-based applications [12] Applicationsacross various fields often have slightly different functional andnon-functional requirements For this dissertation several Bel-gian fire fighting departments were asked for requirements for

a thermal drone platform application It quickly became clearthat they had various problems that needed to be solved suchas finding hot explosives measuring temperatures in contain-ers identifying hot entrances detecting invisible methane firesfinding missing persons etc Some use cases need to be eval-uated in real-time (during fires) others need to be extremelyaccurate A hypothetical application should be able to quicklyimplement new detection and analysis features to meet all theserequirements Due to the current solutions not being modifiableenough current applications built with aerial thermal imagingtechnology remain in the niche use case for which they were ini-tially developed [13] Applications could benefit from a back-bone framework to aid in this modifiabilityinteroperability is-sue aiding in developing end-to-end solutions connecting ther-mal cameras to various analysisdetection modules

This dissertation explores the requirements for such a frame-work and its potential software architecture To test the viabil-ity of the architecture a proof of concept prototype is imple-mented and evaluated against the initial requirements To verifyif it aids in developing detection applications the specific usecase for detecting large crowds of people so-called mobs dur-ing open-air events is investigated Monitoring crowds duringopen-air events is important as mobs can create potentially dan-gerous situations through bottlenecks blocking escape routesetc Through monitoring and detecting these mobs these situa-tions can be avoided before they become problematic [14 15]

The remainder of this paper is organized as follows Sec-tion II presents similar projects on the topic of modifiable imag-ing analysis frameworks and thermal object detection SectionIII presents the requirements of the framework and the soft-ware architecture designed from these requirements Section IVpresents the implementation of the framework prototype Themob detection experiment is described in Section V The testsand results to evaluate the framework and the mob detection ex-periment are presented in Section VI Finally Section VII drawsconclusions from this research and indicates where future effortsin this field should go to

II RELATED WORK

The Irish start-up DroneSAR [16] developed a search-and-rescue (SAR) drone platform allowing users to stream live im-ages and video from a drone as it conducts a search for miss-ing persons The platform works with any camera visual andthermal but focuses on drones from vendor DJI DroneSARs in-dustry partner Amazon introduced the Amazon Kinesis Video

Streams platform in January 2018 as a new service for the Ama-zon Web Services (AWS) cloud platform It allows users tostream live video from devices to the AWS cloud and buildapplications for real-time video processing [17] The VIPERproject by EAVISE KU Leuven researched how thermal andvisual video images could be used for real-time detection ofpersons using object detection algorithms such as deep learning[18] The framework presented in this work combines elementsfrom all three of these examples

III REQUIREMENTS AND SOFTWARE ARCHITECTURE

A Functional requirements

Three general actors are identified for the framework an end-user that wants to build a image processing application for a spe-cific use case camera developers integrating new cameras intothe framework and analysisdetection module developers inte-grating new analysisdetection algorithms into the framework sothat end-users can use them to build their applications An end-user should be able to construct a image processing applicationusing the integrated cameras and analysis modules He shouldbe able to adapt this application with the framework for newuse cases Camera and analysis module developers should beable to add and distribute new integrations for new cameras andanalysis modules to the framework This allows the end-users tofocus on the use case not the technical details of the hardwareplatforms or algorithms and have a wider selection of hardwareand algorithms

B Non-functional requirements

Interoperability modifiability and peformance are identifiedas the architecturally significant requirements Interoperabilityspecifies that the framework should be able to interact with var-ious cameras and analysis software modules via interfaces Theamount of systems the framework can successfully interact withadding to the business value of the framework as end-users canuse more devices via the framework to build applications Theframework needs to be extendable with new thermal camerasand analysis modules Applications built with the frameworkshould be modifiable to integrate new hardware and softwareThe available hardware on-site for use cases such as forest firemonitoring is not always powerful enough to support heavy im-age analysis software The framework should be able to deployin a distributed fashion to allow more computationally expen-sive operations to be executed on more powerful remote devicesSome use cases require real-time streaming of video and manip-ulation of these video streams which should be supported forthe framework to be relevant

C Software architecture

An architectural pattern analysis based on the requirementspresented in Section III-B was conducted from which a com-bination of the microservices and microkernel pattern was se-lected as the base pattern for the software architecture The mi-crokernel pattern enables the framework to be extended via aplugin system The microservices framework enables interop-erability through the use of standard interfaces and a communi-cation scheme It also allows for the framework to be deployed

in a distributed fashion [19ndash21] The software architecture isdocumented in static views sequence diagrams and deploymentviews Figure 1 presents an overview of the architecture

Fig 1 Component-connector overview of the framework Theclear components are the core components of the framework thateach user needs to install to use the framework The coloredcomponents are used for the distribution of plugins

End-users interact with the framework via the Client Inter-face a graphical or textual interface Commands are forwardedto the Stream module that manages the layout and state of thecomponents in the media pipeline (stream) that the end-userbuilds for his use case To activate and place the componentsin a certain layout the Stream module connects with the Pro-ducer and Consumer components that control the plugins thatprocess the media Producer Plugins are devices that producemedia such as thermal cameras Consumer Plugins process andconsume media such as analysis software and displays Oncea stream is established the plugins forward media to each otherin the layout specified by the Stream module New support forcameras and analysis modules can be added as plugins to theProducerConsumer Distribution components that distribute thissoftware so that end-users can download and install the pluginsEach module in the architecture is a microservice allowing fora distributed deployment and enabling the modifiability require-ments through interchangeable modules Cameras and analy-sis modules are realized as plugins for the ProducerConsumermodules implemented as a microkernel This allows the frame-work to easily build new streams for different use cases and in-terchange components when needed

C1 Plugin model

Figure 2 depicts the model of a general framework pluginThe plugin defines three interfaces a source media endpoint toreceive media from different sources a listener endpoint to for-ward the processed media to other listening plugins and an APIfor framework control The framework uses the API to changewhich sources and listener a plugin has and its state By linkingplugins together by setting the sources and listeners resourcesthe framework can build a media processing stream ProducerPlugins have no sources since they produce media The statesare used stop and start the media processing of the plugins in

Fig 2 Schematic overview of a plugin

the stream The REST paradigm is selected to build this APIwith state sources and listeners resources that need to be min-imally implemented A plugin has the following states INAC-TIVE PLAY STOP and PAUSE Figure 3 depicts the state tran-sition diagram A plugin implements the visible states STOPPAUSE and PLAY describing if the media process of the pluginis stopped paused or processing respectively The INACTIVEstate is only visible to the framework as it indicates that their isno active process that runs the plugin This is the initial state ofa plugin in the framework When a plugin is added to a streamthe plugin microservice is started transitions to the STOP stateand waits for commands

Fig 3 State transition diagram of a plugin

C2 Network topology and communication protocol

The microservices of the framework and the plugins need acommunication protocol to exchange commands and video Forsending the commands the HTTPTCP protocol is used a syn-chronous protocol that blocks on the response of a request Thisensures that commands are always received and acknowledgedincreasing the reliability of the communication at the cost ofperformance [22] The asynchronous RTPUDP protocol is se-lected to transfer media between the plugins to ensure low la-tency video transfer between plugins to enable real-time videostreams The recommended codec for transmitting video mediais MJPEG which transmits video frames as separately encodedJPEG images [23] Other codecs such as MPEG-4 encode onlycertain frames as keyframes and the other frames as B-framesthat encode differences from the keyframe [24] This impliesthat when receiving images from a stream a keyframe must firstbe received before the video can be decoded Using MJPEG

plugins receiving frames can directly perform analysis on eachkeyframe and do not have to wait for additional frames An ex-ample of this network topology for a stream with 2 Producerplugins and 3 Consumer plugins is depicted in Figure 4

Fig 4 Network topology The full lines represent HTTPTCPcommuncations the dashed line RTPUDP communications

IV PROTOTYPE IMPLEMENTATION

The goal of the prototype implementation is to provide a proofof the concept architecture and meet the requirements presentedin Section III The core framework components are imple-mented the distribution components are left out of scope as theyfocus primarily on the distribution of supported plugins Thecore microservices as well as the plugins are implemented usingthe Docker software containerization framework [25] Contain-ers virtualize on the operating system and allow for portablelightweight software environments for processes with a minorperformance overhead Using this technology the core modulesand plugins can be deployed in a local and distributed fashionand be setup and shutdown easily [26] The microservice con-tainers communicate via the protocols presented in Section III-C2 The REST APIs are built with the Flask framework [27] alightweight Python web development framework ideal for pro-totyping The ProducerConsumer components need access tothe Docker daemon running on the Docker host in order to spinup and shutdown Docker containers containing the plugin pro-cesses This is achieved by mounting the Docker client socket inthe ProducerConsumer containers This gives the container rootaccess to the host a significant security threat [2829] Two sam-ple plugins were implemented Filecam a plugin that producesvideo read in from a file and Display a plugin that forwardsmedia to the display of the local device The plugins transmitmedia using the video streaming framework GStreamer [30]

V MOB DETECTION

A Dataset

Several publicly available datasets for thermal images exists[31ndash34] None of these include large crowds of people so anew dataset called the Last Post dataset was created It consistsof thermal video captured at the Last Post ceremony in Ypres

(a) Thermal view of the square (b) Visual view of the square (c) Thermal view of the bridge (d) Visual view of the bridge

Fig 5 Last Post dataset main scenes

Belgium [35] The videos were captured using the Flir One Prothermal camera for Android [36] using the Iron colorschemeTwo main scenes are present in the dataset depicted in Figure 5Mobs are present in the thermal images not in the visual imagesdue to the images being made on separate days The imagesused for the experiment were manually annotated outliers wereremoved and the dataset was randomly split in a training andvalidation set

B Model

Detecting and classifying objects of interest in images isknown as the object detection problem in machine learning [37]Several object detection algorithms and frameworks have beenimplemented in the past years A distinction is made betweentraditional models [31 38ndash40] deep learning two-stage net-works [41ndash46] and deep learning dense networks [47ndash49] Thetraditional and two-stage methods make predictions relativelyslow (order of seconds on GPU) when compared to the densenetworks (order of milliseconds on GPU) [47] Since the goalis to use the framework in real-time use cases the latter is pre-ferred The YOLOv3 model is selected as it achieves state of theart prediction performances can make real-time predictions andis available via the open source neural network framework dark-net [50 51] The model is pre-trained on the ImageNet dataset[52] The model is trained on a NVIDIA Geforce 980 TX GPUand optimizes the SSE loss using batch gradient descent [50]To select the best weights the average Intersection of Union(IoU) and mean Average Precision (mAP) [53] are calculatedon predictions on the validation set The weights that achievethe highest mAP are selected as the final weights

VI RESULTS

A Framework

To evaluate the framework acceptance tests for the require-ments from Section III were conducted Common frameworkoperations such as manipulating and building a stream have anaverage execution time of 084 seconds with a standard devia-tion of 037 seconds Less common operations such as deacti-vating a plugin starting up the framework and shutting downthe framework have an average execution time of 358 840 and2402 seconds respectively with standard deviations 467 050and 048 respectively Deactivating plugins (STOP to INAC-TIVE transitions) takes a long time as the container running theprocess needs to be removed Real-time streaming could not betested due to the GStreamer framework having no readily avail-

able testing endpoints However when streaming and displayinga video with the framework human users could not differenti-ate between a streaming video and a video played using a nativemedia player making it plausible the framework streams in real-time Great care must be taken when building plugins as theirprocessing speed has a direct impact on the real-time streamingperformance Interoperability is achieved with the REST APIsand plugin model presented in Section III-C The interoperabil-ity is tested by having the framework exchange information witha mock plugin implementing the specified interface and count-ing the number of correct exchanges The average successfulexchange ratio is 99998 The framework can install and de-tect new plugins at runtime achieving runtime modifiability atplugin level Different deployment schemes were not tested forthe prototype

B Mob detection

The weights generated at the 15700th training epoch achievedthe highest mAP value 9052 on the validation set For com-parison performance of other models on benchmark datasetsachieve an average mAP of 748 [54] The reason the model isachieving such high values is because the validation set is tem-porally correlated with the training set as both sets are extractedfrom videos in which frames have a temporal correlation Per-formance when predicting on new datasets will be worse Figure6 depicts some predictions of the model When predicting on avideo the model generated predictions at an average frame rateof 55 frames per second an a GPU

Fig 6 Model predictions on validation set

VII CONCLUSION AND FUTURE WORK

In this dissertation a modifiable drone thermal imaging anal-ysis framework is proposed to allow end-users to build flexiblevideo processing pipelines using different thermal cameras andanalysis modules The framework implements a microservice

container plugin architecture Users can build image process-ing applications with the framework by building media streamsusing plugins that are either thermal cameras or image analy-sis software modules The framework is evaluated by building aproof of concept implementation which is tested on the initial re-quirements The proposed framework achieves the modifiabilityand interoperability requirements at the cost of performance andsecurity The framework is applied for detecting large crowdsof people (mobs) during open-air events A new dataset con-taining thermal images of such mobs is presented on which aYOLOv3 neural network is trained The trained model is ableto detect mobs on new thermal images in real-time achievingframe rates of 55 frames per second when deployed on a modernGPU Some extensions to this research are deploying a detec-tion model using the framework testing the other deploymentconfigurations testing the framework with end-users in prac-tice and building new object detection models specifically forthermal images

REFERENCES

[1] R Gade and T B Moeslund ldquoThermal cameras and applications a sur-veyrdquo Machine Vision and Applications vol 25 pp 245ndash262 2014

[2] M C Harvey J V Rowland and K M Luketina ldquoDrone with thermalinfrared camera provides high resolution georeferenced imagery of theWaikite geothermal area New Zealandrdquo 2016

[3] S Amici M Turci S Giammanco L Spampinato and F Giulietti ldquoUAVThermal Infrared Remote Sensing of an Italian Mud Volcanordquo vol 2pp 358ndash364 2013

[4] J Bendig A Bolten and G Bareth ldquoINTRODUCING A LOW-COSTMINI-UAV FOR THERMAL-AND MULTISPECTRAL-IMAGINGrdquo2012

[5] A J Rivera A D Villalobos J C Monje J A Marinas and C MOppus ldquoPost-disaster rescue facility Human detection and geolocationusing aerial dronesrdquo IEEE Region 10 Annual International ConferenceProceedingsTENCON pp 384ndash386 2017

[6] P Christiansen K A Steen R N Joslashrgensen and H Karstoft ldquoAuto-mated detection and recognition of wildlife using thermal camerasrdquo Sen-sors (Basel Switzerland) vol 14 pp 13778ndash93 jul 2014

[7] S Chowdhury A Emelogu M Marufuzzaman S G Nurre and L BianldquoDrones for disaster response and relief operations A continuous approx-imation modelrdquo 2017

[8] Workswell ldquoPipeline inspection with thermal diagnosticsrdquo 2016[9] DJI ldquoZenmuse H3 - 2Drdquo[10] Workswell ldquoApplications of WIRIS - Thermal vision system for dronesrdquo[11] Therm-App ldquoTherm-App - Android-apps op Google Playrdquo 2018[12] B Satzger W Hummer C Inzinger P Leitner and S Dustdar ldquoWinds of

change From vendor lock-in to the meta cloudrdquo IEEE Internet Comput-ing vol 17 no 1 pp 69ndash73 2013

[13] J Divya ldquoDrone Technology and Usage Current Uses and Future DroneTechnologyrdquo 2017

[14] B Steffen and A Seyfried ldquoMethods for measuring pedestrian densityflow speed and direction with minimal scatterrdquo Physica A Statistical Me-chanics and its Applications vol 389 pp 1902ndash1910 may 2010

[15] M Wirz T Franke D Roggen E Mitleton-Kelly P Lukowicz andG Troster ldquoInferring crowd conditions from pedestriansrsquo location tracesfor real-time crowd monitoring during city-scale mass gatheringsrdquo Pro-ceedings of the Workshop on Enabling Technologies Infrastructure forCollaborative Enterprises WETICE pp 367ndash372 2012

[16] L-L Slattery ldquoDroneSAR wants to turn drones into search-and-rescueheroesrdquo 2017

[17] Amazon Web Services Inc ldquoWhat Is Amazon Kinesis Video Streamsrdquo2018

[18] T Goedeme ldquoProjectresultaten VLAIO TETRA-projectrdquo tech rep KULeuven Louvain 2017

[19] L Bass P Clements and R Kazman Software Architecture in PracticeAddison-Wesley Professional 3rd ed 2012

[20] M Richards Software Architecture Patterns OrsquoReilly Media first edit ed2015

[21] C Richardson ldquoMicroservice Architecture patternrdquo 2017[22] C De La Torre C Maddock J Hampton P Kulikov and M Jones ldquoCom-

munication in a microservice architecturerdquo 2017

[23] On-Net Surveillance Systems Inc ldquoMJPEG vs MPEG4 Understandingthe differences advantages and disadvantages of each compression tech-niquerdquo 2006

[24] D Bull Communicating Pictures A Course in Image and Video CodingElsevier Science 2014

[25] Docker Inc ldquoDocker - Build Ship and Run Any App Anywhererdquo 2018[26] D Merkel ldquoDocker Lightweight Linux Containers for Consistent Devel-

opment and Deploymentrdquo 2014[27] A Ronacher ldquoWelcome to Flask Flask Documentation (012)rdquo 2017[28] Lvh ldquoDonrsquot expose the Docker socket (not even to a container)rdquo 2015[29] R Yasrab ldquoMitigating Docker Security Issuesrdquo tech rep University of

Science and Technology of China Hefei[30] GStreamer ldquoGStreamer open source multimedia frameworkrdquo 2018[31] J W Davis and M A Keck ldquoA Two-Stage Template Approach to Per-

son Detection in Thermal Imageryrdquo Proc Workshop on Applications ofComputer Vision 2005

[32] S Hwang J Park N Kim Y Choi and I S Kweon ldquoMultispectralPedestrian Detection Benchmark Dataset and Baselinerdquo CVPR 2015

[33] Z Wu N Fuller D Theriault and M Betke ldquoA Thermal Infrared VideoBenchmark for Visual Analysisrdquo IEEE Conference on Computer Visionand Pattern Recognition Workshops 2014

[34] S Z Li R Chu S Liao and L Zhang ldquoIllumination Invariant FaceRecognition Using Near-Infrared Imagesrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 29 no 4 pp 627ndash639 2007

[35] Last Post Association ldquoMissionrdquo 2018[36] FLIR ldquoFLIR One Prordquo[37] E Alpaydin Introduction to machine learning MIT Press 3 ed 2014[38] F X F Xu X L X Liu and K Fujimura ldquoPedestrian detection and

tracking with night visionrdquo IEEE Transactions on Intelligent Transporta-tion Systems vol 6 no 1 pp 63ndash71 2005

[39] H Nanda and L Davis ldquoProbabilistic template based pedestrian detectionin infrared videosrdquo IEEE Intelligent Vehicles Symposium Proceedingsvol 1 pp 15ndash20 2003

[40] R Appel S Belongie P Perona and P Doll ldquoFast Feature Pyramids forObject Detectionrdquo Pami vol 36 no 8 pp 1ndash14 2014

[41] J R R Uijlings K E A Van De Sande T Gevers and A W M Smeul-ders ldquoSelective Search for Object Recognitionrdquo tech rep 2012

[42] R Girshick J Donahue T Darrell and J Malik ldquoRegion-Based Convolu-tional Networks for Accurate Object Detection and Segmentationrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol 38 no 1pp 142ndash158 2014

[43] R Girshick ldquoFast R-CNNrdquo in Proceedings of the IEEE InternationalConference on Computer Vision vol 2015 Inter pp 1440ndash1448 2015

[44] S Ren K He R Girshick and J Sun ldquoFaster R-CNN Towards Real-Time Object Detection with Region Proposal Networksrdquo IEEE Trans-actions on Pattern Analysis and Machine Intelligence vol 39 no 6pp 1137ndash1149 2016

[45] K He Gkioxari P Dollar and R Girshick ldquoMask R-CNNrdquo arXiv 2018[46] J Dai Y Li K He and J Sun ldquoR-FCN Object Detection via Region-

based Fully Convolutional Networksrdquo tech rep 2016[47] J Redmon S Divvala R Girshick and A Farhadi ldquoYou Only Look

Once Unified Real-Time Object Detectionrdquo 2015[48] W Liu D Anguelov D Erhan C Szegedy S Reed C-y Fu and A C

Berg ldquoSSD Single Shot MultiBox Detectorrdquo arXiv 2016[49] T-y Lin P Goyal R Girshick K He and P Dollar ldquoFocal Loss for

Dense Object Detectionrdquo arXiv 2018[50] J Redmon and A Farhadi ldquoYOLOv3 An Incremental Improvementrdquo

axXiv 2018[51] J Redmon ldquoDarknet Open source neural networks in crdquo

httppjreddiecomdarknet 2013ndash2016[52] J Deng W Dong R Socher L-J Li K Li and L Fei-Fei ldquoImageNet

A Large-Scale Hierarchical Image Databaserdquo in CVPR09 2009[53] M Everingham S M A Eslami L Van Gool C K I Williams J Winn

and A Zisserman ldquoThe Pascal Visual Object Classes Challenge A Ret-rospectiverdquo International Journal of Computer Vision vol 111 no 1pp 98ndash136 2014

[54] A Ouaknine ldquoReview of Deep Learning Algorithms for Object Detec-tionrdquo 2018

xii

Contents

1 Introduction 1

11 Drones 1

12 Concepts 2

121 Thermal Cameras 2

122 Aerial thermal imaging 2

13 Problem statement 2

131 Industry adoption 2

132 Crowd monitoring 3

133 Goal 4

134 Related work 4

14 Outline 4

2 System Design 5

21 Requirements analysis 5

211 Functional requirements 5

212 Non-functional requirements 6

22 Patterns and tactics 11

221 Layers 12

222 Event-driven architecture 12

223 Microkernel 12

224 Microservices 13

225 Comparison of patterns 13

23 Software architecture 15

231 Static view 15

232 Dynamic views 22

233 Deployment views 23

3 State of the art and technology choice 27

31 Thermal camera options 27

311 Parameters 27

312 Comparative analysis 30

32 Microservices frameworks 31

321 Flask 31

322 Falcon 33

323 Nameko 33

324 Vertx 33

325 Spring Boot 34

33 Deployment framework 34

331 Containers 34

332 LXC 35

333 Docker 35

334 rkt 35

34 Object detection algorithms and frameworks 36

341 Traditional approaches 36

342 Deep learning 37

343 Frameworks 39

35 Technology choice 41

351 Thermal camera 41

352 Microservices framework 41

353 Deployment framework 41

354 Object detection 41

4 Proof of Concept implementation 43

41 Goals and scope of prototype 43

42 Overview of prototype 43

421 General overview 43

422 Client interface 45

423 Stream 46

424 Producer and Consumer 46

425 Implemented plugins 48

43 Limitations and issues 51

431 Single client 51

432 Timeouts 51

433 Exception handling and testing 51

434 Docker security issues 51

435 Docker bridge network 52

436 Single stream 52

437 Number of containers per plugin 52

5 Mob detection experiment 53

51 Last Post thermal dataset 53

511 Last Post ceremony 53

512 Dataset description 54

52 Object detection experiment 56

521 Preprocessing 56

522 Training 56

6 Results and evaluation 58

61 Framework results 58

611 Performance evaluation 58

612 Interoperability evaluation 60

613 Modifiability evaluation 62

62 Mob detection experiment results 62

621 Training results 63

622 Metrics 63

623 Validation results 64

7 Conclusion and future work 67

71 Conclusion 67

72 Future work 69

721 Security 69

722 Implementing a detection plugin 69

723 Different deployment configurations 70

724 Multiple streams with different layouts 70

725 Implementing the plugin distribution service (Remote ProducerConsumer) 70

726 Using high performance microservices backbone frameworks 70

727 New object detection models and datasets specifically for thermal images 70

A Firefighting department email conversations 81

A1 General email sent to Firefighting departments 81

A2 Conversation with Firefighting department of Antwerp Belgium 82

A3 Converstation with Firefighting department of Ostend Belgium 83

A4 Conversation with Firefighting department of Courtrai Belgium 83

A5 Conversation with Firefighting department of Ghent Belgium 83

B Thermal camera specifications 85

C Last Post thermal dataset summary 94

C1 24th of March 2018 94

C2 2nd of April 2018 95

C3 3th of April 2018 96

C4 4th of April 2018 97

C5 5th of April 2018 97

C6 9th of April 2018 98

C7 10th of April 2018 99

C8 11th of April 2018 100

C9 12th of April 2018 101

xvi

List of Figures

21 Use case diagram 7

22 Overview of the framework software architecture 16

23 Framework network topology 17

24 Client Interface detailed view 17

25 Stream detailed view 18

26 Stream model 18

27 Plugin model 19

28 Plugin state transition diagram 20

29 Component-connector diagrams of the Producer and Consumer module 21

210 Producer and Consumer Distribution component-connector diagrams 22

211 Add plugin sequence diagram 23

212 Link plugins sequence diagram 24

213 Deployment diagrams 26

31 Thermal image and MSX image of a dog 28

33 Rethink IT Most used tools and frameworks for microservices results [54] 32

34 Containers compared to virtual machines [66] 36

41 filecam GStreamer pipeline 49

42 local plugin GStreamer pipeline 50

51 Last Post ceremony panorama 54

52 Last Post filming locations 54

53 Main scenes in the Last Post dataset 55

54 Outliers 57

61 Average training loss per epoch 64

62 Validation metrics per epoch 65

63 Predictions of the model on images in the validation set 66

71 GStreamer pipeline for a plugin with a detection model 69

xviii

List of Tables

21 Performance utility tree 8

22 Interoperability utility tree 9

23 Modifiability utility tree 10

24 Usability utility tree 11

25 Security utility tree 11

26 Availability utility tree 12

27 Architecture pattern comparison 14

61 Acceptance tests results summary 59

62 Performance test statistics summary measured in seconds 60

63 Resource usage of the framework in several conditions 61

64 Total size of framework components 61

65 Interoperability tests results (S Source L Listener) 62

B1 Compared cameras their producing companies and their average retail price 86

B2 Physical specifications 87

B3 Image quality IR InfraRed SD Standard FOV Field of View 88

B4 Thermal precision 89

B5 Interfaces 90

B6 Energy consumption 91

B7 Help and support 92

B8 Auxiliary features 93

xix

List of Listings

1 Minimal Flask application 32

2 Vertx example 33

3 Spring Boot example 34

4 docker-composeyml snippet of the prototype 44

5 Mounting the Docker socket on the container 47

6 Starting a plugin container 47

7 Dynamic linking of the decodebin and jpegenc 50

xx

List of Abbreviations

ACF Aggregated Channel Features

AMQP Advanced Message Queuing Protocol

API Application Programming Interface

AS Availability Scenario

ASR Architecturally Significant Requirement

CLI Command Line Interface

CNN Convolutional Neural Networks

CRUD Create Read Update Destroy

DNS Domain Name System

FR Functional Requirement

GPU Graphical Processing Unit

H High

HTTP Hyper Text Transfer Protocol

ICF Integral Channel Features

IoU Intersection of Union

IS Interoperability Scenario

IT Interoperability Tactic

JVM Java Virtual Machine

L Low

xxi

LXC Linux Containers

M Medium

mAP mean Average Precision

Motion-JPEG MJPEG

MS Modifiability Scenario

MSX Multi Spectral Dynamic Imaging

MT Modifiablity Tactic

NFR Non-Functional Requirement

ONNX Open Neural Network Exchange Format

OS Operating System

PS Performance Scenario

PT Performance Tactic

QAR Quality Attribute Requirement

REST Representational State Transfer

RNN Recurrent Neural Network

RPN Region Proposal Network

RTP Real-time Transport Protocol

SS Security Scenario

SSE Sum of Squared Errors

SVM Support Vector Machine

TCP Transmission Control Protocol

UDP User Datagram Protocol

UI User Interface

US Usability Scenario

YOLO You Only Look Once

INTRODUCTION 1

Chapter 1

Introduction

Throughout history having an overview of the environment from high viewpoints held many benefits Early civilizations used

hills to monitor their surroundings population and spot possible attackers The discovery of flight meant that environments

could now be viewed from a birdrsquos-eye view offering even more visibility revealing much more of the world below Recently a

much more smaller type of aircraft was developed the drone Ranging from large plane-like to almost insect-like devices and

having a wide variety of uses drones are quickly taking over the sky Drones would not be as effective without proper cameras

providing a detailed view on the world below With digital videocameras offering superb quality for steadily decreasing costs

almost every scene can be captured in great detail However these cameras are limited to the visible light spectrum which

hinders drones to operate in all circumstances like nightly flights Thermal cameras measure the emitted heat of a scene and

can reveal information not visible to the eye such as hidden persons or animals pipelines malfunctioning equipment etc The

combination of these two technologies certainly holds many exciting opportunities for the future

11 Drones

Drones are flying robots that can fly remotely or autonomously and donrsquot carry a human operator They can carry a variety of

payloads video cameras delivery parcels fluid containers sensors lights but also lethal explosives [1]

Drones are classified in different categories based on varying parameters such as the physical characteristics (diameter weight)

aerial movement techniques application domains etc Based on diameter drones are classified as smart dust (1 mm to 025

cm) pico air vehicles (025 cm - 25 cm) nano air vehicles (25 cm - 15 cm) micro air vehicles (15 cm - 1 m) micro unmanned

aerial vehicles (1 m - 2 m) and unmanned aerial vehicles (2 m and larger) Often depending on their diameter the weight

of these devices ranges from less than a gram up to more than 2000 kg Drones have different flight techniques such as

propulsion engines with wings rotors in various amounts flapping wings and even balloons They are used for all kinds of

purposes ranging from search and rescue missions environmental protection delivery recon etc Hassanalian et al provide

an excellent overview of most types of drones [2]

Due to the increasing interest in commercial drone platforms [3] a variety of payloads were developed specifically tailored for

these aerial robots such as gimbals to mount action video cameras [4] gimbals for delivering packets [5] and thermal imaging

12 Concepts 2

platforms [6]

12 Concepts

121 Thermal Cameras

Thermal cameras are passive sensors that capture the infrared radiation emitted by all objects with a temperature above

absolute zero degrees Kelvin In contrast to visible light cameras thermal cameras do not depend on an external energy

source for visibility and colors of objects or scenes This makes captured images independent of the illumination colors etc

Furthermore images can be captured in the absence of visible light [7] Originally thermal camera technology was developed

for night vision purposes for the military and the devices were very expensive Later the technology was commercialized

and has developed quickly over the last few decades resulting in better and cheaper cameras [7] This led to access for a

broader public and the technology is now introduced to a wide range of different applications such as building inspection gas

detection industrial appliances medicinal science agriculture fire detection surveillance etc [7] Thermal cameras are now

being mounted on drones to give an aerial thermal overview

122 Aerial thermal imaging

Aerial thermal imaging is defined as the creation of thermal images using a flying device This dissertation focuses on the usage

of drones for aerial thermal imaging There are many applications for aerial thermal imaging Some examples are geography

[8 9] agriculture [10 11] search and rescue operations [12] wildlife monitoring [13] forest monitoring [14 15] disaster response

[16] equipment and building maintenance [17ndash20] etc In the past few years several industry players have developed thermal

cameras specifically aimed at these drone applications Examples are FLIR [6] Workswell [21] and TEAX Technology [22]

13 Problem statement

131 Industry adoption

The implementation of thermal cameras on drone platforms faces some issues for wide adoption by the industry Several

vendors offer thermal camera products some specifically designed for drone platforms that often implement different image

formats color schemes and interfaces (eg [23ndash25]) This leads to issues if users want to modify their applications by changing

the camera that is used because the applicationmust implement new software to interact with the camera or when the camera

is no longer supported by the vendor leaving the application with outdated hardware and software This leads to a problem

called vendor lock-in that makes customers dependent on a certain vendor as they cannot switch product without making

substantial costs a problem already very tangible for cloud-based applications today [26]

Applications across various fields often have different functional and non-functional requirements Some applications have hard

real-time deadlines (such as firefighting search and rescue security etc) that must be respected other applications require

13 Problem statement 3

highly accurate predictions (eg person detection agriculture etc) A single application domain can even have many different

use cases

Several firefighting departments in Belgium were contacted to get an overview of their needs for a drone thermal imaging ap-

plication It quickly became clear they had various detection problems such as finding missing persons locating hot explosives

measuring temperatures in silos detecting invisible methane fires etc Equipment also wears down more quickly due to usage

in harsh environments such as fires in close proximity A drone thermal application for them needs to be able to exchange

functionality and hardware easily and have high performance constraints to deliver value for them The email conversations

can be read in Appendix A

Other drone thermal imaging applications are mostly only used in the niche domain for which they were developed because

they arenrsquot designed for flexibility [27] These proprietary applications have some disadvantages the development and support

potentially has a large cost vendor lock-in can occur when products are no longer supported security issues could arise and

customization is difficult [28 29] Applications could benefit from a backbone framework to aid in this modifiabilityinteroper-

ability issue aiding in developing end-to-end solutions connecting thermal cameras to various analysisdetection modules for

various use cases

132 Crowd monitoring

Festivals and other open air events are popular gatherings that attract many people For every event organizer it is important to

ensure safety and avoid incidents Large groups of people so-called mobs can create potentially dangerous situations through

bottlenecks blocking escape routes etc Therefore having the ability to monitor crowds and predict their behavior is very

important to avoid such scenarios Data can be obtained by evaluating video footage from past comparable events or real time

video monitoring of current events [30] By analyzing this footage potentially dangerous situations can be avoided by acting

on the mob formation and safety regulations can be improved to help planning future events Vision-based approaches face

several limitations mounted cameras cannot capture elements outside of their field of view canrsquot see in some conditions (for

example during night time) and it is difficult to infer information from the raw footage [31]

Thermal cameras could help for crowd monitoring because they can operate in any condition Having precise and detailed

object recognition for the images produced by these cameras is crucial to extract information correctly In this context clas-

sifying images is not satisfactory localization of the objects contained within the images is needed This problem is known

as object detection [32] There are several challenges for object detection in thermal images the image quality is very low

when compared to visible light images there is a lack of color and texture information and temperature measures are relative

measures etc This makes extracting discriminative information from these images difficult [33] Most efforts towards object

detection on thermal images has gone towards human detection Most of the proposed algorithms focus on feature extraction

using the Aggregated Channel Features technique and boosting algorithms for learning [33ndash35] Novel approaches make use

of so-called deep learning with neural networks that achieve very good results given enough data [36]

14 Outline 4

133 Goal

The goal of the dissertation is to explore the requirements of the possible backbone framework suggested in Section 131 and its

potential software architecture The architecture is evaluated by building a proof of concept implementation of the framework

and evaluating it against the proposed requirements To verify its use in developing drone thermal imaging applications the

specific mob-detection use case is investigated

134 Related work

The Irish start-up DroneSAR [37] developed a search-and-rescue (SAR) drone platform allowing users to stream live images

and video from a drone as it conducts a search for missing persons The platform works with any camera visual and thermal

but focuses on drones from vendor DJI DroneSARs industry partner Amazon introduced the Amazon Kinesis Video Streams

platform in January 2018 as a new service for the Amazon Web Services (AWS) cloud platform It allows users to stream live

video from devices to the AWS cloud and build applications for real-time video processing [38] The VIPER project by EAVISE

KU Leuven researched how thermal and visual video images could be used for real-time detection of persons using object

detection algorithms such as deep learning [36] The framework presented in this work combines elements from all three of

these examples

14 Outline

The remainder of this dissertation is organized as follows Chapter 2 presents the requirements for the framework and the

software architecture Chapter 3 explores several state of the art technologies that can serve as backbone technologies for

the framework To test the viability of the software architecture a prototype is implemented Chapter 4 presents the different

aspects of this prototype Chapter 5 describes the experiment that is conducted to research the detection of mobs in thermal

images The results of both the framework and the detection experiment are presented and evaluated in Chapter 6 Finally the

conclusion and future research efforts are presented in Chapter 7

SYSTEM DESIGN 5

Chapter 2

System Design

Finding out what users actually expect from a software system and what makes it valuable for them is of key importance for the

success of that system This chapter first explores the functional and non-functional requirements of the hypothetical frame-

work suggested in Chapter 1 to find out what makes building the framework worthwhile Well known architectural patterns

enable certain software requirements very well and can be used for building the software architecture of the framework The

framework software architecture combines some of these patterns and is presented in several documents

21 Requirements analysis

Requirements are the stated life-cycle customer needs and objectives for the system and they relate to how well the system

will work in its intended environment They are those aspects of the framework that will provide value to the users

211 Functional requirements

Functional requirements (FR) describe the necessary task action or activity that must be accomplished by the system often

captured in use cases andor user stories [39 40] Use cases provide a summary of the features described in the user stories

Several external people andor systems defined as actors interact with the framework to achieve a certain goal [40] Three

actors are identified for the framework an end-user that uses the framework in order to build an image processing applica-

tion for a specific use case such as the ones described in Section 122 a camera developer who creates support software for

a specific thermal camera for the framework so that the end-user can buy and use their product and an analysis software

developer that creates analysis software for a specific use case (tracking object detecting objects etc) so that the end-user

can use their software to build his 1 application The camera and analysis software developers are generalized to an actor called

plugin developer who develops plugins to extend the functionality of the framework These plugins are the building blocks

with which the end-user can build image processing applications

The general user scenario for the framework proceeds as follows An end-user wants to build an image processing application

1To avoid unnecessary gender specific pronoun clutter the male pronoun is used by default

21 Requirements analysis 6

eg to detect fires in a landscape using a drone He has a thermal camera for this and has read about hot-spot detection in

video The user looks for a plugin for the framework that can read video from his thermal camera and for a plugin that does

the hot-spot detection If the user finds these plugins he can add them to the framework and use them for the application he

is building He connects both plugins with the framework in a specific order to finish his application For this simple example

the thermal camera plugin is connected to the hot-spot detection plugin so that video coming from the thermal camera is

transmitted to the detection plugin to find the fires in the landscape The plugins in the application and the specific order in

which they are connected is defined as a stream This stream should be easily modifiable if additional or other functionalities

are required Suppose that the thermal camera delivers very high quality video and the detector cannot work with this as it

can only operate on low quality images The end-user searches for a plugin that scales the high quality video down to an

accepted quality for the detector This plugin is placed in between the thermal camera and the detector and the application

can work again By continuously adding plugins to the framework the number of possible applications that can be built with

the framework increase making the framework useable for more aerial thermal imaging use cases

Instead of developing the application from scratch users can use the already implemented plugins to build the applications in

an ad hoc fashion Because of this the development time for such applications can be reduced and users can switch hardware

andor algorithms easily The FRs are summarized in a use case diagram that connects each actor with their respective require-

ments and the relationship among them [40] depicted in Figure 21 Trivial functionalities such as launching and shutting down

the framework are omitted The red use cases represent use cases to extend the functionality of the framework the blue use

cases represent use cases for building streams white use cases modify the media processing of the stream Some use cases

depend on others the blue and white use cases work with plugins of the framework their prerequisite use case is rdquoAdd pluginrdquo

as a plugin must be a part of the framework for a user to use it the rdquo(Un)Link pluginsrdquo rdquoStopPausePlay streamrdquo use cases

depend on rdquoAdd plugins to streamrdquo as a stream must contain plugins before they can be manipulated

212 Non-functional requirements

A non-functional requirement (NFR) specifies how the framework is supposed to be or in what manner it should execute its

functionality [41] These qualifications typically cover business and system quality requirements A distinction is made between

quality attribute requirements (QAR) and constraints QARs are qualifications of the FRs or of the overall product eg how

fast a certain function must be executed or how resilient it must be to erroneous input They are closely related to business

requirements which are specifications that once delivered provide value to the actors [40] The QARs are captured in a utility

tree [40] that has a root node representing the system This root node is elaborated by listing the major QARs that the system

is required to exhibit Each QAR is subdivided into more specific QARs To make the specific requirements unambiguous and

testable a scenario for the system or a specific function is written and they are evaluated against the business value and the

architectural impact [40] The QAR can either have High (H) Medium (M) and Low (L) business value and architectural impact

respectively The business value is defined as the value for the end user if the QAR is enabled High designates a must-have

requirement Medium is for a requirement which is important but would not lead to project failure Low describes a nice to have

QAR but not something that is worth much effort Architectural impact defines how much the architecture must be designed

towards the QAR to enable it High means that meeting this QAR will profoundly affect the architecture Medium means

21 Requirements analysis 7

Figure 21 Use case diagram

that meeting this QAR will somewhat affect the architecture Low means that meeting this QAR will have little effect on the

architecture The following QARs are discussed performance interoperability modifiability usability security and availability

Performance

Performance indicates the frameworks ability to meet timing requirements It characterizes the events that can occur and the

frameworks time-based response to those events Latency is defined as the time between the arrival of the stimulus and the

systemrsquos response to it [40] The system latency is the latency between the initialization of an action sequence and the first

change to the system noticeable by the user Streaming latency is defined as the time between the arrival of a video frame and

the arrival of the next video frame The jitter of the response is the allowable variation in latency Scalability is the number

of users that can use the framework at the same time The utility tree is presented in Table 21 The software industry has

not defined a quantified rsquogood latencyrsquo for end-users but a 4 second latency rule is often used as a rule-of-thumb [42] The

average response time for general framework commands should then be less than 2 seconds with a standard deviation of 1

seconds ensuring most execution times respect the 4 second bound As stated in Chapter 1 some use cases require real-time

video streaming such as fire fighting The notion of low latency real-time video loosely defines that video should be streamed

almost simultaneously if a camera is filming and a human user does not notice a latency between the video of the camera

and the real world the video stream is considered real-time Real-time is thus a human time perception and for visual inputs

this bound is as low as 13 milliseconds Anything above 13 milliseconds becomes noticeable anything above 100 milliseconds

hinders human performance [43 44] However the framework focusses on the use of thermal cameras most of which most

donrsquot produce frames faster than 8 frames per second or 125 milliseconds per frame (see Section 31) More expensive cameras

21 Requirements analysis 8

can shoot at 25 frames per second corresponding to a latency of 40 milliseconds and this bound is selected for the streaming

latency with a standard deviation of 20 milliseconds remaining below the frame rate of less expensive cameras The number

of users that can use the framework at the same time is assumed to be low as current aerial thermal image applications are

currently operated by only one user or a few The assumption is that a maximum of five users can use the framework at the

same time All of these requirements are quantified as relatively rsquogoodrsquo values These bounds should be evaluated for user

satisfaction by having users use a prototype of the framework in practice

Attribute refinement Id Quality attribute scenario

LatencyPS-1 The average execution time of all framework commands does not exceed 2 sec-

onds (H M)

PS-2 A playing stream should have an upper limit of 40ms streaming latency (H H)

JitterPS-3 The average standard deviation of the execution time of all framework com-

mands should not exceed 1 second under normal operation (H M)

PS-4 The average standard deviation in streaming latency should not exceed 20ms

under normal operation (H H)

Scalability PS-5 The system should usable by five users at the same time (M M)

Table 21 Performance utility tree

Interoperability

Interoperability is the degree to which two or more independently developed systems can usefully exchange meaningful infor-

mation via interfaces in a particular context [40] The framework will interoperate with cameras and analysis modules via the

framework plugins Henceforth the camera and analysis plugins will be referred to as a Producer plugin and a Consumer plugin

A Producer plugin is thus a plugin that represents a camera that produces video and a Consumer plugin a plugin that represents

a module that processes or consumes video The framework will thus interact with the Producer and Consumer plugins with

which the framework exchanges requests to link them together control their media process etc The more correct exchanges

there are between the two the better the user can use the plugin for building applications with the framework This QAR is

quantified by the ratio of requests made by the framework that were correctly interpreted by the plugin (successful requests)

and the total number of requests during a runtime of the framework [40] Intuitively one argues that the framework must

achieve perfect interoperability with a perfect exchange success rate of 100 Reality however tends to not agree with perfec-

tion and it can never be excluded that exchanges will always be correct Therefore it is better to aim for a good interoperability

measure and prepare for failed exchanges instead of naively assuming the framework will be perfect An exchange success

rate of 9999 is selected the motivation for this bound is as follows A plugin is assumed to be always correct up to first the

mistake after which the plugin is faulty and the fault needs to be identified and ensured that it wonrsquot occur again An exchange

success rate of 9999 means that if 10000 plugins are installed and used by the framework only one will fail during uptime

For one plugin during framework up time the mean time between failures is then 10000 exchanges It is suspected that this

21 Requirements analysis 9

amount of exchanges are very high for normal framework use Because the possibility of faulty exchanges is acknowledged

the framework will need to implement a fallback mechanism to compensate The utility tree is presented in Table 22

Attribute refinement Id Quality attribute scenario

Syntactic interoperabilityIS-1 The framework exchanges requests with a Producer plugin (known at runtime)

with a success ratio of 9999 Incorrect requests are undone by the framework

and logged (HH)

IS-2 The framework exchanges requests with a Consumer plugin (known at runtime)

with a success ratio of 9999 Incorrect requests are undone by the framework

and logged (HH)

Table 22 Interoperability utility tree

Modifiability

Modifiability is the cost and risk of changing functionality of the system [40] One of themost important values of the framework

is modifiability of the supported thermal cameras and analysis modules The framework needs to be extendable for new

functionalities by enabling developers to add their support software in the form of a plugin End-users should be able to

modify the components that they use for their image processing applications easily and quickly to allow for interchangeable

hardware and software and quickly set up new applications Modifiability is defined in two environments runtime defined as

periods during which the system is up and running and downtime defined as the time periods during which the system is not

active The utility tree is presented in Table 23

To enable users to choose the extensions they need the framework will need a distribution service that contains all plugins

available for the framework from which a user can select and install plugins for their local version of the framework Adding

new plugins to the distribution service should not affect versions of the frameworks installed by the user When a user adds a

plugin from the distribution to his version of the framework the framework should only reload once before making the plugin

useable for user comfort Deployability is defined as the different device configurations that specify how the framework can be

deployed If the framework can be deployed in different fashions this can increase the value for the end-user Suppose a fire

fighting use case in which a forest fire is monitored on site Computationally powerful devices might not be available on site

so moving some plugins processing media to a remote server or cloud could still allow usage of the framework Perhaps the

device processing the media is already remote for example a drone on security patrol in this case access via a remote device

such as a smartphone is desirable This leads to the deployment configurations described in the utility tree

Usability

Usability indicates how easy it is for the user to accomplish a desired task and the kind of user support the system provides

Learnability indicates how easy it is for a user to gain knowledge on how to use the framework Errors are the amount of errors

21 Requirements analysis 10

Attribute refinement Id Quality attribute scenario

Run time modifiability

MS-1 Support for a new Producer plugin should be added to the distribution service

within one day without the framework having to restart (H H)

MS-2 Support for a new Consumer plugin should be added to the distribution service

within one day without the framework having to restart (H H)

MS-3 End-users should be able to extend their framework with new functionalities

by installing new Consumer and Producer Plugins (HH)

MS-4 End-users should be able tomodify the plugins used to build their stream (HH)

Down time modifiabilityMS-5 New Producer plugins can be installed to the local framework at runtime the

framework can only reload once before the plugin is useable (H H)

MS-6 New Producer plugins can be installed to the local framework at runtime the

framework can only reload once before the plugin is useable (H H)

Deployability

MS-7 The system should be deployable on a combination of a smartphone and

cloudremote server environment (H H)

MS-8 The system should be deployable on a personal computer or laptop (H H)

MS-9 The system should be deployable on a smartphone laptop and cloud environ-

ment (H H)

Table 23 Modifiability utility tree

a user can make when trying to execute certain functions [40] The utility tree is presented in Table 24

Security

Security is a measure of the systemrsquos ability to protect data and information from unauthorized access while still providing

access to users and systems that are authorized An action taken against the system to cause it harm is called an attack

Security has three main characteristics Confidentiality is the property that data or services are protected from unauthorized

access Integrity is the property that data or services are protected from unauthorized manipulation Availability is the property

of the systemmaintaining its functionality during an attack Authentication verifies the identities of the parties of an interaction

checks if they are truly who they claim to be and gives or provokes access [40] Security is important for the framework if it is

deployed on multiple devices that use a public network to communicate The utility tree is presented in Table 25

Availability

Availability in a general context (not only security) refers to howavailable the software is to carry out its functionality Downtime

is a measure of the time that the system is unavailable to carry out its functions The utility tree is presented in Table 26

Availability is specified for the part of the framework that distributes the plugins

22 Patterns and tactics 11

Attribute refinement Id Quality attribute scenario

Learnability

US-1 A user should be able to learn how to build an image processing application in

at most one hour (H L)

US-2 An experienced developer should be able to start developing a Consumer plugin

for the system within one day (H L)

US-3 An experienced developer should be able to start developing a Producer plugin

for the system within one day (H L)

Errors US-4 A user should not make more than 3 errors to build an image processing appli-

cation (H L)

Table 24 Usability utility tree

Attribute refinement Id Quality attribute scenario

Confidentiality SS-1 Streams created by a user can only be accessed by that user and not by any

other entity (H L)

Integrity SS-2 Streams canrsquot be manipulated without authorization by the user that made the

streams (H L)

Availability SS-3 During an attack the core functionality is still available to the user (H M)

AuthenticationSS-4 Users should authenticate with the system to perform functions (H L)

SS-5 Developers should authenticate their plugins before adding them to the frame-

work (H L)

Table 25 Security utility tree

Architecturally significant requirements

Architecturally Significant Requirements (ASR) are the requirements that are themost important to realize according to business

value and have the most impact on the architecture From the utility trees and the measures of the quality attribute scenarios

the QARs modifiability interoperability and performance are identified as ASRs

22 Patterns and tactics

An architectural pattern is a package of design decisions that is found repeatedly in practice that has known properties that

permit reuse and describe a class of architectures Architectural tactics are simpler than patterns which typically use just a

single structure or computational mechanism They are meant to address a single architectural force Tactics are the rdquobuilding

blocksrdquo of design and an architectural pattern typically comprises one or more tactics [40] Based on the ASRs several tactics

are listed in Table 27 that are used for the base pattern selection The explored patterns are layers event-driven architecture

22 Patterns and tactics 12

microkernel and microservices

221 Layers

The layered pattern divides the software into units called layers that each perform a specific role within the application Each

layer is allowed to use the layer directly beneath it via its interface Changes in one layer are isolated if the interfaces donrsquot

change enablingMT-1 andMT-2MT-5 [40] While changes can be isolated by the isolated layers they remain difficult due the

monolithic nature of most implementations of this pattern Layers contribute to a performance penalty due to the rdquoarchitecture

sinkhole phenomenonrdquo in which requests are simply propagating through layers for the sake of layers [45]

222 Event-driven architecture

This pattern consists of several event publishers that create events and event subscribers that process these events The pub-

lishers and subscribers are decoupled by using an event channel to which the publishers publish events that the event channel

forwards to the event subscribers The subscribers should have a single purpose and execute asynchronously Since the publish-

ers and subscribers are single-purpose and are completely decoupled from other components via the event channel changes

are isolated to one or some components enabling MT-1 MT-2 MT-4 MT-5 and MT-7 If the event channel adds a discovery

mechanism IT-1 can also be enabled Overall the pattern is relatively easy to deploy due to the decoupled nature of the com-

ponents Performance in general can be very high through the asynchronous nature of the architecture enabling PT-6 and PT-7

If the event channel is tweaked to contain extra functionality PT-1 PT-3 PT-8 PT-9 PT-10 PT-11 can be enabled as well

If the components have a limited event response then PT-2 and PT-5 can also be enabled Development can be somewhat

complicated due to the asynchronous nature of the pattern [40 45]

223 Microkernel

The microkernel pattern allows the addition of application features as plugins to the core application providing extensibility

as well as feature separation and isolation The pattern consists of two components a core system called the kernel and

Attribute refinement Id Quality attribute scenario

DowntimeAS-1 The system should be up 995 per year This means the system has an allowed

scheduled downtime of 43 hours and 30 minutes per year for maintenance (M

L)

AS-2 The maximal duration of the interval during which the system is unavailable is

3 hours (M L)

Network AS-3 If there is no active network connection the local device can be used for opera-

tion of the framework (H H)

Table 26 Availability utility tree

22 Patterns and tactics 13

plugins The business logic is divided between independent plugins and the kernel The kernel contains only the minimal

functionality required to make the system operational The plugins are standalone independent components that contain

specialized processing additional features and custom code This code is meant to enhance or extend the core system to

produce additional business capabilities In many implementations plugins are independently developed third-party modules

Changes can largely be isolated and implemented quickly through loosely coupled plugins AllMTs can be enabled Depending

on how the pattern is implemented the plugins can be dynamically added to the kernel at runtime Via a resource discovery

service in the kernel the ITs can be enabled In general most applications built using the microkernel pattern perform well

because applications can be customized and streamlined to only include the features that are needed [45]

224 Microservices

Microservices is an architectural pattern that structures an application as a collection of loosely coupled services that implement

business capabilities Each component of the pattern is deployed as a separate unit that can be deployed on one device or

multiple devices The components can vary in granularity from a single module to a large portion of the application The

components contain one or more modules that represent either a single-purpose function or an independent portion of a

business application [45 46] Due to the separately deployed units changes are isolated to individual components enabling all

MTs Via service discovery mechanisms the ITs can also be enabled The microservices pattern supports distributed deployment

of the software across multiple devices by design This pattern is not known to produce high-performance applications due to

the distributed nature of the pattern which relies on communication via a network [45 46]

225 Comparison of patterns

Table 27 summarizes the analysis of the patterns A score is given based on howwell the pattern enables the tactic Lowmeans

that the pattern does not naturally enable the tactic Medium indicates the pattern can be implemented with the tactic but

does not include it itself High means the tactic is enabled in the pattern Excellent means that the tactic plays a key role in the

pattern

The microkernel pattern andmicroservices pattern both enable most tactics The microkernel pattern implements extendability

of the framework by design using plugins which is the main idea for the framework and thus is an excellent base pattern

Interoperability and deployability of these plugins can be ensured by the microservices pattern as it designs the microservices

to have well defined interfaces for interoperability and allows for the framework to be deployed in a distributed fashion The

architecture presented below is a combination of both the microkernel pattern and the microservices pattern

22 Patterns and tactics 14

Tactic Layers Event-driven Microkernel Microservices

MT-1 Split module Medium High High Excellent

MT-2 Increase semantic coherence Medium High High Excellent

MT-3 Encapsulate Medium High High Excellent

MT-4 Use an intermediary Medium High High Excellent

MT-5 Restrict dependencies High High High Excellent

MT-6 Anticipate expected changes Low High Excellent Excellent

MT-7 Abstract common services Low High Excellent Excellent

MT-8 Defer binding | Runtime registration Low Low Medium High

IT-1 Discover services Low Low High High

IT-2 Orchestrate interface Low Low High High

IT-3 Tailor interface Low Low High High

P-1 Manage sampling rate Low High High Medium

P-2 Limit event response Low High High Medium

P-3 Prioritize events Low High High Medium

P-4 Reduce overhead Low High High Low

P-5 Bound execution time Low High High Medium

PT-6 Increase resource efficiency Low High High High

PT-7 Introduce concurrency Low High Low High

PT-8 Maintain copies of computation Low High Low High

PT-9 Load balancing Low High Low High

PT-10 Maintain multiple copies of data Low High Low High

PT-11 Bound queue sizes Low High Low Medium

PT-12 Schedule resources Low High Low Medium

Table 27 Comparison of how well the discussed patterns enable the tactics needed for the ASRs

23 Software architecture 15

23 Software architecture

The software architecture is documented in three document categories static views dynamic views and deployment views

The static views comprise the different components of the system and their relationship among each other The dynamic views

describe the runtime behavior of the system Finally the deployment views provide different configurations how the system

can be deployed on different devices [47]

231 Static view

Figure 22 presents an overview of the architecture using a component-connector UML diagram Components are the boxes that

represent different software entities that exist at runtime The components have interfaces through which they interact with

other components These are indicated using the rsquolollipoprsquo notation with the rsquoballrsquo representing the interface that a component

provides and a socket indicating that another component is using this interface The type of data exchanged is noted next to

the interface Multiple boxes indicate that multiple components of the same kind can exist at runtime [48]

The architecture consists of the following core components Client Interface Producer Stream Consumer Producer Distribution

Consumer Distribution Producer Plugin and Consumer Plugin The clear components in Figure 22 form the core framework

which each user needs to install to use the framework The colored components form a distribution service for framework

plugins to extend the functionality they are not installed with the core framework but run as remote instances with which

the user can interact to extend his version of the core framework with new plugins A user can use the framework via the

Client Interface building streams that are maintained in the Stream component The Stream component makes requests to

the Producer and Consumer components to activate and control the selected plugins to build the stream Additional plugins

can be added to the framework and are distributed via the Producer and Consumer Distribution components The architecture

implements a hybrid combination of the microservices and microkernel pattern Each presented component is a microservice

that implements its own interface to interact with other components The Producer and Consumer components act as kernels in

the microkernel pattern while the Producer and Consumer plugins acting as plugins in the microkernel pattern These patterns

enable the tactics needed to meet the requirements presented in Section 21

Communication protocol

To allow the microservices to communicate a communication protocol must be designed Communication protocols can roughly

be classified in two categories synchronous and asynchronous Synchronous protocols block on requests which means that the

client waits for a response of the server and can only continue executing when a response is received This makes a synchronous

protocol inherently more reliable but also slower An example synchronous protocol is the Hyper Text Transfer Protocol (HTTP)

Asynchronous protocols just send messages and do not block on the response This makes the protocol less reliable but also

faster [49]

There are two types of traffic exchanged between microservices First there are the command requests that are exchanged

between microservices to edit resources or change state Second there are the video frames that are exchanged between Pro-

ducer and Consumer Plugins Both types of traffic have different requirements The commands must be communicated reliably

23 Software architecture 16

Figure 22 Overview component-connector diagram of the architecture

and need to executed once and only once The reliability is more important than latency so a synchronous protocol is pre-

ferred Microservices traditionally implement the synchronous HTTP protocol with Representational State Transfer Application

Programming Interfaces (REST API) that specifies the application endpoints as textual resources [45] This common protocol

is used for the exchanged command requests in the framework

The video frames need to be sent with low latency at a high frequency but reliability is less important An asynchronous

protocol is preferred For video streaming the Real-time Transport Protocol (RTP) running on top of the User Datagram Protocol

(UDP) is selected as it enables real-time transfer of data between processes [50] RTP defines a standardized packet format to

transmit video and audio over a network It sequences each packet with a sequence number and a timestamp This allows the

application to detect missing packets and latencies in the network The UDP protocol is a low latency asynchronous transport

protocol as it doesnrsquot guarantee packet delivery

The recommended codec for transmitting video media is Motion-JPEG that encodes video frames as separately encoded JPEG

images This makes analysis and processing in subsequent plugins easier as only the received frame is needed to perform

the analysis or processing Other video compression formats such as MPEG-4 use key-frames and object oriented differential

compression formats If a key-frame is received via the stream the frame can be used as is If a reference frame is received the

receiver needs to wait for the corresponding key-frame to be received to be able to construct the full video frame for analysis

This introduces extra complexity and lower quality detection which is a clear trade-off for the quality and simplicity which

MJPEG offers [51 52]

Applying these protocols to the architecture results in the network topology depicted in Figure 23 The full lines represent

communication via HTTP on top of the Transmission Control Protocol (TCP) The dashed lines represent the RTP protocol on top

of the UDP protocol The boxes represent the different microservice components of the framework

23 Software architecture 17

Figure 23 Framework network topology Each box is a microservice component of the framework The full lines indicate communication over the HTTPTCP

protocol the dashed lines indicate communication over the RTPUDP protocol

Client Interface

The Client Interface is the interface through which end-users can interact with the framework Figure 24 presents the detailed

component-connector diagram The Client Interface consists of a User Interface (UI) component and a API Gateway component

Devices can make requests to the Client Interface via the Client Requests interface provided by the API GateWay The UI provides

the UI Operation interface that is used by end-users to control the framework This can be either a visual or textual interface

The UI actions are translated to client requests that are forwarded to the API Gateway using the Client Requests interface The

API Gateway translates the client requests and forwards them to the other core framework components

Figure 24 Client Interface detailed view

Stream

The Stream component maintains the logical representation of the streams built by the end-user for his image processing

application Figure 25 presents the detailed component-connector diagram

23 Software architecture 18

Figure 25 Stream detailed view

It consists of an API a StreamManager and several StreamModel components The API provides the Stream Commands interface

used by the Client Interface to interact with the framework it translates incoming requests to commands for the Stream

Manager that can then execute these commands Commands include creating a new stream modifying the stream layout

modifying the stream state etc The StreamManager creates and manages multiple streams represented by the Stream Model

So the end-user builds Stream Models to create image processing applications The Stream Model represents the logical model

of these image processing application streams As stated before a stream consists of several plugins processing media placed

in some order that are linked by the framework Figure 26 illustrates this concept

Figure 26 Logical model of a stream The arrows represent the flow of media through the stream

Logically the Stream Model is represented as a tree with multiple roots and multiple leaves The framework build streams by

initializing the needed plugins and connecting them in order In the example StreamModel plugins receivemedia frommultiple

source plugins and forward media to multiple targets The Stream Model has a global state that represents the cumulative

state of all plugins To transition the global state from A to B all plugins need to transition from A to B This is done by first

making the transition on the leaves of the Stream Model after which the transition propagates to the root nodes This ensures

that no media is lost because the first transitioned plugins canrsquot process anything as their is no media put into the tree

23 Software architecture 19

Producer and Consumer plugins

A Plugin represents an independent media processing element either of the Consumer type (such as a thermal camera) or

the Producer type (such as an object detection software module) Plugins are deployed as standalone microservices providing

a REST API interface that the framework uses to control the plugin Figure 27 represents a general plugin model A plugin

receives media from other plugins called the sources processes this media and forwards it to other plugins called the listeners

A Producer plugin only has listeners a Consumer plugin has both sources and listeners Merging the media from multiple

sources and forwarding the processed media to multiple listeners is the responsibility of the plugin

Figure 27 Plugin model

The plugin REST API should at least provide a state resource representing the state of how the plugin is processing media

a sources resource that represent the sources from which the plugin receives media to process and a listeners

resource which represent the listeners to which the plugin transmits the processed media Only Consumers have the both

the sources and listeners resource as Producer Plugins produce their own media source and hence can only have

listeners

To indicate if and how the plugin is actively processing media a finite state machine is implemented The state transition

diagram is presented in Figure 28 A plugin can be in four possible states INACTIVE STOP PLAY and PAUSE When a plugin

is in the INACTIVE state no active microservice is running the plugin This is the initial state for all plugins of the framework

This state is only visible to the framework as in this state the plugin is not instantiated When a plugin is in the STOP state

the framework has instantiated a microservice running the plugin The plugin is listening for commands on its API but is not

processing any media This state is visible to the plugin In the PLAY state a plugin is processing media received from its

source(s) and transmits processed media to its listener(s) and is listening for commands When in the PAUSE state media

processing is paused but media buffers are kept This is to decrease the latency when the plugin transitions back to the PLAY

state since the plugin can continue processing from the point from where it was paused The difference with the STOP state

when transitioning STOP state the plugin clears its media buffers

The plugin starts in the INACTIVE state When a microservice running the plugin is instantiated by the framework the plugin

initializes itself in the STOP state From the STOP state the plugin can transition to the PLAY state to process media This

transition is only successful if sources and listeners are registered with the plugin From the PLAY state a transition to both

23 Software architecture 20

Figure 28 The state transition diagram for a plugin

the STOP state and the PAUSE state can be made which stops the processing of media and respectively drops or keeps the

media buffers The plugin cannot make multiple state transitions per command When a transition is made to INACTIVE the

framework first transitions the plugin to the STOP state after which the INACTIVE state can be reached

A sourcelistener has the following fields hostname the hostname of the microservice running the plugin and port the port

on which the sourcelistener is reachable

On the sources and listeners an HTTP GET and POST method must be provided GET retrieves the sourceslisteners

and their details POST adds a new sourcelistener to the plugin Both resources additionally need to provide an individ-

ual endpoint per sourcelistener on which GET PUT and DELETE must be provided This is for individual manipulation of the

sourcelistener GET retrieves the details PUT updates the fields of a listener and DELETE removes a sourcelistener from the

plugin

Producer and Consumer

The Producer and Consumer components are responsible for interacting and managing the ProducerConsumer plugins used in

the streams Figure 29 presents the component-connector diagram of the Producer and Consumer components Both compo-

nents have a similar architecture but are separate components This is because their plugin models differ and are suspected

to often be deployed on different devices having specific hardware requirements Producers Plugins could be deployed on

the thermal camera itself having a very specific operating system whereas a Consumer plugin might need access to specific

processors to speed up its execution

The Producer and Consumer consist of the following components API Kernel Plugin Model and Plugin Manager The API trans-

lates requests coming from the Stream component to commands for the Kernel The Kernel implements the core functionalities

such as activating (deploying) and deactivating plugins managing their state and manipulating their resources It creates a

Plugin Model for each Plugin that the framework has installed This model represents a plugin logically on framework level

and keeps track of the Plugin resources The Plugin Manager manages the plugins that were added to the framework stored in

the Plugin Directory It manages the plugin installations adding updates or installing additional plugins that can be retrieved

from the Producer and Consumer Distribution components

23 Software architecture 21

(a) Producer component-connector diagram

(b) Consumer component-connector diagram

Figure 29 Component-connector diagrams of the Producer and Consumer module

Producer and Consumer Distribution

The Producer and Consumer Distribution components are responsible for managing and maintaining the plugins for the frame-

work They act as online software repositories from which local versions of the framework can retrieve new plugins The

component-connector diagrams are presented in Figure 210 The Distribution components consists of the following subcom-

ponents API Plugin Manager and Plugin Tester Plugin Developers can make requests to the API that translates these requests

to Create Read Update Destroy (CRUD) commands for the Plugin Manager The Plugin Manager executes these commands

on the Plugins that are kept in the Plugin Repository The quality of the framework depends on the quality of the plugins

that it offers Therefore plugins should be thoroughly tested before being added to the framework to guarantee quality The

Plugin Tester component is responsible for this testing Tests should include testing if the plugin implements the Plugin Model

correctly if the plugin meets the performance requirements etc When a plugin passes these tests it is added to the Plugin

Repository so that end-users can install the plugin and use it for their applications

23 Software architecture 22

(a) Producer Distribution (b) Consumer Distribution

Figure 210 Producer and Consumer Distribution component-connector diagrams

232 Dynamic views

Dynamic views depict the behavior of the system and complement the static views They are documented using sequence

diagrams that show an explicit sequence of messages between architecture elements that describes a use case [40] Two key

use cases are presented here Add a plugin to the stream and linking plugins to build the stream

Add plugin to stream

Figure 211 presents the sequence diagram for adding a Producer plugin to the framework The framework is assumed to be

running the user has created a streamS and the Producer PluginA is correctly installed The end-user executes the command

to add A to stream S on the Client Interface that passes the command to the Stream component The Stream component

requests the creation of a microservice instance of A that is created by the Producer Kernel When the Producer Plugin is

instantiated the Producer Kernel creates a Plugin Model of A and adds it to its references so that the instance of A can be

reached for future commands Afterwards the StreamManager is informed of the success upon which the StreamManager can

addA to the Stream Model ready to be linked with other plugins The user is notified of this success and can continue building

IfA could not be instantiated (due to not being installed not installed correctly etc) A is marked as rsquobrokenrsquo and the user is

notified that the action could not be completed When the plugin is marked as rsquobrokenrsquo it can no longer be used and needs to

be reinstalled The sequence diagram for adding a Consumer Plugin is similar but replaces the Producer components with the

Consumer components

Link plugins

Figure 212 presents the sequence diagram for linking two plugins in a stream In the sequence diagram two Consumer Plugins

A and B are linked this can be extended to a Producer Plugin linking with a Consumer Plugin The framework is assumed

to be running the user has created a stream S the plugins A and B have been instantiated and added to the stream The

end-user executes the command to linkA andB in stream S on the Client Interface that passes the command to the Stream

component that checks if the link is valid for the Stream Model S Linking can only be done if the stream is in the STOP state

and if the plugins are already in the stream If the link is valid the Stream Manager can begin linking the plugins To link

23 Software architecture 23

Figure 211 Add a Producer Plugin to stream

the plugins in the order A-B A is added as a source for B and B is added as a listener for A These subsequences are

found in their corresponding frames in the diagram and are very similar The Stream Manager makes the request to add the

sourcelistener to the Kernel that finds the corresponding plugin and makes the request on the corresponding Plugin Model

If the Plugin succeeded the Plugin Model is updated and the Stream Manager is notified of this success If both plugins have

successfully set the source and listener the Stream Model layout is updated with the link Should the sourcelistener request

fail for one of the plugins the change is rolled back and the end-user is notified

233 Deployment views

The different deployment configurations are illustrated via deployment diagrams using the Deployment UML 25 specification

[48] rsquoHostrsquo specifies the device on which components are deployed The rsquomicroservicersquo indicates the isolated environment

in which components are running These isolated environments on the host are realized as software containers that enable

portability of the components to other deployment configurations This concept is further discussed in Section 33 The Producer

and Consumer Distribution components were left out of the diagrams as they are always distributed on a different host than

the core framework Two deployment configurations are presented the local configuration that deploys the components on

a single device and the distributed configuration that deploys each component on a separate device These configurations are

presented in Figure 213

23 Software architecture 24

Figure 212 Link two plugins in a stream The rsquoformat requestrsquo and rsquotranslate requestrsquo actions in the API components have been omitted to reduce clutter

in the diagram but are executed by the API components

23 Software architecture 25

Local configuration deployment

The local configuration deploys the framework on a single local device The configuration is depicted in Figure 213a Because

the framework is deployed as a whole it can operate offline This configuration is useful for image processing applications that

canrsquot rely on a stable network connection Examples are remote locations or densely built-up areas The components are still

deployed as separate microservices due to the architecture of the framework This has an impact on the performance of the

framework because for every interaction between components either the HTTP message protocol or RTP protocol is used that

introduces extra overhead compared to direct invocation of commands

Distributed configuration deployment

The distributed configuration deploys the framework on multiple devices The components are distributed over these devices

made possible by the microservice isolation and communication protocols This configuration is depicted in Figure 213b Obvi-

ously in this configuration each component of the framework must have a reliable network connection to communicate with

the other framework components This configuration could be used for example for a security application The end-user has

the Stream module running on a master node that controls several cameras The end-user can configure his image process-

ing application through the Client Interface running on his device that communicates with the Stream module running on

the master node The master node can control each camera by communicating with the Producer component If for example

the security application requires autonomous detection of trespassing people a computationally intensive task the Consumer

Plugins could need dedicated hardware to run that is only available on another device The Consumer component can then be

deployed on that dedicated device and the Stream component can again communicate with it over the network This success of

this configuration depends on the availability of the network and the capacity of the network If the network fails commands

and media canrsquot come through and the framework can no longer execute Due to the distributed nature performance will also

be worse when compared to the local configuration because each request between the components travels over a network

that can experience delays

23 Software architecture 26

(a) Local configuration deployment diagram(b) Distributed configuration deployment diagram

]

Figure 213 Deployment diagrams

STATE OF THE ART AND TECHNOLOGY CHOICE 27

Chapter 3

State of the art and technology choice

To build and test a proof of concept implementation of the architecture presented in Chapter 2 several state of the art tech-

nologies can be used as support for the framework These are presented in Sections 31 32 33 and 34 For each category a

choice is made that will serve as the basis for the implementation of the proof of concept discussed in Section 35 Readers

already familiar with the presented technologies can safely skip ahead to Section 35

31 Thermal camera options

This section aims to provide an overview of some currently commercially available thermal cameras The overview is not a

complete overview of all products offered by all vendors This data was gathered in September 2017 so some products can

be discontinued and new products can already be launched Several parameters are collected for each product Section 311

discusses why these parameters are important to assess the quality of a thermal camera Section 312 aims to aggregate these

parameters and presents insights into the data The full list of specifications can be found in Appendix B

311 Parameters

The following parameters were considered for the comparison physical specifications image quality thermal precision inter-

faces energy consumption help and support and auxiliary features

Price

Thermal cameras are relatively expensive when compared to visible light cameras For example a 20 megapixel (MP) visible

light camera can cost as low as 100 euro while thermal cameras having a much lower image resolution can cost as much as

15000 euro Prices for thermal cameras cover a very wide range and budgets are limited in practice

31 Thermal camera options 28

Physical specifications

Two specifications are considered the weight of the camera and the dimensions of the camera Drones have a limited carry

weight due to maximal carrying capacities and a faster draining of battery life when carrying heavier loads Lighter and smaller

cameras are preferred for usage with drones These often offer lower image quality and less features than the heavier cameras

Image quality

Image quality specifies how much information an image can possibly hold It consists of five parameters resolution capture

frequency or frame rate field of view and radiometric information Image resolution is the amount of detail an image holds

Higher resolution cameras can capture more details in a scene resulting in a sharper image that holds more information Due

to more details smaller objects can also be seen allowing scenes to be viewed from larger distances Drones capture images

from relatively large distances so good resolutions are required for the images to be useful Image resolution is measured in

pixel density presented as the product of the amount of pixels in width and height of the image The highest resolution found

for the compared cameras is 640 x 512 pixels Some cameras offer a visual camera next to the thermal camera This allows

an overlay of the visual image and the thermal image so-called Multi Spectral Dynamic Imaging (MSX) This creates artificial

sharper images because edges can be seen more clearly because they are more visible in the visual image Figure 31 depicts a

thermal-only image and a MSX image of a dog It can be seen that the MSX image is sharper MSX is a more low-cost solution

to produce sharper images compared to increasing the thermal resolution as visible light cameras are less expensive [7]

(a) Thermal (b) MSX

Figure 31 Thermal image and MSX image of a dog

The capture frequency or frame rate dictates how many frames the camera can capture per second Higher frequency cameras

are able to track dynamic scenes better The field of view is angle throughwhich the camera is sensitive to thermal radiation and

31 Thermal camera options 29

determines the extent of the world that can be seen by the camera Bigger field of views can capturemore of the environment in

one image Most cameras allow various lenses to be mounted onto the camera which allows for greater flexibility in choosing

the field of view Radiometric image information is thermal information embeddedwith the infrared image that can be analyzed

after recording Radiometric information characterizes the distribution of the thermal radiationrsquos power in space and specifies

the temperature per pixel exactly Regular thermal images use a relative scaling of temperatures that are mapped onto a

colorspace with some color being the hottest color in the image and another color the coldest For example in Figure 31a the

Iron color scheme is used which maps the cold regions of the image on blue color variants and warmer regions on red and

yellow variants Radiometric information can give a very detailed description of the radiation pattern of a scene

Thermal precision

Thermal precision specifies the temperature range the sensitivity and accuracy of the temperature measurements The tem-

perature range indicates the minimum and maximum range a camera can detect A larger temperature range comes with a

trade-off in sensitivity and accuracy Often cameras offer different modi of operation and operate using different intervals

according to the accuracy needed in a scene Sensitivity indicates the ability of the camera to record finer distinctions in tem-

perature Accuracy is the margin of error for temperature readings on the thermal camera An accuracy of 5 degrees Celsius

for small temperature ranges and 20 degrees Celsius for large temperature ranges is commonly found The increase in error

margin is a trade-off for the larger temperature interval Objects have different emit infrared waves in various forms (due

to black-box radiation [7]) To accurately compare the temperatures cameras often implement emissivity corrections that

normalize the measurements

Interfaces

Cameras can communicate with other devices via several interfaces during use Cameras mounted on a drone cannot be ac-

cessed during flight and need these interface to transfer data USB and HDMI are the most commonly found interfaces to

connect the camera with an on-board processing unit gimbal or battery MAVLink [53] is a very lightweight header-only mes-

sage marshalling library for micro air vehicles drones When a camera provides this interface this allows for a very efficient

communication scheme to control the camera remotely Other interfaces include Bluetooth or Wi-fi

Energy consumption

A device mounted on a drone has a limited energy source at its disposal The less energy the camera consumes the longer the

drone can operate This can even lead to lighter batteries used in-flight reducing the carried weight and therefore also the

energy consumption Typically energy consumptions for cameras are much lower than the energy consumption of the drone

itself so this is a minor specification Input voltage and power consumption are specified

31 Thermal camera options 30

Help and support

How the camera is supported by the company has a big impact on the ease of developing applications for the camera platform

User manuals phone or email support and FAQs are very helpful Should the camera be malfunctioning a product warranty is

necessary to recover the broken product

User experience

The user experience is another important factor as there is a difference in the technical specifications and the actual experience

of the user The user experience is measured in a number of good and a number of bad reviews Reviews are scored from zero

to five with zero being a very bad experience and 5 being a very good experience A good review is scored three or more a bad

review less than three stars

Auxiliary features

Some cameras offer even more features than the ones mentioned above These can be a connection with the Global Positioning

System (GPS) to indicate where images were captured a software application to interact with the camera analysis functionality

tracking etc

312 Comparative analysis

It can be seen that FLIR is the market leader on thermal solutions for drones They offer the largest product line and products

from other companies often utilize one of their camera cores Figure 32a plots the retail price compared to the thermal

resolution Cameras with high and low resolutions are found across all price ranges Clearly other features determine the price

of a thermal camera A feature function is defined that maps the features of a thermal camera on an integer The function

increments the integer if

bull The camera has MSX support

bull The camera has a standard data format (not just an analog or digital signal)

bull The camera offers radiometric information

bull The image resolution is larger than 640 x 512 pixels being the highest resolution found for these products

bull The sensitivity is smaller than 100 mK

bull The camera offers emissivity correction

bull The camera offers a USB interface

bull The camera offers a MAVLink interface

32 Microservices frameworks 31

bull The camera offers an HDMI interface

bull The camera offers a Bluetooth connection

bull The camera offers Wi-Fi connection

bull The camera offers GPS tagging

Figure 32b plots these feature points versus the retail price This gives a more log-like relationship The features of a camera

determine the price much more than just the image quality For a price less than 5000 euro thermal cameras are found that

implement most basic features Then the price increases rather fast for less added features These are features like radiometry

that require additional hardware that greatly increase the price of the camera

32 Microservices frameworks

The architecture presented in Section 23 relies heavily on the microservices pattern Therefore this Section aims to present

several microservices frameworks to support this architecture Figure 33 depicts the results of the Rethink IT survey query-

ing the most used frameworks for microservices by developers [54] The most popular frameworks Java EE and Spring Boot

are written in Java The Java EE framework is more of a one-stop-shop framework offering much more functionalities than

just a backbone microservices framework and is therefore not considered Spring Boot is clearly a very popular and mature

framework more streamlined for microservices Vertx is a more upcoming framework renowned for its performance making

it worthwhile to explore Python is an upcoming language for web development and because it is excellent for prototyping

several frameworks for this language are explored as well The frameworks presented here are Vertx version 351 Spring Boot

version 20 Flask version 012 Falcon version 141 and Nameko version 290

321 Flask

Flask is a micro web development framework for Python The term rdquomicrordquo means that Flask aims to keep its core simple but

extensible Flask is an unopinionated framework as it only provides a glue layer to build a REST API around the application

(a) Camera resolution compared to retail price(b) Camera feature points compared to price

32 Microservices frameworks 32

Figure 33 Rethink IT Most used tools and frameworks for microservices results [54]

However it provides a large list of extensions if extra functionality is required [55] Starting a microservice is very simple

as illustrated in Listing 1 Flask uses the concept of Python decorators [56] to bind Python functions to a REST API in Listing

1 for example the function service_status() is linked to the rsquorsquo resource When a user issues an HTTP GET request

on this resource the route() function on the app object is called by Flask Because route() is a decorator for the

service_status() function service_status() is wrapped and passed to the route() function so that when

a user issues an HTTP GET request the service_status() function that was passed gets called This allows for an easy

construction of the REST API just decorate all the functions of the microservice with the correct Flask decorator

from flask import Flask

app = Flask(__name__)

approute()

def service_status()

return service_status

if __name__ == __main__

apprun()

Listing 1 Minimal Flask application

Because Flask is a microframework its memory footprint is small with the binary file only being 535KB large It is in use

by several large companies such as Netflix and Reddit [57] In a production environment the default Flask web server is not

sufficient as it only serves one user at a time However for prototyping it is an excellent framework [55]

32 Microservices frameworks 33

322 Falcon

Falcon is a bare-metal Python web framework that differentiates itself in performance when compared to other frameworks

It targets itself towards microservices due to being even more lightweight and faster when compared to frameworks like Flask

In a benchmark test it achieves 27 times the speed of Flask [58] The framework seems less mature and has not been adopted

by many companies [59] It is not considered for the prototype of the system but could be used in production as it achieves

better performance

323 Nameko

Nameko is a framework specifically built for building microservices in Python Next to offering a REST API it also offers asyn-

chronous events over the Advanced Message Queuing Protocol (AMQP) It is only meant to be used for microservices not for

web applications that serve content It is a relatively young framework and is not backed by any major companies as of yet It

is however backed by the developer of the Flask framework [60]

324 Vertx

Vertx is a toolkit for building reactive applications on the Java Virtual Machine (JVM) This framework follows the reactive

systems principles These principles are used to achieve responsiveness and build systems that respond to requests in a timely

fashion even with failures or under load To build such a system reactive systems embrace a message-driven approach All

the components interact using messages sent and received asynchronously Reactive microservices built with Vertx have the

following characteristics autonomy asynchronous resilience and elasticity Vertx is a toolkit and can be used as any other

library which makes it very flexible It provides a large set of features metrics different programming languages different

protocols templating data access cluster management etc

Vertx embraces the asynchronous development model which can be seen in Listing 2

import iovertxcoreAbstractVerticle

public class Server extends AbstractVerticle

public void start()

vertxcreateHttpServer()requestHandler(req -gt

reqresponse()

putHeader(content-type textplain)

end(Hello from Vertx)

)listen(8080)

Listing 2 Vertx example

33 Deployment framework 34

The event which occurs is the HTTP request On arrival of the event the Handler is called and is executed The Handler is chained

to a listen request and does not block the calling thread The Handler is only notified when an event is ready to be processed

or when the result of an asynchronous operation has been computed [61]

325 Spring Boot

Spring Boot is an opinionated Java framework for building microservices based on the Spring dependency injection framework

It allows developers to create microservices through reduced boilerplate and configuration For simple applications it provides

a similar syntax to Flask in Python and uses decorators for routing An example is given in Listing 3 The framework handles

most of the routing and request handling but restricts the developer in application structure The framework is not lightweight

and performs less well than Vertx [62]

RestController

RequestMapping(api)

public class HelloRestController

RequestMapping(method = RequestMethodGET value=hola

produces = textplain)

public String hello()

return Hello Spring Boot

Listing 3 Spring Boot example

33 Deployment framework

To allow for the modifiability and interoperability requirements discussed in Section 212 and the different deployment config-

urations in Section 233 Linux containers (LXC) are used A container is a lightweight operating system running inside the host

system running instructions native to the core CPU eliminating the need for instruction level emulation that Virtual Machines

use Containers provide an identical isolated runtime environment for processes without the overhead of virtualization This

makes them perfect for highly portable software as only the container needs to be moved and can directly be executed on any

system supporting the containers [63] First the concept of containers is introduced in Section 331 Second several container

frameworks are presented in Sections 332 333 334

331 Containers

Containers sandbox processes from each other and are often described as the lightweight equivalent of virtual machines The

difference between a virtual machine and a container is the level of virtualization Virtual machines virtualize at the hardware

33 Deployment framework 35

level whereas containers do this at the operating system (OS) level The achieved effect is similar but there are significant

differences Containers make available protected portions of the OS and share its resources Two containers running on one OS

have their own OS abstraction layer and donrsquot know they are running on the same host This provides a significant difference in

resource utilization Virtual machines provide access to hardware only so it is necessary to install an OS As a result there are

multiple OSs running which gobble up resources Containers piggyback on the running OS of the host environment They merely

execute in spaces that are isolated form each other and certain parts of the OS This allows for efficient resource utilization and

for cheap creation and destruction of containers Consequently starting and stopping a container is equivalent to starting and

stopping an application [64 65] This comparison is illustrated in Figure 34

Containers offer several advantages over running a process directly on the system Due to the OS virtualization of the con-

tainers software is always deployed on the same operating system defined in the container This allows for a rsquowrite once run

everywherersquo scenario which allows for portability of the system to a range of devices Containers communicate with each other

using protocols such as HTTP This allows for the processes in containers to be written in any programming language using

any external library that is needed For the system this means that if the Producer and Consumer Plugins are packaged as

containers they can effectively be made in any available technology greatly enhancing the extensibility of the system

332 LXC

Linux containers are the basis on top of which other container frameworks are built LXC provides a normal OS environment

similar to a VM The containers in this framework almost behave identically to a VM They can run multiple processes LXC can

be used directly but offer only low level functionalities and can be difficult to set up [67]

333 Docker

Docker started as an open-source project at dotCloud in early 2013 It was an extension of the technology the company had

developed to run its cloud applications on thousands of servers [64] Now Docker is a standalone mature company providing a

software container platform for the deployment of applications [66] Docker provides two main services a simple toolset and

API for managing Linux containers and a cloud platform which provides easy access to recipes for software containers created

by other developers [68] Docker is the container technology with most public traction and is becoming the container standard

at the time of writing due to its functionalities and very responsive community It offers functionality to easily build and run

containers but also manage them in large clusters A design decision that limits Docker is that each container can only run one

process at a time and the Docker client Docker consists of a daemon that manages the containers and the API Engine a REST

client Should this client fail dangling containers can arise [69]

334 rkt

Core OSrsquo rkt is an emerging container technology providing an API engine similar to the Docker API Engine that can run LXC

containers as well as Docker containers rkt focusses on security standardization and is specifically designed to run in cloud

environments Unlike Docker rkt does not use a daemon process with a REST client The command line tool executes all the

34 Object detection algorithms and frameworks 36

(a) Container stack (b) Virtual machine stack

Figure 34 Containers compared to virtual machines [66]

operations which makes the framework more reliable rkt is not as mature as Docker yet It is portable to multiple Linux

environments but is not yet portable to macOS and Windows [70]

34 Object detection algorithms and frameworks

As stated in Section 132 object detection is the computer vision task of detecting which objects are present in an image and

where they are located Several approaches to this problem have been proposed some of which focus on thermal images This

section aims to give a small overview of different existing techniques For the technical details on the algorithms the reader is

referred to the respective articles on the algorithms

341 Traditional approaches

Traditional approaches include hot-spot detection techniques and Adaptive Boosting (AdaBoost) with various feature extraction

techniques such as Aggregated Channel Features (ACF) and Integral Channel Features (ICF) Thesemethods rely on clever feature

engineering solutions that use domain knowledge or statistical insights to transform the raw dataset into a specific set of

features in order to find patterns [32]

Hot-spot detection

Hot-spot techniques work on the assumptions that people have an overall higher body temperature than most of the back-

ground in the thermal image These techniques first select candidate objects these are the hot-spots in the image The hot-spots

define the region on which a classifier is run and are thus the localization step in the object detection problem Afterwards

a classifier is trained on these candidates Xu et al used a Support Vector Machine (SVM) classifier to classify if the hot-spot

34 Object detection algorithms and frameworks 37

represented a pedestrian [71] Nanda et al used a Bayes classifier to classify the hot-spots [72] These methods are generally

not applicable because people often are not the only hot-spots in thermal images

AdaBoost

AdaBoost is a machine learning algorithm that utilizes the output of so-called weak learning algorithms (weak learners) and

combine their outputs into aweighted sum that forms the output of the boosted classifier AdaBoostmodifies theweak learners

in favor of data points misclassified by previous classifiers [73] Viola and Jones et al built a detection algorithm that uses two

consecutive frames of a video sequence and trains the AdaBoost classifier on both motion and appearance information [74]

Davis et al use a two-stage template approach that initially performs a fast screening procedure using a generalized template

using a contour saliency map to locate potential person locations Any window located in the first phase is then forwarded to

the AdaBoost algorithm to validate the presence of the person Dollaacuter et al extracted features using different ICF and ACF [35]

ICF and ACF compute features by calculating several aggregations over the different channels of an image such as gradient

color histogram and colors Goedeme et al expanded these detectors with extra thermal channels to achieve comparable

results as Dollaacuter et al but for thermal images [36]

342 Deep learning

Over the past few decades there has been a shift in proposed solution methods towards deep learning Deep learning for object

detection uses Convolutional Neural Networks (CNN) CNNs are a specialized kind of neural network for processing data that

has a known grid-like topology such as images CNNs generally consist of three steps a convolution step that creates a feature

map of a region of an image a pooling step that summarizes the output of the convolution step and finally a fully-connected

network that learns from the features extracted in the previous steps [75] The key difference is that these algorithms do the

feature extraction in the convolutional layers and do not need feature engineering like the algorithms presented in Section

341 This requires quite a bit of computing power when compared to the traditional methods Since deep learning made the

shift to computing on Graphical Processing Units (GPUs) computations became feasible and these models proved to achieve

very good performance on various machine learning problems Two model types are described two-stage networks (R-CNN

R-FCN) that extract image regions first and make separate predictions on each region and dense networks (YOLO SSD NASNet

RetinaNet) that operate on the image as a whole

Region-based Convolutional Network (R-CNN)

R-CNN uses a selective search method to find objects an alternative to the exhaustive search in an image It initializes small

regions in an image and merges them hierarchically The detected regions are merged according to color spaces and other

similarity metrics [76] R-CNN combines this selective search with a CNN per region to find out the objects in these regions [77]

34 Object detection algorithms and frameworks 38

Fast(er) Region-based Convolutional Network (Fast(er) R-CNN)

Fast R-CNN was developed to reduce the time consumption related to the high number of models necessary to analyze region

proposals from the selective search method in R-CNN Instead of using a CNN for each region a single CNN with multiple

convolutional layers is used [78] Faster RCNN drops the region proposals detected with the selective search method (which

is computationally expensive) and introduced the Region Proposal Network (RPN) to directly generate region proposals This

accelerates training and testing and improves performance [79] Mask R-CNN is an extension of the Faster R-CNN model that

adds a parallel branch to the bounding box detection to predict object masks that is the segmentation of an object by pixel in

the image [80]

Region-based Fully Convolutional Network (R-FCN)

R-FCN tries a more efficient approach to region detection Instead of applying a per-region subnetwork multiple times R-FCN

uses a fully convolutional network with computations shared across the entire image This allows it to be compatible with

multiple backbone networks such as Residual Networks [81]

You Only Look Once (YOLO)

The previously discussed methods need to run the same computations on different parts of an image multiple times before

generating a prediction This makes those methods relatively slow The YOLO model [82] was developed with the requirement

to make predictions as fast as possible trading off accuracy for speed to move towards real-time object detection YOLO directly

predicts bounding boxes and class probabilities with a single CNN in a single evaluation instead of first detecting object regions

and predicting classes afterwards This has some benefits over the other methods YOLO is very fast when compared to other

methods capable of processing images in real-time up to 155 frames per second for some variants It also learns contextual

information because it trains on entire images instead of regions YOLO also generalizes better for other image types All these

benefits come at the cost of accuracy YOLO struggles to precisely localize some objects especially small objects The following

versions of YOLO focus on delivering more accuracy The algorithm is currently in its third version [83]

Single-Shot Detector (SSD)

The SSD [84] is similar to YOLO and predicts all the bounding boxes and the class probabilities in one single evaluation (single

shot) using one CNN The model takes an image as input which passes through multiple convolutional layers When compared

to YOLO SSD achieves higher accuracies by adding convolutional layers and including separate filters for different aspect ratio

detections

Neural Architecture Search Net (NASNet)

NASNet takes a different approach and does not design the network architecture to perform the object detection beforehand

but instead trains a Recurrent Neural Network (RNN) to generate the model descriptions of the CNN to perform the object

34 Object detection algorithms and frameworks 39

detection The RNN is trained using reinforcement learning The NASNets built for object detection perform as good as most

networks but are slower to train [85]

RetinaNet

RetinaNet is the latest state-of-the art object detector It is a simple dense detector similar to YOLO and SSD but matches

the accuracy of the two-stage detectors like the R-CNN variants RetinaNet proposes that the foreground-background class

imbalance encountered when training the dense detectors lead to less accuracy when compared to the two-stage detectors

RetinaNet uses a newmethod called Focal Loss that focuses training on a sparse set of examples to counter this class imbalance

which results in a very good performance and a very fast detection [86]

343 Frameworks

While the previous Sections focused on different algorithms actually implementing these algorithms is not straightforward

Thatrsquos why over the past years several deep learning frameworks have emerged that try to provide easier access to this tech-

nology Some frameworks provide APIs for some of the object detection algorithms presented above This section gives a small

overview of some frameworks Most frameworks differ quite a bit from each other which makes porting a model from one

framework to another rather difficult The Open Neural Network Exchange Format (ONNX) initiative hopes to propose a stan-

dard for interchangeable models which should aid switching among frameworks more easily in the future [87] Note that there

are other frameworks available but those do not yet support object detection functions out of the box

TensorFlow

Perhaps the most well-known framework TensorFlow is an open source machine learning library for neural networks with a

Python interface It was developed by Google for internal use and released for the public in 2015 [88] Recently an Object

Detection API has been built for TensorFlow which implements pre-trained models on benchmark datasets such as SSD Faster

R-CNN R-FCN and Mask R-CNN [89] TensorFlow offers a lot of flexibility in its use and can be used for many machine learning

problems

Darknet

Darknet is an open source neural network framework written in C and CUDA It is maintained by Joseph Redmon the person

behind the YOLO algorithm [90] Darknet does not offer the flexibility that other frameworks offer but is easy to install and

use when compared to others Out of the box Darknet offers an interface for YOLO The open source community offers some

ports of this framework to other popular frameworks such as Tensorflow

34 Object detection algorithms and frameworks 40

CNTK

The Microsoft Cognitive Toolkit (CNTK) is an open source toolkit for distributed deep learning It offers a Python C or C++

interface Itrsquos one of the first framework so support ONNX CNTK offers an API for Fast R-CNN and Faster R-CNN [91]

35 Technology choice 41

35 Technology choice

This Section presents the choices made for each technology described in the previous Sections

351 Thermal camera

The FLIR One Pro and Therm-App were selected as thermal cameras for the proof of concept Both offer relatively high quality

images 160 x 120 pixels and 320 x 240 pixels respectively This is of course relative to their price 469 and 93731 euro respec-

tively These prices are at the low end of the product ranges offered Both cameras are designed to use on a smartphone which

makes them ideal for prototyping since these devices are widely available and setting up the camera via the apps from the

respective companies is easy Both cameras provide MPEG-4h264 encoded video output easily understood by most playback

software Both cameras can be found in the lower left of Figure 32b

For deployment in production-ready applications with drones these cameras are not the best choice They arenrsquot specifically

designed to be used on a drone and donrsquot offer the best image quality possible In those applications platforms like the FLIR Vue

Duo Zenmuse or Workswell Wiris are better candidates due to their superior image quality MAVLink interfaces compatibility

with commercially available gimbals to mount them on drones and other features

352 Microservices framework

Flask is selected as the microservices framework The arguments for Flask are as follows Flask is a mature web framework

with major companies backing it This means the APIs stay consistent and the framework is stable in use When compared to

some other frameworks like Spring Boot Flask is unopionated which allows for maximum flexibility during development Flask

also has a very small memory footprint that makes it easier to deploy on less powerful on-board devices like drones Flask is

also easy to use and quick to set up ideal for developing a proof of concept A final argument is the familiarity of the author

with Flask

353 Deployment framework

Docker is selected as the deployment framework Docker is the most mature and well supported container framework at

the time of writing and likely will be important in the future It offers the most features and is specifically designed for the

microservices pattern [68]

354 Object detection

One of the requirements specified in Section 21 is real-time streaming Real-time object detection is only achieved by a few

models presented in Section 34 Candidates are YOLO SSD and RetinaNet As there is no framework that provides an implemen-

tation of the RetinaNet algorithm out of the box at the time of writing this algorithm is not selected SSD is implemented in

the TensorFlow object detection API However at the time of writing this API has not been found stable when trying out the API

fallbacks to older versions of the software were needed to be able to test the models This was due to the object detection API

35 Technology choice 42

using older versions of the TensorFlow framework Therefore YOLO implemented in the darknet framework is selected Darknet

offers a stable distribution YOLO achieves good results and has proven to be a very fast detector capable for real-time object

detection

PROOF OF CONCEPT IMPLEMENTATION 43

Chapter 4

Proof of Concept implementation

To prove the concept of the architecture discussed in the previous chapters a prototype is implemented First the goals and the

scope of the prototype are presented in Section 41 Next the components of the prototype are presented in Section 42 Finally

the known limitations and issues of the prototype are presented in Section 43

41 Goals and scope of prototype

The goals of the prototype are to prove the QARs defined in Section 21 The prototype focusses on the ASRs performance

interoperability and modifiability The usability security and availability requirements are left out of the scope of the prototype

because they are not an ASR and require significant resources (focus groups longtime deployment etc) to test

The components that are implemented in the prototype are Client Interface Stream Consumer and Producer because they

represent the core functionality of the framework to build image processing application streams using plugins The Producer

and Consumer Distribution components enable third party plugin developers to add their functionality to the framework These

are distribution functionalities which are out of scope of the prototype The prototype will only support one video stream All

functions presented in Figure 21 are implemented with the exception of rsquoInstall pluginrsquo rsquoUninstall pluginrsquo rsquoAdd pluginrsquo rsquoView

pluginrsquo rsquoRemove pluginrsquo and rsquoUpdate pluginrsquo as they are only supported via the Producer and Consumer Distribution components

The prototype is deployed on a local device Distributed deployment configurations require small changes in the implementation

(see Section 43)

42 Overview of prototype

421 General overview

The prototype consists of four main components a cli streamer producer and consumer The cli process is

the Client Interface implemented as a textual Command Line user Interface (CLI) which allows a user to interact with the

prototype through textual commands in a shell This process is deployed on the local machine The streamer producer

42 Overview of prototype 44

and consumer processes are deployed as microservices in their own Docker containers The prototype is initialized through

the cli which spins up the Docker containers of the other processes This is achieved with the tool docker-compose Compose

is a tool for defining and running multi-container Docker applications The compose YAML file defines the configurations for

the microservices Compose uses these configurations to start and stop the application with a single command [92] A snippet

of the compose file for the application is given in Listing 4 Containers are specified as services The example service

configuration given is that of the producer First the name of the container is specified which overwrites the default name

as the container name is used as hostname for the container in Docker [93] The build configuration specifies where the

container build recipe is situated The port mapping allows processes from the localhost to access processes in the container

For the producer service this is only used for debugging The volumes configuration specifies folders from the host to

be mounted to the container This configuration mounts in the source code and resources It also provides access to the Docker

socket to allow interaction with the Docker host (see Section 424)

services

producer

container_name producer

build

context producer

dockerfile Dockerfile

ports

- 808080

volumes

- producerusrproducer

- varrundockersockvarrundockersock

Listing 4 docker-composeyml snippet of the prototype

All containers are connected to a Docker bridge network [93] for communication A bridge network uses a software bridge to

allow connected containers to communicate while providing isolation from containers which are not connected to that bridge

network The bridge network applies to containers running on the same Docker host The network is thus confined to the local

Docker host and is not distributed on different devices The bridge network has some advantages

bull The bridge provides better isolation and interoperability between containers Containers automatically expose all ports

to each other and none to the outside world

bull The bridge provides automatic Domain Name System (DNS) resolution between containers This means that containers

resolve the IP address of each other by container name or alias

bull Containers can be attached to and detached from the networks on the fly

bull Environment variables are shared which can be used to provide equal environment configurations for every container

on the bridge

42 Overview of prototype 45

422 Client interface

The Client Interface is implemented by the cli component The cli is built in Python with the Click package by Armin

Ronacher [94] Click is a CLI creation kit which aims to make the implementation of CLIs easier It resembles the Flask frame-

work as it also leverages Python decorators [56] for most of its functionality The source code of the cli is located in the

mosquitopy file Commands can be executed by calling python mosquitopy or by calling mosquito if the

source code is installed into the Python environment The following commands are implemented

bull mosquito Displays a help page listing command groups

bull mosquito on Starts the application

bull mosquito off Shuts down the application

bull mosquito plugins Groups all commands to manage plugins Plugins can only be listed not installed or unin-

stalled as the Remote Producer and Remote Consumer are not implemented

bull mosquito plugins ls Lists all locally installed plugins

bull mosquito stream Groups all commands to manipulate the current stream

bull mosquito stream add Adds a producer or consumer to the stream

bull mosquito stream delete Deletes a producer or consumer from the stream

bull mosquito stream elements List all producers and consumers that were added to the stream

bull mosquito stream link Links two stream plugins

bull mosquito stream pause Pauses the stream

bull mosquito stream play Plays the stream This means the stream is processing media

bull mosquito stream print Prints the stream layout (which plugins are linked)

bull mosquito stream stop Stop the stream

bull mosquito stream view View the stream on the local device

A typical use of the application would be the following First the application is started using mosquito on Then plugins

are added to the stream using mosquito stream add [ELEMENT_TYPE] [ELEMENT] This will instantiate the

corresponding plugins in the Producer and Consumer component The plugins are linked in order using mosquito stream

link [ELEMENT_1] [ELEMENT_2] The stream is then set to play using mosquito stream play When the

last plugin is linked to the special local plugin the user can view the output from that plugin using mosquito stream

view which opens up a window in which the stream is displayed

42 Overview of prototype 46

As specified in the software architecture (see Section 23) the Client Interface can use the Stream Commands interface of the

Stream component As specified in Section 231 this interface is a REST API so the client can use this interface through the HTTP

protocol This is done with the Python Requests library [95]

423 Stream

The Stream component is responsible for the logical representation of the stream (see Section 231) implemented as the

streamer component The component consists of three objects api that contains the REST API StreamManager and

the Stream object representing the Stream Model in the framework Requests to the other microservices are sent using the

Python Requests library The prototype implementation only supports one stream with a chain-like model This means that

unlike the stream depicted in Figure 26 a plugin canrsquot have multiple sources or multiple listeners The Stream object man-

ages the logical representation of the stream and manipulates the references to the plugins by forwarding commands to the

producer and consumer component respectively It contains two data structures outline which is the logical struc-

ture of the stream and elements that contains all the plugins present in the stream In the prototype the Stream component

provides the following functionalities on its API endpoints

bull plugins GET Fetches all the plugins from the producer and consumer components and returns their in-

formation

bull elements GET POST DELETE Resource to add and delete plugins from the elements bin

bull streamlinks POST Resource to create links for elements

bull streamstate GET PUT Resource to update the state

bull shutdown POST Shut down the framework

Since the streamer component is the only component of the framework that interacts with outside users it has the re-

sponsibility to gracefully shut down the framework This is needed to solve the problem of dangling plugin containers that

run plugins that have not been stopped and removed after closing the application Since only plugins that are contained in a

stream have a running container associated the stream can notify the Producer and Consumer components to stop and remove

those containers

424 Producer and Consumer

The Producer and Consumer component cover similar responsibilities in managing installed plugins They are implemented in

the producer and consumer components Both components consist of the following objects api that contains the REST

API the Kernel that implements the core functionalities the PluginManager which finds plugins installed on the device

and checks if their installation is valid and the Plugin which is the logical representation of a plugin as described in Section

231 Commands to control the plugins are made using the Python Requests library

42 Overview of prototype 47

For the component to be able to start stop and interact with the plugin containers the component needs access to the Docker

host and the Docker client running on that host But because the component is running in its own container it is isolated from

the Docker host and canrsquot interact with the Docker client by default The workaround for this problem is to expose the socket

on which the Docker client is running on the Docker host to the container This is done by mounting the Docker socket of the

host on the Docker socket in the container In Docker compose the mounting is achieved using the following Listing

volumes

- varrundockersockvarrundockersock

Listing 5 Mounting the Docker socket on the container

This has some implications on security (see Section 43) To interact with the now exposed Docker client the component uses

the docker-py library [96] Listing 6 shows how a connection is made to the Docker client and a plugin container is started

The container is started from the plugin image on the network of the framework and is given the plugin name as the container

name Docker thus creates a DNS entry with the plugin name which makes the container addressable on its name Due to this

implementation this limits that there can only be one container of a plugin running at all times in the current implementation

import docker

client = dockerfrom_env()

container = clientcontainersrun(

image=plugin_name

detach=True

name=plugin_name

network=mosquito_default

)

Listing 6 Starting a plugin container

When both components are initialized the Kernel and PluginManager are created The PluginManager searches

for a plugin_directory which contains information on which plugins are installed on the device Each installed plugin

should have a valid image on the device which are contained in the images directory of the Docker daemon If the image

or information file cannot be found on the device the plugin is marked as broken and canrsquot be used by the framework To

describe the API the consumer API is used The producer API is analogous but replaces consumer with producer

and doesnrsquot have the sources endpoints The Producer and Consumer components provide the following functionalities

on the API endpoints

bull consumers GET Retrieves a list of the installed consumers on the device on which the component is running

bull consumerslthostnamegt GET DELETE Retrieves the information of a consumer specified by the host-

name value which is the name of the consumer

42 Overview of prototype 48

bull consumerslthostnamegtstate GET PUT Retrieves or respectively updates the state of a consumer

specified by the hostname value

bull consumerslthostnamegtsources GET POST Retrieves the sources or respectively adds a new source

to the consumer specified by the hostname value

bull consumerslthostnamegtsourcesltsource_hostnamegt

GET PUT DELETE Retrieves updates or removes the source specified by source_hostname of a consumer spec-

ified by hostname respectively

bull consumerslthostnamegtlisteners All listeners resources are analogous to the sources re-

sources

425 Implemented plugins

Three plugins are implemented and tested filecam (called rsquoMycamrsquo in the code) a producer that reads in a video file and

transmits it in MJPEG encoding using the RTP protocol testsrc a producer which generates test video and transmits it

in MJPEG encoding using the RTP protocol and local a consumer which captures incoming RTP MJPEG video frames and

displays them on the local display The filecam and local plugins are discussed since the testsrc is similar to the

filecam

The plugins are implemented in Python use the GStreamer library with the Python bindings [97] for media streaming and the

Flask framework to implement the API These libraries donrsquot have to be used by future plugins which can just implement a REST

API and provide a media stream specified in their descriptions

Filecam plugin

The filecam image is based of the Ubuntu 1710 image It is chosen over lighter Linux distributions because it offers more

functionalities out of the box for prototyping Other dependencies are Python 36 GStreamer 112 GStreamer plugins-base

plugins-good plugins-bad plugins-ugly libav doc and tools and python-gst

The API of the plugin offers the following functionalities

bull state GET PUT Retrieve and respectively update the state of the plugin

bull listeners GET POST Retrieve and respectively add a listener on the plugin

bull listenerslthostnamegt GET PUT DELETE Retrieve update and respectively delete a listener on the

plugin

The implemented GStreamer pipeline is depicted in Figure 41 The pipeline consists of the following GStreamer elements

1 filesrc GStreamer element that reads data from a file in the local file system This file can have any extension

and is not limited to video or audio files [98] The location property is set to the location of the file in the plugin

container

42 Overview of prototype 49

Figure 41 filecam GStreamer pipeline

2 decodebin GStreamer bin that automatically constructs a decoding pipeline using available decoders and demuxers

via auto-plugging [99] Note that for some media containers and codecs the appropriate decoders must be installed

For example to decode the MPEG streams contained in MP4 files a h264 decoder is needed that can be found in the

rsquolibavrsquo GStreamer plugins library

3 jpegenc GStreamer elements that encodes raw video into JPEG images [100] This implements the MPEG video

stream as all video frames are encoded as JPEG images

4 rtpjpegpay GStreamer element that payload encodes JPEG images into RTP packets according to RFC 2435 [101]

5 udpsink GStreamer element that sends UDP packets to the network When combined with an RTP payload plugin

it implements RTP streaming [102] The host and port property are set to the hostname and port property of the

listener of the plugin

This pipeline is implemented using the Python GStreamer bindings The process consists of creating each GStreamer element

adding them to the GStreamer pipeline and linking the elements in order of appearance in the pipeline The decodebin

and jpegenc element canrsquot be linked when created because there is no default sink pad available on the decodebin

Because the decodebin needs to decide on how to decode media it needs the pipeline to be processing media to it If no

media is flowing the decodebin canrsquot know what decoder it needs to decode the media and what media it can offer to the

sink element Therefore the process of dynamic linking is used [103] All elements which can be linked when the pipeline is

not in the PLAYING state are linked A handler is registered on the rsquopad-addedrsquo signal emitted when a new pad is added

on the decodebin indicating that it can forward media downstream When media is flowing through the pipeline the

decodebin creates new pads when it can generate output data and emits the rsquopad-addedrsquo signal A callback is performed

on the handler which links the decodebin with the jpegenc Listing 7 illustrates this concept

callback handler

def on_pad(source pad sink)

get the sink pad from the sink element

sink_pad = sinkget_static_pad(sink)

get the pad type

pad_caps = padget_current_caps()

pad_type = pad_capsget_structure(0)get_name()

Only if the pad is raw video the link is made

if pad_type == videox-raw

42 Overview of prototype 50

Perform the dynamic link

padlink(sink_pad)

Other pad types are ignored

filesrc = GstElementFactorymake(filesrc)

decodebin = GstElementFactorymake(decodebin)

jpegenc = GstElementFactorymake(jpegenc)

(create other elements and add elements to pipeline)

Only filesrc and decodebin can be linked statically

filesrclink(decodebin)

Register on_pad handler on the pad-added signal

handler_id = decodebinconnect(pad-added on_pad jpegenc)

Set pipeline to PLAYING callback will be called to perform the dynamic link

pipelineset_state(GstStatePLAYING)

Listing 7 Dynamic linking of the decodebin and jpegenc

Local plugin

The local plugin captures an incoming media stream and displays it on the local display This plugin is special with respect

to other plugins in that it is not deployed in a Docker container It runs natively via the cli on the host to allow access to

the local display This version is built for macOS High Sierra (version 10134) and uses GStreamer 112 GStreamer plugins-base

plugins-good plugins-bad plugins-ugly libav doc and tools to receive an incoming stream When a plugin links to the local

plugin the Stream component does not instruct the Consumer component to start the plugin but instead links the plugin to the

local host For macOS the address of the host is hostdockerinternal The GStreamer pipeline used by the plugin is depicted in

Figure 42

Figure 42 local plugin GStreamer pipeline

The pipeline consists of the following elements

1 updsrc GStreamer element that reads UDP packets from the network [104] The port property is set to the port to

which the source is transmitting media

2 rtpjpegdepay GStreamer element that retrieves JPEG images from the received RTP packets [105] This element

canrsquot process the media received from the udpsrc directly because it canrsquot know what type of data it will be receiv-

43 Limitations and issues 51

ing Between the pads a rsquocapabilities filterrsquo is placed which informs the elements on the type of data that will be

flowing through In this case the capabilities are applicationx-rtp which tells that there will be rtp pack-

ets coming through encoding-name=JPEG which tells that the payload of the RTP packets are JPEG images and

payload=26 which also tells that the encoding is JPEG according to RFC3551 [50 106]

3 jpegdec GStreamer element that decodes JPEG images [107]

4 autovideosink GStreamer element that automatically detects an appropriate videosink and forwards the video

to it [108]

43 Limitations and issues

The implementation presented is a prototype and slimmed down version of the architecture presented in Section 23 The

following limitations and issues remain

431 Single client

The current implementation deploys the Flask framework (on which each microservice relies) on the built-in Flask web server

(Werkzeug) which is provided for development convenience It is only built for use by a single user and by default can only

handle one request at each given moment which implies that the framework can also only be used by a single user [109]

432 Timeouts

The framework does not perform checks on request timeouts when passing commands to components and plugins This can

be a problem when the framework is deployed on several devices and the request latency is much higher In case of timeouts

the framework will keep waiting for a response which leads to a crash

433 Exception handling and testing

The framework is only tested for the so called rsquohappy pathrsquo the default scenario featuring no exceptional or error conditions

Some alternate paths are handled butmost still need to be tested An example scenario would be if one of the plugin containers

in a stream fails and stops The framework is not able to detect this and will assume that the container is still running

434 Docker security issues

The Docker client is a client that communicates with a daemon process using the socket dockerd This socket is a UNIX

domain socket called varrundockersock The daemon is highly privileged having root access to the host system

Any process that can write to this socket effectively has root access To allow the components of the framework to manipulate

the plugin containers they need access to this socket Therefore the socket ismounted in the containerwhich gives the container

43 Limitations and issues 52

write access to the socket This implies that that container now has root access on the host when writing to this socket Because

the container gets root access to the host an attacker can walk the file tree of the host and extract sensitive information or run

unwanted software This type of attack is known as a rsquoDocker Breakoutrsquo or rsquoContainer Escapersquo attack [110 111]

435 Docker bridge network

The current implementation deploys the framework on a Docker bridge network which can only be used if the framework is

deployed on a single device The current implementation can thus only be deployed on a single device To deploy the framework

on multiple devices the framework must be deployed using a Docker overlay network [112]

436 Single stream

The implementation supports one stream which must be a chain Multiple streams in tree form with merging media from

multiple sources and broadcasting to multiple listeners is not supported

437 Number of containers per plugin

The framework uses the name of the plugin as identifier for the containers The name is also the hostname on which the

container can be reached Therefore there can only be one active container associated with a plugin at runtime

MOB DETECTION EXPERIMENT 53

Chapter 5

Mob detection experiment

To try out an actual drone thermal imaging application the mob detection experiment is carried out The goal of this experi-

ment is to use existing object detection algorithms on a dataset of thermal images to try and detect large crowds of people

hereinafter referred to as a mob

Several public datasets of thermal images exist Most datasets focus on the detection of people in scenes [113ndash117] some on

face recognition [118 119] others on vehicle recognition [120] Most of these datasets are freely available through the OTCBVS

Benchmark Dataset Collection [121] No datasets containing large amounts of people were found so the Last Post thermal

dataset was created for the detection of mobs and other analysis tasks This dataset is presented in Section 51

To detect mobs in the images of the datasets a deep learning approach using neural networks is explored The selection and

training of the model is described in Section 52

51 Last Post thermal dataset

The Last Post dataset consists of videos of the Last Post Ceremony taking place each night at 800 PM (Brussels timezone) under

the Menin Gate in Ypres Belgium Section 511 gives some insight into this unique ceremony The full dataset is described in

Section 512

511 Last Post ceremony

The Last Post ceremony is a nightly ceremony taking place under the Menin Gate in Ypres at 800 PM sharp The ceremony is

held in remembrance of the fallen soldiers during World War I (1914-1918) The Last Post association [122] states its mission as

follows

True to its statutes the Last Post Association wishes to honor and remember the soldiers of the British Empire

who gave their lives during the Great War of 1914-1918 The Last Post ceremony seeks to express day after day

the lasting debt of gratitude which we all owe to the men who fought and fell for the restoration of peace and

the independence of Belgium

51 Last Post thermal dataset 54

Figure 51 gives an impression of the size of the ceremony Because of the sheer number of people that gather under the gate

each day the Last Post is a unique open air event that allowed for repeatable conditions to capture footage therefore the event

was a perfect opportunity to create the dataset

Figure 51 Last Post ceremony panorama

512 Dataset description

Due to legislation in Belgium drones cannot be flown in public areas without a certification and permit by authorities The

creation of real aerial thermal images with a drone was thus not feasible Therefore an elevated position (in order to simulate

aerial images) on the walls next to Menin gate was used to capture the footage of the adjacent square on one side and the

bridge on the other side Figure 52 shows the locations where the video footage was captured

Figure 52 Locations where the video footage was captured The black stars represent the captured scenes the red stars represent the locations from

where the scene was filmed

The data was recorded with the FLIR One Generation 3 Pro camera for Android devices hereafter referred to as rdquoCamerardquo [123]

Since thermal images donrsquot hold color information a color scheme is used to represent the relative differences in temperature

The rsquoIronrsquo color scheme which maps colder sections of a scene on blue colors and warmer sections on red and yellow colors

51 Last Post thermal dataset 55

The videos are encoded using the H264 MPEG-4 codec Decoded the color information is captured in 420 YUV format The

frame rate of the videos varies from 7 Hz to 8 Hz depending on the speed of the objects in the scene There is sound present

in the videos which is encoded with the MPEG AAC codec For a full list of sequences the reader is referred to Appendix C

The two locations that make up the main scenes in the dataset are presented in Figure 53 The thermal images and visual

images of each scene are depicted next to each other The thermal and visual images were not captured at the same time so

the mobs that are present in the thermal images canrsquot be seen in the visual images In both scenes buildings are present that

are quite warm when compared to the surroundings as can be seen in the thermal images In Figure 53a it even becomes

difficult to recognize the mob when they are standing close to the building This is less the case for Figure 53c where due to

the water present in the image the mob has higher contrast due to the larger difference in emitted heat Towards the far right

of the image the mob seemingly disappears into the background The effect of two objects having a similar heat signature and

having no clear transition in thermal images is defined as thermal camouflage a technique that is often used by animals and

military units [124] This effect is even visible when looking at the mobs present in both images because people are standing

so close together it becomes difficult to recognize individual persons in the crowd

(a) Thermal view of the square in location A (b) Visual view of the square in location A

(c) Thermal view of the bridge in location B (d) Visual view of the bridge in location B

Figure 53 Main scenes in the Last Post dataset

52 Object detection experiment 56

52 Object detection experiment

521 Preprocessing

The Last Post dataset was not used entirely for training the model because there were not enough resources to manually

annotate every image Therefore a smaller dataset was used to serve as a baseline model

The following videos were used 2018-04-10 195029mp4 2018-04-10 200122mp4 2018-04-04-

202859mp4 2018-04-10 202558mp4 and 2018-04-04 200052mp4 captured on the fourth and

tenth of April 2018 These videos were used because of their contents They contain images from location A and B respectively

in which the mob behaves more dynamically compared to other videos This was due to a marching band present on the fourth

of April and a marching army unit on the tenth of April See Appendix C for a summary of the contents of these videos From

these videos images were extracted at a capture rate of 1 Hz Each image was manually labelled using the Microsoft Visual

Object Tagging Tool [125] The tool allows to export the training images to various formats such as Pascal VOC for Tensorflow

YOLO and Microsoft CNTK

Within the data several visual outliers are present An outlier is an observation point that is distant from other observations

It is created due to variability in capturing the videos or indicate experimental errors [126] The errors detected here are the

latter form and are depicted in Figure 54 The first type of outliers are system faults in the Camera Due to an error in the

processing of the video the Camera would sometimes not register any input This causes the Camera to produce completely

black images which is depicted in Figure 54a The Camera softwaremaps temperatures onto colors in the image The variations

of the colors are relative to the temperature interval ranging from the minimum and maximum temperature detected by the

Camera If the minimum andor maximum detected temperature change the Camera needs to adapt its color mapping This

causes the Camera to fade to bright colors for a short period of time (1 to 2 seconds) The resulting image is depicted in Figure

54b Because the resulting image is too bright and objects are hard to detect it is considered an outlier Due to instabilities

when capturing the footage sequences with fast motion some images are very blurry This makes it hard even for a person to

decide what is visible in the frame therefore it is considered an outlier This is depicted in Figure 54c Sometimes people would

pass in front of the Camera which resulted in brightly colored areas in the videos that were not part of the scene and therefore

are another type of outliers depicted in Figure 54d Because the presented outliers are experimental errors and do not belong

in the scenes they were removed from the dataset

522 Training

The model that is used for training is YOLOv3 implemented using the darknet neural network framework [83] The model is

trained using convolutional weights that are pre-trained on the ImageNet database [127] The concept of using weights from a

pre-trained model previously trained on large datasets is known as transfer learning It is very important that when choosing

a pre-trained model the problem statement of the pre-trained model is close enough to the current problem statement For

the pre-trained model on ImageNet this was to identify objects in images which lies close to the detection of mobs in thermal

images Because the type of images (thermal versus visual) is fundamentally different the model could suffer in performance

Goedeme et al [36] solved a similar problem with thermal images and achieved good results which gives an indication that

52 Object detection experiment 57

(a) System fault in the Camera no input was detected (b) The Camera updates to new temperature interval

(c) Due to moving the Camera too fast the image becomes too blurry (d) Very warm object due to people passing in front of the Camera

Figure 54 Outliers

detection should be feasible with the pre-trained model Also because the dataset is relatively small training the model from

scratch could actually hurt performance [128] Training was carried out on the NVIDIA Geforce GTX 980 GPU that allows training

to be done much faster To evaluate training progress the Sum of Squared Error (SSE) loss function is calculated defined assumni=1(xij minus xj)

2 where n is the number of samples in a batch used in a single training epoch and j is the dimension (x

or y) as defined in [83] The result of this training is discussed in Chapter 6

RESULTS AND EVALUATION 58

Chapter 6

Results and evaluation

The goal of this Chapter is to present the results of the framework and the detection experiment The results of the framework

tests are presented in Section 61 The results of the object detection experiment are presented in Section 62

61 Framework results

To evaluate the framework acceptance tests are conducted that test if the framework meets the QARs defined in Section 21 As

stated in Section 41 only the ASRs will be tested A summary of which requirements are met by the framework is given in Table

61 Passed means that the framework has met the requirement not passed that the framework hasnrsquot met the requirement

and plausible means that the frameworkmight havemet the requirement but not enough data could be gathered to be certain

611 Performance evaluation

To evaluate performance the acceptance tests for the requirements are conducted the impact of the framework on the pro-

cessing resources are recorded and the total size of the framework is measured

Acceptance tests

To test the performance of the framework the execution times of each command executed with the CLI (see Section 422) are

measured Each command is executed 200 times except for the on off and link commands they are measured manually

10 times Because these commands launched system threads and their finish signal could not be captured they had to be

measured by hand Commands were executed on a 26 GHz Intel Core i5-2540 processor running macOS High Sierra version

10134 The summarized statistics of the tests are given in Table 62

The average execution times for the Play Stop Pause Add Elements Print View and Link commands do not exceed the 2

second bound specified in PS-1 while the average execution times of the Delete On and Off commands do exceed this bound

This performance requirement is not met by the framework The same result is found for PS-2 Especially the Delete and Off

command exceed the requirements by quite a bit The Delete command shuts down a plugin and removes the Docker container

61 Framework results 59

Requirement id Status

PS-1 Not Passed

PS-2 Plausible

PS-3 Not Passed

PS-4 Plausible

PS-5 Not Passed

IS-1 Passed

IS-2 Passed

MS-1 Passed

MS-2 Passed

MS-3 Passed

MS-4 Passed

MS-5 Plausible

MS-6 Passed

MS-7 Plausible

Table 61 Acceptance tests results summary

from the host This action is costly in time The Off command removes all the plugins and all the microservices of the framework

and thus suffers from the same costly action This could be ameliorated by having the framework not removing the containers

but stopping them instead which requires less resources as it only stops the process running in the container but does not

delete the container from the system

PS-2 and PS-4 could not be measured due to the GStreamer pipeline of the prototype not allowing frames to be tracked

However since real-time is a human time perception if a person canrsquot distinguish the streamed videos from videos played with

a native video player real-time streaming is plausible [43 44] The videos were shown side by side to ten users that could not

distinguish between both videos indicating presumable real-time streaming Since the hard requirements cannot bemeasured

the requirements are not met but are plausible Real-time streaming performance also heavily depends on the used plugins

and the hardware on which they are deployed If a plugin canrsquot process its media fast enough due to lack of processing power

or a slow implementation it will slow down the whole stream

The scalability requirement PS-5 could not be met due to the Flask Werkzeug server only being able to process one request at

a time (see Section 43)

Only two performance requirements are met by the prototype However this is mostly due to some actions being very slow

such as shutting down the framework or removing a plugin As these are actions that should occur less frequently when a user

is using the framework these actions are less important for the perceived quality Frequent actions such as adding linking and

changing the state of the stream do perform rather well and contribute more to the perceived quality Overall the performance

of the framework is not stellar but not bad either This can partially be explained due to the choice of supporting frameworks

61 Framework results 60

Statistic Play Stop Pause Add Delete Elements Print View On Off Link

Mean 0690 0804 0634 1363 8402 0562 0564 122 358 24023 0849

Std deviation 0050 0059 0088 1037 4669 0070 00747 0260 0498 0481 0170

Minimum 0629 0708 0549 0516 0505 0517 0517 0757 3015 23707 0637

25 Percentile 0665 0775 0594 1049 1154 0534 0536 0998 3143 23750 0798

Median 0678 0800 0623 111 11132 0550 0552 1214 3500 23886 0853

75 Percentile 0700 0820 0653 1233 11189 0562 0560 1433 3850 24034 0877

Maximum 1016 1279 1631 625 11846 1227 1149 1691 4562 25326 1261

Table 62 Performance test statistics summary measured in seconds

such as Flask that are not built for performance Other more high performance frameworks such as Vertx could ameliorate

performance

Resource usage

The resources used by the modules of the framework are measured using the Docker statistics tool [129] A summary of the

resources used is given in Table 63 When the framework is idle resource usage is negligible When a plugin is active there is

a slight increase in resources This increase in resources depends on the runtime size of the plugin unknown to the framework

The increase peaks when the plugin is processing media CPU usage is 40 on one core which implies that on one CPU core only

two plugins can be active simultaneously before reaching the ceiling of the processing power In a production environment of

the framework plugins need to be tested thoroughly so that these metrics are known beforehand These metrics imply that

the length of streams should be kept short to avoid having many plugins active simultaneously

Size of framework

The total size of all the Docker images of the components of the framework are given in Table 64 Most images are quite large

the framework core components have an average size of 724 MB and the plugins have sizes ranging from 1GB to 3GB This

size can be explained due to the base images and additionally installed software in the images For development flexibility

the base images used are Linux Ubuntu images that are typically larger than other Linux distributions For the plugins the full

GStreamer library with all plugins was installed which is more than 2 GB large The sizes of the components can be reduced in

a production environment by choosing slimmer Linux distributions as base images and only installing the minimally needed

libraries to get a working plugin

612 Interoperability evaluation

The systems with which the framework exchanges data are the plugins These plugins must follow the plugin model presented

in Section 231 implement the presented resources using a REST API the state machine and protocols If these specifications

61 Framework results 61

Condition Container CPU usage [] Memory usage [MiB]

Idle streamer 100 4209

consumer 003 244

producer 001 2414

1 plugin active not processing media streamer 156 4248

consumer 002 2442

producer 002 2423

mycam plugin 075 4597

1 plugin active processing media streamer 156 4251

consumer 002 2442

producer 002 2424

mycam plugin 4003 9924

Table 63 Resource usage of the framework in several conditions

Image Size [MB]

streamer 718

consumer 729

producer 729

testsrc 1250

mycam 3020

Table 64 Total size of framework components

are followed by a plugin the framework should have no issues exchanging information with the plugin To test this a new

mock plugin is implemented For each resource of the plugin the framework is given random mock input data to exchange

with the plugin When the exchange is complete the values in the plugin are requested and compared with the given input If

the input matches the value in the plugin the exchange was successful These tests were executed 50000 times The results

are summarized in Table 65 Play pause and stop are the requests to change the state of the plugin The sourcelistener add

update and delete commands manipulate the sources and listeners of the plugin Overall there were almost no errors made

when exchanging information only when updating a source and deleting a listener there was one incorrect exchange The

ratios achieved are always 100 correct exchanges except for updating a source and deleting a listener which are 99998

IS-1 and IS-2 specify that commands exchanged with the plugins need to be correct 9999 of the uptime so this requirement

is clearly met

Plugins also interact with each other by transmitting media to each other according to the stream layout This interoperability

62 Mob detection experiment results 62

Value Play Pause Stop Add S Update S Delete S Add L Update L Delete L

Correct 50000 50000 50000 50000 50000 49999 50000 50000 49999

Incorrect 0 0 0 0 0 1 0 0 1

Ratio () 100 100 100 100 100 99998 100 100 99998

Table 65 Interoperability tests results (S Source L Listener)

is not directly controlled by the framework as plugins can be developed by third parties To solve this a plugin needs to provide

its specifications to the framework before being integrated as a plugin This allows the framework to decide whether or not two

plugins will be able to interact with each other in a stream For example if plugin A supports MJPEG streams transmitted via

RTPUDP it will be able to interact with plugin B implementing the same protocols If plugin B implements another protocol it

will not be able to interact with plugin A If this is specified the framework can notify a user that two plugins are not compatible

These scenarios should be avoided which is done by specifying standard protocols for plugins

613 Modifiability evaluation

Plugins are installed for the prototype by building and adding their image to the image directory of the Docker host The

framework does not need a restart to install these images Therefore requirements MS-1 and MS-2 are met End-users can

extend their version of the framework with new plugins by installing them by building the respective plugin images meeting

MS-3 Streams can be modified by linking different plugins by design meetingMS-4 The framework can detect newly installed

plugins when starting up if the image is installed to the image directory of the Docker host Therefore requirementsMS-5 and

MS-6 are met The current prototype is only deployable on a local device as discussed in Section 41 meeting requirementMS-7

The other requirements can be met by deploying the framework using the Docker overlay network as discussed in Section 43

without having to implement changes to the code base The requirements MS-8 and MS-9 are not met but are plausible by

using a different Docker deployment

In general the frameworkwas designed to bemodifiable for different video analysis tasks The hybridmicrokernelmicroservices

architecture enables this modifiability The microkernel plugin architecture allows a user to modify a video analysis stream

during framework use The microservices architecture allows for a modifiable deployment configuration

62 Mob detection experiment results

To evaluate the detection experiment the trained model is tested on the validation set that contains random images from the

total annotated dataset presented in Section 512 First the results of the training of the model are presented in Section 621

Second the metrics that were used to evaluate the model are presented in Section 622 Finally the results of the validation

are presented in Section 623

62 Mob detection experiment results 63

621 Training results

To monitor training the average loss per training epoch was measured the resulting training evolutions are depicted in Figure

61 Darknet does not shuffle training data automatically and creates training batches in order of the training data provided

Since YOLO uses gradient descent for optimization this can lead to YOLO getting stuck in local minima of the cost surface [130]

This effect is seen in Figure 61a around epoch 4500 every image in the training set has been loaded at least once at this point

the model was training on images from location B and now images from location A are loaded (see Section 512) This leads to

a peak in average loss as YOLO was optimizing images from location B and probably converging to a local minimum for that

type of images Therefore in a second run data was shuffled allowing the model to get out of local minima easier Figure

61b shows the difference in training loss the curve is much more irregular thanks to the shuffling of the data Once again

the average loss decreases more around epoch 4500 when every image in the training set has been loaded at least once The

average loss stagnates values in the interval [004 007] To avoid overfitting the model on the training data and achieve worse

generalization performance early stopping is applied Early stopping is a generalization technique to stop the training of a

neural network early before the network starts overfitting [131] The stopping criterion used is progress defined as the decrease

of training error in successive training epochs [131] or the slope of the loss curve depicted in Figure 61 This slope approaches

0 from epoch 13000 and onward so this epoch is selected as early stopping point Because the generalization error is not a

smooth curve and consists of many local minima it is a good idea to validate model weights in the neighborhood of the early

stopping point as well as these could potentially yield better performance on the validation set [131]

622 Metrics

Themodel predicts bounding boxes for objects in the images of the validation sets The bounding box provided by the annotated

dataset is defined as the ground truth bounding boxBgt The bounding box provided by the model is defined as the predicted

bounding boxBp To evaluate the performance of themodel and select the best weights several metrics are used The standard

metrics used to evaluate object detection problems are the Intersection over Union (IoU) and themean Average Precision (mAP)

The IoU is a metric used in common object detection challenges such as the Pascal VOC challenge [132] If the functionA(Bx)

gives the area for a bounding boxBx the IoU is defined as

IoU =A(Bp capBgt)

A(Bp cupBgt)(61)

The mAP for set of detections another metric used in the Pascal VOC challenge is defined as the mean over classes of the

interpolated AP for each class A detection is considered a true positive if the IoU for the detection is greater than 05 The

interpolated AP is given by the area under the precision-recall curve for the detections [132ndash134]

Themodel is also tested on several videos not included in the train and validation set to visually evaluate detection andmeasure

the number of frames per second that can be processed by the model

62 Mob detection experiment results 64

(a) Average training loss when data is not shuffled Vertical average loss horizontal time (in training epochs)

(b) Average training loss when data is shuffled Vertical average loss horizontal time (in training epochs)

Figure 61 Average training loss per epoch

623 Validation results

YOLO creates a snapshot from the weights the model is using at a certain epoch every 100 epochs [83] This makes it possible

to validate each set of weights on the validation set and show the evolution of the validation performance Figure 62 shows

these evolutions for the average IoU and mAP metrics The mAP gradually grows from epoch 4500 onwards and stagnates

around epoch 11500 This shows that the model is not learning anymore and is at risk of overfitting The mAP stagnates in the

interval of [88 91] The average IoU shows a similar trend but varies more because predictions on the same images rarely

are exactly the same

The best mAP value is achieved at epoch 15700 being 9052 The weights from this epoch are used for further testing and

validation The mAP for the 05 IoU threshold of YOLOv3 on the COCO benchmark dataset [135] is 748 comparing this to the

achieved mAP for the Last Post dataset the Last Post mAP is very high The reason for this difference is that the validation

62 Mob detection experiment results 65

(a) mAP () per epoch Vertical mAP () horizontal time (in training epochs)

(b) IoU () per epoch Vertical IoU () horizontal time (in training epochs)

Figure 62 Validation metrics per epoch

set has a high correlation with the validation set Due to the training set and validation set being extracted from videos all

images from one video are correlated in time to each other Images from the validation set are thus correlated to images in

the training set and the model is optimized on these types of images explaining the high mAP This indicates that the model is

somewhat overfitting on the training data This was confirmed when testing the model on unseen videos Although the model

could detect a mob most of the time it produced more visual errors Because this data was not annotated no metrics could be

extracted Figure 63 depicts some predictions of the model on images from the validation set The predicted bounding boxes

resemble the ground truth bounding boxes quite accurately visually

To test the speed of the predictions of the model the total time to predict images in the validation set was measured For the

NVIDIA Geforce GTX 980 GPU the average prediction time for one image is 14673 milliseconds with a standard deviation of

0517 milliseconds This indicates that the upper limit of the frame rate when making predictions on a video is approximately

68 frames per second on the GPU For comparison predictions with the model were also made on a CPU a 26 GHz Intel Core

i5-2540 processor with AVX instructions speedup The average prediction time on the CPU is 5849 seconds with a standard

deviation of 0438 seconds resulting in an upper limit for the frame rate on the CPU of 0171 frames per second Clearly real

time object detection with this model is only possible on a GPU When generating predictions on a test video the average frame

rate of the video was 55 frames per second

62 Mob detection experiment results 66

(a) Prediction of a large mob at location B (b) Prediction of the mob at location A

(c) Prediction of a small mob at location B (d) Prediction of the mob at location B

Figure 63 Predictions of the model on images in the validation set

CONCLUSION AND FUTURE WORK 67

Chapter 7

Conclusion and future work

71 Conclusion

Aerial thermal imaging with drones is a promising technology that can deliver many promising applications for various use

cases across many different domains such as agriculture fire fighting search and rescue etc Most applications built with this

technology are built with a specific use case in mind using a thermal camera and analysis software specifically for this use

case and therefore struggle to exchange hardware and algorithms for new use cases Therefore the goal of this dissertation

was to design build and test a possible backbone framework that allows building these applications in a modifiable way The

specific use case of mob detection in thermal images was investigated as a sample use case for the framework

Chapter 2 explored the requirements of such a framework The ASRs to achieve the goal of the framework are performance

interoperability and modifiability Performance is needed because some use cases (like fire fighting) require real-time video

analysis Interoperability enables the framework to interact with different thermal cameras and different processinganalysis

modules Modifiability enables the framework to interchange the thermal cameras and analyzers in its process to build ap-

plications for different use cases A hybrid combination of the microkernel pattern and the microservices pattern is used to

meet these requirements as the microkernel pattern enabled interchanging the cameras and analyzers via a plugin system

and the microservices pattern enabled different deployment configurations for the framework To build and test the frame-

work several technologies were needed backbone technologies for the software architecture a thermal camera and an object

detection algorithm for the mob detection use case

Chapter 3 explored the state of the art of these technologies and presents the selected technologies Thermal cameras come in

all shapes and sizes and have different features according to their retail prize Contrary to intuition the image quality is not the

defining factor of the retail prize but the amount of extra features such as radiometry communication interfaces etc The FLIR

One Pro and ThermApp were selected for this dissertation since they offer good quality images and features for their price and

their use via smartphone platforms that makes these cameras excellent for prototyping Microservices frameworks also know

a lot of variety depending a lot on the use case for the application using the framework Some are aimed at quick prototyping

others focus on performance etc Flask was selected as the microservices framework as it is easy to use and designed for

prototyping with microservices This does come with a performance trade-off To deploy the microservices in a plugin fashion

71 Conclusion 68

the concept of containers is applied Containers virtualize on the OS level allowing the microservices to be moved around on

the host and distributed on different hosts The current field has some frameworks implementing this technology with Docker

being the most well-known and mature framework and it was selected for that reason The field of object detection has a

variety of solutions for the object detection problem having varying accuracies and some can even create predictions in real-

time The YOLOv3 algorithm implemented in the darknet framework was selected as it generalizes well onto other datasets

(such as thermal images) makes relatively accurate predictions and is able to make predictions in real-time when deployed on

a device with GPU processing capabilities

Chapter 4 presents the implemented prototype of the framework using these technologies Two sample plugins were imple-

mented the filecam that serves a video read in from a file and the display plugin that displays this video on the local device

The framework is limited to one video processing stream for one user at a time and is deployed to a local device It also has a

security risk as the framework has to expose the Docker daemon socket to allow the framework to manipulate the containers

running the plugins This gives the containers that run the core framework processes root access to the host system which can

be abused by potential attackers

Themob detection experiment is presented in Chapter 5 A new thermal image dataset called the Last Post datasetwas collected

for this experiment The dataset features videos of the Last Post ceremony filmed over the course of two weeks What makes

this dataset special is that unlike publicly available datasets it delivers footage of the movement of large crowds filmed from

a high vantage point to simulate footage captured from a drone platform This dataset is used to train a pre-trained YOLOv3

model via transfer-learning The dataset is manually labeled and preprocessed by removing the outliers present Training is

done on a NVIDIA GTX 980 GPU and is evaluated using the MSE loss metric

Chapter 6 presented the test conducted on the framework and the detection model and their corresponding results The per-

formance requirements for the frequently used commands are met by the framework Other commands such as removing

plugins starting up and shutting down the framework do not meet the performance requirements since Docker requires sig-

nificant time to start stop and remove containers The real-time streaming requirements could not be proven because the

time between transmitting a frame and receiving a frame could not be measured directly However the processed videos were

shown to human users that could not distinguish between the processed video and the video played back on a local system

which makes it plausible that the framework achieved this requirement Real-time streaming performance heavily depends on

the plugin and the hardware on which it is deployed When plugins in the framework are processingmedia CPU usage increases

significantly even when only one plugin is active This implies that the length of media processing streams should be kept as

short as possible to achieve good performance The framework is relatively big with some plugins even having a size of 2 GB

This is mostly due to the base images and installed libraries of the plugins and core components Due to each components

and plugin having its own container libraries canrsquot be shared so they are redundantly installed leading to large components

sizes This could be alleviated by using slimmer images and only installing minimal libraries needed The interoperability

requirements are all met by the framework This is proven by a test exchanging mock information between the framework

and plugins The modifiability requirements regarding the plugins are met by the framework The modifiability requirements

regarding the deployment schemes are not met by the framework but are can be achieved by deploying the framework using

a Docker overlay network instead of the Docker bridge network To evaluate the trained model the model made predictions

72 Future work 69

on a separate validation set The model achieves an mAP of 9052 which is much higher than what current state of the art

models are achieving on benchmark datasets This shows that the model is capable of learning the thermal features but is also

overfitting on the data due to temporal correlation between training and validation sets The model can predict in real-time

achieving an average frame rate of 55 frames per second when making predictions on a GPU

72 Future work

This dissertation proposed a framework and implements a prototype of it which only implements a part of the total framework

Object detection using deep learning in general and specified on thermal images is still a young field Several extensions to

this research are possible

721 Security

The framework prototype did not implement any security measures Because in distributed configurations communications

rely on an external network these measures should be implemented to reduce the risks of attacks To allow the components

to manipulate Docker containers the Docker host socket was exposed As stated before this is a serious security risk as the

container gets root access to the host Workarounds for this problem could be to implement a Docker in Docker environment

[136] or deploy the containers in a VM

722 Implementing a detection plugin

Due to the scope and time limit of the dissertation a working prototype plugin containing a trained model for detecting objects

in a video stream could not be made A possible GStreamer pipeline for such a plugin is depicted in Figure 71 This plugin is a

Consumer and receives video via the udpsink Frames are decoded and the raw video is presented to the appsink GStreamer

plugin that allows the video to be dumped into an application This is the detection model that can generate predictions on the

frame The predicted frame is then forwarded to an appsrc GStreamer plugin that puts the predicted frame in a new pipeline to

transmit it to further framework plugins It should be tested whether the detection model can run in a Docker container since

it needs GPU support to be able to predict in real-time A solution could be to use nvidia-docker which leverages NVIDIA GPU

support in Docker containers [137]

Figure 71 GStreamer pipeline for a plugin with a detection model

72 Future work 70

723 Different deployment configurations

The prototype of the framework only implemented one of the deployment configurations presented in Section 233 Other

configurations can be explored by changing the Docker bridge network to a Docker overlay network

724 Multiple streams with different layouts

The prototype only implemented one stream with a chain-like layout Future effort could implement support for multiple

streams that run concurrently The layout can be changed by implementing plugin that can forward media to multiple sources

or merge media coming from different sources which is the concept of sensor fusion

725 Implementing the plugin distribution service (Remote ProducerConsumer)

In Chapter 2 presented the Remote Producer and Consumer that distribute the plugins available for the framework This was

deemed out of scope for the prototype but could be implemented in future versions

726 Using high performance microservices backbone frameworks

The current implementation uses the Flask framework excellent for prototyping but not ideal for high performance Other

frameworks such as Vertx focus on high performance through asynchronous messaging that could improve the performance

of the framework

727 New object detection models and datasets specifically for thermal images

Current effort in object detection models goes towards challenges on benchmark datasets of visual images such as ImageNet

and Pascal VOC There are some thermal datasets publicly available for some detection purposes but these are very small

compared to the visual image datasets Future research could create new benchmark datasets similar to the visual image

datasets specifically for thermal images

Currently publicly available pre-trained neural network models are designed for and trained on the visual image datasets

Future research could go towards designing an architecture specifically for thermal images and training amodel on a benchmark

dataset

Thermal images use several colormaps tomap the relative temperatures in a scene on colors presenting warm and cold regions

Well-known examples are the Iron scheme (used in this dissertation) White-hot and Black-hot Some companies implement

threshold colors that highlight very hot spots or very cold spots in an image (for examples see [138 139] etc) Future research

could investigate how models trained on images using different color schemes differ in their predictions and performances

Thermal images could potentially benefit from radiometric information that adds a ton of information by adding a temperature

dimension to each pixel in the image instead of the relative coloring This information could lead to more accurate predictions

BIBLIOGRAPHY 71

Bibliography

[1] S G Gupta M M Ghonge and P Jawandhiya ldquoReview of Unmanned Aircraft Systemrdquo International Journal of Advanced

Research in Computer Engineering amp Technology vol 2 no 4 pp 2278ndash1323 2013 ISSN 2278 ndash 1323

[2] M Hassanalian and A Abdelkefi Classifications applications and design challenges of drones A review 2017 DOI

10 1016 j paerosci 2017 04 003 [Online] Available http ac els - cdn com S0376042116301348 1 - s2 0 -

S0376042116301348-mainpdf7B5C_7Dtid=256c9506-8f3c-11e7-a898-00000aab0f017B5Camp7Dacdnat=

15042875957B5C_7D

[3] M Joel The Booming Business of Drones 2013 [Online] Available httpshbrorg201301the-booming-business-of-

drones (visited on 01302018)

[4] DJI Zenmuse H3 - 2D [Online] Available httpswwwdjicomzenmuse-h3-2d (visited on 01302018)

[5] Gimbal Guard Drop amp Delivery Device for DJI Mavic Pro [Online] Available httpwwwgimbal-guardcom7B5C_

7Dpprd134610820141productdrop-7B5C7D26-delivery-device-for-dji-mavic-pro (visited on 01302018)

[6] FLIR Systems Aerial Thermal Imaging Kits [Online] Available httpwwwflircomsuasaerial-thermal-imaging-kits

(visited on 01302018)

[7] R Gade and T B Moeslund ldquoThermal cameras and applications a surveyrdquo Machine Vision and Applications vol 25

pp 245ndash262 2014 DOI 101007s00138-013-0570-5 [Online] Available httpslinkspringercomcontentpdf10

10077B5C7D2Fs00138-013-0570-5pdf

[8] M C Harvey J V Rowland and K M Luketina ldquoDrone with thermal infrared camera provides high resolution georefer-

enced imagery of theWaikite geothermal area New Zealandrdquo 2016 DOI 101016jjvolgeores201606014 [Online] Avail-

able httpsacels-cdncomS03770273163014211-s20-S0377027316301421-mainpdf7B5C_7Dtid=78077cee-

05f3-11e8-84ec-00000aab0f6c7B5Camp7Dacdnat=15173405687B5C_7D

[9] S Amici M Turci S Giammanco L Spampinato and F Giulietti ldquoUAV Thermal Infrared Remote Sensing of an Italian Mud

Volcanordquo vol 2 pp 358ndash364 2013 DOI 104236ars201324038 [Online] Available httpwwwscirporgjournalars

20httpdxdoiorg104236ars201324038

[10] J Bendig A Bolten and G Bareth ldquoINTRODUCING A LOW-COST MINI-UAV FOR THERMAL-AND MULTISPECTRAL-IMAGINGrdquo

2012 [Online] Available httpswwwint-arch-photogramm-remote-sens-spatial-inf-scinetXXXIX-B13452012

isprsarchives-XXXIX-B1-345-2012pdf

BIBLIOGRAPHY 72

[11] Workswell ldquoUsing the UAV Thermography for Cultivation and Phenotyping of Cerealsrdquo Tech Rep 2016 [Online] Avail-

able httpswwwdrone-thermal-cameracomwp-contentuploadsCultivation-and-Phenotyping-1pdf

[12] A J Rivera A D Villalobos J C Monje J A Marintildeas and C M Oppus ldquoPost-disaster rescue facility Human detection and

geolocation using aerial dronesrdquo IEEE Region 10 Annual International Conference ProceedingsTENCON pp 384ndash386

2017 ISSN 21593450 DOI 101109TENCON20167848026

[13] P Christiansen K A Steen R N Joslashrgensen and H Karstoft ldquoAutomated detection and recognition of wildlife using

thermal camerasrdquo Sensors (Basel Switzerland) vol 14 no 8 pp 13 778ndash93 Jul 2014 ISSN 1424-8220 DOI 103390

s140813778 [Online] Available httpwwwncbinlmnihgovpubmed2519610520httpwwwpubmedcentralnih

govarticlerenderfcgiartid=PMC4179058

[14] J Zhang J Hu J Lian Z Fan X Ouyang and W Ye ldquoSeeing the forest from drones Testing the potential of lightweight

drones as a tool for long-term forest monitoringrdquo Biological Conversation vol 198 pp 60ndash69 2016 [Online] Available

httpacels-cdncomS00063207163011001-s20-S0006320716301100-mainpdf7B5C_7Dtid=7166e916-8f3c-

11e7-9090-00000aacb35e7B5Camp7Dacdnat=15042877237B5C_7D

[15] D Ventura M Bruno G Jona Lasinio A Belluscio and G Ardizzone ldquoA low-cost drone based application for identifying

and mapping of coastal fish nursery groundsrdquo Estuarine Coastal and Shelf Science vol 171 pp 85ndash98 Mar 2016 ISSN

02727714 DOI 101016j ecss 201601 030 [Online] Available http ac els-cdncomS02727714163003001-s20-

S0272771416300300-mainpdf7B5C_7Dtid=7f4cdb08-8f3c-11e7-a03a-00000aab0f6b7B5Camp7Dacdnat=

15042877467B5C_7D20httplinkinghubelseviercomretrievepiiS0272771416300300

[16] S Chowdhury A Emelogu M Marufuzzaman S G Nurre and L Bian ldquoDrones for disaster response and relief operations

A continuous approximation modelrdquo 2017 DOI 101016jijpe201703024 [Online] Available wwwelseviercomlocate

ijpe

[17] Workswell ldquoPipeline inspection with thermal diagnosticsrdquo 2016 [Online] Available https www drone - thermal -

cameracomwp-contentuploadspipelinepdf

[18] Workswell ldquoThermo diagnosis of photovoltaic power plantsrdquo 2016 [Online] Available httpswwwdrone-thermal-

cameracomwp-contentuploadsWorkswell-WIRIS7B5C_7Dphotovoltaicpdf

[19] Workswell ldquoThermodiagnostics of flat roofsrdquo 2016 [Online] Available httpswwwdrone-thermal-cameracomwp-

contentuploadsroofpdf

[20] Workswell ldquoThermodiagnostics in the power engineering sectorrdquo Tech Rep 2016 [Online] Available https www

drone-thermal-cameracomwp-contentuploadshighvoltagepdf

[21] Workswell Workswell WIRIS - Product - Thermal camera for drones 2016 [Online] Available https www drone -

thermal-cameracomwiris (visited on 01302018)

[22] TEAX Technology ThermalCapture - Thermal Imaging Technology | Capture raw radiometric thermal data with drones

[Online] Available httpthermalcapturecom (visited on 01302018)

BIBLIOGRAPHY 73

[23] DJI Zenmuse XT - unlock the possibilities of sight - DJI 2018 [Online] Available https wwwdji comzenmuse-xt

(visited on 01302018)

[24] Workswell SOFTWARE - Workswell WIRIS - Thermal camera for drones 2016 [Online] Available httpswwwdrone-

thermal-cameracomsoftware (visited on 01312018)

[25] Therm-App Therm-Apptrade - Android-apps op Google Play 2018 [Online] Available httpsplaygooglecomstoreapps

detailsid=comthermapp (visited on 01312018)

[26] B Satzger W Hummer C Inzinger P Leitner and S Dustdar ldquoWinds of change From vendor lock-in to the meta cloudrdquo

IEEE Internet Computing vol 17 no 1 pp 69ndash73 2013 ISSN 10897801 DOI 101109MIC201319

[27] J Divya Drone Technology and Usage Current Uses and Future Drone Technology 2017 [Online] Available httpuk

businessinsidercomdrone-technology-uses-2017-7r=US7B5Camp7DIR=T (visited on 01312018)

[28] A Boulanger ldquoOpen-source versus proprietary software Is one more reliable and secure than the otherrdquo IBM Systems

Journal vol 44 no 2 pp 239ndash248 2005 ISSN 0018-8670 DOI 101147sj4420239 [Online] Available httpieeexplore

ieeeorgdocument5386727

[29] M Kazmeyer Disadvantages of Proprietary Software [Online] Available httpsmallbusinesschroncomdisadvantages-

proprietary-software-65430html (visited on 01312018)

[30] B Steffen and A Seyfried ldquoMethods for measuring pedestrian density flow speed and direction with minimal scatterrdquo

Physica A Statistical Mechanics and its Applications vol 389 no 9 pp 1902ndash1910 May 2010 ISSN 0378-4371 DOI 10

1016JPHYSA200912015 [Online] Available httpswwwsciencedirectcomsciencearticlepiiS0378437109010115

via7B5C7D3Dihub

[31] M Wirz T Franke D Roggen E Mitleton-Kelly P Lukowicz and G Troumlster ldquoInferring crowd conditions from pedestriansrsquo

location traces for real-time crowd monitoring during city-scale mass gatheringsrdquo Proceedings of the Workshop on

Enabling Technologies Infrastructure for Collaborative Enterprises WETICE pp 367ndash372 2012 ISSN 15244547 DOI 10

1109WETICE201226

[32] E Alpaydin Introduction to machine learning 3rd ed MIT Press 2014 p 591 ISBN 026201243X [Online] Available

httpsdlacmorgcitationcfmid=1734076

[33] J W Davis and V Sharma ldquoRobust background-subtraction for person detection in Thermal Imageryrdquo IEEE Computer

Society Conference on Computer Vision and Pattern Recognition Workshops vol 2004-Janua no January 2004 ISSN

21607516 DOI 101109CVPR2004431

[34] W Wang J Zhang and C Shen ldquoImproved Human Detection And Classification in Thermal Imagesrdquo pp 2313ndash2316 2010

[35] R Appel S Belongie P Perona and P Doll ldquoFast Feature Pyramids for Object Detectionrdquo Pami vol 36 no 8 pp 1ndash14

2014 ISSN 01628828 DOI 10 1109 TPAMI 2014 2300479 [Online] Available https vision cornell edu se3 wp -

contentuploads201409DollarPAMI14pyramids7B5C_7D0pdf

[36] T Goedeme ldquoProjectresultaten VLAIO TETRA-projectrdquo KU Leuven Louvain Tech Rep 2017

BIBLIOGRAPHY 74

[37] L-L Slattery DroneSAR wants to turn drones into search-and-rescue heroes 2017 [Online] Available https www

siliconrepubliccomstart-upsdronesar-search-and-rescue-drone-software (visited on 05262018)

[38] A W S Inc What Is Amazon Kinesis Video Streams 2018 [Online] Available https docs aws amazon com

kinesisvideostreamslatestdgwhat-is-kinesis-videohtml (visited on 05262018)

[39] U Government ldquoSystems Engineering Fundamentalsrdquo Defence Acquisition University Press no January p 223 2001

ISSN 1872-7565 DOI 101016jcmpb201005002 [Online] Available httpwwwdticmildocscitationsADA387507

[40] L Bass P Clements and R Kazman Software Architecture in Practice 3rd Addison-Wesley Professional 2012 ISBN

0321815734 9780321815736

[41] J Greene and M Stellman Applied Software Project Management 2006 p 324 ISBN 978-0596009489 [Online] Avail-

able httpwwworeillycomcatalogappliedprojectmgmt

[42] S Barber Acceptable application response times vs industry standard 2018 [Online] Available httpssearchsoftwarequality

techtargetcomtipAcceptable-application-response-times-vs-industry-standard (visited on 05282018)

[43] T Burger How Fast Is Realtime Human Perception and Technology | PubNub 2015 [Online] Available httpswww

pubnubcombloghow-fast-is-realtime-human-perception-and-technology (visited on 05282018)

[44] S-t Modeling P Glennie and N Thrift ldquoTime perception modelsrdquo Neuron pp 15 696ndash15 699 1992

[45] M Richards Software Architecture Patterns First edit Heather Scherer Ed OrsquoReilly Media 2015 [Online] Available

httpwwworeillycomprogrammingfreefilessoftware-architecture-patternspdf

[46] C Richardson Microservice Architecture pattern 2017 [Online] Available httpmicroservicesiopatternsmicroservices

html (visited on 12022017)

[47] P Clements F Bachmann L Bass D Garlan J Ivers R Little P Merson R Nord and J Staffor Documenting Software

Architectures Second Boston Pearson Education Inc 2011 ISBN 0-321-55268-7

[48] Object Management Group ldquoUnified Modeling Language v251rdquo no December 2017 [Online] Available http www

omgorgspecUML251

[49] C De La Torre C Maddock J Hampton P Kulikov and M Jones Communication in a microservice architecture 2017

[Online] Available https docs microsoft com en - us dotnet standard microservices - architecture architect -

microservice-container-applicationscommunication-in-microservice-architecture (visited on 04272018)

[50] H Schulzrinne and S Casner ldquoRTP Profile for Audio and Video Conferences with Minimal Controlrdquo 2003 [Online] Avail-

able httpstoolsietforghtmlrfc3551

[51] D Bull Communicating Pictures A Course in Image and Video Coding Elsevier Science 2014 ISBN 9780080993744

[Online] Available httpsbooksgooglebebooksid=PDZOAwAAQBAJ

[52] On-Net Surveillance Systems Inc ldquoMJPEG vs MPEG4 Understanding the differences advantages and disadvantages of

each compression techniquerdquo 2006 [Online] Available wwwonssicom

BIBLIOGRAPHY 75

[53] M M A V Protocol Introduction MAVLink Developer Guide 2013 [Online] Available httpsmavlinkioen (visited on

09142017)

[54] hartmut Schlosser Microservices trends 2017 Strategies tools and frameworks - JAXenter 2017 [Online] Available

httpsjaxentercommicroservices-trends-2017-survey-133265html (visited on 03242018)

[55] A Ronacher Welcome to Flask mdash Flask Documentation (012) 2017 [Online] Available httpflaskpocooorgdocs012

(visited on 03242018)

[56] F Reyes PythonDecorators 2017 [Online] Available https wiki python org moin PythonDecorators (visited on

04272018)

[57] Stackshare Companies that use Flask and Flask Integrations 2018 [Online] Available https stackshare io flask

(visited on 03242018)

[58] Falcon Falcon - Bare-metal web API framework for Python [Online] Available httpsfalconframeworkorg7B5C

7DsectionAbout (visited on 03242018)

[59] Stackshare Companies that use Falcon and Falcon Integrations 2018 [Online] Available httpsstackshareiofalcon

(visited on 03242018)

[60] A Ronacher Nameko for Microservices 2015 [Online] Available httplucumrpocooorg201548microservices-with-

nameko (visited on 03242018)

[61] C Escoffier Building Reactive Microservices in Java 2017 ISBN 9781491986264

[62] C Posta Microservices for Java Developers ISBN 9781491963081

[63] R Dua A R Raja and D Kakadia ldquoVirtualization vs Containerization to support PaaSrdquo in IEEE International Conference

on Cloud Engineering 2014 ISBN 9781479937660 DOI 101109IC2E201441

[64] D Merkel Docker Lightweight Linux Containers for Consistent Development and Deployment 2014 [Online] Available

http delivery acmorg1011452610000260024111600htmlip=1571935 1787B5Camp7Did=26002417B

5Camp7Dacc=ACTIVE20SERVICE7B5Camp7Dkey=D7FC43CABE88BEAA F15FE2ACB4878E3D 4D4702B0C3E38B35

4D4702B0C3E38B357B5Camp7D7B5C_7D7B5C_7Dacm7B5C_7D7B5C_7D=15214915967B5C_

7D (visited on 03192018)

[65] Docker Inc Docker for the Virtualization Admin 2016 p 12

[66] Docker Inc What is a Container 2018 [Online] Available https www docker com what - container (visited on

03242018)

[67] M Helsley LXC Linux container tools 2009 [Online] Available httpswwwibmcomdeveloperworkslinuxlibraryl-

lxc-containers (visited on 05212018)

[68] J Fink Docker a Software as a Service Operating System-Level Virtualization Framework 2014 [Online] Available

http journal code4lib org articles 9669 utm7B5C _7Dsource = feedburner 7B5C amp7Dutm7B5C _

7Dmedium=feed7B5Camp7Dutm7B5C_7Dcampaign=Feed7B5C7D3A+c4lj+ (visited on 03192018)

BIBLIOGRAPHY 76

[69] C Wang What is Docker Linux containers explained 2017 [Online] Available https www infoworld comarticle

3204171linuxwhat-is-docker-linux-containers-explainedhtml (visited on 05212018)

[70] CoreOS Rkt a security-minded standards-based container engine [Online] Available httpscoreoscomrkt (visited

on 03242018)

[71] F X F Xu X L X Liu and K Fujimura ldquoPedestrian detection and tracking with night visionrdquo IEEE Transactions on

Intelligent Transportation Systems vol 6 no 1 pp 63ndash71 2005 ISSN 1524-9050 DOI 101109TITS2004838222

[72] H Nanda and L Davis ldquoProbabilistic template based pedestrian detection in infrared videosrdquo IEEE Intelligent Vehicles

Symposium Proceedings vol 1 pp 15ndash20 2003 DOI 101109IVS20021187921

[73] R E Schapire ldquoExplaining adaboostrdquo Empirical Inference Festschrift in Honor of Vladimir N Vapnik pp 37ndash52 2013

DOI 101007978-3-642-41136-6_5

[74] P Viola O M Way M J Jones and D Snow ldquoDetecting pedestrian using patterns of motion and appearancerdquo Interna-

tional Journal of Computer Vision vol 63 no 2 pp 153ndash161 2005 DOI 101109ICCV20031238422

[75] I Goodfellow Y Bengio and A Courville Deep Learning MIT Press 2016 httpwwwdeeplearningbookorg

[76] J R R Uijlings K E A Van De Sande T Gevers and A W M Smeulders ldquoSelective Search for Object Recognitionrdquo Tech

Rep 2012 DOI 101007s11263-013-0620-5 arXiv 14094842 [Online] Available httpwwwcscornelleducourses

cs76702014spslidesVisionSeminar14pdf

[77] R Girshick J Donahue T Darrell and J Malik ldquoRegion-Based Convolutional Networks for Accurate Object Detection and

Segmentationrdquo IEEE Transactions on Pattern Analysis and Machine Intelligence vol 38 no 1 pp 142ndash158 2014 ISSN

01628828 DOI 101109TPAMI20152437384 arXiv 13112524

[78] R Girshick ldquoFast R-CNNrdquo in Proceedings of the IEEE International Conference on Computer Vision vol 2015 Inter 2015

pp 1440ndash1448 ISBN 9781467383912 DOI 101109ICCV2015169 arXiv 150408083

[79] S Ren K He R Girshick and J Sun ldquoFaster R-CNN Towards Real-Time Object Detection with Region Proposal Networksrdquo

IEEE Transactions on Pattern Analysis and Machine Intelligence vol 39 no 6 pp 1137ndash1149 2016 ISSN 01628828 DOI

101109TPAMI20162577031 arXiv 150601497

[80] K He Gkioxari P Dollaacuter and R Girshick ldquoMask R-CNNrdquo arXiv 2018 arXiv arXiv170306870v3

[81] J Dai Y Li K He and J Sun ldquoR-FCN Object Detection via Region-based Fully Convolutional Networksrdquo Tech Rep 2016

DOI 101109ICASSP20177952132 arXiv 160506409 [Online] Available httparxivorgabs160506409

[82] J Redmon S Divvala R Girshick and A Farhadi ldquoYou Only Look Once Unified Real-Time Object Detectionrdquo 2015 ISSN

01689002 DOI 101109CVPR201691 arXiv 150602640 [Online] Available httparxivorgabs150602640

[83] J Redmon and A Farhadi ldquoYOLOv3 An Incremental Improvementrdquo axXiv 2018 [Online] Available httpspjreddiecom

mediafilespapersYOLOv3pdf

[84] W Liu D Anguelov D Erhan C Szegedy S Reed C-y Fu and A C Berg ldquoSSD Single Shot MultiBox Detectorrdquo arXiv 2016

arXiv arXiv151202325v5

BIBLIOGRAPHY 77

[85] B Zoph and Q V Le ldquoNeural Architecture Search with Reinforcement Learningrdquo in ICLR 2017 pp 1ndash16 arXiv arXiv

161101578v2

[86] T-y Lin P Goyal R Girshick K He and P Dollaacuter ldquoFocal Loss for Dense Object Detectionrdquo arXiv 2018 arXiv arXiv

170802002v2

[87] Facebook Inc ONNX - About 2017 [Online] Available httpsonnxaiabout (visited on 05212018)

[88] TensorFlow TensorFlow 2018 [Online] Available httpswwwtensorfloworg (visited on 05212018)

[89] J Huang V Rathod C Sun M Zhu A Korattikara A Fathi I Fischer Z Wojna Y Song S Guadarrama and K Murphy

ldquoSpeedaccuracy trade-offs for modern convolutional object detectorsrdquo arXiv 2017 arXiv arXiv161110012v3

[90] J Redmon Darknet Open source neural networks in c httppjreddiecomdarknet 2013ndash2016

[91] Microsoft The Microsoft Cognitive Toolkit | Microsoft Docs 2018 [Online] Available https docs microsoft comen-

uscognitive-toolkitindex (visited on 05212018)

[92] Docker Inc Overview of Docker Compose | Docker Documentation 2018 [Online] Available httpsdocsdockercom

composeoverview (visited on 04272018)

[93] Docker Inc Use bridge networks | Docker Documentation 2018 [Online] Available httpsdocsdockercomnetwork

bridge (visited on 04272018)

[94] A Ronacher Click Documentation (50) 2017 [Online] Available httpclickpocooorg5 (visited on 04272018)

[95] A K Reitz Requests HTTP for Humans mdash Requests 2184 documentation 2018 [Online] Available httpdocspython-

requestsorgenmaster (visited on 05092018)

[96] Docker Inc Docker SDK for PythonmdashDocker SDK for Python 20 documentation 2018 [Online] Available httpsdocker-

pyreadthedocsioenstable (visited on 05122018)

[97] GStreamer GStreamer open source multimedia framework 2018 [Online] Available httpsgstreamerfreedesktop

org (visited on 05132018)

[98] E Walthinsen filesrc GStreamer Core Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktop org data doc gstreamer head gstreamer -plugins html gstreamer -plugins - filesrc html (visited on

05132018)

[99] E Hervey decodebin GStreamer Base Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-base-pluginshtmlgst-plugins-base-plugins-decodebinhtml

(visited on 05132018)

[100] W Taymans jpegenc GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-jpegenchtml (visited on

05132018)

BIBLIOGRAPHY 78

[101] A Communications rtpjpegpay GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https

gstreamer freedesktop org data doc gstreamer head gst - plugins - good html gst - plugins - good - plugins -

rtpjpegpayhtml (visited on 05132018)

[102] W Taymans udpsink GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-udpsinkhtml (visited on

05132018)

[103] GStreamer Basic tutorial 3 Dynamic pipelines [Online] Available httpsgstreamerfreedesktoporgdocumentation

tutorialsbasicdynamic-pipelineshtml (visited on 05132018)

[104] W Taymans udpsrc GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-udpsrchtml (visited on

05142018)

[105] W Taymans rtpjpegdepay GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available httpsgstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-rtpjpegdepayhtml

(visited on 05142018)

[106] A Loonstra ldquoVideostreaming with Gstreamerrdquo [Online] Available httpmediatechnologyleideneduimagesuploads

docswt20147B5C_7Dgstreamerpdf

[107] W Taymans jpegdec GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-jpegdechtml (visited on

05142018)

[108] J Schmidt autovideosink GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available httpsgstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-autovideosinkhtml

(visited on 05142018)

[109] A Ronacher Deployment Options mdash Flask 0124 documentation 2018 [Online] Available httpflaskpocooorgdocs

012deploying (visited on 05142018)

[110] R Yasrab ldquoMitigating Docker Security Issuesrdquo University of Science and Technology of China Hefei Tech Rep [Online]

Available httpsarxivorgpdf180405039pdf

[111] Lvh Donrsquot expose the Docker socket (not even to a container) 2015 [Online] Available httpswwwlvhiopostsdont-

expose-the-docker-socket-not-even-to-a-containerhtml (visited on 05152018)

[112] Docker Inc Use overlay networks | Docker Documentation 2018 [Online] Available httpsdocsdockercomnetwork

overlay7B5C7Dcustomize-the-docker7B5C_7Dgwbridge-interface (visited on 05152018)

[113] J W Davis and M A Keck ldquoA Two-Stage Template Approach to Person Detection in Thermal Imageryrdquo Proc Workshop

on Applications of Computer Vision 2005 [Online] Available httpvcipl-okstateorgpbvsbenchpaperswacv05pdf

BIBLIOGRAPHY 79

[114] J W Davis and V Sharma ldquoBackground-subtraction using contour-based fusion of thermal and visible imageryrdquo Com-

puter Vision and Image Understanding vol 106 no No 2-3 pp 162ndash182 2007 DOI 101016jcviu200606010 [Online]

Available httpswebcseohio-stateedu7B~7Ddavis1719Publicationscviu07pdf

[115] S Hwang J Park N Kim Y Choi and I S Kweon ldquoMultispectral Pedestrian Detection Benchmark Dataset and Baselinerdquo

CVPR 2015 [Online] Available httpssitesgooglecomsitepedestrianbenchmark

[116] Z Wu N Fuller D Theriault and M Betke ldquoA Thermal Infrared Video Benchmark for Visual Analysisrdquo IEEE Conference

on Computer Vision and Pattern Recognition Workshops 2014 DOI 101109CVPRW201439 [Online] Available http

citeseerxistpsueduviewdocdownloaddoi=101173522167B5Camp7Drep=rep17B5Camp7Dtype=pdf

[117] R Miezianko Terravic research infrared database

[118] R Miezieanko Terravic research infrared database

[119] S Z Li R Chu S Liao and L Zhang ldquoIllumination Invariant Face Recognition Using Near-Infrared Imagesrdquo IEEE Trans-

actions on Pattern Analysis and Machine Intelligence vol 29 no 4 pp 627ndash639 2007 DOI 101109TPAMI20071014

[Online] Available httpvcipl-okstateorgpbvsbenchpapersNIRpdf

[120] A Akula R Ghosh S Kumar and H K Sardana ldquoMoving target detection in thermal infrared imagery using spatiotem-

poral informationrdquo J Opt Soc Am A vol 30 no 8 pp 1492ndash1501 Aug 2013 DOI 101364JOSAA30001492 [Online]

Available httpjosaaosaorgabstractcfmURI=josaa-30-8-1492

[121] R I Hammoud IEEE OTCBVS WS Series Bench [Online] Available http vcipl - okstate org pbvs bench (visited on

05182018)

[122] Last Post Association Mission 2018 [Online] Available httpwwwlastpostbeenthe-last-postmission (visited on

05182018)

[123] I FLIR Systems FLIR One Pro 2017 [Online] Available httpswwwflircomglobalassetsimported-assetsdocument17-

1746-oem-flir7B5C_7Done7B5C_7Dpro7B5C_7Ddatasheet7B5C_7Dfinal7B5C_7Dv17B5C_

7Dwebpdf

[124] R J Ramana Introduction to Camouflage andDeception Defence Scientific Information ampDocumentation Centre pp 99ndash

164

[125] A Bornstein and I Richter Microsoft visual object tagging tool [Online] Available httpsgithubcomMicrosoftVoTT

(visited on 05202018)

[126] F E Grubbs ldquoProcedures for Detecting Outlying Observations in Samplesrdquo Technometrics vol 11 no 1 pp 1ndash21 Feb 1969

DOI 10108000401706196910490657 [Online] Available httpwwwtandfonlinecomdoiabs10108000401706

196910490657

[127] J Deng W Dong R Socher L-J Li K Li and L Fei-Fei ldquoImageNet A Large-Scale Hierarchical Image Databaserdquo in CVPR09

2009 [Online] Available httpwwwimage-netorgpapersimagenet7B5C_7Dcvpr09pdf

BIBLIOGRAPHY 80

[128] D Gupta Transfer learning amp The art of using Pre-trained Models in Deep Learning 2017 [Online] Available https

wwwanalyticsvidhyacomblog201706transfer-learning-the-art-of-fine-tuning-a-pre-trained-model (visited on

05202018)

[129] Docker Inc docker stats | Docker Documentation 2018 [Online] Available httpsdocsdockercomenginereference

commandlinestats (visited on 05242018)

[130] M Gori and A Tesi ldquoOn the Problem of Local Minima in Recurrent Neural Networksrdquo IEEE Transactions on Pattern

Analysis and Machine Intelligence vol 14 no 1 pp 76ndash86 1992 DOI 10110934107014

[131] L Prechelt ldquoEarly stopping - but whenrdquo in Neural Networks Tricks of the Trade G B Orr and K-R Muumlller Eds Berlin

Heidelberg Springer Berlin Heidelberg 1998 pp 55ndash69 ISBN 978-3-540-49430-0 DOI 1010073-540-49430-8_3

[Online] Available httpsdoiorg1010073-540-49430-8_3

[132] M Everingham L Van Gool C K Williams J Winn and A Zisserman ldquoThe Pascal visual object classes (VOC) challengerdquo

International Journal of Computer Vision vol 88 no 2 pp 303ndash338 2010 ISSN 09205691 DOI 101007s11263-009-

0275-4

[133] M Everingham S M A Eslami L Van Gool C K I Williams J Winn and A Zisserman ldquoThe Pascal Visual Object Classes

Challenge A Retrospectiverdquo International Journal of Computer Vision vol 111 no 1 pp 98ndash136 2014 ISSN 15731405

DOI 101007s11263-014-0733-5

[134] P Henderson and V Ferrari ldquoEnd-to-end training of object class detectors for mean average precisionrdquo Lecture Notes

in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

vol 10115 LNCS pp 198ndash213 2017 ISSN 16113349 DOI 101007978-3-319-54193-8_13 arXiv 160703476

[135] T Y Lin M Maire S Belongie J Hays P Perona D Ramanan P Dollaacuter and C L Zitnick ldquoMicrosoft COCO Common objects

in contextrdquo Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture

Notes in Bioinformatics) vol 8693 LNCS no PART 5 pp 740ndash755 2014 ISSN 16113349 DOI 101007978-3-319-10602-

1_48 arXiv 14050312

[136] Docker Inc Librarydocker 2018 [Online] Available https hub docker com 7B 5C _ 7D docker (visited on

06012018)

[137] Nvidia Nvidia-docker [Online] Available httpsgithubcomNVIDIAnvidia-docker (visited on 05252018)

[138] FLIR ldquoFLIR Onerdquo [Online] Available http www flir comuploadedFiles Store Products FLIR-ONE3rd-GenFLIR-

ONEFLIR-ONE-Gen-3-Datasheetpdf

[139] FLIR ldquoFLIR Bosonrdquo p 2 2016

FIREFIGHTING DEPARTMENT EMAIL CONVERSATIONS 81

Appendix A

Firefighting department email conversations

This appendix contains the email conversations with different firefighting departments in Belgium as part of an exploration of

the functional requirements of an aerial thermal imaging solution Note that all conversations were translated from Dutch to

English

A1 General email sent to Firefighting departments

This email was sent to the departments later mentioned in this appendix The responses in the following sections are responses

to this email

Subject Firefighting department - Thesis thermal drones

Dear Sir Madam

My name is Brecht Verhoeve I am a student Master of Science computer science engineering at Ghent University I am contacting

your department with reference to the research of my masterrsquos dissertation I am currently researching the applications of

thermal cameras in combination with commercial drones They can create an aerial overview of scenes and objects that often

canrsquot be spotted with visual detectors like hidden persons fires or hot explosives The eventual goal is to let a computer indicate

these objects of interest autonomously on the thermal images of the drone These images could aid a firefighter with their

work

For this research I have some questions for you

Functionality

I have enlisted some functionalities which I believe could be interesting for a firefighter

bull Detection of persons in buildings (find potential victims)

bull Detection of hidden fires in buildings (to identify danger zones)

bull Detection of fires on vast terrains (forests industrial terrains)

A2 Conversation with Firefighting department of Antwerp Belgium 82

bull Indication of hot explosives

I have two questions

bull Do you agree that these are the most important functions

bull Are there any other functions that you deem important

Quality of the application Next to the functionality the quality of the application is also important For me the most important

aspects are

bull Accuracy The software must be accurate There is no room for errors when detecting

bull Speed The software must operate quickly An overview must be created quickly to not waste time in case of an emer-

gency

bull Usability The software must be easy to use

Once again I have two questions

bull Do you agree with these qualities

bull Are there any other important qualities that you deem important

I would like to thank you in advance for your time

Best regards

Brecht Verhoeve

A2 Conversation with Firefighting department of Antwerp Belgium

The answers were given inline For clarity these are explicitly given

Subject Re Firefighting department Antwerp - Thesis thermal drones

Answers can be found in your email

Best regards

Functionality Detection of hidden fires in buildings and environments Are there any other functions that you deem important

Capture the evolution of a fire with the thermal camera Visualise incidents during night-time Capture invisible fires such as

hydrogen or methane fires

A3 Converstation with Firefighting department of Ostend Belgium 83

A3 Converstation with Firefighting department of Ostend Belgium

The answers were given inline For clarity these are explicityl given

Subject Re Firefighting department Ostend - Thesis thermal drones

Dear Brecht

You can find the answers after the questions in your email

Best Regards

Functionality Are there any other functions that you deem important These are the most important for us at the moment

Quality of the application Are there any other important qualities that you deem important The application must work au-

tonomously

A4 Conversation with Firefighting department of Courtrai Belgium

Subject Re Firefighting department Courtrai - Thesis thermal drones

Dear Brecht

Beneath you will find our answers (next to the already mentioned items)

Functionality

bull The detection of persons in a landscape For example missing persons after a traffic accident there are searches in the

dark for victims that were catapulted from a vehicle Today this is done via a thermal camera on the ground but with

a drone this could hold potential benefits Another example is searching for missing persons in nature reserves The

police sometimes asks for assitance of firefighters to search the area

Quality of the application

bull The images needs to be processed in realtime not after the drone has landed

The drones must be deployable for multiple purposes

The interpretation of the images in the future can be important for automatic flight control of drones Currently there is a

European project rdquo3D Safeguardrdquo where the KU Leuven is participating They are already quite advanced in interpreting the

images from a drone to spot persons through smoke With this information the drone can be redirected The application can

thus use the interpretations of the images to control the drone in flight

Best regards

A5 Conversation with Firefighting department of Ghent Belgium

Subject Re Firefighting department Ghent - Thesis thermal drones

A5 Conversation with Firefighting department of Ghent Belgium 84

Hi Brecht

I donrsquot know if yoursquove received the previous email but there you received answers on your questions

Best regards

Subject Re Firefighting department Ghent - Thesis thermal drones

With respect to the functionality I would like to add

bull Measuring the temperature of containers silos

I agree with the quality of the application It could be handy to be able to view the application from one or more devices

Everything should have a clear overview If possible information and controls should be available on one screen

I will follow up

Best regards

THERMAL CAMERA SPECIFICATIONS 85

Appendix B

Thermal camera specifications

This appendix gives all the specifications for the compared thermal cameras First the different cameras their producing

companies and average retail prices are listed in Table B1 Second their respective physical specifications are presented in

Table B2 Third the image qualities are presented in Table B3 Fourth the thermal precisions are presented in Table B4 Fifth

the available interfaces to interact with each camera are presented in Table B5 Sixth the energy consumption of each camera

is presented in Table B6 Seventh how support is offered when developing for these platforms is presented in Table B7 Finally

auxiliary features are presented in Table B8

THERMAL CAMERA SPECIFICATIONS 86

Product Company Price (Euro)

Wiris 2nd Gen 640 Workswell 999500

Wiris 2nd Gen 336 Workswell 699500

Duo Pro R 640 FLIR 640900

Duo Pro R 336 FLIR 438484

Duo FLIR 94999

Duo R FLIR 123999

Vue 640 FLIR 268900

Vue 336 FLIR 125993

Vue Pro 640 FLIR 403218

Vue Pro 336 FLIR 230261

Vue Pro R 640 FLIR 518456

Vue Pro R 336 FLIR 345599

Zenmuse XT 640 DJI x FLIR 1181000

Zenmuse XT 336 DJI x FLIR 697000

Zenmuse XT 336 R DJI x FLIR 939000

Zenmuse XT 640 R DJI x FLIR 1423000

One FLIR 23799

One Pro FLIR 46900

Tau 2 640 FLIR 674636

Tau 2 336 FLIR 493389

Tau 2 324 FLIR 2640

Lepton 3 160 x 120 FLIR 25995

Lepton 3 80 x 60 FLIR 14338

Boson 640 FLIR 122209

Boson 320 FLIR 93842

Quark 2 640 FLIR 33165

Quark 2 336 FLIR 33165

DroneThermal v3 Flytron 34115

Compact Seek Thermal 27500

CompactXR Seek Thermal 28646

Compact Pro Seek Thermal 59900

Therm-App Opgal 93731

Therm-App TH Opgal 295000

Therm-App 25 Hz Opgal 199000

Table B1 Compared cameras their producing companies and their average retail price

THERMAL CAMERA SPECIFICATIONS 87

Product Weight (g) Dimensions (mm)

Wiris 2nd Gen 640 390 135 x 77 x 69

Wiris 2nd Gen 336 390 135 x 77 x 69

Duo Pro R 640 325 85 x 813 x 685

Duo Pro R 336 325 85 x 813 x 685

Duo 84 41 x 59 x 30

Duo R 84 41 x 59 x 30

Vue 640 114 574 x 4445 x 4445

Vue 336 114 574 x 4445 x 4445

Vue Pro 640 9214 574 x 4445 x 4445

Vue Pro 336 9214 574 x 4445 x 4445

Vue Pro R 640 9214 574 x 4445 x 4445

Vue Pro R 336 9214 574 x 4445 x 4445

Zenmuse XT 640 270 103 x 74 x 102

Zenmuse XT 336 270 103 x 74 x 102

Zenmuse XT 336 R 270 103 x 74 x 102

Zenmuse XT 640 R 270 103 x 74 x 102

One 345 67 x 34 x 14

One Pro 365 68 x 34 x 14

Tau 2 640 72 444 x 444 x 444

Tau 2 336 72 444 x 444 x 444

Tau 2 324 72 444 x 444 x 444

Lepton 3 160 x 120 09 118 x 127 x 72

Lepton 3 80 x 60 09 118 x 127 x 72

Boson 640 75 21 x 21 x 11

Boson 320 75 21 x 21 x 11

Quark 2 640 8 22 x 22 x 12

Quark 2 336 8 22 x 22 x 12

DroneThermal v3 3 20 x 20 x 15

Compact 1417 254 x 444 x 203

CompactXR 1417 254 x 444 x 254

Compact Pro 1417 254 x 444 x 254

Therm-App 138 55 x 65 x 40

Therm-App TH 123 55 x 65 x 40

Therm-App 25 Hz 138 55 x 65 x 40

Table B2 Physical specifications

THERMAL CAMERA SPECIFICATIONS 88

Product IR Resolution (pixels) SD resolution (megapixels) Frequency (Hz) FOV Radiometry

Wiris 2nd Gen 640 640 x 512 192 not specified Various yes

Wiris 2nd Gen 336 336 x 256 192 not specified Various yes

Duo Pro R 640 640 x 512 12 30 Various lens yes

Duo Pro R 336 336 x 256 12 30 Various lens yes

Duo 160 x 120 2 75 and 83 57deg x 44deg no

Duo R 160 x 120 2 75 57deg x 44deg yes

Vue 640 640 x 512 0 75 Various lens no

Vue 336 336 x 256 0 75 Various lens no

Vue Pro 640 640 x 512 0 75 Various lens no

Vue Pro 336 336 x 256 0 75 Various lens no

Vue Pro R 640 640 x 512 0 75 Various lens yes

Vue Pro R 336 336 x 256 0 75 Various lens yes

Zenmuse XT 640 640 x 512 0 75 Various lens no

Zenmuse XT 336 336 x 256 0 75 Various lens no

Zenmuse XT 336 R 336 x 256 0 75 Various lens yes

Zenmuse XT 640 R 336 x 256 0 75 Various lens yes

One 80 x 60 15 87 50 deg x 38 deg yes

One Pro 160 x 120 15 87 55 deg x 43 deg yes

Tau 2 640 640 x 512 0 75 Various lens yes

Tau 2 336 336 x 256 0 75 Various lens yes

Tau 2 324 324 x 256 0 76 Various lens yes

Lepton 3 160 x 120 160 x 120 0 88 56 deg available

Lepton 3 80 x 60 80 x 60 0 88 56 deg no

Boson 640 640 x 512 0 90 Various lens no

Boson 320 320 x 256 0 90 Various lens no

Quark 2 640 640 x 512 0 9 Various lens no

Quark 2 336 336 x 256 0 9 Various lens no

DroneThermal v3 80 x 60 0 86 25 deg no

Compact 206 x 156 0 9 36 deg no

CompactXR 205 x 156 0 9 20 deg no

Compact Pro 320 x 240 0 15 32 deg no

Therm-App 384 x 288 0 87 Various lens no

Therm-App TH 384 x 288 0 87 Various lens yes

Therm-App 25 Hz 384 x 288 0 25 Various lens no

Table B3 Image quality

IR InfraRed SD Standard FOV Field of View

THERMAL CAMERA SPECIFICATIONS 89

Product Sensitivity mK Temperature range (degrees Celsius) Accuracy (Celsius)

Wiris 2nd Gen 640 50 -25 to +150 -40 to + 550 2

Wiris 2nd Gen 336 50 -25 to +150 -40 to + 550 2

Duo Pro R 640 50 -25 to + 135 -40 to + 550 5 20

Duo Pro R 336 50 -25 to + 135 -40 to + 550 5 20

Duo not specified -40 tot + 550 5

Duo R not specified -40 to + 550 5

Vue 640 not specified -58 to + 113 not specified

Vue 336 not specified -58 to + 113 not specified

Vue Pro 640 not specified -58 to + 113 not specified

Vue Pro 336 not specified -58 to + 113 not specified

Vue Pro R 640 not specified -58 to + 113 not specified

Vue Pro R 336 not specified -58 to + 113 not specified

Zenmuse XT 640 50 -40 to 550 not specified

Zenmuse XT 336 50 -40 to 550 not specified

Zenmuse XT 336 R 50 -40 to 550 not specified

Zenmuse XT 640 R 50 -40 to 550 not specified

One 150 -20 to 120 3

One Pro 150 -20 to 400 3

Tau 2 640 50 -40 to 550 not specified

Tau 2 336 50 -40 to 550 not specified

Tau 2 324 50 -40 to 550 not specified

Lepton 3 160 x 120 50 0 to 450 5

Lepton 3 80 x 60 50 0 to 450 5

Boson 640 40 0 to 500 not specified

Boson 320 40 0 to 500 not specified

Quark 2 640 50 -40 to 160 not specified

Quark 2 336 50 -40 to 160 not specified

DroneThermal v3 50 0 to 120 not specified

Compact not specified -40 to 330 not specified

CompactXR not specified -40 to 330 not specified

Compact Pro 70 -40 to +330 not specified

Therm-App 70 5 to + 90 3

Therm-App TH 70 0 to 200 2

Therm-App 25 Hz 70 5 to + 90 3

Table B4 Thermal precision

THERMAL CAMERA SPECIFICATIONS 90

Product USB MAVLink HDMI

Wiris 2nd Gen 640 Flash disk yes yes

Wiris 2nd Gen 336 Flash disk yes yes

Duo Pro R 640 Mini-USB yes micro-HDMI

Duo Pro R 336 Mini-USB yes micro-HDMI

Duo Mini-USB yes micro-HDMI

Duo R Mini-USB yes micro-HDMI

Vue 640 Mini-USB No No

Vue 336 Mini-USB no no

Vue Pro 640 Mini-USB yes Optional

Vue Pro 336 Mini-USB yes Optional

Vue Pro R 640 Mini-USB yes Optional

Vue Pro R 336 Mini-USB yes Optional

Zenmuse XT 640 Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 336 Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 336 R Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 640 R Via DJI drone Via DJI drone Via DJI drone

Tau 2 640 No no no

Tau 2 336 No no no

Tau 2 324 No no no

Lepton 3 160 x 120 No no no

Lepton 3 80 x 60 No no no

Boson 640 Yes no no

Boson 320 Yes no no

Quark 2 640 no no no

Quark 2 336 no no no

DroneThermal v3 no no no

Compact Smartphone storage no no

CompactXR Smartphone storage no no

Compact Pro Smartphone storage no no

Therm-App Smartphone storage no no

Therm-App TH Smartphone storage no no

Therm-App 25 Hz Smartphone storage no no

Table B5 Interfaces

THERMAL CAMERA SPECIFICATIONS 91

Product Power consumption (Watt) Input Voltage

Wiris 2nd Gen 640 4 6 - 36

Wiris 2nd Gen 336 4 6 - 36

Duo Pro R 640 10 50 - 260

Duo Pro R 336 10 50 - 260

Duo 22 50 - 260

Duo R 22 50 - 260

Vue 640 12 48 - 60

Vue 336 12 48 - 60

Vue Pro 640 21 48 - 60

Vue Pro 336 21 48 - 60

Vue Pro R 640 21 48 - 60

Vue Pro R 336 21 48 - 60

Zenmuse XT 640 Via DJI drone Via DJI drone

Zenmuse XT 336 Via DJI drone Via DJI drone

Zenmuse XT 336 R Via DJI drone Via DJI drone

Zenmuse XT 640 R Via DJI drone Via DJI drone

One approx 1h battery lifetime Battery

One Pro approx 1h battery lifetime Battery

Tau 2 640 13 40 - 60

Tau 2 336 13 40 - 61

Tau 2 324 13 40 - 62

Lepton 3 160 x 120 065 31

Lepton 3 80 x 60 065 31

Boson 640 05 33

Boson 320 05 33

Quark 2 640 12 33

Quark 2 336 12 33

DroneThermal v3 015 33 - 5

Compact Via smartphone Smartphone

CompactXR Via smartphone Smartphone

Compact Pro Via smartphone Smartphone

Therm-App 05 5

Therm-App TH 05 5

Therm-App 25 Hz 05 5

Table B6 Energy consumption

THERMAL CAMERA SPECIFICATIONS 92

Product Warranty (years) User Manual Phone support Email support FAQs

Wiris 2nd Gen 640 Not specified Yes Yes Yes Yes

Wiris 2nd Gen 336 Not specified Yes Yes Yes Yes

Duo Pro R 640 1 Yes Yes Yes Yes

Duo Pro R 336 1 Yes Yes Yes Yes

Duo 1 yes Yes Yes Yes

Duo R 1 yes yes yes yes

Vue 640 1 yes yes yes yes

Vue 336 1 yes yes yes yes

Vue Pro 640 1 yes yes yes yes

Vue Pro 336 1 yes yes yes yes

Vue Pro R 640 1 yes yes yes yes

Vue Pro R 336 1 yes yes yes yes

Zenmuse XT 640 05 yes yes yes yes

Zenmuse XT 336 05 yes yes yes yes

Zenmuse XT 336 R 05 yes yes yes yes

Zenmuse XT 640 R 05 yes yes yes yes

One 1 yes yes yes yes

One Pro 1 yes yes yes yes

Tau 2 640 1 yes yes yes yes

Tau 2 336 1 yes yes yes yes

Tau 2 324 1 yes yes yes yes

Lepton 3 160 x 120 1 yes yes yes yes

Lepton 3 80 x 60 1 yes yes yes yes

Boson 640 1 yes yes yes yes

Boson 320 1 yes yes yes yes

Quark 2 640 1 yes yes yes yes

Quark 2 336 1 yes yes yes yes

DroneThermal v3 not specified no no no no

Compact 1 yes yes yes yes

CompactXR 1 yes yes yes yes

Compact Pro 1 yes yes yes yes

Therm-App 1 yes yes yes yes

Therm-App TH 1 yes yes yes yes

Therm-App 25 Hz 1 yes yes yes yes

Table B7 Help and support

THERMAL CAMERA SPECIFICATIONS 93

Product Bluetooth Wi-Fi GPS Mobile app Storage

Wiris 2nd Gen 640 no on request Yes no yes

Wiris 2nd Gen 336 no on request yes no yes

Duo Pro R 640 yes no yes yes yes

Duo Pro R 336 yes no yes yes yes

Duo no no no no yes

Duo R no no no no yes

Vue 640 No no no no no

Vue 336 no no no no no

Vue Pro 640 yes no no yes yes

Vue Pro 336 yes no no yes yes

Vue Pro R 640 yes no no yes yes

Vue Pro R 336 yes no no yes yes

Zenmuse XT 640 Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 336 Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 336 R Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 640 R Via DJI drone Via DJI drone Via DJI drone yes yes

One no no no yes yes

One Pro no no no yes yes

Tau 2 640 no no no no yes

Tau 2 336 no no no no yes

Tau 2 324 no no no no yes

Lepton 3 160 x 120 no no no no no

Lepton 3 80 x 60 no no no no no

Boson 640 no no no no no

Boson 320 no no no no no

Quark 2 640 no no no no no

Quark 2 336 no no no no no

DroneThermal v3 no no no no no

Compact no no no yes yes

CompactXR no no no yes yes

Compact Pro no no no yes yes

Therm-App no no no yes yes

Therm-App TH no no no yes yes

Therm-App 25 Hz no no no yes yes

Table B8 Auxiliary features

LAST POST THERMAL DATASET SUMMARY 94

Appendix C

Last Post thermal dataset summary

The goal of this appendix is to provide a summary of the layout of the Last Post thermal dataset The data was captured on

the following days 24th of March 2018 second of April 2018 third of April 2018 third of April 2018 fourth of April 2018 fifth of

April 2018 9th of April 2018 10th of April 2018 11th of April 2018 and 12th of April 2018 For each date a small summary of the

contents is made below The small summary consists of a description of the conditions that day a listing of the video files and

their contents

C1 24th of March 2018

Conditions

bull Hours 1940 - 2020

bull Outside temperature range 5 degrees Celsius - 12 degrees Celsius

bull Clear

bull Humidity 76

bull Wind 24 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 14 kilometers

Videos

bull flir_20180324T195255mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd gathers on the right of the video D

C2 2nd of April 2018 95

bull flir_20180324T195836mp4 This video gives an overview of the inside of the Meningate ceremony Many

people can be seen watching the ceremony

bull flir_20180324T200421mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side

bull flir_20180324T201448mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side The crowd is leaving

the area after the ceremony

bull flir_20180324T202328mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side The crowd is leaving

the area after the ceremony

C2 2nd of April 2018

Conditions

bull Hours 1940 - 2020

bull Outside temperature range 9 degrees Celsius - 15 degrees Celsius

bull Light rain

bull Humidity 74

bull Wind 18 kilometers per hour

bull Precipitation 04 centimeters

bull Visibility 81 kilometers

Videos

bull 2018-04-02194733mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people sometimes with umbrellas passing through

bull 2018-04-02194952mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance

C3 3th of April 2018 96

bull 2018-04-02195518mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance

bull 2018-04-02201322mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance Crowds can be seen as well as people holding umbrellas

C3 3th of April 2018

Conditions

bull Hours 2000 - 2030

bull Outside temperature range 8 degrees Celsius - 16 degrees Celsius

bull Heavy rain

bull Humidity 79

bull Wind 25 kilometers per hour

bull Precipitation 05 centimeters

bull Visibility 101 kilometers

Videos

bull 2018-04-03 201227mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are holding

umbrellas due to heavy rain on that day Due to rain conditions and wind it was difficult to steady the camera which

can be seen in the shaky video

bull 2018-04-03 201727mp4 In the beginning of the clip the camera is moving towards the other side of the

Meningate From 0020 and onwards the clip is useful The video gives an overview of the bridge at the east-side of the

Meningate This is were the Frenchlaan goes into the Menenstraat The video shows people leaving from the Meningate

towards the busses at the other side of the bridge Most people are holding umbrellas due to heavy rain that day The

Meningate is in the bottom left of the picture Several buildings can be seen in the distance In the bottom right the

water of the Kasteelgracht can be seen Sometimes in the left of the picture the wall of the Meningate can be seen

bull 2018-04-03 202311mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are holding

umbrellas due to heavy rain on that day Due to rain conditions and wind it was difficult to steady the camera which

can be seen in the shaky video

C4 4th of April 2018 97

C4 4th of April 2018

Content

bull Hours 1945 - 2030

bull Outside temperature range 10 degrees Celsius - 14 degrees Celsius

bull Cloudy

bull Humidity 87

bull Wind 18 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-04 200052mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen In the video a dense

crowd of people slowly starts to form in the bottom right of the picture Most people are coming from the left of the

video to join the crowd

bull 2018-04-04 200728mp4 This video shows the inside of the Meningate and the ceremony of the last post

Some people are up close in front The large crowd can be seen through the hall

bull 2018-04-04 200913mp4 This video shows the inside of the Meningate and the ceremony of the last post

The video switches between MSX mode visual camera and thermal camera to show the differences

bull 2018-04-04 202859mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen At the start of the video

a crowd is seen in the bottom right At the 0100 mark the ceremony has ended and people are exiting the gate and

coming onto the crossing They form two rows to make place for the marching band exiting the Meningate It can be

seen marching through the crowd at the 0250 mark

C5 5th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 11 degrees Celsius - 15 degrees Celsius

C6 9th of April 2018 98

bull Sunny

bull Humidity 77

bull Wind 11 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-05 200217mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen In the video a dense

crowd of people slowly starts to form in the bottom right of the picture Most people are coming from the left of the

video to join the crowd The video shows 15 minutes before the start of the ceremony

bull 2018-04-04 201838mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen The video shows the

first ten minutes after the end of the ceremony The crowd which can be seen on the left leaves towards the square

C6 9th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 9 degrees Celsius - 10 degrees Celsius

bull Light rain

bull Humidity 99

bull Wind 14 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 81 kilometers

Videos

bull 2018-04-09 200007mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are coming from

the left towards the Meningate in the right Not a lot of people are seen due to rain that day

C7 10th of April 2018 99

bull 2018-04-09-202302mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right People are leaving from the right of the Meningate

towards the square

C7 10th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 14 degrees Celsius - 17 degrees Celsius

bull Partly Cloudy

bull Humidity 52

bull Wind 13 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers per hour

Videos

bull 2018-04-10 195029mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way

bull 2018-04-10 195131mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way

bull 2018-04-10 195748mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way Some people are moving around the crowd

bull 2018-04-10 200122mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen There is a big crowd that

can be seen on the right There are some schools there so some people are wearing backpacks It is quite warm and

the cafe on the other side of the street has opened up its terrace

bull 2018-04-10 201427mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

C8 11th of April 2018 100

that they are standing in a very structured way Some people are moving around the crowd The image is not rotated

well a well rotated image is found in 2018-04-10 201427_rotatedmp4

bull 2018-04-10 201515mp4 This video shows the inside of the Meningate and the ceremony A traditional

rsquoHakkarsquo from New-Zealand soldiers can be heard in the video the soldiers are difficult to spot due to thermal blurring

because many people are standing in one place

bull 2018-04-10 202558mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen that

they are standing in a very structured way Some people are moving around the crowd At the 0200 mark the army

unit marches to the end of the bridge Very dense crowds can be seen afterwards At 0825 the army unit marches in a

straight line towards the Meningate

C8 11th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 12 degrees Celsius - 16 degrees Celsius

bull Sunny

bull Humidity 63

bull Wind 14 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-11 200140mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen

bull 2018-04-11 200601mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat A small crowd can be seen on the left of the video

bull 2018-04-11 201554mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People start leaving the

ceremony from the 0120 mark

C9 12th of April 2018 101

C9 12th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 11 degrees Celsius - 14 degrees Celsius

bull Rain

bull Humidity 94

bull Wind 8 kilometers per hour

bull Precipitation 01 centimeters

bull Visibility 32 kilometers

Videos

bull 2018-04-12 195219mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen Sometimes the hedge

of the wall were the film was made is visible due to shaky cam Not many people are seen due to the rain

bull 2018-04-12 201526mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen Sometimes the hedge

of the wall were the film was made is visible due to shaky cam Not many people are seen due to the rain People are

leaving towards the right

  • Introduction
    • Drones
    • Concepts
      • Thermal Cameras
      • Aerial thermal imaging
        • Problem statement
          • Industry adoption
          • Crowd monitoring
          • Goal
          • Related work
            • Outline
              • System Design
                • Requirements analysis
                  • Functional requirements
                  • Non-functional requirements
                    • Patterns and tactics
                      • Layers
                      • Event-driven architecture
                      • Microkernel
                      • Microservices
                      • Comparison of patterns
                        • Software architecture
                          • Static view
                          • Dynamic views
                          • Deployment views
                              • State of the art and technology choice
                                • Thermal camera options
                                  • Parameters
                                  • Comparative analysis
                                    • Microservices frameworks
                                      • Flask
                                      • Falcon
                                      • Nameko
                                      • Vertx
                                      • Spring Boot
                                        • Deployment framework
                                          • Containers
                                          • LXC
                                          • Docker
                                          • rkt
                                            • Object detection algorithms and frameworks
                                              • Traditional approaches
                                              • Deep learning
                                              • Frameworks
                                                • Technology choice
                                                  • Thermal camera
                                                  • Microservices framework
                                                  • Deployment framework
                                                  • Object detection
                                                      • Proof of Concept implementation
                                                        • Goals and scope of prototype
                                                        • Overview of prototype
                                                          • General overview
                                                          • Client interface
                                                          • Stream
                                                          • Producer and Consumer
                                                          • Implemented plugins
                                                            • Limitations and issues
                                                              • Single client
                                                              • Timeouts
                                                              • Exception handling and testing
                                                              • Docker security issues
                                                              • Docker bridge network
                                                              • Single stream
                                                              • Number of containers per plugin
                                                                  • Mob detection experiment
                                                                    • Last Post thermal dataset
                                                                      • Last Post ceremony
                                                                      • Dataset description
                                                                        • Object detection experiment
                                                                          • Preprocessing
                                                                          • Training
                                                                              • Results and evaluation
                                                                                • Framework results
                                                                                  • Performance evaluation
                                                                                  • Interoperability evaluation
                                                                                  • Modifiability evaluation
                                                                                    • Mob detection experiment results
                                                                                      • Training results
                                                                                      • Metrics
                                                                                      • Validation results
                                                                                          • Conclusion and future work
                                                                                            • Conclusion
                                                                                            • Future work
                                                                                              • Security
                                                                                              • Implementing a detection plugin
                                                                                              • Different deployment configurations
                                                                                              • Multiple streams with different layouts
                                                                                              • Implementing the plugin distribution service (Remote ProducerConsumer)
                                                                                              • Using high performance microservices backbone frameworks
                                                                                              • New object detection models and datasets specifically for thermal images
                                                                                                  • Firefighting department email conversations
                                                                                                    • General email sent to Firefighting departments
                                                                                                    • Conversation with Firefighting department of Antwerp Belgium
                                                                                                    • Converstation with Firefighting department of Ostend Belgium
                                                                                                    • Conversation with Firefighting department of Courtrai Belgium
                                                                                                    • Conversation with Firefighting department of Ghent Belgium
                                                                                                      • Thermal camera specifications
                                                                                                      • Last Post thermal dataset summary
                                                                                                        • 24th of March 2018
                                                                                                        • 2nd of April 2018
                                                                                                        • 3th of April 2018
                                                                                                        • 4th of April 2018
                                                                                                        • 5th of April 2018
                                                                                                        • 9th of April 2018
                                                                                                        • 10th of April 2018
                                                                                                        • 11th of April 2018
                                                                                                        • 12th of April 2018

Streams platform in January 2018 as a new service for the Ama-zon Web Services (AWS) cloud platform It allows users tostream live video from devices to the AWS cloud and buildapplications for real-time video processing [17] The VIPERproject by EAVISE KU Leuven researched how thermal andvisual video images could be used for real-time detection ofpersons using object detection algorithms such as deep learning[18] The framework presented in this work combines elementsfrom all three of these examples

III REQUIREMENTS AND SOFTWARE ARCHITECTURE

A Functional requirements

Three general actors are identified for the framework an end-user that wants to build a image processing application for a spe-cific use case camera developers integrating new cameras intothe framework and analysisdetection module developers inte-grating new analysisdetection algorithms into the framework sothat end-users can use them to build their applications An end-user should be able to construct a image processing applicationusing the integrated cameras and analysis modules He shouldbe able to adapt this application with the framework for newuse cases Camera and analysis module developers should beable to add and distribute new integrations for new cameras andanalysis modules to the framework This allows the end-users tofocus on the use case not the technical details of the hardwareplatforms or algorithms and have a wider selection of hardwareand algorithms

B Non-functional requirements

Interoperability modifiability and peformance are identifiedas the architecturally significant requirements Interoperabilityspecifies that the framework should be able to interact with var-ious cameras and analysis software modules via interfaces Theamount of systems the framework can successfully interact withadding to the business value of the framework as end-users canuse more devices via the framework to build applications Theframework needs to be extendable with new thermal camerasand analysis modules Applications built with the frameworkshould be modifiable to integrate new hardware and softwareThe available hardware on-site for use cases such as forest firemonitoring is not always powerful enough to support heavy im-age analysis software The framework should be able to deployin a distributed fashion to allow more computationally expen-sive operations to be executed on more powerful remote devicesSome use cases require real-time streaming of video and manip-ulation of these video streams which should be supported forthe framework to be relevant

C Software architecture

An architectural pattern analysis based on the requirementspresented in Section III-B was conducted from which a com-bination of the microservices and microkernel pattern was se-lected as the base pattern for the software architecture The mi-crokernel pattern enables the framework to be extended via aplugin system The microservices framework enables interop-erability through the use of standard interfaces and a communi-cation scheme It also allows for the framework to be deployed

in a distributed fashion [19ndash21] The software architecture isdocumented in static views sequence diagrams and deploymentviews Figure 1 presents an overview of the architecture

Fig 1 Component-connector overview of the framework Theclear components are the core components of the framework thateach user needs to install to use the framework The coloredcomponents are used for the distribution of plugins

End-users interact with the framework via the Client Inter-face a graphical or textual interface Commands are forwardedto the Stream module that manages the layout and state of thecomponents in the media pipeline (stream) that the end-userbuilds for his use case To activate and place the componentsin a certain layout the Stream module connects with the Pro-ducer and Consumer components that control the plugins thatprocess the media Producer Plugins are devices that producemedia such as thermal cameras Consumer Plugins process andconsume media such as analysis software and displays Oncea stream is established the plugins forward media to each otherin the layout specified by the Stream module New support forcameras and analysis modules can be added as plugins to theProducerConsumer Distribution components that distribute thissoftware so that end-users can download and install the pluginsEach module in the architecture is a microservice allowing fora distributed deployment and enabling the modifiability require-ments through interchangeable modules Cameras and analy-sis modules are realized as plugins for the ProducerConsumermodules implemented as a microkernel This allows the frame-work to easily build new streams for different use cases and in-terchange components when needed

C1 Plugin model

Figure 2 depicts the model of a general framework pluginThe plugin defines three interfaces a source media endpoint toreceive media from different sources a listener endpoint to for-ward the processed media to other listening plugins and an APIfor framework control The framework uses the API to changewhich sources and listener a plugin has and its state By linkingplugins together by setting the sources and listeners resourcesthe framework can build a media processing stream ProducerPlugins have no sources since they produce media The statesare used stop and start the media processing of the plugins in

Fig 2 Schematic overview of a plugin

the stream The REST paradigm is selected to build this APIwith state sources and listeners resources that need to be min-imally implemented A plugin has the following states INAC-TIVE PLAY STOP and PAUSE Figure 3 depicts the state tran-sition diagram A plugin implements the visible states STOPPAUSE and PLAY describing if the media process of the pluginis stopped paused or processing respectively The INACTIVEstate is only visible to the framework as it indicates that their isno active process that runs the plugin This is the initial state ofa plugin in the framework When a plugin is added to a streamthe plugin microservice is started transitions to the STOP stateand waits for commands

Fig 3 State transition diagram of a plugin

C2 Network topology and communication protocol

The microservices of the framework and the plugins need acommunication protocol to exchange commands and video Forsending the commands the HTTPTCP protocol is used a syn-chronous protocol that blocks on the response of a request Thisensures that commands are always received and acknowledgedincreasing the reliability of the communication at the cost ofperformance [22] The asynchronous RTPUDP protocol is se-lected to transfer media between the plugins to ensure low la-tency video transfer between plugins to enable real-time videostreams The recommended codec for transmitting video mediais MJPEG which transmits video frames as separately encodedJPEG images [23] Other codecs such as MPEG-4 encode onlycertain frames as keyframes and the other frames as B-framesthat encode differences from the keyframe [24] This impliesthat when receiving images from a stream a keyframe must firstbe received before the video can be decoded Using MJPEG

plugins receiving frames can directly perform analysis on eachkeyframe and do not have to wait for additional frames An ex-ample of this network topology for a stream with 2 Producerplugins and 3 Consumer plugins is depicted in Figure 4

Fig 4 Network topology The full lines represent HTTPTCPcommuncations the dashed line RTPUDP communications

IV PROTOTYPE IMPLEMENTATION

The goal of the prototype implementation is to provide a proofof the concept architecture and meet the requirements presentedin Section III The core framework components are imple-mented the distribution components are left out of scope as theyfocus primarily on the distribution of supported plugins Thecore microservices as well as the plugins are implemented usingthe Docker software containerization framework [25] Contain-ers virtualize on the operating system and allow for portablelightweight software environments for processes with a minorperformance overhead Using this technology the core modulesand plugins can be deployed in a local and distributed fashionand be setup and shutdown easily [26] The microservice con-tainers communicate via the protocols presented in Section III-C2 The REST APIs are built with the Flask framework [27] alightweight Python web development framework ideal for pro-totyping The ProducerConsumer components need access tothe Docker daemon running on the Docker host in order to spinup and shutdown Docker containers containing the plugin pro-cesses This is achieved by mounting the Docker client socket inthe ProducerConsumer containers This gives the container rootaccess to the host a significant security threat [2829] Two sam-ple plugins were implemented Filecam a plugin that producesvideo read in from a file and Display a plugin that forwardsmedia to the display of the local device The plugins transmitmedia using the video streaming framework GStreamer [30]

V MOB DETECTION

A Dataset

Several publicly available datasets for thermal images exists[31ndash34] None of these include large crowds of people so anew dataset called the Last Post dataset was created It consistsof thermal video captured at the Last Post ceremony in Ypres

(a) Thermal view of the square (b) Visual view of the square (c) Thermal view of the bridge (d) Visual view of the bridge

Fig 5 Last Post dataset main scenes

Belgium [35] The videos were captured using the Flir One Prothermal camera for Android [36] using the Iron colorschemeTwo main scenes are present in the dataset depicted in Figure 5Mobs are present in the thermal images not in the visual imagesdue to the images being made on separate days The imagesused for the experiment were manually annotated outliers wereremoved and the dataset was randomly split in a training andvalidation set

B Model

Detecting and classifying objects of interest in images isknown as the object detection problem in machine learning [37]Several object detection algorithms and frameworks have beenimplemented in the past years A distinction is made betweentraditional models [31 38ndash40] deep learning two-stage net-works [41ndash46] and deep learning dense networks [47ndash49] Thetraditional and two-stage methods make predictions relativelyslow (order of seconds on GPU) when compared to the densenetworks (order of milliseconds on GPU) [47] Since the goalis to use the framework in real-time use cases the latter is pre-ferred The YOLOv3 model is selected as it achieves state of theart prediction performances can make real-time predictions andis available via the open source neural network framework dark-net [50 51] The model is pre-trained on the ImageNet dataset[52] The model is trained on a NVIDIA Geforce 980 TX GPUand optimizes the SSE loss using batch gradient descent [50]To select the best weights the average Intersection of Union(IoU) and mean Average Precision (mAP) [53] are calculatedon predictions on the validation set The weights that achievethe highest mAP are selected as the final weights

VI RESULTS

A Framework

To evaluate the framework acceptance tests for the require-ments from Section III were conducted Common frameworkoperations such as manipulating and building a stream have anaverage execution time of 084 seconds with a standard devia-tion of 037 seconds Less common operations such as deacti-vating a plugin starting up the framework and shutting downthe framework have an average execution time of 358 840 and2402 seconds respectively with standard deviations 467 050and 048 respectively Deactivating plugins (STOP to INAC-TIVE transitions) takes a long time as the container running theprocess needs to be removed Real-time streaming could not betested due to the GStreamer framework having no readily avail-

able testing endpoints However when streaming and displayinga video with the framework human users could not differenti-ate between a streaming video and a video played using a nativemedia player making it plausible the framework streams in real-time Great care must be taken when building plugins as theirprocessing speed has a direct impact on the real-time streamingperformance Interoperability is achieved with the REST APIsand plugin model presented in Section III-C The interoperabil-ity is tested by having the framework exchange information witha mock plugin implementing the specified interface and count-ing the number of correct exchanges The average successfulexchange ratio is 99998 The framework can install and de-tect new plugins at runtime achieving runtime modifiability atplugin level Different deployment schemes were not tested forthe prototype

B Mob detection

The weights generated at the 15700th training epoch achievedthe highest mAP value 9052 on the validation set For com-parison performance of other models on benchmark datasetsachieve an average mAP of 748 [54] The reason the model isachieving such high values is because the validation set is tem-porally correlated with the training set as both sets are extractedfrom videos in which frames have a temporal correlation Per-formance when predicting on new datasets will be worse Figure6 depicts some predictions of the model When predicting on avideo the model generated predictions at an average frame rateof 55 frames per second an a GPU

Fig 6 Model predictions on validation set

VII CONCLUSION AND FUTURE WORK

In this dissertation a modifiable drone thermal imaging anal-ysis framework is proposed to allow end-users to build flexiblevideo processing pipelines using different thermal cameras andanalysis modules The framework implements a microservice

container plugin architecture Users can build image process-ing applications with the framework by building media streamsusing plugins that are either thermal cameras or image analy-sis software modules The framework is evaluated by building aproof of concept implementation which is tested on the initial re-quirements The proposed framework achieves the modifiabilityand interoperability requirements at the cost of performance andsecurity The framework is applied for detecting large crowdsof people (mobs) during open-air events A new dataset con-taining thermal images of such mobs is presented on which aYOLOv3 neural network is trained The trained model is ableto detect mobs on new thermal images in real-time achievingframe rates of 55 frames per second when deployed on a modernGPU Some extensions to this research are deploying a detec-tion model using the framework testing the other deploymentconfigurations testing the framework with end-users in prac-tice and building new object detection models specifically forthermal images

REFERENCES

[1] R Gade and T B Moeslund ldquoThermal cameras and applications a sur-veyrdquo Machine Vision and Applications vol 25 pp 245ndash262 2014

[2] M C Harvey J V Rowland and K M Luketina ldquoDrone with thermalinfrared camera provides high resolution georeferenced imagery of theWaikite geothermal area New Zealandrdquo 2016

[3] S Amici M Turci S Giammanco L Spampinato and F Giulietti ldquoUAVThermal Infrared Remote Sensing of an Italian Mud Volcanordquo vol 2pp 358ndash364 2013

[4] J Bendig A Bolten and G Bareth ldquoINTRODUCING A LOW-COSTMINI-UAV FOR THERMAL-AND MULTISPECTRAL-IMAGINGrdquo2012

[5] A J Rivera A D Villalobos J C Monje J A Marinas and C MOppus ldquoPost-disaster rescue facility Human detection and geolocationusing aerial dronesrdquo IEEE Region 10 Annual International ConferenceProceedingsTENCON pp 384ndash386 2017

[6] P Christiansen K A Steen R N Joslashrgensen and H Karstoft ldquoAuto-mated detection and recognition of wildlife using thermal camerasrdquo Sen-sors (Basel Switzerland) vol 14 pp 13778ndash93 jul 2014

[7] S Chowdhury A Emelogu M Marufuzzaman S G Nurre and L BianldquoDrones for disaster response and relief operations A continuous approx-imation modelrdquo 2017

[8] Workswell ldquoPipeline inspection with thermal diagnosticsrdquo 2016[9] DJI ldquoZenmuse H3 - 2Drdquo[10] Workswell ldquoApplications of WIRIS - Thermal vision system for dronesrdquo[11] Therm-App ldquoTherm-App - Android-apps op Google Playrdquo 2018[12] B Satzger W Hummer C Inzinger P Leitner and S Dustdar ldquoWinds of

change From vendor lock-in to the meta cloudrdquo IEEE Internet Comput-ing vol 17 no 1 pp 69ndash73 2013

[13] J Divya ldquoDrone Technology and Usage Current Uses and Future DroneTechnologyrdquo 2017

[14] B Steffen and A Seyfried ldquoMethods for measuring pedestrian densityflow speed and direction with minimal scatterrdquo Physica A Statistical Me-chanics and its Applications vol 389 pp 1902ndash1910 may 2010

[15] M Wirz T Franke D Roggen E Mitleton-Kelly P Lukowicz andG Troster ldquoInferring crowd conditions from pedestriansrsquo location tracesfor real-time crowd monitoring during city-scale mass gatheringsrdquo Pro-ceedings of the Workshop on Enabling Technologies Infrastructure forCollaborative Enterprises WETICE pp 367ndash372 2012

[16] L-L Slattery ldquoDroneSAR wants to turn drones into search-and-rescueheroesrdquo 2017

[17] Amazon Web Services Inc ldquoWhat Is Amazon Kinesis Video Streamsrdquo2018

[18] T Goedeme ldquoProjectresultaten VLAIO TETRA-projectrdquo tech rep KULeuven Louvain 2017

[19] L Bass P Clements and R Kazman Software Architecture in PracticeAddison-Wesley Professional 3rd ed 2012

[20] M Richards Software Architecture Patterns OrsquoReilly Media first edit ed2015

[21] C Richardson ldquoMicroservice Architecture patternrdquo 2017[22] C De La Torre C Maddock J Hampton P Kulikov and M Jones ldquoCom-

munication in a microservice architecturerdquo 2017

[23] On-Net Surveillance Systems Inc ldquoMJPEG vs MPEG4 Understandingthe differences advantages and disadvantages of each compression tech-niquerdquo 2006

[24] D Bull Communicating Pictures A Course in Image and Video CodingElsevier Science 2014

[25] Docker Inc ldquoDocker - Build Ship and Run Any App Anywhererdquo 2018[26] D Merkel ldquoDocker Lightweight Linux Containers for Consistent Devel-

opment and Deploymentrdquo 2014[27] A Ronacher ldquoWelcome to Flask Flask Documentation (012)rdquo 2017[28] Lvh ldquoDonrsquot expose the Docker socket (not even to a container)rdquo 2015[29] R Yasrab ldquoMitigating Docker Security Issuesrdquo tech rep University of

Science and Technology of China Hefei[30] GStreamer ldquoGStreamer open source multimedia frameworkrdquo 2018[31] J W Davis and M A Keck ldquoA Two-Stage Template Approach to Per-

son Detection in Thermal Imageryrdquo Proc Workshop on Applications ofComputer Vision 2005

[32] S Hwang J Park N Kim Y Choi and I S Kweon ldquoMultispectralPedestrian Detection Benchmark Dataset and Baselinerdquo CVPR 2015

[33] Z Wu N Fuller D Theriault and M Betke ldquoA Thermal Infrared VideoBenchmark for Visual Analysisrdquo IEEE Conference on Computer Visionand Pattern Recognition Workshops 2014

[34] S Z Li R Chu S Liao and L Zhang ldquoIllumination Invariant FaceRecognition Using Near-Infrared Imagesrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 29 no 4 pp 627ndash639 2007

[35] Last Post Association ldquoMissionrdquo 2018[36] FLIR ldquoFLIR One Prordquo[37] E Alpaydin Introduction to machine learning MIT Press 3 ed 2014[38] F X F Xu X L X Liu and K Fujimura ldquoPedestrian detection and

tracking with night visionrdquo IEEE Transactions on Intelligent Transporta-tion Systems vol 6 no 1 pp 63ndash71 2005

[39] H Nanda and L Davis ldquoProbabilistic template based pedestrian detectionin infrared videosrdquo IEEE Intelligent Vehicles Symposium Proceedingsvol 1 pp 15ndash20 2003

[40] R Appel S Belongie P Perona and P Doll ldquoFast Feature Pyramids forObject Detectionrdquo Pami vol 36 no 8 pp 1ndash14 2014

[41] J R R Uijlings K E A Van De Sande T Gevers and A W M Smeul-ders ldquoSelective Search for Object Recognitionrdquo tech rep 2012

[42] R Girshick J Donahue T Darrell and J Malik ldquoRegion-Based Convolu-tional Networks for Accurate Object Detection and Segmentationrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol 38 no 1pp 142ndash158 2014

[43] R Girshick ldquoFast R-CNNrdquo in Proceedings of the IEEE InternationalConference on Computer Vision vol 2015 Inter pp 1440ndash1448 2015

[44] S Ren K He R Girshick and J Sun ldquoFaster R-CNN Towards Real-Time Object Detection with Region Proposal Networksrdquo IEEE Trans-actions on Pattern Analysis and Machine Intelligence vol 39 no 6pp 1137ndash1149 2016

[45] K He Gkioxari P Dollar and R Girshick ldquoMask R-CNNrdquo arXiv 2018[46] J Dai Y Li K He and J Sun ldquoR-FCN Object Detection via Region-

based Fully Convolutional Networksrdquo tech rep 2016[47] J Redmon S Divvala R Girshick and A Farhadi ldquoYou Only Look

Once Unified Real-Time Object Detectionrdquo 2015[48] W Liu D Anguelov D Erhan C Szegedy S Reed C-y Fu and A C

Berg ldquoSSD Single Shot MultiBox Detectorrdquo arXiv 2016[49] T-y Lin P Goyal R Girshick K He and P Dollar ldquoFocal Loss for

Dense Object Detectionrdquo arXiv 2018[50] J Redmon and A Farhadi ldquoYOLOv3 An Incremental Improvementrdquo

axXiv 2018[51] J Redmon ldquoDarknet Open source neural networks in crdquo

httppjreddiecomdarknet 2013ndash2016[52] J Deng W Dong R Socher L-J Li K Li and L Fei-Fei ldquoImageNet

A Large-Scale Hierarchical Image Databaserdquo in CVPR09 2009[53] M Everingham S M A Eslami L Van Gool C K I Williams J Winn

and A Zisserman ldquoThe Pascal Visual Object Classes Challenge A Ret-rospectiverdquo International Journal of Computer Vision vol 111 no 1pp 98ndash136 2014

[54] A Ouaknine ldquoReview of Deep Learning Algorithms for Object Detec-tionrdquo 2018

xii

Contents

1 Introduction 1

11 Drones 1

12 Concepts 2

121 Thermal Cameras 2

122 Aerial thermal imaging 2

13 Problem statement 2

131 Industry adoption 2

132 Crowd monitoring 3

133 Goal 4

134 Related work 4

14 Outline 4

2 System Design 5

21 Requirements analysis 5

211 Functional requirements 5

212 Non-functional requirements 6

22 Patterns and tactics 11

221 Layers 12

222 Event-driven architecture 12

223 Microkernel 12

224 Microservices 13

225 Comparison of patterns 13

23 Software architecture 15

231 Static view 15

232 Dynamic views 22

233 Deployment views 23

3 State of the art and technology choice 27

31 Thermal camera options 27

311 Parameters 27

312 Comparative analysis 30

32 Microservices frameworks 31

321 Flask 31

322 Falcon 33

323 Nameko 33

324 Vertx 33

325 Spring Boot 34

33 Deployment framework 34

331 Containers 34

332 LXC 35

333 Docker 35

334 rkt 35

34 Object detection algorithms and frameworks 36

341 Traditional approaches 36

342 Deep learning 37

343 Frameworks 39

35 Technology choice 41

351 Thermal camera 41

352 Microservices framework 41

353 Deployment framework 41

354 Object detection 41

4 Proof of Concept implementation 43

41 Goals and scope of prototype 43

42 Overview of prototype 43

421 General overview 43

422 Client interface 45

423 Stream 46

424 Producer and Consumer 46

425 Implemented plugins 48

43 Limitations and issues 51

431 Single client 51

432 Timeouts 51

433 Exception handling and testing 51

434 Docker security issues 51

435 Docker bridge network 52

436 Single stream 52

437 Number of containers per plugin 52

5 Mob detection experiment 53

51 Last Post thermal dataset 53

511 Last Post ceremony 53

512 Dataset description 54

52 Object detection experiment 56

521 Preprocessing 56

522 Training 56

6 Results and evaluation 58

61 Framework results 58

611 Performance evaluation 58

612 Interoperability evaluation 60

613 Modifiability evaluation 62

62 Mob detection experiment results 62

621 Training results 63

622 Metrics 63

623 Validation results 64

7 Conclusion and future work 67

71 Conclusion 67

72 Future work 69

721 Security 69

722 Implementing a detection plugin 69

723 Different deployment configurations 70

724 Multiple streams with different layouts 70

725 Implementing the plugin distribution service (Remote ProducerConsumer) 70

726 Using high performance microservices backbone frameworks 70

727 New object detection models and datasets specifically for thermal images 70

A Firefighting department email conversations 81

A1 General email sent to Firefighting departments 81

A2 Conversation with Firefighting department of Antwerp Belgium 82

A3 Converstation with Firefighting department of Ostend Belgium 83

A4 Conversation with Firefighting department of Courtrai Belgium 83

A5 Conversation with Firefighting department of Ghent Belgium 83

B Thermal camera specifications 85

C Last Post thermal dataset summary 94

C1 24th of March 2018 94

C2 2nd of April 2018 95

C3 3th of April 2018 96

C4 4th of April 2018 97

C5 5th of April 2018 97

C6 9th of April 2018 98

C7 10th of April 2018 99

C8 11th of April 2018 100

C9 12th of April 2018 101

xvi

List of Figures

21 Use case diagram 7

22 Overview of the framework software architecture 16

23 Framework network topology 17

24 Client Interface detailed view 17

25 Stream detailed view 18

26 Stream model 18

27 Plugin model 19

28 Plugin state transition diagram 20

29 Component-connector diagrams of the Producer and Consumer module 21

210 Producer and Consumer Distribution component-connector diagrams 22

211 Add plugin sequence diagram 23

212 Link plugins sequence diagram 24

213 Deployment diagrams 26

31 Thermal image and MSX image of a dog 28

33 Rethink IT Most used tools and frameworks for microservices results [54] 32

34 Containers compared to virtual machines [66] 36

41 filecam GStreamer pipeline 49

42 local plugin GStreamer pipeline 50

51 Last Post ceremony panorama 54

52 Last Post filming locations 54

53 Main scenes in the Last Post dataset 55

54 Outliers 57

61 Average training loss per epoch 64

62 Validation metrics per epoch 65

63 Predictions of the model on images in the validation set 66

71 GStreamer pipeline for a plugin with a detection model 69

xviii

List of Tables

21 Performance utility tree 8

22 Interoperability utility tree 9

23 Modifiability utility tree 10

24 Usability utility tree 11

25 Security utility tree 11

26 Availability utility tree 12

27 Architecture pattern comparison 14

61 Acceptance tests results summary 59

62 Performance test statistics summary measured in seconds 60

63 Resource usage of the framework in several conditions 61

64 Total size of framework components 61

65 Interoperability tests results (S Source L Listener) 62

B1 Compared cameras their producing companies and their average retail price 86

B2 Physical specifications 87

B3 Image quality IR InfraRed SD Standard FOV Field of View 88

B4 Thermal precision 89

B5 Interfaces 90

B6 Energy consumption 91

B7 Help and support 92

B8 Auxiliary features 93

xix

List of Listings

1 Minimal Flask application 32

2 Vertx example 33

3 Spring Boot example 34

4 docker-composeyml snippet of the prototype 44

5 Mounting the Docker socket on the container 47

6 Starting a plugin container 47

7 Dynamic linking of the decodebin and jpegenc 50

xx

List of Abbreviations

ACF Aggregated Channel Features

AMQP Advanced Message Queuing Protocol

API Application Programming Interface

AS Availability Scenario

ASR Architecturally Significant Requirement

CLI Command Line Interface

CNN Convolutional Neural Networks

CRUD Create Read Update Destroy

DNS Domain Name System

FR Functional Requirement

GPU Graphical Processing Unit

H High

HTTP Hyper Text Transfer Protocol

ICF Integral Channel Features

IoU Intersection of Union

IS Interoperability Scenario

IT Interoperability Tactic

JVM Java Virtual Machine

L Low

xxi

LXC Linux Containers

M Medium

mAP mean Average Precision

Motion-JPEG MJPEG

MS Modifiability Scenario

MSX Multi Spectral Dynamic Imaging

MT Modifiablity Tactic

NFR Non-Functional Requirement

ONNX Open Neural Network Exchange Format

OS Operating System

PS Performance Scenario

PT Performance Tactic

QAR Quality Attribute Requirement

REST Representational State Transfer

RNN Recurrent Neural Network

RPN Region Proposal Network

RTP Real-time Transport Protocol

SS Security Scenario

SSE Sum of Squared Errors

SVM Support Vector Machine

TCP Transmission Control Protocol

UDP User Datagram Protocol

UI User Interface

US Usability Scenario

YOLO You Only Look Once

INTRODUCTION 1

Chapter 1

Introduction

Throughout history having an overview of the environment from high viewpoints held many benefits Early civilizations used

hills to monitor their surroundings population and spot possible attackers The discovery of flight meant that environments

could now be viewed from a birdrsquos-eye view offering even more visibility revealing much more of the world below Recently a

much more smaller type of aircraft was developed the drone Ranging from large plane-like to almost insect-like devices and

having a wide variety of uses drones are quickly taking over the sky Drones would not be as effective without proper cameras

providing a detailed view on the world below With digital videocameras offering superb quality for steadily decreasing costs

almost every scene can be captured in great detail However these cameras are limited to the visible light spectrum which

hinders drones to operate in all circumstances like nightly flights Thermal cameras measure the emitted heat of a scene and

can reveal information not visible to the eye such as hidden persons or animals pipelines malfunctioning equipment etc The

combination of these two technologies certainly holds many exciting opportunities for the future

11 Drones

Drones are flying robots that can fly remotely or autonomously and donrsquot carry a human operator They can carry a variety of

payloads video cameras delivery parcels fluid containers sensors lights but also lethal explosives [1]

Drones are classified in different categories based on varying parameters such as the physical characteristics (diameter weight)

aerial movement techniques application domains etc Based on diameter drones are classified as smart dust (1 mm to 025

cm) pico air vehicles (025 cm - 25 cm) nano air vehicles (25 cm - 15 cm) micro air vehicles (15 cm - 1 m) micro unmanned

aerial vehicles (1 m - 2 m) and unmanned aerial vehicles (2 m and larger) Often depending on their diameter the weight

of these devices ranges from less than a gram up to more than 2000 kg Drones have different flight techniques such as

propulsion engines with wings rotors in various amounts flapping wings and even balloons They are used for all kinds of

purposes ranging from search and rescue missions environmental protection delivery recon etc Hassanalian et al provide

an excellent overview of most types of drones [2]

Due to the increasing interest in commercial drone platforms [3] a variety of payloads were developed specifically tailored for

these aerial robots such as gimbals to mount action video cameras [4] gimbals for delivering packets [5] and thermal imaging

12 Concepts 2

platforms [6]

12 Concepts

121 Thermal Cameras

Thermal cameras are passive sensors that capture the infrared radiation emitted by all objects with a temperature above

absolute zero degrees Kelvin In contrast to visible light cameras thermal cameras do not depend on an external energy

source for visibility and colors of objects or scenes This makes captured images independent of the illumination colors etc

Furthermore images can be captured in the absence of visible light [7] Originally thermal camera technology was developed

for night vision purposes for the military and the devices were very expensive Later the technology was commercialized

and has developed quickly over the last few decades resulting in better and cheaper cameras [7] This led to access for a

broader public and the technology is now introduced to a wide range of different applications such as building inspection gas

detection industrial appliances medicinal science agriculture fire detection surveillance etc [7] Thermal cameras are now

being mounted on drones to give an aerial thermal overview

122 Aerial thermal imaging

Aerial thermal imaging is defined as the creation of thermal images using a flying device This dissertation focuses on the usage

of drones for aerial thermal imaging There are many applications for aerial thermal imaging Some examples are geography

[8 9] agriculture [10 11] search and rescue operations [12] wildlife monitoring [13] forest monitoring [14 15] disaster response

[16] equipment and building maintenance [17ndash20] etc In the past few years several industry players have developed thermal

cameras specifically aimed at these drone applications Examples are FLIR [6] Workswell [21] and TEAX Technology [22]

13 Problem statement

131 Industry adoption

The implementation of thermal cameras on drone platforms faces some issues for wide adoption by the industry Several

vendors offer thermal camera products some specifically designed for drone platforms that often implement different image

formats color schemes and interfaces (eg [23ndash25]) This leads to issues if users want to modify their applications by changing

the camera that is used because the applicationmust implement new software to interact with the camera or when the camera

is no longer supported by the vendor leaving the application with outdated hardware and software This leads to a problem

called vendor lock-in that makes customers dependent on a certain vendor as they cannot switch product without making

substantial costs a problem already very tangible for cloud-based applications today [26]

Applications across various fields often have different functional and non-functional requirements Some applications have hard

real-time deadlines (such as firefighting search and rescue security etc) that must be respected other applications require

13 Problem statement 3

highly accurate predictions (eg person detection agriculture etc) A single application domain can even have many different

use cases

Several firefighting departments in Belgium were contacted to get an overview of their needs for a drone thermal imaging ap-

plication It quickly became clear they had various detection problems such as finding missing persons locating hot explosives

measuring temperatures in silos detecting invisible methane fires etc Equipment also wears down more quickly due to usage

in harsh environments such as fires in close proximity A drone thermal application for them needs to be able to exchange

functionality and hardware easily and have high performance constraints to deliver value for them The email conversations

can be read in Appendix A

Other drone thermal imaging applications are mostly only used in the niche domain for which they were developed because

they arenrsquot designed for flexibility [27] These proprietary applications have some disadvantages the development and support

potentially has a large cost vendor lock-in can occur when products are no longer supported security issues could arise and

customization is difficult [28 29] Applications could benefit from a backbone framework to aid in this modifiabilityinteroper-

ability issue aiding in developing end-to-end solutions connecting thermal cameras to various analysisdetection modules for

various use cases

132 Crowd monitoring

Festivals and other open air events are popular gatherings that attract many people For every event organizer it is important to

ensure safety and avoid incidents Large groups of people so-called mobs can create potentially dangerous situations through

bottlenecks blocking escape routes etc Therefore having the ability to monitor crowds and predict their behavior is very

important to avoid such scenarios Data can be obtained by evaluating video footage from past comparable events or real time

video monitoring of current events [30] By analyzing this footage potentially dangerous situations can be avoided by acting

on the mob formation and safety regulations can be improved to help planning future events Vision-based approaches face

several limitations mounted cameras cannot capture elements outside of their field of view canrsquot see in some conditions (for

example during night time) and it is difficult to infer information from the raw footage [31]

Thermal cameras could help for crowd monitoring because they can operate in any condition Having precise and detailed

object recognition for the images produced by these cameras is crucial to extract information correctly In this context clas-

sifying images is not satisfactory localization of the objects contained within the images is needed This problem is known

as object detection [32] There are several challenges for object detection in thermal images the image quality is very low

when compared to visible light images there is a lack of color and texture information and temperature measures are relative

measures etc This makes extracting discriminative information from these images difficult [33] Most efforts towards object

detection on thermal images has gone towards human detection Most of the proposed algorithms focus on feature extraction

using the Aggregated Channel Features technique and boosting algorithms for learning [33ndash35] Novel approaches make use

of so-called deep learning with neural networks that achieve very good results given enough data [36]

14 Outline 4

133 Goal

The goal of the dissertation is to explore the requirements of the possible backbone framework suggested in Section 131 and its

potential software architecture The architecture is evaluated by building a proof of concept implementation of the framework

and evaluating it against the proposed requirements To verify its use in developing drone thermal imaging applications the

specific mob-detection use case is investigated

134 Related work

The Irish start-up DroneSAR [37] developed a search-and-rescue (SAR) drone platform allowing users to stream live images

and video from a drone as it conducts a search for missing persons The platform works with any camera visual and thermal

but focuses on drones from vendor DJI DroneSARs industry partner Amazon introduced the Amazon Kinesis Video Streams

platform in January 2018 as a new service for the Amazon Web Services (AWS) cloud platform It allows users to stream live

video from devices to the AWS cloud and build applications for real-time video processing [38] The VIPER project by EAVISE

KU Leuven researched how thermal and visual video images could be used for real-time detection of persons using object

detection algorithms such as deep learning [36] The framework presented in this work combines elements from all three of

these examples

14 Outline

The remainder of this dissertation is organized as follows Chapter 2 presents the requirements for the framework and the

software architecture Chapter 3 explores several state of the art technologies that can serve as backbone technologies for

the framework To test the viability of the software architecture a prototype is implemented Chapter 4 presents the different

aspects of this prototype Chapter 5 describes the experiment that is conducted to research the detection of mobs in thermal

images The results of both the framework and the detection experiment are presented and evaluated in Chapter 6 Finally the

conclusion and future research efforts are presented in Chapter 7

SYSTEM DESIGN 5

Chapter 2

System Design

Finding out what users actually expect from a software system and what makes it valuable for them is of key importance for the

success of that system This chapter first explores the functional and non-functional requirements of the hypothetical frame-

work suggested in Chapter 1 to find out what makes building the framework worthwhile Well known architectural patterns

enable certain software requirements very well and can be used for building the software architecture of the framework The

framework software architecture combines some of these patterns and is presented in several documents

21 Requirements analysis

Requirements are the stated life-cycle customer needs and objectives for the system and they relate to how well the system

will work in its intended environment They are those aspects of the framework that will provide value to the users

211 Functional requirements

Functional requirements (FR) describe the necessary task action or activity that must be accomplished by the system often

captured in use cases andor user stories [39 40] Use cases provide a summary of the features described in the user stories

Several external people andor systems defined as actors interact with the framework to achieve a certain goal [40] Three

actors are identified for the framework an end-user that uses the framework in order to build an image processing applica-

tion for a specific use case such as the ones described in Section 122 a camera developer who creates support software for

a specific thermal camera for the framework so that the end-user can buy and use their product and an analysis software

developer that creates analysis software for a specific use case (tracking object detecting objects etc) so that the end-user

can use their software to build his 1 application The camera and analysis software developers are generalized to an actor called

plugin developer who develops plugins to extend the functionality of the framework These plugins are the building blocks

with which the end-user can build image processing applications

The general user scenario for the framework proceeds as follows An end-user wants to build an image processing application

1To avoid unnecessary gender specific pronoun clutter the male pronoun is used by default

21 Requirements analysis 6

eg to detect fires in a landscape using a drone He has a thermal camera for this and has read about hot-spot detection in

video The user looks for a plugin for the framework that can read video from his thermal camera and for a plugin that does

the hot-spot detection If the user finds these plugins he can add them to the framework and use them for the application he

is building He connects both plugins with the framework in a specific order to finish his application For this simple example

the thermal camera plugin is connected to the hot-spot detection plugin so that video coming from the thermal camera is

transmitted to the detection plugin to find the fires in the landscape The plugins in the application and the specific order in

which they are connected is defined as a stream This stream should be easily modifiable if additional or other functionalities

are required Suppose that the thermal camera delivers very high quality video and the detector cannot work with this as it

can only operate on low quality images The end-user searches for a plugin that scales the high quality video down to an

accepted quality for the detector This plugin is placed in between the thermal camera and the detector and the application

can work again By continuously adding plugins to the framework the number of possible applications that can be built with

the framework increase making the framework useable for more aerial thermal imaging use cases

Instead of developing the application from scratch users can use the already implemented plugins to build the applications in

an ad hoc fashion Because of this the development time for such applications can be reduced and users can switch hardware

andor algorithms easily The FRs are summarized in a use case diagram that connects each actor with their respective require-

ments and the relationship among them [40] depicted in Figure 21 Trivial functionalities such as launching and shutting down

the framework are omitted The red use cases represent use cases to extend the functionality of the framework the blue use

cases represent use cases for building streams white use cases modify the media processing of the stream Some use cases

depend on others the blue and white use cases work with plugins of the framework their prerequisite use case is rdquoAdd pluginrdquo

as a plugin must be a part of the framework for a user to use it the rdquo(Un)Link pluginsrdquo rdquoStopPausePlay streamrdquo use cases

depend on rdquoAdd plugins to streamrdquo as a stream must contain plugins before they can be manipulated

212 Non-functional requirements

A non-functional requirement (NFR) specifies how the framework is supposed to be or in what manner it should execute its

functionality [41] These qualifications typically cover business and system quality requirements A distinction is made between

quality attribute requirements (QAR) and constraints QARs are qualifications of the FRs or of the overall product eg how

fast a certain function must be executed or how resilient it must be to erroneous input They are closely related to business

requirements which are specifications that once delivered provide value to the actors [40] The QARs are captured in a utility

tree [40] that has a root node representing the system This root node is elaborated by listing the major QARs that the system

is required to exhibit Each QAR is subdivided into more specific QARs To make the specific requirements unambiguous and

testable a scenario for the system or a specific function is written and they are evaluated against the business value and the

architectural impact [40] The QAR can either have High (H) Medium (M) and Low (L) business value and architectural impact

respectively The business value is defined as the value for the end user if the QAR is enabled High designates a must-have

requirement Medium is for a requirement which is important but would not lead to project failure Low describes a nice to have

QAR but not something that is worth much effort Architectural impact defines how much the architecture must be designed

towards the QAR to enable it High means that meeting this QAR will profoundly affect the architecture Medium means

21 Requirements analysis 7

Figure 21 Use case diagram

that meeting this QAR will somewhat affect the architecture Low means that meeting this QAR will have little effect on the

architecture The following QARs are discussed performance interoperability modifiability usability security and availability

Performance

Performance indicates the frameworks ability to meet timing requirements It characterizes the events that can occur and the

frameworks time-based response to those events Latency is defined as the time between the arrival of the stimulus and the

systemrsquos response to it [40] The system latency is the latency between the initialization of an action sequence and the first

change to the system noticeable by the user Streaming latency is defined as the time between the arrival of a video frame and

the arrival of the next video frame The jitter of the response is the allowable variation in latency Scalability is the number

of users that can use the framework at the same time The utility tree is presented in Table 21 The software industry has

not defined a quantified rsquogood latencyrsquo for end-users but a 4 second latency rule is often used as a rule-of-thumb [42] The

average response time for general framework commands should then be less than 2 seconds with a standard deviation of 1

seconds ensuring most execution times respect the 4 second bound As stated in Chapter 1 some use cases require real-time

video streaming such as fire fighting The notion of low latency real-time video loosely defines that video should be streamed

almost simultaneously if a camera is filming and a human user does not notice a latency between the video of the camera

and the real world the video stream is considered real-time Real-time is thus a human time perception and for visual inputs

this bound is as low as 13 milliseconds Anything above 13 milliseconds becomes noticeable anything above 100 milliseconds

hinders human performance [43 44] However the framework focusses on the use of thermal cameras most of which most

donrsquot produce frames faster than 8 frames per second or 125 milliseconds per frame (see Section 31) More expensive cameras

21 Requirements analysis 8

can shoot at 25 frames per second corresponding to a latency of 40 milliseconds and this bound is selected for the streaming

latency with a standard deviation of 20 milliseconds remaining below the frame rate of less expensive cameras The number

of users that can use the framework at the same time is assumed to be low as current aerial thermal image applications are

currently operated by only one user or a few The assumption is that a maximum of five users can use the framework at the

same time All of these requirements are quantified as relatively rsquogoodrsquo values These bounds should be evaluated for user

satisfaction by having users use a prototype of the framework in practice

Attribute refinement Id Quality attribute scenario

LatencyPS-1 The average execution time of all framework commands does not exceed 2 sec-

onds (H M)

PS-2 A playing stream should have an upper limit of 40ms streaming latency (H H)

JitterPS-3 The average standard deviation of the execution time of all framework com-

mands should not exceed 1 second under normal operation (H M)

PS-4 The average standard deviation in streaming latency should not exceed 20ms

under normal operation (H H)

Scalability PS-5 The system should usable by five users at the same time (M M)

Table 21 Performance utility tree

Interoperability

Interoperability is the degree to which two or more independently developed systems can usefully exchange meaningful infor-

mation via interfaces in a particular context [40] The framework will interoperate with cameras and analysis modules via the

framework plugins Henceforth the camera and analysis plugins will be referred to as a Producer plugin and a Consumer plugin

A Producer plugin is thus a plugin that represents a camera that produces video and a Consumer plugin a plugin that represents

a module that processes or consumes video The framework will thus interact with the Producer and Consumer plugins with

which the framework exchanges requests to link them together control their media process etc The more correct exchanges

there are between the two the better the user can use the plugin for building applications with the framework This QAR is

quantified by the ratio of requests made by the framework that were correctly interpreted by the plugin (successful requests)

and the total number of requests during a runtime of the framework [40] Intuitively one argues that the framework must

achieve perfect interoperability with a perfect exchange success rate of 100 Reality however tends to not agree with perfec-

tion and it can never be excluded that exchanges will always be correct Therefore it is better to aim for a good interoperability

measure and prepare for failed exchanges instead of naively assuming the framework will be perfect An exchange success

rate of 9999 is selected the motivation for this bound is as follows A plugin is assumed to be always correct up to first the

mistake after which the plugin is faulty and the fault needs to be identified and ensured that it wonrsquot occur again An exchange

success rate of 9999 means that if 10000 plugins are installed and used by the framework only one will fail during uptime

For one plugin during framework up time the mean time between failures is then 10000 exchanges It is suspected that this

21 Requirements analysis 9

amount of exchanges are very high for normal framework use Because the possibility of faulty exchanges is acknowledged

the framework will need to implement a fallback mechanism to compensate The utility tree is presented in Table 22

Attribute refinement Id Quality attribute scenario

Syntactic interoperabilityIS-1 The framework exchanges requests with a Producer plugin (known at runtime)

with a success ratio of 9999 Incorrect requests are undone by the framework

and logged (HH)

IS-2 The framework exchanges requests with a Consumer plugin (known at runtime)

with a success ratio of 9999 Incorrect requests are undone by the framework

and logged (HH)

Table 22 Interoperability utility tree

Modifiability

Modifiability is the cost and risk of changing functionality of the system [40] One of themost important values of the framework

is modifiability of the supported thermal cameras and analysis modules The framework needs to be extendable for new

functionalities by enabling developers to add their support software in the form of a plugin End-users should be able to

modify the components that they use for their image processing applications easily and quickly to allow for interchangeable

hardware and software and quickly set up new applications Modifiability is defined in two environments runtime defined as

periods during which the system is up and running and downtime defined as the time periods during which the system is not

active The utility tree is presented in Table 23

To enable users to choose the extensions they need the framework will need a distribution service that contains all plugins

available for the framework from which a user can select and install plugins for their local version of the framework Adding

new plugins to the distribution service should not affect versions of the frameworks installed by the user When a user adds a

plugin from the distribution to his version of the framework the framework should only reload once before making the plugin

useable for user comfort Deployability is defined as the different device configurations that specify how the framework can be

deployed If the framework can be deployed in different fashions this can increase the value for the end-user Suppose a fire

fighting use case in which a forest fire is monitored on site Computationally powerful devices might not be available on site

so moving some plugins processing media to a remote server or cloud could still allow usage of the framework Perhaps the

device processing the media is already remote for example a drone on security patrol in this case access via a remote device

such as a smartphone is desirable This leads to the deployment configurations described in the utility tree

Usability

Usability indicates how easy it is for the user to accomplish a desired task and the kind of user support the system provides

Learnability indicates how easy it is for a user to gain knowledge on how to use the framework Errors are the amount of errors

21 Requirements analysis 10

Attribute refinement Id Quality attribute scenario

Run time modifiability

MS-1 Support for a new Producer plugin should be added to the distribution service

within one day without the framework having to restart (H H)

MS-2 Support for a new Consumer plugin should be added to the distribution service

within one day without the framework having to restart (H H)

MS-3 End-users should be able to extend their framework with new functionalities

by installing new Consumer and Producer Plugins (HH)

MS-4 End-users should be able tomodify the plugins used to build their stream (HH)

Down time modifiabilityMS-5 New Producer plugins can be installed to the local framework at runtime the

framework can only reload once before the plugin is useable (H H)

MS-6 New Producer plugins can be installed to the local framework at runtime the

framework can only reload once before the plugin is useable (H H)

Deployability

MS-7 The system should be deployable on a combination of a smartphone and

cloudremote server environment (H H)

MS-8 The system should be deployable on a personal computer or laptop (H H)

MS-9 The system should be deployable on a smartphone laptop and cloud environ-

ment (H H)

Table 23 Modifiability utility tree

a user can make when trying to execute certain functions [40] The utility tree is presented in Table 24

Security

Security is a measure of the systemrsquos ability to protect data and information from unauthorized access while still providing

access to users and systems that are authorized An action taken against the system to cause it harm is called an attack

Security has three main characteristics Confidentiality is the property that data or services are protected from unauthorized

access Integrity is the property that data or services are protected from unauthorized manipulation Availability is the property

of the systemmaintaining its functionality during an attack Authentication verifies the identities of the parties of an interaction

checks if they are truly who they claim to be and gives or provokes access [40] Security is important for the framework if it is

deployed on multiple devices that use a public network to communicate The utility tree is presented in Table 25

Availability

Availability in a general context (not only security) refers to howavailable the software is to carry out its functionality Downtime

is a measure of the time that the system is unavailable to carry out its functions The utility tree is presented in Table 26

Availability is specified for the part of the framework that distributes the plugins

22 Patterns and tactics 11

Attribute refinement Id Quality attribute scenario

Learnability

US-1 A user should be able to learn how to build an image processing application in

at most one hour (H L)

US-2 An experienced developer should be able to start developing a Consumer plugin

for the system within one day (H L)

US-3 An experienced developer should be able to start developing a Producer plugin

for the system within one day (H L)

Errors US-4 A user should not make more than 3 errors to build an image processing appli-

cation (H L)

Table 24 Usability utility tree

Attribute refinement Id Quality attribute scenario

Confidentiality SS-1 Streams created by a user can only be accessed by that user and not by any

other entity (H L)

Integrity SS-2 Streams canrsquot be manipulated without authorization by the user that made the

streams (H L)

Availability SS-3 During an attack the core functionality is still available to the user (H M)

AuthenticationSS-4 Users should authenticate with the system to perform functions (H L)

SS-5 Developers should authenticate their plugins before adding them to the frame-

work (H L)

Table 25 Security utility tree

Architecturally significant requirements

Architecturally Significant Requirements (ASR) are the requirements that are themost important to realize according to business

value and have the most impact on the architecture From the utility trees and the measures of the quality attribute scenarios

the QARs modifiability interoperability and performance are identified as ASRs

22 Patterns and tactics

An architectural pattern is a package of design decisions that is found repeatedly in practice that has known properties that

permit reuse and describe a class of architectures Architectural tactics are simpler than patterns which typically use just a

single structure or computational mechanism They are meant to address a single architectural force Tactics are the rdquobuilding

blocksrdquo of design and an architectural pattern typically comprises one or more tactics [40] Based on the ASRs several tactics

are listed in Table 27 that are used for the base pattern selection The explored patterns are layers event-driven architecture

22 Patterns and tactics 12

microkernel and microservices

221 Layers

The layered pattern divides the software into units called layers that each perform a specific role within the application Each

layer is allowed to use the layer directly beneath it via its interface Changes in one layer are isolated if the interfaces donrsquot

change enablingMT-1 andMT-2MT-5 [40] While changes can be isolated by the isolated layers they remain difficult due the

monolithic nature of most implementations of this pattern Layers contribute to a performance penalty due to the rdquoarchitecture

sinkhole phenomenonrdquo in which requests are simply propagating through layers for the sake of layers [45]

222 Event-driven architecture

This pattern consists of several event publishers that create events and event subscribers that process these events The pub-

lishers and subscribers are decoupled by using an event channel to which the publishers publish events that the event channel

forwards to the event subscribers The subscribers should have a single purpose and execute asynchronously Since the publish-

ers and subscribers are single-purpose and are completely decoupled from other components via the event channel changes

are isolated to one or some components enabling MT-1 MT-2 MT-4 MT-5 and MT-7 If the event channel adds a discovery

mechanism IT-1 can also be enabled Overall the pattern is relatively easy to deploy due to the decoupled nature of the com-

ponents Performance in general can be very high through the asynchronous nature of the architecture enabling PT-6 and PT-7

If the event channel is tweaked to contain extra functionality PT-1 PT-3 PT-8 PT-9 PT-10 PT-11 can be enabled as well

If the components have a limited event response then PT-2 and PT-5 can also be enabled Development can be somewhat

complicated due to the asynchronous nature of the pattern [40 45]

223 Microkernel

The microkernel pattern allows the addition of application features as plugins to the core application providing extensibility

as well as feature separation and isolation The pattern consists of two components a core system called the kernel and

Attribute refinement Id Quality attribute scenario

DowntimeAS-1 The system should be up 995 per year This means the system has an allowed

scheduled downtime of 43 hours and 30 minutes per year for maintenance (M

L)

AS-2 The maximal duration of the interval during which the system is unavailable is

3 hours (M L)

Network AS-3 If there is no active network connection the local device can be used for opera-

tion of the framework (H H)

Table 26 Availability utility tree

22 Patterns and tactics 13

plugins The business logic is divided between independent plugins and the kernel The kernel contains only the minimal

functionality required to make the system operational The plugins are standalone independent components that contain

specialized processing additional features and custom code This code is meant to enhance or extend the core system to

produce additional business capabilities In many implementations plugins are independently developed third-party modules

Changes can largely be isolated and implemented quickly through loosely coupled plugins AllMTs can be enabled Depending

on how the pattern is implemented the plugins can be dynamically added to the kernel at runtime Via a resource discovery

service in the kernel the ITs can be enabled In general most applications built using the microkernel pattern perform well

because applications can be customized and streamlined to only include the features that are needed [45]

224 Microservices

Microservices is an architectural pattern that structures an application as a collection of loosely coupled services that implement

business capabilities Each component of the pattern is deployed as a separate unit that can be deployed on one device or

multiple devices The components can vary in granularity from a single module to a large portion of the application The

components contain one or more modules that represent either a single-purpose function or an independent portion of a

business application [45 46] Due to the separately deployed units changes are isolated to individual components enabling all

MTs Via service discovery mechanisms the ITs can also be enabled The microservices pattern supports distributed deployment

of the software across multiple devices by design This pattern is not known to produce high-performance applications due to

the distributed nature of the pattern which relies on communication via a network [45 46]

225 Comparison of patterns

Table 27 summarizes the analysis of the patterns A score is given based on howwell the pattern enables the tactic Lowmeans

that the pattern does not naturally enable the tactic Medium indicates the pattern can be implemented with the tactic but

does not include it itself High means the tactic is enabled in the pattern Excellent means that the tactic plays a key role in the

pattern

The microkernel pattern andmicroservices pattern both enable most tactics The microkernel pattern implements extendability

of the framework by design using plugins which is the main idea for the framework and thus is an excellent base pattern

Interoperability and deployability of these plugins can be ensured by the microservices pattern as it designs the microservices

to have well defined interfaces for interoperability and allows for the framework to be deployed in a distributed fashion The

architecture presented below is a combination of both the microkernel pattern and the microservices pattern

22 Patterns and tactics 14

Tactic Layers Event-driven Microkernel Microservices

MT-1 Split module Medium High High Excellent

MT-2 Increase semantic coherence Medium High High Excellent

MT-3 Encapsulate Medium High High Excellent

MT-4 Use an intermediary Medium High High Excellent

MT-5 Restrict dependencies High High High Excellent

MT-6 Anticipate expected changes Low High Excellent Excellent

MT-7 Abstract common services Low High Excellent Excellent

MT-8 Defer binding | Runtime registration Low Low Medium High

IT-1 Discover services Low Low High High

IT-2 Orchestrate interface Low Low High High

IT-3 Tailor interface Low Low High High

P-1 Manage sampling rate Low High High Medium

P-2 Limit event response Low High High Medium

P-3 Prioritize events Low High High Medium

P-4 Reduce overhead Low High High Low

P-5 Bound execution time Low High High Medium

PT-6 Increase resource efficiency Low High High High

PT-7 Introduce concurrency Low High Low High

PT-8 Maintain copies of computation Low High Low High

PT-9 Load balancing Low High Low High

PT-10 Maintain multiple copies of data Low High Low High

PT-11 Bound queue sizes Low High Low Medium

PT-12 Schedule resources Low High Low Medium

Table 27 Comparison of how well the discussed patterns enable the tactics needed for the ASRs

23 Software architecture 15

23 Software architecture

The software architecture is documented in three document categories static views dynamic views and deployment views

The static views comprise the different components of the system and their relationship among each other The dynamic views

describe the runtime behavior of the system Finally the deployment views provide different configurations how the system

can be deployed on different devices [47]

231 Static view

Figure 22 presents an overview of the architecture using a component-connector UML diagram Components are the boxes that

represent different software entities that exist at runtime The components have interfaces through which they interact with

other components These are indicated using the rsquolollipoprsquo notation with the rsquoballrsquo representing the interface that a component

provides and a socket indicating that another component is using this interface The type of data exchanged is noted next to

the interface Multiple boxes indicate that multiple components of the same kind can exist at runtime [48]

The architecture consists of the following core components Client Interface Producer Stream Consumer Producer Distribution

Consumer Distribution Producer Plugin and Consumer Plugin The clear components in Figure 22 form the core framework

which each user needs to install to use the framework The colored components form a distribution service for framework

plugins to extend the functionality they are not installed with the core framework but run as remote instances with which

the user can interact to extend his version of the core framework with new plugins A user can use the framework via the

Client Interface building streams that are maintained in the Stream component The Stream component makes requests to

the Producer and Consumer components to activate and control the selected plugins to build the stream Additional plugins

can be added to the framework and are distributed via the Producer and Consumer Distribution components The architecture

implements a hybrid combination of the microservices and microkernel pattern Each presented component is a microservice

that implements its own interface to interact with other components The Producer and Consumer components act as kernels in

the microkernel pattern while the Producer and Consumer plugins acting as plugins in the microkernel pattern These patterns

enable the tactics needed to meet the requirements presented in Section 21

Communication protocol

To allow the microservices to communicate a communication protocol must be designed Communication protocols can roughly

be classified in two categories synchronous and asynchronous Synchronous protocols block on requests which means that the

client waits for a response of the server and can only continue executing when a response is received This makes a synchronous

protocol inherently more reliable but also slower An example synchronous protocol is the Hyper Text Transfer Protocol (HTTP)

Asynchronous protocols just send messages and do not block on the response This makes the protocol less reliable but also

faster [49]

There are two types of traffic exchanged between microservices First there are the command requests that are exchanged

between microservices to edit resources or change state Second there are the video frames that are exchanged between Pro-

ducer and Consumer Plugins Both types of traffic have different requirements The commands must be communicated reliably

23 Software architecture 16

Figure 22 Overview component-connector diagram of the architecture

and need to executed once and only once The reliability is more important than latency so a synchronous protocol is pre-

ferred Microservices traditionally implement the synchronous HTTP protocol with Representational State Transfer Application

Programming Interfaces (REST API) that specifies the application endpoints as textual resources [45] This common protocol

is used for the exchanged command requests in the framework

The video frames need to be sent with low latency at a high frequency but reliability is less important An asynchronous

protocol is preferred For video streaming the Real-time Transport Protocol (RTP) running on top of the User Datagram Protocol

(UDP) is selected as it enables real-time transfer of data between processes [50] RTP defines a standardized packet format to

transmit video and audio over a network It sequences each packet with a sequence number and a timestamp This allows the

application to detect missing packets and latencies in the network The UDP protocol is a low latency asynchronous transport

protocol as it doesnrsquot guarantee packet delivery

The recommended codec for transmitting video media is Motion-JPEG that encodes video frames as separately encoded JPEG

images This makes analysis and processing in subsequent plugins easier as only the received frame is needed to perform

the analysis or processing Other video compression formats such as MPEG-4 use key-frames and object oriented differential

compression formats If a key-frame is received via the stream the frame can be used as is If a reference frame is received the

receiver needs to wait for the corresponding key-frame to be received to be able to construct the full video frame for analysis

This introduces extra complexity and lower quality detection which is a clear trade-off for the quality and simplicity which

MJPEG offers [51 52]

Applying these protocols to the architecture results in the network topology depicted in Figure 23 The full lines represent

communication via HTTP on top of the Transmission Control Protocol (TCP) The dashed lines represent the RTP protocol on top

of the UDP protocol The boxes represent the different microservice components of the framework

23 Software architecture 17

Figure 23 Framework network topology Each box is a microservice component of the framework The full lines indicate communication over the HTTPTCP

protocol the dashed lines indicate communication over the RTPUDP protocol

Client Interface

The Client Interface is the interface through which end-users can interact with the framework Figure 24 presents the detailed

component-connector diagram The Client Interface consists of a User Interface (UI) component and a API Gateway component

Devices can make requests to the Client Interface via the Client Requests interface provided by the API GateWay The UI provides

the UI Operation interface that is used by end-users to control the framework This can be either a visual or textual interface

The UI actions are translated to client requests that are forwarded to the API Gateway using the Client Requests interface The

API Gateway translates the client requests and forwards them to the other core framework components

Figure 24 Client Interface detailed view

Stream

The Stream component maintains the logical representation of the streams built by the end-user for his image processing

application Figure 25 presents the detailed component-connector diagram

23 Software architecture 18

Figure 25 Stream detailed view

It consists of an API a StreamManager and several StreamModel components The API provides the Stream Commands interface

used by the Client Interface to interact with the framework it translates incoming requests to commands for the Stream

Manager that can then execute these commands Commands include creating a new stream modifying the stream layout

modifying the stream state etc The StreamManager creates and manages multiple streams represented by the Stream Model

So the end-user builds Stream Models to create image processing applications The Stream Model represents the logical model

of these image processing application streams As stated before a stream consists of several plugins processing media placed

in some order that are linked by the framework Figure 26 illustrates this concept

Figure 26 Logical model of a stream The arrows represent the flow of media through the stream

Logically the Stream Model is represented as a tree with multiple roots and multiple leaves The framework build streams by

initializing the needed plugins and connecting them in order In the example StreamModel plugins receivemedia frommultiple

source plugins and forward media to multiple targets The Stream Model has a global state that represents the cumulative

state of all plugins To transition the global state from A to B all plugins need to transition from A to B This is done by first

making the transition on the leaves of the Stream Model after which the transition propagates to the root nodes This ensures

that no media is lost because the first transitioned plugins canrsquot process anything as their is no media put into the tree

23 Software architecture 19

Producer and Consumer plugins

A Plugin represents an independent media processing element either of the Consumer type (such as a thermal camera) or

the Producer type (such as an object detection software module) Plugins are deployed as standalone microservices providing

a REST API interface that the framework uses to control the plugin Figure 27 represents a general plugin model A plugin

receives media from other plugins called the sources processes this media and forwards it to other plugins called the listeners

A Producer plugin only has listeners a Consumer plugin has both sources and listeners Merging the media from multiple

sources and forwarding the processed media to multiple listeners is the responsibility of the plugin

Figure 27 Plugin model

The plugin REST API should at least provide a state resource representing the state of how the plugin is processing media

a sources resource that represent the sources from which the plugin receives media to process and a listeners

resource which represent the listeners to which the plugin transmits the processed media Only Consumers have the both

the sources and listeners resource as Producer Plugins produce their own media source and hence can only have

listeners

To indicate if and how the plugin is actively processing media a finite state machine is implemented The state transition

diagram is presented in Figure 28 A plugin can be in four possible states INACTIVE STOP PLAY and PAUSE When a plugin

is in the INACTIVE state no active microservice is running the plugin This is the initial state for all plugins of the framework

This state is only visible to the framework as in this state the plugin is not instantiated When a plugin is in the STOP state

the framework has instantiated a microservice running the plugin The plugin is listening for commands on its API but is not

processing any media This state is visible to the plugin In the PLAY state a plugin is processing media received from its

source(s) and transmits processed media to its listener(s) and is listening for commands When in the PAUSE state media

processing is paused but media buffers are kept This is to decrease the latency when the plugin transitions back to the PLAY

state since the plugin can continue processing from the point from where it was paused The difference with the STOP state

when transitioning STOP state the plugin clears its media buffers

The plugin starts in the INACTIVE state When a microservice running the plugin is instantiated by the framework the plugin

initializes itself in the STOP state From the STOP state the plugin can transition to the PLAY state to process media This

transition is only successful if sources and listeners are registered with the plugin From the PLAY state a transition to both

23 Software architecture 20

Figure 28 The state transition diagram for a plugin

the STOP state and the PAUSE state can be made which stops the processing of media and respectively drops or keeps the

media buffers The plugin cannot make multiple state transitions per command When a transition is made to INACTIVE the

framework first transitions the plugin to the STOP state after which the INACTIVE state can be reached

A sourcelistener has the following fields hostname the hostname of the microservice running the plugin and port the port

on which the sourcelistener is reachable

On the sources and listeners an HTTP GET and POST method must be provided GET retrieves the sourceslisteners

and their details POST adds a new sourcelistener to the plugin Both resources additionally need to provide an individ-

ual endpoint per sourcelistener on which GET PUT and DELETE must be provided This is for individual manipulation of the

sourcelistener GET retrieves the details PUT updates the fields of a listener and DELETE removes a sourcelistener from the

plugin

Producer and Consumer

The Producer and Consumer components are responsible for interacting and managing the ProducerConsumer plugins used in

the streams Figure 29 presents the component-connector diagram of the Producer and Consumer components Both compo-

nents have a similar architecture but are separate components This is because their plugin models differ and are suspected

to often be deployed on different devices having specific hardware requirements Producers Plugins could be deployed on

the thermal camera itself having a very specific operating system whereas a Consumer plugin might need access to specific

processors to speed up its execution

The Producer and Consumer consist of the following components API Kernel Plugin Model and Plugin Manager The API trans-

lates requests coming from the Stream component to commands for the Kernel The Kernel implements the core functionalities

such as activating (deploying) and deactivating plugins managing their state and manipulating their resources It creates a

Plugin Model for each Plugin that the framework has installed This model represents a plugin logically on framework level

and keeps track of the Plugin resources The Plugin Manager manages the plugins that were added to the framework stored in

the Plugin Directory It manages the plugin installations adding updates or installing additional plugins that can be retrieved

from the Producer and Consumer Distribution components

23 Software architecture 21

(a) Producer component-connector diagram

(b) Consumer component-connector diagram

Figure 29 Component-connector diagrams of the Producer and Consumer module

Producer and Consumer Distribution

The Producer and Consumer Distribution components are responsible for managing and maintaining the plugins for the frame-

work They act as online software repositories from which local versions of the framework can retrieve new plugins The

component-connector diagrams are presented in Figure 210 The Distribution components consists of the following subcom-

ponents API Plugin Manager and Plugin Tester Plugin Developers can make requests to the API that translates these requests

to Create Read Update Destroy (CRUD) commands for the Plugin Manager The Plugin Manager executes these commands

on the Plugins that are kept in the Plugin Repository The quality of the framework depends on the quality of the plugins

that it offers Therefore plugins should be thoroughly tested before being added to the framework to guarantee quality The

Plugin Tester component is responsible for this testing Tests should include testing if the plugin implements the Plugin Model

correctly if the plugin meets the performance requirements etc When a plugin passes these tests it is added to the Plugin

Repository so that end-users can install the plugin and use it for their applications

23 Software architecture 22

(a) Producer Distribution (b) Consumer Distribution

Figure 210 Producer and Consumer Distribution component-connector diagrams

232 Dynamic views

Dynamic views depict the behavior of the system and complement the static views They are documented using sequence

diagrams that show an explicit sequence of messages between architecture elements that describes a use case [40] Two key

use cases are presented here Add a plugin to the stream and linking plugins to build the stream

Add plugin to stream

Figure 211 presents the sequence diagram for adding a Producer plugin to the framework The framework is assumed to be

running the user has created a streamS and the Producer PluginA is correctly installed The end-user executes the command

to add A to stream S on the Client Interface that passes the command to the Stream component The Stream component

requests the creation of a microservice instance of A that is created by the Producer Kernel When the Producer Plugin is

instantiated the Producer Kernel creates a Plugin Model of A and adds it to its references so that the instance of A can be

reached for future commands Afterwards the StreamManager is informed of the success upon which the StreamManager can

addA to the Stream Model ready to be linked with other plugins The user is notified of this success and can continue building

IfA could not be instantiated (due to not being installed not installed correctly etc) A is marked as rsquobrokenrsquo and the user is

notified that the action could not be completed When the plugin is marked as rsquobrokenrsquo it can no longer be used and needs to

be reinstalled The sequence diagram for adding a Consumer Plugin is similar but replaces the Producer components with the

Consumer components

Link plugins

Figure 212 presents the sequence diagram for linking two plugins in a stream In the sequence diagram two Consumer Plugins

A and B are linked this can be extended to a Producer Plugin linking with a Consumer Plugin The framework is assumed

to be running the user has created a stream S the plugins A and B have been instantiated and added to the stream The

end-user executes the command to linkA andB in stream S on the Client Interface that passes the command to the Stream

component that checks if the link is valid for the Stream Model S Linking can only be done if the stream is in the STOP state

and if the plugins are already in the stream If the link is valid the Stream Manager can begin linking the plugins To link

23 Software architecture 23

Figure 211 Add a Producer Plugin to stream

the plugins in the order A-B A is added as a source for B and B is added as a listener for A These subsequences are

found in their corresponding frames in the diagram and are very similar The Stream Manager makes the request to add the

sourcelistener to the Kernel that finds the corresponding plugin and makes the request on the corresponding Plugin Model

If the Plugin succeeded the Plugin Model is updated and the Stream Manager is notified of this success If both plugins have

successfully set the source and listener the Stream Model layout is updated with the link Should the sourcelistener request

fail for one of the plugins the change is rolled back and the end-user is notified

233 Deployment views

The different deployment configurations are illustrated via deployment diagrams using the Deployment UML 25 specification

[48] rsquoHostrsquo specifies the device on which components are deployed The rsquomicroservicersquo indicates the isolated environment

in which components are running These isolated environments on the host are realized as software containers that enable

portability of the components to other deployment configurations This concept is further discussed in Section 33 The Producer

and Consumer Distribution components were left out of the diagrams as they are always distributed on a different host than

the core framework Two deployment configurations are presented the local configuration that deploys the components on

a single device and the distributed configuration that deploys each component on a separate device These configurations are

presented in Figure 213

23 Software architecture 24

Figure 212 Link two plugins in a stream The rsquoformat requestrsquo and rsquotranslate requestrsquo actions in the API components have been omitted to reduce clutter

in the diagram but are executed by the API components

23 Software architecture 25

Local configuration deployment

The local configuration deploys the framework on a single local device The configuration is depicted in Figure 213a Because

the framework is deployed as a whole it can operate offline This configuration is useful for image processing applications that

canrsquot rely on a stable network connection Examples are remote locations or densely built-up areas The components are still

deployed as separate microservices due to the architecture of the framework This has an impact on the performance of the

framework because for every interaction between components either the HTTP message protocol or RTP protocol is used that

introduces extra overhead compared to direct invocation of commands

Distributed configuration deployment

The distributed configuration deploys the framework on multiple devices The components are distributed over these devices

made possible by the microservice isolation and communication protocols This configuration is depicted in Figure 213b Obvi-

ously in this configuration each component of the framework must have a reliable network connection to communicate with

the other framework components This configuration could be used for example for a security application The end-user has

the Stream module running on a master node that controls several cameras The end-user can configure his image process-

ing application through the Client Interface running on his device that communicates with the Stream module running on

the master node The master node can control each camera by communicating with the Producer component If for example

the security application requires autonomous detection of trespassing people a computationally intensive task the Consumer

Plugins could need dedicated hardware to run that is only available on another device The Consumer component can then be

deployed on that dedicated device and the Stream component can again communicate with it over the network This success of

this configuration depends on the availability of the network and the capacity of the network If the network fails commands

and media canrsquot come through and the framework can no longer execute Due to the distributed nature performance will also

be worse when compared to the local configuration because each request between the components travels over a network

that can experience delays

23 Software architecture 26

(a) Local configuration deployment diagram(b) Distributed configuration deployment diagram

]

Figure 213 Deployment diagrams

STATE OF THE ART AND TECHNOLOGY CHOICE 27

Chapter 3

State of the art and technology choice

To build and test a proof of concept implementation of the architecture presented in Chapter 2 several state of the art tech-

nologies can be used as support for the framework These are presented in Sections 31 32 33 and 34 For each category a

choice is made that will serve as the basis for the implementation of the proof of concept discussed in Section 35 Readers

already familiar with the presented technologies can safely skip ahead to Section 35

31 Thermal camera options

This section aims to provide an overview of some currently commercially available thermal cameras The overview is not a

complete overview of all products offered by all vendors This data was gathered in September 2017 so some products can

be discontinued and new products can already be launched Several parameters are collected for each product Section 311

discusses why these parameters are important to assess the quality of a thermal camera Section 312 aims to aggregate these

parameters and presents insights into the data The full list of specifications can be found in Appendix B

311 Parameters

The following parameters were considered for the comparison physical specifications image quality thermal precision inter-

faces energy consumption help and support and auxiliary features

Price

Thermal cameras are relatively expensive when compared to visible light cameras For example a 20 megapixel (MP) visible

light camera can cost as low as 100 euro while thermal cameras having a much lower image resolution can cost as much as

15000 euro Prices for thermal cameras cover a very wide range and budgets are limited in practice

31 Thermal camera options 28

Physical specifications

Two specifications are considered the weight of the camera and the dimensions of the camera Drones have a limited carry

weight due to maximal carrying capacities and a faster draining of battery life when carrying heavier loads Lighter and smaller

cameras are preferred for usage with drones These often offer lower image quality and less features than the heavier cameras

Image quality

Image quality specifies how much information an image can possibly hold It consists of five parameters resolution capture

frequency or frame rate field of view and radiometric information Image resolution is the amount of detail an image holds

Higher resolution cameras can capture more details in a scene resulting in a sharper image that holds more information Due

to more details smaller objects can also be seen allowing scenes to be viewed from larger distances Drones capture images

from relatively large distances so good resolutions are required for the images to be useful Image resolution is measured in

pixel density presented as the product of the amount of pixels in width and height of the image The highest resolution found

for the compared cameras is 640 x 512 pixels Some cameras offer a visual camera next to the thermal camera This allows

an overlay of the visual image and the thermal image so-called Multi Spectral Dynamic Imaging (MSX) This creates artificial

sharper images because edges can be seen more clearly because they are more visible in the visual image Figure 31 depicts a

thermal-only image and a MSX image of a dog It can be seen that the MSX image is sharper MSX is a more low-cost solution

to produce sharper images compared to increasing the thermal resolution as visible light cameras are less expensive [7]

(a) Thermal (b) MSX

Figure 31 Thermal image and MSX image of a dog

The capture frequency or frame rate dictates how many frames the camera can capture per second Higher frequency cameras

are able to track dynamic scenes better The field of view is angle throughwhich the camera is sensitive to thermal radiation and

31 Thermal camera options 29

determines the extent of the world that can be seen by the camera Bigger field of views can capturemore of the environment in

one image Most cameras allow various lenses to be mounted onto the camera which allows for greater flexibility in choosing

the field of view Radiometric image information is thermal information embeddedwith the infrared image that can be analyzed

after recording Radiometric information characterizes the distribution of the thermal radiationrsquos power in space and specifies

the temperature per pixel exactly Regular thermal images use a relative scaling of temperatures that are mapped onto a

colorspace with some color being the hottest color in the image and another color the coldest For example in Figure 31a the

Iron color scheme is used which maps the cold regions of the image on blue color variants and warmer regions on red and

yellow variants Radiometric information can give a very detailed description of the radiation pattern of a scene

Thermal precision

Thermal precision specifies the temperature range the sensitivity and accuracy of the temperature measurements The tem-

perature range indicates the minimum and maximum range a camera can detect A larger temperature range comes with a

trade-off in sensitivity and accuracy Often cameras offer different modi of operation and operate using different intervals

according to the accuracy needed in a scene Sensitivity indicates the ability of the camera to record finer distinctions in tem-

perature Accuracy is the margin of error for temperature readings on the thermal camera An accuracy of 5 degrees Celsius

for small temperature ranges and 20 degrees Celsius for large temperature ranges is commonly found The increase in error

margin is a trade-off for the larger temperature interval Objects have different emit infrared waves in various forms (due

to black-box radiation [7]) To accurately compare the temperatures cameras often implement emissivity corrections that

normalize the measurements

Interfaces

Cameras can communicate with other devices via several interfaces during use Cameras mounted on a drone cannot be ac-

cessed during flight and need these interface to transfer data USB and HDMI are the most commonly found interfaces to

connect the camera with an on-board processing unit gimbal or battery MAVLink [53] is a very lightweight header-only mes-

sage marshalling library for micro air vehicles drones When a camera provides this interface this allows for a very efficient

communication scheme to control the camera remotely Other interfaces include Bluetooth or Wi-fi

Energy consumption

A device mounted on a drone has a limited energy source at its disposal The less energy the camera consumes the longer the

drone can operate This can even lead to lighter batteries used in-flight reducing the carried weight and therefore also the

energy consumption Typically energy consumptions for cameras are much lower than the energy consumption of the drone

itself so this is a minor specification Input voltage and power consumption are specified

31 Thermal camera options 30

Help and support

How the camera is supported by the company has a big impact on the ease of developing applications for the camera platform

User manuals phone or email support and FAQs are very helpful Should the camera be malfunctioning a product warranty is

necessary to recover the broken product

User experience

The user experience is another important factor as there is a difference in the technical specifications and the actual experience

of the user The user experience is measured in a number of good and a number of bad reviews Reviews are scored from zero

to five with zero being a very bad experience and 5 being a very good experience A good review is scored three or more a bad

review less than three stars

Auxiliary features

Some cameras offer even more features than the ones mentioned above These can be a connection with the Global Positioning

System (GPS) to indicate where images were captured a software application to interact with the camera analysis functionality

tracking etc

312 Comparative analysis

It can be seen that FLIR is the market leader on thermal solutions for drones They offer the largest product line and products

from other companies often utilize one of their camera cores Figure 32a plots the retail price compared to the thermal

resolution Cameras with high and low resolutions are found across all price ranges Clearly other features determine the price

of a thermal camera A feature function is defined that maps the features of a thermal camera on an integer The function

increments the integer if

bull The camera has MSX support

bull The camera has a standard data format (not just an analog or digital signal)

bull The camera offers radiometric information

bull The image resolution is larger than 640 x 512 pixels being the highest resolution found for these products

bull The sensitivity is smaller than 100 mK

bull The camera offers emissivity correction

bull The camera offers a USB interface

bull The camera offers a MAVLink interface

32 Microservices frameworks 31

bull The camera offers an HDMI interface

bull The camera offers a Bluetooth connection

bull The camera offers Wi-Fi connection

bull The camera offers GPS tagging

Figure 32b plots these feature points versus the retail price This gives a more log-like relationship The features of a camera

determine the price much more than just the image quality For a price less than 5000 euro thermal cameras are found that

implement most basic features Then the price increases rather fast for less added features These are features like radiometry

that require additional hardware that greatly increase the price of the camera

32 Microservices frameworks

The architecture presented in Section 23 relies heavily on the microservices pattern Therefore this Section aims to present

several microservices frameworks to support this architecture Figure 33 depicts the results of the Rethink IT survey query-

ing the most used frameworks for microservices by developers [54] The most popular frameworks Java EE and Spring Boot

are written in Java The Java EE framework is more of a one-stop-shop framework offering much more functionalities than

just a backbone microservices framework and is therefore not considered Spring Boot is clearly a very popular and mature

framework more streamlined for microservices Vertx is a more upcoming framework renowned for its performance making

it worthwhile to explore Python is an upcoming language for web development and because it is excellent for prototyping

several frameworks for this language are explored as well The frameworks presented here are Vertx version 351 Spring Boot

version 20 Flask version 012 Falcon version 141 and Nameko version 290

321 Flask

Flask is a micro web development framework for Python The term rdquomicrordquo means that Flask aims to keep its core simple but

extensible Flask is an unopinionated framework as it only provides a glue layer to build a REST API around the application

(a) Camera resolution compared to retail price(b) Camera feature points compared to price

32 Microservices frameworks 32

Figure 33 Rethink IT Most used tools and frameworks for microservices results [54]

However it provides a large list of extensions if extra functionality is required [55] Starting a microservice is very simple

as illustrated in Listing 1 Flask uses the concept of Python decorators [56] to bind Python functions to a REST API in Listing

1 for example the function service_status() is linked to the rsquorsquo resource When a user issues an HTTP GET request

on this resource the route() function on the app object is called by Flask Because route() is a decorator for the

service_status() function service_status() is wrapped and passed to the route() function so that when

a user issues an HTTP GET request the service_status() function that was passed gets called This allows for an easy

construction of the REST API just decorate all the functions of the microservice with the correct Flask decorator

from flask import Flask

app = Flask(__name__)

approute()

def service_status()

return service_status

if __name__ == __main__

apprun()

Listing 1 Minimal Flask application

Because Flask is a microframework its memory footprint is small with the binary file only being 535KB large It is in use

by several large companies such as Netflix and Reddit [57] In a production environment the default Flask web server is not

sufficient as it only serves one user at a time However for prototyping it is an excellent framework [55]

32 Microservices frameworks 33

322 Falcon

Falcon is a bare-metal Python web framework that differentiates itself in performance when compared to other frameworks

It targets itself towards microservices due to being even more lightweight and faster when compared to frameworks like Flask

In a benchmark test it achieves 27 times the speed of Flask [58] The framework seems less mature and has not been adopted

by many companies [59] It is not considered for the prototype of the system but could be used in production as it achieves

better performance

323 Nameko

Nameko is a framework specifically built for building microservices in Python Next to offering a REST API it also offers asyn-

chronous events over the Advanced Message Queuing Protocol (AMQP) It is only meant to be used for microservices not for

web applications that serve content It is a relatively young framework and is not backed by any major companies as of yet It

is however backed by the developer of the Flask framework [60]

324 Vertx

Vertx is a toolkit for building reactive applications on the Java Virtual Machine (JVM) This framework follows the reactive

systems principles These principles are used to achieve responsiveness and build systems that respond to requests in a timely

fashion even with failures or under load To build such a system reactive systems embrace a message-driven approach All

the components interact using messages sent and received asynchronously Reactive microservices built with Vertx have the

following characteristics autonomy asynchronous resilience and elasticity Vertx is a toolkit and can be used as any other

library which makes it very flexible It provides a large set of features metrics different programming languages different

protocols templating data access cluster management etc

Vertx embraces the asynchronous development model which can be seen in Listing 2

import iovertxcoreAbstractVerticle

public class Server extends AbstractVerticle

public void start()

vertxcreateHttpServer()requestHandler(req -gt

reqresponse()

putHeader(content-type textplain)

end(Hello from Vertx)

)listen(8080)

Listing 2 Vertx example

33 Deployment framework 34

The event which occurs is the HTTP request On arrival of the event the Handler is called and is executed The Handler is chained

to a listen request and does not block the calling thread The Handler is only notified when an event is ready to be processed

or when the result of an asynchronous operation has been computed [61]

325 Spring Boot

Spring Boot is an opinionated Java framework for building microservices based on the Spring dependency injection framework

It allows developers to create microservices through reduced boilerplate and configuration For simple applications it provides

a similar syntax to Flask in Python and uses decorators for routing An example is given in Listing 3 The framework handles

most of the routing and request handling but restricts the developer in application structure The framework is not lightweight

and performs less well than Vertx [62]

RestController

RequestMapping(api)

public class HelloRestController

RequestMapping(method = RequestMethodGET value=hola

produces = textplain)

public String hello()

return Hello Spring Boot

Listing 3 Spring Boot example

33 Deployment framework

To allow for the modifiability and interoperability requirements discussed in Section 212 and the different deployment config-

urations in Section 233 Linux containers (LXC) are used A container is a lightweight operating system running inside the host

system running instructions native to the core CPU eliminating the need for instruction level emulation that Virtual Machines

use Containers provide an identical isolated runtime environment for processes without the overhead of virtualization This

makes them perfect for highly portable software as only the container needs to be moved and can directly be executed on any

system supporting the containers [63] First the concept of containers is introduced in Section 331 Second several container

frameworks are presented in Sections 332 333 334

331 Containers

Containers sandbox processes from each other and are often described as the lightweight equivalent of virtual machines The

difference between a virtual machine and a container is the level of virtualization Virtual machines virtualize at the hardware

33 Deployment framework 35

level whereas containers do this at the operating system (OS) level The achieved effect is similar but there are significant

differences Containers make available protected portions of the OS and share its resources Two containers running on one OS

have their own OS abstraction layer and donrsquot know they are running on the same host This provides a significant difference in

resource utilization Virtual machines provide access to hardware only so it is necessary to install an OS As a result there are

multiple OSs running which gobble up resources Containers piggyback on the running OS of the host environment They merely

execute in spaces that are isolated form each other and certain parts of the OS This allows for efficient resource utilization and

for cheap creation and destruction of containers Consequently starting and stopping a container is equivalent to starting and

stopping an application [64 65] This comparison is illustrated in Figure 34

Containers offer several advantages over running a process directly on the system Due to the OS virtualization of the con-

tainers software is always deployed on the same operating system defined in the container This allows for a rsquowrite once run

everywherersquo scenario which allows for portability of the system to a range of devices Containers communicate with each other

using protocols such as HTTP This allows for the processes in containers to be written in any programming language using

any external library that is needed For the system this means that if the Producer and Consumer Plugins are packaged as

containers they can effectively be made in any available technology greatly enhancing the extensibility of the system

332 LXC

Linux containers are the basis on top of which other container frameworks are built LXC provides a normal OS environment

similar to a VM The containers in this framework almost behave identically to a VM They can run multiple processes LXC can

be used directly but offer only low level functionalities and can be difficult to set up [67]

333 Docker

Docker started as an open-source project at dotCloud in early 2013 It was an extension of the technology the company had

developed to run its cloud applications on thousands of servers [64] Now Docker is a standalone mature company providing a

software container platform for the deployment of applications [66] Docker provides two main services a simple toolset and

API for managing Linux containers and a cloud platform which provides easy access to recipes for software containers created

by other developers [68] Docker is the container technology with most public traction and is becoming the container standard

at the time of writing due to its functionalities and very responsive community It offers functionality to easily build and run

containers but also manage them in large clusters A design decision that limits Docker is that each container can only run one

process at a time and the Docker client Docker consists of a daemon that manages the containers and the API Engine a REST

client Should this client fail dangling containers can arise [69]

334 rkt

Core OSrsquo rkt is an emerging container technology providing an API engine similar to the Docker API Engine that can run LXC

containers as well as Docker containers rkt focusses on security standardization and is specifically designed to run in cloud

environments Unlike Docker rkt does not use a daemon process with a REST client The command line tool executes all the

34 Object detection algorithms and frameworks 36

(a) Container stack (b) Virtual machine stack

Figure 34 Containers compared to virtual machines [66]

operations which makes the framework more reliable rkt is not as mature as Docker yet It is portable to multiple Linux

environments but is not yet portable to macOS and Windows [70]

34 Object detection algorithms and frameworks

As stated in Section 132 object detection is the computer vision task of detecting which objects are present in an image and

where they are located Several approaches to this problem have been proposed some of which focus on thermal images This

section aims to give a small overview of different existing techniques For the technical details on the algorithms the reader is

referred to the respective articles on the algorithms

341 Traditional approaches

Traditional approaches include hot-spot detection techniques and Adaptive Boosting (AdaBoost) with various feature extraction

techniques such as Aggregated Channel Features (ACF) and Integral Channel Features (ICF) Thesemethods rely on clever feature

engineering solutions that use domain knowledge or statistical insights to transform the raw dataset into a specific set of

features in order to find patterns [32]

Hot-spot detection

Hot-spot techniques work on the assumptions that people have an overall higher body temperature than most of the back-

ground in the thermal image These techniques first select candidate objects these are the hot-spots in the image The hot-spots

define the region on which a classifier is run and are thus the localization step in the object detection problem Afterwards

a classifier is trained on these candidates Xu et al used a Support Vector Machine (SVM) classifier to classify if the hot-spot

34 Object detection algorithms and frameworks 37

represented a pedestrian [71] Nanda et al used a Bayes classifier to classify the hot-spots [72] These methods are generally

not applicable because people often are not the only hot-spots in thermal images

AdaBoost

AdaBoost is a machine learning algorithm that utilizes the output of so-called weak learning algorithms (weak learners) and

combine their outputs into aweighted sum that forms the output of the boosted classifier AdaBoostmodifies theweak learners

in favor of data points misclassified by previous classifiers [73] Viola and Jones et al built a detection algorithm that uses two

consecutive frames of a video sequence and trains the AdaBoost classifier on both motion and appearance information [74]

Davis et al use a two-stage template approach that initially performs a fast screening procedure using a generalized template

using a contour saliency map to locate potential person locations Any window located in the first phase is then forwarded to

the AdaBoost algorithm to validate the presence of the person Dollaacuter et al extracted features using different ICF and ACF [35]

ICF and ACF compute features by calculating several aggregations over the different channels of an image such as gradient

color histogram and colors Goedeme et al expanded these detectors with extra thermal channels to achieve comparable

results as Dollaacuter et al but for thermal images [36]

342 Deep learning

Over the past few decades there has been a shift in proposed solution methods towards deep learning Deep learning for object

detection uses Convolutional Neural Networks (CNN) CNNs are a specialized kind of neural network for processing data that

has a known grid-like topology such as images CNNs generally consist of three steps a convolution step that creates a feature

map of a region of an image a pooling step that summarizes the output of the convolution step and finally a fully-connected

network that learns from the features extracted in the previous steps [75] The key difference is that these algorithms do the

feature extraction in the convolutional layers and do not need feature engineering like the algorithms presented in Section

341 This requires quite a bit of computing power when compared to the traditional methods Since deep learning made the

shift to computing on Graphical Processing Units (GPUs) computations became feasible and these models proved to achieve

very good performance on various machine learning problems Two model types are described two-stage networks (R-CNN

R-FCN) that extract image regions first and make separate predictions on each region and dense networks (YOLO SSD NASNet

RetinaNet) that operate on the image as a whole

Region-based Convolutional Network (R-CNN)

R-CNN uses a selective search method to find objects an alternative to the exhaustive search in an image It initializes small

regions in an image and merges them hierarchically The detected regions are merged according to color spaces and other

similarity metrics [76] R-CNN combines this selective search with a CNN per region to find out the objects in these regions [77]

34 Object detection algorithms and frameworks 38

Fast(er) Region-based Convolutional Network (Fast(er) R-CNN)

Fast R-CNN was developed to reduce the time consumption related to the high number of models necessary to analyze region

proposals from the selective search method in R-CNN Instead of using a CNN for each region a single CNN with multiple

convolutional layers is used [78] Faster RCNN drops the region proposals detected with the selective search method (which

is computationally expensive) and introduced the Region Proposal Network (RPN) to directly generate region proposals This

accelerates training and testing and improves performance [79] Mask R-CNN is an extension of the Faster R-CNN model that

adds a parallel branch to the bounding box detection to predict object masks that is the segmentation of an object by pixel in

the image [80]

Region-based Fully Convolutional Network (R-FCN)

R-FCN tries a more efficient approach to region detection Instead of applying a per-region subnetwork multiple times R-FCN

uses a fully convolutional network with computations shared across the entire image This allows it to be compatible with

multiple backbone networks such as Residual Networks [81]

You Only Look Once (YOLO)

The previously discussed methods need to run the same computations on different parts of an image multiple times before

generating a prediction This makes those methods relatively slow The YOLO model [82] was developed with the requirement

to make predictions as fast as possible trading off accuracy for speed to move towards real-time object detection YOLO directly

predicts bounding boxes and class probabilities with a single CNN in a single evaluation instead of first detecting object regions

and predicting classes afterwards This has some benefits over the other methods YOLO is very fast when compared to other

methods capable of processing images in real-time up to 155 frames per second for some variants It also learns contextual

information because it trains on entire images instead of regions YOLO also generalizes better for other image types All these

benefits come at the cost of accuracy YOLO struggles to precisely localize some objects especially small objects The following

versions of YOLO focus on delivering more accuracy The algorithm is currently in its third version [83]

Single-Shot Detector (SSD)

The SSD [84] is similar to YOLO and predicts all the bounding boxes and the class probabilities in one single evaluation (single

shot) using one CNN The model takes an image as input which passes through multiple convolutional layers When compared

to YOLO SSD achieves higher accuracies by adding convolutional layers and including separate filters for different aspect ratio

detections

Neural Architecture Search Net (NASNet)

NASNet takes a different approach and does not design the network architecture to perform the object detection beforehand

but instead trains a Recurrent Neural Network (RNN) to generate the model descriptions of the CNN to perform the object

34 Object detection algorithms and frameworks 39

detection The RNN is trained using reinforcement learning The NASNets built for object detection perform as good as most

networks but are slower to train [85]

RetinaNet

RetinaNet is the latest state-of-the art object detector It is a simple dense detector similar to YOLO and SSD but matches

the accuracy of the two-stage detectors like the R-CNN variants RetinaNet proposes that the foreground-background class

imbalance encountered when training the dense detectors lead to less accuracy when compared to the two-stage detectors

RetinaNet uses a newmethod called Focal Loss that focuses training on a sparse set of examples to counter this class imbalance

which results in a very good performance and a very fast detection [86]

343 Frameworks

While the previous Sections focused on different algorithms actually implementing these algorithms is not straightforward

Thatrsquos why over the past years several deep learning frameworks have emerged that try to provide easier access to this tech-

nology Some frameworks provide APIs for some of the object detection algorithms presented above This section gives a small

overview of some frameworks Most frameworks differ quite a bit from each other which makes porting a model from one

framework to another rather difficult The Open Neural Network Exchange Format (ONNX) initiative hopes to propose a stan-

dard for interchangeable models which should aid switching among frameworks more easily in the future [87] Note that there

are other frameworks available but those do not yet support object detection functions out of the box

TensorFlow

Perhaps the most well-known framework TensorFlow is an open source machine learning library for neural networks with a

Python interface It was developed by Google for internal use and released for the public in 2015 [88] Recently an Object

Detection API has been built for TensorFlow which implements pre-trained models on benchmark datasets such as SSD Faster

R-CNN R-FCN and Mask R-CNN [89] TensorFlow offers a lot of flexibility in its use and can be used for many machine learning

problems

Darknet

Darknet is an open source neural network framework written in C and CUDA It is maintained by Joseph Redmon the person

behind the YOLO algorithm [90] Darknet does not offer the flexibility that other frameworks offer but is easy to install and

use when compared to others Out of the box Darknet offers an interface for YOLO The open source community offers some

ports of this framework to other popular frameworks such as Tensorflow

34 Object detection algorithms and frameworks 40

CNTK

The Microsoft Cognitive Toolkit (CNTK) is an open source toolkit for distributed deep learning It offers a Python C or C++

interface Itrsquos one of the first framework so support ONNX CNTK offers an API for Fast R-CNN and Faster R-CNN [91]

35 Technology choice 41

35 Technology choice

This Section presents the choices made for each technology described in the previous Sections

351 Thermal camera

The FLIR One Pro and Therm-App were selected as thermal cameras for the proof of concept Both offer relatively high quality

images 160 x 120 pixels and 320 x 240 pixels respectively This is of course relative to their price 469 and 93731 euro respec-

tively These prices are at the low end of the product ranges offered Both cameras are designed to use on a smartphone which

makes them ideal for prototyping since these devices are widely available and setting up the camera via the apps from the

respective companies is easy Both cameras provide MPEG-4h264 encoded video output easily understood by most playback

software Both cameras can be found in the lower left of Figure 32b

For deployment in production-ready applications with drones these cameras are not the best choice They arenrsquot specifically

designed to be used on a drone and donrsquot offer the best image quality possible In those applications platforms like the FLIR Vue

Duo Zenmuse or Workswell Wiris are better candidates due to their superior image quality MAVLink interfaces compatibility

with commercially available gimbals to mount them on drones and other features

352 Microservices framework

Flask is selected as the microservices framework The arguments for Flask are as follows Flask is a mature web framework

with major companies backing it This means the APIs stay consistent and the framework is stable in use When compared to

some other frameworks like Spring Boot Flask is unopionated which allows for maximum flexibility during development Flask

also has a very small memory footprint that makes it easier to deploy on less powerful on-board devices like drones Flask is

also easy to use and quick to set up ideal for developing a proof of concept A final argument is the familiarity of the author

with Flask

353 Deployment framework

Docker is selected as the deployment framework Docker is the most mature and well supported container framework at

the time of writing and likely will be important in the future It offers the most features and is specifically designed for the

microservices pattern [68]

354 Object detection

One of the requirements specified in Section 21 is real-time streaming Real-time object detection is only achieved by a few

models presented in Section 34 Candidates are YOLO SSD and RetinaNet As there is no framework that provides an implemen-

tation of the RetinaNet algorithm out of the box at the time of writing this algorithm is not selected SSD is implemented in

the TensorFlow object detection API However at the time of writing this API has not been found stable when trying out the API

fallbacks to older versions of the software were needed to be able to test the models This was due to the object detection API

35 Technology choice 42

using older versions of the TensorFlow framework Therefore YOLO implemented in the darknet framework is selected Darknet

offers a stable distribution YOLO achieves good results and has proven to be a very fast detector capable for real-time object

detection

PROOF OF CONCEPT IMPLEMENTATION 43

Chapter 4

Proof of Concept implementation

To prove the concept of the architecture discussed in the previous chapters a prototype is implemented First the goals and the

scope of the prototype are presented in Section 41 Next the components of the prototype are presented in Section 42 Finally

the known limitations and issues of the prototype are presented in Section 43

41 Goals and scope of prototype

The goals of the prototype are to prove the QARs defined in Section 21 The prototype focusses on the ASRs performance

interoperability and modifiability The usability security and availability requirements are left out of the scope of the prototype

because they are not an ASR and require significant resources (focus groups longtime deployment etc) to test

The components that are implemented in the prototype are Client Interface Stream Consumer and Producer because they

represent the core functionality of the framework to build image processing application streams using plugins The Producer

and Consumer Distribution components enable third party plugin developers to add their functionality to the framework These

are distribution functionalities which are out of scope of the prototype The prototype will only support one video stream All

functions presented in Figure 21 are implemented with the exception of rsquoInstall pluginrsquo rsquoUninstall pluginrsquo rsquoAdd pluginrsquo rsquoView

pluginrsquo rsquoRemove pluginrsquo and rsquoUpdate pluginrsquo as they are only supported via the Producer and Consumer Distribution components

The prototype is deployed on a local device Distributed deployment configurations require small changes in the implementation

(see Section 43)

42 Overview of prototype

421 General overview

The prototype consists of four main components a cli streamer producer and consumer The cli process is

the Client Interface implemented as a textual Command Line user Interface (CLI) which allows a user to interact with the

prototype through textual commands in a shell This process is deployed on the local machine The streamer producer

42 Overview of prototype 44

and consumer processes are deployed as microservices in their own Docker containers The prototype is initialized through

the cli which spins up the Docker containers of the other processes This is achieved with the tool docker-compose Compose

is a tool for defining and running multi-container Docker applications The compose YAML file defines the configurations for

the microservices Compose uses these configurations to start and stop the application with a single command [92] A snippet

of the compose file for the application is given in Listing 4 Containers are specified as services The example service

configuration given is that of the producer First the name of the container is specified which overwrites the default name

as the container name is used as hostname for the container in Docker [93] The build configuration specifies where the

container build recipe is situated The port mapping allows processes from the localhost to access processes in the container

For the producer service this is only used for debugging The volumes configuration specifies folders from the host to

be mounted to the container This configuration mounts in the source code and resources It also provides access to the Docker

socket to allow interaction with the Docker host (see Section 424)

services

producer

container_name producer

build

context producer

dockerfile Dockerfile

ports

- 808080

volumes

- producerusrproducer

- varrundockersockvarrundockersock

Listing 4 docker-composeyml snippet of the prototype

All containers are connected to a Docker bridge network [93] for communication A bridge network uses a software bridge to

allow connected containers to communicate while providing isolation from containers which are not connected to that bridge

network The bridge network applies to containers running on the same Docker host The network is thus confined to the local

Docker host and is not distributed on different devices The bridge network has some advantages

bull The bridge provides better isolation and interoperability between containers Containers automatically expose all ports

to each other and none to the outside world

bull The bridge provides automatic Domain Name System (DNS) resolution between containers This means that containers

resolve the IP address of each other by container name or alias

bull Containers can be attached to and detached from the networks on the fly

bull Environment variables are shared which can be used to provide equal environment configurations for every container

on the bridge

42 Overview of prototype 45

422 Client interface

The Client Interface is implemented by the cli component The cli is built in Python with the Click package by Armin

Ronacher [94] Click is a CLI creation kit which aims to make the implementation of CLIs easier It resembles the Flask frame-

work as it also leverages Python decorators [56] for most of its functionality The source code of the cli is located in the

mosquitopy file Commands can be executed by calling python mosquitopy or by calling mosquito if the

source code is installed into the Python environment The following commands are implemented

bull mosquito Displays a help page listing command groups

bull mosquito on Starts the application

bull mosquito off Shuts down the application

bull mosquito plugins Groups all commands to manage plugins Plugins can only be listed not installed or unin-

stalled as the Remote Producer and Remote Consumer are not implemented

bull mosquito plugins ls Lists all locally installed plugins

bull mosquito stream Groups all commands to manipulate the current stream

bull mosquito stream add Adds a producer or consumer to the stream

bull mosquito stream delete Deletes a producer or consumer from the stream

bull mosquito stream elements List all producers and consumers that were added to the stream

bull mosquito stream link Links two stream plugins

bull mosquito stream pause Pauses the stream

bull mosquito stream play Plays the stream This means the stream is processing media

bull mosquito stream print Prints the stream layout (which plugins are linked)

bull mosquito stream stop Stop the stream

bull mosquito stream view View the stream on the local device

A typical use of the application would be the following First the application is started using mosquito on Then plugins

are added to the stream using mosquito stream add [ELEMENT_TYPE] [ELEMENT] This will instantiate the

corresponding plugins in the Producer and Consumer component The plugins are linked in order using mosquito stream

link [ELEMENT_1] [ELEMENT_2] The stream is then set to play using mosquito stream play When the

last plugin is linked to the special local plugin the user can view the output from that plugin using mosquito stream

view which opens up a window in which the stream is displayed

42 Overview of prototype 46

As specified in the software architecture (see Section 23) the Client Interface can use the Stream Commands interface of the

Stream component As specified in Section 231 this interface is a REST API so the client can use this interface through the HTTP

protocol This is done with the Python Requests library [95]

423 Stream

The Stream component is responsible for the logical representation of the stream (see Section 231) implemented as the

streamer component The component consists of three objects api that contains the REST API StreamManager and

the Stream object representing the Stream Model in the framework Requests to the other microservices are sent using the

Python Requests library The prototype implementation only supports one stream with a chain-like model This means that

unlike the stream depicted in Figure 26 a plugin canrsquot have multiple sources or multiple listeners The Stream object man-

ages the logical representation of the stream and manipulates the references to the plugins by forwarding commands to the

producer and consumer component respectively It contains two data structures outline which is the logical struc-

ture of the stream and elements that contains all the plugins present in the stream In the prototype the Stream component

provides the following functionalities on its API endpoints

bull plugins GET Fetches all the plugins from the producer and consumer components and returns their in-

formation

bull elements GET POST DELETE Resource to add and delete plugins from the elements bin

bull streamlinks POST Resource to create links for elements

bull streamstate GET PUT Resource to update the state

bull shutdown POST Shut down the framework

Since the streamer component is the only component of the framework that interacts with outside users it has the re-

sponsibility to gracefully shut down the framework This is needed to solve the problem of dangling plugin containers that

run plugins that have not been stopped and removed after closing the application Since only plugins that are contained in a

stream have a running container associated the stream can notify the Producer and Consumer components to stop and remove

those containers

424 Producer and Consumer

The Producer and Consumer component cover similar responsibilities in managing installed plugins They are implemented in

the producer and consumer components Both components consist of the following objects api that contains the REST

API the Kernel that implements the core functionalities the PluginManager which finds plugins installed on the device

and checks if their installation is valid and the Plugin which is the logical representation of a plugin as described in Section

231 Commands to control the plugins are made using the Python Requests library

42 Overview of prototype 47

For the component to be able to start stop and interact with the plugin containers the component needs access to the Docker

host and the Docker client running on that host But because the component is running in its own container it is isolated from

the Docker host and canrsquot interact with the Docker client by default The workaround for this problem is to expose the socket

on which the Docker client is running on the Docker host to the container This is done by mounting the Docker socket of the

host on the Docker socket in the container In Docker compose the mounting is achieved using the following Listing

volumes

- varrundockersockvarrundockersock

Listing 5 Mounting the Docker socket on the container

This has some implications on security (see Section 43) To interact with the now exposed Docker client the component uses

the docker-py library [96] Listing 6 shows how a connection is made to the Docker client and a plugin container is started

The container is started from the plugin image on the network of the framework and is given the plugin name as the container

name Docker thus creates a DNS entry with the plugin name which makes the container addressable on its name Due to this

implementation this limits that there can only be one container of a plugin running at all times in the current implementation

import docker

client = dockerfrom_env()

container = clientcontainersrun(

image=plugin_name

detach=True

name=plugin_name

network=mosquito_default

)

Listing 6 Starting a plugin container

When both components are initialized the Kernel and PluginManager are created The PluginManager searches

for a plugin_directory which contains information on which plugins are installed on the device Each installed plugin

should have a valid image on the device which are contained in the images directory of the Docker daemon If the image

or information file cannot be found on the device the plugin is marked as broken and canrsquot be used by the framework To

describe the API the consumer API is used The producer API is analogous but replaces consumer with producer

and doesnrsquot have the sources endpoints The Producer and Consumer components provide the following functionalities

on the API endpoints

bull consumers GET Retrieves a list of the installed consumers on the device on which the component is running

bull consumerslthostnamegt GET DELETE Retrieves the information of a consumer specified by the host-

name value which is the name of the consumer

42 Overview of prototype 48

bull consumerslthostnamegtstate GET PUT Retrieves or respectively updates the state of a consumer

specified by the hostname value

bull consumerslthostnamegtsources GET POST Retrieves the sources or respectively adds a new source

to the consumer specified by the hostname value

bull consumerslthostnamegtsourcesltsource_hostnamegt

GET PUT DELETE Retrieves updates or removes the source specified by source_hostname of a consumer spec-

ified by hostname respectively

bull consumerslthostnamegtlisteners All listeners resources are analogous to the sources re-

sources

425 Implemented plugins

Three plugins are implemented and tested filecam (called rsquoMycamrsquo in the code) a producer that reads in a video file and

transmits it in MJPEG encoding using the RTP protocol testsrc a producer which generates test video and transmits it

in MJPEG encoding using the RTP protocol and local a consumer which captures incoming RTP MJPEG video frames and

displays them on the local display The filecam and local plugins are discussed since the testsrc is similar to the

filecam

The plugins are implemented in Python use the GStreamer library with the Python bindings [97] for media streaming and the

Flask framework to implement the API These libraries donrsquot have to be used by future plugins which can just implement a REST

API and provide a media stream specified in their descriptions

Filecam plugin

The filecam image is based of the Ubuntu 1710 image It is chosen over lighter Linux distributions because it offers more

functionalities out of the box for prototyping Other dependencies are Python 36 GStreamer 112 GStreamer plugins-base

plugins-good plugins-bad plugins-ugly libav doc and tools and python-gst

The API of the plugin offers the following functionalities

bull state GET PUT Retrieve and respectively update the state of the plugin

bull listeners GET POST Retrieve and respectively add a listener on the plugin

bull listenerslthostnamegt GET PUT DELETE Retrieve update and respectively delete a listener on the

plugin

The implemented GStreamer pipeline is depicted in Figure 41 The pipeline consists of the following GStreamer elements

1 filesrc GStreamer element that reads data from a file in the local file system This file can have any extension

and is not limited to video or audio files [98] The location property is set to the location of the file in the plugin

container

42 Overview of prototype 49

Figure 41 filecam GStreamer pipeline

2 decodebin GStreamer bin that automatically constructs a decoding pipeline using available decoders and demuxers

via auto-plugging [99] Note that for some media containers and codecs the appropriate decoders must be installed

For example to decode the MPEG streams contained in MP4 files a h264 decoder is needed that can be found in the

rsquolibavrsquo GStreamer plugins library

3 jpegenc GStreamer elements that encodes raw video into JPEG images [100] This implements the MPEG video

stream as all video frames are encoded as JPEG images

4 rtpjpegpay GStreamer element that payload encodes JPEG images into RTP packets according to RFC 2435 [101]

5 udpsink GStreamer element that sends UDP packets to the network When combined with an RTP payload plugin

it implements RTP streaming [102] The host and port property are set to the hostname and port property of the

listener of the plugin

This pipeline is implemented using the Python GStreamer bindings The process consists of creating each GStreamer element

adding them to the GStreamer pipeline and linking the elements in order of appearance in the pipeline The decodebin

and jpegenc element canrsquot be linked when created because there is no default sink pad available on the decodebin

Because the decodebin needs to decide on how to decode media it needs the pipeline to be processing media to it If no

media is flowing the decodebin canrsquot know what decoder it needs to decode the media and what media it can offer to the

sink element Therefore the process of dynamic linking is used [103] All elements which can be linked when the pipeline is

not in the PLAYING state are linked A handler is registered on the rsquopad-addedrsquo signal emitted when a new pad is added

on the decodebin indicating that it can forward media downstream When media is flowing through the pipeline the

decodebin creates new pads when it can generate output data and emits the rsquopad-addedrsquo signal A callback is performed

on the handler which links the decodebin with the jpegenc Listing 7 illustrates this concept

callback handler

def on_pad(source pad sink)

get the sink pad from the sink element

sink_pad = sinkget_static_pad(sink)

get the pad type

pad_caps = padget_current_caps()

pad_type = pad_capsget_structure(0)get_name()

Only if the pad is raw video the link is made

if pad_type == videox-raw

42 Overview of prototype 50

Perform the dynamic link

padlink(sink_pad)

Other pad types are ignored

filesrc = GstElementFactorymake(filesrc)

decodebin = GstElementFactorymake(decodebin)

jpegenc = GstElementFactorymake(jpegenc)

(create other elements and add elements to pipeline)

Only filesrc and decodebin can be linked statically

filesrclink(decodebin)

Register on_pad handler on the pad-added signal

handler_id = decodebinconnect(pad-added on_pad jpegenc)

Set pipeline to PLAYING callback will be called to perform the dynamic link

pipelineset_state(GstStatePLAYING)

Listing 7 Dynamic linking of the decodebin and jpegenc

Local plugin

The local plugin captures an incoming media stream and displays it on the local display This plugin is special with respect

to other plugins in that it is not deployed in a Docker container It runs natively via the cli on the host to allow access to

the local display This version is built for macOS High Sierra (version 10134) and uses GStreamer 112 GStreamer plugins-base

plugins-good plugins-bad plugins-ugly libav doc and tools to receive an incoming stream When a plugin links to the local

plugin the Stream component does not instruct the Consumer component to start the plugin but instead links the plugin to the

local host For macOS the address of the host is hostdockerinternal The GStreamer pipeline used by the plugin is depicted in

Figure 42

Figure 42 local plugin GStreamer pipeline

The pipeline consists of the following elements

1 updsrc GStreamer element that reads UDP packets from the network [104] The port property is set to the port to

which the source is transmitting media

2 rtpjpegdepay GStreamer element that retrieves JPEG images from the received RTP packets [105] This element

canrsquot process the media received from the udpsrc directly because it canrsquot know what type of data it will be receiv-

43 Limitations and issues 51

ing Between the pads a rsquocapabilities filterrsquo is placed which informs the elements on the type of data that will be

flowing through In this case the capabilities are applicationx-rtp which tells that there will be rtp pack-

ets coming through encoding-name=JPEG which tells that the payload of the RTP packets are JPEG images and

payload=26 which also tells that the encoding is JPEG according to RFC3551 [50 106]

3 jpegdec GStreamer element that decodes JPEG images [107]

4 autovideosink GStreamer element that automatically detects an appropriate videosink and forwards the video

to it [108]

43 Limitations and issues

The implementation presented is a prototype and slimmed down version of the architecture presented in Section 23 The

following limitations and issues remain

431 Single client

The current implementation deploys the Flask framework (on which each microservice relies) on the built-in Flask web server

(Werkzeug) which is provided for development convenience It is only built for use by a single user and by default can only

handle one request at each given moment which implies that the framework can also only be used by a single user [109]

432 Timeouts

The framework does not perform checks on request timeouts when passing commands to components and plugins This can

be a problem when the framework is deployed on several devices and the request latency is much higher In case of timeouts

the framework will keep waiting for a response which leads to a crash

433 Exception handling and testing

The framework is only tested for the so called rsquohappy pathrsquo the default scenario featuring no exceptional or error conditions

Some alternate paths are handled butmost still need to be tested An example scenario would be if one of the plugin containers

in a stream fails and stops The framework is not able to detect this and will assume that the container is still running

434 Docker security issues

The Docker client is a client that communicates with a daemon process using the socket dockerd This socket is a UNIX

domain socket called varrundockersock The daemon is highly privileged having root access to the host system

Any process that can write to this socket effectively has root access To allow the components of the framework to manipulate

the plugin containers they need access to this socket Therefore the socket ismounted in the containerwhich gives the container

43 Limitations and issues 52

write access to the socket This implies that that container now has root access on the host when writing to this socket Because

the container gets root access to the host an attacker can walk the file tree of the host and extract sensitive information or run

unwanted software This type of attack is known as a rsquoDocker Breakoutrsquo or rsquoContainer Escapersquo attack [110 111]

435 Docker bridge network

The current implementation deploys the framework on a Docker bridge network which can only be used if the framework is

deployed on a single device The current implementation can thus only be deployed on a single device To deploy the framework

on multiple devices the framework must be deployed using a Docker overlay network [112]

436 Single stream

The implementation supports one stream which must be a chain Multiple streams in tree form with merging media from

multiple sources and broadcasting to multiple listeners is not supported

437 Number of containers per plugin

The framework uses the name of the plugin as identifier for the containers The name is also the hostname on which the

container can be reached Therefore there can only be one active container associated with a plugin at runtime

MOB DETECTION EXPERIMENT 53

Chapter 5

Mob detection experiment

To try out an actual drone thermal imaging application the mob detection experiment is carried out The goal of this experi-

ment is to use existing object detection algorithms on a dataset of thermal images to try and detect large crowds of people

hereinafter referred to as a mob

Several public datasets of thermal images exist Most datasets focus on the detection of people in scenes [113ndash117] some on

face recognition [118 119] others on vehicle recognition [120] Most of these datasets are freely available through the OTCBVS

Benchmark Dataset Collection [121] No datasets containing large amounts of people were found so the Last Post thermal

dataset was created for the detection of mobs and other analysis tasks This dataset is presented in Section 51

To detect mobs in the images of the datasets a deep learning approach using neural networks is explored The selection and

training of the model is described in Section 52

51 Last Post thermal dataset

The Last Post dataset consists of videos of the Last Post Ceremony taking place each night at 800 PM (Brussels timezone) under

the Menin Gate in Ypres Belgium Section 511 gives some insight into this unique ceremony The full dataset is described in

Section 512

511 Last Post ceremony

The Last Post ceremony is a nightly ceremony taking place under the Menin Gate in Ypres at 800 PM sharp The ceremony is

held in remembrance of the fallen soldiers during World War I (1914-1918) The Last Post association [122] states its mission as

follows

True to its statutes the Last Post Association wishes to honor and remember the soldiers of the British Empire

who gave their lives during the Great War of 1914-1918 The Last Post ceremony seeks to express day after day

the lasting debt of gratitude which we all owe to the men who fought and fell for the restoration of peace and

the independence of Belgium

51 Last Post thermal dataset 54

Figure 51 gives an impression of the size of the ceremony Because of the sheer number of people that gather under the gate

each day the Last Post is a unique open air event that allowed for repeatable conditions to capture footage therefore the event

was a perfect opportunity to create the dataset

Figure 51 Last Post ceremony panorama

512 Dataset description

Due to legislation in Belgium drones cannot be flown in public areas without a certification and permit by authorities The

creation of real aerial thermal images with a drone was thus not feasible Therefore an elevated position (in order to simulate

aerial images) on the walls next to Menin gate was used to capture the footage of the adjacent square on one side and the

bridge on the other side Figure 52 shows the locations where the video footage was captured

Figure 52 Locations where the video footage was captured The black stars represent the captured scenes the red stars represent the locations from

where the scene was filmed

The data was recorded with the FLIR One Generation 3 Pro camera for Android devices hereafter referred to as rdquoCamerardquo [123]

Since thermal images donrsquot hold color information a color scheme is used to represent the relative differences in temperature

The rsquoIronrsquo color scheme which maps colder sections of a scene on blue colors and warmer sections on red and yellow colors

51 Last Post thermal dataset 55

The videos are encoded using the H264 MPEG-4 codec Decoded the color information is captured in 420 YUV format The

frame rate of the videos varies from 7 Hz to 8 Hz depending on the speed of the objects in the scene There is sound present

in the videos which is encoded with the MPEG AAC codec For a full list of sequences the reader is referred to Appendix C

The two locations that make up the main scenes in the dataset are presented in Figure 53 The thermal images and visual

images of each scene are depicted next to each other The thermal and visual images were not captured at the same time so

the mobs that are present in the thermal images canrsquot be seen in the visual images In both scenes buildings are present that

are quite warm when compared to the surroundings as can be seen in the thermal images In Figure 53a it even becomes

difficult to recognize the mob when they are standing close to the building This is less the case for Figure 53c where due to

the water present in the image the mob has higher contrast due to the larger difference in emitted heat Towards the far right

of the image the mob seemingly disappears into the background The effect of two objects having a similar heat signature and

having no clear transition in thermal images is defined as thermal camouflage a technique that is often used by animals and

military units [124] This effect is even visible when looking at the mobs present in both images because people are standing

so close together it becomes difficult to recognize individual persons in the crowd

(a) Thermal view of the square in location A (b) Visual view of the square in location A

(c) Thermal view of the bridge in location B (d) Visual view of the bridge in location B

Figure 53 Main scenes in the Last Post dataset

52 Object detection experiment 56

52 Object detection experiment

521 Preprocessing

The Last Post dataset was not used entirely for training the model because there were not enough resources to manually

annotate every image Therefore a smaller dataset was used to serve as a baseline model

The following videos were used 2018-04-10 195029mp4 2018-04-10 200122mp4 2018-04-04-

202859mp4 2018-04-10 202558mp4 and 2018-04-04 200052mp4 captured on the fourth and

tenth of April 2018 These videos were used because of their contents They contain images from location A and B respectively

in which the mob behaves more dynamically compared to other videos This was due to a marching band present on the fourth

of April and a marching army unit on the tenth of April See Appendix C for a summary of the contents of these videos From

these videos images were extracted at a capture rate of 1 Hz Each image was manually labelled using the Microsoft Visual

Object Tagging Tool [125] The tool allows to export the training images to various formats such as Pascal VOC for Tensorflow

YOLO and Microsoft CNTK

Within the data several visual outliers are present An outlier is an observation point that is distant from other observations

It is created due to variability in capturing the videos or indicate experimental errors [126] The errors detected here are the

latter form and are depicted in Figure 54 The first type of outliers are system faults in the Camera Due to an error in the

processing of the video the Camera would sometimes not register any input This causes the Camera to produce completely

black images which is depicted in Figure 54a The Camera softwaremaps temperatures onto colors in the image The variations

of the colors are relative to the temperature interval ranging from the minimum and maximum temperature detected by the

Camera If the minimum andor maximum detected temperature change the Camera needs to adapt its color mapping This

causes the Camera to fade to bright colors for a short period of time (1 to 2 seconds) The resulting image is depicted in Figure

54b Because the resulting image is too bright and objects are hard to detect it is considered an outlier Due to instabilities

when capturing the footage sequences with fast motion some images are very blurry This makes it hard even for a person to

decide what is visible in the frame therefore it is considered an outlier This is depicted in Figure 54c Sometimes people would

pass in front of the Camera which resulted in brightly colored areas in the videos that were not part of the scene and therefore

are another type of outliers depicted in Figure 54d Because the presented outliers are experimental errors and do not belong

in the scenes they were removed from the dataset

522 Training

The model that is used for training is YOLOv3 implemented using the darknet neural network framework [83] The model is

trained using convolutional weights that are pre-trained on the ImageNet database [127] The concept of using weights from a

pre-trained model previously trained on large datasets is known as transfer learning It is very important that when choosing

a pre-trained model the problem statement of the pre-trained model is close enough to the current problem statement For

the pre-trained model on ImageNet this was to identify objects in images which lies close to the detection of mobs in thermal

images Because the type of images (thermal versus visual) is fundamentally different the model could suffer in performance

Goedeme et al [36] solved a similar problem with thermal images and achieved good results which gives an indication that

52 Object detection experiment 57

(a) System fault in the Camera no input was detected (b) The Camera updates to new temperature interval

(c) Due to moving the Camera too fast the image becomes too blurry (d) Very warm object due to people passing in front of the Camera

Figure 54 Outliers

detection should be feasible with the pre-trained model Also because the dataset is relatively small training the model from

scratch could actually hurt performance [128] Training was carried out on the NVIDIA Geforce GTX 980 GPU that allows training

to be done much faster To evaluate training progress the Sum of Squared Error (SSE) loss function is calculated defined assumni=1(xij minus xj)

2 where n is the number of samples in a batch used in a single training epoch and j is the dimension (x

or y) as defined in [83] The result of this training is discussed in Chapter 6

RESULTS AND EVALUATION 58

Chapter 6

Results and evaluation

The goal of this Chapter is to present the results of the framework and the detection experiment The results of the framework

tests are presented in Section 61 The results of the object detection experiment are presented in Section 62

61 Framework results

To evaluate the framework acceptance tests are conducted that test if the framework meets the QARs defined in Section 21 As

stated in Section 41 only the ASRs will be tested A summary of which requirements are met by the framework is given in Table

61 Passed means that the framework has met the requirement not passed that the framework hasnrsquot met the requirement

and plausible means that the frameworkmight havemet the requirement but not enough data could be gathered to be certain

611 Performance evaluation

To evaluate performance the acceptance tests for the requirements are conducted the impact of the framework on the pro-

cessing resources are recorded and the total size of the framework is measured

Acceptance tests

To test the performance of the framework the execution times of each command executed with the CLI (see Section 422) are

measured Each command is executed 200 times except for the on off and link commands they are measured manually

10 times Because these commands launched system threads and their finish signal could not be captured they had to be

measured by hand Commands were executed on a 26 GHz Intel Core i5-2540 processor running macOS High Sierra version

10134 The summarized statistics of the tests are given in Table 62

The average execution times for the Play Stop Pause Add Elements Print View and Link commands do not exceed the 2

second bound specified in PS-1 while the average execution times of the Delete On and Off commands do exceed this bound

This performance requirement is not met by the framework The same result is found for PS-2 Especially the Delete and Off

command exceed the requirements by quite a bit The Delete command shuts down a plugin and removes the Docker container

61 Framework results 59

Requirement id Status

PS-1 Not Passed

PS-2 Plausible

PS-3 Not Passed

PS-4 Plausible

PS-5 Not Passed

IS-1 Passed

IS-2 Passed

MS-1 Passed

MS-2 Passed

MS-3 Passed

MS-4 Passed

MS-5 Plausible

MS-6 Passed

MS-7 Plausible

Table 61 Acceptance tests results summary

from the host This action is costly in time The Off command removes all the plugins and all the microservices of the framework

and thus suffers from the same costly action This could be ameliorated by having the framework not removing the containers

but stopping them instead which requires less resources as it only stops the process running in the container but does not

delete the container from the system

PS-2 and PS-4 could not be measured due to the GStreamer pipeline of the prototype not allowing frames to be tracked

However since real-time is a human time perception if a person canrsquot distinguish the streamed videos from videos played with

a native video player real-time streaming is plausible [43 44] The videos were shown side by side to ten users that could not

distinguish between both videos indicating presumable real-time streaming Since the hard requirements cannot bemeasured

the requirements are not met but are plausible Real-time streaming performance also heavily depends on the used plugins

and the hardware on which they are deployed If a plugin canrsquot process its media fast enough due to lack of processing power

or a slow implementation it will slow down the whole stream

The scalability requirement PS-5 could not be met due to the Flask Werkzeug server only being able to process one request at

a time (see Section 43)

Only two performance requirements are met by the prototype However this is mostly due to some actions being very slow

such as shutting down the framework or removing a plugin As these are actions that should occur less frequently when a user

is using the framework these actions are less important for the perceived quality Frequent actions such as adding linking and

changing the state of the stream do perform rather well and contribute more to the perceived quality Overall the performance

of the framework is not stellar but not bad either This can partially be explained due to the choice of supporting frameworks

61 Framework results 60

Statistic Play Stop Pause Add Delete Elements Print View On Off Link

Mean 0690 0804 0634 1363 8402 0562 0564 122 358 24023 0849

Std deviation 0050 0059 0088 1037 4669 0070 00747 0260 0498 0481 0170

Minimum 0629 0708 0549 0516 0505 0517 0517 0757 3015 23707 0637

25 Percentile 0665 0775 0594 1049 1154 0534 0536 0998 3143 23750 0798

Median 0678 0800 0623 111 11132 0550 0552 1214 3500 23886 0853

75 Percentile 0700 0820 0653 1233 11189 0562 0560 1433 3850 24034 0877

Maximum 1016 1279 1631 625 11846 1227 1149 1691 4562 25326 1261

Table 62 Performance test statistics summary measured in seconds

such as Flask that are not built for performance Other more high performance frameworks such as Vertx could ameliorate

performance

Resource usage

The resources used by the modules of the framework are measured using the Docker statistics tool [129] A summary of the

resources used is given in Table 63 When the framework is idle resource usage is negligible When a plugin is active there is

a slight increase in resources This increase in resources depends on the runtime size of the plugin unknown to the framework

The increase peaks when the plugin is processing media CPU usage is 40 on one core which implies that on one CPU core only

two plugins can be active simultaneously before reaching the ceiling of the processing power In a production environment of

the framework plugins need to be tested thoroughly so that these metrics are known beforehand These metrics imply that

the length of streams should be kept short to avoid having many plugins active simultaneously

Size of framework

The total size of all the Docker images of the components of the framework are given in Table 64 Most images are quite large

the framework core components have an average size of 724 MB and the plugins have sizes ranging from 1GB to 3GB This

size can be explained due to the base images and additionally installed software in the images For development flexibility

the base images used are Linux Ubuntu images that are typically larger than other Linux distributions For the plugins the full

GStreamer library with all plugins was installed which is more than 2 GB large The sizes of the components can be reduced in

a production environment by choosing slimmer Linux distributions as base images and only installing the minimally needed

libraries to get a working plugin

612 Interoperability evaluation

The systems with which the framework exchanges data are the plugins These plugins must follow the plugin model presented

in Section 231 implement the presented resources using a REST API the state machine and protocols If these specifications

61 Framework results 61

Condition Container CPU usage [] Memory usage [MiB]

Idle streamer 100 4209

consumer 003 244

producer 001 2414

1 plugin active not processing media streamer 156 4248

consumer 002 2442

producer 002 2423

mycam plugin 075 4597

1 plugin active processing media streamer 156 4251

consumer 002 2442

producer 002 2424

mycam plugin 4003 9924

Table 63 Resource usage of the framework in several conditions

Image Size [MB]

streamer 718

consumer 729

producer 729

testsrc 1250

mycam 3020

Table 64 Total size of framework components

are followed by a plugin the framework should have no issues exchanging information with the plugin To test this a new

mock plugin is implemented For each resource of the plugin the framework is given random mock input data to exchange

with the plugin When the exchange is complete the values in the plugin are requested and compared with the given input If

the input matches the value in the plugin the exchange was successful These tests were executed 50000 times The results

are summarized in Table 65 Play pause and stop are the requests to change the state of the plugin The sourcelistener add

update and delete commands manipulate the sources and listeners of the plugin Overall there were almost no errors made

when exchanging information only when updating a source and deleting a listener there was one incorrect exchange The

ratios achieved are always 100 correct exchanges except for updating a source and deleting a listener which are 99998

IS-1 and IS-2 specify that commands exchanged with the plugins need to be correct 9999 of the uptime so this requirement

is clearly met

Plugins also interact with each other by transmitting media to each other according to the stream layout This interoperability

62 Mob detection experiment results 62

Value Play Pause Stop Add S Update S Delete S Add L Update L Delete L

Correct 50000 50000 50000 50000 50000 49999 50000 50000 49999

Incorrect 0 0 0 0 0 1 0 0 1

Ratio () 100 100 100 100 100 99998 100 100 99998

Table 65 Interoperability tests results (S Source L Listener)

is not directly controlled by the framework as plugins can be developed by third parties To solve this a plugin needs to provide

its specifications to the framework before being integrated as a plugin This allows the framework to decide whether or not two

plugins will be able to interact with each other in a stream For example if plugin A supports MJPEG streams transmitted via

RTPUDP it will be able to interact with plugin B implementing the same protocols If plugin B implements another protocol it

will not be able to interact with plugin A If this is specified the framework can notify a user that two plugins are not compatible

These scenarios should be avoided which is done by specifying standard protocols for plugins

613 Modifiability evaluation

Plugins are installed for the prototype by building and adding their image to the image directory of the Docker host The

framework does not need a restart to install these images Therefore requirements MS-1 and MS-2 are met End-users can

extend their version of the framework with new plugins by installing them by building the respective plugin images meeting

MS-3 Streams can be modified by linking different plugins by design meetingMS-4 The framework can detect newly installed

plugins when starting up if the image is installed to the image directory of the Docker host Therefore requirementsMS-5 and

MS-6 are met The current prototype is only deployable on a local device as discussed in Section 41 meeting requirementMS-7

The other requirements can be met by deploying the framework using the Docker overlay network as discussed in Section 43

without having to implement changes to the code base The requirements MS-8 and MS-9 are not met but are plausible by

using a different Docker deployment

In general the frameworkwas designed to bemodifiable for different video analysis tasks The hybridmicrokernelmicroservices

architecture enables this modifiability The microkernel plugin architecture allows a user to modify a video analysis stream

during framework use The microservices architecture allows for a modifiable deployment configuration

62 Mob detection experiment results

To evaluate the detection experiment the trained model is tested on the validation set that contains random images from the

total annotated dataset presented in Section 512 First the results of the training of the model are presented in Section 621

Second the metrics that were used to evaluate the model are presented in Section 622 Finally the results of the validation

are presented in Section 623

62 Mob detection experiment results 63

621 Training results

To monitor training the average loss per training epoch was measured the resulting training evolutions are depicted in Figure

61 Darknet does not shuffle training data automatically and creates training batches in order of the training data provided

Since YOLO uses gradient descent for optimization this can lead to YOLO getting stuck in local minima of the cost surface [130]

This effect is seen in Figure 61a around epoch 4500 every image in the training set has been loaded at least once at this point

the model was training on images from location B and now images from location A are loaded (see Section 512) This leads to

a peak in average loss as YOLO was optimizing images from location B and probably converging to a local minimum for that

type of images Therefore in a second run data was shuffled allowing the model to get out of local minima easier Figure

61b shows the difference in training loss the curve is much more irregular thanks to the shuffling of the data Once again

the average loss decreases more around epoch 4500 when every image in the training set has been loaded at least once The

average loss stagnates values in the interval [004 007] To avoid overfitting the model on the training data and achieve worse

generalization performance early stopping is applied Early stopping is a generalization technique to stop the training of a

neural network early before the network starts overfitting [131] The stopping criterion used is progress defined as the decrease

of training error in successive training epochs [131] or the slope of the loss curve depicted in Figure 61 This slope approaches

0 from epoch 13000 and onward so this epoch is selected as early stopping point Because the generalization error is not a

smooth curve and consists of many local minima it is a good idea to validate model weights in the neighborhood of the early

stopping point as well as these could potentially yield better performance on the validation set [131]

622 Metrics

Themodel predicts bounding boxes for objects in the images of the validation sets The bounding box provided by the annotated

dataset is defined as the ground truth bounding boxBgt The bounding box provided by the model is defined as the predicted

bounding boxBp To evaluate the performance of themodel and select the best weights several metrics are used The standard

metrics used to evaluate object detection problems are the Intersection over Union (IoU) and themean Average Precision (mAP)

The IoU is a metric used in common object detection challenges such as the Pascal VOC challenge [132] If the functionA(Bx)

gives the area for a bounding boxBx the IoU is defined as

IoU =A(Bp capBgt)

A(Bp cupBgt)(61)

The mAP for set of detections another metric used in the Pascal VOC challenge is defined as the mean over classes of the

interpolated AP for each class A detection is considered a true positive if the IoU for the detection is greater than 05 The

interpolated AP is given by the area under the precision-recall curve for the detections [132ndash134]

Themodel is also tested on several videos not included in the train and validation set to visually evaluate detection andmeasure

the number of frames per second that can be processed by the model

62 Mob detection experiment results 64

(a) Average training loss when data is not shuffled Vertical average loss horizontal time (in training epochs)

(b) Average training loss when data is shuffled Vertical average loss horizontal time (in training epochs)

Figure 61 Average training loss per epoch

623 Validation results

YOLO creates a snapshot from the weights the model is using at a certain epoch every 100 epochs [83] This makes it possible

to validate each set of weights on the validation set and show the evolution of the validation performance Figure 62 shows

these evolutions for the average IoU and mAP metrics The mAP gradually grows from epoch 4500 onwards and stagnates

around epoch 11500 This shows that the model is not learning anymore and is at risk of overfitting The mAP stagnates in the

interval of [88 91] The average IoU shows a similar trend but varies more because predictions on the same images rarely

are exactly the same

The best mAP value is achieved at epoch 15700 being 9052 The weights from this epoch are used for further testing and

validation The mAP for the 05 IoU threshold of YOLOv3 on the COCO benchmark dataset [135] is 748 comparing this to the

achieved mAP for the Last Post dataset the Last Post mAP is very high The reason for this difference is that the validation

62 Mob detection experiment results 65

(a) mAP () per epoch Vertical mAP () horizontal time (in training epochs)

(b) IoU () per epoch Vertical IoU () horizontal time (in training epochs)

Figure 62 Validation metrics per epoch

set has a high correlation with the validation set Due to the training set and validation set being extracted from videos all

images from one video are correlated in time to each other Images from the validation set are thus correlated to images in

the training set and the model is optimized on these types of images explaining the high mAP This indicates that the model is

somewhat overfitting on the training data This was confirmed when testing the model on unseen videos Although the model

could detect a mob most of the time it produced more visual errors Because this data was not annotated no metrics could be

extracted Figure 63 depicts some predictions of the model on images from the validation set The predicted bounding boxes

resemble the ground truth bounding boxes quite accurately visually

To test the speed of the predictions of the model the total time to predict images in the validation set was measured For the

NVIDIA Geforce GTX 980 GPU the average prediction time for one image is 14673 milliseconds with a standard deviation of

0517 milliseconds This indicates that the upper limit of the frame rate when making predictions on a video is approximately

68 frames per second on the GPU For comparison predictions with the model were also made on a CPU a 26 GHz Intel Core

i5-2540 processor with AVX instructions speedup The average prediction time on the CPU is 5849 seconds with a standard

deviation of 0438 seconds resulting in an upper limit for the frame rate on the CPU of 0171 frames per second Clearly real

time object detection with this model is only possible on a GPU When generating predictions on a test video the average frame

rate of the video was 55 frames per second

62 Mob detection experiment results 66

(a) Prediction of a large mob at location B (b) Prediction of the mob at location A

(c) Prediction of a small mob at location B (d) Prediction of the mob at location B

Figure 63 Predictions of the model on images in the validation set

CONCLUSION AND FUTURE WORK 67

Chapter 7

Conclusion and future work

71 Conclusion

Aerial thermal imaging with drones is a promising technology that can deliver many promising applications for various use

cases across many different domains such as agriculture fire fighting search and rescue etc Most applications built with this

technology are built with a specific use case in mind using a thermal camera and analysis software specifically for this use

case and therefore struggle to exchange hardware and algorithms for new use cases Therefore the goal of this dissertation

was to design build and test a possible backbone framework that allows building these applications in a modifiable way The

specific use case of mob detection in thermal images was investigated as a sample use case for the framework

Chapter 2 explored the requirements of such a framework The ASRs to achieve the goal of the framework are performance

interoperability and modifiability Performance is needed because some use cases (like fire fighting) require real-time video

analysis Interoperability enables the framework to interact with different thermal cameras and different processinganalysis

modules Modifiability enables the framework to interchange the thermal cameras and analyzers in its process to build ap-

plications for different use cases A hybrid combination of the microkernel pattern and the microservices pattern is used to

meet these requirements as the microkernel pattern enabled interchanging the cameras and analyzers via a plugin system

and the microservices pattern enabled different deployment configurations for the framework To build and test the frame-

work several technologies were needed backbone technologies for the software architecture a thermal camera and an object

detection algorithm for the mob detection use case

Chapter 3 explored the state of the art of these technologies and presents the selected technologies Thermal cameras come in

all shapes and sizes and have different features according to their retail prize Contrary to intuition the image quality is not the

defining factor of the retail prize but the amount of extra features such as radiometry communication interfaces etc The FLIR

One Pro and ThermApp were selected for this dissertation since they offer good quality images and features for their price and

their use via smartphone platforms that makes these cameras excellent for prototyping Microservices frameworks also know

a lot of variety depending a lot on the use case for the application using the framework Some are aimed at quick prototyping

others focus on performance etc Flask was selected as the microservices framework as it is easy to use and designed for

prototyping with microservices This does come with a performance trade-off To deploy the microservices in a plugin fashion

71 Conclusion 68

the concept of containers is applied Containers virtualize on the OS level allowing the microservices to be moved around on

the host and distributed on different hosts The current field has some frameworks implementing this technology with Docker

being the most well-known and mature framework and it was selected for that reason The field of object detection has a

variety of solutions for the object detection problem having varying accuracies and some can even create predictions in real-

time The YOLOv3 algorithm implemented in the darknet framework was selected as it generalizes well onto other datasets

(such as thermal images) makes relatively accurate predictions and is able to make predictions in real-time when deployed on

a device with GPU processing capabilities

Chapter 4 presents the implemented prototype of the framework using these technologies Two sample plugins were imple-

mented the filecam that serves a video read in from a file and the display plugin that displays this video on the local device

The framework is limited to one video processing stream for one user at a time and is deployed to a local device It also has a

security risk as the framework has to expose the Docker daemon socket to allow the framework to manipulate the containers

running the plugins This gives the containers that run the core framework processes root access to the host system which can

be abused by potential attackers

Themob detection experiment is presented in Chapter 5 A new thermal image dataset called the Last Post datasetwas collected

for this experiment The dataset features videos of the Last Post ceremony filmed over the course of two weeks What makes

this dataset special is that unlike publicly available datasets it delivers footage of the movement of large crowds filmed from

a high vantage point to simulate footage captured from a drone platform This dataset is used to train a pre-trained YOLOv3

model via transfer-learning The dataset is manually labeled and preprocessed by removing the outliers present Training is

done on a NVIDIA GTX 980 GPU and is evaluated using the MSE loss metric

Chapter 6 presented the test conducted on the framework and the detection model and their corresponding results The per-

formance requirements for the frequently used commands are met by the framework Other commands such as removing

plugins starting up and shutting down the framework do not meet the performance requirements since Docker requires sig-

nificant time to start stop and remove containers The real-time streaming requirements could not be proven because the

time between transmitting a frame and receiving a frame could not be measured directly However the processed videos were

shown to human users that could not distinguish between the processed video and the video played back on a local system

which makes it plausible that the framework achieved this requirement Real-time streaming performance heavily depends on

the plugin and the hardware on which it is deployed When plugins in the framework are processingmedia CPU usage increases

significantly even when only one plugin is active This implies that the length of media processing streams should be kept as

short as possible to achieve good performance The framework is relatively big with some plugins even having a size of 2 GB

This is mostly due to the base images and installed libraries of the plugins and core components Due to each components

and plugin having its own container libraries canrsquot be shared so they are redundantly installed leading to large components

sizes This could be alleviated by using slimmer images and only installing minimal libraries needed The interoperability

requirements are all met by the framework This is proven by a test exchanging mock information between the framework

and plugins The modifiability requirements regarding the plugins are met by the framework The modifiability requirements

regarding the deployment schemes are not met by the framework but are can be achieved by deploying the framework using

a Docker overlay network instead of the Docker bridge network To evaluate the trained model the model made predictions

72 Future work 69

on a separate validation set The model achieves an mAP of 9052 which is much higher than what current state of the art

models are achieving on benchmark datasets This shows that the model is capable of learning the thermal features but is also

overfitting on the data due to temporal correlation between training and validation sets The model can predict in real-time

achieving an average frame rate of 55 frames per second when making predictions on a GPU

72 Future work

This dissertation proposed a framework and implements a prototype of it which only implements a part of the total framework

Object detection using deep learning in general and specified on thermal images is still a young field Several extensions to

this research are possible

721 Security

The framework prototype did not implement any security measures Because in distributed configurations communications

rely on an external network these measures should be implemented to reduce the risks of attacks To allow the components

to manipulate Docker containers the Docker host socket was exposed As stated before this is a serious security risk as the

container gets root access to the host Workarounds for this problem could be to implement a Docker in Docker environment

[136] or deploy the containers in a VM

722 Implementing a detection plugin

Due to the scope and time limit of the dissertation a working prototype plugin containing a trained model for detecting objects

in a video stream could not be made A possible GStreamer pipeline for such a plugin is depicted in Figure 71 This plugin is a

Consumer and receives video via the udpsink Frames are decoded and the raw video is presented to the appsink GStreamer

plugin that allows the video to be dumped into an application This is the detection model that can generate predictions on the

frame The predicted frame is then forwarded to an appsrc GStreamer plugin that puts the predicted frame in a new pipeline to

transmit it to further framework plugins It should be tested whether the detection model can run in a Docker container since

it needs GPU support to be able to predict in real-time A solution could be to use nvidia-docker which leverages NVIDIA GPU

support in Docker containers [137]

Figure 71 GStreamer pipeline for a plugin with a detection model

72 Future work 70

723 Different deployment configurations

The prototype of the framework only implemented one of the deployment configurations presented in Section 233 Other

configurations can be explored by changing the Docker bridge network to a Docker overlay network

724 Multiple streams with different layouts

The prototype only implemented one stream with a chain-like layout Future effort could implement support for multiple

streams that run concurrently The layout can be changed by implementing plugin that can forward media to multiple sources

or merge media coming from different sources which is the concept of sensor fusion

725 Implementing the plugin distribution service (Remote ProducerConsumer)

In Chapter 2 presented the Remote Producer and Consumer that distribute the plugins available for the framework This was

deemed out of scope for the prototype but could be implemented in future versions

726 Using high performance microservices backbone frameworks

The current implementation uses the Flask framework excellent for prototyping but not ideal for high performance Other

frameworks such as Vertx focus on high performance through asynchronous messaging that could improve the performance

of the framework

727 New object detection models and datasets specifically for thermal images

Current effort in object detection models goes towards challenges on benchmark datasets of visual images such as ImageNet

and Pascal VOC There are some thermal datasets publicly available for some detection purposes but these are very small

compared to the visual image datasets Future research could create new benchmark datasets similar to the visual image

datasets specifically for thermal images

Currently publicly available pre-trained neural network models are designed for and trained on the visual image datasets

Future research could go towards designing an architecture specifically for thermal images and training amodel on a benchmark

dataset

Thermal images use several colormaps tomap the relative temperatures in a scene on colors presenting warm and cold regions

Well-known examples are the Iron scheme (used in this dissertation) White-hot and Black-hot Some companies implement

threshold colors that highlight very hot spots or very cold spots in an image (for examples see [138 139] etc) Future research

could investigate how models trained on images using different color schemes differ in their predictions and performances

Thermal images could potentially benefit from radiometric information that adds a ton of information by adding a temperature

dimension to each pixel in the image instead of the relative coloring This information could lead to more accurate predictions

BIBLIOGRAPHY 71

Bibliography

[1] S G Gupta M M Ghonge and P Jawandhiya ldquoReview of Unmanned Aircraft Systemrdquo International Journal of Advanced

Research in Computer Engineering amp Technology vol 2 no 4 pp 2278ndash1323 2013 ISSN 2278 ndash 1323

[2] M Hassanalian and A Abdelkefi Classifications applications and design challenges of drones A review 2017 DOI

10 1016 j paerosci 2017 04 003 [Online] Available http ac els - cdn com S0376042116301348 1 - s2 0 -

S0376042116301348-mainpdf7B5C_7Dtid=256c9506-8f3c-11e7-a898-00000aab0f017B5Camp7Dacdnat=

15042875957B5C_7D

[3] M Joel The Booming Business of Drones 2013 [Online] Available httpshbrorg201301the-booming-business-of-

drones (visited on 01302018)

[4] DJI Zenmuse H3 - 2D [Online] Available httpswwwdjicomzenmuse-h3-2d (visited on 01302018)

[5] Gimbal Guard Drop amp Delivery Device for DJI Mavic Pro [Online] Available httpwwwgimbal-guardcom7B5C_

7Dpprd134610820141productdrop-7B5C7D26-delivery-device-for-dji-mavic-pro (visited on 01302018)

[6] FLIR Systems Aerial Thermal Imaging Kits [Online] Available httpwwwflircomsuasaerial-thermal-imaging-kits

(visited on 01302018)

[7] R Gade and T B Moeslund ldquoThermal cameras and applications a surveyrdquo Machine Vision and Applications vol 25

pp 245ndash262 2014 DOI 101007s00138-013-0570-5 [Online] Available httpslinkspringercomcontentpdf10

10077B5C7D2Fs00138-013-0570-5pdf

[8] M C Harvey J V Rowland and K M Luketina ldquoDrone with thermal infrared camera provides high resolution georefer-

enced imagery of theWaikite geothermal area New Zealandrdquo 2016 DOI 101016jjvolgeores201606014 [Online] Avail-

able httpsacels-cdncomS03770273163014211-s20-S0377027316301421-mainpdf7B5C_7Dtid=78077cee-

05f3-11e8-84ec-00000aab0f6c7B5Camp7Dacdnat=15173405687B5C_7D

[9] S Amici M Turci S Giammanco L Spampinato and F Giulietti ldquoUAV Thermal Infrared Remote Sensing of an Italian Mud

Volcanordquo vol 2 pp 358ndash364 2013 DOI 104236ars201324038 [Online] Available httpwwwscirporgjournalars

20httpdxdoiorg104236ars201324038

[10] J Bendig A Bolten and G Bareth ldquoINTRODUCING A LOW-COST MINI-UAV FOR THERMAL-AND MULTISPECTRAL-IMAGINGrdquo

2012 [Online] Available httpswwwint-arch-photogramm-remote-sens-spatial-inf-scinetXXXIX-B13452012

isprsarchives-XXXIX-B1-345-2012pdf

BIBLIOGRAPHY 72

[11] Workswell ldquoUsing the UAV Thermography for Cultivation and Phenotyping of Cerealsrdquo Tech Rep 2016 [Online] Avail-

able httpswwwdrone-thermal-cameracomwp-contentuploadsCultivation-and-Phenotyping-1pdf

[12] A J Rivera A D Villalobos J C Monje J A Marintildeas and C M Oppus ldquoPost-disaster rescue facility Human detection and

geolocation using aerial dronesrdquo IEEE Region 10 Annual International Conference ProceedingsTENCON pp 384ndash386

2017 ISSN 21593450 DOI 101109TENCON20167848026

[13] P Christiansen K A Steen R N Joslashrgensen and H Karstoft ldquoAutomated detection and recognition of wildlife using

thermal camerasrdquo Sensors (Basel Switzerland) vol 14 no 8 pp 13 778ndash93 Jul 2014 ISSN 1424-8220 DOI 103390

s140813778 [Online] Available httpwwwncbinlmnihgovpubmed2519610520httpwwwpubmedcentralnih

govarticlerenderfcgiartid=PMC4179058

[14] J Zhang J Hu J Lian Z Fan X Ouyang and W Ye ldquoSeeing the forest from drones Testing the potential of lightweight

drones as a tool for long-term forest monitoringrdquo Biological Conversation vol 198 pp 60ndash69 2016 [Online] Available

httpacels-cdncomS00063207163011001-s20-S0006320716301100-mainpdf7B5C_7Dtid=7166e916-8f3c-

11e7-9090-00000aacb35e7B5Camp7Dacdnat=15042877237B5C_7D

[15] D Ventura M Bruno G Jona Lasinio A Belluscio and G Ardizzone ldquoA low-cost drone based application for identifying

and mapping of coastal fish nursery groundsrdquo Estuarine Coastal and Shelf Science vol 171 pp 85ndash98 Mar 2016 ISSN

02727714 DOI 101016j ecss 201601 030 [Online] Available http ac els-cdncomS02727714163003001-s20-

S0272771416300300-mainpdf7B5C_7Dtid=7f4cdb08-8f3c-11e7-a03a-00000aab0f6b7B5Camp7Dacdnat=

15042877467B5C_7D20httplinkinghubelseviercomretrievepiiS0272771416300300

[16] S Chowdhury A Emelogu M Marufuzzaman S G Nurre and L Bian ldquoDrones for disaster response and relief operations

A continuous approximation modelrdquo 2017 DOI 101016jijpe201703024 [Online] Available wwwelseviercomlocate

ijpe

[17] Workswell ldquoPipeline inspection with thermal diagnosticsrdquo 2016 [Online] Available https www drone - thermal -

cameracomwp-contentuploadspipelinepdf

[18] Workswell ldquoThermo diagnosis of photovoltaic power plantsrdquo 2016 [Online] Available httpswwwdrone-thermal-

cameracomwp-contentuploadsWorkswell-WIRIS7B5C_7Dphotovoltaicpdf

[19] Workswell ldquoThermodiagnostics of flat roofsrdquo 2016 [Online] Available httpswwwdrone-thermal-cameracomwp-

contentuploadsroofpdf

[20] Workswell ldquoThermodiagnostics in the power engineering sectorrdquo Tech Rep 2016 [Online] Available https www

drone-thermal-cameracomwp-contentuploadshighvoltagepdf

[21] Workswell Workswell WIRIS - Product - Thermal camera for drones 2016 [Online] Available https www drone -

thermal-cameracomwiris (visited on 01302018)

[22] TEAX Technology ThermalCapture - Thermal Imaging Technology | Capture raw radiometric thermal data with drones

[Online] Available httpthermalcapturecom (visited on 01302018)

BIBLIOGRAPHY 73

[23] DJI Zenmuse XT - unlock the possibilities of sight - DJI 2018 [Online] Available https wwwdji comzenmuse-xt

(visited on 01302018)

[24] Workswell SOFTWARE - Workswell WIRIS - Thermal camera for drones 2016 [Online] Available httpswwwdrone-

thermal-cameracomsoftware (visited on 01312018)

[25] Therm-App Therm-Apptrade - Android-apps op Google Play 2018 [Online] Available httpsplaygooglecomstoreapps

detailsid=comthermapp (visited on 01312018)

[26] B Satzger W Hummer C Inzinger P Leitner and S Dustdar ldquoWinds of change From vendor lock-in to the meta cloudrdquo

IEEE Internet Computing vol 17 no 1 pp 69ndash73 2013 ISSN 10897801 DOI 101109MIC201319

[27] J Divya Drone Technology and Usage Current Uses and Future Drone Technology 2017 [Online] Available httpuk

businessinsidercomdrone-technology-uses-2017-7r=US7B5Camp7DIR=T (visited on 01312018)

[28] A Boulanger ldquoOpen-source versus proprietary software Is one more reliable and secure than the otherrdquo IBM Systems

Journal vol 44 no 2 pp 239ndash248 2005 ISSN 0018-8670 DOI 101147sj4420239 [Online] Available httpieeexplore

ieeeorgdocument5386727

[29] M Kazmeyer Disadvantages of Proprietary Software [Online] Available httpsmallbusinesschroncomdisadvantages-

proprietary-software-65430html (visited on 01312018)

[30] B Steffen and A Seyfried ldquoMethods for measuring pedestrian density flow speed and direction with minimal scatterrdquo

Physica A Statistical Mechanics and its Applications vol 389 no 9 pp 1902ndash1910 May 2010 ISSN 0378-4371 DOI 10

1016JPHYSA200912015 [Online] Available httpswwwsciencedirectcomsciencearticlepiiS0378437109010115

via7B5C7D3Dihub

[31] M Wirz T Franke D Roggen E Mitleton-Kelly P Lukowicz and G Troumlster ldquoInferring crowd conditions from pedestriansrsquo

location traces for real-time crowd monitoring during city-scale mass gatheringsrdquo Proceedings of the Workshop on

Enabling Technologies Infrastructure for Collaborative Enterprises WETICE pp 367ndash372 2012 ISSN 15244547 DOI 10

1109WETICE201226

[32] E Alpaydin Introduction to machine learning 3rd ed MIT Press 2014 p 591 ISBN 026201243X [Online] Available

httpsdlacmorgcitationcfmid=1734076

[33] J W Davis and V Sharma ldquoRobust background-subtraction for person detection in Thermal Imageryrdquo IEEE Computer

Society Conference on Computer Vision and Pattern Recognition Workshops vol 2004-Janua no January 2004 ISSN

21607516 DOI 101109CVPR2004431

[34] W Wang J Zhang and C Shen ldquoImproved Human Detection And Classification in Thermal Imagesrdquo pp 2313ndash2316 2010

[35] R Appel S Belongie P Perona and P Doll ldquoFast Feature Pyramids for Object Detectionrdquo Pami vol 36 no 8 pp 1ndash14

2014 ISSN 01628828 DOI 10 1109 TPAMI 2014 2300479 [Online] Available https vision cornell edu se3 wp -

contentuploads201409DollarPAMI14pyramids7B5C_7D0pdf

[36] T Goedeme ldquoProjectresultaten VLAIO TETRA-projectrdquo KU Leuven Louvain Tech Rep 2017

BIBLIOGRAPHY 74

[37] L-L Slattery DroneSAR wants to turn drones into search-and-rescue heroes 2017 [Online] Available https www

siliconrepubliccomstart-upsdronesar-search-and-rescue-drone-software (visited on 05262018)

[38] A W S Inc What Is Amazon Kinesis Video Streams 2018 [Online] Available https docs aws amazon com

kinesisvideostreamslatestdgwhat-is-kinesis-videohtml (visited on 05262018)

[39] U Government ldquoSystems Engineering Fundamentalsrdquo Defence Acquisition University Press no January p 223 2001

ISSN 1872-7565 DOI 101016jcmpb201005002 [Online] Available httpwwwdticmildocscitationsADA387507

[40] L Bass P Clements and R Kazman Software Architecture in Practice 3rd Addison-Wesley Professional 2012 ISBN

0321815734 9780321815736

[41] J Greene and M Stellman Applied Software Project Management 2006 p 324 ISBN 978-0596009489 [Online] Avail-

able httpwwworeillycomcatalogappliedprojectmgmt

[42] S Barber Acceptable application response times vs industry standard 2018 [Online] Available httpssearchsoftwarequality

techtargetcomtipAcceptable-application-response-times-vs-industry-standard (visited on 05282018)

[43] T Burger How Fast Is Realtime Human Perception and Technology | PubNub 2015 [Online] Available httpswww

pubnubcombloghow-fast-is-realtime-human-perception-and-technology (visited on 05282018)

[44] S-t Modeling P Glennie and N Thrift ldquoTime perception modelsrdquo Neuron pp 15 696ndash15 699 1992

[45] M Richards Software Architecture Patterns First edit Heather Scherer Ed OrsquoReilly Media 2015 [Online] Available

httpwwworeillycomprogrammingfreefilessoftware-architecture-patternspdf

[46] C Richardson Microservice Architecture pattern 2017 [Online] Available httpmicroservicesiopatternsmicroservices

html (visited on 12022017)

[47] P Clements F Bachmann L Bass D Garlan J Ivers R Little P Merson R Nord and J Staffor Documenting Software

Architectures Second Boston Pearson Education Inc 2011 ISBN 0-321-55268-7

[48] Object Management Group ldquoUnified Modeling Language v251rdquo no December 2017 [Online] Available http www

omgorgspecUML251

[49] C De La Torre C Maddock J Hampton P Kulikov and M Jones Communication in a microservice architecture 2017

[Online] Available https docs microsoft com en - us dotnet standard microservices - architecture architect -

microservice-container-applicationscommunication-in-microservice-architecture (visited on 04272018)

[50] H Schulzrinne and S Casner ldquoRTP Profile for Audio and Video Conferences with Minimal Controlrdquo 2003 [Online] Avail-

able httpstoolsietforghtmlrfc3551

[51] D Bull Communicating Pictures A Course in Image and Video Coding Elsevier Science 2014 ISBN 9780080993744

[Online] Available httpsbooksgooglebebooksid=PDZOAwAAQBAJ

[52] On-Net Surveillance Systems Inc ldquoMJPEG vs MPEG4 Understanding the differences advantages and disadvantages of

each compression techniquerdquo 2006 [Online] Available wwwonssicom

BIBLIOGRAPHY 75

[53] M M A V Protocol Introduction MAVLink Developer Guide 2013 [Online] Available httpsmavlinkioen (visited on

09142017)

[54] hartmut Schlosser Microservices trends 2017 Strategies tools and frameworks - JAXenter 2017 [Online] Available

httpsjaxentercommicroservices-trends-2017-survey-133265html (visited on 03242018)

[55] A Ronacher Welcome to Flask mdash Flask Documentation (012) 2017 [Online] Available httpflaskpocooorgdocs012

(visited on 03242018)

[56] F Reyes PythonDecorators 2017 [Online] Available https wiki python org moin PythonDecorators (visited on

04272018)

[57] Stackshare Companies that use Flask and Flask Integrations 2018 [Online] Available https stackshare io flask

(visited on 03242018)

[58] Falcon Falcon - Bare-metal web API framework for Python [Online] Available httpsfalconframeworkorg7B5C

7DsectionAbout (visited on 03242018)

[59] Stackshare Companies that use Falcon and Falcon Integrations 2018 [Online] Available httpsstackshareiofalcon

(visited on 03242018)

[60] A Ronacher Nameko for Microservices 2015 [Online] Available httplucumrpocooorg201548microservices-with-

nameko (visited on 03242018)

[61] C Escoffier Building Reactive Microservices in Java 2017 ISBN 9781491986264

[62] C Posta Microservices for Java Developers ISBN 9781491963081

[63] R Dua A R Raja and D Kakadia ldquoVirtualization vs Containerization to support PaaSrdquo in IEEE International Conference

on Cloud Engineering 2014 ISBN 9781479937660 DOI 101109IC2E201441

[64] D Merkel Docker Lightweight Linux Containers for Consistent Development and Deployment 2014 [Online] Available

http delivery acmorg1011452610000260024111600htmlip=1571935 1787B5Camp7Did=26002417B

5Camp7Dacc=ACTIVE20SERVICE7B5Camp7Dkey=D7FC43CABE88BEAA F15FE2ACB4878E3D 4D4702B0C3E38B35

4D4702B0C3E38B357B5Camp7D7B5C_7D7B5C_7Dacm7B5C_7D7B5C_7D=15214915967B5C_

7D (visited on 03192018)

[65] Docker Inc Docker for the Virtualization Admin 2016 p 12

[66] Docker Inc What is a Container 2018 [Online] Available https www docker com what - container (visited on

03242018)

[67] M Helsley LXC Linux container tools 2009 [Online] Available httpswwwibmcomdeveloperworkslinuxlibraryl-

lxc-containers (visited on 05212018)

[68] J Fink Docker a Software as a Service Operating System-Level Virtualization Framework 2014 [Online] Available

http journal code4lib org articles 9669 utm7B5C _7Dsource = feedburner 7B5C amp7Dutm7B5C _

7Dmedium=feed7B5Camp7Dutm7B5C_7Dcampaign=Feed7B5C7D3A+c4lj+ (visited on 03192018)

BIBLIOGRAPHY 76

[69] C Wang What is Docker Linux containers explained 2017 [Online] Available https www infoworld comarticle

3204171linuxwhat-is-docker-linux-containers-explainedhtml (visited on 05212018)

[70] CoreOS Rkt a security-minded standards-based container engine [Online] Available httpscoreoscomrkt (visited

on 03242018)

[71] F X F Xu X L X Liu and K Fujimura ldquoPedestrian detection and tracking with night visionrdquo IEEE Transactions on

Intelligent Transportation Systems vol 6 no 1 pp 63ndash71 2005 ISSN 1524-9050 DOI 101109TITS2004838222

[72] H Nanda and L Davis ldquoProbabilistic template based pedestrian detection in infrared videosrdquo IEEE Intelligent Vehicles

Symposium Proceedings vol 1 pp 15ndash20 2003 DOI 101109IVS20021187921

[73] R E Schapire ldquoExplaining adaboostrdquo Empirical Inference Festschrift in Honor of Vladimir N Vapnik pp 37ndash52 2013

DOI 101007978-3-642-41136-6_5

[74] P Viola O M Way M J Jones and D Snow ldquoDetecting pedestrian using patterns of motion and appearancerdquo Interna-

tional Journal of Computer Vision vol 63 no 2 pp 153ndash161 2005 DOI 101109ICCV20031238422

[75] I Goodfellow Y Bengio and A Courville Deep Learning MIT Press 2016 httpwwwdeeplearningbookorg

[76] J R R Uijlings K E A Van De Sande T Gevers and A W M Smeulders ldquoSelective Search for Object Recognitionrdquo Tech

Rep 2012 DOI 101007s11263-013-0620-5 arXiv 14094842 [Online] Available httpwwwcscornelleducourses

cs76702014spslidesVisionSeminar14pdf

[77] R Girshick J Donahue T Darrell and J Malik ldquoRegion-Based Convolutional Networks for Accurate Object Detection and

Segmentationrdquo IEEE Transactions on Pattern Analysis and Machine Intelligence vol 38 no 1 pp 142ndash158 2014 ISSN

01628828 DOI 101109TPAMI20152437384 arXiv 13112524

[78] R Girshick ldquoFast R-CNNrdquo in Proceedings of the IEEE International Conference on Computer Vision vol 2015 Inter 2015

pp 1440ndash1448 ISBN 9781467383912 DOI 101109ICCV2015169 arXiv 150408083

[79] S Ren K He R Girshick and J Sun ldquoFaster R-CNN Towards Real-Time Object Detection with Region Proposal Networksrdquo

IEEE Transactions on Pattern Analysis and Machine Intelligence vol 39 no 6 pp 1137ndash1149 2016 ISSN 01628828 DOI

101109TPAMI20162577031 arXiv 150601497

[80] K He Gkioxari P Dollaacuter and R Girshick ldquoMask R-CNNrdquo arXiv 2018 arXiv arXiv170306870v3

[81] J Dai Y Li K He and J Sun ldquoR-FCN Object Detection via Region-based Fully Convolutional Networksrdquo Tech Rep 2016

DOI 101109ICASSP20177952132 arXiv 160506409 [Online] Available httparxivorgabs160506409

[82] J Redmon S Divvala R Girshick and A Farhadi ldquoYou Only Look Once Unified Real-Time Object Detectionrdquo 2015 ISSN

01689002 DOI 101109CVPR201691 arXiv 150602640 [Online] Available httparxivorgabs150602640

[83] J Redmon and A Farhadi ldquoYOLOv3 An Incremental Improvementrdquo axXiv 2018 [Online] Available httpspjreddiecom

mediafilespapersYOLOv3pdf

[84] W Liu D Anguelov D Erhan C Szegedy S Reed C-y Fu and A C Berg ldquoSSD Single Shot MultiBox Detectorrdquo arXiv 2016

arXiv arXiv151202325v5

BIBLIOGRAPHY 77

[85] B Zoph and Q V Le ldquoNeural Architecture Search with Reinforcement Learningrdquo in ICLR 2017 pp 1ndash16 arXiv arXiv

161101578v2

[86] T-y Lin P Goyal R Girshick K He and P Dollaacuter ldquoFocal Loss for Dense Object Detectionrdquo arXiv 2018 arXiv arXiv

170802002v2

[87] Facebook Inc ONNX - About 2017 [Online] Available httpsonnxaiabout (visited on 05212018)

[88] TensorFlow TensorFlow 2018 [Online] Available httpswwwtensorfloworg (visited on 05212018)

[89] J Huang V Rathod C Sun M Zhu A Korattikara A Fathi I Fischer Z Wojna Y Song S Guadarrama and K Murphy

ldquoSpeedaccuracy trade-offs for modern convolutional object detectorsrdquo arXiv 2017 arXiv arXiv161110012v3

[90] J Redmon Darknet Open source neural networks in c httppjreddiecomdarknet 2013ndash2016

[91] Microsoft The Microsoft Cognitive Toolkit | Microsoft Docs 2018 [Online] Available https docs microsoft comen-

uscognitive-toolkitindex (visited on 05212018)

[92] Docker Inc Overview of Docker Compose | Docker Documentation 2018 [Online] Available httpsdocsdockercom

composeoverview (visited on 04272018)

[93] Docker Inc Use bridge networks | Docker Documentation 2018 [Online] Available httpsdocsdockercomnetwork

bridge (visited on 04272018)

[94] A Ronacher Click Documentation (50) 2017 [Online] Available httpclickpocooorg5 (visited on 04272018)

[95] A K Reitz Requests HTTP for Humans mdash Requests 2184 documentation 2018 [Online] Available httpdocspython-

requestsorgenmaster (visited on 05092018)

[96] Docker Inc Docker SDK for PythonmdashDocker SDK for Python 20 documentation 2018 [Online] Available httpsdocker-

pyreadthedocsioenstable (visited on 05122018)

[97] GStreamer GStreamer open source multimedia framework 2018 [Online] Available httpsgstreamerfreedesktop

org (visited on 05132018)

[98] E Walthinsen filesrc GStreamer Core Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktop org data doc gstreamer head gstreamer -plugins html gstreamer -plugins - filesrc html (visited on

05132018)

[99] E Hervey decodebin GStreamer Base Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-base-pluginshtmlgst-plugins-base-plugins-decodebinhtml

(visited on 05132018)

[100] W Taymans jpegenc GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-jpegenchtml (visited on

05132018)

BIBLIOGRAPHY 78

[101] A Communications rtpjpegpay GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https

gstreamer freedesktop org data doc gstreamer head gst - plugins - good html gst - plugins - good - plugins -

rtpjpegpayhtml (visited on 05132018)

[102] W Taymans udpsink GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-udpsinkhtml (visited on

05132018)

[103] GStreamer Basic tutorial 3 Dynamic pipelines [Online] Available httpsgstreamerfreedesktoporgdocumentation

tutorialsbasicdynamic-pipelineshtml (visited on 05132018)

[104] W Taymans udpsrc GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-udpsrchtml (visited on

05142018)

[105] W Taymans rtpjpegdepay GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available httpsgstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-rtpjpegdepayhtml

(visited on 05142018)

[106] A Loonstra ldquoVideostreaming with Gstreamerrdquo [Online] Available httpmediatechnologyleideneduimagesuploads

docswt20147B5C_7Dgstreamerpdf

[107] W Taymans jpegdec GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-jpegdechtml (visited on

05142018)

[108] J Schmidt autovideosink GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available httpsgstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-autovideosinkhtml

(visited on 05142018)

[109] A Ronacher Deployment Options mdash Flask 0124 documentation 2018 [Online] Available httpflaskpocooorgdocs

012deploying (visited on 05142018)

[110] R Yasrab ldquoMitigating Docker Security Issuesrdquo University of Science and Technology of China Hefei Tech Rep [Online]

Available httpsarxivorgpdf180405039pdf

[111] Lvh Donrsquot expose the Docker socket (not even to a container) 2015 [Online] Available httpswwwlvhiopostsdont-

expose-the-docker-socket-not-even-to-a-containerhtml (visited on 05152018)

[112] Docker Inc Use overlay networks | Docker Documentation 2018 [Online] Available httpsdocsdockercomnetwork

overlay7B5C7Dcustomize-the-docker7B5C_7Dgwbridge-interface (visited on 05152018)

[113] J W Davis and M A Keck ldquoA Two-Stage Template Approach to Person Detection in Thermal Imageryrdquo Proc Workshop

on Applications of Computer Vision 2005 [Online] Available httpvcipl-okstateorgpbvsbenchpaperswacv05pdf

BIBLIOGRAPHY 79

[114] J W Davis and V Sharma ldquoBackground-subtraction using contour-based fusion of thermal and visible imageryrdquo Com-

puter Vision and Image Understanding vol 106 no No 2-3 pp 162ndash182 2007 DOI 101016jcviu200606010 [Online]

Available httpswebcseohio-stateedu7B~7Ddavis1719Publicationscviu07pdf

[115] S Hwang J Park N Kim Y Choi and I S Kweon ldquoMultispectral Pedestrian Detection Benchmark Dataset and Baselinerdquo

CVPR 2015 [Online] Available httpssitesgooglecomsitepedestrianbenchmark

[116] Z Wu N Fuller D Theriault and M Betke ldquoA Thermal Infrared Video Benchmark for Visual Analysisrdquo IEEE Conference

on Computer Vision and Pattern Recognition Workshops 2014 DOI 101109CVPRW201439 [Online] Available http

citeseerxistpsueduviewdocdownloaddoi=101173522167B5Camp7Drep=rep17B5Camp7Dtype=pdf

[117] R Miezianko Terravic research infrared database

[118] R Miezieanko Terravic research infrared database

[119] S Z Li R Chu S Liao and L Zhang ldquoIllumination Invariant Face Recognition Using Near-Infrared Imagesrdquo IEEE Trans-

actions on Pattern Analysis and Machine Intelligence vol 29 no 4 pp 627ndash639 2007 DOI 101109TPAMI20071014

[Online] Available httpvcipl-okstateorgpbvsbenchpapersNIRpdf

[120] A Akula R Ghosh S Kumar and H K Sardana ldquoMoving target detection in thermal infrared imagery using spatiotem-

poral informationrdquo J Opt Soc Am A vol 30 no 8 pp 1492ndash1501 Aug 2013 DOI 101364JOSAA30001492 [Online]

Available httpjosaaosaorgabstractcfmURI=josaa-30-8-1492

[121] R I Hammoud IEEE OTCBVS WS Series Bench [Online] Available http vcipl - okstate org pbvs bench (visited on

05182018)

[122] Last Post Association Mission 2018 [Online] Available httpwwwlastpostbeenthe-last-postmission (visited on

05182018)

[123] I FLIR Systems FLIR One Pro 2017 [Online] Available httpswwwflircomglobalassetsimported-assetsdocument17-

1746-oem-flir7B5C_7Done7B5C_7Dpro7B5C_7Ddatasheet7B5C_7Dfinal7B5C_7Dv17B5C_

7Dwebpdf

[124] R J Ramana Introduction to Camouflage andDeception Defence Scientific Information ampDocumentation Centre pp 99ndash

164

[125] A Bornstein and I Richter Microsoft visual object tagging tool [Online] Available httpsgithubcomMicrosoftVoTT

(visited on 05202018)

[126] F E Grubbs ldquoProcedures for Detecting Outlying Observations in Samplesrdquo Technometrics vol 11 no 1 pp 1ndash21 Feb 1969

DOI 10108000401706196910490657 [Online] Available httpwwwtandfonlinecomdoiabs10108000401706

196910490657

[127] J Deng W Dong R Socher L-J Li K Li and L Fei-Fei ldquoImageNet A Large-Scale Hierarchical Image Databaserdquo in CVPR09

2009 [Online] Available httpwwwimage-netorgpapersimagenet7B5C_7Dcvpr09pdf

BIBLIOGRAPHY 80

[128] D Gupta Transfer learning amp The art of using Pre-trained Models in Deep Learning 2017 [Online] Available https

wwwanalyticsvidhyacomblog201706transfer-learning-the-art-of-fine-tuning-a-pre-trained-model (visited on

05202018)

[129] Docker Inc docker stats | Docker Documentation 2018 [Online] Available httpsdocsdockercomenginereference

commandlinestats (visited on 05242018)

[130] M Gori and A Tesi ldquoOn the Problem of Local Minima in Recurrent Neural Networksrdquo IEEE Transactions on Pattern

Analysis and Machine Intelligence vol 14 no 1 pp 76ndash86 1992 DOI 10110934107014

[131] L Prechelt ldquoEarly stopping - but whenrdquo in Neural Networks Tricks of the Trade G B Orr and K-R Muumlller Eds Berlin

Heidelberg Springer Berlin Heidelberg 1998 pp 55ndash69 ISBN 978-3-540-49430-0 DOI 1010073-540-49430-8_3

[Online] Available httpsdoiorg1010073-540-49430-8_3

[132] M Everingham L Van Gool C K Williams J Winn and A Zisserman ldquoThe Pascal visual object classes (VOC) challengerdquo

International Journal of Computer Vision vol 88 no 2 pp 303ndash338 2010 ISSN 09205691 DOI 101007s11263-009-

0275-4

[133] M Everingham S M A Eslami L Van Gool C K I Williams J Winn and A Zisserman ldquoThe Pascal Visual Object Classes

Challenge A Retrospectiverdquo International Journal of Computer Vision vol 111 no 1 pp 98ndash136 2014 ISSN 15731405

DOI 101007s11263-014-0733-5

[134] P Henderson and V Ferrari ldquoEnd-to-end training of object class detectors for mean average precisionrdquo Lecture Notes

in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

vol 10115 LNCS pp 198ndash213 2017 ISSN 16113349 DOI 101007978-3-319-54193-8_13 arXiv 160703476

[135] T Y Lin M Maire S Belongie J Hays P Perona D Ramanan P Dollaacuter and C L Zitnick ldquoMicrosoft COCO Common objects

in contextrdquo Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture

Notes in Bioinformatics) vol 8693 LNCS no PART 5 pp 740ndash755 2014 ISSN 16113349 DOI 101007978-3-319-10602-

1_48 arXiv 14050312

[136] Docker Inc Librarydocker 2018 [Online] Available https hub docker com 7B 5C _ 7D docker (visited on

06012018)

[137] Nvidia Nvidia-docker [Online] Available httpsgithubcomNVIDIAnvidia-docker (visited on 05252018)

[138] FLIR ldquoFLIR Onerdquo [Online] Available http www flir comuploadedFiles Store Products FLIR-ONE3rd-GenFLIR-

ONEFLIR-ONE-Gen-3-Datasheetpdf

[139] FLIR ldquoFLIR Bosonrdquo p 2 2016

FIREFIGHTING DEPARTMENT EMAIL CONVERSATIONS 81

Appendix A

Firefighting department email conversations

This appendix contains the email conversations with different firefighting departments in Belgium as part of an exploration of

the functional requirements of an aerial thermal imaging solution Note that all conversations were translated from Dutch to

English

A1 General email sent to Firefighting departments

This email was sent to the departments later mentioned in this appendix The responses in the following sections are responses

to this email

Subject Firefighting department - Thesis thermal drones

Dear Sir Madam

My name is Brecht Verhoeve I am a student Master of Science computer science engineering at Ghent University I am contacting

your department with reference to the research of my masterrsquos dissertation I am currently researching the applications of

thermal cameras in combination with commercial drones They can create an aerial overview of scenes and objects that often

canrsquot be spotted with visual detectors like hidden persons fires or hot explosives The eventual goal is to let a computer indicate

these objects of interest autonomously on the thermal images of the drone These images could aid a firefighter with their

work

For this research I have some questions for you

Functionality

I have enlisted some functionalities which I believe could be interesting for a firefighter

bull Detection of persons in buildings (find potential victims)

bull Detection of hidden fires in buildings (to identify danger zones)

bull Detection of fires on vast terrains (forests industrial terrains)

A2 Conversation with Firefighting department of Antwerp Belgium 82

bull Indication of hot explosives

I have two questions

bull Do you agree that these are the most important functions

bull Are there any other functions that you deem important

Quality of the application Next to the functionality the quality of the application is also important For me the most important

aspects are

bull Accuracy The software must be accurate There is no room for errors when detecting

bull Speed The software must operate quickly An overview must be created quickly to not waste time in case of an emer-

gency

bull Usability The software must be easy to use

Once again I have two questions

bull Do you agree with these qualities

bull Are there any other important qualities that you deem important

I would like to thank you in advance for your time

Best regards

Brecht Verhoeve

A2 Conversation with Firefighting department of Antwerp Belgium

The answers were given inline For clarity these are explicitly given

Subject Re Firefighting department Antwerp - Thesis thermal drones

Answers can be found in your email

Best regards

Functionality Detection of hidden fires in buildings and environments Are there any other functions that you deem important

Capture the evolution of a fire with the thermal camera Visualise incidents during night-time Capture invisible fires such as

hydrogen or methane fires

A3 Converstation with Firefighting department of Ostend Belgium 83

A3 Converstation with Firefighting department of Ostend Belgium

The answers were given inline For clarity these are explicityl given

Subject Re Firefighting department Ostend - Thesis thermal drones

Dear Brecht

You can find the answers after the questions in your email

Best Regards

Functionality Are there any other functions that you deem important These are the most important for us at the moment

Quality of the application Are there any other important qualities that you deem important The application must work au-

tonomously

A4 Conversation with Firefighting department of Courtrai Belgium

Subject Re Firefighting department Courtrai - Thesis thermal drones

Dear Brecht

Beneath you will find our answers (next to the already mentioned items)

Functionality

bull The detection of persons in a landscape For example missing persons after a traffic accident there are searches in the

dark for victims that were catapulted from a vehicle Today this is done via a thermal camera on the ground but with

a drone this could hold potential benefits Another example is searching for missing persons in nature reserves The

police sometimes asks for assitance of firefighters to search the area

Quality of the application

bull The images needs to be processed in realtime not after the drone has landed

The drones must be deployable for multiple purposes

The interpretation of the images in the future can be important for automatic flight control of drones Currently there is a

European project rdquo3D Safeguardrdquo where the KU Leuven is participating They are already quite advanced in interpreting the

images from a drone to spot persons through smoke With this information the drone can be redirected The application can

thus use the interpretations of the images to control the drone in flight

Best regards

A5 Conversation with Firefighting department of Ghent Belgium

Subject Re Firefighting department Ghent - Thesis thermal drones

A5 Conversation with Firefighting department of Ghent Belgium 84

Hi Brecht

I donrsquot know if yoursquove received the previous email but there you received answers on your questions

Best regards

Subject Re Firefighting department Ghent - Thesis thermal drones

With respect to the functionality I would like to add

bull Measuring the temperature of containers silos

I agree with the quality of the application It could be handy to be able to view the application from one or more devices

Everything should have a clear overview If possible information and controls should be available on one screen

I will follow up

Best regards

THERMAL CAMERA SPECIFICATIONS 85

Appendix B

Thermal camera specifications

This appendix gives all the specifications for the compared thermal cameras First the different cameras their producing

companies and average retail prices are listed in Table B1 Second their respective physical specifications are presented in

Table B2 Third the image qualities are presented in Table B3 Fourth the thermal precisions are presented in Table B4 Fifth

the available interfaces to interact with each camera are presented in Table B5 Sixth the energy consumption of each camera

is presented in Table B6 Seventh how support is offered when developing for these platforms is presented in Table B7 Finally

auxiliary features are presented in Table B8

THERMAL CAMERA SPECIFICATIONS 86

Product Company Price (Euro)

Wiris 2nd Gen 640 Workswell 999500

Wiris 2nd Gen 336 Workswell 699500

Duo Pro R 640 FLIR 640900

Duo Pro R 336 FLIR 438484

Duo FLIR 94999

Duo R FLIR 123999

Vue 640 FLIR 268900

Vue 336 FLIR 125993

Vue Pro 640 FLIR 403218

Vue Pro 336 FLIR 230261

Vue Pro R 640 FLIR 518456

Vue Pro R 336 FLIR 345599

Zenmuse XT 640 DJI x FLIR 1181000

Zenmuse XT 336 DJI x FLIR 697000

Zenmuse XT 336 R DJI x FLIR 939000

Zenmuse XT 640 R DJI x FLIR 1423000

One FLIR 23799

One Pro FLIR 46900

Tau 2 640 FLIR 674636

Tau 2 336 FLIR 493389

Tau 2 324 FLIR 2640

Lepton 3 160 x 120 FLIR 25995

Lepton 3 80 x 60 FLIR 14338

Boson 640 FLIR 122209

Boson 320 FLIR 93842

Quark 2 640 FLIR 33165

Quark 2 336 FLIR 33165

DroneThermal v3 Flytron 34115

Compact Seek Thermal 27500

CompactXR Seek Thermal 28646

Compact Pro Seek Thermal 59900

Therm-App Opgal 93731

Therm-App TH Opgal 295000

Therm-App 25 Hz Opgal 199000

Table B1 Compared cameras their producing companies and their average retail price

THERMAL CAMERA SPECIFICATIONS 87

Product Weight (g) Dimensions (mm)

Wiris 2nd Gen 640 390 135 x 77 x 69

Wiris 2nd Gen 336 390 135 x 77 x 69

Duo Pro R 640 325 85 x 813 x 685

Duo Pro R 336 325 85 x 813 x 685

Duo 84 41 x 59 x 30

Duo R 84 41 x 59 x 30

Vue 640 114 574 x 4445 x 4445

Vue 336 114 574 x 4445 x 4445

Vue Pro 640 9214 574 x 4445 x 4445

Vue Pro 336 9214 574 x 4445 x 4445

Vue Pro R 640 9214 574 x 4445 x 4445

Vue Pro R 336 9214 574 x 4445 x 4445

Zenmuse XT 640 270 103 x 74 x 102

Zenmuse XT 336 270 103 x 74 x 102

Zenmuse XT 336 R 270 103 x 74 x 102

Zenmuse XT 640 R 270 103 x 74 x 102

One 345 67 x 34 x 14

One Pro 365 68 x 34 x 14

Tau 2 640 72 444 x 444 x 444

Tau 2 336 72 444 x 444 x 444

Tau 2 324 72 444 x 444 x 444

Lepton 3 160 x 120 09 118 x 127 x 72

Lepton 3 80 x 60 09 118 x 127 x 72

Boson 640 75 21 x 21 x 11

Boson 320 75 21 x 21 x 11

Quark 2 640 8 22 x 22 x 12

Quark 2 336 8 22 x 22 x 12

DroneThermal v3 3 20 x 20 x 15

Compact 1417 254 x 444 x 203

CompactXR 1417 254 x 444 x 254

Compact Pro 1417 254 x 444 x 254

Therm-App 138 55 x 65 x 40

Therm-App TH 123 55 x 65 x 40

Therm-App 25 Hz 138 55 x 65 x 40

Table B2 Physical specifications

THERMAL CAMERA SPECIFICATIONS 88

Product IR Resolution (pixels) SD resolution (megapixels) Frequency (Hz) FOV Radiometry

Wiris 2nd Gen 640 640 x 512 192 not specified Various yes

Wiris 2nd Gen 336 336 x 256 192 not specified Various yes

Duo Pro R 640 640 x 512 12 30 Various lens yes

Duo Pro R 336 336 x 256 12 30 Various lens yes

Duo 160 x 120 2 75 and 83 57deg x 44deg no

Duo R 160 x 120 2 75 57deg x 44deg yes

Vue 640 640 x 512 0 75 Various lens no

Vue 336 336 x 256 0 75 Various lens no

Vue Pro 640 640 x 512 0 75 Various lens no

Vue Pro 336 336 x 256 0 75 Various lens no

Vue Pro R 640 640 x 512 0 75 Various lens yes

Vue Pro R 336 336 x 256 0 75 Various lens yes

Zenmuse XT 640 640 x 512 0 75 Various lens no

Zenmuse XT 336 336 x 256 0 75 Various lens no

Zenmuse XT 336 R 336 x 256 0 75 Various lens yes

Zenmuse XT 640 R 336 x 256 0 75 Various lens yes

One 80 x 60 15 87 50 deg x 38 deg yes

One Pro 160 x 120 15 87 55 deg x 43 deg yes

Tau 2 640 640 x 512 0 75 Various lens yes

Tau 2 336 336 x 256 0 75 Various lens yes

Tau 2 324 324 x 256 0 76 Various lens yes

Lepton 3 160 x 120 160 x 120 0 88 56 deg available

Lepton 3 80 x 60 80 x 60 0 88 56 deg no

Boson 640 640 x 512 0 90 Various lens no

Boson 320 320 x 256 0 90 Various lens no

Quark 2 640 640 x 512 0 9 Various lens no

Quark 2 336 336 x 256 0 9 Various lens no

DroneThermal v3 80 x 60 0 86 25 deg no

Compact 206 x 156 0 9 36 deg no

CompactXR 205 x 156 0 9 20 deg no

Compact Pro 320 x 240 0 15 32 deg no

Therm-App 384 x 288 0 87 Various lens no

Therm-App TH 384 x 288 0 87 Various lens yes

Therm-App 25 Hz 384 x 288 0 25 Various lens no

Table B3 Image quality

IR InfraRed SD Standard FOV Field of View

THERMAL CAMERA SPECIFICATIONS 89

Product Sensitivity mK Temperature range (degrees Celsius) Accuracy (Celsius)

Wiris 2nd Gen 640 50 -25 to +150 -40 to + 550 2

Wiris 2nd Gen 336 50 -25 to +150 -40 to + 550 2

Duo Pro R 640 50 -25 to + 135 -40 to + 550 5 20

Duo Pro R 336 50 -25 to + 135 -40 to + 550 5 20

Duo not specified -40 tot + 550 5

Duo R not specified -40 to + 550 5

Vue 640 not specified -58 to + 113 not specified

Vue 336 not specified -58 to + 113 not specified

Vue Pro 640 not specified -58 to + 113 not specified

Vue Pro 336 not specified -58 to + 113 not specified

Vue Pro R 640 not specified -58 to + 113 not specified

Vue Pro R 336 not specified -58 to + 113 not specified

Zenmuse XT 640 50 -40 to 550 not specified

Zenmuse XT 336 50 -40 to 550 not specified

Zenmuse XT 336 R 50 -40 to 550 not specified

Zenmuse XT 640 R 50 -40 to 550 not specified

One 150 -20 to 120 3

One Pro 150 -20 to 400 3

Tau 2 640 50 -40 to 550 not specified

Tau 2 336 50 -40 to 550 not specified

Tau 2 324 50 -40 to 550 not specified

Lepton 3 160 x 120 50 0 to 450 5

Lepton 3 80 x 60 50 0 to 450 5

Boson 640 40 0 to 500 not specified

Boson 320 40 0 to 500 not specified

Quark 2 640 50 -40 to 160 not specified

Quark 2 336 50 -40 to 160 not specified

DroneThermal v3 50 0 to 120 not specified

Compact not specified -40 to 330 not specified

CompactXR not specified -40 to 330 not specified

Compact Pro 70 -40 to +330 not specified

Therm-App 70 5 to + 90 3

Therm-App TH 70 0 to 200 2

Therm-App 25 Hz 70 5 to + 90 3

Table B4 Thermal precision

THERMAL CAMERA SPECIFICATIONS 90

Product USB MAVLink HDMI

Wiris 2nd Gen 640 Flash disk yes yes

Wiris 2nd Gen 336 Flash disk yes yes

Duo Pro R 640 Mini-USB yes micro-HDMI

Duo Pro R 336 Mini-USB yes micro-HDMI

Duo Mini-USB yes micro-HDMI

Duo R Mini-USB yes micro-HDMI

Vue 640 Mini-USB No No

Vue 336 Mini-USB no no

Vue Pro 640 Mini-USB yes Optional

Vue Pro 336 Mini-USB yes Optional

Vue Pro R 640 Mini-USB yes Optional

Vue Pro R 336 Mini-USB yes Optional

Zenmuse XT 640 Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 336 Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 336 R Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 640 R Via DJI drone Via DJI drone Via DJI drone

Tau 2 640 No no no

Tau 2 336 No no no

Tau 2 324 No no no

Lepton 3 160 x 120 No no no

Lepton 3 80 x 60 No no no

Boson 640 Yes no no

Boson 320 Yes no no

Quark 2 640 no no no

Quark 2 336 no no no

DroneThermal v3 no no no

Compact Smartphone storage no no

CompactXR Smartphone storage no no

Compact Pro Smartphone storage no no

Therm-App Smartphone storage no no

Therm-App TH Smartphone storage no no

Therm-App 25 Hz Smartphone storage no no

Table B5 Interfaces

THERMAL CAMERA SPECIFICATIONS 91

Product Power consumption (Watt) Input Voltage

Wiris 2nd Gen 640 4 6 - 36

Wiris 2nd Gen 336 4 6 - 36

Duo Pro R 640 10 50 - 260

Duo Pro R 336 10 50 - 260

Duo 22 50 - 260

Duo R 22 50 - 260

Vue 640 12 48 - 60

Vue 336 12 48 - 60

Vue Pro 640 21 48 - 60

Vue Pro 336 21 48 - 60

Vue Pro R 640 21 48 - 60

Vue Pro R 336 21 48 - 60

Zenmuse XT 640 Via DJI drone Via DJI drone

Zenmuse XT 336 Via DJI drone Via DJI drone

Zenmuse XT 336 R Via DJI drone Via DJI drone

Zenmuse XT 640 R Via DJI drone Via DJI drone

One approx 1h battery lifetime Battery

One Pro approx 1h battery lifetime Battery

Tau 2 640 13 40 - 60

Tau 2 336 13 40 - 61

Tau 2 324 13 40 - 62

Lepton 3 160 x 120 065 31

Lepton 3 80 x 60 065 31

Boson 640 05 33

Boson 320 05 33

Quark 2 640 12 33

Quark 2 336 12 33

DroneThermal v3 015 33 - 5

Compact Via smartphone Smartphone

CompactXR Via smartphone Smartphone

Compact Pro Via smartphone Smartphone

Therm-App 05 5

Therm-App TH 05 5

Therm-App 25 Hz 05 5

Table B6 Energy consumption

THERMAL CAMERA SPECIFICATIONS 92

Product Warranty (years) User Manual Phone support Email support FAQs

Wiris 2nd Gen 640 Not specified Yes Yes Yes Yes

Wiris 2nd Gen 336 Not specified Yes Yes Yes Yes

Duo Pro R 640 1 Yes Yes Yes Yes

Duo Pro R 336 1 Yes Yes Yes Yes

Duo 1 yes Yes Yes Yes

Duo R 1 yes yes yes yes

Vue 640 1 yes yes yes yes

Vue 336 1 yes yes yes yes

Vue Pro 640 1 yes yes yes yes

Vue Pro 336 1 yes yes yes yes

Vue Pro R 640 1 yes yes yes yes

Vue Pro R 336 1 yes yes yes yes

Zenmuse XT 640 05 yes yes yes yes

Zenmuse XT 336 05 yes yes yes yes

Zenmuse XT 336 R 05 yes yes yes yes

Zenmuse XT 640 R 05 yes yes yes yes

One 1 yes yes yes yes

One Pro 1 yes yes yes yes

Tau 2 640 1 yes yes yes yes

Tau 2 336 1 yes yes yes yes

Tau 2 324 1 yes yes yes yes

Lepton 3 160 x 120 1 yes yes yes yes

Lepton 3 80 x 60 1 yes yes yes yes

Boson 640 1 yes yes yes yes

Boson 320 1 yes yes yes yes

Quark 2 640 1 yes yes yes yes

Quark 2 336 1 yes yes yes yes

DroneThermal v3 not specified no no no no

Compact 1 yes yes yes yes

CompactXR 1 yes yes yes yes

Compact Pro 1 yes yes yes yes

Therm-App 1 yes yes yes yes

Therm-App TH 1 yes yes yes yes

Therm-App 25 Hz 1 yes yes yes yes

Table B7 Help and support

THERMAL CAMERA SPECIFICATIONS 93

Product Bluetooth Wi-Fi GPS Mobile app Storage

Wiris 2nd Gen 640 no on request Yes no yes

Wiris 2nd Gen 336 no on request yes no yes

Duo Pro R 640 yes no yes yes yes

Duo Pro R 336 yes no yes yes yes

Duo no no no no yes

Duo R no no no no yes

Vue 640 No no no no no

Vue 336 no no no no no

Vue Pro 640 yes no no yes yes

Vue Pro 336 yes no no yes yes

Vue Pro R 640 yes no no yes yes

Vue Pro R 336 yes no no yes yes

Zenmuse XT 640 Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 336 Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 336 R Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 640 R Via DJI drone Via DJI drone Via DJI drone yes yes

One no no no yes yes

One Pro no no no yes yes

Tau 2 640 no no no no yes

Tau 2 336 no no no no yes

Tau 2 324 no no no no yes

Lepton 3 160 x 120 no no no no no

Lepton 3 80 x 60 no no no no no

Boson 640 no no no no no

Boson 320 no no no no no

Quark 2 640 no no no no no

Quark 2 336 no no no no no

DroneThermal v3 no no no no no

Compact no no no yes yes

CompactXR no no no yes yes

Compact Pro no no no yes yes

Therm-App no no no yes yes

Therm-App TH no no no yes yes

Therm-App 25 Hz no no no yes yes

Table B8 Auxiliary features

LAST POST THERMAL DATASET SUMMARY 94

Appendix C

Last Post thermal dataset summary

The goal of this appendix is to provide a summary of the layout of the Last Post thermal dataset The data was captured on

the following days 24th of March 2018 second of April 2018 third of April 2018 third of April 2018 fourth of April 2018 fifth of

April 2018 9th of April 2018 10th of April 2018 11th of April 2018 and 12th of April 2018 For each date a small summary of the

contents is made below The small summary consists of a description of the conditions that day a listing of the video files and

their contents

C1 24th of March 2018

Conditions

bull Hours 1940 - 2020

bull Outside temperature range 5 degrees Celsius - 12 degrees Celsius

bull Clear

bull Humidity 76

bull Wind 24 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 14 kilometers

Videos

bull flir_20180324T195255mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd gathers on the right of the video D

C2 2nd of April 2018 95

bull flir_20180324T195836mp4 This video gives an overview of the inside of the Meningate ceremony Many

people can be seen watching the ceremony

bull flir_20180324T200421mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side

bull flir_20180324T201448mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side The crowd is leaving

the area after the ceremony

bull flir_20180324T202328mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side The crowd is leaving

the area after the ceremony

C2 2nd of April 2018

Conditions

bull Hours 1940 - 2020

bull Outside temperature range 9 degrees Celsius - 15 degrees Celsius

bull Light rain

bull Humidity 74

bull Wind 18 kilometers per hour

bull Precipitation 04 centimeters

bull Visibility 81 kilometers

Videos

bull 2018-04-02194733mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people sometimes with umbrellas passing through

bull 2018-04-02194952mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance

C3 3th of April 2018 96

bull 2018-04-02195518mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance

bull 2018-04-02201322mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance Crowds can be seen as well as people holding umbrellas

C3 3th of April 2018

Conditions

bull Hours 2000 - 2030

bull Outside temperature range 8 degrees Celsius - 16 degrees Celsius

bull Heavy rain

bull Humidity 79

bull Wind 25 kilometers per hour

bull Precipitation 05 centimeters

bull Visibility 101 kilometers

Videos

bull 2018-04-03 201227mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are holding

umbrellas due to heavy rain on that day Due to rain conditions and wind it was difficult to steady the camera which

can be seen in the shaky video

bull 2018-04-03 201727mp4 In the beginning of the clip the camera is moving towards the other side of the

Meningate From 0020 and onwards the clip is useful The video gives an overview of the bridge at the east-side of the

Meningate This is were the Frenchlaan goes into the Menenstraat The video shows people leaving from the Meningate

towards the busses at the other side of the bridge Most people are holding umbrellas due to heavy rain that day The

Meningate is in the bottom left of the picture Several buildings can be seen in the distance In the bottom right the

water of the Kasteelgracht can be seen Sometimes in the left of the picture the wall of the Meningate can be seen

bull 2018-04-03 202311mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are holding

umbrellas due to heavy rain on that day Due to rain conditions and wind it was difficult to steady the camera which

can be seen in the shaky video

C4 4th of April 2018 97

C4 4th of April 2018

Content

bull Hours 1945 - 2030

bull Outside temperature range 10 degrees Celsius - 14 degrees Celsius

bull Cloudy

bull Humidity 87

bull Wind 18 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-04 200052mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen In the video a dense

crowd of people slowly starts to form in the bottom right of the picture Most people are coming from the left of the

video to join the crowd

bull 2018-04-04 200728mp4 This video shows the inside of the Meningate and the ceremony of the last post

Some people are up close in front The large crowd can be seen through the hall

bull 2018-04-04 200913mp4 This video shows the inside of the Meningate and the ceremony of the last post

The video switches between MSX mode visual camera and thermal camera to show the differences

bull 2018-04-04 202859mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen At the start of the video

a crowd is seen in the bottom right At the 0100 mark the ceremony has ended and people are exiting the gate and

coming onto the crossing They form two rows to make place for the marching band exiting the Meningate It can be

seen marching through the crowd at the 0250 mark

C5 5th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 11 degrees Celsius - 15 degrees Celsius

C6 9th of April 2018 98

bull Sunny

bull Humidity 77

bull Wind 11 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-05 200217mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen In the video a dense

crowd of people slowly starts to form in the bottom right of the picture Most people are coming from the left of the

video to join the crowd The video shows 15 minutes before the start of the ceremony

bull 2018-04-04 201838mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen The video shows the

first ten minutes after the end of the ceremony The crowd which can be seen on the left leaves towards the square

C6 9th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 9 degrees Celsius - 10 degrees Celsius

bull Light rain

bull Humidity 99

bull Wind 14 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 81 kilometers

Videos

bull 2018-04-09 200007mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are coming from

the left towards the Meningate in the right Not a lot of people are seen due to rain that day

C7 10th of April 2018 99

bull 2018-04-09-202302mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right People are leaving from the right of the Meningate

towards the square

C7 10th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 14 degrees Celsius - 17 degrees Celsius

bull Partly Cloudy

bull Humidity 52

bull Wind 13 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers per hour

Videos

bull 2018-04-10 195029mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way

bull 2018-04-10 195131mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way

bull 2018-04-10 195748mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way Some people are moving around the crowd

bull 2018-04-10 200122mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen There is a big crowd that

can be seen on the right There are some schools there so some people are wearing backpacks It is quite warm and

the cafe on the other side of the street has opened up its terrace

bull 2018-04-10 201427mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

C8 11th of April 2018 100

that they are standing in a very structured way Some people are moving around the crowd The image is not rotated

well a well rotated image is found in 2018-04-10 201427_rotatedmp4

bull 2018-04-10 201515mp4 This video shows the inside of the Meningate and the ceremony A traditional

rsquoHakkarsquo from New-Zealand soldiers can be heard in the video the soldiers are difficult to spot due to thermal blurring

because many people are standing in one place

bull 2018-04-10 202558mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen that

they are standing in a very structured way Some people are moving around the crowd At the 0200 mark the army

unit marches to the end of the bridge Very dense crowds can be seen afterwards At 0825 the army unit marches in a

straight line towards the Meningate

C8 11th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 12 degrees Celsius - 16 degrees Celsius

bull Sunny

bull Humidity 63

bull Wind 14 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-11 200140mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen

bull 2018-04-11 200601mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat A small crowd can be seen on the left of the video

bull 2018-04-11 201554mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People start leaving the

ceremony from the 0120 mark

C9 12th of April 2018 101

C9 12th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 11 degrees Celsius - 14 degrees Celsius

bull Rain

bull Humidity 94

bull Wind 8 kilometers per hour

bull Precipitation 01 centimeters

bull Visibility 32 kilometers

Videos

bull 2018-04-12 195219mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen Sometimes the hedge

of the wall were the film was made is visible due to shaky cam Not many people are seen due to the rain

bull 2018-04-12 201526mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen Sometimes the hedge

of the wall were the film was made is visible due to shaky cam Not many people are seen due to the rain People are

leaving towards the right

  • Introduction
    • Drones
    • Concepts
      • Thermal Cameras
      • Aerial thermal imaging
        • Problem statement
          • Industry adoption
          • Crowd monitoring
          • Goal
          • Related work
            • Outline
              • System Design
                • Requirements analysis
                  • Functional requirements
                  • Non-functional requirements
                    • Patterns and tactics
                      • Layers
                      • Event-driven architecture
                      • Microkernel
                      • Microservices
                      • Comparison of patterns
                        • Software architecture
                          • Static view
                          • Dynamic views
                          • Deployment views
                              • State of the art and technology choice
                                • Thermal camera options
                                  • Parameters
                                  • Comparative analysis
                                    • Microservices frameworks
                                      • Flask
                                      • Falcon
                                      • Nameko
                                      • Vertx
                                      • Spring Boot
                                        • Deployment framework
                                          • Containers
                                          • LXC
                                          • Docker
                                          • rkt
                                            • Object detection algorithms and frameworks
                                              • Traditional approaches
                                              • Deep learning
                                              • Frameworks
                                                • Technology choice
                                                  • Thermal camera
                                                  • Microservices framework
                                                  • Deployment framework
                                                  • Object detection
                                                      • Proof of Concept implementation
                                                        • Goals and scope of prototype
                                                        • Overview of prototype
                                                          • General overview
                                                          • Client interface
                                                          • Stream
                                                          • Producer and Consumer
                                                          • Implemented plugins
                                                            • Limitations and issues
                                                              • Single client
                                                              • Timeouts
                                                              • Exception handling and testing
                                                              • Docker security issues
                                                              • Docker bridge network
                                                              • Single stream
                                                              • Number of containers per plugin
                                                                  • Mob detection experiment
                                                                    • Last Post thermal dataset
                                                                      • Last Post ceremony
                                                                      • Dataset description
                                                                        • Object detection experiment
                                                                          • Preprocessing
                                                                          • Training
                                                                              • Results and evaluation
                                                                                • Framework results
                                                                                  • Performance evaluation
                                                                                  • Interoperability evaluation
                                                                                  • Modifiability evaluation
                                                                                    • Mob detection experiment results
                                                                                      • Training results
                                                                                      • Metrics
                                                                                      • Validation results
                                                                                          • Conclusion and future work
                                                                                            • Conclusion
                                                                                            • Future work
                                                                                              • Security
                                                                                              • Implementing a detection plugin
                                                                                              • Different deployment configurations
                                                                                              • Multiple streams with different layouts
                                                                                              • Implementing the plugin distribution service (Remote ProducerConsumer)
                                                                                              • Using high performance microservices backbone frameworks
                                                                                              • New object detection models and datasets specifically for thermal images
                                                                                                  • Firefighting department email conversations
                                                                                                    • General email sent to Firefighting departments
                                                                                                    • Conversation with Firefighting department of Antwerp Belgium
                                                                                                    • Converstation with Firefighting department of Ostend Belgium
                                                                                                    • Conversation with Firefighting department of Courtrai Belgium
                                                                                                    • Conversation with Firefighting department of Ghent Belgium
                                                                                                      • Thermal camera specifications
                                                                                                      • Last Post thermal dataset summary
                                                                                                        • 24th of March 2018
                                                                                                        • 2nd of April 2018
                                                                                                        • 3th of April 2018
                                                                                                        • 4th of April 2018
                                                                                                        • 5th of April 2018
                                                                                                        • 9th of April 2018
                                                                                                        • 10th of April 2018
                                                                                                        • 11th of April 2018
                                                                                                        • 12th of April 2018

Fig 2 Schematic overview of a plugin

the stream The REST paradigm is selected to build this APIwith state sources and listeners resources that need to be min-imally implemented A plugin has the following states INAC-TIVE PLAY STOP and PAUSE Figure 3 depicts the state tran-sition diagram A plugin implements the visible states STOPPAUSE and PLAY describing if the media process of the pluginis stopped paused or processing respectively The INACTIVEstate is only visible to the framework as it indicates that their isno active process that runs the plugin This is the initial state ofa plugin in the framework When a plugin is added to a streamthe plugin microservice is started transitions to the STOP stateand waits for commands

Fig 3 State transition diagram of a plugin

C2 Network topology and communication protocol

The microservices of the framework and the plugins need acommunication protocol to exchange commands and video Forsending the commands the HTTPTCP protocol is used a syn-chronous protocol that blocks on the response of a request Thisensures that commands are always received and acknowledgedincreasing the reliability of the communication at the cost ofperformance [22] The asynchronous RTPUDP protocol is se-lected to transfer media between the plugins to ensure low la-tency video transfer between plugins to enable real-time videostreams The recommended codec for transmitting video mediais MJPEG which transmits video frames as separately encodedJPEG images [23] Other codecs such as MPEG-4 encode onlycertain frames as keyframes and the other frames as B-framesthat encode differences from the keyframe [24] This impliesthat when receiving images from a stream a keyframe must firstbe received before the video can be decoded Using MJPEG

plugins receiving frames can directly perform analysis on eachkeyframe and do not have to wait for additional frames An ex-ample of this network topology for a stream with 2 Producerplugins and 3 Consumer plugins is depicted in Figure 4

Fig 4 Network topology The full lines represent HTTPTCPcommuncations the dashed line RTPUDP communications

IV PROTOTYPE IMPLEMENTATION

The goal of the prototype implementation is to provide a proofof the concept architecture and meet the requirements presentedin Section III The core framework components are imple-mented the distribution components are left out of scope as theyfocus primarily on the distribution of supported plugins Thecore microservices as well as the plugins are implemented usingthe Docker software containerization framework [25] Contain-ers virtualize on the operating system and allow for portablelightweight software environments for processes with a minorperformance overhead Using this technology the core modulesand plugins can be deployed in a local and distributed fashionand be setup and shutdown easily [26] The microservice con-tainers communicate via the protocols presented in Section III-C2 The REST APIs are built with the Flask framework [27] alightweight Python web development framework ideal for pro-totyping The ProducerConsumer components need access tothe Docker daemon running on the Docker host in order to spinup and shutdown Docker containers containing the plugin pro-cesses This is achieved by mounting the Docker client socket inthe ProducerConsumer containers This gives the container rootaccess to the host a significant security threat [2829] Two sam-ple plugins were implemented Filecam a plugin that producesvideo read in from a file and Display a plugin that forwardsmedia to the display of the local device The plugins transmitmedia using the video streaming framework GStreamer [30]

V MOB DETECTION

A Dataset

Several publicly available datasets for thermal images exists[31ndash34] None of these include large crowds of people so anew dataset called the Last Post dataset was created It consistsof thermal video captured at the Last Post ceremony in Ypres

(a) Thermal view of the square (b) Visual view of the square (c) Thermal view of the bridge (d) Visual view of the bridge

Fig 5 Last Post dataset main scenes

Belgium [35] The videos were captured using the Flir One Prothermal camera for Android [36] using the Iron colorschemeTwo main scenes are present in the dataset depicted in Figure 5Mobs are present in the thermal images not in the visual imagesdue to the images being made on separate days The imagesused for the experiment were manually annotated outliers wereremoved and the dataset was randomly split in a training andvalidation set

B Model

Detecting and classifying objects of interest in images isknown as the object detection problem in machine learning [37]Several object detection algorithms and frameworks have beenimplemented in the past years A distinction is made betweentraditional models [31 38ndash40] deep learning two-stage net-works [41ndash46] and deep learning dense networks [47ndash49] Thetraditional and two-stage methods make predictions relativelyslow (order of seconds on GPU) when compared to the densenetworks (order of milliseconds on GPU) [47] Since the goalis to use the framework in real-time use cases the latter is pre-ferred The YOLOv3 model is selected as it achieves state of theart prediction performances can make real-time predictions andis available via the open source neural network framework dark-net [50 51] The model is pre-trained on the ImageNet dataset[52] The model is trained on a NVIDIA Geforce 980 TX GPUand optimizes the SSE loss using batch gradient descent [50]To select the best weights the average Intersection of Union(IoU) and mean Average Precision (mAP) [53] are calculatedon predictions on the validation set The weights that achievethe highest mAP are selected as the final weights

VI RESULTS

A Framework

To evaluate the framework acceptance tests for the require-ments from Section III were conducted Common frameworkoperations such as manipulating and building a stream have anaverage execution time of 084 seconds with a standard devia-tion of 037 seconds Less common operations such as deacti-vating a plugin starting up the framework and shutting downthe framework have an average execution time of 358 840 and2402 seconds respectively with standard deviations 467 050and 048 respectively Deactivating plugins (STOP to INAC-TIVE transitions) takes a long time as the container running theprocess needs to be removed Real-time streaming could not betested due to the GStreamer framework having no readily avail-

able testing endpoints However when streaming and displayinga video with the framework human users could not differenti-ate between a streaming video and a video played using a nativemedia player making it plausible the framework streams in real-time Great care must be taken when building plugins as theirprocessing speed has a direct impact on the real-time streamingperformance Interoperability is achieved with the REST APIsand plugin model presented in Section III-C The interoperabil-ity is tested by having the framework exchange information witha mock plugin implementing the specified interface and count-ing the number of correct exchanges The average successfulexchange ratio is 99998 The framework can install and de-tect new plugins at runtime achieving runtime modifiability atplugin level Different deployment schemes were not tested forthe prototype

B Mob detection

The weights generated at the 15700th training epoch achievedthe highest mAP value 9052 on the validation set For com-parison performance of other models on benchmark datasetsachieve an average mAP of 748 [54] The reason the model isachieving such high values is because the validation set is tem-porally correlated with the training set as both sets are extractedfrom videos in which frames have a temporal correlation Per-formance when predicting on new datasets will be worse Figure6 depicts some predictions of the model When predicting on avideo the model generated predictions at an average frame rateof 55 frames per second an a GPU

Fig 6 Model predictions on validation set

VII CONCLUSION AND FUTURE WORK

In this dissertation a modifiable drone thermal imaging anal-ysis framework is proposed to allow end-users to build flexiblevideo processing pipelines using different thermal cameras andanalysis modules The framework implements a microservice

container plugin architecture Users can build image process-ing applications with the framework by building media streamsusing plugins that are either thermal cameras or image analy-sis software modules The framework is evaluated by building aproof of concept implementation which is tested on the initial re-quirements The proposed framework achieves the modifiabilityand interoperability requirements at the cost of performance andsecurity The framework is applied for detecting large crowdsof people (mobs) during open-air events A new dataset con-taining thermal images of such mobs is presented on which aYOLOv3 neural network is trained The trained model is ableto detect mobs on new thermal images in real-time achievingframe rates of 55 frames per second when deployed on a modernGPU Some extensions to this research are deploying a detec-tion model using the framework testing the other deploymentconfigurations testing the framework with end-users in prac-tice and building new object detection models specifically forthermal images

REFERENCES

[1] R Gade and T B Moeslund ldquoThermal cameras and applications a sur-veyrdquo Machine Vision and Applications vol 25 pp 245ndash262 2014

[2] M C Harvey J V Rowland and K M Luketina ldquoDrone with thermalinfrared camera provides high resolution georeferenced imagery of theWaikite geothermal area New Zealandrdquo 2016

[3] S Amici M Turci S Giammanco L Spampinato and F Giulietti ldquoUAVThermal Infrared Remote Sensing of an Italian Mud Volcanordquo vol 2pp 358ndash364 2013

[4] J Bendig A Bolten and G Bareth ldquoINTRODUCING A LOW-COSTMINI-UAV FOR THERMAL-AND MULTISPECTRAL-IMAGINGrdquo2012

[5] A J Rivera A D Villalobos J C Monje J A Marinas and C MOppus ldquoPost-disaster rescue facility Human detection and geolocationusing aerial dronesrdquo IEEE Region 10 Annual International ConferenceProceedingsTENCON pp 384ndash386 2017

[6] P Christiansen K A Steen R N Joslashrgensen and H Karstoft ldquoAuto-mated detection and recognition of wildlife using thermal camerasrdquo Sen-sors (Basel Switzerland) vol 14 pp 13778ndash93 jul 2014

[7] S Chowdhury A Emelogu M Marufuzzaman S G Nurre and L BianldquoDrones for disaster response and relief operations A continuous approx-imation modelrdquo 2017

[8] Workswell ldquoPipeline inspection with thermal diagnosticsrdquo 2016[9] DJI ldquoZenmuse H3 - 2Drdquo[10] Workswell ldquoApplications of WIRIS - Thermal vision system for dronesrdquo[11] Therm-App ldquoTherm-App - Android-apps op Google Playrdquo 2018[12] B Satzger W Hummer C Inzinger P Leitner and S Dustdar ldquoWinds of

change From vendor lock-in to the meta cloudrdquo IEEE Internet Comput-ing vol 17 no 1 pp 69ndash73 2013

[13] J Divya ldquoDrone Technology and Usage Current Uses and Future DroneTechnologyrdquo 2017

[14] B Steffen and A Seyfried ldquoMethods for measuring pedestrian densityflow speed and direction with minimal scatterrdquo Physica A Statistical Me-chanics and its Applications vol 389 pp 1902ndash1910 may 2010

[15] M Wirz T Franke D Roggen E Mitleton-Kelly P Lukowicz andG Troster ldquoInferring crowd conditions from pedestriansrsquo location tracesfor real-time crowd monitoring during city-scale mass gatheringsrdquo Pro-ceedings of the Workshop on Enabling Technologies Infrastructure forCollaborative Enterprises WETICE pp 367ndash372 2012

[16] L-L Slattery ldquoDroneSAR wants to turn drones into search-and-rescueheroesrdquo 2017

[17] Amazon Web Services Inc ldquoWhat Is Amazon Kinesis Video Streamsrdquo2018

[18] T Goedeme ldquoProjectresultaten VLAIO TETRA-projectrdquo tech rep KULeuven Louvain 2017

[19] L Bass P Clements and R Kazman Software Architecture in PracticeAddison-Wesley Professional 3rd ed 2012

[20] M Richards Software Architecture Patterns OrsquoReilly Media first edit ed2015

[21] C Richardson ldquoMicroservice Architecture patternrdquo 2017[22] C De La Torre C Maddock J Hampton P Kulikov and M Jones ldquoCom-

munication in a microservice architecturerdquo 2017

[23] On-Net Surveillance Systems Inc ldquoMJPEG vs MPEG4 Understandingthe differences advantages and disadvantages of each compression tech-niquerdquo 2006

[24] D Bull Communicating Pictures A Course in Image and Video CodingElsevier Science 2014

[25] Docker Inc ldquoDocker - Build Ship and Run Any App Anywhererdquo 2018[26] D Merkel ldquoDocker Lightweight Linux Containers for Consistent Devel-

opment and Deploymentrdquo 2014[27] A Ronacher ldquoWelcome to Flask Flask Documentation (012)rdquo 2017[28] Lvh ldquoDonrsquot expose the Docker socket (not even to a container)rdquo 2015[29] R Yasrab ldquoMitigating Docker Security Issuesrdquo tech rep University of

Science and Technology of China Hefei[30] GStreamer ldquoGStreamer open source multimedia frameworkrdquo 2018[31] J W Davis and M A Keck ldquoA Two-Stage Template Approach to Per-

son Detection in Thermal Imageryrdquo Proc Workshop on Applications ofComputer Vision 2005

[32] S Hwang J Park N Kim Y Choi and I S Kweon ldquoMultispectralPedestrian Detection Benchmark Dataset and Baselinerdquo CVPR 2015

[33] Z Wu N Fuller D Theriault and M Betke ldquoA Thermal Infrared VideoBenchmark for Visual Analysisrdquo IEEE Conference on Computer Visionand Pattern Recognition Workshops 2014

[34] S Z Li R Chu S Liao and L Zhang ldquoIllumination Invariant FaceRecognition Using Near-Infrared Imagesrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 29 no 4 pp 627ndash639 2007

[35] Last Post Association ldquoMissionrdquo 2018[36] FLIR ldquoFLIR One Prordquo[37] E Alpaydin Introduction to machine learning MIT Press 3 ed 2014[38] F X F Xu X L X Liu and K Fujimura ldquoPedestrian detection and

tracking with night visionrdquo IEEE Transactions on Intelligent Transporta-tion Systems vol 6 no 1 pp 63ndash71 2005

[39] H Nanda and L Davis ldquoProbabilistic template based pedestrian detectionin infrared videosrdquo IEEE Intelligent Vehicles Symposium Proceedingsvol 1 pp 15ndash20 2003

[40] R Appel S Belongie P Perona and P Doll ldquoFast Feature Pyramids forObject Detectionrdquo Pami vol 36 no 8 pp 1ndash14 2014

[41] J R R Uijlings K E A Van De Sande T Gevers and A W M Smeul-ders ldquoSelective Search for Object Recognitionrdquo tech rep 2012

[42] R Girshick J Donahue T Darrell and J Malik ldquoRegion-Based Convolu-tional Networks for Accurate Object Detection and Segmentationrdquo IEEETransactions on Pattern Analysis and Machine Intelligence vol 38 no 1pp 142ndash158 2014

[43] R Girshick ldquoFast R-CNNrdquo in Proceedings of the IEEE InternationalConference on Computer Vision vol 2015 Inter pp 1440ndash1448 2015

[44] S Ren K He R Girshick and J Sun ldquoFaster R-CNN Towards Real-Time Object Detection with Region Proposal Networksrdquo IEEE Trans-actions on Pattern Analysis and Machine Intelligence vol 39 no 6pp 1137ndash1149 2016

[45] K He Gkioxari P Dollar and R Girshick ldquoMask R-CNNrdquo arXiv 2018[46] J Dai Y Li K He and J Sun ldquoR-FCN Object Detection via Region-

based Fully Convolutional Networksrdquo tech rep 2016[47] J Redmon S Divvala R Girshick and A Farhadi ldquoYou Only Look

Once Unified Real-Time Object Detectionrdquo 2015[48] W Liu D Anguelov D Erhan C Szegedy S Reed C-y Fu and A C

Berg ldquoSSD Single Shot MultiBox Detectorrdquo arXiv 2016[49] T-y Lin P Goyal R Girshick K He and P Dollar ldquoFocal Loss for

Dense Object Detectionrdquo arXiv 2018[50] J Redmon and A Farhadi ldquoYOLOv3 An Incremental Improvementrdquo

axXiv 2018[51] J Redmon ldquoDarknet Open source neural networks in crdquo

httppjreddiecomdarknet 2013ndash2016[52] J Deng W Dong R Socher L-J Li K Li and L Fei-Fei ldquoImageNet

A Large-Scale Hierarchical Image Databaserdquo in CVPR09 2009[53] M Everingham S M A Eslami L Van Gool C K I Williams J Winn

and A Zisserman ldquoThe Pascal Visual Object Classes Challenge A Ret-rospectiverdquo International Journal of Computer Vision vol 111 no 1pp 98ndash136 2014

[54] A Ouaknine ldquoReview of Deep Learning Algorithms for Object Detec-tionrdquo 2018

xii

Contents

1 Introduction 1

11 Drones 1

12 Concepts 2

121 Thermal Cameras 2

122 Aerial thermal imaging 2

13 Problem statement 2

131 Industry adoption 2

132 Crowd monitoring 3

133 Goal 4

134 Related work 4

14 Outline 4

2 System Design 5

21 Requirements analysis 5

211 Functional requirements 5

212 Non-functional requirements 6

22 Patterns and tactics 11

221 Layers 12

222 Event-driven architecture 12

223 Microkernel 12

224 Microservices 13

225 Comparison of patterns 13

23 Software architecture 15

231 Static view 15

232 Dynamic views 22

233 Deployment views 23

3 State of the art and technology choice 27

31 Thermal camera options 27

311 Parameters 27

312 Comparative analysis 30

32 Microservices frameworks 31

321 Flask 31

322 Falcon 33

323 Nameko 33

324 Vertx 33

325 Spring Boot 34

33 Deployment framework 34

331 Containers 34

332 LXC 35

333 Docker 35

334 rkt 35

34 Object detection algorithms and frameworks 36

341 Traditional approaches 36

342 Deep learning 37

343 Frameworks 39

35 Technology choice 41

351 Thermal camera 41

352 Microservices framework 41

353 Deployment framework 41

354 Object detection 41

4 Proof of Concept implementation 43

41 Goals and scope of prototype 43

42 Overview of prototype 43

421 General overview 43

422 Client interface 45

423 Stream 46

424 Producer and Consumer 46

425 Implemented plugins 48

43 Limitations and issues 51

431 Single client 51

432 Timeouts 51

433 Exception handling and testing 51

434 Docker security issues 51

435 Docker bridge network 52

436 Single stream 52

437 Number of containers per plugin 52

5 Mob detection experiment 53

51 Last Post thermal dataset 53

511 Last Post ceremony 53

512 Dataset description 54

52 Object detection experiment 56

521 Preprocessing 56

522 Training 56

6 Results and evaluation 58

61 Framework results 58

611 Performance evaluation 58

612 Interoperability evaluation 60

613 Modifiability evaluation 62

62 Mob detection experiment results 62

621 Training results 63

622 Metrics 63

623 Validation results 64

7 Conclusion and future work 67

71 Conclusion 67

72 Future work 69

721 Security 69

722 Implementing a detection plugin 69

723 Different deployment configurations 70

724 Multiple streams with different layouts 70

725 Implementing the plugin distribution service (Remote ProducerConsumer) 70

726 Using high performance microservices backbone frameworks 70

727 New object detection models and datasets specifically for thermal images 70

A Firefighting department email conversations 81

A1 General email sent to Firefighting departments 81

A2 Conversation with Firefighting department of Antwerp Belgium 82

A3 Converstation with Firefighting department of Ostend Belgium 83

A4 Conversation with Firefighting department of Courtrai Belgium 83

A5 Conversation with Firefighting department of Ghent Belgium 83

B Thermal camera specifications 85

C Last Post thermal dataset summary 94

C1 24th of March 2018 94

C2 2nd of April 2018 95

C3 3th of April 2018 96

C4 4th of April 2018 97

C5 5th of April 2018 97

C6 9th of April 2018 98

C7 10th of April 2018 99

C8 11th of April 2018 100

C9 12th of April 2018 101

xvi

List of Figures

21 Use case diagram 7

22 Overview of the framework software architecture 16

23 Framework network topology 17

24 Client Interface detailed view 17

25 Stream detailed view 18

26 Stream model 18

27 Plugin model 19

28 Plugin state transition diagram 20

29 Component-connector diagrams of the Producer and Consumer module 21

210 Producer and Consumer Distribution component-connector diagrams 22

211 Add plugin sequence diagram 23

212 Link plugins sequence diagram 24

213 Deployment diagrams 26

31 Thermal image and MSX image of a dog 28

33 Rethink IT Most used tools and frameworks for microservices results [54] 32

34 Containers compared to virtual machines [66] 36

41 filecam GStreamer pipeline 49

42 local plugin GStreamer pipeline 50

51 Last Post ceremony panorama 54

52 Last Post filming locations 54

53 Main scenes in the Last Post dataset 55

54 Outliers 57

61 Average training loss per epoch 64

62 Validation metrics per epoch 65

63 Predictions of the model on images in the validation set 66

71 GStreamer pipeline for a plugin with a detection model 69

xviii

List of Tables

21 Performance utility tree 8

22 Interoperability utility tree 9

23 Modifiability utility tree 10

24 Usability utility tree 11

25 Security utility tree 11

26 Availability utility tree 12

27 Architecture pattern comparison 14

61 Acceptance tests results summary 59

62 Performance test statistics summary measured in seconds 60

63 Resource usage of the framework in several conditions 61

64 Total size of framework components 61

65 Interoperability tests results (S Source L Listener) 62

B1 Compared cameras their producing companies and their average retail price 86

B2 Physical specifications 87

B3 Image quality IR InfraRed SD Standard FOV Field of View 88

B4 Thermal precision 89

B5 Interfaces 90

B6 Energy consumption 91

B7 Help and support 92

B8 Auxiliary features 93

xix

List of Listings

1 Minimal Flask application 32

2 Vertx example 33

3 Spring Boot example 34

4 docker-composeyml snippet of the prototype 44

5 Mounting the Docker socket on the container 47

6 Starting a plugin container 47

7 Dynamic linking of the decodebin and jpegenc 50

xx

List of Abbreviations

ACF Aggregated Channel Features

AMQP Advanced Message Queuing Protocol

API Application Programming Interface

AS Availability Scenario

ASR Architecturally Significant Requirement

CLI Command Line Interface

CNN Convolutional Neural Networks

CRUD Create Read Update Destroy

DNS Domain Name System

FR Functional Requirement

GPU Graphical Processing Unit

H High

HTTP Hyper Text Transfer Protocol

ICF Integral Channel Features

IoU Intersection of Union

IS Interoperability Scenario

IT Interoperability Tactic

JVM Java Virtual Machine

L Low

xxi

LXC Linux Containers

M Medium

mAP mean Average Precision

Motion-JPEG MJPEG

MS Modifiability Scenario

MSX Multi Spectral Dynamic Imaging

MT Modifiablity Tactic

NFR Non-Functional Requirement

ONNX Open Neural Network Exchange Format

OS Operating System

PS Performance Scenario

PT Performance Tactic

QAR Quality Attribute Requirement

REST Representational State Transfer

RNN Recurrent Neural Network

RPN Region Proposal Network

RTP Real-time Transport Protocol

SS Security Scenario

SSE Sum of Squared Errors

SVM Support Vector Machine

TCP Transmission Control Protocol

UDP User Datagram Protocol

UI User Interface

US Usability Scenario

YOLO You Only Look Once

INTRODUCTION 1

Chapter 1

Introduction

Throughout history having an overview of the environment from high viewpoints held many benefits Early civilizations used

hills to monitor their surroundings population and spot possible attackers The discovery of flight meant that environments

could now be viewed from a birdrsquos-eye view offering even more visibility revealing much more of the world below Recently a

much more smaller type of aircraft was developed the drone Ranging from large plane-like to almost insect-like devices and

having a wide variety of uses drones are quickly taking over the sky Drones would not be as effective without proper cameras

providing a detailed view on the world below With digital videocameras offering superb quality for steadily decreasing costs

almost every scene can be captured in great detail However these cameras are limited to the visible light spectrum which

hinders drones to operate in all circumstances like nightly flights Thermal cameras measure the emitted heat of a scene and

can reveal information not visible to the eye such as hidden persons or animals pipelines malfunctioning equipment etc The

combination of these two technologies certainly holds many exciting opportunities for the future

11 Drones

Drones are flying robots that can fly remotely or autonomously and donrsquot carry a human operator They can carry a variety of

payloads video cameras delivery parcels fluid containers sensors lights but also lethal explosives [1]

Drones are classified in different categories based on varying parameters such as the physical characteristics (diameter weight)

aerial movement techniques application domains etc Based on diameter drones are classified as smart dust (1 mm to 025

cm) pico air vehicles (025 cm - 25 cm) nano air vehicles (25 cm - 15 cm) micro air vehicles (15 cm - 1 m) micro unmanned

aerial vehicles (1 m - 2 m) and unmanned aerial vehicles (2 m and larger) Often depending on their diameter the weight

of these devices ranges from less than a gram up to more than 2000 kg Drones have different flight techniques such as

propulsion engines with wings rotors in various amounts flapping wings and even balloons They are used for all kinds of

purposes ranging from search and rescue missions environmental protection delivery recon etc Hassanalian et al provide

an excellent overview of most types of drones [2]

Due to the increasing interest in commercial drone platforms [3] a variety of payloads were developed specifically tailored for

these aerial robots such as gimbals to mount action video cameras [4] gimbals for delivering packets [5] and thermal imaging

12 Concepts 2

platforms [6]

12 Concepts

121 Thermal Cameras

Thermal cameras are passive sensors that capture the infrared radiation emitted by all objects with a temperature above

absolute zero degrees Kelvin In contrast to visible light cameras thermal cameras do not depend on an external energy

source for visibility and colors of objects or scenes This makes captured images independent of the illumination colors etc

Furthermore images can be captured in the absence of visible light [7] Originally thermal camera technology was developed

for night vision purposes for the military and the devices were very expensive Later the technology was commercialized

and has developed quickly over the last few decades resulting in better and cheaper cameras [7] This led to access for a

broader public and the technology is now introduced to a wide range of different applications such as building inspection gas

detection industrial appliances medicinal science agriculture fire detection surveillance etc [7] Thermal cameras are now

being mounted on drones to give an aerial thermal overview

122 Aerial thermal imaging

Aerial thermal imaging is defined as the creation of thermal images using a flying device This dissertation focuses on the usage

of drones for aerial thermal imaging There are many applications for aerial thermal imaging Some examples are geography

[8 9] agriculture [10 11] search and rescue operations [12] wildlife monitoring [13] forest monitoring [14 15] disaster response

[16] equipment and building maintenance [17ndash20] etc In the past few years several industry players have developed thermal

cameras specifically aimed at these drone applications Examples are FLIR [6] Workswell [21] and TEAX Technology [22]

13 Problem statement

131 Industry adoption

The implementation of thermal cameras on drone platforms faces some issues for wide adoption by the industry Several

vendors offer thermal camera products some specifically designed for drone platforms that often implement different image

formats color schemes and interfaces (eg [23ndash25]) This leads to issues if users want to modify their applications by changing

the camera that is used because the applicationmust implement new software to interact with the camera or when the camera

is no longer supported by the vendor leaving the application with outdated hardware and software This leads to a problem

called vendor lock-in that makes customers dependent on a certain vendor as they cannot switch product without making

substantial costs a problem already very tangible for cloud-based applications today [26]

Applications across various fields often have different functional and non-functional requirements Some applications have hard

real-time deadlines (such as firefighting search and rescue security etc) that must be respected other applications require

13 Problem statement 3

highly accurate predictions (eg person detection agriculture etc) A single application domain can even have many different

use cases

Several firefighting departments in Belgium were contacted to get an overview of their needs for a drone thermal imaging ap-

plication It quickly became clear they had various detection problems such as finding missing persons locating hot explosives

measuring temperatures in silos detecting invisible methane fires etc Equipment also wears down more quickly due to usage

in harsh environments such as fires in close proximity A drone thermal application for them needs to be able to exchange

functionality and hardware easily and have high performance constraints to deliver value for them The email conversations

can be read in Appendix A

Other drone thermal imaging applications are mostly only used in the niche domain for which they were developed because

they arenrsquot designed for flexibility [27] These proprietary applications have some disadvantages the development and support

potentially has a large cost vendor lock-in can occur when products are no longer supported security issues could arise and

customization is difficult [28 29] Applications could benefit from a backbone framework to aid in this modifiabilityinteroper-

ability issue aiding in developing end-to-end solutions connecting thermal cameras to various analysisdetection modules for

various use cases

132 Crowd monitoring

Festivals and other open air events are popular gatherings that attract many people For every event organizer it is important to

ensure safety and avoid incidents Large groups of people so-called mobs can create potentially dangerous situations through

bottlenecks blocking escape routes etc Therefore having the ability to monitor crowds and predict their behavior is very

important to avoid such scenarios Data can be obtained by evaluating video footage from past comparable events or real time

video monitoring of current events [30] By analyzing this footage potentially dangerous situations can be avoided by acting

on the mob formation and safety regulations can be improved to help planning future events Vision-based approaches face

several limitations mounted cameras cannot capture elements outside of their field of view canrsquot see in some conditions (for

example during night time) and it is difficult to infer information from the raw footage [31]

Thermal cameras could help for crowd monitoring because they can operate in any condition Having precise and detailed

object recognition for the images produced by these cameras is crucial to extract information correctly In this context clas-

sifying images is not satisfactory localization of the objects contained within the images is needed This problem is known

as object detection [32] There are several challenges for object detection in thermal images the image quality is very low

when compared to visible light images there is a lack of color and texture information and temperature measures are relative

measures etc This makes extracting discriminative information from these images difficult [33] Most efforts towards object

detection on thermal images has gone towards human detection Most of the proposed algorithms focus on feature extraction

using the Aggregated Channel Features technique and boosting algorithms for learning [33ndash35] Novel approaches make use

of so-called deep learning with neural networks that achieve very good results given enough data [36]

14 Outline 4

133 Goal

The goal of the dissertation is to explore the requirements of the possible backbone framework suggested in Section 131 and its

potential software architecture The architecture is evaluated by building a proof of concept implementation of the framework

and evaluating it against the proposed requirements To verify its use in developing drone thermal imaging applications the

specific mob-detection use case is investigated

134 Related work

The Irish start-up DroneSAR [37] developed a search-and-rescue (SAR) drone platform allowing users to stream live images

and video from a drone as it conducts a search for missing persons The platform works with any camera visual and thermal

but focuses on drones from vendor DJI DroneSARs industry partner Amazon introduced the Amazon Kinesis Video Streams

platform in January 2018 as a new service for the Amazon Web Services (AWS) cloud platform It allows users to stream live

video from devices to the AWS cloud and build applications for real-time video processing [38] The VIPER project by EAVISE

KU Leuven researched how thermal and visual video images could be used for real-time detection of persons using object

detection algorithms such as deep learning [36] The framework presented in this work combines elements from all three of

these examples

14 Outline

The remainder of this dissertation is organized as follows Chapter 2 presents the requirements for the framework and the

software architecture Chapter 3 explores several state of the art technologies that can serve as backbone technologies for

the framework To test the viability of the software architecture a prototype is implemented Chapter 4 presents the different

aspects of this prototype Chapter 5 describes the experiment that is conducted to research the detection of mobs in thermal

images The results of both the framework and the detection experiment are presented and evaluated in Chapter 6 Finally the

conclusion and future research efforts are presented in Chapter 7

SYSTEM DESIGN 5

Chapter 2

System Design

Finding out what users actually expect from a software system and what makes it valuable for them is of key importance for the

success of that system This chapter first explores the functional and non-functional requirements of the hypothetical frame-

work suggested in Chapter 1 to find out what makes building the framework worthwhile Well known architectural patterns

enable certain software requirements very well and can be used for building the software architecture of the framework The

framework software architecture combines some of these patterns and is presented in several documents

21 Requirements analysis

Requirements are the stated life-cycle customer needs and objectives for the system and they relate to how well the system

will work in its intended environment They are those aspects of the framework that will provide value to the users

211 Functional requirements

Functional requirements (FR) describe the necessary task action or activity that must be accomplished by the system often

captured in use cases andor user stories [39 40] Use cases provide a summary of the features described in the user stories

Several external people andor systems defined as actors interact with the framework to achieve a certain goal [40] Three

actors are identified for the framework an end-user that uses the framework in order to build an image processing applica-

tion for a specific use case such as the ones described in Section 122 a camera developer who creates support software for

a specific thermal camera for the framework so that the end-user can buy and use their product and an analysis software

developer that creates analysis software for a specific use case (tracking object detecting objects etc) so that the end-user

can use their software to build his 1 application The camera and analysis software developers are generalized to an actor called

plugin developer who develops plugins to extend the functionality of the framework These plugins are the building blocks

with which the end-user can build image processing applications

The general user scenario for the framework proceeds as follows An end-user wants to build an image processing application

1To avoid unnecessary gender specific pronoun clutter the male pronoun is used by default

21 Requirements analysis 6

eg to detect fires in a landscape using a drone He has a thermal camera for this and has read about hot-spot detection in

video The user looks for a plugin for the framework that can read video from his thermal camera and for a plugin that does

the hot-spot detection If the user finds these plugins he can add them to the framework and use them for the application he

is building He connects both plugins with the framework in a specific order to finish his application For this simple example

the thermal camera plugin is connected to the hot-spot detection plugin so that video coming from the thermal camera is

transmitted to the detection plugin to find the fires in the landscape The plugins in the application and the specific order in

which they are connected is defined as a stream This stream should be easily modifiable if additional or other functionalities

are required Suppose that the thermal camera delivers very high quality video and the detector cannot work with this as it

can only operate on low quality images The end-user searches for a plugin that scales the high quality video down to an

accepted quality for the detector This plugin is placed in between the thermal camera and the detector and the application

can work again By continuously adding plugins to the framework the number of possible applications that can be built with

the framework increase making the framework useable for more aerial thermal imaging use cases

Instead of developing the application from scratch users can use the already implemented plugins to build the applications in

an ad hoc fashion Because of this the development time for such applications can be reduced and users can switch hardware

andor algorithms easily The FRs are summarized in a use case diagram that connects each actor with their respective require-

ments and the relationship among them [40] depicted in Figure 21 Trivial functionalities such as launching and shutting down

the framework are omitted The red use cases represent use cases to extend the functionality of the framework the blue use

cases represent use cases for building streams white use cases modify the media processing of the stream Some use cases

depend on others the blue and white use cases work with plugins of the framework their prerequisite use case is rdquoAdd pluginrdquo

as a plugin must be a part of the framework for a user to use it the rdquo(Un)Link pluginsrdquo rdquoStopPausePlay streamrdquo use cases

depend on rdquoAdd plugins to streamrdquo as a stream must contain plugins before they can be manipulated

212 Non-functional requirements

A non-functional requirement (NFR) specifies how the framework is supposed to be or in what manner it should execute its

functionality [41] These qualifications typically cover business and system quality requirements A distinction is made between

quality attribute requirements (QAR) and constraints QARs are qualifications of the FRs or of the overall product eg how

fast a certain function must be executed or how resilient it must be to erroneous input They are closely related to business

requirements which are specifications that once delivered provide value to the actors [40] The QARs are captured in a utility

tree [40] that has a root node representing the system This root node is elaborated by listing the major QARs that the system

is required to exhibit Each QAR is subdivided into more specific QARs To make the specific requirements unambiguous and

testable a scenario for the system or a specific function is written and they are evaluated against the business value and the

architectural impact [40] The QAR can either have High (H) Medium (M) and Low (L) business value and architectural impact

respectively The business value is defined as the value for the end user if the QAR is enabled High designates a must-have

requirement Medium is for a requirement which is important but would not lead to project failure Low describes a nice to have

QAR but not something that is worth much effort Architectural impact defines how much the architecture must be designed

towards the QAR to enable it High means that meeting this QAR will profoundly affect the architecture Medium means

21 Requirements analysis 7

Figure 21 Use case diagram

that meeting this QAR will somewhat affect the architecture Low means that meeting this QAR will have little effect on the

architecture The following QARs are discussed performance interoperability modifiability usability security and availability

Performance

Performance indicates the frameworks ability to meet timing requirements It characterizes the events that can occur and the

frameworks time-based response to those events Latency is defined as the time between the arrival of the stimulus and the

systemrsquos response to it [40] The system latency is the latency between the initialization of an action sequence and the first

change to the system noticeable by the user Streaming latency is defined as the time between the arrival of a video frame and

the arrival of the next video frame The jitter of the response is the allowable variation in latency Scalability is the number

of users that can use the framework at the same time The utility tree is presented in Table 21 The software industry has

not defined a quantified rsquogood latencyrsquo for end-users but a 4 second latency rule is often used as a rule-of-thumb [42] The

average response time for general framework commands should then be less than 2 seconds with a standard deviation of 1

seconds ensuring most execution times respect the 4 second bound As stated in Chapter 1 some use cases require real-time

video streaming such as fire fighting The notion of low latency real-time video loosely defines that video should be streamed

almost simultaneously if a camera is filming and a human user does not notice a latency between the video of the camera

and the real world the video stream is considered real-time Real-time is thus a human time perception and for visual inputs

this bound is as low as 13 milliseconds Anything above 13 milliseconds becomes noticeable anything above 100 milliseconds

hinders human performance [43 44] However the framework focusses on the use of thermal cameras most of which most

donrsquot produce frames faster than 8 frames per second or 125 milliseconds per frame (see Section 31) More expensive cameras

21 Requirements analysis 8

can shoot at 25 frames per second corresponding to a latency of 40 milliseconds and this bound is selected for the streaming

latency with a standard deviation of 20 milliseconds remaining below the frame rate of less expensive cameras The number

of users that can use the framework at the same time is assumed to be low as current aerial thermal image applications are

currently operated by only one user or a few The assumption is that a maximum of five users can use the framework at the

same time All of these requirements are quantified as relatively rsquogoodrsquo values These bounds should be evaluated for user

satisfaction by having users use a prototype of the framework in practice

Attribute refinement Id Quality attribute scenario

LatencyPS-1 The average execution time of all framework commands does not exceed 2 sec-

onds (H M)

PS-2 A playing stream should have an upper limit of 40ms streaming latency (H H)

JitterPS-3 The average standard deviation of the execution time of all framework com-

mands should not exceed 1 second under normal operation (H M)

PS-4 The average standard deviation in streaming latency should not exceed 20ms

under normal operation (H H)

Scalability PS-5 The system should usable by five users at the same time (M M)

Table 21 Performance utility tree

Interoperability

Interoperability is the degree to which two or more independently developed systems can usefully exchange meaningful infor-

mation via interfaces in a particular context [40] The framework will interoperate with cameras and analysis modules via the

framework plugins Henceforth the camera and analysis plugins will be referred to as a Producer plugin and a Consumer plugin

A Producer plugin is thus a plugin that represents a camera that produces video and a Consumer plugin a plugin that represents

a module that processes or consumes video The framework will thus interact with the Producer and Consumer plugins with

which the framework exchanges requests to link them together control their media process etc The more correct exchanges

there are between the two the better the user can use the plugin for building applications with the framework This QAR is

quantified by the ratio of requests made by the framework that were correctly interpreted by the plugin (successful requests)

and the total number of requests during a runtime of the framework [40] Intuitively one argues that the framework must

achieve perfect interoperability with a perfect exchange success rate of 100 Reality however tends to not agree with perfec-

tion and it can never be excluded that exchanges will always be correct Therefore it is better to aim for a good interoperability

measure and prepare for failed exchanges instead of naively assuming the framework will be perfect An exchange success

rate of 9999 is selected the motivation for this bound is as follows A plugin is assumed to be always correct up to first the

mistake after which the plugin is faulty and the fault needs to be identified and ensured that it wonrsquot occur again An exchange

success rate of 9999 means that if 10000 plugins are installed and used by the framework only one will fail during uptime

For one plugin during framework up time the mean time between failures is then 10000 exchanges It is suspected that this

21 Requirements analysis 9

amount of exchanges are very high for normal framework use Because the possibility of faulty exchanges is acknowledged

the framework will need to implement a fallback mechanism to compensate The utility tree is presented in Table 22

Attribute refinement Id Quality attribute scenario

Syntactic interoperabilityIS-1 The framework exchanges requests with a Producer plugin (known at runtime)

with a success ratio of 9999 Incorrect requests are undone by the framework

and logged (HH)

IS-2 The framework exchanges requests with a Consumer plugin (known at runtime)

with a success ratio of 9999 Incorrect requests are undone by the framework

and logged (HH)

Table 22 Interoperability utility tree

Modifiability

Modifiability is the cost and risk of changing functionality of the system [40] One of themost important values of the framework

is modifiability of the supported thermal cameras and analysis modules The framework needs to be extendable for new

functionalities by enabling developers to add their support software in the form of a plugin End-users should be able to

modify the components that they use for their image processing applications easily and quickly to allow for interchangeable

hardware and software and quickly set up new applications Modifiability is defined in two environments runtime defined as

periods during which the system is up and running and downtime defined as the time periods during which the system is not

active The utility tree is presented in Table 23

To enable users to choose the extensions they need the framework will need a distribution service that contains all plugins

available for the framework from which a user can select and install plugins for their local version of the framework Adding

new plugins to the distribution service should not affect versions of the frameworks installed by the user When a user adds a

plugin from the distribution to his version of the framework the framework should only reload once before making the plugin

useable for user comfort Deployability is defined as the different device configurations that specify how the framework can be

deployed If the framework can be deployed in different fashions this can increase the value for the end-user Suppose a fire

fighting use case in which a forest fire is monitored on site Computationally powerful devices might not be available on site

so moving some plugins processing media to a remote server or cloud could still allow usage of the framework Perhaps the

device processing the media is already remote for example a drone on security patrol in this case access via a remote device

such as a smartphone is desirable This leads to the deployment configurations described in the utility tree

Usability

Usability indicates how easy it is for the user to accomplish a desired task and the kind of user support the system provides

Learnability indicates how easy it is for a user to gain knowledge on how to use the framework Errors are the amount of errors

21 Requirements analysis 10

Attribute refinement Id Quality attribute scenario

Run time modifiability

MS-1 Support for a new Producer plugin should be added to the distribution service

within one day without the framework having to restart (H H)

MS-2 Support for a new Consumer plugin should be added to the distribution service

within one day without the framework having to restart (H H)

MS-3 End-users should be able to extend their framework with new functionalities

by installing new Consumer and Producer Plugins (HH)

MS-4 End-users should be able tomodify the plugins used to build their stream (HH)

Down time modifiabilityMS-5 New Producer plugins can be installed to the local framework at runtime the

framework can only reload once before the plugin is useable (H H)

MS-6 New Producer plugins can be installed to the local framework at runtime the

framework can only reload once before the plugin is useable (H H)

Deployability

MS-7 The system should be deployable on a combination of a smartphone and

cloudremote server environment (H H)

MS-8 The system should be deployable on a personal computer or laptop (H H)

MS-9 The system should be deployable on a smartphone laptop and cloud environ-

ment (H H)

Table 23 Modifiability utility tree

a user can make when trying to execute certain functions [40] The utility tree is presented in Table 24

Security

Security is a measure of the systemrsquos ability to protect data and information from unauthorized access while still providing

access to users and systems that are authorized An action taken against the system to cause it harm is called an attack

Security has three main characteristics Confidentiality is the property that data or services are protected from unauthorized

access Integrity is the property that data or services are protected from unauthorized manipulation Availability is the property

of the systemmaintaining its functionality during an attack Authentication verifies the identities of the parties of an interaction

checks if they are truly who they claim to be and gives or provokes access [40] Security is important for the framework if it is

deployed on multiple devices that use a public network to communicate The utility tree is presented in Table 25

Availability

Availability in a general context (not only security) refers to howavailable the software is to carry out its functionality Downtime

is a measure of the time that the system is unavailable to carry out its functions The utility tree is presented in Table 26

Availability is specified for the part of the framework that distributes the plugins

22 Patterns and tactics 11

Attribute refinement Id Quality attribute scenario

Learnability

US-1 A user should be able to learn how to build an image processing application in

at most one hour (H L)

US-2 An experienced developer should be able to start developing a Consumer plugin

for the system within one day (H L)

US-3 An experienced developer should be able to start developing a Producer plugin

for the system within one day (H L)

Errors US-4 A user should not make more than 3 errors to build an image processing appli-

cation (H L)

Table 24 Usability utility tree

Attribute refinement Id Quality attribute scenario

Confidentiality SS-1 Streams created by a user can only be accessed by that user and not by any

other entity (H L)

Integrity SS-2 Streams canrsquot be manipulated without authorization by the user that made the

streams (H L)

Availability SS-3 During an attack the core functionality is still available to the user (H M)

AuthenticationSS-4 Users should authenticate with the system to perform functions (H L)

SS-5 Developers should authenticate their plugins before adding them to the frame-

work (H L)

Table 25 Security utility tree

Architecturally significant requirements

Architecturally Significant Requirements (ASR) are the requirements that are themost important to realize according to business

value and have the most impact on the architecture From the utility trees and the measures of the quality attribute scenarios

the QARs modifiability interoperability and performance are identified as ASRs

22 Patterns and tactics

An architectural pattern is a package of design decisions that is found repeatedly in practice that has known properties that

permit reuse and describe a class of architectures Architectural tactics are simpler than patterns which typically use just a

single structure or computational mechanism They are meant to address a single architectural force Tactics are the rdquobuilding

blocksrdquo of design and an architectural pattern typically comprises one or more tactics [40] Based on the ASRs several tactics

are listed in Table 27 that are used for the base pattern selection The explored patterns are layers event-driven architecture

22 Patterns and tactics 12

microkernel and microservices

221 Layers

The layered pattern divides the software into units called layers that each perform a specific role within the application Each

layer is allowed to use the layer directly beneath it via its interface Changes in one layer are isolated if the interfaces donrsquot

change enablingMT-1 andMT-2MT-5 [40] While changes can be isolated by the isolated layers they remain difficult due the

monolithic nature of most implementations of this pattern Layers contribute to a performance penalty due to the rdquoarchitecture

sinkhole phenomenonrdquo in which requests are simply propagating through layers for the sake of layers [45]

222 Event-driven architecture

This pattern consists of several event publishers that create events and event subscribers that process these events The pub-

lishers and subscribers are decoupled by using an event channel to which the publishers publish events that the event channel

forwards to the event subscribers The subscribers should have a single purpose and execute asynchronously Since the publish-

ers and subscribers are single-purpose and are completely decoupled from other components via the event channel changes

are isolated to one or some components enabling MT-1 MT-2 MT-4 MT-5 and MT-7 If the event channel adds a discovery

mechanism IT-1 can also be enabled Overall the pattern is relatively easy to deploy due to the decoupled nature of the com-

ponents Performance in general can be very high through the asynchronous nature of the architecture enabling PT-6 and PT-7

If the event channel is tweaked to contain extra functionality PT-1 PT-3 PT-8 PT-9 PT-10 PT-11 can be enabled as well

If the components have a limited event response then PT-2 and PT-5 can also be enabled Development can be somewhat

complicated due to the asynchronous nature of the pattern [40 45]

223 Microkernel

The microkernel pattern allows the addition of application features as plugins to the core application providing extensibility

as well as feature separation and isolation The pattern consists of two components a core system called the kernel and

Attribute refinement Id Quality attribute scenario

DowntimeAS-1 The system should be up 995 per year This means the system has an allowed

scheduled downtime of 43 hours and 30 minutes per year for maintenance (M

L)

AS-2 The maximal duration of the interval during which the system is unavailable is

3 hours (M L)

Network AS-3 If there is no active network connection the local device can be used for opera-

tion of the framework (H H)

Table 26 Availability utility tree

22 Patterns and tactics 13

plugins The business logic is divided between independent plugins and the kernel The kernel contains only the minimal

functionality required to make the system operational The plugins are standalone independent components that contain

specialized processing additional features and custom code This code is meant to enhance or extend the core system to

produce additional business capabilities In many implementations plugins are independently developed third-party modules

Changes can largely be isolated and implemented quickly through loosely coupled plugins AllMTs can be enabled Depending

on how the pattern is implemented the plugins can be dynamically added to the kernel at runtime Via a resource discovery

service in the kernel the ITs can be enabled In general most applications built using the microkernel pattern perform well

because applications can be customized and streamlined to only include the features that are needed [45]

224 Microservices

Microservices is an architectural pattern that structures an application as a collection of loosely coupled services that implement

business capabilities Each component of the pattern is deployed as a separate unit that can be deployed on one device or

multiple devices The components can vary in granularity from a single module to a large portion of the application The

components contain one or more modules that represent either a single-purpose function or an independent portion of a

business application [45 46] Due to the separately deployed units changes are isolated to individual components enabling all

MTs Via service discovery mechanisms the ITs can also be enabled The microservices pattern supports distributed deployment

of the software across multiple devices by design This pattern is not known to produce high-performance applications due to

the distributed nature of the pattern which relies on communication via a network [45 46]

225 Comparison of patterns

Table 27 summarizes the analysis of the patterns A score is given based on howwell the pattern enables the tactic Lowmeans

that the pattern does not naturally enable the tactic Medium indicates the pattern can be implemented with the tactic but

does not include it itself High means the tactic is enabled in the pattern Excellent means that the tactic plays a key role in the

pattern

The microkernel pattern andmicroservices pattern both enable most tactics The microkernel pattern implements extendability

of the framework by design using plugins which is the main idea for the framework and thus is an excellent base pattern

Interoperability and deployability of these plugins can be ensured by the microservices pattern as it designs the microservices

to have well defined interfaces for interoperability and allows for the framework to be deployed in a distributed fashion The

architecture presented below is a combination of both the microkernel pattern and the microservices pattern

22 Patterns and tactics 14

Tactic Layers Event-driven Microkernel Microservices

MT-1 Split module Medium High High Excellent

MT-2 Increase semantic coherence Medium High High Excellent

MT-3 Encapsulate Medium High High Excellent

MT-4 Use an intermediary Medium High High Excellent

MT-5 Restrict dependencies High High High Excellent

MT-6 Anticipate expected changes Low High Excellent Excellent

MT-7 Abstract common services Low High Excellent Excellent

MT-8 Defer binding | Runtime registration Low Low Medium High

IT-1 Discover services Low Low High High

IT-2 Orchestrate interface Low Low High High

IT-3 Tailor interface Low Low High High

P-1 Manage sampling rate Low High High Medium

P-2 Limit event response Low High High Medium

P-3 Prioritize events Low High High Medium

P-4 Reduce overhead Low High High Low

P-5 Bound execution time Low High High Medium

PT-6 Increase resource efficiency Low High High High

PT-7 Introduce concurrency Low High Low High

PT-8 Maintain copies of computation Low High Low High

PT-9 Load balancing Low High Low High

PT-10 Maintain multiple copies of data Low High Low High

PT-11 Bound queue sizes Low High Low Medium

PT-12 Schedule resources Low High Low Medium

Table 27 Comparison of how well the discussed patterns enable the tactics needed for the ASRs

23 Software architecture 15

23 Software architecture

The software architecture is documented in three document categories static views dynamic views and deployment views

The static views comprise the different components of the system and their relationship among each other The dynamic views

describe the runtime behavior of the system Finally the deployment views provide different configurations how the system

can be deployed on different devices [47]

231 Static view

Figure 22 presents an overview of the architecture using a component-connector UML diagram Components are the boxes that

represent different software entities that exist at runtime The components have interfaces through which they interact with

other components These are indicated using the rsquolollipoprsquo notation with the rsquoballrsquo representing the interface that a component

provides and a socket indicating that another component is using this interface The type of data exchanged is noted next to

the interface Multiple boxes indicate that multiple components of the same kind can exist at runtime [48]

The architecture consists of the following core components Client Interface Producer Stream Consumer Producer Distribution

Consumer Distribution Producer Plugin and Consumer Plugin The clear components in Figure 22 form the core framework

which each user needs to install to use the framework The colored components form a distribution service for framework

plugins to extend the functionality they are not installed with the core framework but run as remote instances with which

the user can interact to extend his version of the core framework with new plugins A user can use the framework via the

Client Interface building streams that are maintained in the Stream component The Stream component makes requests to

the Producer and Consumer components to activate and control the selected plugins to build the stream Additional plugins

can be added to the framework and are distributed via the Producer and Consumer Distribution components The architecture

implements a hybrid combination of the microservices and microkernel pattern Each presented component is a microservice

that implements its own interface to interact with other components The Producer and Consumer components act as kernels in

the microkernel pattern while the Producer and Consumer plugins acting as plugins in the microkernel pattern These patterns

enable the tactics needed to meet the requirements presented in Section 21

Communication protocol

To allow the microservices to communicate a communication protocol must be designed Communication protocols can roughly

be classified in two categories synchronous and asynchronous Synchronous protocols block on requests which means that the

client waits for a response of the server and can only continue executing when a response is received This makes a synchronous

protocol inherently more reliable but also slower An example synchronous protocol is the Hyper Text Transfer Protocol (HTTP)

Asynchronous protocols just send messages and do not block on the response This makes the protocol less reliable but also

faster [49]

There are two types of traffic exchanged between microservices First there are the command requests that are exchanged

between microservices to edit resources or change state Second there are the video frames that are exchanged between Pro-

ducer and Consumer Plugins Both types of traffic have different requirements The commands must be communicated reliably

23 Software architecture 16

Figure 22 Overview component-connector diagram of the architecture

and need to executed once and only once The reliability is more important than latency so a synchronous protocol is pre-

ferred Microservices traditionally implement the synchronous HTTP protocol with Representational State Transfer Application

Programming Interfaces (REST API) that specifies the application endpoints as textual resources [45] This common protocol

is used for the exchanged command requests in the framework

The video frames need to be sent with low latency at a high frequency but reliability is less important An asynchronous

protocol is preferred For video streaming the Real-time Transport Protocol (RTP) running on top of the User Datagram Protocol

(UDP) is selected as it enables real-time transfer of data between processes [50] RTP defines a standardized packet format to

transmit video and audio over a network It sequences each packet with a sequence number and a timestamp This allows the

application to detect missing packets and latencies in the network The UDP protocol is a low latency asynchronous transport

protocol as it doesnrsquot guarantee packet delivery

The recommended codec for transmitting video media is Motion-JPEG that encodes video frames as separately encoded JPEG

images This makes analysis and processing in subsequent plugins easier as only the received frame is needed to perform

the analysis or processing Other video compression formats such as MPEG-4 use key-frames and object oriented differential

compression formats If a key-frame is received via the stream the frame can be used as is If a reference frame is received the

receiver needs to wait for the corresponding key-frame to be received to be able to construct the full video frame for analysis

This introduces extra complexity and lower quality detection which is a clear trade-off for the quality and simplicity which

MJPEG offers [51 52]

Applying these protocols to the architecture results in the network topology depicted in Figure 23 The full lines represent

communication via HTTP on top of the Transmission Control Protocol (TCP) The dashed lines represent the RTP protocol on top

of the UDP protocol The boxes represent the different microservice components of the framework

23 Software architecture 17

Figure 23 Framework network topology Each box is a microservice component of the framework The full lines indicate communication over the HTTPTCP

protocol the dashed lines indicate communication over the RTPUDP protocol

Client Interface

The Client Interface is the interface through which end-users can interact with the framework Figure 24 presents the detailed

component-connector diagram The Client Interface consists of a User Interface (UI) component and a API Gateway component

Devices can make requests to the Client Interface via the Client Requests interface provided by the API GateWay The UI provides

the UI Operation interface that is used by end-users to control the framework This can be either a visual or textual interface

The UI actions are translated to client requests that are forwarded to the API Gateway using the Client Requests interface The

API Gateway translates the client requests and forwards them to the other core framework components

Figure 24 Client Interface detailed view

Stream

The Stream component maintains the logical representation of the streams built by the end-user for his image processing

application Figure 25 presents the detailed component-connector diagram

23 Software architecture 18

Figure 25 Stream detailed view

It consists of an API a StreamManager and several StreamModel components The API provides the Stream Commands interface

used by the Client Interface to interact with the framework it translates incoming requests to commands for the Stream

Manager that can then execute these commands Commands include creating a new stream modifying the stream layout

modifying the stream state etc The StreamManager creates and manages multiple streams represented by the Stream Model

So the end-user builds Stream Models to create image processing applications The Stream Model represents the logical model

of these image processing application streams As stated before a stream consists of several plugins processing media placed

in some order that are linked by the framework Figure 26 illustrates this concept

Figure 26 Logical model of a stream The arrows represent the flow of media through the stream

Logically the Stream Model is represented as a tree with multiple roots and multiple leaves The framework build streams by

initializing the needed plugins and connecting them in order In the example StreamModel plugins receivemedia frommultiple

source plugins and forward media to multiple targets The Stream Model has a global state that represents the cumulative

state of all plugins To transition the global state from A to B all plugins need to transition from A to B This is done by first

making the transition on the leaves of the Stream Model after which the transition propagates to the root nodes This ensures

that no media is lost because the first transitioned plugins canrsquot process anything as their is no media put into the tree

23 Software architecture 19

Producer and Consumer plugins

A Plugin represents an independent media processing element either of the Consumer type (such as a thermal camera) or

the Producer type (such as an object detection software module) Plugins are deployed as standalone microservices providing

a REST API interface that the framework uses to control the plugin Figure 27 represents a general plugin model A plugin

receives media from other plugins called the sources processes this media and forwards it to other plugins called the listeners

A Producer plugin only has listeners a Consumer plugin has both sources and listeners Merging the media from multiple

sources and forwarding the processed media to multiple listeners is the responsibility of the plugin

Figure 27 Plugin model

The plugin REST API should at least provide a state resource representing the state of how the plugin is processing media

a sources resource that represent the sources from which the plugin receives media to process and a listeners

resource which represent the listeners to which the plugin transmits the processed media Only Consumers have the both

the sources and listeners resource as Producer Plugins produce their own media source and hence can only have

listeners

To indicate if and how the plugin is actively processing media a finite state machine is implemented The state transition

diagram is presented in Figure 28 A plugin can be in four possible states INACTIVE STOP PLAY and PAUSE When a plugin

is in the INACTIVE state no active microservice is running the plugin This is the initial state for all plugins of the framework

This state is only visible to the framework as in this state the plugin is not instantiated When a plugin is in the STOP state

the framework has instantiated a microservice running the plugin The plugin is listening for commands on its API but is not

processing any media This state is visible to the plugin In the PLAY state a plugin is processing media received from its

source(s) and transmits processed media to its listener(s) and is listening for commands When in the PAUSE state media

processing is paused but media buffers are kept This is to decrease the latency when the plugin transitions back to the PLAY

state since the plugin can continue processing from the point from where it was paused The difference with the STOP state

when transitioning STOP state the plugin clears its media buffers

The plugin starts in the INACTIVE state When a microservice running the plugin is instantiated by the framework the plugin

initializes itself in the STOP state From the STOP state the plugin can transition to the PLAY state to process media This

transition is only successful if sources and listeners are registered with the plugin From the PLAY state a transition to both

23 Software architecture 20

Figure 28 The state transition diagram for a plugin

the STOP state and the PAUSE state can be made which stops the processing of media and respectively drops or keeps the

media buffers The plugin cannot make multiple state transitions per command When a transition is made to INACTIVE the

framework first transitions the plugin to the STOP state after which the INACTIVE state can be reached

A sourcelistener has the following fields hostname the hostname of the microservice running the plugin and port the port

on which the sourcelistener is reachable

On the sources and listeners an HTTP GET and POST method must be provided GET retrieves the sourceslisteners

and their details POST adds a new sourcelistener to the plugin Both resources additionally need to provide an individ-

ual endpoint per sourcelistener on which GET PUT and DELETE must be provided This is for individual manipulation of the

sourcelistener GET retrieves the details PUT updates the fields of a listener and DELETE removes a sourcelistener from the

plugin

Producer and Consumer

The Producer and Consumer components are responsible for interacting and managing the ProducerConsumer plugins used in

the streams Figure 29 presents the component-connector diagram of the Producer and Consumer components Both compo-

nents have a similar architecture but are separate components This is because their plugin models differ and are suspected

to often be deployed on different devices having specific hardware requirements Producers Plugins could be deployed on

the thermal camera itself having a very specific operating system whereas a Consumer plugin might need access to specific

processors to speed up its execution

The Producer and Consumer consist of the following components API Kernel Plugin Model and Plugin Manager The API trans-

lates requests coming from the Stream component to commands for the Kernel The Kernel implements the core functionalities

such as activating (deploying) and deactivating plugins managing their state and manipulating their resources It creates a

Plugin Model for each Plugin that the framework has installed This model represents a plugin logically on framework level

and keeps track of the Plugin resources The Plugin Manager manages the plugins that were added to the framework stored in

the Plugin Directory It manages the plugin installations adding updates or installing additional plugins that can be retrieved

from the Producer and Consumer Distribution components

23 Software architecture 21

(a) Producer component-connector diagram

(b) Consumer component-connector diagram

Figure 29 Component-connector diagrams of the Producer and Consumer module

Producer and Consumer Distribution

The Producer and Consumer Distribution components are responsible for managing and maintaining the plugins for the frame-

work They act as online software repositories from which local versions of the framework can retrieve new plugins The

component-connector diagrams are presented in Figure 210 The Distribution components consists of the following subcom-

ponents API Plugin Manager and Plugin Tester Plugin Developers can make requests to the API that translates these requests

to Create Read Update Destroy (CRUD) commands for the Plugin Manager The Plugin Manager executes these commands

on the Plugins that are kept in the Plugin Repository The quality of the framework depends on the quality of the plugins

that it offers Therefore plugins should be thoroughly tested before being added to the framework to guarantee quality The

Plugin Tester component is responsible for this testing Tests should include testing if the plugin implements the Plugin Model

correctly if the plugin meets the performance requirements etc When a plugin passes these tests it is added to the Plugin

Repository so that end-users can install the plugin and use it for their applications

23 Software architecture 22

(a) Producer Distribution (b) Consumer Distribution

Figure 210 Producer and Consumer Distribution component-connector diagrams

232 Dynamic views

Dynamic views depict the behavior of the system and complement the static views They are documented using sequence

diagrams that show an explicit sequence of messages between architecture elements that describes a use case [40] Two key

use cases are presented here Add a plugin to the stream and linking plugins to build the stream

Add plugin to stream

Figure 211 presents the sequence diagram for adding a Producer plugin to the framework The framework is assumed to be

running the user has created a streamS and the Producer PluginA is correctly installed The end-user executes the command

to add A to stream S on the Client Interface that passes the command to the Stream component The Stream component

requests the creation of a microservice instance of A that is created by the Producer Kernel When the Producer Plugin is

instantiated the Producer Kernel creates a Plugin Model of A and adds it to its references so that the instance of A can be

reached for future commands Afterwards the StreamManager is informed of the success upon which the StreamManager can

addA to the Stream Model ready to be linked with other plugins The user is notified of this success and can continue building

IfA could not be instantiated (due to not being installed not installed correctly etc) A is marked as rsquobrokenrsquo and the user is

notified that the action could not be completed When the plugin is marked as rsquobrokenrsquo it can no longer be used and needs to

be reinstalled The sequence diagram for adding a Consumer Plugin is similar but replaces the Producer components with the

Consumer components

Link plugins

Figure 212 presents the sequence diagram for linking two plugins in a stream In the sequence diagram two Consumer Plugins

A and B are linked this can be extended to a Producer Plugin linking with a Consumer Plugin The framework is assumed

to be running the user has created a stream S the plugins A and B have been instantiated and added to the stream The

end-user executes the command to linkA andB in stream S on the Client Interface that passes the command to the Stream

component that checks if the link is valid for the Stream Model S Linking can only be done if the stream is in the STOP state

and if the plugins are already in the stream If the link is valid the Stream Manager can begin linking the plugins To link

23 Software architecture 23

Figure 211 Add a Producer Plugin to stream

the plugins in the order A-B A is added as a source for B and B is added as a listener for A These subsequences are

found in their corresponding frames in the diagram and are very similar The Stream Manager makes the request to add the

sourcelistener to the Kernel that finds the corresponding plugin and makes the request on the corresponding Plugin Model

If the Plugin succeeded the Plugin Model is updated and the Stream Manager is notified of this success If both plugins have

successfully set the source and listener the Stream Model layout is updated with the link Should the sourcelistener request

fail for one of the plugins the change is rolled back and the end-user is notified

233 Deployment views

The different deployment configurations are illustrated via deployment diagrams using the Deployment UML 25 specification

[48] rsquoHostrsquo specifies the device on which components are deployed The rsquomicroservicersquo indicates the isolated environment

in which components are running These isolated environments on the host are realized as software containers that enable

portability of the components to other deployment configurations This concept is further discussed in Section 33 The Producer

and Consumer Distribution components were left out of the diagrams as they are always distributed on a different host than

the core framework Two deployment configurations are presented the local configuration that deploys the components on

a single device and the distributed configuration that deploys each component on a separate device These configurations are

presented in Figure 213

23 Software architecture 24

Figure 212 Link two plugins in a stream The rsquoformat requestrsquo and rsquotranslate requestrsquo actions in the API components have been omitted to reduce clutter

in the diagram but are executed by the API components

23 Software architecture 25

Local configuration deployment

The local configuration deploys the framework on a single local device The configuration is depicted in Figure 213a Because

the framework is deployed as a whole it can operate offline This configuration is useful for image processing applications that

canrsquot rely on a stable network connection Examples are remote locations or densely built-up areas The components are still

deployed as separate microservices due to the architecture of the framework This has an impact on the performance of the

framework because for every interaction between components either the HTTP message protocol or RTP protocol is used that

introduces extra overhead compared to direct invocation of commands

Distributed configuration deployment

The distributed configuration deploys the framework on multiple devices The components are distributed over these devices

made possible by the microservice isolation and communication protocols This configuration is depicted in Figure 213b Obvi-

ously in this configuration each component of the framework must have a reliable network connection to communicate with

the other framework components This configuration could be used for example for a security application The end-user has

the Stream module running on a master node that controls several cameras The end-user can configure his image process-

ing application through the Client Interface running on his device that communicates with the Stream module running on

the master node The master node can control each camera by communicating with the Producer component If for example

the security application requires autonomous detection of trespassing people a computationally intensive task the Consumer

Plugins could need dedicated hardware to run that is only available on another device The Consumer component can then be

deployed on that dedicated device and the Stream component can again communicate with it over the network This success of

this configuration depends on the availability of the network and the capacity of the network If the network fails commands

and media canrsquot come through and the framework can no longer execute Due to the distributed nature performance will also

be worse when compared to the local configuration because each request between the components travels over a network

that can experience delays

23 Software architecture 26

(a) Local configuration deployment diagram(b) Distributed configuration deployment diagram

]

Figure 213 Deployment diagrams

STATE OF THE ART AND TECHNOLOGY CHOICE 27

Chapter 3

State of the art and technology choice

To build and test a proof of concept implementation of the architecture presented in Chapter 2 several state of the art tech-

nologies can be used as support for the framework These are presented in Sections 31 32 33 and 34 For each category a

choice is made that will serve as the basis for the implementation of the proof of concept discussed in Section 35 Readers

already familiar with the presented technologies can safely skip ahead to Section 35

31 Thermal camera options

This section aims to provide an overview of some currently commercially available thermal cameras The overview is not a

complete overview of all products offered by all vendors This data was gathered in September 2017 so some products can

be discontinued and new products can already be launched Several parameters are collected for each product Section 311

discusses why these parameters are important to assess the quality of a thermal camera Section 312 aims to aggregate these

parameters and presents insights into the data The full list of specifications can be found in Appendix B

311 Parameters

The following parameters were considered for the comparison physical specifications image quality thermal precision inter-

faces energy consumption help and support and auxiliary features

Price

Thermal cameras are relatively expensive when compared to visible light cameras For example a 20 megapixel (MP) visible

light camera can cost as low as 100 euro while thermal cameras having a much lower image resolution can cost as much as

15000 euro Prices for thermal cameras cover a very wide range and budgets are limited in practice

31 Thermal camera options 28

Physical specifications

Two specifications are considered the weight of the camera and the dimensions of the camera Drones have a limited carry

weight due to maximal carrying capacities and a faster draining of battery life when carrying heavier loads Lighter and smaller

cameras are preferred for usage with drones These often offer lower image quality and less features than the heavier cameras

Image quality

Image quality specifies how much information an image can possibly hold It consists of five parameters resolution capture

frequency or frame rate field of view and radiometric information Image resolution is the amount of detail an image holds

Higher resolution cameras can capture more details in a scene resulting in a sharper image that holds more information Due

to more details smaller objects can also be seen allowing scenes to be viewed from larger distances Drones capture images

from relatively large distances so good resolutions are required for the images to be useful Image resolution is measured in

pixel density presented as the product of the amount of pixels in width and height of the image The highest resolution found

for the compared cameras is 640 x 512 pixels Some cameras offer a visual camera next to the thermal camera This allows

an overlay of the visual image and the thermal image so-called Multi Spectral Dynamic Imaging (MSX) This creates artificial

sharper images because edges can be seen more clearly because they are more visible in the visual image Figure 31 depicts a

thermal-only image and a MSX image of a dog It can be seen that the MSX image is sharper MSX is a more low-cost solution

to produce sharper images compared to increasing the thermal resolution as visible light cameras are less expensive [7]

(a) Thermal (b) MSX

Figure 31 Thermal image and MSX image of a dog

The capture frequency or frame rate dictates how many frames the camera can capture per second Higher frequency cameras

are able to track dynamic scenes better The field of view is angle throughwhich the camera is sensitive to thermal radiation and

31 Thermal camera options 29

determines the extent of the world that can be seen by the camera Bigger field of views can capturemore of the environment in

one image Most cameras allow various lenses to be mounted onto the camera which allows for greater flexibility in choosing

the field of view Radiometric image information is thermal information embeddedwith the infrared image that can be analyzed

after recording Radiometric information characterizes the distribution of the thermal radiationrsquos power in space and specifies

the temperature per pixel exactly Regular thermal images use a relative scaling of temperatures that are mapped onto a

colorspace with some color being the hottest color in the image and another color the coldest For example in Figure 31a the

Iron color scheme is used which maps the cold regions of the image on blue color variants and warmer regions on red and

yellow variants Radiometric information can give a very detailed description of the radiation pattern of a scene

Thermal precision

Thermal precision specifies the temperature range the sensitivity and accuracy of the temperature measurements The tem-

perature range indicates the minimum and maximum range a camera can detect A larger temperature range comes with a

trade-off in sensitivity and accuracy Often cameras offer different modi of operation and operate using different intervals

according to the accuracy needed in a scene Sensitivity indicates the ability of the camera to record finer distinctions in tem-

perature Accuracy is the margin of error for temperature readings on the thermal camera An accuracy of 5 degrees Celsius

for small temperature ranges and 20 degrees Celsius for large temperature ranges is commonly found The increase in error

margin is a trade-off for the larger temperature interval Objects have different emit infrared waves in various forms (due

to black-box radiation [7]) To accurately compare the temperatures cameras often implement emissivity corrections that

normalize the measurements

Interfaces

Cameras can communicate with other devices via several interfaces during use Cameras mounted on a drone cannot be ac-

cessed during flight and need these interface to transfer data USB and HDMI are the most commonly found interfaces to

connect the camera with an on-board processing unit gimbal or battery MAVLink [53] is a very lightweight header-only mes-

sage marshalling library for micro air vehicles drones When a camera provides this interface this allows for a very efficient

communication scheme to control the camera remotely Other interfaces include Bluetooth or Wi-fi

Energy consumption

A device mounted on a drone has a limited energy source at its disposal The less energy the camera consumes the longer the

drone can operate This can even lead to lighter batteries used in-flight reducing the carried weight and therefore also the

energy consumption Typically energy consumptions for cameras are much lower than the energy consumption of the drone

itself so this is a minor specification Input voltage and power consumption are specified

31 Thermal camera options 30

Help and support

How the camera is supported by the company has a big impact on the ease of developing applications for the camera platform

User manuals phone or email support and FAQs are very helpful Should the camera be malfunctioning a product warranty is

necessary to recover the broken product

User experience

The user experience is another important factor as there is a difference in the technical specifications and the actual experience

of the user The user experience is measured in a number of good and a number of bad reviews Reviews are scored from zero

to five with zero being a very bad experience and 5 being a very good experience A good review is scored three or more a bad

review less than three stars

Auxiliary features

Some cameras offer even more features than the ones mentioned above These can be a connection with the Global Positioning

System (GPS) to indicate where images were captured a software application to interact with the camera analysis functionality

tracking etc

312 Comparative analysis

It can be seen that FLIR is the market leader on thermal solutions for drones They offer the largest product line and products

from other companies often utilize one of their camera cores Figure 32a plots the retail price compared to the thermal

resolution Cameras with high and low resolutions are found across all price ranges Clearly other features determine the price

of a thermal camera A feature function is defined that maps the features of a thermal camera on an integer The function

increments the integer if

bull The camera has MSX support

bull The camera has a standard data format (not just an analog or digital signal)

bull The camera offers radiometric information

bull The image resolution is larger than 640 x 512 pixels being the highest resolution found for these products

bull The sensitivity is smaller than 100 mK

bull The camera offers emissivity correction

bull The camera offers a USB interface

bull The camera offers a MAVLink interface

32 Microservices frameworks 31

bull The camera offers an HDMI interface

bull The camera offers a Bluetooth connection

bull The camera offers Wi-Fi connection

bull The camera offers GPS tagging

Figure 32b plots these feature points versus the retail price This gives a more log-like relationship The features of a camera

determine the price much more than just the image quality For a price less than 5000 euro thermal cameras are found that

implement most basic features Then the price increases rather fast for less added features These are features like radiometry

that require additional hardware that greatly increase the price of the camera

32 Microservices frameworks

The architecture presented in Section 23 relies heavily on the microservices pattern Therefore this Section aims to present

several microservices frameworks to support this architecture Figure 33 depicts the results of the Rethink IT survey query-

ing the most used frameworks for microservices by developers [54] The most popular frameworks Java EE and Spring Boot

are written in Java The Java EE framework is more of a one-stop-shop framework offering much more functionalities than

just a backbone microservices framework and is therefore not considered Spring Boot is clearly a very popular and mature

framework more streamlined for microservices Vertx is a more upcoming framework renowned for its performance making

it worthwhile to explore Python is an upcoming language for web development and because it is excellent for prototyping

several frameworks for this language are explored as well The frameworks presented here are Vertx version 351 Spring Boot

version 20 Flask version 012 Falcon version 141 and Nameko version 290

321 Flask

Flask is a micro web development framework for Python The term rdquomicrordquo means that Flask aims to keep its core simple but

extensible Flask is an unopinionated framework as it only provides a glue layer to build a REST API around the application

(a) Camera resolution compared to retail price(b) Camera feature points compared to price

32 Microservices frameworks 32

Figure 33 Rethink IT Most used tools and frameworks for microservices results [54]

However it provides a large list of extensions if extra functionality is required [55] Starting a microservice is very simple

as illustrated in Listing 1 Flask uses the concept of Python decorators [56] to bind Python functions to a REST API in Listing

1 for example the function service_status() is linked to the rsquorsquo resource When a user issues an HTTP GET request

on this resource the route() function on the app object is called by Flask Because route() is a decorator for the

service_status() function service_status() is wrapped and passed to the route() function so that when

a user issues an HTTP GET request the service_status() function that was passed gets called This allows for an easy

construction of the REST API just decorate all the functions of the microservice with the correct Flask decorator

from flask import Flask

app = Flask(__name__)

approute()

def service_status()

return service_status

if __name__ == __main__

apprun()

Listing 1 Minimal Flask application

Because Flask is a microframework its memory footprint is small with the binary file only being 535KB large It is in use

by several large companies such as Netflix and Reddit [57] In a production environment the default Flask web server is not

sufficient as it only serves one user at a time However for prototyping it is an excellent framework [55]

32 Microservices frameworks 33

322 Falcon

Falcon is a bare-metal Python web framework that differentiates itself in performance when compared to other frameworks

It targets itself towards microservices due to being even more lightweight and faster when compared to frameworks like Flask

In a benchmark test it achieves 27 times the speed of Flask [58] The framework seems less mature and has not been adopted

by many companies [59] It is not considered for the prototype of the system but could be used in production as it achieves

better performance

323 Nameko

Nameko is a framework specifically built for building microservices in Python Next to offering a REST API it also offers asyn-

chronous events over the Advanced Message Queuing Protocol (AMQP) It is only meant to be used for microservices not for

web applications that serve content It is a relatively young framework and is not backed by any major companies as of yet It

is however backed by the developer of the Flask framework [60]

324 Vertx

Vertx is a toolkit for building reactive applications on the Java Virtual Machine (JVM) This framework follows the reactive

systems principles These principles are used to achieve responsiveness and build systems that respond to requests in a timely

fashion even with failures or under load To build such a system reactive systems embrace a message-driven approach All

the components interact using messages sent and received asynchronously Reactive microservices built with Vertx have the

following characteristics autonomy asynchronous resilience and elasticity Vertx is a toolkit and can be used as any other

library which makes it very flexible It provides a large set of features metrics different programming languages different

protocols templating data access cluster management etc

Vertx embraces the asynchronous development model which can be seen in Listing 2

import iovertxcoreAbstractVerticle

public class Server extends AbstractVerticle

public void start()

vertxcreateHttpServer()requestHandler(req -gt

reqresponse()

putHeader(content-type textplain)

end(Hello from Vertx)

)listen(8080)

Listing 2 Vertx example

33 Deployment framework 34

The event which occurs is the HTTP request On arrival of the event the Handler is called and is executed The Handler is chained

to a listen request and does not block the calling thread The Handler is only notified when an event is ready to be processed

or when the result of an asynchronous operation has been computed [61]

325 Spring Boot

Spring Boot is an opinionated Java framework for building microservices based on the Spring dependency injection framework

It allows developers to create microservices through reduced boilerplate and configuration For simple applications it provides

a similar syntax to Flask in Python and uses decorators for routing An example is given in Listing 3 The framework handles

most of the routing and request handling but restricts the developer in application structure The framework is not lightweight

and performs less well than Vertx [62]

RestController

RequestMapping(api)

public class HelloRestController

RequestMapping(method = RequestMethodGET value=hola

produces = textplain)

public String hello()

return Hello Spring Boot

Listing 3 Spring Boot example

33 Deployment framework

To allow for the modifiability and interoperability requirements discussed in Section 212 and the different deployment config-

urations in Section 233 Linux containers (LXC) are used A container is a lightweight operating system running inside the host

system running instructions native to the core CPU eliminating the need for instruction level emulation that Virtual Machines

use Containers provide an identical isolated runtime environment for processes without the overhead of virtualization This

makes them perfect for highly portable software as only the container needs to be moved and can directly be executed on any

system supporting the containers [63] First the concept of containers is introduced in Section 331 Second several container

frameworks are presented in Sections 332 333 334

331 Containers

Containers sandbox processes from each other and are often described as the lightweight equivalent of virtual machines The

difference between a virtual machine and a container is the level of virtualization Virtual machines virtualize at the hardware

33 Deployment framework 35

level whereas containers do this at the operating system (OS) level The achieved effect is similar but there are significant

differences Containers make available protected portions of the OS and share its resources Two containers running on one OS

have their own OS abstraction layer and donrsquot know they are running on the same host This provides a significant difference in

resource utilization Virtual machines provide access to hardware only so it is necessary to install an OS As a result there are

multiple OSs running which gobble up resources Containers piggyback on the running OS of the host environment They merely

execute in spaces that are isolated form each other and certain parts of the OS This allows for efficient resource utilization and

for cheap creation and destruction of containers Consequently starting and stopping a container is equivalent to starting and

stopping an application [64 65] This comparison is illustrated in Figure 34

Containers offer several advantages over running a process directly on the system Due to the OS virtualization of the con-

tainers software is always deployed on the same operating system defined in the container This allows for a rsquowrite once run

everywherersquo scenario which allows for portability of the system to a range of devices Containers communicate with each other

using protocols such as HTTP This allows for the processes in containers to be written in any programming language using

any external library that is needed For the system this means that if the Producer and Consumer Plugins are packaged as

containers they can effectively be made in any available technology greatly enhancing the extensibility of the system

332 LXC

Linux containers are the basis on top of which other container frameworks are built LXC provides a normal OS environment

similar to a VM The containers in this framework almost behave identically to a VM They can run multiple processes LXC can

be used directly but offer only low level functionalities and can be difficult to set up [67]

333 Docker

Docker started as an open-source project at dotCloud in early 2013 It was an extension of the technology the company had

developed to run its cloud applications on thousands of servers [64] Now Docker is a standalone mature company providing a

software container platform for the deployment of applications [66] Docker provides two main services a simple toolset and

API for managing Linux containers and a cloud platform which provides easy access to recipes for software containers created

by other developers [68] Docker is the container technology with most public traction and is becoming the container standard

at the time of writing due to its functionalities and very responsive community It offers functionality to easily build and run

containers but also manage them in large clusters A design decision that limits Docker is that each container can only run one

process at a time and the Docker client Docker consists of a daemon that manages the containers and the API Engine a REST

client Should this client fail dangling containers can arise [69]

334 rkt

Core OSrsquo rkt is an emerging container technology providing an API engine similar to the Docker API Engine that can run LXC

containers as well as Docker containers rkt focusses on security standardization and is specifically designed to run in cloud

environments Unlike Docker rkt does not use a daemon process with a REST client The command line tool executes all the

34 Object detection algorithms and frameworks 36

(a) Container stack (b) Virtual machine stack

Figure 34 Containers compared to virtual machines [66]

operations which makes the framework more reliable rkt is not as mature as Docker yet It is portable to multiple Linux

environments but is not yet portable to macOS and Windows [70]

34 Object detection algorithms and frameworks

As stated in Section 132 object detection is the computer vision task of detecting which objects are present in an image and

where they are located Several approaches to this problem have been proposed some of which focus on thermal images This

section aims to give a small overview of different existing techniques For the technical details on the algorithms the reader is

referred to the respective articles on the algorithms

341 Traditional approaches

Traditional approaches include hot-spot detection techniques and Adaptive Boosting (AdaBoost) with various feature extraction

techniques such as Aggregated Channel Features (ACF) and Integral Channel Features (ICF) Thesemethods rely on clever feature

engineering solutions that use domain knowledge or statistical insights to transform the raw dataset into a specific set of

features in order to find patterns [32]

Hot-spot detection

Hot-spot techniques work on the assumptions that people have an overall higher body temperature than most of the back-

ground in the thermal image These techniques first select candidate objects these are the hot-spots in the image The hot-spots

define the region on which a classifier is run and are thus the localization step in the object detection problem Afterwards

a classifier is trained on these candidates Xu et al used a Support Vector Machine (SVM) classifier to classify if the hot-spot

34 Object detection algorithms and frameworks 37

represented a pedestrian [71] Nanda et al used a Bayes classifier to classify the hot-spots [72] These methods are generally

not applicable because people often are not the only hot-spots in thermal images

AdaBoost

AdaBoost is a machine learning algorithm that utilizes the output of so-called weak learning algorithms (weak learners) and

combine their outputs into aweighted sum that forms the output of the boosted classifier AdaBoostmodifies theweak learners

in favor of data points misclassified by previous classifiers [73] Viola and Jones et al built a detection algorithm that uses two

consecutive frames of a video sequence and trains the AdaBoost classifier on both motion and appearance information [74]

Davis et al use a two-stage template approach that initially performs a fast screening procedure using a generalized template

using a contour saliency map to locate potential person locations Any window located in the first phase is then forwarded to

the AdaBoost algorithm to validate the presence of the person Dollaacuter et al extracted features using different ICF and ACF [35]

ICF and ACF compute features by calculating several aggregations over the different channels of an image such as gradient

color histogram and colors Goedeme et al expanded these detectors with extra thermal channels to achieve comparable

results as Dollaacuter et al but for thermal images [36]

342 Deep learning

Over the past few decades there has been a shift in proposed solution methods towards deep learning Deep learning for object

detection uses Convolutional Neural Networks (CNN) CNNs are a specialized kind of neural network for processing data that

has a known grid-like topology such as images CNNs generally consist of three steps a convolution step that creates a feature

map of a region of an image a pooling step that summarizes the output of the convolution step and finally a fully-connected

network that learns from the features extracted in the previous steps [75] The key difference is that these algorithms do the

feature extraction in the convolutional layers and do not need feature engineering like the algorithms presented in Section

341 This requires quite a bit of computing power when compared to the traditional methods Since deep learning made the

shift to computing on Graphical Processing Units (GPUs) computations became feasible and these models proved to achieve

very good performance on various machine learning problems Two model types are described two-stage networks (R-CNN

R-FCN) that extract image regions first and make separate predictions on each region and dense networks (YOLO SSD NASNet

RetinaNet) that operate on the image as a whole

Region-based Convolutional Network (R-CNN)

R-CNN uses a selective search method to find objects an alternative to the exhaustive search in an image It initializes small

regions in an image and merges them hierarchically The detected regions are merged according to color spaces and other

similarity metrics [76] R-CNN combines this selective search with a CNN per region to find out the objects in these regions [77]

34 Object detection algorithms and frameworks 38

Fast(er) Region-based Convolutional Network (Fast(er) R-CNN)

Fast R-CNN was developed to reduce the time consumption related to the high number of models necessary to analyze region

proposals from the selective search method in R-CNN Instead of using a CNN for each region a single CNN with multiple

convolutional layers is used [78] Faster RCNN drops the region proposals detected with the selective search method (which

is computationally expensive) and introduced the Region Proposal Network (RPN) to directly generate region proposals This

accelerates training and testing and improves performance [79] Mask R-CNN is an extension of the Faster R-CNN model that

adds a parallel branch to the bounding box detection to predict object masks that is the segmentation of an object by pixel in

the image [80]

Region-based Fully Convolutional Network (R-FCN)

R-FCN tries a more efficient approach to region detection Instead of applying a per-region subnetwork multiple times R-FCN

uses a fully convolutional network with computations shared across the entire image This allows it to be compatible with

multiple backbone networks such as Residual Networks [81]

You Only Look Once (YOLO)

The previously discussed methods need to run the same computations on different parts of an image multiple times before

generating a prediction This makes those methods relatively slow The YOLO model [82] was developed with the requirement

to make predictions as fast as possible trading off accuracy for speed to move towards real-time object detection YOLO directly

predicts bounding boxes and class probabilities with a single CNN in a single evaluation instead of first detecting object regions

and predicting classes afterwards This has some benefits over the other methods YOLO is very fast when compared to other

methods capable of processing images in real-time up to 155 frames per second for some variants It also learns contextual

information because it trains on entire images instead of regions YOLO also generalizes better for other image types All these

benefits come at the cost of accuracy YOLO struggles to precisely localize some objects especially small objects The following

versions of YOLO focus on delivering more accuracy The algorithm is currently in its third version [83]

Single-Shot Detector (SSD)

The SSD [84] is similar to YOLO and predicts all the bounding boxes and the class probabilities in one single evaluation (single

shot) using one CNN The model takes an image as input which passes through multiple convolutional layers When compared

to YOLO SSD achieves higher accuracies by adding convolutional layers and including separate filters for different aspect ratio

detections

Neural Architecture Search Net (NASNet)

NASNet takes a different approach and does not design the network architecture to perform the object detection beforehand

but instead trains a Recurrent Neural Network (RNN) to generate the model descriptions of the CNN to perform the object

34 Object detection algorithms and frameworks 39

detection The RNN is trained using reinforcement learning The NASNets built for object detection perform as good as most

networks but are slower to train [85]

RetinaNet

RetinaNet is the latest state-of-the art object detector It is a simple dense detector similar to YOLO and SSD but matches

the accuracy of the two-stage detectors like the R-CNN variants RetinaNet proposes that the foreground-background class

imbalance encountered when training the dense detectors lead to less accuracy when compared to the two-stage detectors

RetinaNet uses a newmethod called Focal Loss that focuses training on a sparse set of examples to counter this class imbalance

which results in a very good performance and a very fast detection [86]

343 Frameworks

While the previous Sections focused on different algorithms actually implementing these algorithms is not straightforward

Thatrsquos why over the past years several deep learning frameworks have emerged that try to provide easier access to this tech-

nology Some frameworks provide APIs for some of the object detection algorithms presented above This section gives a small

overview of some frameworks Most frameworks differ quite a bit from each other which makes porting a model from one

framework to another rather difficult The Open Neural Network Exchange Format (ONNX) initiative hopes to propose a stan-

dard for interchangeable models which should aid switching among frameworks more easily in the future [87] Note that there

are other frameworks available but those do not yet support object detection functions out of the box

TensorFlow

Perhaps the most well-known framework TensorFlow is an open source machine learning library for neural networks with a

Python interface It was developed by Google for internal use and released for the public in 2015 [88] Recently an Object

Detection API has been built for TensorFlow which implements pre-trained models on benchmark datasets such as SSD Faster

R-CNN R-FCN and Mask R-CNN [89] TensorFlow offers a lot of flexibility in its use and can be used for many machine learning

problems

Darknet

Darknet is an open source neural network framework written in C and CUDA It is maintained by Joseph Redmon the person

behind the YOLO algorithm [90] Darknet does not offer the flexibility that other frameworks offer but is easy to install and

use when compared to others Out of the box Darknet offers an interface for YOLO The open source community offers some

ports of this framework to other popular frameworks such as Tensorflow

34 Object detection algorithms and frameworks 40

CNTK

The Microsoft Cognitive Toolkit (CNTK) is an open source toolkit for distributed deep learning It offers a Python C or C++

interface Itrsquos one of the first framework so support ONNX CNTK offers an API for Fast R-CNN and Faster R-CNN [91]

35 Technology choice 41

35 Technology choice

This Section presents the choices made for each technology described in the previous Sections

351 Thermal camera

The FLIR One Pro and Therm-App were selected as thermal cameras for the proof of concept Both offer relatively high quality

images 160 x 120 pixels and 320 x 240 pixels respectively This is of course relative to their price 469 and 93731 euro respec-

tively These prices are at the low end of the product ranges offered Both cameras are designed to use on a smartphone which

makes them ideal for prototyping since these devices are widely available and setting up the camera via the apps from the

respective companies is easy Both cameras provide MPEG-4h264 encoded video output easily understood by most playback

software Both cameras can be found in the lower left of Figure 32b

For deployment in production-ready applications with drones these cameras are not the best choice They arenrsquot specifically

designed to be used on a drone and donrsquot offer the best image quality possible In those applications platforms like the FLIR Vue

Duo Zenmuse or Workswell Wiris are better candidates due to their superior image quality MAVLink interfaces compatibility

with commercially available gimbals to mount them on drones and other features

352 Microservices framework

Flask is selected as the microservices framework The arguments for Flask are as follows Flask is a mature web framework

with major companies backing it This means the APIs stay consistent and the framework is stable in use When compared to

some other frameworks like Spring Boot Flask is unopionated which allows for maximum flexibility during development Flask

also has a very small memory footprint that makes it easier to deploy on less powerful on-board devices like drones Flask is

also easy to use and quick to set up ideal for developing a proof of concept A final argument is the familiarity of the author

with Flask

353 Deployment framework

Docker is selected as the deployment framework Docker is the most mature and well supported container framework at

the time of writing and likely will be important in the future It offers the most features and is specifically designed for the

microservices pattern [68]

354 Object detection

One of the requirements specified in Section 21 is real-time streaming Real-time object detection is only achieved by a few

models presented in Section 34 Candidates are YOLO SSD and RetinaNet As there is no framework that provides an implemen-

tation of the RetinaNet algorithm out of the box at the time of writing this algorithm is not selected SSD is implemented in

the TensorFlow object detection API However at the time of writing this API has not been found stable when trying out the API

fallbacks to older versions of the software were needed to be able to test the models This was due to the object detection API

35 Technology choice 42

using older versions of the TensorFlow framework Therefore YOLO implemented in the darknet framework is selected Darknet

offers a stable distribution YOLO achieves good results and has proven to be a very fast detector capable for real-time object

detection

PROOF OF CONCEPT IMPLEMENTATION 43

Chapter 4

Proof of Concept implementation

To prove the concept of the architecture discussed in the previous chapters a prototype is implemented First the goals and the

scope of the prototype are presented in Section 41 Next the components of the prototype are presented in Section 42 Finally

the known limitations and issues of the prototype are presented in Section 43

41 Goals and scope of prototype

The goals of the prototype are to prove the QARs defined in Section 21 The prototype focusses on the ASRs performance

interoperability and modifiability The usability security and availability requirements are left out of the scope of the prototype

because they are not an ASR and require significant resources (focus groups longtime deployment etc) to test

The components that are implemented in the prototype are Client Interface Stream Consumer and Producer because they

represent the core functionality of the framework to build image processing application streams using plugins The Producer

and Consumer Distribution components enable third party plugin developers to add their functionality to the framework These

are distribution functionalities which are out of scope of the prototype The prototype will only support one video stream All

functions presented in Figure 21 are implemented with the exception of rsquoInstall pluginrsquo rsquoUninstall pluginrsquo rsquoAdd pluginrsquo rsquoView

pluginrsquo rsquoRemove pluginrsquo and rsquoUpdate pluginrsquo as they are only supported via the Producer and Consumer Distribution components

The prototype is deployed on a local device Distributed deployment configurations require small changes in the implementation

(see Section 43)

42 Overview of prototype

421 General overview

The prototype consists of four main components a cli streamer producer and consumer The cli process is

the Client Interface implemented as a textual Command Line user Interface (CLI) which allows a user to interact with the

prototype through textual commands in a shell This process is deployed on the local machine The streamer producer

42 Overview of prototype 44

and consumer processes are deployed as microservices in their own Docker containers The prototype is initialized through

the cli which spins up the Docker containers of the other processes This is achieved with the tool docker-compose Compose

is a tool for defining and running multi-container Docker applications The compose YAML file defines the configurations for

the microservices Compose uses these configurations to start and stop the application with a single command [92] A snippet

of the compose file for the application is given in Listing 4 Containers are specified as services The example service

configuration given is that of the producer First the name of the container is specified which overwrites the default name

as the container name is used as hostname for the container in Docker [93] The build configuration specifies where the

container build recipe is situated The port mapping allows processes from the localhost to access processes in the container

For the producer service this is only used for debugging The volumes configuration specifies folders from the host to

be mounted to the container This configuration mounts in the source code and resources It also provides access to the Docker

socket to allow interaction with the Docker host (see Section 424)

services

producer

container_name producer

build

context producer

dockerfile Dockerfile

ports

- 808080

volumes

- producerusrproducer

- varrundockersockvarrundockersock

Listing 4 docker-composeyml snippet of the prototype

All containers are connected to a Docker bridge network [93] for communication A bridge network uses a software bridge to

allow connected containers to communicate while providing isolation from containers which are not connected to that bridge

network The bridge network applies to containers running on the same Docker host The network is thus confined to the local

Docker host and is not distributed on different devices The bridge network has some advantages

bull The bridge provides better isolation and interoperability between containers Containers automatically expose all ports

to each other and none to the outside world

bull The bridge provides automatic Domain Name System (DNS) resolution between containers This means that containers

resolve the IP address of each other by container name or alias

bull Containers can be attached to and detached from the networks on the fly

bull Environment variables are shared which can be used to provide equal environment configurations for every container

on the bridge

42 Overview of prototype 45

422 Client interface

The Client Interface is implemented by the cli component The cli is built in Python with the Click package by Armin

Ronacher [94] Click is a CLI creation kit which aims to make the implementation of CLIs easier It resembles the Flask frame-

work as it also leverages Python decorators [56] for most of its functionality The source code of the cli is located in the

mosquitopy file Commands can be executed by calling python mosquitopy or by calling mosquito if the

source code is installed into the Python environment The following commands are implemented

bull mosquito Displays a help page listing command groups

bull mosquito on Starts the application

bull mosquito off Shuts down the application

bull mosquito plugins Groups all commands to manage plugins Plugins can only be listed not installed or unin-

stalled as the Remote Producer and Remote Consumer are not implemented

bull mosquito plugins ls Lists all locally installed plugins

bull mosquito stream Groups all commands to manipulate the current stream

bull mosquito stream add Adds a producer or consumer to the stream

bull mosquito stream delete Deletes a producer or consumer from the stream

bull mosquito stream elements List all producers and consumers that were added to the stream

bull mosquito stream link Links two stream plugins

bull mosquito stream pause Pauses the stream

bull mosquito stream play Plays the stream This means the stream is processing media

bull mosquito stream print Prints the stream layout (which plugins are linked)

bull mosquito stream stop Stop the stream

bull mosquito stream view View the stream on the local device

A typical use of the application would be the following First the application is started using mosquito on Then plugins

are added to the stream using mosquito stream add [ELEMENT_TYPE] [ELEMENT] This will instantiate the

corresponding plugins in the Producer and Consumer component The plugins are linked in order using mosquito stream

link [ELEMENT_1] [ELEMENT_2] The stream is then set to play using mosquito stream play When the

last plugin is linked to the special local plugin the user can view the output from that plugin using mosquito stream

view which opens up a window in which the stream is displayed

42 Overview of prototype 46

As specified in the software architecture (see Section 23) the Client Interface can use the Stream Commands interface of the

Stream component As specified in Section 231 this interface is a REST API so the client can use this interface through the HTTP

protocol This is done with the Python Requests library [95]

423 Stream

The Stream component is responsible for the logical representation of the stream (see Section 231) implemented as the

streamer component The component consists of three objects api that contains the REST API StreamManager and

the Stream object representing the Stream Model in the framework Requests to the other microservices are sent using the

Python Requests library The prototype implementation only supports one stream with a chain-like model This means that

unlike the stream depicted in Figure 26 a plugin canrsquot have multiple sources or multiple listeners The Stream object man-

ages the logical representation of the stream and manipulates the references to the plugins by forwarding commands to the

producer and consumer component respectively It contains two data structures outline which is the logical struc-

ture of the stream and elements that contains all the plugins present in the stream In the prototype the Stream component

provides the following functionalities on its API endpoints

bull plugins GET Fetches all the plugins from the producer and consumer components and returns their in-

formation

bull elements GET POST DELETE Resource to add and delete plugins from the elements bin

bull streamlinks POST Resource to create links for elements

bull streamstate GET PUT Resource to update the state

bull shutdown POST Shut down the framework

Since the streamer component is the only component of the framework that interacts with outside users it has the re-

sponsibility to gracefully shut down the framework This is needed to solve the problem of dangling plugin containers that

run plugins that have not been stopped and removed after closing the application Since only plugins that are contained in a

stream have a running container associated the stream can notify the Producer and Consumer components to stop and remove

those containers

424 Producer and Consumer

The Producer and Consumer component cover similar responsibilities in managing installed plugins They are implemented in

the producer and consumer components Both components consist of the following objects api that contains the REST

API the Kernel that implements the core functionalities the PluginManager which finds plugins installed on the device

and checks if their installation is valid and the Plugin which is the logical representation of a plugin as described in Section

231 Commands to control the plugins are made using the Python Requests library

42 Overview of prototype 47

For the component to be able to start stop and interact with the plugin containers the component needs access to the Docker

host and the Docker client running on that host But because the component is running in its own container it is isolated from

the Docker host and canrsquot interact with the Docker client by default The workaround for this problem is to expose the socket

on which the Docker client is running on the Docker host to the container This is done by mounting the Docker socket of the

host on the Docker socket in the container In Docker compose the mounting is achieved using the following Listing

volumes

- varrundockersockvarrundockersock

Listing 5 Mounting the Docker socket on the container

This has some implications on security (see Section 43) To interact with the now exposed Docker client the component uses

the docker-py library [96] Listing 6 shows how a connection is made to the Docker client and a plugin container is started

The container is started from the plugin image on the network of the framework and is given the plugin name as the container

name Docker thus creates a DNS entry with the plugin name which makes the container addressable on its name Due to this

implementation this limits that there can only be one container of a plugin running at all times in the current implementation

import docker

client = dockerfrom_env()

container = clientcontainersrun(

image=plugin_name

detach=True

name=plugin_name

network=mosquito_default

)

Listing 6 Starting a plugin container

When both components are initialized the Kernel and PluginManager are created The PluginManager searches

for a plugin_directory which contains information on which plugins are installed on the device Each installed plugin

should have a valid image on the device which are contained in the images directory of the Docker daemon If the image

or information file cannot be found on the device the plugin is marked as broken and canrsquot be used by the framework To

describe the API the consumer API is used The producer API is analogous but replaces consumer with producer

and doesnrsquot have the sources endpoints The Producer and Consumer components provide the following functionalities

on the API endpoints

bull consumers GET Retrieves a list of the installed consumers on the device on which the component is running

bull consumerslthostnamegt GET DELETE Retrieves the information of a consumer specified by the host-

name value which is the name of the consumer

42 Overview of prototype 48

bull consumerslthostnamegtstate GET PUT Retrieves or respectively updates the state of a consumer

specified by the hostname value

bull consumerslthostnamegtsources GET POST Retrieves the sources or respectively adds a new source

to the consumer specified by the hostname value

bull consumerslthostnamegtsourcesltsource_hostnamegt

GET PUT DELETE Retrieves updates or removes the source specified by source_hostname of a consumer spec-

ified by hostname respectively

bull consumerslthostnamegtlisteners All listeners resources are analogous to the sources re-

sources

425 Implemented plugins

Three plugins are implemented and tested filecam (called rsquoMycamrsquo in the code) a producer that reads in a video file and

transmits it in MJPEG encoding using the RTP protocol testsrc a producer which generates test video and transmits it

in MJPEG encoding using the RTP protocol and local a consumer which captures incoming RTP MJPEG video frames and

displays them on the local display The filecam and local plugins are discussed since the testsrc is similar to the

filecam

The plugins are implemented in Python use the GStreamer library with the Python bindings [97] for media streaming and the

Flask framework to implement the API These libraries donrsquot have to be used by future plugins which can just implement a REST

API and provide a media stream specified in their descriptions

Filecam plugin

The filecam image is based of the Ubuntu 1710 image It is chosen over lighter Linux distributions because it offers more

functionalities out of the box for prototyping Other dependencies are Python 36 GStreamer 112 GStreamer plugins-base

plugins-good plugins-bad plugins-ugly libav doc and tools and python-gst

The API of the plugin offers the following functionalities

bull state GET PUT Retrieve and respectively update the state of the plugin

bull listeners GET POST Retrieve and respectively add a listener on the plugin

bull listenerslthostnamegt GET PUT DELETE Retrieve update and respectively delete a listener on the

plugin

The implemented GStreamer pipeline is depicted in Figure 41 The pipeline consists of the following GStreamer elements

1 filesrc GStreamer element that reads data from a file in the local file system This file can have any extension

and is not limited to video or audio files [98] The location property is set to the location of the file in the plugin

container

42 Overview of prototype 49

Figure 41 filecam GStreamer pipeline

2 decodebin GStreamer bin that automatically constructs a decoding pipeline using available decoders and demuxers

via auto-plugging [99] Note that for some media containers and codecs the appropriate decoders must be installed

For example to decode the MPEG streams contained in MP4 files a h264 decoder is needed that can be found in the

rsquolibavrsquo GStreamer plugins library

3 jpegenc GStreamer elements that encodes raw video into JPEG images [100] This implements the MPEG video

stream as all video frames are encoded as JPEG images

4 rtpjpegpay GStreamer element that payload encodes JPEG images into RTP packets according to RFC 2435 [101]

5 udpsink GStreamer element that sends UDP packets to the network When combined with an RTP payload plugin

it implements RTP streaming [102] The host and port property are set to the hostname and port property of the

listener of the plugin

This pipeline is implemented using the Python GStreamer bindings The process consists of creating each GStreamer element

adding them to the GStreamer pipeline and linking the elements in order of appearance in the pipeline The decodebin

and jpegenc element canrsquot be linked when created because there is no default sink pad available on the decodebin

Because the decodebin needs to decide on how to decode media it needs the pipeline to be processing media to it If no

media is flowing the decodebin canrsquot know what decoder it needs to decode the media and what media it can offer to the

sink element Therefore the process of dynamic linking is used [103] All elements which can be linked when the pipeline is

not in the PLAYING state are linked A handler is registered on the rsquopad-addedrsquo signal emitted when a new pad is added

on the decodebin indicating that it can forward media downstream When media is flowing through the pipeline the

decodebin creates new pads when it can generate output data and emits the rsquopad-addedrsquo signal A callback is performed

on the handler which links the decodebin with the jpegenc Listing 7 illustrates this concept

callback handler

def on_pad(source pad sink)

get the sink pad from the sink element

sink_pad = sinkget_static_pad(sink)

get the pad type

pad_caps = padget_current_caps()

pad_type = pad_capsget_structure(0)get_name()

Only if the pad is raw video the link is made

if pad_type == videox-raw

42 Overview of prototype 50

Perform the dynamic link

padlink(sink_pad)

Other pad types are ignored

filesrc = GstElementFactorymake(filesrc)

decodebin = GstElementFactorymake(decodebin)

jpegenc = GstElementFactorymake(jpegenc)

(create other elements and add elements to pipeline)

Only filesrc and decodebin can be linked statically

filesrclink(decodebin)

Register on_pad handler on the pad-added signal

handler_id = decodebinconnect(pad-added on_pad jpegenc)

Set pipeline to PLAYING callback will be called to perform the dynamic link

pipelineset_state(GstStatePLAYING)

Listing 7 Dynamic linking of the decodebin and jpegenc

Local plugin

The local plugin captures an incoming media stream and displays it on the local display This plugin is special with respect

to other plugins in that it is not deployed in a Docker container It runs natively via the cli on the host to allow access to

the local display This version is built for macOS High Sierra (version 10134) and uses GStreamer 112 GStreamer plugins-base

plugins-good plugins-bad plugins-ugly libav doc and tools to receive an incoming stream When a plugin links to the local

plugin the Stream component does not instruct the Consumer component to start the plugin but instead links the plugin to the

local host For macOS the address of the host is hostdockerinternal The GStreamer pipeline used by the plugin is depicted in

Figure 42

Figure 42 local plugin GStreamer pipeline

The pipeline consists of the following elements

1 updsrc GStreamer element that reads UDP packets from the network [104] The port property is set to the port to

which the source is transmitting media

2 rtpjpegdepay GStreamer element that retrieves JPEG images from the received RTP packets [105] This element

canrsquot process the media received from the udpsrc directly because it canrsquot know what type of data it will be receiv-

43 Limitations and issues 51

ing Between the pads a rsquocapabilities filterrsquo is placed which informs the elements on the type of data that will be

flowing through In this case the capabilities are applicationx-rtp which tells that there will be rtp pack-

ets coming through encoding-name=JPEG which tells that the payload of the RTP packets are JPEG images and

payload=26 which also tells that the encoding is JPEG according to RFC3551 [50 106]

3 jpegdec GStreamer element that decodes JPEG images [107]

4 autovideosink GStreamer element that automatically detects an appropriate videosink and forwards the video

to it [108]

43 Limitations and issues

The implementation presented is a prototype and slimmed down version of the architecture presented in Section 23 The

following limitations and issues remain

431 Single client

The current implementation deploys the Flask framework (on which each microservice relies) on the built-in Flask web server

(Werkzeug) which is provided for development convenience It is only built for use by a single user and by default can only

handle one request at each given moment which implies that the framework can also only be used by a single user [109]

432 Timeouts

The framework does not perform checks on request timeouts when passing commands to components and plugins This can

be a problem when the framework is deployed on several devices and the request latency is much higher In case of timeouts

the framework will keep waiting for a response which leads to a crash

433 Exception handling and testing

The framework is only tested for the so called rsquohappy pathrsquo the default scenario featuring no exceptional or error conditions

Some alternate paths are handled butmost still need to be tested An example scenario would be if one of the plugin containers

in a stream fails and stops The framework is not able to detect this and will assume that the container is still running

434 Docker security issues

The Docker client is a client that communicates with a daemon process using the socket dockerd This socket is a UNIX

domain socket called varrundockersock The daemon is highly privileged having root access to the host system

Any process that can write to this socket effectively has root access To allow the components of the framework to manipulate

the plugin containers they need access to this socket Therefore the socket ismounted in the containerwhich gives the container

43 Limitations and issues 52

write access to the socket This implies that that container now has root access on the host when writing to this socket Because

the container gets root access to the host an attacker can walk the file tree of the host and extract sensitive information or run

unwanted software This type of attack is known as a rsquoDocker Breakoutrsquo or rsquoContainer Escapersquo attack [110 111]

435 Docker bridge network

The current implementation deploys the framework on a Docker bridge network which can only be used if the framework is

deployed on a single device The current implementation can thus only be deployed on a single device To deploy the framework

on multiple devices the framework must be deployed using a Docker overlay network [112]

436 Single stream

The implementation supports one stream which must be a chain Multiple streams in tree form with merging media from

multiple sources and broadcasting to multiple listeners is not supported

437 Number of containers per plugin

The framework uses the name of the plugin as identifier for the containers The name is also the hostname on which the

container can be reached Therefore there can only be one active container associated with a plugin at runtime

MOB DETECTION EXPERIMENT 53

Chapter 5

Mob detection experiment

To try out an actual drone thermal imaging application the mob detection experiment is carried out The goal of this experi-

ment is to use existing object detection algorithms on a dataset of thermal images to try and detect large crowds of people

hereinafter referred to as a mob

Several public datasets of thermal images exist Most datasets focus on the detection of people in scenes [113ndash117] some on

face recognition [118 119] others on vehicle recognition [120] Most of these datasets are freely available through the OTCBVS

Benchmark Dataset Collection [121] No datasets containing large amounts of people were found so the Last Post thermal

dataset was created for the detection of mobs and other analysis tasks This dataset is presented in Section 51

To detect mobs in the images of the datasets a deep learning approach using neural networks is explored The selection and

training of the model is described in Section 52

51 Last Post thermal dataset

The Last Post dataset consists of videos of the Last Post Ceremony taking place each night at 800 PM (Brussels timezone) under

the Menin Gate in Ypres Belgium Section 511 gives some insight into this unique ceremony The full dataset is described in

Section 512

511 Last Post ceremony

The Last Post ceremony is a nightly ceremony taking place under the Menin Gate in Ypres at 800 PM sharp The ceremony is

held in remembrance of the fallen soldiers during World War I (1914-1918) The Last Post association [122] states its mission as

follows

True to its statutes the Last Post Association wishes to honor and remember the soldiers of the British Empire

who gave their lives during the Great War of 1914-1918 The Last Post ceremony seeks to express day after day

the lasting debt of gratitude which we all owe to the men who fought and fell for the restoration of peace and

the independence of Belgium

51 Last Post thermal dataset 54

Figure 51 gives an impression of the size of the ceremony Because of the sheer number of people that gather under the gate

each day the Last Post is a unique open air event that allowed for repeatable conditions to capture footage therefore the event

was a perfect opportunity to create the dataset

Figure 51 Last Post ceremony panorama

512 Dataset description

Due to legislation in Belgium drones cannot be flown in public areas without a certification and permit by authorities The

creation of real aerial thermal images with a drone was thus not feasible Therefore an elevated position (in order to simulate

aerial images) on the walls next to Menin gate was used to capture the footage of the adjacent square on one side and the

bridge on the other side Figure 52 shows the locations where the video footage was captured

Figure 52 Locations where the video footage was captured The black stars represent the captured scenes the red stars represent the locations from

where the scene was filmed

The data was recorded with the FLIR One Generation 3 Pro camera for Android devices hereafter referred to as rdquoCamerardquo [123]

Since thermal images donrsquot hold color information a color scheme is used to represent the relative differences in temperature

The rsquoIronrsquo color scheme which maps colder sections of a scene on blue colors and warmer sections on red and yellow colors

51 Last Post thermal dataset 55

The videos are encoded using the H264 MPEG-4 codec Decoded the color information is captured in 420 YUV format The

frame rate of the videos varies from 7 Hz to 8 Hz depending on the speed of the objects in the scene There is sound present

in the videos which is encoded with the MPEG AAC codec For a full list of sequences the reader is referred to Appendix C

The two locations that make up the main scenes in the dataset are presented in Figure 53 The thermal images and visual

images of each scene are depicted next to each other The thermal and visual images were not captured at the same time so

the mobs that are present in the thermal images canrsquot be seen in the visual images In both scenes buildings are present that

are quite warm when compared to the surroundings as can be seen in the thermal images In Figure 53a it even becomes

difficult to recognize the mob when they are standing close to the building This is less the case for Figure 53c where due to

the water present in the image the mob has higher contrast due to the larger difference in emitted heat Towards the far right

of the image the mob seemingly disappears into the background The effect of two objects having a similar heat signature and

having no clear transition in thermal images is defined as thermal camouflage a technique that is often used by animals and

military units [124] This effect is even visible when looking at the mobs present in both images because people are standing

so close together it becomes difficult to recognize individual persons in the crowd

(a) Thermal view of the square in location A (b) Visual view of the square in location A

(c) Thermal view of the bridge in location B (d) Visual view of the bridge in location B

Figure 53 Main scenes in the Last Post dataset

52 Object detection experiment 56

52 Object detection experiment

521 Preprocessing

The Last Post dataset was not used entirely for training the model because there were not enough resources to manually

annotate every image Therefore a smaller dataset was used to serve as a baseline model

The following videos were used 2018-04-10 195029mp4 2018-04-10 200122mp4 2018-04-04-

202859mp4 2018-04-10 202558mp4 and 2018-04-04 200052mp4 captured on the fourth and

tenth of April 2018 These videos were used because of their contents They contain images from location A and B respectively

in which the mob behaves more dynamically compared to other videos This was due to a marching band present on the fourth

of April and a marching army unit on the tenth of April See Appendix C for a summary of the contents of these videos From

these videos images were extracted at a capture rate of 1 Hz Each image was manually labelled using the Microsoft Visual

Object Tagging Tool [125] The tool allows to export the training images to various formats such as Pascal VOC for Tensorflow

YOLO and Microsoft CNTK

Within the data several visual outliers are present An outlier is an observation point that is distant from other observations

It is created due to variability in capturing the videos or indicate experimental errors [126] The errors detected here are the

latter form and are depicted in Figure 54 The first type of outliers are system faults in the Camera Due to an error in the

processing of the video the Camera would sometimes not register any input This causes the Camera to produce completely

black images which is depicted in Figure 54a The Camera softwaremaps temperatures onto colors in the image The variations

of the colors are relative to the temperature interval ranging from the minimum and maximum temperature detected by the

Camera If the minimum andor maximum detected temperature change the Camera needs to adapt its color mapping This

causes the Camera to fade to bright colors for a short period of time (1 to 2 seconds) The resulting image is depicted in Figure

54b Because the resulting image is too bright and objects are hard to detect it is considered an outlier Due to instabilities

when capturing the footage sequences with fast motion some images are very blurry This makes it hard even for a person to

decide what is visible in the frame therefore it is considered an outlier This is depicted in Figure 54c Sometimes people would

pass in front of the Camera which resulted in brightly colored areas in the videos that were not part of the scene and therefore

are another type of outliers depicted in Figure 54d Because the presented outliers are experimental errors and do not belong

in the scenes they were removed from the dataset

522 Training

The model that is used for training is YOLOv3 implemented using the darknet neural network framework [83] The model is

trained using convolutional weights that are pre-trained on the ImageNet database [127] The concept of using weights from a

pre-trained model previously trained on large datasets is known as transfer learning It is very important that when choosing

a pre-trained model the problem statement of the pre-trained model is close enough to the current problem statement For

the pre-trained model on ImageNet this was to identify objects in images which lies close to the detection of mobs in thermal

images Because the type of images (thermal versus visual) is fundamentally different the model could suffer in performance

Goedeme et al [36] solved a similar problem with thermal images and achieved good results which gives an indication that

52 Object detection experiment 57

(a) System fault in the Camera no input was detected (b) The Camera updates to new temperature interval

(c) Due to moving the Camera too fast the image becomes too blurry (d) Very warm object due to people passing in front of the Camera

Figure 54 Outliers

detection should be feasible with the pre-trained model Also because the dataset is relatively small training the model from

scratch could actually hurt performance [128] Training was carried out on the NVIDIA Geforce GTX 980 GPU that allows training

to be done much faster To evaluate training progress the Sum of Squared Error (SSE) loss function is calculated defined assumni=1(xij minus xj)

2 where n is the number of samples in a batch used in a single training epoch and j is the dimension (x

or y) as defined in [83] The result of this training is discussed in Chapter 6

RESULTS AND EVALUATION 58

Chapter 6

Results and evaluation

The goal of this Chapter is to present the results of the framework and the detection experiment The results of the framework

tests are presented in Section 61 The results of the object detection experiment are presented in Section 62

61 Framework results

To evaluate the framework acceptance tests are conducted that test if the framework meets the QARs defined in Section 21 As

stated in Section 41 only the ASRs will be tested A summary of which requirements are met by the framework is given in Table

61 Passed means that the framework has met the requirement not passed that the framework hasnrsquot met the requirement

and plausible means that the frameworkmight havemet the requirement but not enough data could be gathered to be certain

611 Performance evaluation

To evaluate performance the acceptance tests for the requirements are conducted the impact of the framework on the pro-

cessing resources are recorded and the total size of the framework is measured

Acceptance tests

To test the performance of the framework the execution times of each command executed with the CLI (see Section 422) are

measured Each command is executed 200 times except for the on off and link commands they are measured manually

10 times Because these commands launched system threads and their finish signal could not be captured they had to be

measured by hand Commands were executed on a 26 GHz Intel Core i5-2540 processor running macOS High Sierra version

10134 The summarized statistics of the tests are given in Table 62

The average execution times for the Play Stop Pause Add Elements Print View and Link commands do not exceed the 2

second bound specified in PS-1 while the average execution times of the Delete On and Off commands do exceed this bound

This performance requirement is not met by the framework The same result is found for PS-2 Especially the Delete and Off

command exceed the requirements by quite a bit The Delete command shuts down a plugin and removes the Docker container

61 Framework results 59

Requirement id Status

PS-1 Not Passed

PS-2 Plausible

PS-3 Not Passed

PS-4 Plausible

PS-5 Not Passed

IS-1 Passed

IS-2 Passed

MS-1 Passed

MS-2 Passed

MS-3 Passed

MS-4 Passed

MS-5 Plausible

MS-6 Passed

MS-7 Plausible

Table 61 Acceptance tests results summary

from the host This action is costly in time The Off command removes all the plugins and all the microservices of the framework

and thus suffers from the same costly action This could be ameliorated by having the framework not removing the containers

but stopping them instead which requires less resources as it only stops the process running in the container but does not

delete the container from the system

PS-2 and PS-4 could not be measured due to the GStreamer pipeline of the prototype not allowing frames to be tracked

However since real-time is a human time perception if a person canrsquot distinguish the streamed videos from videos played with

a native video player real-time streaming is plausible [43 44] The videos were shown side by side to ten users that could not

distinguish between both videos indicating presumable real-time streaming Since the hard requirements cannot bemeasured

the requirements are not met but are plausible Real-time streaming performance also heavily depends on the used plugins

and the hardware on which they are deployed If a plugin canrsquot process its media fast enough due to lack of processing power

or a slow implementation it will slow down the whole stream

The scalability requirement PS-5 could not be met due to the Flask Werkzeug server only being able to process one request at

a time (see Section 43)

Only two performance requirements are met by the prototype However this is mostly due to some actions being very slow

such as shutting down the framework or removing a plugin As these are actions that should occur less frequently when a user

is using the framework these actions are less important for the perceived quality Frequent actions such as adding linking and

changing the state of the stream do perform rather well and contribute more to the perceived quality Overall the performance

of the framework is not stellar but not bad either This can partially be explained due to the choice of supporting frameworks

61 Framework results 60

Statistic Play Stop Pause Add Delete Elements Print View On Off Link

Mean 0690 0804 0634 1363 8402 0562 0564 122 358 24023 0849

Std deviation 0050 0059 0088 1037 4669 0070 00747 0260 0498 0481 0170

Minimum 0629 0708 0549 0516 0505 0517 0517 0757 3015 23707 0637

25 Percentile 0665 0775 0594 1049 1154 0534 0536 0998 3143 23750 0798

Median 0678 0800 0623 111 11132 0550 0552 1214 3500 23886 0853

75 Percentile 0700 0820 0653 1233 11189 0562 0560 1433 3850 24034 0877

Maximum 1016 1279 1631 625 11846 1227 1149 1691 4562 25326 1261

Table 62 Performance test statistics summary measured in seconds

such as Flask that are not built for performance Other more high performance frameworks such as Vertx could ameliorate

performance

Resource usage

The resources used by the modules of the framework are measured using the Docker statistics tool [129] A summary of the

resources used is given in Table 63 When the framework is idle resource usage is negligible When a plugin is active there is

a slight increase in resources This increase in resources depends on the runtime size of the plugin unknown to the framework

The increase peaks when the plugin is processing media CPU usage is 40 on one core which implies that on one CPU core only

two plugins can be active simultaneously before reaching the ceiling of the processing power In a production environment of

the framework plugins need to be tested thoroughly so that these metrics are known beforehand These metrics imply that

the length of streams should be kept short to avoid having many plugins active simultaneously

Size of framework

The total size of all the Docker images of the components of the framework are given in Table 64 Most images are quite large

the framework core components have an average size of 724 MB and the plugins have sizes ranging from 1GB to 3GB This

size can be explained due to the base images and additionally installed software in the images For development flexibility

the base images used are Linux Ubuntu images that are typically larger than other Linux distributions For the plugins the full

GStreamer library with all plugins was installed which is more than 2 GB large The sizes of the components can be reduced in

a production environment by choosing slimmer Linux distributions as base images and only installing the minimally needed

libraries to get a working plugin

612 Interoperability evaluation

The systems with which the framework exchanges data are the plugins These plugins must follow the plugin model presented

in Section 231 implement the presented resources using a REST API the state machine and protocols If these specifications

61 Framework results 61

Condition Container CPU usage [] Memory usage [MiB]

Idle streamer 100 4209

consumer 003 244

producer 001 2414

1 plugin active not processing media streamer 156 4248

consumer 002 2442

producer 002 2423

mycam plugin 075 4597

1 plugin active processing media streamer 156 4251

consumer 002 2442

producer 002 2424

mycam plugin 4003 9924

Table 63 Resource usage of the framework in several conditions

Image Size [MB]

streamer 718

consumer 729

producer 729

testsrc 1250

mycam 3020

Table 64 Total size of framework components

are followed by a plugin the framework should have no issues exchanging information with the plugin To test this a new

mock plugin is implemented For each resource of the plugin the framework is given random mock input data to exchange

with the plugin When the exchange is complete the values in the plugin are requested and compared with the given input If

the input matches the value in the plugin the exchange was successful These tests were executed 50000 times The results

are summarized in Table 65 Play pause and stop are the requests to change the state of the plugin The sourcelistener add

update and delete commands manipulate the sources and listeners of the plugin Overall there were almost no errors made

when exchanging information only when updating a source and deleting a listener there was one incorrect exchange The

ratios achieved are always 100 correct exchanges except for updating a source and deleting a listener which are 99998

IS-1 and IS-2 specify that commands exchanged with the plugins need to be correct 9999 of the uptime so this requirement

is clearly met

Plugins also interact with each other by transmitting media to each other according to the stream layout This interoperability

62 Mob detection experiment results 62

Value Play Pause Stop Add S Update S Delete S Add L Update L Delete L

Correct 50000 50000 50000 50000 50000 49999 50000 50000 49999

Incorrect 0 0 0 0 0 1 0 0 1

Ratio () 100 100 100 100 100 99998 100 100 99998

Table 65 Interoperability tests results (S Source L Listener)

is not directly controlled by the framework as plugins can be developed by third parties To solve this a plugin needs to provide

its specifications to the framework before being integrated as a plugin This allows the framework to decide whether or not two

plugins will be able to interact with each other in a stream For example if plugin A supports MJPEG streams transmitted via

RTPUDP it will be able to interact with plugin B implementing the same protocols If plugin B implements another protocol it

will not be able to interact with plugin A If this is specified the framework can notify a user that two plugins are not compatible

These scenarios should be avoided which is done by specifying standard protocols for plugins

613 Modifiability evaluation

Plugins are installed for the prototype by building and adding their image to the image directory of the Docker host The

framework does not need a restart to install these images Therefore requirements MS-1 and MS-2 are met End-users can

extend their version of the framework with new plugins by installing them by building the respective plugin images meeting

MS-3 Streams can be modified by linking different plugins by design meetingMS-4 The framework can detect newly installed

plugins when starting up if the image is installed to the image directory of the Docker host Therefore requirementsMS-5 and

MS-6 are met The current prototype is only deployable on a local device as discussed in Section 41 meeting requirementMS-7

The other requirements can be met by deploying the framework using the Docker overlay network as discussed in Section 43

without having to implement changes to the code base The requirements MS-8 and MS-9 are not met but are plausible by

using a different Docker deployment

In general the frameworkwas designed to bemodifiable for different video analysis tasks The hybridmicrokernelmicroservices

architecture enables this modifiability The microkernel plugin architecture allows a user to modify a video analysis stream

during framework use The microservices architecture allows for a modifiable deployment configuration

62 Mob detection experiment results

To evaluate the detection experiment the trained model is tested on the validation set that contains random images from the

total annotated dataset presented in Section 512 First the results of the training of the model are presented in Section 621

Second the metrics that were used to evaluate the model are presented in Section 622 Finally the results of the validation

are presented in Section 623

62 Mob detection experiment results 63

621 Training results

To monitor training the average loss per training epoch was measured the resulting training evolutions are depicted in Figure

61 Darknet does not shuffle training data automatically and creates training batches in order of the training data provided

Since YOLO uses gradient descent for optimization this can lead to YOLO getting stuck in local minima of the cost surface [130]

This effect is seen in Figure 61a around epoch 4500 every image in the training set has been loaded at least once at this point

the model was training on images from location B and now images from location A are loaded (see Section 512) This leads to

a peak in average loss as YOLO was optimizing images from location B and probably converging to a local minimum for that

type of images Therefore in a second run data was shuffled allowing the model to get out of local minima easier Figure

61b shows the difference in training loss the curve is much more irregular thanks to the shuffling of the data Once again

the average loss decreases more around epoch 4500 when every image in the training set has been loaded at least once The

average loss stagnates values in the interval [004 007] To avoid overfitting the model on the training data and achieve worse

generalization performance early stopping is applied Early stopping is a generalization technique to stop the training of a

neural network early before the network starts overfitting [131] The stopping criterion used is progress defined as the decrease

of training error in successive training epochs [131] or the slope of the loss curve depicted in Figure 61 This slope approaches

0 from epoch 13000 and onward so this epoch is selected as early stopping point Because the generalization error is not a

smooth curve and consists of many local minima it is a good idea to validate model weights in the neighborhood of the early

stopping point as well as these could potentially yield better performance on the validation set [131]

622 Metrics

Themodel predicts bounding boxes for objects in the images of the validation sets The bounding box provided by the annotated

dataset is defined as the ground truth bounding boxBgt The bounding box provided by the model is defined as the predicted

bounding boxBp To evaluate the performance of themodel and select the best weights several metrics are used The standard

metrics used to evaluate object detection problems are the Intersection over Union (IoU) and themean Average Precision (mAP)

The IoU is a metric used in common object detection challenges such as the Pascal VOC challenge [132] If the functionA(Bx)

gives the area for a bounding boxBx the IoU is defined as

IoU =A(Bp capBgt)

A(Bp cupBgt)(61)

The mAP for set of detections another metric used in the Pascal VOC challenge is defined as the mean over classes of the

interpolated AP for each class A detection is considered a true positive if the IoU for the detection is greater than 05 The

interpolated AP is given by the area under the precision-recall curve for the detections [132ndash134]

Themodel is also tested on several videos not included in the train and validation set to visually evaluate detection andmeasure

the number of frames per second that can be processed by the model

62 Mob detection experiment results 64

(a) Average training loss when data is not shuffled Vertical average loss horizontal time (in training epochs)

(b) Average training loss when data is shuffled Vertical average loss horizontal time (in training epochs)

Figure 61 Average training loss per epoch

623 Validation results

YOLO creates a snapshot from the weights the model is using at a certain epoch every 100 epochs [83] This makes it possible

to validate each set of weights on the validation set and show the evolution of the validation performance Figure 62 shows

these evolutions for the average IoU and mAP metrics The mAP gradually grows from epoch 4500 onwards and stagnates

around epoch 11500 This shows that the model is not learning anymore and is at risk of overfitting The mAP stagnates in the

interval of [88 91] The average IoU shows a similar trend but varies more because predictions on the same images rarely

are exactly the same

The best mAP value is achieved at epoch 15700 being 9052 The weights from this epoch are used for further testing and

validation The mAP for the 05 IoU threshold of YOLOv3 on the COCO benchmark dataset [135] is 748 comparing this to the

achieved mAP for the Last Post dataset the Last Post mAP is very high The reason for this difference is that the validation

62 Mob detection experiment results 65

(a) mAP () per epoch Vertical mAP () horizontal time (in training epochs)

(b) IoU () per epoch Vertical IoU () horizontal time (in training epochs)

Figure 62 Validation metrics per epoch

set has a high correlation with the validation set Due to the training set and validation set being extracted from videos all

images from one video are correlated in time to each other Images from the validation set are thus correlated to images in

the training set and the model is optimized on these types of images explaining the high mAP This indicates that the model is

somewhat overfitting on the training data This was confirmed when testing the model on unseen videos Although the model

could detect a mob most of the time it produced more visual errors Because this data was not annotated no metrics could be

extracted Figure 63 depicts some predictions of the model on images from the validation set The predicted bounding boxes

resemble the ground truth bounding boxes quite accurately visually

To test the speed of the predictions of the model the total time to predict images in the validation set was measured For the

NVIDIA Geforce GTX 980 GPU the average prediction time for one image is 14673 milliseconds with a standard deviation of

0517 milliseconds This indicates that the upper limit of the frame rate when making predictions on a video is approximately

68 frames per second on the GPU For comparison predictions with the model were also made on a CPU a 26 GHz Intel Core

i5-2540 processor with AVX instructions speedup The average prediction time on the CPU is 5849 seconds with a standard

deviation of 0438 seconds resulting in an upper limit for the frame rate on the CPU of 0171 frames per second Clearly real

time object detection with this model is only possible on a GPU When generating predictions on a test video the average frame

rate of the video was 55 frames per second

62 Mob detection experiment results 66

(a) Prediction of a large mob at location B (b) Prediction of the mob at location A

(c) Prediction of a small mob at location B (d) Prediction of the mob at location B

Figure 63 Predictions of the model on images in the validation set

CONCLUSION AND FUTURE WORK 67

Chapter 7

Conclusion and future work

71 Conclusion

Aerial thermal imaging with drones is a promising technology that can deliver many promising applications for various use

cases across many different domains such as agriculture fire fighting search and rescue etc Most applications built with this

technology are built with a specific use case in mind using a thermal camera and analysis software specifically for this use

case and therefore struggle to exchange hardware and algorithms for new use cases Therefore the goal of this dissertation

was to design build and test a possible backbone framework that allows building these applications in a modifiable way The

specific use case of mob detection in thermal images was investigated as a sample use case for the framework

Chapter 2 explored the requirements of such a framework The ASRs to achieve the goal of the framework are performance

interoperability and modifiability Performance is needed because some use cases (like fire fighting) require real-time video

analysis Interoperability enables the framework to interact with different thermal cameras and different processinganalysis

modules Modifiability enables the framework to interchange the thermal cameras and analyzers in its process to build ap-

plications for different use cases A hybrid combination of the microkernel pattern and the microservices pattern is used to

meet these requirements as the microkernel pattern enabled interchanging the cameras and analyzers via a plugin system

and the microservices pattern enabled different deployment configurations for the framework To build and test the frame-

work several technologies were needed backbone technologies for the software architecture a thermal camera and an object

detection algorithm for the mob detection use case

Chapter 3 explored the state of the art of these technologies and presents the selected technologies Thermal cameras come in

all shapes and sizes and have different features according to their retail prize Contrary to intuition the image quality is not the

defining factor of the retail prize but the amount of extra features such as radiometry communication interfaces etc The FLIR

One Pro and ThermApp were selected for this dissertation since they offer good quality images and features for their price and

their use via smartphone platforms that makes these cameras excellent for prototyping Microservices frameworks also know

a lot of variety depending a lot on the use case for the application using the framework Some are aimed at quick prototyping

others focus on performance etc Flask was selected as the microservices framework as it is easy to use and designed for

prototyping with microservices This does come with a performance trade-off To deploy the microservices in a plugin fashion

71 Conclusion 68

the concept of containers is applied Containers virtualize on the OS level allowing the microservices to be moved around on

the host and distributed on different hosts The current field has some frameworks implementing this technology with Docker

being the most well-known and mature framework and it was selected for that reason The field of object detection has a

variety of solutions for the object detection problem having varying accuracies and some can even create predictions in real-

time The YOLOv3 algorithm implemented in the darknet framework was selected as it generalizes well onto other datasets

(such as thermal images) makes relatively accurate predictions and is able to make predictions in real-time when deployed on

a device with GPU processing capabilities

Chapter 4 presents the implemented prototype of the framework using these technologies Two sample plugins were imple-

mented the filecam that serves a video read in from a file and the display plugin that displays this video on the local device

The framework is limited to one video processing stream for one user at a time and is deployed to a local device It also has a

security risk as the framework has to expose the Docker daemon socket to allow the framework to manipulate the containers

running the plugins This gives the containers that run the core framework processes root access to the host system which can

be abused by potential attackers

Themob detection experiment is presented in Chapter 5 A new thermal image dataset called the Last Post datasetwas collected

for this experiment The dataset features videos of the Last Post ceremony filmed over the course of two weeks What makes

this dataset special is that unlike publicly available datasets it delivers footage of the movement of large crowds filmed from

a high vantage point to simulate footage captured from a drone platform This dataset is used to train a pre-trained YOLOv3

model via transfer-learning The dataset is manually labeled and preprocessed by removing the outliers present Training is

done on a NVIDIA GTX 980 GPU and is evaluated using the MSE loss metric

Chapter 6 presented the test conducted on the framework and the detection model and their corresponding results The per-

formance requirements for the frequently used commands are met by the framework Other commands such as removing

plugins starting up and shutting down the framework do not meet the performance requirements since Docker requires sig-

nificant time to start stop and remove containers The real-time streaming requirements could not be proven because the

time between transmitting a frame and receiving a frame could not be measured directly However the processed videos were

shown to human users that could not distinguish between the processed video and the video played back on a local system

which makes it plausible that the framework achieved this requirement Real-time streaming performance heavily depends on

the plugin and the hardware on which it is deployed When plugins in the framework are processingmedia CPU usage increases

significantly even when only one plugin is active This implies that the length of media processing streams should be kept as

short as possible to achieve good performance The framework is relatively big with some plugins even having a size of 2 GB

This is mostly due to the base images and installed libraries of the plugins and core components Due to each components

and plugin having its own container libraries canrsquot be shared so they are redundantly installed leading to large components

sizes This could be alleviated by using slimmer images and only installing minimal libraries needed The interoperability

requirements are all met by the framework This is proven by a test exchanging mock information between the framework

and plugins The modifiability requirements regarding the plugins are met by the framework The modifiability requirements

regarding the deployment schemes are not met by the framework but are can be achieved by deploying the framework using

a Docker overlay network instead of the Docker bridge network To evaluate the trained model the model made predictions

72 Future work 69

on a separate validation set The model achieves an mAP of 9052 which is much higher than what current state of the art

models are achieving on benchmark datasets This shows that the model is capable of learning the thermal features but is also

overfitting on the data due to temporal correlation between training and validation sets The model can predict in real-time

achieving an average frame rate of 55 frames per second when making predictions on a GPU

72 Future work

This dissertation proposed a framework and implements a prototype of it which only implements a part of the total framework

Object detection using deep learning in general and specified on thermal images is still a young field Several extensions to

this research are possible

721 Security

The framework prototype did not implement any security measures Because in distributed configurations communications

rely on an external network these measures should be implemented to reduce the risks of attacks To allow the components

to manipulate Docker containers the Docker host socket was exposed As stated before this is a serious security risk as the

container gets root access to the host Workarounds for this problem could be to implement a Docker in Docker environment

[136] or deploy the containers in a VM

722 Implementing a detection plugin

Due to the scope and time limit of the dissertation a working prototype plugin containing a trained model for detecting objects

in a video stream could not be made A possible GStreamer pipeline for such a plugin is depicted in Figure 71 This plugin is a

Consumer and receives video via the udpsink Frames are decoded and the raw video is presented to the appsink GStreamer

plugin that allows the video to be dumped into an application This is the detection model that can generate predictions on the

frame The predicted frame is then forwarded to an appsrc GStreamer plugin that puts the predicted frame in a new pipeline to

transmit it to further framework plugins It should be tested whether the detection model can run in a Docker container since

it needs GPU support to be able to predict in real-time A solution could be to use nvidia-docker which leverages NVIDIA GPU

support in Docker containers [137]

Figure 71 GStreamer pipeline for a plugin with a detection model

72 Future work 70

723 Different deployment configurations

The prototype of the framework only implemented one of the deployment configurations presented in Section 233 Other

configurations can be explored by changing the Docker bridge network to a Docker overlay network

724 Multiple streams with different layouts

The prototype only implemented one stream with a chain-like layout Future effort could implement support for multiple

streams that run concurrently The layout can be changed by implementing plugin that can forward media to multiple sources

or merge media coming from different sources which is the concept of sensor fusion

725 Implementing the plugin distribution service (Remote ProducerConsumer)

In Chapter 2 presented the Remote Producer and Consumer that distribute the plugins available for the framework This was

deemed out of scope for the prototype but could be implemented in future versions

726 Using high performance microservices backbone frameworks

The current implementation uses the Flask framework excellent for prototyping but not ideal for high performance Other

frameworks such as Vertx focus on high performance through asynchronous messaging that could improve the performance

of the framework

727 New object detection models and datasets specifically for thermal images

Current effort in object detection models goes towards challenges on benchmark datasets of visual images such as ImageNet

and Pascal VOC There are some thermal datasets publicly available for some detection purposes but these are very small

compared to the visual image datasets Future research could create new benchmark datasets similar to the visual image

datasets specifically for thermal images

Currently publicly available pre-trained neural network models are designed for and trained on the visual image datasets

Future research could go towards designing an architecture specifically for thermal images and training amodel on a benchmark

dataset

Thermal images use several colormaps tomap the relative temperatures in a scene on colors presenting warm and cold regions

Well-known examples are the Iron scheme (used in this dissertation) White-hot and Black-hot Some companies implement

threshold colors that highlight very hot spots or very cold spots in an image (for examples see [138 139] etc) Future research

could investigate how models trained on images using different color schemes differ in their predictions and performances

Thermal images could potentially benefit from radiometric information that adds a ton of information by adding a temperature

dimension to each pixel in the image instead of the relative coloring This information could lead to more accurate predictions

BIBLIOGRAPHY 71

Bibliography

[1] S G Gupta M M Ghonge and P Jawandhiya ldquoReview of Unmanned Aircraft Systemrdquo International Journal of Advanced

Research in Computer Engineering amp Technology vol 2 no 4 pp 2278ndash1323 2013 ISSN 2278 ndash 1323

[2] M Hassanalian and A Abdelkefi Classifications applications and design challenges of drones A review 2017 DOI

10 1016 j paerosci 2017 04 003 [Online] Available http ac els - cdn com S0376042116301348 1 - s2 0 -

S0376042116301348-mainpdf7B5C_7Dtid=256c9506-8f3c-11e7-a898-00000aab0f017B5Camp7Dacdnat=

15042875957B5C_7D

[3] M Joel The Booming Business of Drones 2013 [Online] Available httpshbrorg201301the-booming-business-of-

drones (visited on 01302018)

[4] DJI Zenmuse H3 - 2D [Online] Available httpswwwdjicomzenmuse-h3-2d (visited on 01302018)

[5] Gimbal Guard Drop amp Delivery Device for DJI Mavic Pro [Online] Available httpwwwgimbal-guardcom7B5C_

7Dpprd134610820141productdrop-7B5C7D26-delivery-device-for-dji-mavic-pro (visited on 01302018)

[6] FLIR Systems Aerial Thermal Imaging Kits [Online] Available httpwwwflircomsuasaerial-thermal-imaging-kits

(visited on 01302018)

[7] R Gade and T B Moeslund ldquoThermal cameras and applications a surveyrdquo Machine Vision and Applications vol 25

pp 245ndash262 2014 DOI 101007s00138-013-0570-5 [Online] Available httpslinkspringercomcontentpdf10

10077B5C7D2Fs00138-013-0570-5pdf

[8] M C Harvey J V Rowland and K M Luketina ldquoDrone with thermal infrared camera provides high resolution georefer-

enced imagery of theWaikite geothermal area New Zealandrdquo 2016 DOI 101016jjvolgeores201606014 [Online] Avail-

able httpsacels-cdncomS03770273163014211-s20-S0377027316301421-mainpdf7B5C_7Dtid=78077cee-

05f3-11e8-84ec-00000aab0f6c7B5Camp7Dacdnat=15173405687B5C_7D

[9] S Amici M Turci S Giammanco L Spampinato and F Giulietti ldquoUAV Thermal Infrared Remote Sensing of an Italian Mud

Volcanordquo vol 2 pp 358ndash364 2013 DOI 104236ars201324038 [Online] Available httpwwwscirporgjournalars

20httpdxdoiorg104236ars201324038

[10] J Bendig A Bolten and G Bareth ldquoINTRODUCING A LOW-COST MINI-UAV FOR THERMAL-AND MULTISPECTRAL-IMAGINGrdquo

2012 [Online] Available httpswwwint-arch-photogramm-remote-sens-spatial-inf-scinetXXXIX-B13452012

isprsarchives-XXXIX-B1-345-2012pdf

BIBLIOGRAPHY 72

[11] Workswell ldquoUsing the UAV Thermography for Cultivation and Phenotyping of Cerealsrdquo Tech Rep 2016 [Online] Avail-

able httpswwwdrone-thermal-cameracomwp-contentuploadsCultivation-and-Phenotyping-1pdf

[12] A J Rivera A D Villalobos J C Monje J A Marintildeas and C M Oppus ldquoPost-disaster rescue facility Human detection and

geolocation using aerial dronesrdquo IEEE Region 10 Annual International Conference ProceedingsTENCON pp 384ndash386

2017 ISSN 21593450 DOI 101109TENCON20167848026

[13] P Christiansen K A Steen R N Joslashrgensen and H Karstoft ldquoAutomated detection and recognition of wildlife using

thermal camerasrdquo Sensors (Basel Switzerland) vol 14 no 8 pp 13 778ndash93 Jul 2014 ISSN 1424-8220 DOI 103390

s140813778 [Online] Available httpwwwncbinlmnihgovpubmed2519610520httpwwwpubmedcentralnih

govarticlerenderfcgiartid=PMC4179058

[14] J Zhang J Hu J Lian Z Fan X Ouyang and W Ye ldquoSeeing the forest from drones Testing the potential of lightweight

drones as a tool for long-term forest monitoringrdquo Biological Conversation vol 198 pp 60ndash69 2016 [Online] Available

httpacels-cdncomS00063207163011001-s20-S0006320716301100-mainpdf7B5C_7Dtid=7166e916-8f3c-

11e7-9090-00000aacb35e7B5Camp7Dacdnat=15042877237B5C_7D

[15] D Ventura M Bruno G Jona Lasinio A Belluscio and G Ardizzone ldquoA low-cost drone based application for identifying

and mapping of coastal fish nursery groundsrdquo Estuarine Coastal and Shelf Science vol 171 pp 85ndash98 Mar 2016 ISSN

02727714 DOI 101016j ecss 201601 030 [Online] Available http ac els-cdncomS02727714163003001-s20-

S0272771416300300-mainpdf7B5C_7Dtid=7f4cdb08-8f3c-11e7-a03a-00000aab0f6b7B5Camp7Dacdnat=

15042877467B5C_7D20httplinkinghubelseviercomretrievepiiS0272771416300300

[16] S Chowdhury A Emelogu M Marufuzzaman S G Nurre and L Bian ldquoDrones for disaster response and relief operations

A continuous approximation modelrdquo 2017 DOI 101016jijpe201703024 [Online] Available wwwelseviercomlocate

ijpe

[17] Workswell ldquoPipeline inspection with thermal diagnosticsrdquo 2016 [Online] Available https www drone - thermal -

cameracomwp-contentuploadspipelinepdf

[18] Workswell ldquoThermo diagnosis of photovoltaic power plantsrdquo 2016 [Online] Available httpswwwdrone-thermal-

cameracomwp-contentuploadsWorkswell-WIRIS7B5C_7Dphotovoltaicpdf

[19] Workswell ldquoThermodiagnostics of flat roofsrdquo 2016 [Online] Available httpswwwdrone-thermal-cameracomwp-

contentuploadsroofpdf

[20] Workswell ldquoThermodiagnostics in the power engineering sectorrdquo Tech Rep 2016 [Online] Available https www

drone-thermal-cameracomwp-contentuploadshighvoltagepdf

[21] Workswell Workswell WIRIS - Product - Thermal camera for drones 2016 [Online] Available https www drone -

thermal-cameracomwiris (visited on 01302018)

[22] TEAX Technology ThermalCapture - Thermal Imaging Technology | Capture raw radiometric thermal data with drones

[Online] Available httpthermalcapturecom (visited on 01302018)

BIBLIOGRAPHY 73

[23] DJI Zenmuse XT - unlock the possibilities of sight - DJI 2018 [Online] Available https wwwdji comzenmuse-xt

(visited on 01302018)

[24] Workswell SOFTWARE - Workswell WIRIS - Thermal camera for drones 2016 [Online] Available httpswwwdrone-

thermal-cameracomsoftware (visited on 01312018)

[25] Therm-App Therm-Apptrade - Android-apps op Google Play 2018 [Online] Available httpsplaygooglecomstoreapps

detailsid=comthermapp (visited on 01312018)

[26] B Satzger W Hummer C Inzinger P Leitner and S Dustdar ldquoWinds of change From vendor lock-in to the meta cloudrdquo

IEEE Internet Computing vol 17 no 1 pp 69ndash73 2013 ISSN 10897801 DOI 101109MIC201319

[27] J Divya Drone Technology and Usage Current Uses and Future Drone Technology 2017 [Online] Available httpuk

businessinsidercomdrone-technology-uses-2017-7r=US7B5Camp7DIR=T (visited on 01312018)

[28] A Boulanger ldquoOpen-source versus proprietary software Is one more reliable and secure than the otherrdquo IBM Systems

Journal vol 44 no 2 pp 239ndash248 2005 ISSN 0018-8670 DOI 101147sj4420239 [Online] Available httpieeexplore

ieeeorgdocument5386727

[29] M Kazmeyer Disadvantages of Proprietary Software [Online] Available httpsmallbusinesschroncomdisadvantages-

proprietary-software-65430html (visited on 01312018)

[30] B Steffen and A Seyfried ldquoMethods for measuring pedestrian density flow speed and direction with minimal scatterrdquo

Physica A Statistical Mechanics and its Applications vol 389 no 9 pp 1902ndash1910 May 2010 ISSN 0378-4371 DOI 10

1016JPHYSA200912015 [Online] Available httpswwwsciencedirectcomsciencearticlepiiS0378437109010115

via7B5C7D3Dihub

[31] M Wirz T Franke D Roggen E Mitleton-Kelly P Lukowicz and G Troumlster ldquoInferring crowd conditions from pedestriansrsquo

location traces for real-time crowd monitoring during city-scale mass gatheringsrdquo Proceedings of the Workshop on

Enabling Technologies Infrastructure for Collaborative Enterprises WETICE pp 367ndash372 2012 ISSN 15244547 DOI 10

1109WETICE201226

[32] E Alpaydin Introduction to machine learning 3rd ed MIT Press 2014 p 591 ISBN 026201243X [Online] Available

httpsdlacmorgcitationcfmid=1734076

[33] J W Davis and V Sharma ldquoRobust background-subtraction for person detection in Thermal Imageryrdquo IEEE Computer

Society Conference on Computer Vision and Pattern Recognition Workshops vol 2004-Janua no January 2004 ISSN

21607516 DOI 101109CVPR2004431

[34] W Wang J Zhang and C Shen ldquoImproved Human Detection And Classification in Thermal Imagesrdquo pp 2313ndash2316 2010

[35] R Appel S Belongie P Perona and P Doll ldquoFast Feature Pyramids for Object Detectionrdquo Pami vol 36 no 8 pp 1ndash14

2014 ISSN 01628828 DOI 10 1109 TPAMI 2014 2300479 [Online] Available https vision cornell edu se3 wp -

contentuploads201409DollarPAMI14pyramids7B5C_7D0pdf

[36] T Goedeme ldquoProjectresultaten VLAIO TETRA-projectrdquo KU Leuven Louvain Tech Rep 2017

BIBLIOGRAPHY 74

[37] L-L Slattery DroneSAR wants to turn drones into search-and-rescue heroes 2017 [Online] Available https www

siliconrepubliccomstart-upsdronesar-search-and-rescue-drone-software (visited on 05262018)

[38] A W S Inc What Is Amazon Kinesis Video Streams 2018 [Online] Available https docs aws amazon com

kinesisvideostreamslatestdgwhat-is-kinesis-videohtml (visited on 05262018)

[39] U Government ldquoSystems Engineering Fundamentalsrdquo Defence Acquisition University Press no January p 223 2001

ISSN 1872-7565 DOI 101016jcmpb201005002 [Online] Available httpwwwdticmildocscitationsADA387507

[40] L Bass P Clements and R Kazman Software Architecture in Practice 3rd Addison-Wesley Professional 2012 ISBN

0321815734 9780321815736

[41] J Greene and M Stellman Applied Software Project Management 2006 p 324 ISBN 978-0596009489 [Online] Avail-

able httpwwworeillycomcatalogappliedprojectmgmt

[42] S Barber Acceptable application response times vs industry standard 2018 [Online] Available httpssearchsoftwarequality

techtargetcomtipAcceptable-application-response-times-vs-industry-standard (visited on 05282018)

[43] T Burger How Fast Is Realtime Human Perception and Technology | PubNub 2015 [Online] Available httpswww

pubnubcombloghow-fast-is-realtime-human-perception-and-technology (visited on 05282018)

[44] S-t Modeling P Glennie and N Thrift ldquoTime perception modelsrdquo Neuron pp 15 696ndash15 699 1992

[45] M Richards Software Architecture Patterns First edit Heather Scherer Ed OrsquoReilly Media 2015 [Online] Available

httpwwworeillycomprogrammingfreefilessoftware-architecture-patternspdf

[46] C Richardson Microservice Architecture pattern 2017 [Online] Available httpmicroservicesiopatternsmicroservices

html (visited on 12022017)

[47] P Clements F Bachmann L Bass D Garlan J Ivers R Little P Merson R Nord and J Staffor Documenting Software

Architectures Second Boston Pearson Education Inc 2011 ISBN 0-321-55268-7

[48] Object Management Group ldquoUnified Modeling Language v251rdquo no December 2017 [Online] Available http www

omgorgspecUML251

[49] C De La Torre C Maddock J Hampton P Kulikov and M Jones Communication in a microservice architecture 2017

[Online] Available https docs microsoft com en - us dotnet standard microservices - architecture architect -

microservice-container-applicationscommunication-in-microservice-architecture (visited on 04272018)

[50] H Schulzrinne and S Casner ldquoRTP Profile for Audio and Video Conferences with Minimal Controlrdquo 2003 [Online] Avail-

able httpstoolsietforghtmlrfc3551

[51] D Bull Communicating Pictures A Course in Image and Video Coding Elsevier Science 2014 ISBN 9780080993744

[Online] Available httpsbooksgooglebebooksid=PDZOAwAAQBAJ

[52] On-Net Surveillance Systems Inc ldquoMJPEG vs MPEG4 Understanding the differences advantages and disadvantages of

each compression techniquerdquo 2006 [Online] Available wwwonssicom

BIBLIOGRAPHY 75

[53] M M A V Protocol Introduction MAVLink Developer Guide 2013 [Online] Available httpsmavlinkioen (visited on

09142017)

[54] hartmut Schlosser Microservices trends 2017 Strategies tools and frameworks - JAXenter 2017 [Online] Available

httpsjaxentercommicroservices-trends-2017-survey-133265html (visited on 03242018)

[55] A Ronacher Welcome to Flask mdash Flask Documentation (012) 2017 [Online] Available httpflaskpocooorgdocs012

(visited on 03242018)

[56] F Reyes PythonDecorators 2017 [Online] Available https wiki python org moin PythonDecorators (visited on

04272018)

[57] Stackshare Companies that use Flask and Flask Integrations 2018 [Online] Available https stackshare io flask

(visited on 03242018)

[58] Falcon Falcon - Bare-metal web API framework for Python [Online] Available httpsfalconframeworkorg7B5C

7DsectionAbout (visited on 03242018)

[59] Stackshare Companies that use Falcon and Falcon Integrations 2018 [Online] Available httpsstackshareiofalcon

(visited on 03242018)

[60] A Ronacher Nameko for Microservices 2015 [Online] Available httplucumrpocooorg201548microservices-with-

nameko (visited on 03242018)

[61] C Escoffier Building Reactive Microservices in Java 2017 ISBN 9781491986264

[62] C Posta Microservices for Java Developers ISBN 9781491963081

[63] R Dua A R Raja and D Kakadia ldquoVirtualization vs Containerization to support PaaSrdquo in IEEE International Conference

on Cloud Engineering 2014 ISBN 9781479937660 DOI 101109IC2E201441

[64] D Merkel Docker Lightweight Linux Containers for Consistent Development and Deployment 2014 [Online] Available

http delivery acmorg1011452610000260024111600htmlip=1571935 1787B5Camp7Did=26002417B

5Camp7Dacc=ACTIVE20SERVICE7B5Camp7Dkey=D7FC43CABE88BEAA F15FE2ACB4878E3D 4D4702B0C3E38B35

4D4702B0C3E38B357B5Camp7D7B5C_7D7B5C_7Dacm7B5C_7D7B5C_7D=15214915967B5C_

7D (visited on 03192018)

[65] Docker Inc Docker for the Virtualization Admin 2016 p 12

[66] Docker Inc What is a Container 2018 [Online] Available https www docker com what - container (visited on

03242018)

[67] M Helsley LXC Linux container tools 2009 [Online] Available httpswwwibmcomdeveloperworkslinuxlibraryl-

lxc-containers (visited on 05212018)

[68] J Fink Docker a Software as a Service Operating System-Level Virtualization Framework 2014 [Online] Available

http journal code4lib org articles 9669 utm7B5C _7Dsource = feedburner 7B5C amp7Dutm7B5C _

7Dmedium=feed7B5Camp7Dutm7B5C_7Dcampaign=Feed7B5C7D3A+c4lj+ (visited on 03192018)

BIBLIOGRAPHY 76

[69] C Wang What is Docker Linux containers explained 2017 [Online] Available https www infoworld comarticle

3204171linuxwhat-is-docker-linux-containers-explainedhtml (visited on 05212018)

[70] CoreOS Rkt a security-minded standards-based container engine [Online] Available httpscoreoscomrkt (visited

on 03242018)

[71] F X F Xu X L X Liu and K Fujimura ldquoPedestrian detection and tracking with night visionrdquo IEEE Transactions on

Intelligent Transportation Systems vol 6 no 1 pp 63ndash71 2005 ISSN 1524-9050 DOI 101109TITS2004838222

[72] H Nanda and L Davis ldquoProbabilistic template based pedestrian detection in infrared videosrdquo IEEE Intelligent Vehicles

Symposium Proceedings vol 1 pp 15ndash20 2003 DOI 101109IVS20021187921

[73] R E Schapire ldquoExplaining adaboostrdquo Empirical Inference Festschrift in Honor of Vladimir N Vapnik pp 37ndash52 2013

DOI 101007978-3-642-41136-6_5

[74] P Viola O M Way M J Jones and D Snow ldquoDetecting pedestrian using patterns of motion and appearancerdquo Interna-

tional Journal of Computer Vision vol 63 no 2 pp 153ndash161 2005 DOI 101109ICCV20031238422

[75] I Goodfellow Y Bengio and A Courville Deep Learning MIT Press 2016 httpwwwdeeplearningbookorg

[76] J R R Uijlings K E A Van De Sande T Gevers and A W M Smeulders ldquoSelective Search for Object Recognitionrdquo Tech

Rep 2012 DOI 101007s11263-013-0620-5 arXiv 14094842 [Online] Available httpwwwcscornelleducourses

cs76702014spslidesVisionSeminar14pdf

[77] R Girshick J Donahue T Darrell and J Malik ldquoRegion-Based Convolutional Networks for Accurate Object Detection and

Segmentationrdquo IEEE Transactions on Pattern Analysis and Machine Intelligence vol 38 no 1 pp 142ndash158 2014 ISSN

01628828 DOI 101109TPAMI20152437384 arXiv 13112524

[78] R Girshick ldquoFast R-CNNrdquo in Proceedings of the IEEE International Conference on Computer Vision vol 2015 Inter 2015

pp 1440ndash1448 ISBN 9781467383912 DOI 101109ICCV2015169 arXiv 150408083

[79] S Ren K He R Girshick and J Sun ldquoFaster R-CNN Towards Real-Time Object Detection with Region Proposal Networksrdquo

IEEE Transactions on Pattern Analysis and Machine Intelligence vol 39 no 6 pp 1137ndash1149 2016 ISSN 01628828 DOI

101109TPAMI20162577031 arXiv 150601497

[80] K He Gkioxari P Dollaacuter and R Girshick ldquoMask R-CNNrdquo arXiv 2018 arXiv arXiv170306870v3

[81] J Dai Y Li K He and J Sun ldquoR-FCN Object Detection via Region-based Fully Convolutional Networksrdquo Tech Rep 2016

DOI 101109ICASSP20177952132 arXiv 160506409 [Online] Available httparxivorgabs160506409

[82] J Redmon S Divvala R Girshick and A Farhadi ldquoYou Only Look Once Unified Real-Time Object Detectionrdquo 2015 ISSN

01689002 DOI 101109CVPR201691 arXiv 150602640 [Online] Available httparxivorgabs150602640

[83] J Redmon and A Farhadi ldquoYOLOv3 An Incremental Improvementrdquo axXiv 2018 [Online] Available httpspjreddiecom

mediafilespapersYOLOv3pdf

[84] W Liu D Anguelov D Erhan C Szegedy S Reed C-y Fu and A C Berg ldquoSSD Single Shot MultiBox Detectorrdquo arXiv 2016

arXiv arXiv151202325v5

BIBLIOGRAPHY 77

[85] B Zoph and Q V Le ldquoNeural Architecture Search with Reinforcement Learningrdquo in ICLR 2017 pp 1ndash16 arXiv arXiv

161101578v2

[86] T-y Lin P Goyal R Girshick K He and P Dollaacuter ldquoFocal Loss for Dense Object Detectionrdquo arXiv 2018 arXiv arXiv

170802002v2

[87] Facebook Inc ONNX - About 2017 [Online] Available httpsonnxaiabout (visited on 05212018)

[88] TensorFlow TensorFlow 2018 [Online] Available httpswwwtensorfloworg (visited on 05212018)

[89] J Huang V Rathod C Sun M Zhu A Korattikara A Fathi I Fischer Z Wojna Y Song S Guadarrama and K Murphy

ldquoSpeedaccuracy trade-offs for modern convolutional object detectorsrdquo arXiv 2017 arXiv arXiv161110012v3

[90] J Redmon Darknet Open source neural networks in c httppjreddiecomdarknet 2013ndash2016

[91] Microsoft The Microsoft Cognitive Toolkit | Microsoft Docs 2018 [Online] Available https docs microsoft comen-

uscognitive-toolkitindex (visited on 05212018)

[92] Docker Inc Overview of Docker Compose | Docker Documentation 2018 [Online] Available httpsdocsdockercom

composeoverview (visited on 04272018)

[93] Docker Inc Use bridge networks | Docker Documentation 2018 [Online] Available httpsdocsdockercomnetwork

bridge (visited on 04272018)

[94] A Ronacher Click Documentation (50) 2017 [Online] Available httpclickpocooorg5 (visited on 04272018)

[95] A K Reitz Requests HTTP for Humans mdash Requests 2184 documentation 2018 [Online] Available httpdocspython-

requestsorgenmaster (visited on 05092018)

[96] Docker Inc Docker SDK for PythonmdashDocker SDK for Python 20 documentation 2018 [Online] Available httpsdocker-

pyreadthedocsioenstable (visited on 05122018)

[97] GStreamer GStreamer open source multimedia framework 2018 [Online] Available httpsgstreamerfreedesktop

org (visited on 05132018)

[98] E Walthinsen filesrc GStreamer Core Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktop org data doc gstreamer head gstreamer -plugins html gstreamer -plugins - filesrc html (visited on

05132018)

[99] E Hervey decodebin GStreamer Base Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-base-pluginshtmlgst-plugins-base-plugins-decodebinhtml

(visited on 05132018)

[100] W Taymans jpegenc GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-jpegenchtml (visited on

05132018)

BIBLIOGRAPHY 78

[101] A Communications rtpjpegpay GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https

gstreamer freedesktop org data doc gstreamer head gst - plugins - good html gst - plugins - good - plugins -

rtpjpegpayhtml (visited on 05132018)

[102] W Taymans udpsink GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-udpsinkhtml (visited on

05132018)

[103] GStreamer Basic tutorial 3 Dynamic pipelines [Online] Available httpsgstreamerfreedesktoporgdocumentation

tutorialsbasicdynamic-pipelineshtml (visited on 05132018)

[104] W Taymans udpsrc GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-udpsrchtml (visited on

05142018)

[105] W Taymans rtpjpegdepay GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available httpsgstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-rtpjpegdepayhtml

(visited on 05142018)

[106] A Loonstra ldquoVideostreaming with Gstreamerrdquo [Online] Available httpmediatechnologyleideneduimagesuploads

docswt20147B5C_7Dgstreamerpdf

[107] W Taymans jpegdec GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available https gstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-jpegdechtml (visited on

05142018)

[108] J Schmidt autovideosink GStreamer Good Plugins 10 Plugins Reference Manual [Online] Available httpsgstreamer

freedesktoporgdatadocgstreamerheadgst-plugins-goodhtmlgst-plugins-good-plugins-autovideosinkhtml

(visited on 05142018)

[109] A Ronacher Deployment Options mdash Flask 0124 documentation 2018 [Online] Available httpflaskpocooorgdocs

012deploying (visited on 05142018)

[110] R Yasrab ldquoMitigating Docker Security Issuesrdquo University of Science and Technology of China Hefei Tech Rep [Online]

Available httpsarxivorgpdf180405039pdf

[111] Lvh Donrsquot expose the Docker socket (not even to a container) 2015 [Online] Available httpswwwlvhiopostsdont-

expose-the-docker-socket-not-even-to-a-containerhtml (visited on 05152018)

[112] Docker Inc Use overlay networks | Docker Documentation 2018 [Online] Available httpsdocsdockercomnetwork

overlay7B5C7Dcustomize-the-docker7B5C_7Dgwbridge-interface (visited on 05152018)

[113] J W Davis and M A Keck ldquoA Two-Stage Template Approach to Person Detection in Thermal Imageryrdquo Proc Workshop

on Applications of Computer Vision 2005 [Online] Available httpvcipl-okstateorgpbvsbenchpaperswacv05pdf

BIBLIOGRAPHY 79

[114] J W Davis and V Sharma ldquoBackground-subtraction using contour-based fusion of thermal and visible imageryrdquo Com-

puter Vision and Image Understanding vol 106 no No 2-3 pp 162ndash182 2007 DOI 101016jcviu200606010 [Online]

Available httpswebcseohio-stateedu7B~7Ddavis1719Publicationscviu07pdf

[115] S Hwang J Park N Kim Y Choi and I S Kweon ldquoMultispectral Pedestrian Detection Benchmark Dataset and Baselinerdquo

CVPR 2015 [Online] Available httpssitesgooglecomsitepedestrianbenchmark

[116] Z Wu N Fuller D Theriault and M Betke ldquoA Thermal Infrared Video Benchmark for Visual Analysisrdquo IEEE Conference

on Computer Vision and Pattern Recognition Workshops 2014 DOI 101109CVPRW201439 [Online] Available http

citeseerxistpsueduviewdocdownloaddoi=101173522167B5Camp7Drep=rep17B5Camp7Dtype=pdf

[117] R Miezianko Terravic research infrared database

[118] R Miezieanko Terravic research infrared database

[119] S Z Li R Chu S Liao and L Zhang ldquoIllumination Invariant Face Recognition Using Near-Infrared Imagesrdquo IEEE Trans-

actions on Pattern Analysis and Machine Intelligence vol 29 no 4 pp 627ndash639 2007 DOI 101109TPAMI20071014

[Online] Available httpvcipl-okstateorgpbvsbenchpapersNIRpdf

[120] A Akula R Ghosh S Kumar and H K Sardana ldquoMoving target detection in thermal infrared imagery using spatiotem-

poral informationrdquo J Opt Soc Am A vol 30 no 8 pp 1492ndash1501 Aug 2013 DOI 101364JOSAA30001492 [Online]

Available httpjosaaosaorgabstractcfmURI=josaa-30-8-1492

[121] R I Hammoud IEEE OTCBVS WS Series Bench [Online] Available http vcipl - okstate org pbvs bench (visited on

05182018)

[122] Last Post Association Mission 2018 [Online] Available httpwwwlastpostbeenthe-last-postmission (visited on

05182018)

[123] I FLIR Systems FLIR One Pro 2017 [Online] Available httpswwwflircomglobalassetsimported-assetsdocument17-

1746-oem-flir7B5C_7Done7B5C_7Dpro7B5C_7Ddatasheet7B5C_7Dfinal7B5C_7Dv17B5C_

7Dwebpdf

[124] R J Ramana Introduction to Camouflage andDeception Defence Scientific Information ampDocumentation Centre pp 99ndash

164

[125] A Bornstein and I Richter Microsoft visual object tagging tool [Online] Available httpsgithubcomMicrosoftVoTT

(visited on 05202018)

[126] F E Grubbs ldquoProcedures for Detecting Outlying Observations in Samplesrdquo Technometrics vol 11 no 1 pp 1ndash21 Feb 1969

DOI 10108000401706196910490657 [Online] Available httpwwwtandfonlinecomdoiabs10108000401706

196910490657

[127] J Deng W Dong R Socher L-J Li K Li and L Fei-Fei ldquoImageNet A Large-Scale Hierarchical Image Databaserdquo in CVPR09

2009 [Online] Available httpwwwimage-netorgpapersimagenet7B5C_7Dcvpr09pdf

BIBLIOGRAPHY 80

[128] D Gupta Transfer learning amp The art of using Pre-trained Models in Deep Learning 2017 [Online] Available https

wwwanalyticsvidhyacomblog201706transfer-learning-the-art-of-fine-tuning-a-pre-trained-model (visited on

05202018)

[129] Docker Inc docker stats | Docker Documentation 2018 [Online] Available httpsdocsdockercomenginereference

commandlinestats (visited on 05242018)

[130] M Gori and A Tesi ldquoOn the Problem of Local Minima in Recurrent Neural Networksrdquo IEEE Transactions on Pattern

Analysis and Machine Intelligence vol 14 no 1 pp 76ndash86 1992 DOI 10110934107014

[131] L Prechelt ldquoEarly stopping - but whenrdquo in Neural Networks Tricks of the Trade G B Orr and K-R Muumlller Eds Berlin

Heidelberg Springer Berlin Heidelberg 1998 pp 55ndash69 ISBN 978-3-540-49430-0 DOI 1010073-540-49430-8_3

[Online] Available httpsdoiorg1010073-540-49430-8_3

[132] M Everingham L Van Gool C K Williams J Winn and A Zisserman ldquoThe Pascal visual object classes (VOC) challengerdquo

International Journal of Computer Vision vol 88 no 2 pp 303ndash338 2010 ISSN 09205691 DOI 101007s11263-009-

0275-4

[133] M Everingham S M A Eslami L Van Gool C K I Williams J Winn and A Zisserman ldquoThe Pascal Visual Object Classes

Challenge A Retrospectiverdquo International Journal of Computer Vision vol 111 no 1 pp 98ndash136 2014 ISSN 15731405

DOI 101007s11263-014-0733-5

[134] P Henderson and V Ferrari ldquoEnd-to-end training of object class detectors for mean average precisionrdquo Lecture Notes

in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

vol 10115 LNCS pp 198ndash213 2017 ISSN 16113349 DOI 101007978-3-319-54193-8_13 arXiv 160703476

[135] T Y Lin M Maire S Belongie J Hays P Perona D Ramanan P Dollaacuter and C L Zitnick ldquoMicrosoft COCO Common objects

in contextrdquo Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture

Notes in Bioinformatics) vol 8693 LNCS no PART 5 pp 740ndash755 2014 ISSN 16113349 DOI 101007978-3-319-10602-

1_48 arXiv 14050312

[136] Docker Inc Librarydocker 2018 [Online] Available https hub docker com 7B 5C _ 7D docker (visited on

06012018)

[137] Nvidia Nvidia-docker [Online] Available httpsgithubcomNVIDIAnvidia-docker (visited on 05252018)

[138] FLIR ldquoFLIR Onerdquo [Online] Available http www flir comuploadedFiles Store Products FLIR-ONE3rd-GenFLIR-

ONEFLIR-ONE-Gen-3-Datasheetpdf

[139] FLIR ldquoFLIR Bosonrdquo p 2 2016

FIREFIGHTING DEPARTMENT EMAIL CONVERSATIONS 81

Appendix A

Firefighting department email conversations

This appendix contains the email conversations with different firefighting departments in Belgium as part of an exploration of

the functional requirements of an aerial thermal imaging solution Note that all conversations were translated from Dutch to

English

A1 General email sent to Firefighting departments

This email was sent to the departments later mentioned in this appendix The responses in the following sections are responses

to this email

Subject Firefighting department - Thesis thermal drones

Dear Sir Madam

My name is Brecht Verhoeve I am a student Master of Science computer science engineering at Ghent University I am contacting

your department with reference to the research of my masterrsquos dissertation I am currently researching the applications of

thermal cameras in combination with commercial drones They can create an aerial overview of scenes and objects that often

canrsquot be spotted with visual detectors like hidden persons fires or hot explosives The eventual goal is to let a computer indicate

these objects of interest autonomously on the thermal images of the drone These images could aid a firefighter with their

work

For this research I have some questions for you

Functionality

I have enlisted some functionalities which I believe could be interesting for a firefighter

bull Detection of persons in buildings (find potential victims)

bull Detection of hidden fires in buildings (to identify danger zones)

bull Detection of fires on vast terrains (forests industrial terrains)

A2 Conversation with Firefighting department of Antwerp Belgium 82

bull Indication of hot explosives

I have two questions

bull Do you agree that these are the most important functions

bull Are there any other functions that you deem important

Quality of the application Next to the functionality the quality of the application is also important For me the most important

aspects are

bull Accuracy The software must be accurate There is no room for errors when detecting

bull Speed The software must operate quickly An overview must be created quickly to not waste time in case of an emer-

gency

bull Usability The software must be easy to use

Once again I have two questions

bull Do you agree with these qualities

bull Are there any other important qualities that you deem important

I would like to thank you in advance for your time

Best regards

Brecht Verhoeve

A2 Conversation with Firefighting department of Antwerp Belgium

The answers were given inline For clarity these are explicitly given

Subject Re Firefighting department Antwerp - Thesis thermal drones

Answers can be found in your email

Best regards

Functionality Detection of hidden fires in buildings and environments Are there any other functions that you deem important

Capture the evolution of a fire with the thermal camera Visualise incidents during night-time Capture invisible fires such as

hydrogen or methane fires

A3 Converstation with Firefighting department of Ostend Belgium 83

A3 Converstation with Firefighting department of Ostend Belgium

The answers were given inline For clarity these are explicityl given

Subject Re Firefighting department Ostend - Thesis thermal drones

Dear Brecht

You can find the answers after the questions in your email

Best Regards

Functionality Are there any other functions that you deem important These are the most important for us at the moment

Quality of the application Are there any other important qualities that you deem important The application must work au-

tonomously

A4 Conversation with Firefighting department of Courtrai Belgium

Subject Re Firefighting department Courtrai - Thesis thermal drones

Dear Brecht

Beneath you will find our answers (next to the already mentioned items)

Functionality

bull The detection of persons in a landscape For example missing persons after a traffic accident there are searches in the

dark for victims that were catapulted from a vehicle Today this is done via a thermal camera on the ground but with

a drone this could hold potential benefits Another example is searching for missing persons in nature reserves The

police sometimes asks for assitance of firefighters to search the area

Quality of the application

bull The images needs to be processed in realtime not after the drone has landed

The drones must be deployable for multiple purposes

The interpretation of the images in the future can be important for automatic flight control of drones Currently there is a

European project rdquo3D Safeguardrdquo where the KU Leuven is participating They are already quite advanced in interpreting the

images from a drone to spot persons through smoke With this information the drone can be redirected The application can

thus use the interpretations of the images to control the drone in flight

Best regards

A5 Conversation with Firefighting department of Ghent Belgium

Subject Re Firefighting department Ghent - Thesis thermal drones

A5 Conversation with Firefighting department of Ghent Belgium 84

Hi Brecht

I donrsquot know if yoursquove received the previous email but there you received answers on your questions

Best regards

Subject Re Firefighting department Ghent - Thesis thermal drones

With respect to the functionality I would like to add

bull Measuring the temperature of containers silos

I agree with the quality of the application It could be handy to be able to view the application from one or more devices

Everything should have a clear overview If possible information and controls should be available on one screen

I will follow up

Best regards

THERMAL CAMERA SPECIFICATIONS 85

Appendix B

Thermal camera specifications

This appendix gives all the specifications for the compared thermal cameras First the different cameras their producing

companies and average retail prices are listed in Table B1 Second their respective physical specifications are presented in

Table B2 Third the image qualities are presented in Table B3 Fourth the thermal precisions are presented in Table B4 Fifth

the available interfaces to interact with each camera are presented in Table B5 Sixth the energy consumption of each camera

is presented in Table B6 Seventh how support is offered when developing for these platforms is presented in Table B7 Finally

auxiliary features are presented in Table B8

THERMAL CAMERA SPECIFICATIONS 86

Product Company Price (Euro)

Wiris 2nd Gen 640 Workswell 999500

Wiris 2nd Gen 336 Workswell 699500

Duo Pro R 640 FLIR 640900

Duo Pro R 336 FLIR 438484

Duo FLIR 94999

Duo R FLIR 123999

Vue 640 FLIR 268900

Vue 336 FLIR 125993

Vue Pro 640 FLIR 403218

Vue Pro 336 FLIR 230261

Vue Pro R 640 FLIR 518456

Vue Pro R 336 FLIR 345599

Zenmuse XT 640 DJI x FLIR 1181000

Zenmuse XT 336 DJI x FLIR 697000

Zenmuse XT 336 R DJI x FLIR 939000

Zenmuse XT 640 R DJI x FLIR 1423000

One FLIR 23799

One Pro FLIR 46900

Tau 2 640 FLIR 674636

Tau 2 336 FLIR 493389

Tau 2 324 FLIR 2640

Lepton 3 160 x 120 FLIR 25995

Lepton 3 80 x 60 FLIR 14338

Boson 640 FLIR 122209

Boson 320 FLIR 93842

Quark 2 640 FLIR 33165

Quark 2 336 FLIR 33165

DroneThermal v3 Flytron 34115

Compact Seek Thermal 27500

CompactXR Seek Thermal 28646

Compact Pro Seek Thermal 59900

Therm-App Opgal 93731

Therm-App TH Opgal 295000

Therm-App 25 Hz Opgal 199000

Table B1 Compared cameras their producing companies and their average retail price

THERMAL CAMERA SPECIFICATIONS 87

Product Weight (g) Dimensions (mm)

Wiris 2nd Gen 640 390 135 x 77 x 69

Wiris 2nd Gen 336 390 135 x 77 x 69

Duo Pro R 640 325 85 x 813 x 685

Duo Pro R 336 325 85 x 813 x 685

Duo 84 41 x 59 x 30

Duo R 84 41 x 59 x 30

Vue 640 114 574 x 4445 x 4445

Vue 336 114 574 x 4445 x 4445

Vue Pro 640 9214 574 x 4445 x 4445

Vue Pro 336 9214 574 x 4445 x 4445

Vue Pro R 640 9214 574 x 4445 x 4445

Vue Pro R 336 9214 574 x 4445 x 4445

Zenmuse XT 640 270 103 x 74 x 102

Zenmuse XT 336 270 103 x 74 x 102

Zenmuse XT 336 R 270 103 x 74 x 102

Zenmuse XT 640 R 270 103 x 74 x 102

One 345 67 x 34 x 14

One Pro 365 68 x 34 x 14

Tau 2 640 72 444 x 444 x 444

Tau 2 336 72 444 x 444 x 444

Tau 2 324 72 444 x 444 x 444

Lepton 3 160 x 120 09 118 x 127 x 72

Lepton 3 80 x 60 09 118 x 127 x 72

Boson 640 75 21 x 21 x 11

Boson 320 75 21 x 21 x 11

Quark 2 640 8 22 x 22 x 12

Quark 2 336 8 22 x 22 x 12

DroneThermal v3 3 20 x 20 x 15

Compact 1417 254 x 444 x 203

CompactXR 1417 254 x 444 x 254

Compact Pro 1417 254 x 444 x 254

Therm-App 138 55 x 65 x 40

Therm-App TH 123 55 x 65 x 40

Therm-App 25 Hz 138 55 x 65 x 40

Table B2 Physical specifications

THERMAL CAMERA SPECIFICATIONS 88

Product IR Resolution (pixels) SD resolution (megapixels) Frequency (Hz) FOV Radiometry

Wiris 2nd Gen 640 640 x 512 192 not specified Various yes

Wiris 2nd Gen 336 336 x 256 192 not specified Various yes

Duo Pro R 640 640 x 512 12 30 Various lens yes

Duo Pro R 336 336 x 256 12 30 Various lens yes

Duo 160 x 120 2 75 and 83 57deg x 44deg no

Duo R 160 x 120 2 75 57deg x 44deg yes

Vue 640 640 x 512 0 75 Various lens no

Vue 336 336 x 256 0 75 Various lens no

Vue Pro 640 640 x 512 0 75 Various lens no

Vue Pro 336 336 x 256 0 75 Various lens no

Vue Pro R 640 640 x 512 0 75 Various lens yes

Vue Pro R 336 336 x 256 0 75 Various lens yes

Zenmuse XT 640 640 x 512 0 75 Various lens no

Zenmuse XT 336 336 x 256 0 75 Various lens no

Zenmuse XT 336 R 336 x 256 0 75 Various lens yes

Zenmuse XT 640 R 336 x 256 0 75 Various lens yes

One 80 x 60 15 87 50 deg x 38 deg yes

One Pro 160 x 120 15 87 55 deg x 43 deg yes

Tau 2 640 640 x 512 0 75 Various lens yes

Tau 2 336 336 x 256 0 75 Various lens yes

Tau 2 324 324 x 256 0 76 Various lens yes

Lepton 3 160 x 120 160 x 120 0 88 56 deg available

Lepton 3 80 x 60 80 x 60 0 88 56 deg no

Boson 640 640 x 512 0 90 Various lens no

Boson 320 320 x 256 0 90 Various lens no

Quark 2 640 640 x 512 0 9 Various lens no

Quark 2 336 336 x 256 0 9 Various lens no

DroneThermal v3 80 x 60 0 86 25 deg no

Compact 206 x 156 0 9 36 deg no

CompactXR 205 x 156 0 9 20 deg no

Compact Pro 320 x 240 0 15 32 deg no

Therm-App 384 x 288 0 87 Various lens no

Therm-App TH 384 x 288 0 87 Various lens yes

Therm-App 25 Hz 384 x 288 0 25 Various lens no

Table B3 Image quality

IR InfraRed SD Standard FOV Field of View

THERMAL CAMERA SPECIFICATIONS 89

Product Sensitivity mK Temperature range (degrees Celsius) Accuracy (Celsius)

Wiris 2nd Gen 640 50 -25 to +150 -40 to + 550 2

Wiris 2nd Gen 336 50 -25 to +150 -40 to + 550 2

Duo Pro R 640 50 -25 to + 135 -40 to + 550 5 20

Duo Pro R 336 50 -25 to + 135 -40 to + 550 5 20

Duo not specified -40 tot + 550 5

Duo R not specified -40 to + 550 5

Vue 640 not specified -58 to + 113 not specified

Vue 336 not specified -58 to + 113 not specified

Vue Pro 640 not specified -58 to + 113 not specified

Vue Pro 336 not specified -58 to + 113 not specified

Vue Pro R 640 not specified -58 to + 113 not specified

Vue Pro R 336 not specified -58 to + 113 not specified

Zenmuse XT 640 50 -40 to 550 not specified

Zenmuse XT 336 50 -40 to 550 not specified

Zenmuse XT 336 R 50 -40 to 550 not specified

Zenmuse XT 640 R 50 -40 to 550 not specified

One 150 -20 to 120 3

One Pro 150 -20 to 400 3

Tau 2 640 50 -40 to 550 not specified

Tau 2 336 50 -40 to 550 not specified

Tau 2 324 50 -40 to 550 not specified

Lepton 3 160 x 120 50 0 to 450 5

Lepton 3 80 x 60 50 0 to 450 5

Boson 640 40 0 to 500 not specified

Boson 320 40 0 to 500 not specified

Quark 2 640 50 -40 to 160 not specified

Quark 2 336 50 -40 to 160 not specified

DroneThermal v3 50 0 to 120 not specified

Compact not specified -40 to 330 not specified

CompactXR not specified -40 to 330 not specified

Compact Pro 70 -40 to +330 not specified

Therm-App 70 5 to + 90 3

Therm-App TH 70 0 to 200 2

Therm-App 25 Hz 70 5 to + 90 3

Table B4 Thermal precision

THERMAL CAMERA SPECIFICATIONS 90

Product USB MAVLink HDMI

Wiris 2nd Gen 640 Flash disk yes yes

Wiris 2nd Gen 336 Flash disk yes yes

Duo Pro R 640 Mini-USB yes micro-HDMI

Duo Pro R 336 Mini-USB yes micro-HDMI

Duo Mini-USB yes micro-HDMI

Duo R Mini-USB yes micro-HDMI

Vue 640 Mini-USB No No

Vue 336 Mini-USB no no

Vue Pro 640 Mini-USB yes Optional

Vue Pro 336 Mini-USB yes Optional

Vue Pro R 640 Mini-USB yes Optional

Vue Pro R 336 Mini-USB yes Optional

Zenmuse XT 640 Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 336 Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 336 R Via DJI drone Via DJI drone Via DJI drone

Zenmuse XT 640 R Via DJI drone Via DJI drone Via DJI drone

Tau 2 640 No no no

Tau 2 336 No no no

Tau 2 324 No no no

Lepton 3 160 x 120 No no no

Lepton 3 80 x 60 No no no

Boson 640 Yes no no

Boson 320 Yes no no

Quark 2 640 no no no

Quark 2 336 no no no

DroneThermal v3 no no no

Compact Smartphone storage no no

CompactXR Smartphone storage no no

Compact Pro Smartphone storage no no

Therm-App Smartphone storage no no

Therm-App TH Smartphone storage no no

Therm-App 25 Hz Smartphone storage no no

Table B5 Interfaces

THERMAL CAMERA SPECIFICATIONS 91

Product Power consumption (Watt) Input Voltage

Wiris 2nd Gen 640 4 6 - 36

Wiris 2nd Gen 336 4 6 - 36

Duo Pro R 640 10 50 - 260

Duo Pro R 336 10 50 - 260

Duo 22 50 - 260

Duo R 22 50 - 260

Vue 640 12 48 - 60

Vue 336 12 48 - 60

Vue Pro 640 21 48 - 60

Vue Pro 336 21 48 - 60

Vue Pro R 640 21 48 - 60

Vue Pro R 336 21 48 - 60

Zenmuse XT 640 Via DJI drone Via DJI drone

Zenmuse XT 336 Via DJI drone Via DJI drone

Zenmuse XT 336 R Via DJI drone Via DJI drone

Zenmuse XT 640 R Via DJI drone Via DJI drone

One approx 1h battery lifetime Battery

One Pro approx 1h battery lifetime Battery

Tau 2 640 13 40 - 60

Tau 2 336 13 40 - 61

Tau 2 324 13 40 - 62

Lepton 3 160 x 120 065 31

Lepton 3 80 x 60 065 31

Boson 640 05 33

Boson 320 05 33

Quark 2 640 12 33

Quark 2 336 12 33

DroneThermal v3 015 33 - 5

Compact Via smartphone Smartphone

CompactXR Via smartphone Smartphone

Compact Pro Via smartphone Smartphone

Therm-App 05 5

Therm-App TH 05 5

Therm-App 25 Hz 05 5

Table B6 Energy consumption

THERMAL CAMERA SPECIFICATIONS 92

Product Warranty (years) User Manual Phone support Email support FAQs

Wiris 2nd Gen 640 Not specified Yes Yes Yes Yes

Wiris 2nd Gen 336 Not specified Yes Yes Yes Yes

Duo Pro R 640 1 Yes Yes Yes Yes

Duo Pro R 336 1 Yes Yes Yes Yes

Duo 1 yes Yes Yes Yes

Duo R 1 yes yes yes yes

Vue 640 1 yes yes yes yes

Vue 336 1 yes yes yes yes

Vue Pro 640 1 yes yes yes yes

Vue Pro 336 1 yes yes yes yes

Vue Pro R 640 1 yes yes yes yes

Vue Pro R 336 1 yes yes yes yes

Zenmuse XT 640 05 yes yes yes yes

Zenmuse XT 336 05 yes yes yes yes

Zenmuse XT 336 R 05 yes yes yes yes

Zenmuse XT 640 R 05 yes yes yes yes

One 1 yes yes yes yes

One Pro 1 yes yes yes yes

Tau 2 640 1 yes yes yes yes

Tau 2 336 1 yes yes yes yes

Tau 2 324 1 yes yes yes yes

Lepton 3 160 x 120 1 yes yes yes yes

Lepton 3 80 x 60 1 yes yes yes yes

Boson 640 1 yes yes yes yes

Boson 320 1 yes yes yes yes

Quark 2 640 1 yes yes yes yes

Quark 2 336 1 yes yes yes yes

DroneThermal v3 not specified no no no no

Compact 1 yes yes yes yes

CompactXR 1 yes yes yes yes

Compact Pro 1 yes yes yes yes

Therm-App 1 yes yes yes yes

Therm-App TH 1 yes yes yes yes

Therm-App 25 Hz 1 yes yes yes yes

Table B7 Help and support

THERMAL CAMERA SPECIFICATIONS 93

Product Bluetooth Wi-Fi GPS Mobile app Storage

Wiris 2nd Gen 640 no on request Yes no yes

Wiris 2nd Gen 336 no on request yes no yes

Duo Pro R 640 yes no yes yes yes

Duo Pro R 336 yes no yes yes yes

Duo no no no no yes

Duo R no no no no yes

Vue 640 No no no no no

Vue 336 no no no no no

Vue Pro 640 yes no no yes yes

Vue Pro 336 yes no no yes yes

Vue Pro R 640 yes no no yes yes

Vue Pro R 336 yes no no yes yes

Zenmuse XT 640 Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 336 Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 336 R Via DJI drone Via DJI drone Via DJI drone yes yes

Zenmuse XT 640 R Via DJI drone Via DJI drone Via DJI drone yes yes

One no no no yes yes

One Pro no no no yes yes

Tau 2 640 no no no no yes

Tau 2 336 no no no no yes

Tau 2 324 no no no no yes

Lepton 3 160 x 120 no no no no no

Lepton 3 80 x 60 no no no no no

Boson 640 no no no no no

Boson 320 no no no no no

Quark 2 640 no no no no no

Quark 2 336 no no no no no

DroneThermal v3 no no no no no

Compact no no no yes yes

CompactXR no no no yes yes

Compact Pro no no no yes yes

Therm-App no no no yes yes

Therm-App TH no no no yes yes

Therm-App 25 Hz no no no yes yes

Table B8 Auxiliary features

LAST POST THERMAL DATASET SUMMARY 94

Appendix C

Last Post thermal dataset summary

The goal of this appendix is to provide a summary of the layout of the Last Post thermal dataset The data was captured on

the following days 24th of March 2018 second of April 2018 third of April 2018 third of April 2018 fourth of April 2018 fifth of

April 2018 9th of April 2018 10th of April 2018 11th of April 2018 and 12th of April 2018 For each date a small summary of the

contents is made below The small summary consists of a description of the conditions that day a listing of the video files and

their contents

C1 24th of March 2018

Conditions

bull Hours 1940 - 2020

bull Outside temperature range 5 degrees Celsius - 12 degrees Celsius

bull Clear

bull Humidity 76

bull Wind 24 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 14 kilometers

Videos

bull flir_20180324T195255mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd gathers on the right of the video D

C2 2nd of April 2018 95

bull flir_20180324T195836mp4 This video gives an overview of the inside of the Meningate ceremony Many

people can be seen watching the ceremony

bull flir_20180324T200421mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side

bull flir_20180324T201448mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side The crowd is leaving

the area after the ceremony

bull flir_20180324T202328mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people A large crowd is visible on the right side The crowd is leaving

the area after the ceremony

C2 2nd of April 2018

Conditions

bull Hours 1940 - 2020

bull Outside temperature range 9 degrees Celsius - 15 degrees Celsius

bull Light rain

bull Humidity 74

bull Wind 18 kilometers per hour

bull Precipitation 04 centimeters

bull Visibility 81 kilometers

Videos

bull 2018-04-02194733mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance and small groups of people sometimes with umbrellas passing through

bull 2018-04-02194952mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance

C3 3th of April 2018 96

bull 2018-04-02195518mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance

bull 2018-04-02201322mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen a sign a police car

several cars in the distance Crowds can be seen as well as people holding umbrellas

C3 3th of April 2018

Conditions

bull Hours 2000 - 2030

bull Outside temperature range 8 degrees Celsius - 16 degrees Celsius

bull Heavy rain

bull Humidity 79

bull Wind 25 kilometers per hour

bull Precipitation 05 centimeters

bull Visibility 101 kilometers

Videos

bull 2018-04-03 201227mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are holding

umbrellas due to heavy rain on that day Due to rain conditions and wind it was difficult to steady the camera which

can be seen in the shaky video

bull 2018-04-03 201727mp4 In the beginning of the clip the camera is moving towards the other side of the

Meningate From 0020 and onwards the clip is useful The video gives an overview of the bridge at the east-side of the

Meningate This is were the Frenchlaan goes into the Menenstraat The video shows people leaving from the Meningate

towards the busses at the other side of the bridge Most people are holding umbrellas due to heavy rain that day The

Meningate is in the bottom left of the picture Several buildings can be seen in the distance In the bottom right the

water of the Kasteelgracht can be seen Sometimes in the left of the picture the wall of the Meningate can be seen

bull 2018-04-03 202311mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat

and Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are holding

umbrellas due to heavy rain on that day Due to rain conditions and wind it was difficult to steady the camera which

can be seen in the shaky video

C4 4th of April 2018 97

C4 4th of April 2018

Content

bull Hours 1945 - 2030

bull Outside temperature range 10 degrees Celsius - 14 degrees Celsius

bull Cloudy

bull Humidity 87

bull Wind 18 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-04 200052mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen In the video a dense

crowd of people slowly starts to form in the bottom right of the picture Most people are coming from the left of the

video to join the crowd

bull 2018-04-04 200728mp4 This video shows the inside of the Meningate and the ceremony of the last post

Some people are up close in front The large crowd can be seen through the hall

bull 2018-04-04 200913mp4 This video shows the inside of the Meningate and the ceremony of the last post

The video switches between MSX mode visual camera and thermal camera to show the differences

bull 2018-04-04 202859mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen At the start of the video

a crowd is seen in the bottom right At the 0100 mark the ceremony has ended and people are exiting the gate and

coming onto the crossing They form two rows to make place for the marching band exiting the Meningate It can be

seen marching through the crowd at the 0250 mark

C5 5th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 11 degrees Celsius - 15 degrees Celsius

C6 9th of April 2018 98

bull Sunny

bull Humidity 77

bull Wind 11 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-05 200217mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen In the video a dense

crowd of people slowly starts to form in the bottom right of the picture Most people are coming from the left of the

video to join the crowd The video shows 15 minutes before the start of the ceremony

bull 2018-04-04 201838mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen The video shows the

first ten minutes after the end of the ceremony The crowd which can be seen on the left leaves towards the square

C6 9th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 9 degrees Celsius - 10 degrees Celsius

bull Light rain

bull Humidity 99

bull Wind 14 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 81 kilometers

Videos

bull 2018-04-09 200007mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People are coming from

the left towards the Meningate in the right Not a lot of people are seen due to rain that day

C7 10th of April 2018 99

bull 2018-04-09-202302mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right People are leaving from the right of the Meningate

towards the square

C7 10th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 14 degrees Celsius - 17 degrees Celsius

bull Partly Cloudy

bull Humidity 52

bull Wind 13 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers per hour

Videos

bull 2018-04-10 195029mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way

bull 2018-04-10 195131mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way

bull 2018-04-10 195748mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

that they are standing in a very structured way Some people are moving around the crowd

bull 2018-04-10 200122mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen There is a big crowd that

can be seen on the right There are some schools there so some people are wearing backpacks It is quite warm and

the cafe on the other side of the street has opened up its terrace

bull 2018-04-10 201427mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen

C8 11th of April 2018 100

that they are standing in a very structured way Some people are moving around the crowd The image is not rotated

well a well rotated image is found in 2018-04-10 201427_rotatedmp4

bull 2018-04-10 201515mp4 This video shows the inside of the Meningate and the ceremony A traditional

rsquoHakkarsquo from New-Zealand soldiers can be heard in the video the soldiers are difficult to spot due to thermal blurring

because many people are standing in one place

bull 2018-04-10 202558mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat On the video an army unit can be seen on the left It can be seen that

they are standing in a very structured way Some people are moving around the crowd At the 0200 mark the army

unit marches to the end of the bridge Very dense crowds can be seen afterwards At 0825 the army unit marches in a

straight line towards the Meningate

C8 11th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 12 degrees Celsius - 16 degrees Celsius

bull Sunny

bull Humidity 63

bull Wind 14 kilometers per hour

bull Precipitation 0 centimeters

bull Visibility 129 kilometers

Videos

bull 2018-04-11 200140mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen

bull 2018-04-11 200601mp4 The video gives an overview of the bridge at the east-side of the Meningate This

is were the Frenchlaan goes into the Menenstraat A small crowd can be seen on the left of the video

bull 2018-04-11 201554mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen People start leaving the

ceremony from the 0120 mark

C9 12th of April 2018 101

C9 12th of April 2018

Conditions

bull Hours 1945 - 2030

bull Outside temperature range 11 degrees Celsius - 14 degrees Celsius

bull Rain

bull Humidity 94

bull Wind 8 kilometers per hour

bull Precipitation 01 centimeters

bull Visibility 32 kilometers

Videos

bull 2018-04-12 195219mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen Sometimes the hedge

of the wall were the film was made is visible due to shaky cam Not many people are seen due to the rain

bull 2018-04-12 201526mp4 This video gives an overview of the crossing of the Menenstraat Bollingstraat and

Kauwekijnstraat In the video the Meningate is in the bottom right Several buildings are seen Sometimes the hedge

of the wall were the film was made is visible due to shaky cam Not many people are seen due to the rain People are

leaving towards the right

  • Introduction
    • Drones
    • Concepts
      • Thermal Cameras
      • Aerial thermal imaging
        • Problem statement
          • Industry adoption
          • Crowd monitoring
          • Goal
          • Related work
            • Outline
              • System Design
                • Requirements analysis
                  • Functional requirements
                  • Non-functional requirements
                    • Patterns and tactics
                      • Layers
                      • Event-driven architecture
                      • Microkernel
                      • Microservices
                      • Comparison of patterns
                        • Software architecture
                          • Static view
                          • Dynamic views
                          • Deployment views
                              • State of the art and technology choice
                                • Thermal camera options
                                  • Parameters
                                  • Comparative analysis
                                    • Microservices frameworks
                                      • Flask
                                      • Falcon
                                      • Nameko
                                      • Vertx
                                      • Spring Boot
                                        • Deployment framework
                                          • Containers
                                          • LXC
                                          • Docker
                                          • rkt
                                            • Object detection algorithms and frameworks
                                              • Traditional approaches
                                              • Deep learning
                                              • Frameworks
                                                • Technology choice
                                                  • Thermal camera
                                                  • Microservices framework
                                                  • Deployment framework
                                                  • Object detection
                                                      • Proof of Concept implementation
                                                        • Goals and scope of prototype
                                                        • Overview of prototype
                                                          • General overview
                                                          • Client interface
                                                          • Stream
                                                          • Producer and Consumer
                                                          • Implemented plugins
                                                            • Limitations and issues
                                                              • Single client
                                                              • Timeouts
                                                              • Exception handling and testing
                                                              • Docker security issues
                                                              • Docker bridge network
                                                              • Single stream
                                                              • Number of containers per plugin
                                                                  • Mob detection experiment
                                                                    • Last Post thermal dataset
                                                                      • Last Post ceremony
                                                                      • Dataset description
                                                                        • Object detection experiment
                                                                          • Preprocessing
                                                                          • Training
                                                                              • Results and evaluation
                                                                                • Framework results
                                                                                  • Performance evaluation
                                                                                  • Interoperability evaluation
                                                                                  • Modifiability evaluation
                                                                                    • Mob detection experiment results
                                                                                      • Training results
                                                                                      • Metrics
                                                                                      • Validation results
                                                                                          • Conclusion and future work
                                                                                            • Conclusion
                                                                                            • Future work
                                                                                              • Security
                                                                                              • Implementing a detection plugin
                                                                                              • Different deployment configurations
                                                                                              • Multiple streams with different layouts
                                                                                              • Implementing the plugin distribution service (Remote ProducerConsumer)
                                                                                              • Using high performance microservices backbone frameworks
                                                                                              • New object detection models and datasets specifically for thermal images
                                                                                                  • Firefighting department email conversations
                                                                                                    • General email sent to Firefighting departments
                                                                                                    • Conversation with Firefighting department of Antwerp Belgium
                                                                                                    • Converstation with Firefighting department of Ostend Belgium
                                                                                                    • Conversation with Firefighting department of Courtrai Belgium
                                                                                                    • Conversation with Firefighting department of Ghent Belgium
                                                                                                      • Thermal camera specifications
                                                                                                      • Last Post thermal dataset summary
                                                                                                        • 24th of March 2018
                                                                                                        • 2nd of April 2018
                                                                                                        • 3th of April 2018
                                                                                                        • 4th of April 2018
                                                                                                        • 5th of April 2018
                                                                                                        • 9th of April 2018
                                                                                                        • 10th of April 2018
                                                                                                        • 11th of April 2018
                                                                                                        • 12th of April 2018

Recommended