A Human Detection Model for UAV-Assisted
Emergency Management Information Systems
Chu Myaet Thwal, Kyi Thar, Ye Lin Tun, and Choong Seon Hong*
Department of Computer Science and Engineering, Kyung Hee University, South Korea
{chumyaet, kyithar, yelintun, cshong}@khu.ac.kr
Abstract
With the advancement of automation and communications technologies, the Unmanned Aerial Vehicles (UAVs) have become great
assistants in several critical applications. By contributing computer vision (CV) techniques to UAV technology, a human detection model
for UAV-assisted emergency management information systems is proposed in this paper. The proposed scheme consists of a control center
that is monitoring the emergency management procedures. After training and testing the human detection model at the control center, it is
mounted on UAVs that are deployed in search and rescue (SAR) operations. By applying the proposed system in real-world scenarios, it can
be expected to help in reducing the operational delay for acquiring situational information of the disaster-affected areas in SAR operations
and thus, increasing the survival rate of the victims.
Keywords: Unmanned Aerial Vehicle (UAV), Human Detection, Computer Vision (CV), Search and Rescue (SAR)
1. Introduction
Naturally triggered disasters such as earthquakes, tsunamis,
wildfires, or floods can create a catastrophic situation resulting
in the disruption of the environment and have an immediate im-
pact on human lives. According to the statistical data of the World
Health Organization (WHO), around 90,000 people are killed and
nearly 160 million people become the victims of large-scale nat-
ural disasters every year [1]. Regardless of the type of disaster or
the size of the affected area, it is mandatory for the authorities to
perform the emergency management processes during or after the
occurrence of a disaster. In such situations, time plays a critical
factor to mitigate the number of victims and control the amount
of death. It is important to collect the disaster information and
provide the first responders with accurate data of the critical situa-
tion in a short range of time. Traditional methods such as ground
inspection and aerial scouting operated by humans can be time-
consuming and have proven to be unreliable [2]. It can also be a
difficult task for a human rescuer to search and detect the survivors
in the area of the distorted site.
With the advancement of automation and communications tech-
nologies, Unmanned Aerial Vehicles (UAVs) become ubiquitous
and popular assets by introducing several services and functions
in a wide range of crucial applications. Being the cost-effective
and efficient innovations, UAVs are broadly used in the fields of
aerial photography, package delivery, surveillance, search and res-
cue (SAR), and military operations [3, 4]. In this paper, we inte-
grate the aid of UAVs in an emergency management information
system for acquiring accurate and sufficient information for the au-
thorities to manage the SAR operations. The advanced automated
capability and agility of UAVs to access the disaster area with ease
may improve the situational awareness to support the SAR oper-
ations. To be able to recognize and locate the survivors via UAV
cameras, we design a deep learning model that is automated by the
collaboration of artificial neural networks in computer vision (CV)
techniques. Based on the convolutional neural network (CNN) lay-
ers, we develop an object detection model that can identify humans
in images obtained from UAV cameras by applying the transfer
learning approach [5] on the pre-trained model.
Thus, our objective is to provide an automated human detec-
tion model mounted on UAVs assisting in emergency management
information systems for mitigating the operational delay and pre-
serving human lives in critical situations. By applying the pro-
posed scheme in real-world scenarios, it can effectively help to ac-
quire the relevant information and detect the survivors in disaster-
affected areas. Moreover, it can be expected the drastic reduction
of time to accomplish the SAR operations compared to the tradi-
tional methods operated by humans. Our contributions are sum-
marized as follows:
• We design the system architecture of an automated search and
rescue operation to jointly work with UAVs.
• We analyze the potential of transfer learning and develop an
object detection model by fine-tuning the parameters of a pre-
trained model.
• We train our model on a relevant aerial image dataset to be
able to recognize humans via UAV cameras and deploy it in
the emergency management information system.
• We design to update our model accordingly with real-time
images and data obtained from the deployed UAVs for further
improvement of the model performance.
296
2020년 한국컴퓨터종합학술대회 논문집
2. System Model and Problem Formulation
Fig.1 shows the system architecture of a search and rescue oper-
ation that jointly works with our proposed emergency management
information system. We consider a region where natural disasters
occur frequently with a control center for monitoring the emer-
gency management processes. A set of UAVs U = { u1, u2, ...,
uU } works under the administration of the control center for ac-
quiring the emergency information and assists in SAR operations.
Our system architecture consists of three stages for utilizing the
proposed scheme in real-world scenarios: i) Model development
stage, ii) Model deployment stage, and iii) Model improvement
stage.
Figure 1: System model.
At the model development stage, we adapt the Single Shot De-
tector (SSD) MobileNet-V2 model [6] that is pre-trained on the
MS-COCO dataset [7] which consists of 330k images with several
features for object detection tasks. We apply the transfer learning
technique to the adapted model for fine-tuning the model parame-
ters and develop an object detection model that can automatically
recognize humans in images obtained via UAV cameras. Then, we
train and test our model on the Semantic Drone dataset [8] which
contains 400 aerial images at a size of 6000× 4000 px (24 Mpx),
taken by a high-resolution UAV camera at an altitude of 5 to 30meters above the ground. We divide the dataset into three sets:
350 samples for the training set, 40 samples for the validation set,
and 10 samples for the test set. The accuracy of the model is de-
termined at the control center, and measured by average precision
and recall of the predictions. Precision measures how accurate
the model does the predictions and recall measures how good the
model is at detecting every human existing in an image calculated
by:
Precision =TP
(TP + FP )(1)
Recall =TP
(TP + FN)(2)
where TP = TruePositive: the case for our model detecting hu-
man that is actually existing in the image, FP = FalsePositive:
the case for our model mistakenly detecting human that is not ex-
isting in the image, and FN = FalseNegative: the case for
our model not detecting human that is existing in the image. Af-
ter training and testing the model locally at the control center, we
deploy the model in real-world scenarios.
At the model deployment stage, our proposed model is mounted
on UAVs that are assisting in SAR operations under the manage-
ment of the control center. The UAVs are employed for collecting
the relevant information on critical situations and detecting the sur-
vivors in disaster-affected areas. As the UAVs inspect an area, the
images acquired via the UAV cameras are processed as the inputs
for our proposed model. The model detects each person presented
in the images and generates a detection box around it as an out-
put. The output images and the location of the survivors are for-
warded to the control center for evaluating the model performance
and planning the SAR operations. Fig.2 shows the demonstration
of our model by applying it to one of the samples in the test set.
Figure 2: Demonstration for human detection.
At the model improvement stage, we use the output images and
data generated by the model at the end of the deployment stage and
update the model to improve the performance and the precision of
detection boxes.
3. Performance Evaluations
Figure 3: Localization loss of detection boxes.
297
2020년 한국컴퓨터종합학술대회 논문집
In this section, we analyze the performance of our proposed
human detection model by evaluating the simulations on the Se-
mantic Drone dataset. We demonstrate our simulations on Google
Colab GPU backend by using TensorFlow API [9]. The statistical
results shown here are illustrated over 10,000 training steps. Fig.3
shows the localization loss for the bounding box offset prediction
as the sum of squared errors. Our model can reduce the loss grad-
ually with respect to the training steps to the minimum value of
0.3142 at the training step 10,000.
Figure 4: Detection boxes precision.
Fig.4 shows the mean average precision (mAP) of the detection
boxes that increases with respect to the training steps. Our model
can achieve the maximum mAP value of 0.7818 at the training step
10,000.
Figure 5: Detection boxes recall.
Fig.5 shows the average recall (AR) of the model over 100 de-
tections per image with respect to the training steps. Our model
can achieve the maximum AR value of 0.8276 at the training step
10,000.
4. Conclusions
In this paper, we proposed the architecture of a search and res-
cue operation that jointly works with the UAVs to apply it in the
smart city domain. According to the evaluation results, our pro-
posed scheme can be deployed in emergency management infor-
mation systems by helping to provide the authorities with relevant
data thereby reducing the operational delay and increasing the sur-
vival rate of the victims in search and rescue operations. As future
works, we aim to develop a model that can distinguish between
safe and injured victims and integrate it into UAVs. Thus, the au-
thorities can prioritize rescuing the injured victims while UAVs are
providing safe victims with first-aid supplies.
Acknowledgement
This work was supported by Institute of Information & commu-
nications Technology Planning & Evaluation (IITP) grant funded
by the Korea government(MSIT) (No.2019-0-01287, Evolvable
Deep Learning Model Generation Platform for Edge Computing)
*Dr. CS Hong is the corresponding author.
References
[1] “Natural events,” Aug 2012. [Online]. Available:
https://www.who.int/environmental_health_emergencies/natural_events/en/
[2] N. Zhao, W. Lu, M. Sheng, Y. Chen, J. Tang, F. R. Yu, and K. Wong, “Uav-
assisted emergency networks in disasters,” IEEE Wireless Communications,
vol. 26, no. 1, pp. 45–51, 2019.
[3] M. Mozaffari, W. Saad, M. Bennis, Y. Nam, and M. Debbah, “A tutorial on
uavs for wireless networks: Applications, challenges, and open problems,”
IEEE Communications Surveys Tutorials, vol. 21, no. 3, pp. 2334–2360, 2019.
[4] C. M. Thwal and C. S. Hong, “A uav-assisted intelligent delivery system for
smart city,” 2019.
[5] C. Tan, F. Sun, T. Kong, W. Zhang, C. Yang, and C. Liu, “A survey on deep
transfer learning,” CoRR, vol. abs/1808.01974, 2018. [Online]. Available:
http://arxiv.org/abs/1808.01974
[6] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “Mo-
bilenetv2: Inverted residuals and linear bottlenecks,” in The IEEE Conferenceon Computer Vision and Pattern Recognition (CVPR), June 2018.
[7] T. Lin, M. Maire, S. J. Belongie, L. D. Bourdev, R. B. Girshick, J. Hays,
P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft COCO:
common objects in context,” CoRR, vol. abs/1405.0312, 2014. [Online].
Available: http://arxiv.org/abs/1405.0312
[8] “News.” [Online]. Available: https://www.tugraz.at/index.php?id=22387
[9] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S.
Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp,
G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg,
D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens,
B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan,
F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and
X. Zheng, “TensorFlow: Large-scale machine learning on heterogeneous
systems,” 2015, software available from tensorflow.org. [Online]. Available:
http://tensorflow.org/
298
2020년 한국컴퓨터종합학술대회 논문집