Post on 13-Apr-2017
transcript
0U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Letricia P. S. Avalhais, Jose Rodrigues-Jr., Agma J.
M. Traina
Fire detection on unconstrained videos
using color-aware spatial modeling and
motion flow
University of Sao Paulo
Institute of Mathematics and Computer Science
Sao Carlos, Brazil
1U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Develop solutions to support emergency command center
using intelligent analysis on data provided by
crowdsourcing.
Emergency context
2U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
OUTLINE
Introduction & Background0
1
0
2
0
3
SPATFIRE Method
Experiments & Results
0
4 Conclusions
3U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Automatic detection of fire on videos
‣ Motivation
o Take advantage of different mobile devices with cameras such
as smartphones and tablets
o Low cost and flexible alternative to fixed located sensors
o Fast response to incidents as fire and explosions
4U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Goal
‣ Develop an effective solution to detect fire on
unconstrained videos, focused on:
1. High abrangency (recall)
2. Real time response
5U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Automatic detection of fire on video
‣ Methods from the literature
Rely mainly in color-based
models from different color
spaces: RGB, YCbCr, CIE Lab
and HSVTake advantage of yellow-
reddish appearance of fire
May also combine shape
or texture
Static information only
6U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Automatic detection of fire on video
‣ Methods from the literature
Rely mainly in color-based
models from different color
spaces: RGB, YCbCr, CIE Lab
and HSVTake advantage of yellow-
reddish appearance of fire
May also combine shape
or texture
Static information only
High false positive rates due to
ambiguity with non-fire objects
presenting the same color
Alternative: incorporate dynamic features
7U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Automatic detection of fire on video
‣ Methods from the literature
Generally combined with
color models
Temporal content: flickering
patterns, background subtraction,
shape variation
Better performance than the works
that use only static information
Dynamic information
8U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Does not fit the requirements of
a crowdsourcing emergency
system
Automatic detection of fire on video
‣ Methods from the literature
Dynamic information
Assumptions: stationary cameras, controlled lightening conditions, short cropped video segments
Generally combined with
color models
Temporal content: flickering
patterns, background subtraction,
shape variation
Better performance than the works
that use only static information
9U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
SPATFIRE
SPAtio-Temporal segmentation of FIRe Events
‣ MAIN CONTRIBUTIONS
A color model for spatial segmentation specifically tailored for the detection of fire-like regions
based on the HSV color space;
1. FPD - Fire-like Pixel Detector
An efficient technique to compensate the camera motion observed in videos acquired with non-
stationary cameras;
2. Motion compensation
Perform the temporal segmentation of fire events in adverse uncontrolled situations.
3. Event segmentation
10U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
SPATFIRE
OVERVIEW Fire segments
11U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Spatial segmentation
FPD Color Model
(2)(1)
Visualization of the fire pixels in the HSV color
space
12U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Motion estimation
2. DENSE FLOW ESTIMATION
‣ Match points sampled at uniform intervals in a grid
‣ Uses the “background’’ information
‣ Gunnar Farneback’s Optical Flow
1. SPARSE FLOW ESTIMATION
‣ Match corner points from two consecutive frames on the regions of interest
‣ Harris corner detection
‣ Lucas-Kanade Optical Flow
OPTICAL FLOW
13U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Non-stationary cameras
‣ Usually add an extra motion component from the camera
movement.
‣ Why is this a problem?
14U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Non-stationary cameras
‣ Usually add an extra motion component from the camera
movement.
‣ Why is this a problem?
Sparse flow from the entire frame Sparse flow from the interest region
15U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Block-based motion compensation
BLOCK DOMINANT
ORIENTATIONnon-overlapping regions of 32 x 32. For
each block , the mean local flow is:
ESTIMATE THE BACKGROUND
MOTION FLOW
calculate the average of the orientation
from the block dominant flows at the peak
of histogram and define the
approximated global background flow as:
16U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Feature vector representation and classification
The representation and classification are described in the following steps:
1. Calculate the new compensated set of flow
so that, for each , the correspondent new flow is given by:
2. Calculate the histogram of oriented optical flow (32 bins) from the
set .
3. Use the SVM classifier to determine the class (fire, not fire) using the
histogram as its input.
17U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Experiments
‣ Evaluating FPD color model
o How accurate is the FPD model to correctly select fire pixels?
18U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Experiments
‣ Evaluating FPD color model
o How accurate is the FPD model to correctly select fire pixels?
Fire pixels Non fire pixels
TP = n. of fire pixels in C
FP = n. of non fire pixels in C
FN = n. of fire pixels in A – TP
TN = n. of non fire pixels in B – FP
19U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Experiments
‣ Evaluating FPD color
model
o The BoWFire Dataset• Training set: 80 cropped
images of 50 x 50 pixels
• Test set: 226 images of
various resolutions
‣ Comparison
o Çelik and Demirel [2009]
o Zhang et al. [2013]
o Chen et al. [2009]
Fire samples Non-fire samples
20U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Results
‣ BoWFire dataset (test)
18% superior
than Celik
PrecisionFPD 62.46%Celik 52.8%Zhang 45.95%Chen 37.2%
21U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Results
‣ BoWFire dataset (test)
18% superior
than Celik
Chen: 10%
higher recall
rate
RecallFPD 77%Zhang 30.87%Celik 67.7%Chen 84.8%
PrecisionFPD 62.46%Celik 52.8%Zhang 45.95%Chen 37.2%
22U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Results
‣ BoWFire dataset (test)
PrecisionFPD 62.46%Celik 52.8%Zhang 45.95%Chen 37.2%
F1-measureFPD 63.35%Celik 53.23%Zhang 29.3%Chen 45.13%
RecallFPD 77%Zhang 30.87%Celik 67.7%Chen 84.8%
18% superior
than Celik
Chen: 10%
higher recall
rate
Outperforms
Celik in 19%
23U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Results
‣ BoWFire dataset (training)
RecallFPD 85.81%Celik 11.5% Zhang 31.4%Chen 88%
Chen 2.5%
superior on
recall
24U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Results
‣ BoWFire dataset (training)
F1-mesureFPD 92.3%Celik 20.6% Zhang 47.8%Chen 93.6%
FPD and Chen
nearly tied
Chen 2.5%
superior on
recall
RecallFPD 85.81%Celik 11.5% Zhang 31.4%Chen 88%
25U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Experiments
‣ Evaluating the SPATFIRE method
o How accurate is the resultant temporal segmentation?
. . . . . . . . .
Fire segments Non fire segments
. . . . . . . . .
26U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Experiments
‣ Evaluating the SPATFIRE method
o FireVid dataset
• Acquired from YouTube using web
crawlers
• Key words: “fire”, “explosion”, “flame”,
“burning”
• 83,675 frames labeled as “fire”, “not-
fire” or “ignore”.
• from 320 × 240 to 600 × 336 pixels,
and frame rate varying from 10 Hz to
30 Hz
o RESCUER dataset
• Videos from a fire simulation at an
industrial area
• balanced distribution of videos with
resolutions varying from 320 × 240 to
1920 × 1080 pixels
• Also manually labeled as “fire”, “not-
fire” or “ignore”.
27U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Results
‣ FireVid dataset
PrecisionSPATFIRE 89.1%Celik 79.16%Di Lascio 89.17%
SPATFIRE and
Di Lascio nearly
tied
28U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Results
‣ FireVid dataset
RecallSPATFIRE 63.7%Celik 18.87%Di Lascio 51.37%
SPATFIRE and
Di Lascio nearly
tied
24% higher
than Di Lascio
PrecisionSPATFIRE 89.1%Celik 79.16%Di Lascio 89.17%
29U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Results
‣ FireVid dataset
F1-measureSPATFIRE 74.3%Celik 30.48%Di Lascio 65.2%
24% higher
than Di Lascio
Outperforms Celik
in 1.4x and Di
Lascio in 14%
SPATFIRE and
Di Lascio nearly
tied
PrecisionSPATFIRE 89.1%Celik 79.16%Di Lascio 89.17%
RecallSPATFIRE 63.7%Celik 18.87%Di Lascio 51.37%
30U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Results
‣ RESCUER dataset
PrecisionSPATFIRE 94.4%Celik 78.6%Di Lascio 90.5%
Outperforms Celik
in 20% and Di
Lascio in 4.3%
31U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Results
‣ RESCUER dataset
Outperforms Celik
in 20% and Di
Lascio in 4.3%
31% and 37%
higher recall
rate
PrecisionSPATFIRE 94.4%Celik 78.6%Di Lascio 90.5%
RecallSPATFIRE 73.62%Celik 53.75%Di Lascio 56.1%
32U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Results
‣ RESCUER dataset
F1-measureSPATFIRE 82.73%Celik 63.82%Di Lascio 69.24%
RecallSPATFIRE 73.62%Celik 53.75%Di Lascio 56.1%
31% and 37%
higher recall
rate
Outperforms Celik
in 29.6% and Di
Lascio in 19.4%
Outperforms Celik
in 20% and Di
Lascio in 4.3%
PrecisionSPATFIRE 94.4%Celik 78.6%Di Lascio 90.5%
33U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Time evaluation
Resize higher resolution
videos:
largest dimension of 600
pixels
34U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Conclusions
‣ Combining static and dynamic information is a key approach to
detect patterns of fire
‣ The motion flow compensation technique aids to lower the influence
of the camera motion from videos shot by non-stationary cameras
‣ SPATFIRE is effective to detect and segment events for
unconstrained videos, overcoming state-of-the-art methods
35U N I V E R S I T Y O F S A O P A U L O ,
B R A Z I L
I C T A I 2 0 1
6
Future Work
‣ Refine the background motion estimation by amplifying the time
interval
‣ Apply spectral analysis to improve spatial segmentation
‣ Explore the use of accelerometers data (when provided) to better
determine the camera movement
‣ Propose alternative designs to monitor other circumstances, such as
smoke, flood, and heavy wind