Download - Decision Making and Reasoning with Uncertain Image and Sensor Data Pramod K Varshney Kishan G Mehrotra Chilukuri K Mohan.

Decision Making and Reasoning with Uncertain Image and Sensor Data

Pramod K VarshneyKishan G MehrotraChilukuri K Mohan

Main Themes Decentralized decision-making Multiple uncertain information

streams Dynamically changing

environments Algorithms for realistic battlefield

scenarios

What is the agent’s current location?

What activities are other agents involved in?

What is the likelihood of damage at various locations?

What would be the safest paths to a goal/exit zone?

Main Contributions Scenario recognition from video

sequences Improved activity recognition with

audio+video information Development of new algorithms for path

planning in a battlefield Formulation of path planning as a multi-

objective optimization problem Development of a new multi-objective

evolutionary algorithm

1. Scenario Recognition and Classification

Event recognition and scene analysis with real time visual and audio information

Problem Formulation Detect moving objects and classify

activities Identify sounds indicative of specific events Quantify uncertainty in activity

classification Develop an enhanced scene representation

by integrating audio and visual information Related work

1.1.Video Component Goal: To detect and track moving

objects and classify activity in real time

Input: real time video stream Output: detected moving object

and activity classification

Video Processing Pipeline (cont’d.)Goal: Recognition of a moving object’s activities

from a sequence of images (video)

Low Level Processing-Filtering-Detection-Tracking-Feature Extraction

High Level Processing

-Frame Classification-Scenario Recognition

Sequence of

Frames

Extracted

Features

Extracted

Scenarios

Video Processing PipelineReal time Video Acquisition

Detection

Tracking

Feature extraction

Classification

Scene Description Generator

Visualization

Features Extracted Aspect Ratio (AR) = d / (a+b+c) Relative Upper Density (RUD) = a / (a+b+c) Relative Middle Density (RMD) = b / (a+b+c) Relative Lower Density (RLD) = c / (a+b+c) Velocity and centroid

a

b

c

d

Video Feature Analysis: Example

Feature Walking Bending

AR 0.2 0.3

RUD 0.3 0.2

RMD 0.4 0.5

RLD 0.3 0.3

Figure 1

Figure 2

Classification Algorithms Used for Activity Detection Multi-module back-propagation

neural network Inductive Decision Tree

Learning (C5) algorithm Control Chart Approach Bayesian networks

Visualization of Activity with Uncertainty Measure Example activities

shown here: sitting, bending and standing

Uncertainty is calculated from classifier output, foreach event

The blue pointer indicates the level of certainty in the classifier decision

Control Chart Approach for Video Activity Classification Control Chart indicates the variation in

the values of some feature over time, with graphical depiction of the upper and lower control limits for that feature.

High level detection with control charts:

1. Identification of each activity.2. Recognition of when the activity begins

and ends.

Control Chart Example (with Upper and Lower Control Limits for each activity)

0.00

0.20

0.40

0.60

0.80

1.00

1.20

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81

Frame Number

Asp

ect R

atio

UCL for Standing

UCL for Sitting

LCL for Standing

LCL for Sitting

LCL for Bending

UCL for Bending

detail

1.2.Why Audio?

Role of Audio Component Obtain information which may not

be acquired visually Provide additional comprehensive

information enriching the scene context

Due to large number of potential sounds to identify, the scope of problem is very vast

Audio Processing Goal: To detect and classify sounds

indicative of specific events Input: A sample of sound in real time Output: Detected class of specific

sound Example: sound samples indicate

specific objects/events such as explosions and vehicles

What’s New? Fusion of audio and video for

surveillance and scene analysis New audio features - Spectrum

shape modeling coefficients

Audio Processing PipelineAudio acquisition

Histogram Features

Spectral FeaturesRelative Band

Energies

Linear predictive coding /Cepstral coefficients

Choose features

Multi Module back-propagation Neural Networks

Audio Features

Amplitude Histogram Features (width, symmetry, skewness and kurtosis calculated on a histogram of a 3 second clip)

Spectral Centroid and Zero Crossing RateRelative Band energiesLinear Predictive Coding CoefficientsCepstral Coefficients

Spectrum shape modeling coefficients

What and why?

Audio Enhanced Visual Processing

Video Processing and Classification

Audio Processingand Classification

Visualization

Description Generation

Video Acquisition

Sound Acquisition

FusionUncertainty

Audio Visual Classes 3 classes of video events

Sitting Standing Bending

4 classes of sound events are considered Silence Clear Speech Babble or Speech in noise Alarm sounds (smoke detector class)

Prototype Demonstration

Experimental Results - Video

Sub-scenario recognition accuracy of Control Chart approach:

VideoNumber of

FramesNumber of

sub-scenarios

Number of recognized

sub-scenarios

1 823 11 10

2 512 6 6

3 701 12 12

4 514 9 10

Experimental Results - Video We used 4 different video sequences. Total

2250 feature vectors, 1072 were used in the training and rest of the 1478 vectors were used in the testing.

Classification Accuracy using different methods: Neural Network (back-propagation) 91.34% Decision Tree (C5) 92.86% Naive Bayesian Network 89.61% Control Chart 95.70%

Experimental Results - Audio In this 4 class problem, we obtain

classification accuracy of 92% on recorded data (off-line classification)

75% for real time classification in the laboratory acoustic environment Acoustics of each environment can be

different, leading to misclassifications Characteristics of the recording

equipment

1.3.Representation Scheme Audio and visual processing yields

information about scene context Need for representation scheme for

acquired audio video information Generation of a document containing

audio-visual information, which can be further processed

XML Based Description We chose an XML based representation

Widely accepted standard for information exchange More comprehensive forms such as XML schema will

be used for representation MPEG standards use XML based Audio visual content

management Semi structured, allowing for addition of user defined

data and information An XML based representation allows for

standardization, flexibility and extendibility Automatic generation of XML based description Descriptor gives the state of observed scenario

over a certain time period

Example Descriptor

Moving object Features and activity class

Header

Complete descriptor

Descriptor Utility The combined audio visual

descriptor can serve as a base for Data mining for unusual events or

correlation between events and activities

Building case libraries of interesting scenarios or for particular cases

Audio-visual fusion and visualization

Discussion We have shown the feasibility of

activity recognition using combined video and audio information.

Future work: integration, extension, elaboration

Next section (path planning): after activity recognition, battlefield decision-maker must act.

2. Personnel Movement Planning in a Battlefield

Path computation algorithms for risk minimization

2.1 Path Planning in a Battlefield Goal: To determine (escape) paths for

personnel in a battlefield Input: A node weighted graph with each

node representing a geographical location of a battlefield whose weight corresponds to the associated risk.

Quality Measure: The quality of an escape path is determined by cumulative risk of the path

Problem Formulation A path P is a non cyclic sequence

(L1,L2….Ln) where L1 is the initial location of personnel, Ln is a target or exit point, and each Li is adjacent to Li+1 in the graph.

Determine escape paths which maximize path quality Q(P) defined as

k

Q(P)= log(1-risk(Li)) i=1

Modeling Risks We define risk as the probability of occurrence of a

high level of damage to personnel traversing a path

Two probabilistic risk models Gaussian Distribution - models risks due to specific events such as explosion and

chemical threats Beta Distribution - models risks due to

distribution of events through the entire geographical region

Modeling Risks with Gaussian Distribution

Algorithms for Path Planning Uniform Cost Search – finds the

optimal solution (Dijkstra’s algorithm) Simulated Annealing Evolution Strategies (ES)

µ+1 ES Stochastic ES Evolutionary Quenching Strategy (EQS)

Evolution Strategies Initialize population Generate offspring at each iteration

from a population of size µ Replacement Strategy

µ+1 ES – Deterministic replacement; only offspring of higher quality are accepted

Stochastic ES - Probability of replacement

is equal to min[1,Q(offspring)/Q(parent)]

Key Principle of EQS An evolution strategy which accepts

solutions of lower quality with a probability that decreases with increase in number of iterations (annealing principle)

Ensures escape of local optimum during early stages of the algorithm

Emphasizes convergence to optimal solution at later stages of the algorithm

Optimal Route Planning for Battlefield Risk Minimization

Goal

Source

Source

Goal

Source

High risk

Moderate risk

Low risk

Risk free

Optimal Route Planning for Battlefield Risk Minimization (Contd.)

Simulation Results The algorithms were simulated on a

100x100 grid with 15 target nodes on the periphery of the grid.

In all instances of the problem, EQS approximates the optimal solution outperforming Simulated Annealing and variants of ES.

EQS and other variants of ES require a relatively less computational time of 21 seconds compared to uniform cost search (470 seconds)

Performance Comparison of Different Algorithms with a Gaussian Distribution

for Risk Values

2.2 Multi-Objective Path Planning In a battlefield, a path can be

evaluated with respect to different objectives.

Some crucial aspects of a path to be considered are:

Cumulative Risk Length of the Path Reward associated with the target node

Multi-objective Evolutionary Algorithms Goal: To discover a set of non

dominated solutions with significant diversity

Evolutionary algorithms are best suited for multi-objective optimization since they simultaneously explore multiple solutions

Multi-objective Evolutionary Algorithms (Contd.) We have implemented three multi-objective

evolutionary algorithms for path planning problem

Pareto Archived Evolution Strategy- J.D. Knowles and D.W Corne, “On Metrics for comparing non dominated sets,” in Proc. IEEE Congress on Evolutionary Computation (CEC02), pp.711-716, 2002.

Non-dominated Sorting Genetic Algorithm - K. Deb , S. Agarwal, A. Pratap, and T. Meyarivan, “A fast and elitist multi-objective genetic algorithm: NSGA II,” in Proc. Parallel Problem Solving from Nature VI, pp.849-858, 2000.

Evolutionary Multi-objective Crowding Algorithm

Evolutionary Multi-objective Crowding Algorithm (EMOCA)

EMOCA considers crowding density in data space for path planning

Mating opportunities are given to better quality as well as substantially different individuals

Stochastic acceptance criteria is used which depends on crowding density difference between parent and offspringEMOCA

Main steps

Multi-objective Problem Scenario

Goal-1

Goal-2

Goal-3

Source

High risk

Moderate risk

Low risk

Risk free

Multi-objective Problem Scenario (contd.) Paths are evaluated with respect

to three different measures – risk, path length and reward

Difficult tradeoffs exist: for example, should personnel follow a more risky path to increase the probability of finding a greater reward?

Illustrating Mutually Non Dominating Paths

P1 goal1 P2 goal2

goal3

P3

source

High risk

Moderate risk

Low risk

Risk free

Path Quality with respect to Different Measures

Path Risk Path length

Reward

P1 0.7 9 0.2

P2 0.2 14 0.5

P3 0.7 12 1

Best Choice of Path W-risk W-path

lengthW-reward

Best path

Low High Low P1

High Low Low P2

Low Low High P3

Performance Comparison We have used a well known metric – C

metric for performance comparison. Smaller values of C metric indicates better performance.

We have also obtained C metric values over multiple trials comparing the solutions obtained by different algorithms for each trial

Simulation Results EMOCA outperforms NSGA II and

PAES for results obtained over 100 trials

EMOCA obtains more non-dominated solutions and has lower C metric values than other algorithms.

The results clearly indicate that EMOCA performs best for the path planning application

C-metrics for Various Pair-wise Algorithm Comparisons

Algorithm1 Algorithm2 C(Algorithm2, Algorithm1)

EMOCA(without crossover)

PAES 0.15

EMOCA(with crossover)

PAES 0.00

EMOCA (with crossover)

NSGA II 0.06

Discussion Efficient algorithms for risk

minimization Near-optimal solutions Modeled path planning as a multi-

objective optimization problem Developed a new algorithm (EMOCA)

outperforming state of the art multi-objective evolutionary algorithms

Future Work

Develop multi-objective evolutionary algorithms for other battlefield applications such as wireless sensor networks employed in surveillance systems

Develop algorithms for dynamic path planning Multiple object detection and tracking, and work

on Multi camera platform Develop a comprehensive library of recognizable

sounds to provide richer context information New methodologies for audio visual fusion Integration with VGIS

Mutation

The mutation step consists of replacing a randomly chosen edge of the path by another sub path between the same nodes.

In mutating the path a b c d e, a randomly chosen edge of the path, say c d, is replaced by an alternate sub-

path c f h d, yielding a b c f h d e

Simulated Annealing- main steps Initialize population- straight line

shortest paths from source node to target node

Mutation of parent to produce offspring

Stochastic replacement with probability

1-e (Q(offspring)-Q(parent))/temperature

Mutation

The mutation step consists of replacing a randomly chosen edge of the path by another sub path between the same nodes.

In mutating the path a b c d e, a randomly chosen edge of the path, say c d, is replaced by an alternate sub-

path c f h d, yielding a b c f h d e

Multi-objective Optimization- Preliminaries The solution to a multi-objective optimization

problem is a set of non-dominated vectors. A solution vector x dominates a solution vector

y (x>>y) if and only if

i {1,….m} : fi(x) >= fi(y), and

j {1,….m} : fj(x) > fj(y) Where m is the number of objectives. X andY

are mutually non-dominating if the above conditions do not hold.

EMOCA- Main Steps Initialize Generate mating population Generate offspring by crossover ,

mutation Create a new pool consisting of some

parents and some offspring Trim new pool to generate population of next iteration

Crossover Two Point Path Crossover operator (2PTPX) which

is less disruptive and preserves a major portion of the parent paths.

Consider two parent paths S N1 N3 E1 and S N2 N4 E2, where N1 and N2 are at least four path lengths away from E1 and E2, and nodes N3 and N4 are a few edges away from N1 and N2, respectively.

The crossover operator then generates the offspring S N1 N4 E2 and S N2 N3 E1 .

Pareto Archived Evolution Strategy (PAES) Uses a local search strategy and maintains

an archive of non-dominated solutions. Parent is mutated to produce offspring If offspring dominates parent, it is

accepted If offspring and parent are non-dominated,

then acceptance decision is based on the squeeze factor of the solutions.

Non-dominated Sorting Genetic Algorithm(NSGA II) Generates offspring population of size N

from mating population of size N by crossover and mutation

Uses binary tournament to select mating pairs

A non dominated sorting on combined population(parent+offspring) is used to obtain mating population for next iteration

Crowding density Data space crowding density is defined as (P)=

L/E where L is the number of paths in the current population passing through each edge of path P, and E is total number of edges in path P

A relatively low value of (P) indicates that path P does not share many edges with other paths in the population, giving it a relatively high diversity rank.

Salient features of EQS The acceptance probability of EQS depends

on where =((c+(1-c)*i)/)-, i is the current iteration , is the maximum number of iterations, c and are algorithm parameters.

During initial stages of the algorithm, when i=0, =c/-, and the probability of acceptance is high. During later stages of the algorithm when i approaches ,

=c/+(1-c)-, and the probability of accepting the offspring is relatively low.

Trimming New pool The new pool is sorted based on the primary

criterion of non-domination rank and the secondary criterion of diversity rank

The new population will consist of the first N elements of the sorted list containing solutions grouped into different fronts:F1, F2,

…..Fn where elements of Fi+1 are dominated

only by elements in F1,F2 ,…..Fi.

New Pool Generation The offspring is compared with one of the parents to form

the new pool.There are three possible cases: Case 1: If the offspring dominates the parent, then the

offspring is added to the new pool. Case 2: If dominated by the parent, the offspring is added to

the new pool with probability

1-exp((offspring)- (parent)). Case 3: Otherwise, if the offspring has a lower crowding

density than the parent, then it is added to the new pool, else the parent is added to the new pool.

Mating Population Generation Binary tournament selection is iterated to

create the mating pool In each step, two randomly chosen

members of the current population are compared

The tournament to determine who enters the mating population is won by the solution with lower total rank, the sum of its non-domination rank and diversity rank

Squeeze factor

The squeeze factor of a candidate solution is the number of archive elements located in the same cell of the objective function space, assuming that this space is a finite hyper cube divided in to (2d)m equal sized non overlapping hyper cubes.

C-metric

C metric – calculates the fraction of solutions in one non-dominated set that are dominated by the non-dominated solutions of the other set.

Significance of audio features Histogram features Features calculated on histogram

Width Symmetry Skewness Kurtosis

Clear voice has a asymmetricbroad histogram

Voice in noise has a narrower histogram, and is more

symmetric Useful in detecting modulations

in sound

Other sound environments We conducted experiments To classify the following environments

Air conditioned rooms Construction site Factory Rail tunnel Warehouse

To distinguish between types of power tools in a construction setting

Drills Hammers Generators Compressor Electric motors

Significance of audio features (cont’d) Spectral Centroid and Zero Crossing Rate,

model the spectral distribution and the dominant frequency (pitch) of sound

Band Relative Energies calculate the energy in several spectral bands. Speech mostly contains energy in the band below 1 khz whereas alarms might have a different distribution

LPC coefficients and Cepstral Coefficients give a direct indication of sampled sound in time and querfency domain respectively

Complete XML descriptor

Related Work Interpretation system of dynamic scenes INRIA

France 2003. Robust, Online Event Detection and Classification

for Video Monitoring (Cornell University) Video Surveillance and Monitoring (Carnegie

Mellon University 2000) Work dealing with situational context learning like

Computational Auditory scene analysis, Wearable Audio Computing at MIT(2003), Technology for Enabling Awareness (TEA) project(2000)

Low Level Processing of video Moving Object Detection:

Background Subtraction: Luminance Contrast Method Background/Template Updating

Moving Object Tracking:Dynamic TemplateInfinite Impulse Response (IIR)

Feature Extraction:Bounding box is identified, and useful features extracted from it

Uncertainty computation ))((/)()( ioutputioutputiunc

Module 1:standing

Module 2: standing

Module 3: sitting

0.987

0.01

0.092

092.001.0987.0/987.0)1( unc

092.001.0987.0/01.0)2( unc

092.001.0987.0/092.0)3( unc

0.9063

0.0092

0.0845

Spectral shape coefficients Divide the spectrum into 5 bands Do a linear regression,find best fit lines for the

spectral envelope in each Band Slopes of these lines give

the coefficients Inspired by the Kates

coefficients Indicate shape of spectrum

Frame based classification The mean values and standard deviations are

computed for each feature fi and for each class ci to be discriminated, using the available training data

For each class ci , the upper and lower bounds associated with the control chart are obtained:upperBound(fk , ci ) = mean(fk , ci ) + fk, ci .standard deviation (fk , ci )

lowerBound(fk , ci ) = mean(fk , ci ) – fk, ci . standard deviation (fk , ci )

Decision in Classification Final classification uses the majority rule.

For instance, if [standing,standing,standing,bending] is the vector representing single-feature based classification for each of the four features, the final conclusion is standing.

Ties are broken by giving priority to one feature: A tie between standing and bending is broken in

favor of ‘Standing’ if the value of RUD feature for the candidate object is closer to mean(RUD,Standing) than to mean(RUD,Bending).

A tie between standing and sitting is broken by AR. A tie between sitting and bending is broken by RLD.

Recognition of Sub-Scenario If c (>0) consecutive decisions at times t, (t-1),

……..(t-c+1) are all different from the decision being made at time (t-c), then we conclude that a new sub-scenario had commenced at time (t-c+1).

Otherwise, we attribute the differences to noise and image quality, and presume that the sub-scenario has not changed.

Video features Features derived from the moving

object used for activity detection are Aspect ratio Velocity Relative densities of pixels in upper ,

lower and middle bands of bounding box Coordinates of centroid of bounding box