Date post: | 09-Sep-2018 |
Category: | Documents |
View: | 214 times |
Download: | 0 times |
Video/Audio Networked surveillance system enhAncement
through Human-cEnteredadaptIve Monitoring
Large-scale integrating project
Grant Agreement n248907
01/02/2010 31/07/2014
Contractual delivery date: 31 January 2011
Actual delivery date: 11 February 2011
Deliverable D4.2
First report on audio features extraction
and multimodal activity modelling (v1)
D4.2
Version: 2.0
Author: TCF
Contributors:
Reviewers: IDIAP, MULT
Dissemination level: PU
Related document(s):
Number of pages:42
FP7 VANAHEIM IP project n248907
Page 2 of 42
Document information
Ver. Date Changes Author (partic.)
0.0 17/01/2011 Creation F. Capman/S. Lecomte/B. Ravera (TCF)
1.0 02/02/2011 Final F. Capman/S. Lecomte/B. Ravera (TCF)
2.0 04/02/2011 Minor changes F. Capman/S. Lecomte/B. Ravera (TCF)
Ver. Date Approval/Rejection decision/comments Author (partic.)
1.0 03/02/2011 Approved subject to minor changes J.M. Odobez (IDIAP)
1.0 03/02/2011 Approved C. Carincotte (MULT)
Filename convention is defined as follow:
1. Project number: VANAHEIM-FP7-248907
2. Leading participant acronym (MULT, GTT, IDIAP ...): xxx
3. Type of document: Working Document (by default) WD
Meeting Minutes MM
Management Report MR
Activity Report AR
Deliverable DR
4. Distribution: Public (PU) PU
Consortium restricted (CO) CO
5. Serial number (one letter + 2 digits corresponding to the task, deliverables or meeting number):
Deliverables D
Meeting M
Report R
6. Revision number: draft d
approved a
version sequence (one digit)
FP7 VANAHEIM IP project n248907
Page 3 of 42
Copyright
Copyright 2010, 2014 the VANAHEIM Consortium
Consisting of:
Coordinator: Multitel asbl (MULT) Belgium
Participants: GruppoTorineseTrasporti (GTT) Italy
Institut Dalle Molle d'Intelligence Artificielle Perceptive (IDIAP) Switzerland
Institut National de Recherche en Informatique et en Automatique (INRIA) France
Rgie Autonome des Transports Parisiens (RATP) France
Thales Communications (TCF) France
Thales Italia (THALIT) Italy
University of Vienna (UNIVIE) Austria
This document may not be copied, reproduced, or modified in whole or in part for any purpose
without written permission from the VANAHEIM Consortium. In addition to such written permission
to copy, reproduce, or modify this document in whole or part, an acknowledgement of the authors of
the document and all applicable portions of the copyright notice must be clearly referenced.
All rights reserved.
This document may change without notice.
FP7 VANAHEIM IP project n248907
Page 4 of 42
1 Executive Summary
In this document we will outline proposed methods addressing audio analysis and multimodal analysis
applied to automatic surveillance. This first report is only focused on audio analysis and describes the
different technical options we have followed.
The first technical issue is related to features extraction and selection. Most of regular audio features have
been implemented and a software library is available. The second issue was dedicated to the development of
an evaluation framework. For algorithmic evaluation purpose, we have studied and implemented a generic
framework for performance evaluation using audio signals recorded in test sites (Metro of Torino) and also
audio signals extracted from professional databases.
Some algorithmic development has been carried out during this first year of the project. The main technical
options we decided to follow are based on unsupervised learning. In order not to dedicate our audio
surveillance system to specific abnormal audio events, we preferred to drive our training steps by normal
ambience modelling. A GMM-based solution (Gaussian Mixture Model) has been adapted to this aim, and a
One-Class SVM-based solution (Support Vector Machine) has been studied and evaluated. Finally, and
based on promising video surveillance studies, a PLSA (Probabilistic Latent Semantic Analysis) based
content analysis system has been also investigated. The presented document deals with the outcomes of the
first year, and the proposed solutions need to be further improved and studied before integration inside the
final VANAHEIM multimodal surveillance system.
FP7 VANAHEIM IP project n248907
Page 5 of 42
Table of contents
1 EXECUTIVE SUMMARY ...................................................................................................................... 4
2 INTRODUCTION .................................................................................................................................... 7
3 AUDIO FEATURES EXTRACTION .................................................................................................... 9
3.1 IMPLEMENTED ACOUSTIC FEATURES .................................................................................................. 9 3.2 AUDIO FEATURE EXTRACTION SOFTWARE TOOL .............................................................................. 10
3.2.1 Configuration file ........................................................................................................................ 11 3.2.2 Features declaration file ............................................................................................................. 12
4 AUDIO FEATURES SELECTION ...................................................................................................... 14
4.1 OVERVIEW OF FEATURE SELECTION .................................................................................................. 14 4.2 STRATEGIES ....................................................................................................................................... 15 4.3 PARADIGMS AND CRITERIA ............................................................................................................... 16 4.4 STOPPING CRITERIA ........................................................................................................................... 17
5 METHODOLOGY FOR ABNORMAL AUDIO SEQUENCE GENERATION ............................. 17
5.1 THE WEIGHTED MEASURE OF SNR .................................................................................................... 18 5.2 DISCUSSION ON MEASURING EVENTS SNR IN AUDIO SURVEILLANCE SIGNALS .............................. 18 5.3 AUDIO FOR SURVEILLANCE SIMULATION FRAMEWORK .................................................................... 21
6 UNSUPERVISED ABNORMAL AUDIO DETECTION ................................................................... 23
6.1 GMM-BASED SYSTEM ....................................................................................................................... 23 6.2 EVALUATION OF THE GMM-BASED SYSTEM .................................................................................... 23 6.3 ONE CLASS SVM-BASED SYSTEM ..................................................................................................... 27
7 AUDIO ANALYSIS BASED ON PLSA ............................................................................................... 31
7.1 PROBABILISTIC LATENT SEMANTIC ANALYSIS .................................................................................. 31 7.2 AUDIO PLSA MODEL FORMULATION ................................................................................................ 32 7.3 AUDIO PLSA ANALYSIS EVALUATION ON REAL AUDIO SURVEILLANCE DATA ................................ 34
8 CONCLUSION ....................................................................................................................................... 39
9 BIBLIOGRAPHIE ................................................................................................................................. 40
10 GLOSSARY ........................................................................................................................................ 42
FP7 VANAHEIM IP project n248907
Page 6 of 42
List of Figures
Figure 1: AFE Configuration file .................................................................................................................... 11 Figure 2: AFE file configuration (Features parameters) .................................................................................. 13 Figure 3- Generic Feature Selection scheme ................................................................................................... 15 Figure 4- Taxonomy of strategies for Feature Selection Algorithms .............................................................. 15 Figure 5- Standardize weighting curves for noise level measurement ............................................................ 19 Figure 6- Typical weighted and unweighted long time spectrum shape of an ambience signal ..................... 19 Figure 7- Empirical variations of noise measurements depending on weighting function .............................. 20 Figure 8- Mean weighted SNR variation from flat measurements depending on event type .......................... 20 Figure 9- Simulation flowchart ....................................................................................................................... 22 Figure 10: DET curve calculated on the complete set of abn
Click here to load reader