Post on 22-Oct-2014
description
transcript
HUMAN-CENTERED AUDIO/VIDEO CONTENT ANALYSIS
FOR IMPROVED SURVEILLANCE IN METRO STATIONS
EXPO Ferroviaria, March 27, 2012
VANAHEIM projectFebruary 2010 – July 2013
VANAHEIM CONSORTIUM
Collaboration of• Computer vision & audio processing researchers
• Multitel asbl (MULT), Belgium (Coordinator)
• Institut Dalle Molle d'Intelligence Artificielle Perceptive (IDIAP), Switzerland
• Institut National de Recherche en Informatique et Automatique (INRIA), France
• Thales Communications France (TCF), France
• Human ethologists (sociologist)• University of Vienna (UNIVIE), Austria
• Surveillance system designer• Thales Italia (THALIT), Italy
• Public transport operators (metros)• Gruppo Torinese Trasporti (GTT), Italy
• Régie Autonome des Transports Parisiens (RATP), France
Large-scale integrating project (IP) �Duration: 42 months (February 2010 –July 2013)
�Budget: 5.471.851 € (EU contribution 3.717.998 €)
Integrate innovative audio/video analysis tools in cctv surveillance system
for assessment in real-scale metro environment (Turin & Paris metros)
Technological objectives: – Development and deployment of system– Technological & scientific assessments
Scientific objectives:– Audio/video data stream modeling– Human behavior analysis
• Human activity recognition (individual, group and crowd/flow of people)
• Collective behavior modeling
OBJECTIVES
CURRENT SITUATIONCURRENT SITUATION
CCTV video streams never watched (e.g. in Torino, 28 monitors for 1100 cameras).
Common situation: monitors in control rooms show empty scenes/spaces,
(while many others cameras look at scenes in which something (even normal) is happening)
→→→→ Probability to watch right streams at right time is very limited
VANAHEIM PROPOSAL – AUTOMATIC SENSOR SELECTIONVANAHEIM PROPOSAL – AUTOMATIC SENSOR SELECTION
� Mechanisms for selecting relevant/salient audio/video streams in control rooms
� Models to characterise video streams content
� Trivial scenario when dealing with “empty vs occupied” scenes
� Challenging problem when almost all scenes are occupied
� Need for unsuperised modelling is even more explicit for audio streams
� “mosaicing” of data is impossible due to transparent nature of sound
� Algorithms to model audio/video streams statistic normality
and detect abnormal audio/video stream content
AUTONOMOUS STREAM SELECTION
Autonomous stream select ionAutonomous stream select ionAutonomous stream select ionAutonomous stream select ion
Leavi ng Station Enteri ng Station (from the right)
Vending M achine (leavi ng)
Taking escal ator up Left T o Right Right to Left(slow)
Activity representation: ti me represented with color gradient: beginni g in violet/blue, end in red
People leaving escal atorPeople arri ving from platform
People going to pl at for m
People on escalator
People arri ving from platform(by taki ng stairs)
People taking escalator
Extraction of object trajectories from videos Identification of activity patterns from trajectories
Discovery of temporal relations between acti vity patterns
Automatic discovery of normal /usual activities (learning stage)Automatic discovery of normal /usual activities (learning stage)
Automatic learning of normal activities from several hours of multi-camera videosAutomatic learning of normal activities from several hours of multi-camera videos
Scene activ ity
Likelihood of
trajectories
Abnormality
discovered
Likelihood of
activities
Online recognition of current activities (most probable)Online recognition of current activities (most probable)
Cycle of activities recognized on-the-flyCycle of activities recognized on-the-fly
Unusual/Abnormal activity detectionUnusual/Abnormal activity detection
Abnormality
index
Drunk person
falling down
Unusual
trajectory
Unusual
crossing
trajectories
Loitering
groups in
the back
Unusual group
trajectory
Extension to multi-camera: unusual/abnormal activity detectionExtension to multi-camera: unusual/abnormal activity detection
→ Counter flow
→ Falling people (people gathering)
→ Heckling→ Lost person
→ Person distributing leaflets
→ Cleaning staff emptying a garbage
→ Persons phone calling
→ …
Anomalies detected on 8 cameras (210h)
Train (Arriva l, Depar ture) Doors (open, closing) Doors a larms Station AmbianceRecog nised
audio activ ity
Time-varying
spectral
representation
of audio s ignal
Raw audio
signal
Positions of k nown abnormal events
(children group synthetica lly added to raw audio data)
Known abnormal events detected Unknown abnormal event detected (bip)
Time-varying
spectral
representation
of audio s ignal
raw audio data
mixed with
synthetic event
Abnormality
measure
Unsupervised abnormal audio events detectionUnsupervised abnormal audio events detection
Semantic analysis of audio surveillance signals Semantic analysis of audio surveillance signals
Human behaviour modelling : rarely exploited in Video Content Analysis
→→→→ Need for robust and reliable human-centred features
VANAHEIM PROPOSAL – HUMAN-CENTRED MONITORINGVANAHEIM PROPOSAL – HUMAN-CENTRED MONITORING
CURRENT SITUATIONCURRENT SITUATION
Move one step beyond scene understanding based on location features
�Investigate 3 levels of human behaviours characterization in surveillance data
� Individual level
→→→→ characterize an individual person with his/her activities.
� Group level
→→→→ detect small group of people and identify interactions in it.
� Crowd level
→→→→ monitor crowd/flow of people (dynamics of
collective people flow).
Two applications:
� Event detection applications for safety/security
� Environmental reporting for situational awareness
HUMAN-CENTRED MONITORING
Situat ional awareness Situat ional awareness Situat ional awareness Situat ional awareness
RealRealRealReal----time applicat ionstime applicat ionstime applicat ionstime applicat ions
HUMAN-CENTRED MONITORING (BEHAVIOR ANNOTATION)
Human behaviour modelling : Development of a behavior catalogue including
not only behaviors regarded as interesting by user,
but covering behavior repertoire as completely as possible
→→→→ Catalogue of all behaviors of all people visible on video material
People tracking (trac king by detection)People tracking (trac king by detection)
HUMAN-CENTRED MONITORING (INDIVIDUAL)
Tracking by detection:Associate detection over timeFill the gaps
Body orientation esti mationBody orientation esti mation Body + He ad pose orientation estimationBody + He ad pose orientation estimation
3D circle
(50 cm)Body
pose
Head pose
Group detectionGroup detection
Group detection & trackingGroup detection & tracking
HUMAN-CENTRED MONITORING (GROUP)
Head detection & trackingHead detection & tracking People & head detection People & head detection
g1
Event detection related to Position:
Group stays in zone (access zone, w aiting zone…)Group close to/far from equipment/w alls
Trajectory:Group stands still, group w alks, and groups runs
Size:Constant size � calm groupMedium variation � normal activity levelHigh variation � lively group
People counting / flow monitoring in e scalator People counting / flow monitoring in e scalator
HUMAN-CENTRED MONITORING (CROWD/FLOW)
Objectives
– Cumulative People counting
– People flow measurement (pers./min)
Exploitation qualitative
→ Identification of trends (e.g. weekdays vs week-end days)
Performances evaluation on one station (8 esc.)
– Depending on view type (close/medium/far)
– Correlation ~ 0,85 for close/medium views
NameDuratio
n
Num.
pers.
Flow
correlati
on
DOD Acc esso C ernaia ( l eft) 2h 501 0.43
DOD Acc esso C ernaia ( l eft) 9h 4085 0.63
DOD Acc esso C ernaia (right) 2h 2201 0.64
DOD Acc esso C ernaia ( l eft) 30 min. 647 0.67
DOD S M1 Acc esso Cernai a (right) 2h 2426 0.77
DOD S M1 Acc esso Cernai a ( left) 2h 497 0.82
DOD Atrio M ezzanino 2 30 min. 178 0.83
BER Atrio Mezz anino 1 30 min. 91 0.83
DOD Atrio M ezzanino 1 30 min. 413 0.85
DOD Vi a 2 A 30 min. 386 0.85
DOD Acc esso Stazionne (right) 30 min. 373 0.88
BER Atrio Mezz anino 2 30 min. 30 0.89
DOD Vi a 1 C 30 min. 295 0.95
DOD Vi a 2 A 9h 1305 0.95
DOD Atrio M ezzanino 1 9h 4127 0.96
DOD Vi a 1 C 2h 810 0.97
DOD Vi a 1 C 9h 4376 0.97
Occupanc y rate at plat formOccupanc y rate at plat form
HUMAN-CENTRED MONITORING (CROWD/FLOW)
Performance evaluation on different platforms
15% error in counting/occupancy for mid-crowded Under-estimation in dense crowd
Change point detection
Detect significant changes in crowd density
Detect fast modification of platform occupancy, mostly at metro arrivals.
Density based approach to help in dense crowd situation
First test with simple feature shows promising results.
HUMAN-CENTRED MONITORING (SITUATIONAL REPORTING)
Report position of people on infrastructure map
Different algorithms used as input• Escalator flow monitoring
• Occupancy rate at platform
• Human detector
• Multi-object tracking
CURRENT SITUATIONCURRENT SITUATION
Transportation terminals are increasingly subject to capacity problems
� Need expressed by managers for analysis of passenger dynamics/behaviors
� Bottleneck consist in high variety/complexity of passenger behaviours
VANAHEIM PROPOSAL – LONG-TERM COLLECTIVE BEHAVIOUR BUILDINGVANAHEIM PROPOSAL – LONG-TERM COLLECTIVE BEHAVIOUR BUILDING
� System able to identify & characterize structures inherent in collective behavior
�models that can learn, analyze and cluster individual behavioral information
Continuous monitoring of user information
� locations, routes,
� spatio-temporal activities (walking, waiting...),
� interactions with others passengers and/or equipments,
� contextual data (time of day, density of people...)
Goal: estimate trends of large-scale human behaviour
at an infrastructure level, e.g. to
� Localize common loitering areas and/or highly frequented aisles
� Identify traffic patterns in the infrastructure, etc.
LONG-TERM COLLECTIVE BEHAVIOUR BUILDING
Planning applicat ionsPlanning applicat ionsPlanning applicat ionsPlanning applicat ions
Real-time monitoring
applicationsCollective behaviors bu ild.Collective behaviors bu ild.Collective behaviors bu ild.Collective behaviors bu ild.
LONG-TERM COLLECTIVE BEHAVIOUR BUILDING
USER-BOARDRepresentative of
� CCTV end-users (security/safety operators, public infrastructure managers...)� Surveillance system designers, manufacturers and suppliers� Video Content Analysis (VCA) solutions providers
Register at www.vanaheim-project.eu
SURVEY ON AUDIO & VIDEO CONTENT ANALYSIS FOR
TRANSPORT. APP.
SURVEY ON AUDIO & VIDEO CONTENT ANALYSIS FOR
TRANSPORT APPLICATION
TECHNICAL VISIT - PROTOTYPE FOR VANAHEIM PROJECT
AT METROPOLITANA AUTOMATICA DI TORINO
The research leading to these results has received funding from the European Community’s Seventh Framework Programme FP7/2007-2013 - Challenge 2- Cognitive Systems,
Interaction, Robotics – under grant agreement n 248907-VANAHEIM.
www.vanaheim-project.eu
carincotte@multitel.be forchino.a@gtt.to.it
QUESTIONS ?