VANAHEIM-FP7-248907 EXPO Ferroviaria 20120323 (PPTminimizer1502)

transcript

HUMAN-CENTERED AUDIO/VIDEO CONTENT ANALYSIS

FOR IMPROVED SURVEILLANCE IN METRO STATIONS

EXPO Ferroviaria, March 27, 2012

VANAHEIM projectFebruary 2010 – July 2013

VANAHEIM CONSORTIUM

Collaboration of• Computer vision & audio processing researchers

• Multitel asbl (MULT), Belgium (Coordinator)

• Institut Dalle Molle d'Intelligence Artificielle Perceptive (IDIAP), Switzerland

• Institut National de Recherche en Informatique et Automatique (INRIA), France

• Thales Communications France (TCF), France

• Human ethologists (sociologist)• University of Vienna (UNIVIE), Austria

• Surveillance system designer• Thales Italia (THALIT), Italy

• Public transport operators (metros)• Gruppo Torinese Trasporti (GTT), Italy

• Régie Autonome des Transports Parisiens (RATP), France

Large-scale integrating project (IP) �Duration: 42 months (February 2010 –July 2013)

�Budget: 5.471.851 € (EU contribution 3.717.998 €)

Integrate innovative audio/video analysis tools in cctv surveillance system

for assessment in real-scale metro environment (Turin & Paris metros)

Technological objectives: – Development and deployment of system– Technological & scientific assessments

Scientific objectives:– Audio/video data stream modeling– Human behavior analysis

• Human activity recognition (individual, group and crowd/flow of people)

• Collective behavior modeling

OBJECTIVES

CURRENT SITUATIONCURRENT SITUATION

CCTV video streams never watched (e.g. in Torino, 28 monitors for 1100 cameras).

Common situation: monitors in control rooms show empty scenes/spaces,

(while many others cameras look at scenes in which something (even normal) is happening)

→→→→ Probability to watch right streams at right time is very limited

VANAHEIM PROPOSAL – AUTOMATIC SENSOR SELECTIONVANAHEIM PROPOSAL – AUTOMATIC SENSOR SELECTION

� Mechanisms for selecting relevant/salient audio/video streams in control rooms

� Models to characterise video streams content

� Trivial scenario when dealing with “empty vs occupied” scenes

� Challenging problem when almost all scenes are occupied

� Need for unsuperised modelling is even more explicit for audio streams

� “mosaicing” of data is impossible due to transparent nature of sound

� Algorithms to model audio/video streams statistic normality

and detect abnormal audio/video stream content

AUTONOMOUS STREAM SELECTION

Autonomous stream select ionAutonomous stream select ionAutonomous stream select ionAutonomous stream select ion

Leavi ng Station Enteri ng Station (from the right)

Vending M achine (leavi ng)

Taking escal ator up Left T o Right Right to Left(slow)

Activity representation: ti me represented with color gradient: beginni g in violet/blue, end in red

People leaving escal atorPeople arri ving from platform

People going to pl at for m

People on escalator

People arri ving from platform(by taki ng stairs)

People taking escalator

Extraction of object trajectories from videos Identification of activity patterns from trajectories

Discovery of temporal relations between acti vity patterns

Automatic discovery of normal /usual activities (learning stage)Automatic discovery of normal /usual activities (learning stage)

Automatic learning of normal activities from several hours of multi-camera videosAutomatic learning of normal activities from several hours of multi-camera videos

Scene activ ity

Likelihood of

trajectories

Abnormality

discovered

Likelihood of

activities

Online recognition of current activities (most probable)Online recognition of current activities (most probable)

Cycle of activities recognized on-the-flyCycle of activities recognized on-the-fly

Unusual/Abnormal activity detectionUnusual/Abnormal activity detection

Abnormality

Drunk person

falling down

Unusual

trajectory

Unusual

crossing

trajectories

Loitering

groups in

the back

Unusual group

trajectory

Extension to multi-camera: unusual/abnormal activity detectionExtension to multi-camera: unusual/abnormal activity detection

→ Counter flow

→ Falling people (people gathering)

→ Heckling→ Lost person

→ Person distributing leaflets

→ Cleaning staff emptying a garbage

→ Persons phone calling

→ …

Anomalies detected on 8 cameras (210h)

Train (Arriva l, Depar ture) Doors (open, closing) Doors a larms Station AmbianceRecog nised

audio activ ity

Time-varying

spectral

representation

of audio s ignal

Raw audio

signal

Positions of k nown abnormal events

(children group synthetica lly added to raw audio data)

Known abnormal events detected Unknown abnormal event detected (bip)

Time-varying

spectral

representation

of audio s ignal

raw audio data

mixed with

synthetic event

Abnormality

measure

Unsupervised abnormal audio events detectionUnsupervised abnormal audio events detection

Semantic analysis of audio surveillance signals Semantic analysis of audio surveillance signals

Human behaviour modelling : rarely exploited in Video Content Analysis

→→→→ Need for robust and reliable human-centred features

VANAHEIM PROPOSAL – HUMAN-CENTRED MONITORINGVANAHEIM PROPOSAL – HUMAN-CENTRED MONITORING

Move one step beyond scene understanding based on location features

�Investigate 3 levels of human behaviours characterization in surveillance data

� Individual level

→→→→ characterize an individual person with his/her activities.

� Group level

→→→→ detect small group of people and identify interactions in it.

� Crowd level

→→→→ monitor crowd/flow of people (dynamics of

collective people flow).

Two applications:

� Event detection applications for safety/security

� Environmental reporting for situational awareness

HUMAN-CENTRED MONITORING

Situat ional awareness Situat ional awareness Situat ional awareness Situat ional awareness

RealRealRealReal----time applicat ionstime applicat ionstime applicat ionstime applicat ions

HUMAN-CENTRED MONITORING (BEHAVIOR ANNOTATION)

Human behaviour modelling : Development of a behavior catalogue including

not only behaviors regarded as interesting by user,

but covering behavior repertoire as completely as possible

→→→→ Catalogue of all behaviors of all people visible on video material

People tracking (trac king by detection)People tracking (trac king by detection)

HUMAN-CENTRED MONITORING (INDIVIDUAL)

Tracking by detection:Associate detection over timeFill the gaps

Body orientation esti mationBody orientation esti mation Body + He ad pose orientation estimationBody + He ad pose orientation estimation

3D circle

(50 cm)Body

Head pose

Group detectionGroup detection

Group detection & trackingGroup detection & tracking

HUMAN-CENTRED MONITORING (GROUP)

Head detection & trackingHead detection & tracking People & head detection People & head detection

Event detection related to Position:

Group stays in zone (access zone, w aiting zone…)Group close to/far from equipment/w alls

Trajectory:Group stands still, group w alks, and groups runs

Size:Constant size � calm groupMedium variation � normal activity levelHigh variation � lively group

People counting / flow monitoring in e scalator People counting / flow monitoring in e scalator

HUMAN-CENTRED MONITORING (CROWD/FLOW)

Objectives

– Cumulative People counting

– People flow measurement (pers./min)

Exploitation qualitative

→ Identification of trends (e.g. weekdays vs week-end days)

Performances evaluation on one station (8 esc.)

– Depending on view type (close/medium/far)

– Correlation ~ 0,85 for close/medium views

NameDuratio

correlati

DOD Acc esso C ernaia ( l eft) 2h 501 0.43

DOD Acc esso C ernaia ( l eft) 9h 4085 0.63

DOD Acc esso C ernaia (right) 2h 2201 0.64

DOD Acc esso C ernaia ( l eft) 30 min. 647 0.67

DOD S M1 Acc esso Cernai a (right) 2h 2426 0.77

DOD S M1 Acc esso Cernai a ( left) 2h 497 0.82

DOD Atrio M ezzanino 2 30 min. 178 0.83

BER Atrio Mezz anino 1 30 min. 91 0.83

DOD Atrio M ezzanino 1 30 min. 413 0.85

DOD Vi a 2 A 30 min. 386 0.85

DOD Acc esso Stazionne (right) 30 min. 373 0.88

BER Atrio Mezz anino 2 30 min. 30 0.89

DOD Vi a 1 C 30 min. 295 0.95

DOD Vi a 2 A 9h 1305 0.95

DOD Atrio M ezzanino 1 9h 4127 0.96

DOD Vi a 1 C 2h 810 0.97

DOD Vi a 1 C 9h 4376 0.97

Occupanc y rate at plat formOccupanc y rate at plat form

HUMAN-CENTRED MONITORING (CROWD/FLOW)

Performance evaluation on different platforms

15% error in counting/occupancy for mid-crowded Under-estimation in dense crowd

Change point detection

Detect significant changes in crowd density

Detect fast modification of platform occupancy, mostly at metro arrivals.

Density based approach to help in dense crowd situation

First test with simple feature shows promising results.

HUMAN-CENTRED MONITORING (SITUATIONAL REPORTING)

Report position of people on infrastructure map

Different algorithms used as input• Escalator flow monitoring

• Occupancy rate at platform

• Human detector

• Multi-object tracking

Transportation terminals are increasingly subject to capacity problems

� Need expressed by managers for analysis of passenger dynamics/behaviors

� Bottleneck consist in high variety/complexity of passenger behaviours

VANAHEIM PROPOSAL – LONG-TERM COLLECTIVE BEHAVIOUR BUILDINGVANAHEIM PROPOSAL – LONG-TERM COLLECTIVE BEHAVIOUR BUILDING

� System able to identify & characterize structures inherent in collective behavior

�models that can learn, analyze and cluster individual behavioral information

Continuous monitoring of user information

� locations, routes,

� spatio-temporal activities (walking, waiting...),

� interactions with others passengers and/or equipments,

� contextual data (time of day, density of people...)

Goal: estimate trends of large-scale human behaviour

at an infrastructure level, e.g. to

� Localize common loitering areas and/or highly frequented aisles

� Identify traffic patterns in the infrastructure, etc.

LONG-TERM COLLECTIVE BEHAVIOUR BUILDING

Planning applicat ionsPlanning applicat ionsPlanning applicat ionsPlanning applicat ions

Real-time monitoring

applicationsCollective behaviors bu ild.Collective behaviors bu ild.Collective behaviors bu ild.Collective behaviors bu ild.

LONG-TERM COLLECTIVE BEHAVIOUR BUILDING

USER-BOARDRepresentative of

� CCTV end-users (security/safety operators, public infrastructure managers...)� Surveillance system designers, manufacturers and suppliers� Video Content Analysis (VCA) solutions providers

Register at www.vanaheim-project.eu

SURVEY ON AUDIO & VIDEO CONTENT ANALYSIS FOR

TRANSPORT. APP.

SURVEY ON AUDIO & VIDEO CONTENT ANALYSIS FOR

TRANSPORT APPLICATION

TECHNICAL VISIT - PROTOTYPE FOR VANAHEIM PROJECT

AT METROPOLITANA AUTOMATICA DI TORINO

The research leading to these results has received funding from the European Community’s Seventh Framework Programme FP7/2007-2013 - Challenge 2- Cognitive Systems,

Interaction, Robotics – under grant agreement n 248907-VANAHEIM.

www.vanaheim-project.eu

carincotte@multitel.be forchino.a@gtt.to.it

QUESTIONS ?

VANAHEIM-FP7-248907 EXPO Ferroviaria 20120323 (PPTminimizer1502)

Documents