Context-dependent Detection of Unusual Events in Videos by Geometric Analysis of Video Trajectories...

Post on 23-Dec-2015

215 views 0 download

Tags:

transcript

Context-dependent Detection of Context-dependent Detection of

Unusual Events in VideosUnusual Events in Videos bybyGeometric Analysis of Video Geometric Analysis of Video

TrajectoriesTrajectoriesLongin Jan LateckiLongin Jan Latecki

((lateckilatecki@temple.@temple.eduedu)) Computer and Information ScienceComputer and Information Sciencess

Temple University, PhiladelphiaTemple University, Philadelphia

Nilesh Ghubade and Nilesh Ghubade and Xiangdong Wen Xiangdong Wen ((nileshgnileshg@temple.@temple.eduedu))

AgendaAgenda

IntroductionIntroduction Mapping of video to a trajectoryMapping of video to a trajectory Relation: motion trajectory Relation: motion trajectory video video

trajectorytrajectory Discrete curve evolutionDiscrete curve evolution Polygon simplificationPolygon simplification Key framesKey frames Unusual events in surveillance videosUnusual events in surveillance videos ResultsResults

Main ToolsMain Tools Mapping the video sequence to a polyline Mapping the video sequence to a polyline

in in a a multi-dimensional space. multi-dimensional space. The automatic extraction of relevant frames The automatic extraction of relevant frames

from videos is based on from videos is based on polygon polygon simplification simplification by by discrete curve evolutiondiscrete curve evolution..

Mapping of video to a Mapping of video to a trajectorytrajectory

MapMapping ofping of the image stream to a trajectory the image stream to a trajectory (polyline) in a feature space.(polyline) in a feature space.

Representing each frame Representing each frame as:as:

Bin0 ……… Bin nFrame 0

Frame N

X-coord of the Bin’s centroid

Bin’s Frequency

Count

Y-coord of the Bin’s centroid

Bin n

Used in our Used in our experimentsexperiments

Red-Green-Blue (rgb) BinsRed-Green-Blue (rgb) Bins Each frame as a 24-bit color image (8 bit per Each frame as a 24-bit color image (8 bit per

color intensity)color intensity)::• Bin 0 = color intensities from 0-31Bin 0 = color intensities from 0-31• Bin 1 = color intensities from 32-63Bin 1 = color intensities from 32-63• Bin 8 = color intensities from 224-255Bin 8 = color intensities from 224-255

Three attributes per bin: -Three attributes per bin: -• Row of the bin’s centroidRow of the bin’s centroid• Column of the bin’s centroidColumn of the bin’s centroid• Frequency count of the bin.Frequency count of the bin.

(8 bins per color level * 3 attributes/bin)*3 color (8 bins per color level * 3 attributes/bin)*3 color levels = 72 featurelevels = 72 feature

Theoretical Results:

Motion trajectory Video trajectory

Consider a video in which an object (a set of pixels) is moving on a uniform background. The object is visible in all frames and it is moving with a constant speed on a linear trajectory. Then the video trajectory in the feature space is a straight line.

If n objects are moving with constant speeds on a linear trajectory, then the trajectory is a straight line in the feature space.

Consider a video in which an object (a set of pixels) is moving on a uniform background.

Then the trajectory vectors are contained in the plane.

If n objects are moving, then the dimension of the trajectory is at most 2n.

If a new object suddenly appears in the movie, the dimension of the trajectory increases at least by 1 and at most by 3.

MovingDotMovieWithAdditionalDot.avi

Robust Rank ComputationRobust Rank Computation

Using singular value decomposition, based on: C. Rao, A. Yilmaz, and M.Shah.View-Invariant Representation and Recognition of actions.Int. J. of Computer Vision 50, 2002.M. Seitz and C. R. Dyer.View-invariant analysis of cyclic motion. Int. J. of Computer Vision 16, 1997.

n

iiMerr

3

22 )(

We compute err in a window of 11 consecutive frames in our experiments.

0 20 40 60 80 100 120 140 1600

1

2

3

4

5

6

7

8x 10

-21

Frame Number

Nor

m D

ist

for

the

win

dow

of

"11"

fra

mes

MovingDotMovieWithAdditionalDotBins:Graph of Norm Dist for window of "11" frames VERSUS frame number

MovingDotMovieWithAdditionalDot.avi

Interpolation of video trajectoryInterpolation of video trajectory

MovingDotMovie_Clockwise.avi

MovingDotMovieWithAdditionalDot.avi

Polygon Polygon simplificationsimplification

Relevance Ranking Frame Number

0 1

1 100

99 5

98 12

Frames with decreasing relevance

Discrete Curve EvolutionDiscrete Curve Evolution P=P P=P00, ..., P, ..., Pmm

PPi+1i+1 is obtained from P is obtained from Pii by deleting the by deleting the vertices of Pvertices of Pii that have minimal relevance that have minimal relevance

measure measure K(v, PK(v, Pii) = K(u,v,w) = |d(u,v)+d(v,w)-d(u,w)|) = K(u,v,w) = |d(u,v)+d(v,w)-d(u,w)|

u

v

w u

v

w

Discrete Curve Evolution: Discrete Curve Evolution: Preservation of position, no blurringPreservation of position, no blurring

Discrete Curve Evolution: Discrete Curve Evolution: robustness with respect to noiserobustness with respect to noise

Discrete Curve Evolution: Discrete Curve Evolution: extraction of linear segmentsextraction of linear segments

Key Frame Extraction Key Frame Extraction

Key frames and Key frames and rankrank

Security1 Security1 Bins MatrixBins MatrixDistance MatrixDistance Matrix

0 50 100 150 200 250 300 350 4000

0.2

0.4

0.6

0.8

1x 10

-3

Frame Number

Nor

m D

ist

for

the

win

dow

of

"11"

fra

mes

security1Bins:Graph of Norm Dist for window of "11" frames VERSUS frame number

err for seciurity1 video

M. S. Drew and J. Au: M. S. Drew and J. Au: http://www.cs.sfu.ca/~mark/ftp/AcmMM00/http://www.cs.sfu.ca/~mark/ftp/AcmMM00/

Predictability of video parts:Predictability of video parts:Local Curveness computationLocal Curveness computation

We divide the video polygonal curve P into parts T_i. For videos with 25 fps: T_i contains 25 frames.

We apply discrete curve evolution to each T_iuntil three points remain: a, b, c.Curveness measure of T_i:

C(T_i,P) = |d(a, b) + d(b, c) - d(a, c)|

b is the most relevant frame in T_i and the first vertex of T_i+1

security7

0 50 100 150 200 250 300 350 400

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5x 10

-4

Frame Number

Nor

m D

ist

for

the

win

dow

of

"11"

fra

mes

security7Bins:Graph of Norm Dist for window of "11" frames VERSUS frame number

err forseciurity7

2D projection by PCA of video trajectory for security7

Mov3

0 50 100 150 200 250 300 350 400

0

0.5

1

1.5

2

2.5

3

3.5

4x 10

-4

Frame Number

Nor

m D

ist

for

the

win

dow

of

"11"

fra

mes

Mov3Bins:Graph of Norm Dist for window of "11" frames VERSUS frame number

Mov3:Mov3:

Rustam waving his hand.Rustam waving his hand.

Bins MatrixBins Matrix

KeyKey frames = 1 378 52 142 frames = 1 378 52 142 253 235 148 31 155 167253 235 148 31 155 167

Distance MatrixDistance Matrix

KeyKey frames = 1 378 253 220 frames = 1 378 253 220 161 109 50 155 149 270 161 109 50 155 149 270

Hall_monitor

0 50 100 150 200 250 3001.5

2

2.5

3

3.5

4

4.5

5

5.5

6

6.5x 10

-5

Frame Number

Nor

m D

ist

for

the

win

dow

of

"11"

fra

mes

HallMonitorBins:Graph of Norm Dist for window of "11" frames VERSUS frame number

err forhall_monitor

Hall Monitor:Hall Monitor:

2 persons entering-exiting in 2 persons entering-exiting in a hall.a hall.

Bins MatrixBins Matrix

KeyKey frames = 1 300 35 240 frames = 1 300 35 240 221 215 265 241 278 280221 215 265 241 278 280

Distance Matrix Distance Matrix

KeyKey frames = 1 300 37 265 frames = 1 300 37 265 241 240 235 278 280 282241 240 235 278 280 282

CameraAtLightSignal.avi

Multimodal HistogramMultimodal Histogram

Histogram of lena

Segmented ImageSegmented Image

Image after segmentation – we get a outline of her face, hat etc

Gray Scale Image - MultimodalGray Scale Image - Multimodal

Original Image of Lena

Thank youThank you