Multi-Scale Video Cropping

Post on 22-Feb-2016

58 views 0 download

Tags:

description

Multi-Scale Video Cropping. Hazem El-Alfy , David Jacobs and Larry Davis Department of Computer Science University of Maryland, College Park Sep 25 th 2007, ACM MM ’07. Modern Surveillance Systems. Networks of sur-veillance cameras. Control Room: Fewer monitors than cameras. - PowerPoint PPT Presentation

transcript

Multi-Scale Video CroppingHazem El-Alfy, David Jacobs

and Larry Davis

Department of Computer ScienceUniversity of Maryland, College Park

Sep 25th 2007, ACM MM ’07

2

Modern Surveillance Systems Networks of sur-

veillance cameras. Control Room:

Fewer monitors than cameras.

Far fewer operators than monitors.

Cameras cycle through monitors.

3

Modern Surveillance SystemsTypical Control Rooms: airports, subways, metropolitan areas, seaports, crowd control.

4

“Future” Control Rooms “Continuous” display

wall versus a fixed set of discrete monitors.

Algorithms to control: where to display videos, how much area to

assign to them, how to display them.

Barco Control Room, Vienna, Austria

5

Video Cropping

Munich Airport – Courtesy Siemens, NJ

6

Why Cropping? Resize video to

save bandwidth or to fit display area.

Cropping before resizing to focus operator attention of on important areas.

7

Problem DefinitionDetermine trajec-tories

of cropping windows through the video: variable size window maximize captured

saliency smooth trajectory occasional jumps (cuts)

between trajectories.

x

t

y

8

Problem Definition Each frame t covered by

variable size overlapping windows Wi,t

Saliency measure S(Wi,t) argmaxQ Σt S(Wi,t), over all

window sequences Q Subject to constraints for

smooth window motion and size change.

Wi,t

9

Our Approach: Overview Extract motion energy. Model video as a graph. Find trajectories as shortest paths in graph. Merge trajectories. Repeat for other segments of long videos.

Extract Motion Energy

BuildingGraph

WipingFrames

MergingTrajectories

ShortestPath +

SmoothingVideo

FramesFramesMotion

Trajectories CroppedVideo

10

Extracting Motion Energy Motion energy as a saliency measure. Frame differences are smoothed using

morphological operations.

11

Modeling Graph Nodes: cropping windows in each frame. Add dummy source and target nodes. Edges: allowable window changes (location and

size) between consecutive frames.

dummy source node

dummy target node

w=0

w=0

windows of first frame

windows of last frame

windows of i th frame

12

Modeling Graph Multi-scale energy function for window W:

E(W) = S(W): always favors large windowsE(W) = S(W)/A(W): favors small (dense) windowsE(W) = S(Win)/A(Win) – Sbelt/K Edge weight: wij = 1 – ENorm(Wj)

Win

Sbelt1

4 3

2

13

Modeling Graph Energy function computed

for all windows in all frames. Efficiently computed using

integral images [Viola & Jones ’01]: ii(x,y) = Σx’<x,y’<y i(x’,y’)E(W)=ii(x3)-ii(x2)-ii(x4)+ii(x1)

x4 x3

x1 x2

video frame

cropping windowW

14

Shortest Path Dial’s implementation of

Dijkstra’s algorithm: linear in # graph nodes.

Smoothing: low-pass filter + cubic Hermite interpolation.

15

Merging Trajectories More cropping windows needed to capture simultaneous activity. Wipe captured activity from motion frames and repeat earlier

process on remaining motion. Merge trajectories: find shortest path through a graph of

trajectories.

16

Processing Long Videos Problems:

Graph gets too big if video is long. Latencies must be short in surveillance systems.

Solution: Break long videos into segments with overlap. Process each segment then stitch results together.

breakhere

breakhere

17

Processing Long Videos Issues

How short can segments be? Are there preferable locations to break video? Overlap amount needed for smooth transitions?

We ran many experiments for fixed size crop Shortest path converge quickly. Segments can be

as short as 40 frames. Avoid periods of low activity when breaking video. Overlap intervals of 20 frames are sufficient.

18

ResultsMunich Airport: variable size single window.

19

ResultsMunich Airport: video-in-video display.

20

ResultsTraffic at a stop sign on campus (2 windows).

21

Contributions

Variable size smooth cropping window. Simultaneous multiple cropping windows. Relatively short video segments

processed vs. the entire video (online). Empirically shown identical to processing

the largest video that can be processed as a whole.