+ All Categories
Home > Documents > Video Analysis

Video Analysis

Date post: 22-Feb-2016
Category:
Upload: dyre
View: 56 times
Download: 0 times
Share this document with a friend
Description:
Video Analysis. Mei-Chen Yeh May 29, 2012. Outline. Video representation Motion Actions in Video. Videos. A natural video stream is continuous in both spatial and temporal domains. A digital video stream sample pixels in both domains. Video processing. YC b C r. YC b C r. - PowerPoint PPT Presentation
Popular Tags:
61
Video Analysis Mei-Chen Yeh May 29, 2012
Transcript
Page 1: Video Analysis

Video Analysis

Mei-Chen YehMay 29, 2012

Page 2: Video Analysis

Outline

• Video representation• Motion• Actions in Video

Page 3: Video Analysis

Videos

• A natural video stream is continuous in both spatial and temporal domains.

• A digital video stream sample pixels in both domains.

Page 4: Video Analysis

Video processing

YCbCr

YCbCr

Page 5: Video Analysis

Video signal representation (1)

• Composite color signal– R, G, B– Y, Cb, Cr

• Why Y, Cb, Cr?– Backward compatibility (back-

and-white to color TV)– The eye is less sensitive to

changes of Cb and Cr components

YRCYBC

BGRY

r

b

114.0587.0299.0

Luminance(Y)

Chrominance(Cb + Cr)

Page 6: Video Analysis

Video signal representation (2)

• Y is the luma component and Cb and Cr are the blue and red chroma components.

Y Cb Cr

Page 7: Video Analysis

Sampling formats (1)

4:4:4 4:2:2 (DVB) 4:1:1 (DV)

Slide from Dr. Ding

Page 8: Video Analysis

Sampling formats (2)

4:2:0 (VCD, DVD)

Page 9: Video Analysis

TV encoding system (1)

• PAL– Phase Alternating Line, is a color encoding system used in

broadcast television systems in large parts of the world.• SECAM

– (French: Séquential Couleur Avec Mémoire), is an analog color television system first used in France.

• NTSC– National Television System Committee, is the analog

television system used in most of North America, South America, Burma, South Korea, Taiwan, Japan, Philippines, and some Pacific island nations and territories.

Page 10: Video Analysis

TV encoding system (2)

Page 11: Video Analysis

Uncompressed bitrate of videos

Video Type

Pixels per Frames

Image Aspect Ratio

Frames per Second

Bits/pixel Uncompressed Bitrate

NTSC 480 483 4:3 29.97 16 111.2 Mb/s

PAL 576 576 4:3 25 16 132.7 Mb/s

CIF 352 288 4:3 14.98 12 18.2 Mb/s

QCIF 176 144 4:3 9.99 12 3.0 Mb/s

HDTV 1280 720 16:9 59.94 12 622.9 Mb/s

HDTV 1920 1080 16:9 29.97 12 754.7 Mb/s

Slide from Dr. Chang

Page 12: Video Analysis

Outline

• Video representation• Motion• Actions in Video

Page 13: Video Analysis

Motion and perceptual organization

• Sometimes, motion is foremost cue

Page 14: Video Analysis

Motion and perceptual organization

• Even poor motion data can evoke a strong percept

Page 15: Video Analysis

Motion and perceptual organization

• Even poor motion data can evoke a strong percept

Page 16: Video Analysis

Uses of motion

• Estimating 3D structure• Segmenting objects based on motion cues• Learning dynamical models• Recognizing events and activities• Improving video quality (motion stabilization)• Compressing videos• ……

Page 17: Video Analysis

Motion field

• The motion field is the projection of the 3D scene motion into the image

Page 18: Video Analysis

Motion field• P(t) is a moving 3D point• Velocity of scene point:

• V = dP/dt• p(t) = (x(t),y(t)) is the

projection of P in the image• Apparent velocity v in the

image: • vx = dx/dt• vy = dy/dt

• These components are known as the motion field of the image

p(t)

p(t+dt)

P(t)P(t+dt)

V

v

Page 19: Video Analysis

Motion estimation techniques• Based on temporal changes in image intensities• Direct methods

– Directly recover image motion at each pixel from spatio-temporal image brightness variations

– Dense motion fields, but sensitive to appearance variations– Suitable when image motion is small

• Feature-based methods– Extract visual features (corners, textured areas) and track them

over multiple frames– Sparse motion fields, but more robust tracking– Suitable when image motion is large

Page 20: Video Analysis

Optical flow

• The velocity of observed 2-D motion vectors• Can be caused by– object motions– camera movements– illumination condition changes

Page 21: Video Analysis

Optical flow the true motion field

Motion field exists but no optical flow No motion field but shading changes

Page 22: Video Analysis

Problem definition: optical flow

How to estimate pixel motion from image I(x,y,t) to image I(x,y,t+dt)?• Solve pixel correspondence problem

– given a pixel in It, look for nearby pixels of the same color in It+dt

Key assumptions• color constancy: a point in It looks the same in It+dt

– For grayscale images, this is brightness constancy• small motion: points do not move very far

This is called the optical flow problem.

),,( tyxI ),,( tdtyxI

Page 23: Video Analysis

Optical flow constraints (grayscale images)

Let’s look at these constraints more closely:• brightness constancy:

• small motion: (u and v are small)– using Taylor’s expansion

),,( tyxI ),,( tdtyxI

),,(),,( tyxIdtvyuxI t

tt dtIv

yIu

xItyxIdtvyuxI

),,(),,(= 0

Page 24: Video Analysis

Optical flow equation

• Combining these two equations

• Dividing both sides by dt

0

tdtIv

yIu

xI

0,

tIvvI yx

u, v: displacement vectors

Known as the optical flow equation

T

yI

xII

,

velocity vectorspatial gradient vector

Page 25: Video Analysis

• Q: how many unknowns and equations per pixel?– 2 unknowns, one equation

• What does this constraint mean?• The component of the flow perpendicular to the

gradient (i.e., parallel to the edge) is unknown

0)','( vuI

edge

(vx, vy)

(u’,v’)

gradient

(vx +u’, vy +v’)

If (vx, vy) satisfies the equation, so does (vx +u’, vy +v’) if

0,

tIvvI yx

Page 26: Video Analysis

• Q: how many unknowns and equations per pixel?– 2 unknowns, one equation

• What does this constraint mean?• The component of the flow perpendicular to the

gradient (i.e., parallel to the edge) is unknown

This explains the Barber Pole illusion

0,

tIvvI yx

1

2

Page 27: Video Analysis

The aperture problem

Perceived motion

Page 28: Video Analysis

The aperture problem

Actual motion

Page 29: Video Analysis

The barber pole illusion

http://en.wikipedia.org/wiki/Barberpole_illusion

Page 30: Video Analysis

The barber pole illusion

http://en.wikipedia.org/wiki/Barberpole_illusion

Page 31: Video Analysis

To solve the aperture problem…

• We need more equations for a pixel.• Example– Spatial coherence constraint: pretends the pixel’s

neighbors have the same (vx, vy)– Lucas & Kanade (1981)

Page 32: Video Analysis

Outline

• Video representation• Motion• Actions in Video– Background subtraction– Recognition of actions based on motion patterns

Page 33: Video Analysis

Using optical flow:recognizing facial expressions

Recognizing Human Facial Expression (1994)by Yaser Yacoob, Larry S. Davis

Page 34: Video Analysis

Example use of optical flow: visual effects in films

http://www.fxguide.com/article333.html

Page 35: Video Analysis

Slide credit: Birgi Tamersoy

Page 36: Video Analysis

Background subtraction

• Simple techniques can do ok with static camera• …But hard to do perfectly

• Widely used:– Traffic monitoring (counting vehicles, detecting &

tracking vehicles, pedestrians),– Human action recognition (run, walk, jump, squat),– Human-computer interaction– Object tracking

Page 37: Video Analysis

Slide credit: Birgi Tamersoy

Page 38: Video Analysis

Slide credit: Birgi Tamersoy

Page 39: Video Analysis

Slide credit: Birgi Tamersoy

Page 40: Video Analysis

Slide credit: Birgi Tamersoy

Page 41: Video Analysis

Frame differencesvs. background subtraction

• Toyama et al. 1999

Page 42: Video Analysis

Slide credit: Birgi Tamersoy

Page 43: Video Analysis

Pros and consAdvantages:• Extremely easy to implement and use• Fast• Background models need not be constant, they change over time

Disadvantages:• Accuracy of frame differencing depends on object speed and

frame rate• Median background model: relatively high memory requirements• Setting global threshold Th…

Slide credit: Birgi Tamersoy

Page 44: Video Analysis

Background subtraction with depth

How can we select foreground pixels based on depth information? Leap: http://www.leapmotion.com/

Page 45: Video Analysis

Outline

• Video representation• Motion• Actions in video– Background subtraction– Recognition of action based on motion patterns

Page 46: Video Analysis

Motion analysis in video

• “Actions”: atomic motion patterns -- often gesture-like, single clear-cut trajectory, single nameable behavior (e.g., sit, wave arms)

• “Activity”: series or composition of actions (e.g., interactions between people)

• “Event”: combination of activities or actions (e.g., a football game, a traffic accident)

Modified from Venu Govindaraju

Page 47: Video Analysis

Surveillance

http://users.isr.ist.utl.pt/~etienne/mypubs/Auvinetal06PETS.pdf

Page 48: Video Analysis

Interfaces

• https://flutterapp.com/

Page 49: Video Analysis

• Model-based action/activity recognition:– Use human body tracking and pose estimation

techniques, relate to action descriptions– Major challenge: accurate tracks in spite of occlusion,

ambiguity, low resolution• Activity as motion, space-time appearance

patterns– Describe overall patterns, but no explicit body tracking– Typically learn a classifier– We’ll look at a specific instance…

Human activity in video:basic approaches

Page 50: Video Analysis

• Recognize actions at a distance [ICCV 2003]– Low resolution, noisy data, not going to be able to track

each limb.– Moving camera, occlusions– Wide range of actions (including non-periodic)

[Efros, Berg, Mori, & Malik 2003]http://graphics.cs.cmu.edu/people/efros/research/action/

The 30-Pixel Man

Page 51: Video Analysis

Approach

• Motion-based approach– Non-parametric; use large amount of data– Classify a novel motion by finding the most similar

motion from the training set• More specifically,– A motion description based on optical flow– an associated similarity measure used in a nearest

neighbor framework

[Efros, Berg, Mori, & Malik 2003]http://graphics.cs.cmu.edu/people/efros/research/action/

Page 52: Video Analysis

Motion description matching results

More matching results: video 1, video 2

Page 53: Video Analysis

Action classification result

• demo video

Page 54: Video Analysis

Gathering action data

• Tracking – Simple correlation-based tracker– User-initialized

Page 55: Video Analysis

Figure-centric representation

• Stabilized spatio-temporal volume– No translation information– All motion caused by person’s

limbs, indifferent to camera motion!

Page 56: Video Analysis

Extract optical flow to describe the region’s motion.

Using optical flow:action recognition at a distance

[Efros, Berg, Mori, & Malik 2003]http://graphics.cs.cmu.edu/people/efros/research/action/

Page 57: Video Analysis

InputSequence

Matched Frames

Use nearest neighbor classifier to name the actions occurring in new video frames.

Using optical flow:action recognition at a distance

[Efros, Berg, Mori, & Malik 2003]http://graphics.cs.cmu.edu/people/efros/research/action/

Page 58: Video Analysis

Football Actions: classification

[.67 .58 .68 .79 .59 .68 .58 .66](8 actions, 4500 frames, taken from 72 tracked sequences)

Page 59: Video Analysis

Application: motion retargeting

[Efros, Berg, Mori, & Malik 2003]http://graphics.cs.cmu.edu/people/efros/research/action/

SHOW VIDEO

Page 60: Video Analysis

Summary

• Background subtraction: – Essential low-level processing tool to segment

moving objects from static camera’s video• Action recognition: – Increasing attention to actions as motion and

appearance patterns– For constrained environments, relatively simple

techniques allow effective gesture or action recognition

Page 61: Video Analysis

Closing remarks

• Thank you all for your attention and participation to the class!

• Please be well prepared for the final project (06/12 and 06/19). Come to class on time. Start early!


Recommended