+ All Categories
Home > Documents > At&t research at trecvid 2009

At&t research at trecvid 2009

Date post: 19-Dec-2014
Category:
Upload: kirill-lazarev
View: 784 times
Download: 0 times
Share this document with a friend
Description:
 
Popular Tags:
43
AT&T Research at TRECVID 2009 Content-based Copy Detection
Transcript
Page 1: At&t research at trecvid 2009

AT&T Research at TRECVID 2009

Content-based Copy Detection

Page 2: At&t research at trecvid 2009

TRECVID 2009

•TREC Video Retrieval Evaluation•Specials for 2009 •Tasks

▫surveillance event detection▫high-level feature extraction▫search (interactive, manually-assisted,

and/or fully automatic)▫content-based copy detection

Page 3: At&t research at trecvid 2009

Video data

•Sound and Vision▫The Netherlands Institute for Sound and

Vision news magazine, science news, news reports,

documentaries, educational programming, and archival video

•BBC rushes unedited material

All materials in MPEG-1.. yep!)

Page 4: At&t research at trecvid 2009

Datasets • Development

▫ tv7.sv.devel (32.9 GB) (reference) ▫ tv7.sv.test (31.4 GB) (reference) ▫ tv8.sv.test (64.3 GB) (reference) ▫ tv7.bbc.devel (12.2 GB) (non-reference) ▫ tv7.bbc.test (10.9 GB) (non-reference) ▫ tv8.bbc.test (10.8 GB) (non-reference)

• Test ▫ tv7.sv.devel (32.9 GB) (reference) ▫ tv7.sv.test (31.4 GB) (reference) ▫ tv8.sv.test (64.3 GB) (reference) ▫ tv9.sv.test (114.8 GB) (reference) ▫ tv7.bbc.devel (12.2 GB) (non-reference) ▫ tv7.bbc.test (10.9 GB) (non-reference) ▫ tv8.bbc.test (10.8 GB) (non-reference) ▫ tv9.bbc.test (19.0 GB (non-reference)

Page 5: At&t research at trecvid 2009

Content-based copy detection

•copyright control•business intelligence•advertisement tracking•law enforcement investigations

Page 6: At&t research at trecvid 2009

Video transformation • Picture in picture (The original video is inserted in front of

a background video) • Insertions of pattern • Strong reencoding • Change of gamma • Decrease in quality

▫ Blur, change of gamma, frame dropping, contrast, compression, ratio, white noise

• Post production ▫ Crop, shift, contrast, caption (text insertion), flip (vertical

mirroring), insertion of pattern, Picture in Picture (the original video is in the background)

• Change to randomly choose 1 transformation from each of the 3 main categories.

Page 7: At&t research at trecvid 2009

AT&T Research at TRECVID 2009Content-based Copy Detection•Applications

▫discovering copyright infringement of multimedia content

▫monitoring commercial air time▫querying video by example

•Approaches▫digital video watermarking▫content based copy detection (CBCD).

Page 8: At&t research at trecvid 2009

Overview

Page 9: At&t research at trecvid 2009

Content based sampling•Shot boundary detection (SBD)

▫Adopts a “divide and conquer” strategy▫Six independent detectors:

Cut, fade in, fade out, fast dissolve (less than 5 frames), dissolve and motion

▫Each detector is a finite state machine (FSM)

•FSMs depent on two types of visual features:▫Intra-frame (only one frame)▫Inter-frame (current frame+previous frame)

gf
A divide and conquer algorithm works by recursively breaking down a problem into two or more sub-problems of the same (or related) type, until these become simple enough to be solved directly
Page 10: At&t research at trecvid 2009

Overview

Page 11: At&t research at trecvid 2009

Transformation detection andnormalization for query keyframe•Letterbox detection•Picture-in-picture detection•Query Keyframe Normalization

Page 12: At&t research at trecvid 2009

Transformation detection andnormalization for query keyframe• Letterbox detection• Picture-in-picture detection

• Canny edge detection operatorhttp://en.wikipedia.org/wiki/Canny_edge_detector

Page 13: At&t research at trecvid 2009

Transformation detection andnormalization for query keyframe•Query Keyframe Normalization

▫Equalize and blur the query keyframe to overcome the effect of change of Gamma and white noise transformations.

Page 14: At&t research at trecvid 2009

Transformation detection andnormalization for query keyframe

•And we have 10 types of query keyframe: original, letterbox removed, PiP scaled, equalized, blurred and flipped versions of these five types

Page 15: At&t research at trecvid 2009

Overview

Page 16: At&t research at trecvid 2009

Reference keyframe transformation

•Only 2 transformations ▫Half-resolution rescaling

For compared with the detected PiP region in the query keyframes

▫Strong re-encoding For dealing with the strong re-encoded query

keyframes.

•And we have 3 types of reference keyframe

Page 17: At&t research at trecvid 2009

Overview

Page 18: At&t research at trecvid 2009

Scale-invariant feature transform SIFT Extraction

Page 19: At&t research at trecvid 2009

Scale-invariant feature transform SIFT Extraction•It’s main feature for locating video copies

▫Locating the keypoints that have local maximum Difference of Gaussian values both in scale and in space. (specified by location, scale and orientation)

▫Computing a descriptor for each keypoint. The descriptor is the gradient orientation histogram, which is a 128 dimension feature vector.

Page 20: At&t research at trecvid 2009

Overview

Page 21: At&t research at trecvid 2009

Locality sensitive hashing (LSH)•The basic idea

▫hash the input items so that similar items are mapped to the same buckets with high probability

a – random vector following a Gaussian distribution with zero mean and unit variance

w – preset bucket sizeb – in range [0,w]

Page 22: At&t research at trecvid 2009

Overview

Page 23: At&t research at trecvid 2009

Indexing and search by LSH

•Sort LSH values independency•Save with SIFT identifications in separate

index file•SIFT identifications: (String)

▫Reference video ID▫Keyframe ID▫SIFT ID

Page 24: At&t research at trecvid 2009

Overview

Page 25: At&t research at trecvid 2009

Keyframe level query refinement

•Two issues:▫the original SIFT matching by Euclidian

distance is not reliable

▫it‘s possible that two SIFT features that are far away mapped to the same LSH value

Page 26: At&t research at trecvid 2009

Keyframe level query refinementRandom Sample Consensus (RANSAC)

Page 27: At&t research at trecvid 2009

Keyframe level query refinementRandom Sample Consensus (RANSAC)

• Randomly select 3 pairs of matching keypoints (having the same LSH)

• Determine the affine model

• Transform all keypoints in the reference keyframe into the query keyframe

• Count the number of keypoints in the reference whose transformed to the coordinates of their matching keypoints in the query keyframe. These keypoints are called inliers

• Repeat steps 1 to 4 for a certain number of times, and output the maximum number of inliers

Page 28: At&t research at trecvid 2009

Keyframe level query refinement

Transformations: PiP, shift, ratio..

Page 29: At&t research at trecvid 2009

Overview

Page 30: At&t research at trecvid 2009

Keyframe level result merge

• If one reference keyframe appears more than once in the 12 lists

• New relevance score set to be maximum score

Pair Query keyframes Reference keyframes

1 Original

Original

2 Flipped

3 Letterbox removed

4 Letterbox removed & flipped

5 Equalized

6 Equalized & flipped

7 Blurred

8 Blurred & flipped

9 OriginalEncoded10 Flipped

11 Picture in Picture (PiP)Half12 PiP & flipped

Page 31: At&t research at trecvid 2009

Overview

Page 32: At&t research at trecvid 2009

Video level result fusion

Get pair (i, j) with the best sum relevance

Page 33: At&t research at trecvid 2009

Overview

Page 34: At&t research at trecvid 2009

Video relevance score normalization

•Normalize the relevance scores into range [0,1]

x – original relevance scorey – normalized one

Page 35: At&t research at trecvid 2009

Overview

Page 36: At&t research at trecvid 2009

CBCD result generation

•Query video ID•Reference video ID•Information of copied reference video

segment•Starting frame of copied segment in the

query video•Decision score

Page 37: At&t research at trecvid 2009

CBCD Evaluation Results• Dataset

▫1407 short query videos▫838 reference videos▫208 non-reference videos

• Extract▫For entire reference video set

268,000 keyframes 57,000,000 SIFT features

▫For entire query video set 18,000 keyframes 2,600,000 SIFT features

Page 38: At&t research at trecvid 2009

CBCD Evaluation Criteria

Parameters for NoFA profile

Parameters for Balanced profile

Page 39: At&t research at trecvid 2009

CBCD Evaluation Results

Page 40: At&t research at trecvid 2009

CBCD Evaluation Results

Page 41: At&t research at trecvid 2009

CBCD Evaluation Results

TransformationsATTLabs.NoFA.1 ATTLabs.Balanced.2

Actual Minimum Actual Minimum

T2 55.4 0.672 1.283 0.732

T3 55.0 0.224 0.59 0.214

T4 0.381 0.381 0.413 0.391

T5 0.239 0.239 0.7 0.214

T6 55.1 0.284 0.974 0.291

T8 0.269 0.269 1.585 0.329

T10 0.515 0.515 2.045 0.52

Page 43: At&t research at trecvid 2009

Want more information?

Kirill Lazarev

Skype: kirill_lazarevMail: [email protected]

Twitter: http://twitter.com/kslazarev


Recommended