Activity Recognition in Videofiles.meetup.com/4379272/Bostion Image Processing Meetup.pdf ·...

Activity Recognition in VideoShashi Kant

Cognika

www.cognika.com

February 6, 2013

http://www.cognika.com/

Cognika Introduction

2/7/2013 2

MachineVision

Real-TimeSearch

Cognika Introduction

2/7/2013 3

MachineVision

Real-TimeSearch

Forensic“Search”

Real-TimeAlerting

What we do

• “Search” within FMV• By Image (OOI)

• By Video Clip

• By Text

• Real-Time

• Activity-based Searching – spatiotemporal querying

2/7/2013 4

Inverted Indexing

2/7/2013 5Source: developer.apple.com

Text Indexing Process

2/7/2013 6

Source Document

Analyze

Parser

Tokenizer

Stemmer

Tokens

Payloads

Write to IndexInvertedIndex

Analyzer

Indexed Documents

Vector Space Model

• Documents and Queries are “Vectors”

– Di = (wi,1, wi,2, wi,3….wi,n)

– Where wi,j is weight for “term” j in document

• Cosine Similarity = Cosine of angle between query and stored document

2/7/2013 7

TF-IDF Vector Space Querying

𝑑𝑗 = 𝑤1,𝑗 , 𝑤2,𝑗 … .𝑤𝑛,𝑗

𝑞 = 𝑤1,𝑞 , 𝑤2,𝑞… .𝑤𝑛,𝑞

Document

Query

2/7/2013 8

Video Indexing Process

2/7/2013 9

Blob Extraction

Source Video(s) Frames

Index“Documents”

Training Set

Object Classification

Metadata

Frame Extraction

BlobDescriptors

Document Construction

Inverted Index

Simplified Example

2/7/2013 10

Circle

Triangle

Circle <x1,y1>

Triangle <x2,y2>

Training Image Set

Frame ImageIndex Document Representation

<x4,y4>

Color, Shape, Texture, Contour

Descriptors

Flow Chart

StabilizationMotion

Compensation

VideoStream

IsCamera Moving?

Blob Tracking

Yes

No

Disk-basedIndex

Extract Blob

Feature Vector

Build Frameset(Sliding

Window)

In-Memory Index

2/7/2013 11

Alerting

Search

Sliding Window Approach

Frame-1 Frame-2 Frame-3 Frame-k... Frame-p...

Window 1

Window 2

Frame-q...

Window w...

2/7/2013 12

Sequences Hierarchy

Objects (e.g. Humans, Vehicles)

Events (e.g. Humans Moving, Vehicles Moving)

Activities (e.g. Persons Moving Away, Vehicles Driving away)

Scenarios (e.g. Humans Gathering around Parked Vehicles)

2/7/2013 13

VideoIndex

Blob Extraction

Object-Frame Matrixes

Inferred Latent SemanticGraph

Normalization to adjust for quality

Object Classification

Metadata(e.g. Date-

Time, Resolution

etc.)

Frameset

2/7/2013 14

What we Index

• Color histograms• Shape Descriptors• Contour Descriptors• Video Metadata (e.g. date-time, resolution etc.)• Contextual information (e.g. Geo-location etc.)

2/7/2013 15

Query Clip

Result Clips

2/7/2013 16

Query Response TimesActivity Query Mean Response Time(milliseconds)

(averaged over 5 consecutive queries)No. of Results

Parked Vehicle 762 482

Person Walking 482 891

Ingressing Vehicle 319 876

Egressing Vehicle 410 573

Moving Vehicles 890 1098

Vehicle Halting & Person Exiting 1028 73

Person Entering Vehicle & Vehicle Moving

1176 48

Persons Gathering 908 382

Sub-second Responses for Terascale & Larger possible

2/7/2013 17

Prototype UI

2/7/2013 18

Further Research

• Improved Feature Vectors (for sparse features)

• Improved Blob Classifiers

• Improved Stabilization, BG Subtraction & Motion Compensation

• “Super Resolution” Enhancements

2/7/2013 19

We Are Hiring!

[email protected]

2/7/2013 20

Machine Vision Engineers

• OpenCV, and other machine vision toolkits

• OpenGL, CUDA

• Bayesian, ANN, SVMs etc.

• Video Background desirable

Search Engineers

• Lucene, Solr, Elastic-Search

• Hadoop, Katta, ZooKeeper

• Terascale+ Real-Time Search Experience desirable

mailto:[email protected]

Date post:	14-Jul-2020
Category:	Documents
Upload:	others
View:	2 times
Download:	0 times