+ All Categories
Home > Documents > VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Date post: 26-Feb-2016
Category:
Upload: darin
View: 94 times
Download: 0 times
Share this document with a friend
Description:
VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING. Alex Leykin Indiana University. PhD Thesis by:. Motivation. Automated tracking and activity recognition is missing from marketing research Hardware is already there - PowerPoint PPT Presentation
Popular Tags:
42
VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING Alex Leykin Indiana University PhD Thesis by:
Transcript
Page 1: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS:

A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Alex Leykin Indiana University

PhD Thesis by:

Page 2: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Motivation

• Automated tracking and activity recognition is missing from marketing research

• Hardware is already there• Visual information can reveal a lot about

human interactions with each other • Help in making intelligent marketing

decisions

Page 3: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Goals

Extract semantic information from the tracks (Activity Analysis)

Process visual information to get a formal representation of human locations (Visual Tracking)

Page 4: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Related Work: Detection and Tracking• Yacoob and Davis “Learned models for estimation of

rigid and articulated human motion from stationary or moving camera” IJCV 2000

• Zhao and Nevatia “Tracking multiple humans in crowded environment” CVPR 2004

• Haritaoglu, Harwood, and Davis “W-4: Real-time surveillance of people and their activities” PAMI 2000

• J. Deutscher, B. North, B. Bascle and A. Blake “Tracking through singularities and discontinuities by random sampling”, ICCV 1999

• A. Elgammal and L. S. Davis, “Probabilistic Framework for Segmenting People Under Occlusion”, ICCV 2001.

• M. Isard, J. MacCormick, “BraMBLe: a Bayesian multiple-blob tracker”, ICCV 2001

Page 5: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Related Work: Activity Recognition

• Haritaoglu and Flickner “Detection and tracking of shopping groups in stores” CVPR 2001

• Oliver, Rosario, and Pentland “A bayesian computer vision system for modeling human interactions” PAMI 2000

• Buzan, Sclaroff, and Kollios “Extraction and clustering of motion trajectories in video” ICPR 2004

• Hongeng, Nevatia, and Bremond “Video-based event recognition: activity representation and probabilistic recognition methods” CVIU 2004

• Bobick and Ivanov “Action recognition using probabilistic parsing” CVPR 1998

Page 6: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

System Components

Low-level Processing

Camera Model

Obstacle Model

Foreground Segmentation

Head Detection

Page 7: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Background Modeling

Color• μRGB• Ilow • Ihi

codeword

codebook

………..

Page 8: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Adaptive Background Update

If there is no match

if codebook is saturated then pixel is foreground else create new codeword

Else update the codeword with new pixel information

If >1 matches then merge matching codewords

I(p) > Ilow

I(p) < Ihigh

(RGB(p)∙ μRGB) < TRGB

t(p)/thigh > Tt1

t(p)/tlow > Tt2

Match pixel p to the codebook b

Page 9: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Background Subtraction

Page 10: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Head DetectionVanishing Point Projection (VPP) Historgram

Vanishing Point in Z-direction

Page 11: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Camera Setup

• Two camera typesPerspective Spherical

• Mixtures of indoor and outdoor scenes• Color and thermal image sensors• Varying lighting conditions (daylight, cloud

cover, incandescent, etc.)

Page 12: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Camera ModelingPerspective Projection Spherical Projection

X, Y, Z from:[sx; sy; s] = P [X; Y; Ż; 1] using SVDWhere P, is the 3x4 projection matrix

Assumption: floor plane Zf = 0

X = cos(θ) tan(π-φ)(Zc-Ż)Y = sin(θ) tan(π-φ)(Zc-Ż)Z = Ż

XY

Z

y

x

[Xc, Yc, Zc]

Lat

Lon[Xc, Yc, Zc]

XY

Z

Page 13: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

TrackingGoal: find a correspondence between the bodies, already detected in the

current frame with the bodies which appear in the next frame.

Apply Markov Chain Monte Carlo (MCMC) to estimate the next state

??

?

xt-1 xt

zt

?

Add bodyDelete body

Recover deletedChange Size

Move

Page 14: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

TrackingLocation of each pedestrian is estimated probabilistically based on: Current image Previous state of the system Physical constraints

The goal of our tracking system is to find the candidate state x´ (a set of bodies along with their parameters) which, given the last known state x, will best fit the current observation z

P(x’| z, x) = L(z|x’) · P(x’{x})

observation likelihood state prior probability

Page 15: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Tracking: Priors

N(hμ, hσ2) and N(wμ,wσ

2) body width and height

U(x)R and U(y)R body coordinates are weighted uniformly within the rectangular region R of the floor map.

d(wt, wt−1) and d(ht, ht−1) variation from the previous size

d(xt, x’t−1) and d(y, y’t−1) variation from Kalman predicted position

N(μdoor, σdoor) distance to the closest door (for new bodies)

Constraints on the body parameters:

Temporal continuity:

Page 16: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Tracking Likelihoods: Distance weight plane

2hPz

Problem: blob trackers ignore blob position in 3D (see Zhao and Nevatia CVPR 2004) Solution: employ “distance weight plane” Dxy = |Pxyz, Cxyz| where P and C are world

coordinates of the camera and reference point correspondingly and

Page 17: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Tracking Likelihoods: Z-buffer

0 = background, 1=furthermost body, 2 = next closest body, etc

Page 18: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Tracking Likelihoods: Color Histogram

),(11 1 ttcolorcolor ccBwP

I

DZOIP xyZ

)( )0(

O

DIZOP xyZ

)( )0(

Implementation of z-buffer (Z) and distance weight plane (D) allows to compute multiple-body configuration with one computationally efficient step.Let: I - set of all blob pixels O - set of body pixels

Color observation likelihood is based on the Bhattacharya distance between candidate and observed color histograms

Page 19: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Tracking: Anisotropic Weighted Mean Shift

Classic Mean-Shift Our Mean-Shift

t-1t

H

t

Page 20: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING
Page 21: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Actors and events

• Shopper groups are formed by individual shoppers who shop together for some amount of time– More than fleeting crossing of paths – Dwelling together– Splitting and uniting after a period of time

Page 22: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Swarming

• Shopper groups detected based on “swarming” idea in reverse– Swarming is used in graphics to generate

flocking behaviour in animations. – Rules define flocking behaviour:

• Avoid collisions with the neighbors.• Maintain fixed distance with neighbors• Coordinate velocity vector with neighbors.

Page 23: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Tracking Customer Groups

• We treat customers as swarming agents, acting according to simple rules (e.g. stay together with swarm members)

5

16

10

Customer groups

Page 24: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Terminology

• Actors: shoppers (bodies detected in tracking)– (x, y, id)

• Swarming events defined as short time activity sequences of multiple agents interacting with each other.– Could be fleeting (crossing paths)– Later analysis sorts this out and ignores

chance encounters.

Page 25: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Swarming

• The actors that best fit this model signal a Swarming Event

• Multiple swarming events are further clustered with fuzzy weights to find out shoppers in the same group over long periods.

11

1213

Page 26: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

• Two actors come sufficiently close according to some distance measure:– Relative position pi=(xi, yi) of actor i on the floor– Body orientations αi– Dwelling state δi={T,F}.

Event detection

Distance between two agents is a linear combination of co-location, co-ordination and co-dwelling

Page 27: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Event detection

Perform agglomerative clustering of actors a into clusters C• Initialize: N singleton clusters • Do: merge two closest clusters• While not: validity index I reaches its maximum

I consists of isolation Ini and compactness Inc

Ini = isolation

Inc = compactness

Page 28: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Event detection

# Iteration # Iteration

Final events

Page 29: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Activity Detection

• The shopper group detection is accomplished by clustering the short term events over long time periods. – The events could be separated in time, but

they will be part of the same shopper group if the actors are the same (the first term).

Page 30: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Activity detection

• Higher level activities (shopper groups) detected using these events as building blocks over longer time periods

• Some definitions:– Bei={b ei} the set of all bodies taking part in

an event ei.– τei and τej are the average times of events ei

and ej happening.

Page 31: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Activity detection

2

22112 )||(

||

|)()(|),(

ji

ji

ijji

eeee

eeeejie BB

BBBBeeD

Define a measure of similarity between two events

Overlap between two sets of actors Separation in time

Page 32: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Activity detection• Perform fuzzy agglomerative clustering• Minimize objective function

• where wij are fuzzy weights• and asymmetric variants of Tukey’s biweight estimators:

• (.) is the loss function from robust statistics.• ψ(.) is the weight function

Adaptively choose only strong fuzzy clusters

Label remaining clusters as activities

Page 33: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Results: Swarming activities detected in space-time

• Dot location: average event location

• Dot size: validity• Dots of same color: belong to

same activity

Page 34: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Group Detection Results

Page 35: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Quantitative Results

Page 36: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Tracking

Sequence

number

Frames

People

People

missed

False hits

Identity switches

1 1054

15 3 1 3

2 0601

8 0 0 0

3 1700

16 5 1 2

4 1506

3 0 0 0

5 2031

2 0 0 0

6 1652

4 0 0 0

%% 8544

48 12.5

4.1 10.4

Page 37: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Group DetectionSequence Groups P+ P− Partial

1 20 0 7 02 17 1 3 13 17 0 7 0

Total 54 1 12 2Percent 100 1.8 22.2 3.7

Ground truth(manually determined)

false positives

false negatives(groups missed)

Partially identified groups(≥2 people in the group Correctly identified)

Page 38: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Qualitative Assesments• Longer paths provide better group detection

(pval << 1)• Two-people groups are easiest to detect• Simple one-step clustering of trajectories is not

sufficient for long-term group detection• Employee tracks pose a significant problem and

have to be excluded• Several groups were missed by the operator in

the initial ground truth– System caught groups missed by the human expert

after inspection of results.

Page 39: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Contributions– BG subtraction based on codebook (RGB+thermal)– Introduced head candidate selection method based on

VPP histogram– Resolving track initialization ambiguity and non-unique

body-blob correspondence– Informed jump-diffuse transitions in MCMC tracker– Weight plane and z-buffer improve likelihood estimation– Anisotropic mean-shift with obstacle model– Two-layer formal framework high level activity detection – Implemented robust fuzzy clustering to group events

into activities

Page 40: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Future Work• Improved Tracking (via feature points)• Demographical analysis• Focus of Attention• Sensor Fusion• Other Types of Swarming Activities

Page 41: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

Questions?

Thank you!

Page 42: VISUAL HUMAN TRACKING AND GROUP ACTIVITY ANALYSIS: A VIDEO MINING SYSTEM FOR RETAIL MARKETING

|,||,||,|),( 321 jijijiji wwppwbbd


Recommended