MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

Post on 06-Apr-2018

217 views 0 download

transcript

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 1/97

Lecture 3Levels of categorization

and multiclass object recognition

6.870 Object Recognition and Scene Understandinghttp://people.csail.mit.edu/torralba/courses/6.870/6.870.recognition.htm

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 2/97

Class business

Start thinking about the class project

Choose a partner (if you want to do it alone, thatis fine too).

Brainstorm about ideas soon! It is also fine if it ispart of your own research.

Consult with me before starting

Best presentation award (for creditonly), after popular vote: a copy of thebook ³Vision Science´ (or another bookif the winner already has this one)

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 3/97

Class business

I have to travel

I am very flexible, but, how flexible are

you?

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 4/97

Wednesday

Sharing parts for intraclass transfer learning

Presenter: Sharat Chikkerur 

Evaluator: Hueihan Jhuang

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 5/97

Some goals for this lecture

What are categories?

How many categories are there?

Some theories about the structure of object categories

Multiclass object recognition and transfer 

learning in computer vision

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 6/97

 An example of categorical perception

Continuous perception: graded response

5 0 1 00 1 50 20 0 25 0

50

10 0

15 0

20 0

25 0

Many perceptual phenomena are a mixture of the two: categorical at an everyday

level of magnification, but continuous at a more microscopic level. It can also

depend on cultural aspects, expertise, task, attention, «

5 0 10 0 15 0 2 0 0 2 5 0

50

10 0

15 0

20 0

25 0

Categorical perception: ³sharp´ boundaries

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 7/97

 Another example

Continuous perception: graded response

20-24 25-29 30-34 35-39 40-44 45-49 50-54

Categorical perception: ³sharp´ boundaries

happinessfear 

Emotions have categorical boundaries

Identification Task

   %    i   d

  e  n  t   i   f   i  c  a  t   i  o  n

Anger

Fear

Happiness

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 8/97

Why do we care about categories?

Perception of function: We can perceive the 3D

shape, texture, material properties, withoutknowing about objects. But, the concept of 

category encapsulates also information about

what can we do with those objects.

³We therefore include the perception of function as a proper ±indeed, crucial- subject

for vision science´, from Vision Science, chapter 9, Palmer .

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 9/97

Why do we care about categories?

When we recognize an object we can makepredictions about its behavior in the future,

beyond of what is immediately perceived.

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 10/97

The perception of function

Direct perception (affordances): GibsonFlat surface

Horizontal

Knee-high

«

Sittable

upon

One caveat of this comparison: deciding that

something is a chair might require access to more

features than the ones needed to decide that we can

sit on something« (it is a different level of 

categorization)

Chair  Chair 

Chair?

Flat surface

Horizontal

Knee-high

«

Sittable

uponChair 

Mediated perception (Categorization)

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 11/97

Direct perception

Some aspects of an object function can be

perceived directly

Functional form: Some forms clearly

indicate to a function (³sittable-upon´,

container, cutting device, «)

Sittable-upon Sittable-upon

Sittable-upon

It does not seem easy

to sit-upon this«

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 12/97

Direct perception

Some aspects of an object function can be

perceived directly

Observer relativity: Function is observer 

dependent

From http://lastchancerescueflint.org

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 13/97

Limitations of Direct Perception

The functions are the same at some level of description: we can put things

inside in both and somebody will come later to empty them. However, weare not expected to put inside the same kinds of things«

Objects of similar structure might have very different functions

Not all functions seem to be available from direct visual information only.

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 14/97

Limitations of Direct Perception

Propulsion system

Strong protective surface

Something that looks like a door 

Sure, I can travel to space onthis object

Visual appearance might be a very weak cue to function

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 15/97

Indirect perception of function by

categorization

Well« this requires object recognition (for 

more details, see entire course)

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 16/97

So, what do we use direct or indirect?

³It seems exceedingly unlikely (though

logically possible) that we categorize

everything in our visual fields´, Palmer.

Hypothesis: we categorize the objects that

are relevant for a specific task that we

have at hand, but we only extract

affordances from the others.

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 17/97

How many categories?

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 18/97

Slide by  Aude Oliva

³Muchas´

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 19/97

How many object categories are there?

Biederman 1987

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 20/97

How many categories?

Probably this question is not even specific

enough to have an answer 

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 21/97

Which level of categorization

is the right one?Car is an object composed of:

a few doors, four wheels (not all visible at all times), a roof,

front lights, windshield

If you are thinking in buying a car, you might want to be a bit more specific about

your categorization.

?

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 22/97

psychological

cognition

content

belief 

entity

object

artifact

structure

area

room

substance

Livingthing

organism

person

leader  scientist

phenomenon

instrumentality

location

region

thing

animal

chordate

vertebrate

plant

chemist

body

stream

river 

Categorical hierarchies

From Wordnet

Categories can be organized in hierarchies (tree structures are commonly used)

This is a mapping of the Wordnet tree into the 2D plane

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 23/97

Organizing ³things´ into categories

(1) Feature-based (2) prototype

Definition:disassembling aconcept into a set of 

featural components

Each feature is anessential element of the category: ³ for athing to be an X, itmust have thatfeature. Otherwise it isnot an X´

Categories are formedon the basis of characteristics features,

which describe thetypical model of thecategory

Characteristics featuresare commonly present inexemplars of concepts,but they are not alwayspresent.

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 24/97

Prototypical Theory

According to theprototype view, an objectwill be classified as aninstance of a category if itis sufficiently similar to

the prototype.

Similarity ~ the number of features shared betweenan object and the

prototype (however,some features should beweighted more heavily asbeing more central to theprototype than are other 

features).

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 25/97

Prototype or Sum of Exemplars ?

Prototype Model Exemplars Model

Category judgments are made

by comparing a new exemplar

to the prototype.

Category judgments are made

by comparing a new exemplar

to all the old exemplars of a category 

or to the exemplar that is the most

appropriate

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 26/97

Levels of Categorization

The idea of prototypes and typicality led to the study of levels of categorization.

Rosch et al: ³albeit concepts exist at many different

levels of a hierarchy, one level is fundamental: basic

level.´

Basic level: the best compromise between grouping

together similar objects, and distinguish among

objects from the same category.

Willingham (247)

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 27/97

Levels of Categorization

B ASIC-LEVELCATEGORIES

SUPERORDIN ATE LEVELCATEGORIES

SUBORDIN ATE LEVELCATEGORIES

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 28/97

Rosch¶s Levels of CategorizationDefinition of Basic Level:

Similar shape: Basic level categories are the highest-level category for which their members have similar shapes.

Similar motor interactions: « for which people interact with its members

using similar motor sequences.

Common attributes: « there are a significant number of attributes in

common between pairs of members.

Sub Basic Superordinate

similarity

Similarity declines only slightly

going from subordinate tobasic level, and then drops

dramatically.

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 29/97

Rosch et al (1976) found that when people

are shown pictures of objects, they identify

objects at a basic level more quickly than

they identified objects at higher or lower 

levels.

Objects appear to be recognized first at

their basic level, and only afterwards theyare classified in terms of higher or lowers

level categories

Levels of Categorization

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 30/97

Typicality effects

Typicality: how good or common an

item is a member of a given category.

The typical exemplar is like arepresentation of the average (or central tendency)

But, the representation of a categoryvary with experience, so does the³typical´ exemplar 

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 31/97

Entry-level categories(Jolicoeur, Gluck, Kosslyn 1984)

Typical member of a basic-level categoryare categorized at the expected level

Atypical members tend to be classified at

a subordinate level.

 A bird An ostrich

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 32/97

We do not need to recognize the exact

category

 A new class can borrow information from

similar categories

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 33/97

So, where is computer vision?

Well«

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 34/97

The first problem« the data

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 35/97

Datasets

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 36/97

Datasets

Language 106 samples

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 37/97

Datasets

Language 106 samples

Character recognition

(MNIST)

104 samples

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 38/97

Datasets

Language 106 samples

Character recognition

(MNIST)

104 samples

Visual object recognition 10

3

samples

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 39/97

Early datasetsLena: a dataset in one picture

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 40/97

Coil

The Columbia Object Image Library (COIL-100) dataset consists of color images of 100objects where the images of the objects were taken at pose intervals of 5 degrees, i.e., 72

poses per object.

http://www.cs.columbia.edu/CA

VE

S.  A. Nene, S. K. Nayar and H. Murase,

Technical Report CUCS-006-96, February 1996.

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 41/97

Collecting datasets

(towards 106-7 examples)

ESP game (CMU)Luis Von  Ahn and Laura Dabbish 2004

LabelMe (MIT)Russell, Torralba, Freeman, 2005

StreetScenes (C

BC

L-MIT)Bileschi, Poggio, 2006

WhatWhere (Caltech)Perona et al, 2007

P ASCAL challenge

2006, 2007

Lotus Hill InstituteSong-Chun Zhu et al 2007

Cortina

UCSB

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 42/97

Labeling with games

L. von  Ahn, L. Dabbish, 2004; L. von  Ahn, R. Liu and M. Blum, 2006

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 43/97

The PASCAL Visual Object

Classes Challenge 2007

M. Everingham, Luc van Gool , C. Williams, J. Winn, A. Zisserman 2007

The twenty object classes that have been selected are:

Person: person

 Animal: bird, cat, cow, dog, horse, sheep

Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train

Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor 

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 44/97

Caltech 101 & 256

Griffin, Holub, Perona, 2007

Fei-Fei, Fergus, Perona, 2004

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 45/97

How to evaluate datasets?

How many labeled examples? How many classes? Segments or bounding

boxes? How many instances per image? How small are the targets? Variability

across instances of the same classes (viewpoint, style, illumination). How

different are the images?

How representative of the visual world is? What happens if you nail it?

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 46/97

Labelme.csail.mit.edu B. Russell,  A. Torralba, K. Murphy, W.T. Freeman. IJCV 2008

Tool went online July 1st, 2005

290,000 object annotations

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 47/97

Polygon quality

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 48/97

« things do not always look good«

Polygon quality

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 49/97

Testing

Most common labels:

test

adksdsa

woiieiie

«

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 50/97

Online HooligansDo not try this at home

T lit

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 51/97

Tags quality

T lit

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 52/97

Tags quality

T lit

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 53/97

Tags quality

T lit

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 54/97

Tags quality

M tl b t lb

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 55/97

Matlab toolbox

LMquer y (database, 'object.name', 'car,building,road,tree')

St t

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 56/97

Stats

How many categories are in LabelMe?

How many levels there are?

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 57/97

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 58/97

10% of the objects

account for 90% of 

the data

~Zipf¶s law

Caltech 101

Tiny images

LabelMe

We need transfer learning

Slid i +

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 59/97

Slide spain+perona

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 60/97

We have better low and mid-level vision

Better learning algorithms

Lot¶s of computational power 

 And lot¶s of data

«

We are running out of excuses

M lti l bj t d t ti

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 61/97

Multiclass object detectionthe not so early days

M lti l bj t d t ti

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 62/97

Multiclass object detectionthe not so early days

Schneiderman-Kanade multiclass object detection

Using a set of independent binary classifiers was a common strategy: Viola-Jones extension for dealing with rotations

- two cascades for each view

(a) One detector for each class

There is nothing wrong with this approach if you have access to

lots of training data and you do not care about efficiency.

Some symptoms of one vs all

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 63/97

Some symptoms of one-vs-all

multiclass approaches

Some of these parts cannot be used for anything else than this object.

What is the best representation to detect a traffic sign?

Very regular object: template matching will do the job

Parts derived from

training a binaryclassifier.

~100%

detection ratewith 0 false alarms

Some symptoms of one vs all

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 64/97

Some symptoms of one-vs-all

multiclass approachesPart-based object representation (looking for meaningful parts):

A. Agarwal and D. Roth

These studies try to recover parts that are meaningful. But is this the

right thing to do? The derived parts may be too specific, and they are

not likely to be useful in a general system.

M. Weber, M. Welling and P. Perona

Some symptoms of one vs all

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 65/97

Some symptoms of one-vs-all

multiclass approachesComputational cost grows linearly with Nclasses * Nviews * Nstyles «

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 66/97

Shared features

Is learning the object class 1000 easier than learning the first?

Can we transfer knowledge from one

object to another?

A

re the shared properties interesting bythemselves?

«

M ltit k l i

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 67/97

Multitask learning

R. Caruana. Multitask Learning. ML 1997

³MTL improves generalization by leveraging the domain-specific information

contained in the training signals of related tasks. It does this by training tasks in

parallel while using a shared representation´.

vs.

Sejnowski & Rosenberg 1986; Hinton 1986; Le Cun et al. 1989; Suddarth &

Kergosien 1990; Pratt et al. 1991; Sharkey & Sharkey 1992; «

M ltit k l i

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 68/97

Multitask learning

horizontal location of doorknob

single or double door 

horizontal location of doorway center 

width of doorway

horizontal location of left door jamb

horizontal location of right door jamb

width of left door jamb

width of right door jamb

horizontal location of left edge of door 

horizontal location of right edge of door 

Primar y task: detect door knobs

Tasks used:

R. Caruana. Multitask Learning. ML 1997

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 69/97

Sharing invariancesS. Thrun. Is Learning the n-th Thing Any Easier Than Learning The First?

NIPS 1996

Knowledge is transferred between tasks via a learned model of the

invariances of the domain: object recognition is invariant to rotation,

translation, scaling, lighting, « These invariances are common to all

object recognition tasks.

Toy world

Without sharing

With sharing

Convolutional Neural Network

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 70/97

Convolutional Neural Network

Translation invariance is already built into the network

The output neurons share all the intermediate levels

Le Cun et al, 98

Sharing transformations

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 71/97

Sharing transformations

Miller, E., Matsakis, N., and Viola, P. (2000). Learning from one example

through shared densities on transforms. In IEEE Computer Vision andPattern Recognition.

Transformations are sharedand can be learnt from other tasks.

M d l f bj t iti

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 72/97

Models of object recognitionI. Biederman, ³Recognition-by-components: A theory of human image

understanding,´ Psychological Review , 1987.

M. Riesenhuber and T. Poggio, ³Hierarchical models of object recognition in

cortex,´ Nature Neuroscience 1999.

T. Serre, L. Wolf and T. Poggio. ³Object recognition with features inspired

by visual cortex´. CVPR 2005

Sharing in constellation models

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 73/97

Sharing in constellation models(next Wednesday)

Pictorial StructuresF ischler & Elschlager, IEEE Trans. Comp. 1973

ConstellationModel

F ergus, Perona, & Zisserman, CVPR 2003

SVM DetectorsHeisele, Poggio, et. al., NIPS 2001

Model-Guided Segmentation

Mori, Ren, Efros, & Malik, CVPR 2004

R bl P t

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 74/97

Reusable Parts

Goal: Look for a vocabulary of edges that reduces the number of 

features.

Krempp, Geman, & Amit ³Sequential Learning of Reusable Parts for Object

Detection´. TR 2002

   N  u  m   b  e  r  o   f   f  e  a

   t  u  r  e  s

Number of classes

Examples of reused parts

Additive models and boosting

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 75/97

 Additive models and boosting(more details on Wednesday)

Torralba, Murphy, Freeman. CVPR 2004. P AMI 2007

Screen detector 

Car detector 

Face detector 

Binary classifiers that share features:

Screen detector 

Car detector 

Face detector 

Independent binary classifiers:

S ifi f t

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 76/97

Specific feature

 Non-shared feature: this feature

is too specific to faces.

 pedestrian

chair 

Traffic light

sign

face

Background class

Sh d f t

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 77/97

Shared feature

shared feature

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 78/97

50 training samples/class

29 object classes

2000 entries in the dictionary

Results averaged on 20 runs

Error bars = 80% interval

Torralba, Murphy, Freeman. CVPR 2004. P AMI 2007

Shared features

Class-specific features

enera za on as a unc on o o ec

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 79/97

similarities

12 viewpoints12 unrelated object classes

Number of training samples per class Number of training samples per class

      A  r  e  a  u  n   d  e  r   R   O      C

      A  r  e  a  u  n   d  e  r   R   O      C K = 2.1 K = 4.8

Torralba, Murphy, Freeman. CVPR 2004. P AMI 2007

Sharing patches

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 80/97

Sharing patches

Bart and Ullman, 2004For a new class, use only features similar to features that where good for other 

classes:

Proposed Dog

features

Transfer Learning for Image Classification with Sparse

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 81/97

Transfer Learning for Image Classification with Sparse

Prototype Representations

md d d 

m

m

W W W 

W W W 

W W W 

,2,1,

,22,21,2

,12,11,1

.

/11/

.

.

Coefficients for

for feature 2

Coefficients for

classifier 2

 A. Quattoni, M. Collins, T. Darrell, CVPR 2008

§ §§! !

m

i

ik k 

 D y xk 

W C  y x f  l  D

k 1 1),(

|)(|max)),((||

1min

W

Hi hi l T i M d l

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 82/97

Hierarchical Topic Models

Topic models typically use a³bag of words´ approx.: ± Learning topics allows transfer 

of information within a corpus of related documents

 ± Mixing proportions capture thedistinctive features of particular documents

T

 U

z

x

 J  N 

 K 

Latent Dirichlet Allocation (LDA)Blei, Ng, & Jordan, JMLR 2003

Pr(topic | doc)

Pr(word | topic)

E

Hi hi l T i M d l

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 83/97

Hierarchical Topic Models

T

 U

z

x

 J  N 

 K 

Latent Dirichlet Allocation (LDA)Blei, Ng, & Jordan, JMLR 2003

Pr(topic | doc)

Pr(word | topic)

E

Pr(x=word | z=topic) Pr(z=topic | doc)7topic

Pr(x=word | doc) =

=

Hierarchical Topic Models

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 84/97

Hierarchical Topic Models

T

 U

z

x

 J  N 

 K 

Latent Dirichlet Allocation (LDA)Blei, Ng, & Jordan, JMLR 2003

Pr(topic | doc)

Pr(word | topic)

E

³bag of features´ models:

Object Recognition ( Sivic et. al., ICCV 2005)

Scene Recognition ( F ei-F ei et. al., CVPR 2005)

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 85/97

Learning Shared Parts

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 86/97

Learning Shared Parts

Objects are often locally similar in appearance

Discover  parts shared across categories

 ± How many total parts should we share?

 ± How many parts should each category use?

HDP Object Model

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 87/97

HDP Object Model

We learn the

number of parts.

Each object

uses a different

number of parts.

The model

assumes aknown number 

of object

categories.

Parts are distributions

over appearances andlocations

HDP Object Model

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 88/97

HDP Object Model

There is no context, so the model is happy in creating

impossible part combinations.

Sharing Parts: 16 Categories

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 89/97

Sharing Parts: 16 Categories

Caltech 101 Dataset (Li & Perona)

Horses (Borenstein & Ullman)

C

at & dog faces (Vidal-Naquet & Ullman)

Bikes from Graz-02 (Opelt & Pinz)

Google«

Visualization of Shared Parts

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 90/97

Visualization of Shared Parts

Pr(appearance | part)

Pr(position | part)

Visualization of Shared Parts

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 91/97

Visualization of Shared Parts

Pr(appearance | part)

Pr(position | part)

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 92/97

Detection Results

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 93/97

6 Training Images per Categor y Detection vs. Training Set Size

Recognition Task

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 94/97

g

VS.

Recognition Results

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 95/97

g

6 Training Images per Categor y Recognition vs. Training Set Size

Recognition performance decreases. By sharing features, the classes look more similar.

Some more references

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 96/97

Some more references

Baxter 1996

Caruana 1997

Schapire, Singer, 2000

Thrun, Pratt 1997

Krempp, Geman,  Amit, 2002

E.L.Miller, Matsakis, Viola, 2000

Mahamud, Hebert, Lafferty, 2001

Fink et al. 2003, 2004

LeCun, Huang, Bottou, 2004

Holub, Welling, Perona, 2005

«

Wednesday

8/3/2019 MIT6870_ORSU_lecture3: Levels of categorization and multiclass object recognition

http://slidepdf.com/reader/full/mit6870orsulecture3-levels-of-categorization-and-multiclass-object-recognition 97/97

y

Sharing parts for intraclass transfer learning

Presenter: SharatC

hikkerur 

Evaluator: Hueihan Jhuang