Object Recognition -Segregation of function -Visual hierarchy -What and where (ventral and dorsal...

Object Recognition

-Segregation of function

-Visual hierarchy

-What and where (ventral and dorsal streams)

-Single cell coding and ensemble coding

-Distributed representations of object categories

-Face recognition

-Object recognition as a computational problem

Higher Perceptual Functions

Segregation of function exists already in the early visual system:

M channel (magnocellular): from M-type retinal ganglion cells to magnocellular LGN layers to layer IVB of V1; wavelength-insensitive in LGN, orientation selectivity in V1 (“simple cells”), binocularity and direction selectivity in layer IVB; processing visual motion.

P channel (parvocellular): from P-type retinal ganglion cells to parvocellular LGN layers to interblob regions of layer III in V1; many cells in LGN show color opponency, cells in interblob regions of V1 have strong orientation selectivity and binocularity (“complex cells”), channel is also called P-IB; processing visual object shape.

Functional Segregation

Segregation of function can also be found at the cortical level:

- within each area: cells form distinct columns.

- multiple areas form the visual hierarchy …

Functional Segregation

The Visual Hierarchy

van Essen and Maunsell, 1983

van Essen et al., 1990


-functional segregation of visual features into separate (specialized) areas.-increased complexity and specificity of neural responses.- columnar groupings, horizontal integration within each area.-larger receptive fields at higher levels.-visual topography is less clearly defined at higher levels, or disappears altogether.-longer response latencies at higher levels.- large number of pathways linking each segregated area to other areas.- existence of feedforward, as well as lateral and feedback connections between hierarchical levels.


The Architecture of Visual Cortex

Mishkin and Ungerleider, 1983

Lesion studies in the macaque monkey suggest that there are two large-scale cortical streams of visual processing:

Dorsal stream (“where”)

Ventral stream (“what”)

What and Where

Mishkin and Ungerleider, 1983

Object discrimination task

Landmark discrimination task

Bilateral lesion of the temporal lobe leads to a behavioral deficit in a task that requires the discrimination of objects.

Bilateral lesion of the parietal lobe leads to a behavioral deficit in a task that requires the discrimination of locations (landmarks).

The Architecture of Visual Cortex

motion

colorform

Lateral views of the macaque monkey brain

Single Cells and Recognition

What is the cellular basis for visual recognition (visual long-term memory)?

1. Where are the cellular representations localized?

2. What processes generate these representations?

3. What underlies their reactivation during recall and recognition?


Visual recognition involves the inferior temporal cortex (multiple areas). These areas are part of a distributed network and are subject to both bottom-up (feature driven) and top-down (memory driven) influences.

Miyashita and Hayashi, 2000


Characteristics of neural responses in IT:

1. Object-specific (tuned to object class), selective for general object features (e.g. shape)

2. Non-topographic (large RF)3. Long-lasting (100’s ms)

Columnar organization (“object feature columns”)Specificity has often rather broad range

(distributed response pattern)

Distributed Representations

Are there specific, dedicated modules (or cells) for each and every object category?

No. – Why not?


Evidence feature based and widely distributed representation of objects across (ventral) temporal cortex.

What is a distributed representation?


Experiments conducted by Ishai et al.:

Experiment 1:1. fMRI during passive viewing2. fMRI during delayed match-to-sample

Experiment 2:1. fMRI during delayed match-to-sample with

photographs2. fMRI during delayed match-to-sample with line

drawings

Three categories: houses, faces, chairs.


Findings:

Experiment 1:Consistent topography in areas that most strongly

respond to each of the three categories.Modules?No - Responses are distributed (more so for non-face

stimuli)

Experiment 2:Are low-level features (spatial frequency, texture etc.)

responsible for the representation?No – line drawings elicit similar distributions of responses


From Ishai et al., 1999


From Ishai et al., 1999

houses

faces

chairs

Face recognition achieves a very high level of specificity – hundreds, if not thousands of individual faces can be recognized.

Face Recognition

Visual agnosia specific to faces: prosopagnosia.

High specificity of face cells “gnostic units”, “grandmother cells”

Many face cells respond to faces only – and show very little response to other object stimuli.

Face Recognition

Typical neural responses in the primate inferior temporal cortex:

Desimone et al., 1984

Face Recognition

Face cells (typically) do not respond to:1. “jumbled” faces2. “partial” faces 3. “single components” of faces (although some

face-component cells have been found)4. other “significant” stimuli

Face cells (typically) do respond to:1. faces anywhere in a large bilateral visual field2. faces with “reduced” feature content (e.g. b/w,

low contrast)

Face cell responses can vary with: facial expression, view-orientation

Face Recognition

Face cells are (to a significant extent) anatomically segregated from other cells selective for objects. They are found in multiple subdivisions across the inferior temporal cortex (in particular in or near the superior temporal sulcus)

Face Recognition

Faces versus objects in a recent fMRI study (Halgren et al. 1999)

Object Recognition: Why is it a Hard Problem?

Objects can be recognized over huge variations in appearance and context!

Ability to recognize objects in a great number of different ways: object constancy (stimulus equivalence)

Sources of variability:- Object position/orientation- Viewer position/orientation- Illumination (wavelength/brightness)- Groupings and context- Occlusion/partial views


Examples for variability:

field of view

Translation invariance

Rotation invariance


More examples for variability:field of view

Size invariance

Color


Variability in visual scenes:

field of view

Partial occlusionand presence of other objects

Object Recognition: Theories

Representation of visual shape (set of locations):

Viewer-centered coordinate systems: frame of reference: viewerexample: retinotopic coordinates, head-centered coordinateseasily accessed, but very unstable …

Environment-centered coordinate systems:locations specified relative to environment

Object-centered coordinate systems: intrinsic to or fixed to object itself (frame of reference: object)less accessible

Object Recognition: Theories

A taxonomy:

1. Template matching models (viewer-centered, normalization stage and matching)

2. Prototype models

3. Feature analysis model

4. Recognition by components (object-centered)

Object Recognition: Geons

Theory proposed by Irv Biederman.

Objects have parts.

Objects can be described as configurations of a (relatively small) number of geometrically defined parts.

These parts (geons) form a recognition alphabet. 24 geons for four basic properties that are viewpoint-invariant.


How geons are constructed:


Irv Biederman, JCN, 2001

Geons in IT?

How does Invariance Develop?

Deficits of feature perception (such as achromatopsia) generally do not cause an inability to recognize objects.

Failure of knowledge or recognition = “agnosia”. (visual agnosia)

In visual agnosias, feature processing and memory remain intact, and recognition deficits are limited to the the visual modality. Alertness, attention, intelligence and language are unaffected.

Other sensory modalities (touch, smell) may substitute for vision in allowing objects to be recognized.

Higher Perceptual Functions: Agnosias

Apperceptive agnosia: perceptual deficit, affects visual representations directly, components of visual percept are picked up, but can’t be integrated, effects may be graded, often affected: unusual views of objects

Associative agnosia: visual representations are intact, but cannot be accessed or used in recognition. Lack of information about the percept. “Normal percepts stripped of their meaning” (Teuber)

This distinction introduced by Lissauer (1890)

Two Kinds of Agnosias

Apperceptive Agnosia

Diagnosis: ability to recognize degraded stimuli is impaired

A AFarah: Many “apperceptive agnosias” are “perceptual categorization deficits” …


Studies by E. Warrington:

Laterality in recognition deficits: patients with right-hemispheric lesions (parietal, temporal) showed lower performance on degraded images than controls or left-hemispheric lesions.

Hypothesis: object constancy is disrupted (not contour perception)

Experiment: Unusual views of objects – patients with right-hemispheric lesions show a characteristic deficit for these views.


Is “perceptual categorization deficit” a general impairment of viewpoint-invariant object recognition?

1. Patients are not impaired in everyday life (unlike associative agnosics).

2. They are not impaired in matching different “normal” views of objects, only “unusual views”.

3. Impairment follows unilateral lesions, not bilateral (as would be expected if visual shape representations were generally affected).

Associative Agnosia

Patients do well on perceptual tests (degraded images, image segmentation), but cannot access names (“naming”) or other information (“recognition”) about objects. Agnosics fail to experience familiarity with the stimulus.

When given names of objects, they can (generally) give accurate verbal descriptions.

Warrington’s analysis places associative agnosia in left hemisphere.

Associative Agnosia

Associative agnosics can copy drawings of objects but cannot name them (evidence for intactness of perceptual representations…) but…

Agnosia Restricted to Specific Categories

Specific deficits in recognizing living versus non-living things.

Warrington and Shallice (1984): patients with bilateral temporal lobe damage showed loss of knowledge about living things (failures in visual identification and verbal knowledge).

Their interpretation: distinction between knowledge domains – functional significance (vase-jug) versus sensory properties (strawberry-raspberry).

Evolutionary explanation…


Another view: Damasio (1990)

Many inanimate objects are manipulated by humans in characteristic ways.

Interpretation: inanimate objects will tend to evoke kinesthetic representations.

Agreeing with Warrington, difficulty is not due to visual characteristics or visual discriminability.


Yet another view: Gaffan and Heywood (1993)

Presented images (line drawings) of animate and inanimate to normal humans and normal monkeys, tachistoscopically (20 ms). Both subject groups made more errors in identifying animate vs. inanimate objects.

Interpretation: Living things are more similar to each other than non-living things “category-specific agnosia”

How is Semantic Knowledge Organized?

Category-based systemProperty-based system

Network model by Farah and McClelland (1991).

Prosopagnosia

Is face recognition “special”?Anatomical localizationFunctional independence

Associative visual agnosia (prosopagnosia): Lost ability to recognize familiar faces.

Affects previous experience as well as (anterograde component) newly experienced faces.

Patients can recognize people by their voice, distinctive clothing, hairstyle etc.

Prosopagnosia

What is special about faces:

1. Higher specificity of categorization2. Higher level of expertise3. Higher degree of visual similarity4. Evolutionary significance

Can face and object recognition be dissociated?

Neuropsychological evidence suggests, yes (study by McNeil and Warrington)

Also, remember Ishai et al. (object category map)

Date post:	28-Dec-2015
Category:	Documents
Upload:	jewel-robbins
View:	214 times
Download:	1 times

Object Recognition -Segregation of function -Visual hierarchy -What and where (ventral and dorsal...

Documents