Download - Farah (1992) is an Object an Object an Object _ Cognitive and Neuropsychological Investigations of Domain Specificity in Visual Object Recognition

Is an Object an Object an Object? Cognitive and Neuropsychological Investigations of DomainSpecificity in Visual Object RecognitionAuthor(s): Martha J. FarahReviewed work(s):Source: Current Directions in Psychological Science, Vol. 1, No. 5 (Oct., 1992), pp. 164-169Published by: Sage Publications, Inc. on behalf of Association for Psychological ScienceStable URL: http://www.jstor.org/stable/20182166 .Accessed: 06/02/2012 09:37

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

Sage Publications, Inc. and Association for Psychological Science are collaborating with JSTOR to digitize,preserve and extend access to Current Directions in Psychological Science.

http://www.jstor.org

http://www.jstor.org/action/showPublisher?publisherCode=sage

http://www.jstor.org/action/showPublisher?publisherCode=assocpsychsci

http://www.jstor.org/stable/20182166?origin=JSTOR-pdf

http://www.jstor.org/page/info/about/policies/terms.jsp

CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE

Is an Object an Object an Object? Cognitive and Neuropsychological Investigations of Domain Specificity in

Visual Object Recognition Martha J. Farah

Are all types of objects recog nized in the same way, or are differ

ent kinds of visual object recognition

systems used to recognize different

types of objects? Most current work

on object recognition in cognitive science has assumed, explicitly or

implicitly, that all visual stimuli are

recognized by a common set of

mechanisms. Cognitive scientists

such as Marr1 and Biederman,2 who

have proposed comprehensive theo

ries of object recognition, do not

specify different types of representa tions or processes for different types of stimuli. Rather, these scientists

have described a single type of sys tem capable of recognizing as wide

a range of stimuli as possible. Other researchers have ques

tioned the existence of a general

purpose pattern recognition system, and have instead suggested that the

visual system has evolved numerous

specialized subsystems for recogniz

ing different types of stimuli. Proba

bly the most extreme proponent of

this view was Konorski,3 who sug

gested that there were nine different

subsystems used in visual recogni

Martha ). Farah is Professor of Psy

chology at the University of Penn

sylvania. The work of hers that is

described herein was carried out

while she was at Carnegie Mellon

University. Address correspon dence to Martha J. Farah, Depart ment of Psychology, University of

Pennsylvania, 3815 Walnut St.,

Philadelphia, PA 19104.

tion. Figure 1 shows his taxonomy of

distinct recognition systems. What is the evidence in favor of

each of these two positions? Surpris

ingly, relatively little systematic em

pirical work has been directed to

ward this issue. Researchers favoring a single system are apparently

guided by a faith in parsimony. In

contrast, the alternative view draws

empirical support from neuropsy

chology, although the evidence

cited is often anecdotal. Thus, the

question of whether we have a sin

gle, general-purpose object recogni tion system or multiple subsystems

specialized for recognizing different

kinds of objects is an open empirical

question. In this article, I marshal

evidence from previous work in neu

ropsychology, as well as current re

search with normal and brain

damaged subjects in my lab, to

address the following three ques tions: First, are there specialized

subsystems for recognizing different

types of visual stimuli? Second, how

many such subsystems are there?

Third, what types of visual informa

tion processing are these specialized

subsystems specialized for?

NEUROPSYCHOLOGICAL EVIDENCE FOR

SPECIALIZATION WITHIN THE VISUAL OBJECT

RECOGNITION SYSTEM

Damage to the visual areas of the

brain can sometimes impair visual

recognition ability, while leaving in

tact general intellectual abilities as

well as perception of many of the

basic elements of vision, such as lo

cal contour, color, depth, and mo

tion.4 People with this condition, known as visual agnosia, retain full

knowledge of the nonvisual aspects of an object, enabling them to rec

ognize it by touching it or hearing

any characteristic sound it might make. They can also identify it from

a verbal description. In associative

visual agnosia (a term coined in the

19th century, based on the belief

that an inability to associate visual

input with stored knowledge was the

underlying cause), there is consider

able residual perceptual ability, such that the person may be able to

see an object well enough to draw a

recognizable copy of it. For exam

ple, Figure 2 shows three pictures that an associative agnosic, case

L.H., was unable to recognize,

along with his copies of each.

Within the framework of current

theories of vision, such people are

likely to have sustained damage to

the highest levels of visual object representation (e.g., the 3D models

of Marr). Associative visual agnosia does

not always affect the recognition of

all types of stimuli equally. The se

lectivity observed in some cases of

agnosia suggests that there may be

some division of labor within the vi

sual recognition system and pro vides clues as to the way in which

visual recognition can be subdi

vided. The most common of these

dissociations is pure alexia, an im

pairment of printed-word recogni tion.

Dissociations Between

Printed-Word Recognition and the

Recognition of Other Objects

People with pure alexia are im

paired at reading, despite the pre served ability to recognize spoken

Copyright ? 1992 American Psychological Society 64

CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE 165

Q? t^ \& ?

V-FC

V-Hu>

V-L

ttf & Fig. 1. Konorski's nine visual-gnostic

categories: (a) small, manipulable ob

jects; (b) larger, partially manipulable objects; (c) nonmanipulable objects; (d) human faces; (e) emotional facial ex

pressions; (f) animated objects; (g) signs; (h) handwriting; (i) positions of limbs.

words and the preserved ability to

write. (This leads to the almost par adoxical situation in which people

may be unable to read what they themselves have just written.) Al

though some pure alexics are en

Fig. 2. Examples of three pictures that

L.H., an associative visual agnosic, could not recognize (left), and his copies of these pictures (right).

tirely unable to read, the more usual I

form of the disorder involves ex

tremely slow, letter-by-letter read

ing. If such patients are required to

recognize even a short word in less

than a few seconds, they may fail

entirely. Most people with pure al

exia are not agnosic for objects other

than printed words, which suggests that printed-word recognition de

pends on at least some mechanisms

that are not shared with other forms

of visual recognition. Before accepting this conclusion,

we must consider an alternative hy

pothesis: that word recognition in

volves the same system that serves

for the recognition of other kinds of

objects, but taxes this system more

heavily, perhaps because word rec

ognition is learned later than other

forms of visual recognition, or be

cause different words resemble one

another more than different non

word objects. According to this al

ternative hypothesis, the selective

impairment in word recognition does not imply that different sub

systems of visual object recognition are required for recognizing words

and nonword objects. The existence of the opposite dis

sociation, namely, associative agno sia for objects without pure alexia,

helps to rule out this alternative ex

planation. There are a number of as

sociative agnosics who are not al

exic. For example, a man described

by Gomori and Hawryluk5 was im

paired at recognizing a variety of common objects, as well as the

faces of his family and friends. Nev

ertheless, he was able to read easily, even when interfering lines had

been drawn across the printed words. Whereas a single dissocia

tion can be explained within the

framework of a single, general

purpose object recognition system,

by hypothesizing that the impaired class of stimuli taxes that system

more heavily than the preserved

class, a so-called double dissocia

tion strongly implies two separate I systems. I

Dissociations Between Face

Recognition and the Recognition of Other Objects

One of the dramatic dissociations

within the neuropsychology of vision

is found in prosopagnosia, the selec

tive impairment of face recognition.

Prosopagnosics cannot recognize fa

miliar people by their faces alone

and must rely on other cues for rec

ognition, such as people's voices or

distinctive clothing or hairstyles. The

disorder can be so severe that even

close friends and family members are not recognized. One prosopag nosic recounted sitting in his club

and wondering why another mem

ber was staring so intently at him.

When he asked one of the waiters to

investigate, he learned that he had

been looking at himself in a mirror!

Although many prosopagnosics have some degree of difficulty rec

ognizing objects other than faces, in

some cases the deficit appears strik

ingly selective for faces. DeRenzi6

described a man who was suffi

ciently prosopagnosic that "the

identification of relatives and close

friends posed an insurmountable

problem if he could not rely on their

voices." Nevertheless, he was able

to identify all nonface objects with

which he was presented, and to rec

ognize his own razor, wallet, eye

glasses, and other personal items

when presented along with several

similar objects of the same type. The most straightforward inter

pretation of prosopagnosia, with re

spect to the question posed at the

outset, is that there is a specialized

subsystem for recognizing faces, not

needed for recognizing other types of objects, and that prosopagnosia results from damage to this sub

system. However, as noted earlier, it

is possible that faces and common

objects are recognized using a single

recognition system, and that faces are simply the most difficult type of

object to recognize. Prosopagnosia could then be explained as a mild

form of agnosia in which the impair

Copyright ? 1992 American Psychological Society

166 VOLUME 1, NUMBER 5, OCTOBER 1992

ment is detectable only on the most

taxing form of recognition task. This

alternative hypothesis can be ruled

out because the opposite dissocia

tion also exists, namely, impaired

recognition of common objects with

preserved face recognition. For ex

ample, McCarthy and Warrington7 described a patient who was unable to recognize a single picture from a

long series of pictures of common

objects, but performed satisfactorily with pictures of the faces of famous

people. The neuropsychological data therefore suggest that the recog nition of faces and common objects is carried out by at least partially dis

tinct subsystems of the visual sys tem.

The dissociations among the ag nosias for faces, common objects, and printed words cannot be ex

plained, in any straightforward way,

by the hypothesis that all three stim

ulus domains are recognized by a

single, general-purpose object rec

ognition system. Instead, these dis

orders suggest that people have

evolved different types of special ized recognition systems for different

types of stimuli. This raises the ques tion: How many specialized sub

systems are there?

PATTERNS OF CO-OCCURRENCE AMONG

THE ASSOCIATIVE AGNOSIAS: DELINEATING

THE SUBSYSTEMS OF VISUAL OBJECT RECOGNITION

At first glance, the pairwise disso

ciability of face, common-object, and printed-word recognition would seem to imply that there are three

different subsystems of visual recog

nition, each specialized for one of

these categories of stimuli. If this were true, then we should observe

all combinations of spared and im

paired face, common-object, and

printed-word recognition, provided we look at a large enough number of cases. With the goal of testing this

prediction, I recently reviewed 99 cases of associative agnosia, for

each case noting the available infor

mation on the patient's face, com

mon-object, and printed-word rec

ognition.8 Table 1 shows the

distribution of different patterns of

ability and deficit for those cases in

which information was given about

the recognition of all three catego ries of stimuli. For two of the possi ble patterns, there was only one case

Fig. 3. Diagram representing the relative

importance of two hypothesized types of visual recognition ability for recognizing faces, common objects, and printed

words.

each that appeared to instantiate the

pattern. Furthermore, in each of

those case reports, there was an in

consistency in the way the case was

described, such that a description of

the patient in one part of the case

report conformed to the unusual pat tern, whereas a description in a dif

ferent part of the same case report did not.

The distribution of cases shown in

Table 1 is consistent with two, rather

than three, underlying types of vi

sual recognition ability. As depicted

by the diagram in Figure 3, one sub

system is essential for face recogni tion, useful for common-object rec

ognition, and not at all needed for

printed-word recognition, whereas

the other subsystem is essential for

printed-word recognition, useful for

common-object recognition, and

not at all needed for face recogni tion. According to this framework, one should never observe impaired

recognition of common objects with

intact recognition of faces and

printed words, and rarely or never

observe impaired recognition of

faces and printed words with intact

recognition of common objects. These are, in fact, the two patterns for which no clear cases exist.9

TWO TYPES OF STRUCTURAL DESCRIPTION?

In the remainder of this article, I

present a conjecture concerning the

functions of the two types of visual

recognition systems, and some at

tempts that my collaborators and I

have made to test this conjecture. As a starting point, recall that many cur

Table 1. Results of literature review for possible combinations of impaired and spared recognition of faces, common objects, and printed words

Impaired and spared Number classes of stimuli of cases

Impaired: faces; spared: common objects, words 27

Impaired: faces, common objects; spared: words 15

Impaired: faces, common objects, words 22

Impaired: words; Not included

spared: faces, common objects in search

Impaired: common objects, words;

spared: faces 16

Impaired: common objects; spared: faces, words 1?

Impaired: faces, words; spared: common objects 1?

Published by Cambridge University Press


rent theories of object recognition

hypothesize some form of structural

description, that is, a representation of an object's shape in terms of

parts, which are explicitly repre sented as shapes in their own right,

along with the relations among

parts. The more extensive the part

decomposition, the more parts there

will be in the object's representa

tion, but the simpler those parts will

be. The less the part decomposition, the fewer parts there will be in an

object's representation, but the more

complex those parts will be. The

conjecture being put forth here is that word recognition involves ex

tensive part decomposition, and

hence requires the ability to repre sent a large number of parts,

whereas face recognition involves

virtually no part decomposition, and

hence requires the ability to repre sent complex parts.

Reading and the Ability to

Represent Multiple Parts

It would not surprise most non

psychologists to learn that printed words are recognized by first recog

nizing their letters. In fact, experi mental data and untutored intuitions

agree on this issue: For example, Johnston and McClelland10 found

thattachistoscopic word recognition was significantly more disrupted by a mask made up of letters than by one made up of letter fragments, consistent with the idea that a nec

essary stage in word recognition is

the explicit recognition of the com

ponent letters. This finding suggests that words are a paradigm case of a

type of object that must be decom

posed into multiple parts to be rec

ognized.11 There is also evidence that the

underlying impairment in pure al

exia consists of an inability to recog nize multiple shapes, either simulta

neously or in rapid sequence,

resulting in the laborious letter-by letter reading that is the hallmark of this syndrome. Such evidence was

first noted by Kinsbourne and War

rington,12 using both orthographic and nonorthographic stimuli, and

has since been confirmed in differ ent ways by other researchers. In all

these cases, however, the evidence

has been associational: Subjects who have pure alexia are also found to have an impairment in the recog nition of multiple items.

Wallace and I13 recently at

tempted to find out whether the lat ter was a causal factor in the word

recognition impairment of pure

alexia, or whether it was associated

for some other reason (e.g., neigh

boring parts of the brain involved in

the two abilities, such that a single lesion would be likely to impair

both). We used additive factors logic to test the hypothesis that letter-by letter reading results from difficulty

with specifically visual processing of

the multiple letters of a word. Be cause pure alexics read (if at all) let

ter by letter, the time it takes them to

read a word is directly proportional to the number of letters in the word.

If the slow, length-dependent read

ing times of these patients result

from impairment at a visual stage of

processing, then, according to addi tive factors logic, a manipulation known to affect the difficulty of vi

sual encoding should exacerbate the

word-length effects. By varying word length and visual quality, we

should observe an interaction be tween their effects. As shown in Fig ure 4, we found just this pattern of

results in a pure alexic subject, but not in control subjects who were in

structed to read letter by letter.

Face Recognition and the Ability to Represent Complex Wholes

Without Part Decomposition

Just as words seem to have a nat

ural decomposition into letters, so

faces seem decomposable into such

facial features as eyes, noses, and

mouths. However, this intuition

alone does not tell us whether such features play the role of psycho

unmasked

masked

word length

-*- unmasked

-*- masked

word length

Fig. 4. Reading latency as a function of word length and visual quality, (a) Re sults for a pure alexic, letter-by-letter reader instructed to read as quickly as

possible, (b) Results for normal subjects instructed to read letter by letter as

quickly as possible.

logically real parts in the visual

representations that underlie face

recognition. A recent series of ex

periments I conducted in collabora tion with Tanaka suggests that they do not, or that they do so to a lesser extent than the features of other, nonface, objects.14

We reasoned as follows: To the extent that some portion of a pattern is explicitly represented as a part for

purposes of recognition, when that

portion is presented later in isola

tion, subjects should be able to iden

tify it as a portion of a familiar pat tern. To the extent that the portion does not correspond to the way the

subject's visual system parses the

whole pattern, that portion pre sented in isolation is less likely to be recognized. Tanaka and I taught subjects to identify a set of faces,

along with a set of nonface objects, and then assessed the subjects' abil


168 VOLUME 1, NUMBER 5, OCTOBER 1992

ity to recognize both the whole pat terns and their parts. Examples of

study and test stimuli are shown in

Figure 5. Relative to the recognition of such nonface objects as houses, inverted faces, and scrambled faces, the recognition of intact upright faces showed a greater disadvantage for parts relative to wholes. Figure 6

shows the results of the experiment

comparing recognition of faces and

houses.

Given that normal subjects em

ploy relatively less part decomposi tion in recognizing faces than in

recognizing other objects, pros

opagnosics' impairment in face rec

ognition might be due to an inability

to encode faces as complex, unde

composed wholes. To test this hy

pothesis directly, Tanaka, Drain, and I compared the relative advan

tage of whole faces over face parts for normal subjects and for a

prosopagnosic subject, case L.H.15

Our initial plan was to administer to

L.H. the same task that Tanaka and I

used with the normal subjects, but

despite intensive effort, L.H. could not learn to recognize a set of faces.

We therefore switched to a short term memory paradigm in which a

face was presented for study, fol

lowed by a blank interval, followed

by a second presentation of a face.

The subjects' task was to say

whether the first and second faces were the same or different. There were two different conditions for the

presentation of the first face: either

"exploded" into four separate frames containing the head, eyes, nose, and mouth (in their proper rel

ative spatial position within each

frame) or intact. The second face was always presented in the normal

format, so that the two conditions can be called "parts-to-whole" and

"whole-to-whole." Normal subjects

performed better in the "whole-to

whole" condition, thus providing further evidence that their percep tion of a whole face is not equivalent to the perception of its parts. In con

trast, L.H. performed equally well in

the two conditions, despite an over

all accuracy comparable to the nor

mal subjects', consistent with the

hypothesis that he has lost the ability to see faces as wholes.

CONCLUSIONS

The evidence reviewed suggests answers to the three questions posed at the outset. First, an object is not

an object is not an object. The dou

ble dissociations that exist between

disorders of face and nonface object

recognition, and between disorders

of word and nonword object recog

nition, are inconsistent with the op eration of a single, general-purpose

object recognition system. Instead,

they suggest that there is a division

of labor within the object recogni tion system, with different sub

systems needed for different types of

visual stimulus.

Second, although the pairwise

dissociability of faces, common ob

jects, and printed words might seem

to imply the existence of three dis

tinct subsystems, a closer look at the

patterns of co-occurrence suggests that we need postulate only two sub

systems: one that is essential for

word recognition, useful for com

mon-object recognition, and not

Hi 1

_ Fig. 5. Examples of pairs of test items from an experiment on the recognition of faces

and houses. Subjects studied whole items individually and learned to identify them by name (e.g., Larry's face or Larry's house). The test was administered in a two-alternative

forced-choice format, either for an isolated part (e.g., "Which is Larry's nose?" or

"Which is Larry's door?") or for the whole item with only a single part changed (e.g., "Which is Larry's face?" or "Which is Larry's house?").

Published by Cambridge University Press


I ? LU CC

O ?

LU O ?C LU

?.

80

70

60

X

FACES

I I Isolated Part Condition

H Whole Object Condition

HOUSES

Fig. 6. Percentage of correct recognition of faces and houses and their parts from the experiment described in Figure 5.

needed for face recognition, and an

other that is essential for face recog

nition, useful for object recognition, and not needed for word recogni tion.

Third, a tentative interpretation of

these subsystems, in terms of the

types of visual information process

ing they carry out, is the following: The first subsystem is needed to rec

ognize objects by extensive part de

composition, in which numerous

parts must be encoded. The second

subsystem is needed to recognize

objects with little or no part decom

position, in which relatively com

plex parts must be encoded. Further

work is needed to test the empirical truth of this interpretation. In addi

tion, work is needed to clarify the reasons why different types of ob

jects come to be recognized by these different subsystems. For example, what aspects of the statistics of sim

ilarity and difference among individ

ual objects make extensive part de

composition useful for words, less so

for common objects, and relatively useless for faces?

Acknowledgments?Preparation of this

review, and much of the work described

herein, was supported by Office of Naval

Research Grant N00014-91-J1546, Na

tional Institute of Mental Health Grant R01 MH48274, National Institutes of

Health Career Development Award K04

NS01405, and a grant from the McDon

nell-Pew Program in Cognitive Neuro

science. I gratefully acknowledge Jim Tanaka's collaboration in developing the

ideas about face recognition that are pre sented here.

Notes

1. D. Marr, Vision (Freeman, San Francisco, 1982).

2. I. Biederman, Recognition-by-components: A

theory of human image understanding, Psychologi cal Review, 94, 115-147 (1987). Note that Bieder man did suggest that different types of processes may be needed to recognize stimuli at different lev els of specificity, and that his theory is primarily suited to recognition at the basic object level.

3. J. Konorski, Integrative Activity of the Brain

(University of Chicago Press, Chicago, 1967). 4. For a review, see M.J. Farah, Visual Agnosia:

Disorders of Object Recognition and What They Tell Us About Normal Vision (MIT Press, Cambridge, MA, 1990).

5. A.J. Gomori and G.A. Hawryluk, Visual ag nosia without alexia, Neurology, 34, 947-950 (1984).

6. E. DeRenzi, Current issues in prosopagnosia, in Aspects of Face Processing, ( H.D. Ellis, M.A.

Jeeves, F. Newcome, and A. Young, Eds. Martinus

Nijhoff, Dordrecht, The Netherlands, 1986). 7. R.A. McCarthy and E.K. Warrington, Visual

associative agnosia: A clinical-anatomical study of a

single case, Journal of Neurology, Neurosurgery and

Psychiatry, 49, 1233-1240 (1986). 8. M.j. Farah, Patterns of co-occurrence among

the associative agnosias: Implications for visual ob

ject representation, Cognitive Neuropsychology, 8, 1-19(1991).

9. This hypothesis also predicts that patients with apparently pure prosopagnosia or pure alexia should, with sufficiently sensitive testing (e.g., de

graded or tachistoscopic presentations of stimuli), show impairment in common-object recognition as

well, and that this impairment should be qualita tively different for the two types of patients.

10. J.C. Johnston and j.L. McClelland, Experi mental tests of a hierarchical model of word identi fication, Journal of Verbal Learning and Verbal Be havior, 19, 503-524 (1980).

11. The word superiority effect, by which letters embedded in words are perceived better than words

presented in nonwords or alone, might appear to

imply that words are perceived holistically, without

decomposition into letters. However, the implica tions of this effect are weaker than this. It implies only that, in addition to individual letter represen tations, word or letter-cluster representations are ac

tivated, and that the activation states of the latter

representations influence those of the former. 12. M. Kinsbourne and E.K. Warrington, A dis

order of simultaneous form perception, Brain, 85, 461-486(1962).

13. M.j. Farah and M.A. Wallace, Pure alexia as a visual impairment: A reconsideration, Cognitive Neuropsychology, 8, 313-334 (1991).

14. J.W. Tanaka and M.j. Farah, Parts and wholes in face recognition, Quarterly Journal of Ex

perimental Psychology (in press). 15. M.J. Farah, J.W. Tanaka, and M. Drain,

[manuscript in preparation), University of Pennsyl vania, Philadelphia.