Is an Object an Object an Object? Cognitive and Neuropsychological Investigations of DomainSpecificity in Visual Object RecognitionAuthor(s): Martha J. FarahReviewed work(s):Source: Current Directions in Psychological Science, Vol. 1, No. 5 (Oct., 1992), pp. 164-169Published by: Sage Publications, Inc. on behalf of Association for Psychological ScienceStable URL: http://www.jstor.org/stable/20182166 .Accessed: 06/02/2012 09:37
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].
Sage Publications, Inc. and Association for Psychological Science are collaborating with JSTOR to digitize,preserve and extend access to Current Directions in Psychological Science.
http://www.jstor.org
CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE
Is an Object an Object an Object? Cognitive and Neuropsychological Investigations of Domain Specificity in
Visual Object Recognition Martha J. Farah
Are all types of objects recog nized in the same way, or are differ
ent kinds of visual object recognition
systems used to recognize different
types of objects? Most current work
on object recognition in cognitive science has assumed, explicitly or
implicitly, that all visual stimuli are
recognized by a common set of
mechanisms. Cognitive scientists
such as Marr1 and Biederman,2 who
have proposed comprehensive theo
ries of object recognition, do not
specify different types of representa tions or processes for different types of stimuli. Rather, these scientists
have described a single type of sys tem capable of recognizing as wide
a range of stimuli as possible. Other researchers have ques
tioned the existence of a general
purpose pattern recognition system, and have instead suggested that the
visual system has evolved numerous
specialized subsystems for recogniz
ing different types of stimuli. Proba
bly the most extreme proponent of
this view was Konorski,3 who sug
gested that there were nine different
subsystems used in visual recogni
Martha ). Farah is Professor of Psy
chology at the University of Penn
sylvania. The work of hers that is
described herein was carried out
while she was at Carnegie Mellon
University. Address correspon dence to Martha J. Farah, Depart ment of Psychology, University of
Pennsylvania, 3815 Walnut St.,
Philadelphia, PA 19104.
tion. Figure 1 shows his taxonomy of
distinct recognition systems. What is the evidence in favor of
each of these two positions? Surpris
ingly, relatively little systematic em
pirical work has been directed to
ward this issue. Researchers favoring a single system are apparently
guided by a faith in parsimony. In
contrast, the alternative view draws
empirical support from neuropsy
chology, although the evidence
cited is often anecdotal. Thus, the
question of whether we have a sin
gle, general-purpose object recogni tion system or multiple subsystems
specialized for recognizing different
kinds of objects is an open empirical
question. In this article, I marshal
evidence from previous work in neu
ropsychology, as well as current re
search with normal and brain
damaged subjects in my lab, to
address the following three ques tions: First, are there specialized
subsystems for recognizing different
types of visual stimuli? Second, how
many such subsystems are there?
Third, what types of visual informa
tion processing are these specialized
subsystems specialized for?
NEUROPSYCHOLOGICAL EVIDENCE FOR
SPECIALIZATION WITHIN THE VISUAL OBJECT
RECOGNITION SYSTEM
Damage to the visual areas of the
brain can sometimes impair visual
recognition ability, while leaving in
tact general intellectual abilities as
well as perception of many of the
basic elements of vision, such as lo
cal contour, color, depth, and mo
tion.4 People with this condition, known as visual agnosia, retain full
knowledge of the nonvisual aspects of an object, enabling them to rec
ognize it by touching it or hearing
any characteristic sound it might make. They can also identify it from
a verbal description. In associative
visual agnosia (a term coined in the
19th century, based on the belief
that an inability to associate visual
input with stored knowledge was the
underlying cause), there is consider
able residual perceptual ability, such that the person may be able to
see an object well enough to draw a
recognizable copy of it. For exam
ple, Figure 2 shows three pictures that an associative agnosic, case
L.H., was unable to recognize,
along with his copies of each.
Within the framework of current
theories of vision, such people are
likely to have sustained damage to
the highest levels of visual object representation (e.g., the 3D models
of Marr). Associative visual agnosia does
not always affect the recognition of
all types of stimuli equally. The se
lectivity observed in some cases of
agnosia suggests that there may be
some division of labor within the vi
sual recognition system and pro vides clues as to the way in which
visual recognition can be subdi
vided. The most common of these
dissociations is pure alexia, an im
pairment of printed-word recogni tion.
Dissociations Between
Printed-Word Recognition and the
Recognition of Other Objects
People with pure alexia are im
paired at reading, despite the pre served ability to recognize spoken
Copyright ? 1992 American Psychological Society 64
CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE 165
Q? t^ \& ?
V-FC
V-Hu>
V-L
ttf & Fig. 1. Konorski's nine visual-gnostic
categories: (a) small, manipulable ob
jects; (b) larger, partially manipulable objects; (c) nonmanipulable objects; (d) human faces; (e) emotional facial ex
pressions; (f) animated objects; (g) signs; (h) handwriting; (i) positions of limbs.
words and the preserved ability to
write. (This leads to the almost par adoxical situation in which people
may be unable to read what they themselves have just written.) Al
though some pure alexics are en
Fig. 2. Examples of three pictures that
L.H., an associative visual agnosic, could not recognize (left), and his copies of these pictures (right).
tirely unable to read, the more usual I
form of the disorder involves ex
tremely slow, letter-by-letter read
ing. If such patients are required to
recognize even a short word in less
than a few seconds, they may fail
entirely. Most people with pure al
exia are not agnosic for objects other
than printed words, which suggests that printed-word recognition de
pends on at least some mechanisms
that are not shared with other forms
of visual recognition. Before accepting this conclusion,
we must consider an alternative hy
pothesis: that word recognition in
volves the same system that serves
for the recognition of other kinds of
objects, but taxes this system more
heavily, perhaps because word rec
ognition is learned later than other
forms of visual recognition, or be
cause different words resemble one
another more than different non
word objects. According to this al
ternative hypothesis, the selective
impairment in word recognition does not imply that different sub
systems of visual object recognition are required for recognizing words
and nonword objects. The existence of the opposite dis
sociation, namely, associative agno sia for objects without pure alexia,
helps to rule out this alternative ex
planation. There are a number of as
sociative agnosics who are not al
exic. For example, a man described
by Gomori and Hawryluk5 was im
paired at recognizing a variety of common objects, as well as the
faces of his family and friends. Nev
ertheless, he was able to read easily, even when interfering lines had
been drawn across the printed words. Whereas a single dissocia
tion can be explained within the
framework of a single, general
purpose object recognition system,
by hypothesizing that the impaired class of stimuli taxes that system
more heavily than the preserved
class, a so-called double dissocia
tion strongly implies two separate I systems. I
Dissociations Between Face
Recognition and the Recognition of Other Objects
One of the dramatic dissociations
within the neuropsychology of vision
is found in prosopagnosia, the selec
tive impairment of face recognition.
Prosopagnosics cannot recognize fa
miliar people by their faces alone
and must rely on other cues for rec
ognition, such as people's voices or
distinctive clothing or hairstyles. The
disorder can be so severe that even
close friends and family members are not recognized. One prosopag nosic recounted sitting in his club
and wondering why another mem
ber was staring so intently at him.
When he asked one of the waiters to
investigate, he learned that he had
been looking at himself in a mirror!
Although many prosopagnosics have some degree of difficulty rec
ognizing objects other than faces, in
some cases the deficit appears strik
ingly selective for faces. DeRenzi6
described a man who was suffi
ciently prosopagnosic that "the
identification of relatives and close
friends posed an insurmountable
problem if he could not rely on their
voices." Nevertheless, he was able
to identify all nonface objects with
which he was presented, and to rec
ognize his own razor, wallet, eye
glasses, and other personal items
when presented along with several
similar objects of the same type. The most straightforward inter
pretation of prosopagnosia, with re
spect to the question posed at the
outset, is that there is a specialized
subsystem for recognizing faces, not
needed for recognizing other types of objects, and that prosopagnosia results from damage to this sub
system. However, as noted earlier, it
is possible that faces and common
objects are recognized using a single
recognition system, and that faces are simply the most difficult type of
object to recognize. Prosopagnosia could then be explained as a mild
form of agnosia in which the impair
Copyright ? 1992 American Psychological Society
166 VOLUME 1, NUMBER 5, OCTOBER 1992
ment is detectable only on the most
taxing form of recognition task. This
alternative hypothesis can be ruled
out because the opposite dissocia
tion also exists, namely, impaired
recognition of common objects with
preserved face recognition. For ex
ample, McCarthy and Warrington7 described a patient who was unable to recognize a single picture from a
long series of pictures of common
objects, but performed satisfactorily with pictures of the faces of famous
people. The neuropsychological data therefore suggest that the recog nition of faces and common objects is carried out by at least partially dis
tinct subsystems of the visual sys tem.
The dissociations among the ag nosias for faces, common objects, and printed words cannot be ex
plained, in any straightforward way,
by the hypothesis that all three stim
ulus domains are recognized by a
single, general-purpose object rec
ognition system. Instead, these dis
orders suggest that people have
evolved different types of special ized recognition systems for different
types of stimuli. This raises the ques tion: How many specialized sub
systems are there?
PATTERNS OF CO-OCCURRENCE AMONG
THE ASSOCIATIVE AGNOSIAS: DELINEATING
THE SUBSYSTEMS OF VISUAL OBJECT RECOGNITION
At first glance, the pairwise disso
ciability of face, common-object, and printed-word recognition would seem to imply that there are three
different subsystems of visual recog
nition, each specialized for one of
these categories of stimuli. If this were true, then we should observe
all combinations of spared and im
paired face, common-object, and
printed-word recognition, provided we look at a large enough number of cases. With the goal of testing this
prediction, I recently reviewed 99 cases of associative agnosia, for
each case noting the available infor
mation on the patient's face, com
mon-object, and printed-word rec
ognition.8 Table 1 shows the
distribution of different patterns of
ability and deficit for those cases in
which information was given about
the recognition of all three catego ries of stimuli. For two of the possi ble patterns, there was only one case
Fig. 3. Diagram representing the relative
importance of two hypothesized types of visual recognition ability for recognizing faces, common objects, and printed
words.
each that appeared to instantiate the
pattern. Furthermore, in each of
those case reports, there was an in
consistency in the way the case was
described, such that a description of
the patient in one part of the case
report conformed to the unusual pat tern, whereas a description in a dif
ferent part of the same case report did not.
The distribution of cases shown in
Table 1 is consistent with two, rather
than three, underlying types of vi
sual recognition ability. As depicted
by the diagram in Figure 3, one sub
system is essential for face recogni tion, useful for common-object rec
ognition, and not at all needed for
printed-word recognition, whereas
the other subsystem is essential for
printed-word recognition, useful for
common-object recognition, and
not at all needed for face recogni tion. According to this framework, one should never observe impaired
recognition of common objects with
intact recognition of faces and
printed words, and rarely or never
observe impaired recognition of
faces and printed words with intact
recognition of common objects. These are, in fact, the two patterns for which no clear cases exist.9
TWO TYPES OF STRUCTURAL DESCRIPTION?
In the remainder of this article, I
present a conjecture concerning the
functions of the two types of visual
recognition systems, and some at
tempts that my collaborators and I
have made to test this conjecture. As a starting point, recall that many cur
Table 1. Results of literature review for possible combinations of impaired and spared recognition of faces, common objects, and printed words
Impaired and spared Number classes of stimuli of cases
Impaired: faces; spared: common objects, words 27
Impaired: faces, common objects; spared: words 15
Impaired: faces, common objects, words 22
Impaired: words; Not included
spared: faces, common objects in search
Impaired: common objects, words;
spared: faces 16
Impaired: common objects; spared: faces, words 1?
Impaired: faces, words; spared: common objects 1?
Published by Cambridge University Press
CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE 167
rent theories of object recognition
hypothesize some form of structural
description, that is, a representation of an object's shape in terms of
parts, which are explicitly repre sented as shapes in their own right,
along with the relations among
parts. The more extensive the part
decomposition, the more parts there
will be in the object's representa
tion, but the simpler those parts will
be. The less the part decomposition, the fewer parts there will be in an
object's representation, but the more
complex those parts will be. The
conjecture being put forth here is that word recognition involves ex
tensive part decomposition, and
hence requires the ability to repre sent a large number of parts,
whereas face recognition involves
virtually no part decomposition, and
hence requires the ability to repre sent complex parts.
Reading and the Ability to
Represent Multiple Parts
It would not surprise most non
psychologists to learn that printed words are recognized by first recog
nizing their letters. In fact, experi mental data and untutored intuitions
agree on this issue: For example, Johnston and McClelland10 found
thattachistoscopic word recognition was significantly more disrupted by a mask made up of letters than by one made up of letter fragments, consistent with the idea that a nec
essary stage in word recognition is
the explicit recognition of the com
ponent letters. This finding suggests that words are a paradigm case of a
type of object that must be decom
posed into multiple parts to be rec
ognized.11 There is also evidence that the
underlying impairment in pure al
exia consists of an inability to recog nize multiple shapes, either simulta
neously or in rapid sequence,
resulting in the laborious letter-by letter reading that is the hallmark of this syndrome. Such evidence was
first noted by Kinsbourne and War
rington,12 using both orthographic and nonorthographic stimuli, and
has since been confirmed in differ ent ways by other researchers. In all
these cases, however, the evidence
has been associational: Subjects who have pure alexia are also found to have an impairment in the recog nition of multiple items.
Wallace and I13 recently at
tempted to find out whether the lat ter was a causal factor in the word
recognition impairment of pure
alexia, or whether it was associated
for some other reason (e.g., neigh
boring parts of the brain involved in
the two abilities, such that a single lesion would be likely to impair
both). We used additive factors logic to test the hypothesis that letter-by letter reading results from difficulty
with specifically visual processing of
the multiple letters of a word. Be cause pure alexics read (if at all) let
ter by letter, the time it takes them to
read a word is directly proportional to the number of letters in the word.
If the slow, length-dependent read
ing times of these patients result
from impairment at a visual stage of
processing, then, according to addi tive factors logic, a manipulation known to affect the difficulty of vi
sual encoding should exacerbate the
word-length effects. By varying word length and visual quality, we
should observe an interaction be tween their effects. As shown in Fig ure 4, we found just this pattern of
results in a pure alexic subject, but not in control subjects who were in
structed to read letter by letter.
Face Recognition and the Ability to Represent Complex Wholes
Without Part Decomposition
Just as words seem to have a nat
ural decomposition into letters, so
faces seem decomposable into such
facial features as eyes, noses, and
mouths. However, this intuition
alone does not tell us whether such features play the role of psycho
unmasked
masked
word length
-*- unmasked
-*- masked
word length
Fig. 4. Reading latency as a function of word length and visual quality, (a) Re sults for a pure alexic, letter-by-letter reader instructed to read as quickly as
possible, (b) Results for normal subjects instructed to read letter by letter as
quickly as possible.
logically real parts in the visual
representations that underlie face
recognition. A recent series of ex
periments I conducted in collabora tion with Tanaka suggests that they do not, or that they do so to a lesser extent than the features of other, nonface, objects.14
We reasoned as follows: To the extent that some portion of a pattern is explicitly represented as a part for
purposes of recognition, when that
portion is presented later in isola
tion, subjects should be able to iden
tify it as a portion of a familiar pat tern. To the extent that the portion does not correspond to the way the
subject's visual system parses the
whole pattern, that portion pre sented in isolation is less likely to be recognized. Tanaka and I taught subjects to identify a set of faces,
along with a set of nonface objects, and then assessed the subjects' abil
Copyright ? 1992 American Psychological Society
168 VOLUME 1, NUMBER 5, OCTOBER 1992
ity to recognize both the whole pat terns and their parts. Examples of
study and test stimuli are shown in
Figure 5. Relative to the recognition of such nonface objects as houses, inverted faces, and scrambled faces, the recognition of intact upright faces showed a greater disadvantage for parts relative to wholes. Figure 6
shows the results of the experiment
comparing recognition of faces and
houses.
Given that normal subjects em
ploy relatively less part decomposi tion in recognizing faces than in
recognizing other objects, pros
opagnosics' impairment in face rec
ognition might be due to an inability
to encode faces as complex, unde
composed wholes. To test this hy
pothesis directly, Tanaka, Drain, and I compared the relative advan
tage of whole faces over face parts for normal subjects and for a
prosopagnosic subject, case L.H.15
Our initial plan was to administer to
L.H. the same task that Tanaka and I
used with the normal subjects, but
despite intensive effort, L.H. could not learn to recognize a set of faces.
We therefore switched to a short term memory paradigm in which a
face was presented for study, fol
lowed by a blank interval, followed
by a second presentation of a face.
The subjects' task was to say
whether the first and second faces were the same or different. There were two different conditions for the
presentation of the first face: either
"exploded" into four separate frames containing the head, eyes, nose, and mouth (in their proper rel
ative spatial position within each
frame) or intact. The second face was always presented in the normal
format, so that the two conditions can be called "parts-to-whole" and
"whole-to-whole." Normal subjects
performed better in the "whole-to
whole" condition, thus providing further evidence that their percep tion of a whole face is not equivalent to the perception of its parts. In con
trast, L.H. performed equally well in
the two conditions, despite an over
all accuracy comparable to the nor
mal subjects', consistent with the
hypothesis that he has lost the ability to see faces as wholes.
CONCLUSIONS
The evidence reviewed suggests answers to the three questions posed at the outset. First, an object is not
an object is not an object. The dou
ble dissociations that exist between
disorders of face and nonface object
recognition, and between disorders
of word and nonword object recog
nition, are inconsistent with the op eration of a single, general-purpose
object recognition system. Instead,
they suggest that there is a division
of labor within the object recogni tion system, with different sub
systems needed for different types of
visual stimulus.
Second, although the pairwise
dissociability of faces, common ob
jects, and printed words might seem
to imply the existence of three dis
tinct subsystems, a closer look at the
patterns of co-occurrence suggests that we need postulate only two sub
systems: one that is essential for
word recognition, useful for com
mon-object recognition, and not
Hi 1
_ Fig. 5. Examples of pairs of test items from an experiment on the recognition of faces
and houses. Subjects studied whole items individually and learned to identify them by name (e.g., Larry's face or Larry's house). The test was administered in a two-alternative
forced-choice format, either for an isolated part (e.g., "Which is Larry's nose?" or
"Which is Larry's door?") or for the whole item with only a single part changed (e.g., "Which is Larry's face?" or "Which is Larry's house?").
Published by Cambridge University Press
CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE 169
I ? LU CC
O ?
LU O ?C LU
?.
80
70
60
X
FACES
I I Isolated Part Condition
H Whole Object Condition
HOUSES
Fig. 6. Percentage of correct recognition of faces and houses and their parts from the experiment described in Figure 5.
needed for face recognition, and an
other that is essential for face recog
nition, useful for object recognition, and not needed for word recogni tion.
Third, a tentative interpretation of
these subsystems, in terms of the
types of visual information process
ing they carry out, is the following: The first subsystem is needed to rec
ognize objects by extensive part de
composition, in which numerous
parts must be encoded. The second
subsystem is needed to recognize
objects with little or no part decom
position, in which relatively com
plex parts must be encoded. Further
work is needed to test the empirical truth of this interpretation. In addi
tion, work is needed to clarify the reasons why different types of ob
jects come to be recognized by these different subsystems. For example, what aspects of the statistics of sim
ilarity and difference among individ
ual objects make extensive part de
composition useful for words, less so
for common objects, and relatively useless for faces?
Acknowledgments?Preparation of this
review, and much of the work described
herein, was supported by Office of Naval
Research Grant N00014-91-J1546, Na
tional Institute of Mental Health Grant R01 MH48274, National Institutes of
Health Career Development Award K04
NS01405, and a grant from the McDon
nell-Pew Program in Cognitive Neuro
science. I gratefully acknowledge Jim Tanaka's collaboration in developing the
ideas about face recognition that are pre sented here.
Notes
1. D. Marr, Vision (Freeman, San Francisco, 1982).
2. I. Biederman, Recognition-by-components: A
theory of human image understanding, Psychologi cal Review, 94, 115-147 (1987). Note that Bieder man did suggest that different types of processes may be needed to recognize stimuli at different lev els of specificity, and that his theory is primarily suited to recognition at the basic object level.
3. J. Konorski, Integrative Activity of the Brain
(University of Chicago Press, Chicago, 1967). 4. For a review, see M.J. Farah, Visual Agnosia:
Disorders of Object Recognition and What They Tell Us About Normal Vision (MIT Press, Cambridge, MA, 1990).
5. A.J. Gomori and G.A. Hawryluk, Visual ag nosia without alexia, Neurology, 34, 947-950 (1984).
6. E. DeRenzi, Current issues in prosopagnosia, in Aspects of Face Processing, ( H.D. Ellis, M.A.
Jeeves, F. Newcome, and A. Young, Eds. Martinus
Nijhoff, Dordrecht, The Netherlands, 1986). 7. R.A. McCarthy and E.K. Warrington, Visual
associative agnosia: A clinical-anatomical study of a
single case, Journal of Neurology, Neurosurgery and
Psychiatry, 49, 1233-1240 (1986). 8. M.j. Farah, Patterns of co-occurrence among
the associative agnosias: Implications for visual ob
ject representation, Cognitive Neuropsychology, 8, 1-19(1991).
9. This hypothesis also predicts that patients with apparently pure prosopagnosia or pure alexia should, with sufficiently sensitive testing (e.g., de
graded or tachistoscopic presentations of stimuli), show impairment in common-object recognition as
well, and that this impairment should be qualita tively different for the two types of patients.
10. J.C. Johnston and j.L. McClelland, Experi mental tests of a hierarchical model of word identi fication, Journal of Verbal Learning and Verbal Be havior, 19, 503-524 (1980).
11. The word superiority effect, by which letters embedded in words are perceived better than words
presented in nonwords or alone, might appear to
imply that words are perceived holistically, without
decomposition into letters. However, the implica tions of this effect are weaker than this. It implies only that, in addition to individual letter represen tations, word or letter-cluster representations are ac
tivated, and that the activation states of the latter
representations influence those of the former. 12. M. Kinsbourne and E.K. Warrington, A dis
order of simultaneous form perception, Brain, 85, 461-486(1962).
13. M.j. Farah and M.A. Wallace, Pure alexia as a visual impairment: A reconsideration, Cognitive Neuropsychology, 8, 313-334 (1991).
14. J.W. Tanaka and M.j. Farah, Parts and wholes in face recognition, Quarterly Journal of Ex
perimental Psychology (in press). 15. M.J. Farah, J.W. Tanaka, and M. Drain,
[manuscript in preparation), University of Pennsyl vania, Philadelphia.
Copyright ? 1992 American Psychological Society