+ All Categories
Transcript

Predictions and incongruency in object recogni-

tion: a cognitive neuroscience perspective

Helena Yardley, Leonid Perlovsky and Moshe Bar1

Abstract The view presented here is that visual object recognition is the product

of interaction between perception and cognition, a process budding from memo-

ries of past events. Our brain is proactive, in that it is continuously aiming to pre-

dict how the future will unfold, and this inherent ability allows us to function op-

timally in our environment. Our memory serves as a database from which we can

form analogies from our past experiences, and apply them to the current moment.

We are able to recognize an object through top-down and bottom-up processing

pathways, which integrate to facilitate successful and timely object recognition.

Specifically, it is argued that when we encounter an object our brain asks “what is

this like” and therefore draws on years of experience, stored in memory, to gener-

ate predictions that directly facilitate perception. These feed-forward and feed-

back systems tune our perceptive and cognitive faculties based on a number of

factors: predictive astuteness, context, personal relevance of the given event, and

the degree to which its potential rarity differs from our original expectations. We

discuss both computational and theoretical models of object recognition, and re-

view evidence to support the theory that we do not merely process incoming in-

formation serially, and that during our attempts to interpret the world around us,

perception relies on existing knowledge as much as it does on incoming informa-

tion.

Helena Yardley

Martinos Center for Biomedical Imaging at Massachusetts General Hospital, USA

e-mail: [email protected]

Leonid Perlovsky

Harvard University and Air Force Research Laboratory, Hanscom AFB, USA

e-mail: [email protected]

Moshe Bar

Martinos Center Biomedical Imaging at Massachusetts General Hospital, Harvard Medical

School, Charlestown, MA 02129, USA

e-mail: [email protected]

Acknowledgments – Work on this chapter was supported by NIH, NSF, DARPA and the Air

Force.

2 Helena Yardley, Leonid Perlovsky and Moshe Bar

Introduction Traditionally, perception and cognition are considered separately. By this main-

stream view, perception pertains to the analysis of sensory input, while cognition

is presumed to apply to higher-level processes that follow the interpretation of the

input. But this boundary is superficial, as argued here, in that perception and cog-

nition continuously interact with each other towards the goal of understanding our

environment. There is an intimate relationship between perception and cognition,

we perceive. This perceptual-cognitive interface provides the basis by which we

can experience, learn, and func-

tion more optimally. The result al-

lows us to learn and form memo-

ries, and those memories

subsequently allow us to create

predictions about what might

happen next. Modeling this process mathe-

matically by matching sensory da-

ta to memories has been attempted

in artificial intelligence since the

1950s, but these attempts have

had very limited success due to

inadequate knowledge of the un-

derlying neural mechanisms [71].

As an alternative, it is suggested

that the human brain is proactive-

ly generating predictions about

the relevant sights, sounds, and

situations that we might encounter

in the imminent future

[3,4,44,12,63,10]. These predic-

tions serve an important purpose,

by preparing us and directing our

attention to what is likely to hap-

pen, and alerting us to what is

novel.

Predictions are an integral part

of our everyday life, enabled by

neural pathways of top-down and

bottom-up processes of recogni-

tion that allow us to understand

our world and the objects within

it. For this review, we will focus on predictions in visual object and scene recogni-

tion. Our proposal is that when our brain encounters a new object, it does not at-

tempt to recognize it by asking “what is this,” which is inherently a bottom-up

process, but rather by asking “what is this like” [3], which is a process that em-

!Perception, cognition, memory, and the

resulting predictions are feed-forward and feed-

back systems, constantly changing the way we

function in the future with the growing amount

of situations we encounter.

Fig. 1. Perception, cognition, memory, and the

resulting predictions are feed-forward and feed-

back systems, constantly changing the way we

function in the future with the growing amount

of situations we encounter.

Predictions and incongruency in object recognition: a cognitive neuroscience perspective 3

phasizes experience and the role of memory in our understanding of the world

around us. We continuously, consciously or not, search for analogies between new

inputs and familiar representations stored in memory from experience. Finding

such analogies not only facilitates recognition of what is in front of us, but also

provides immediate access to a web of associations in memory that is the basis for

the activation of predictions about likely upcoming situations. We will use this

framework to later explain how the systems adjust to identify rare and incongruent

aspects of our environment that happen to fall outside the predicted “net” of poss-

ible outcomes.

Predictions about the relevant future rely heavily upon our past memories; we

are able to learn, and can therefore anticipate the future to some degree by draw-

ing upon these memories and “lessons learned.” In addition to memories from real

experiences, our minds are also capable of mental simulations, or “imagining” the

outcomes of imaginary events. For example, you might be swept up in a day-

dream during the lull of a late afternoon workday about what you would do should

the building catch fire, and a swift evacuation was required. You would imagine

how you would try to escape the building, help your fellow co-workers, or make a

mental map of the nearest windows should the regular exits cease to be an option.

These mental simulations then become part of your own memory database, and

even though they are memories of events that have not yet happened, they could

guide future actions as effectively as real memories.

At the center of this review is evidence to support the theory that we do not

process our world simply by analyzing incoming information, but rather attempt to

understand it by linking this incoming information to existing memories by way of

analogy. When we encounter a novel stimulus (and all visual stimuli are novel to

some extent, because we never experience the same situation twice under exactly

the same conditions), our brains attempt to link this input with prior, similar expe-

riences. In other words, we assess the global characteristics of, say, an object, and

make a quick comparison as to how this object is likened to one that we are al-

ready familiar with, to help us identify the new object in question. By linking a

novel input to a similar representation in memory (i.e. an analogy), we immediate-

ly gain access to a vast web of associations that are linked to this representation,

and correspondingly to this novel input. Activating these associations, which are

typically context-sensitive, is the mechanism that underlies prediction.

1 Object Recognition: Overview

The visual cortex is built in a hierarchical manner, which might be the primary

reason why bottom-up processing has been the dominant dogma in thinking about

visual processing for so many years. Visual perception begins with an image pro-

jected onto the retina, which is then relayed to the lateral geniculate nucleus

(LGN). The LGN then projects to the primary visual cortex (V1), giving rise to the

ventral pathway, which continues onward to the prestriate cortex (V2), to an area

in the prelunate gyrus (V4), and ends in the temporal lobe where object recogni-

tion is presumably accomplished [37]. As information progresses through this vis-

ual processing hierarchy, neurons respond to increasingly complex stimuli. For in-

4 Helena Yardley, Leonid Perlovsky and Moshe Bar

stance, neurons located in the V1 area respond to very simple stimuli (e.g. orienta-

tion of lines) [49], while neurons in the temporal areas are selective for complex

visual input, such as faces [72,89,85,58]. In addition, receptive field sizes (i.e. the

size of the visual field to which a certain neuron is responsive) become progres-

sively larger along the visual hierarchy, allowing processing of objects of larger

magnitude and varying location [52].

Traditionally, visual object recognition has been taken as mediated by a hierar-

chical, bottom-up stream that processes an image by systematically analyzing its

individual elements, and relaying the information to the next areas until the overall

form and identity is determined. This idea has been challenged in recent years by

theories and studies that are based on a more comprehensive model that combines

bottom-up and top-down processing, which facilitates rapid and accurate object

recognition as these “counter-streams” come to an agreement

[54,90,42,15,31,56,1,6].

Top-down processing can be seen as making a quick and educated “guess,”

consulting with prior memories to activate simultaneous predictions about the

identity of the input object, and finding an analogy of “What is this like?” For in-

stance, seeing a dim and fuzzy picture of a hairdryer would allow you to quickly

determine what the object may be (hairdryer, power drill, gun, etc), and what it

definitely could not be (horse, car, toilet seat, etc.). In this way, we “prime” our

systems for what to expect, and when our predictions are accurate, it facilitates

faster recognition and further generation of association-based predictions. Once

this global information is obtained, it is back-projected to the temporal cortex for

collaboration in bottom-up processing [1]. A wealth of recent studies indicate that

object identification is initiated and facilitated by top-down processes

[48,17,60,32,6].

The extent of neural processing needed for object recognition is proportional to

how congruent the input is with expectations, and the frequency with which the

object has been encountered in the past. Previous neuroimaging studies have con-

sistently shown a reduction in neural activity, particularly in the prefrontal cortex

(PFC) and temporal cortex, for repeated stimuli [27,14,92,91,23]. This could be

attributed to more efficient processing of a familiar stimulus, or the increased

“synchrony” between top-down processing in the PFC and bottom-up processing

in the temporal cortex [39,92,91,47,55,35,23,34,80,26,61]. A recent study by

Ghuman et. al (2008) suggested that there is a relationship between cross-cortical

communication and priming. It is possible that repetitions of a stimulus strengthen

the connections between the nodes of this PFC-inferiotemporal (IT) network.

Their results indicate that this feedback process is initiated in the PFC, which is

then influencing processes in the temporal cortex. Top-down initiation of object

recognition allows bottom-up process to function more effectively, and achieve

object identification faster than a solely bottom-up process. In some instances it is

even impossible to recognize an input object based on bottom-up processes exclu-

sively.

One of the main criticisms of an exclusively bottom-up processing framework

is that it presupposes that object recognition occurs only after the area at the tail

end of the processing hierarchy has received and analyzed all the required input

from earlier areas [1]. This scheme attributes substantially less importance to the

Predictions and incongruency in object recognition: a cognitive neuroscience perspective 5

parallel and feedback connections known to exist [78,16,65,74]. Object recogni-

tion is not unimodal; there are a number of other cues that lend themselves to a

better understanding of one particular object, such as contextual cues, and our ex-

pectations about what we should encounter. In this review, we will examine the

influence of top-down predictions in the process of object recognition, and explain

how it might be integrated with other perceptual and cognitive processes to

achieve visual object identification.

2 Mechanisms of Top-Down Activation

Top-down processing in object recognition can be seen as the brain’s initial guess

of “what is this like?” and is greatly dependent on the global physical properties of

the stimulus. Instead of processing individual details, this process relies on the

“gist” of the object. Results from previous studies have indicated that people

perceive the low spatial frequency (LSF) components of an object prior to perceiv-

ing high spatial frequency (HSF) components [25,50,66], potentially providing the

means for triggering predictions in a top-down fashion [81,71,6,73,67]. According

to one specific framework (Bar, 2003), this LSF information is projected rapidly

from the early visual areas directly to the prefrontal cortex possibly via the dorsal

magnocellular pathway. This blurred delineation sparks a cascade of events that

begin to generate predictions about the identity of the object in the orbitofrontal

cortex (OFC) and downstream to the visual cortex (Figure 2) [6]. The process of

LSF-to-HSF recognition proposed in [1] and demonstrated in [6] shares several

principles with Carpenter & Grossberg’s (1987) ART framework, Perlovsky’s dy-

namic logic model of cognition (2006) and others. Attempting to process the de-

tails of an object sequentially would take substantially longer without top-down

Fig. 1: Information from the LSF components of an image is projected from early visual

cortex to the OFC, possibly via the magnocellular pathway. As the OFC generates

predictions about the object’s identity, information is simultaneously processed in a

bottom-up manner along the relatively slower ventral visual pathway. Information from

the two visual pathways are integrated to accomplish object recognition. (Adapted from

Bar et. al, 2006)

Fig. 2. Information from the LSF components of an image is projected from early visual cor-

tex to the OFC, possibly via the magnocellular pathway. As the OFC generates predictions

about the object’s identity, information is simultaneously processed in a bottom-up manner

along the relatively slower ventral visual pathway. Information from the two visual pathways

are integrated to accomplish object recognition. (Adapted from Bar et. al, 2006.)

6 Helena Yardley, Leonid Perlovsky and Moshe Bar

guidance. Also, in cases where an immediate assessment is needed (i.e. if you saw

someone in your periphery coming at you), being able to rely on this LSF infor-

mation could mean a faster reaction time, and, in extreme cases, the difference be-

tween life and death. This process allows us to make a rapid, and possibly sub-

conscious “guess” as to what an object in our field of view may be.

Neurophysiological studies have shed light on the issue, providing evidence for

the proposition that top-down processes initiate object recognition [48,17,60,32,6].

Bar et. al. (2006) found that object recognition based on LSF information elicited

differential activity in the left OFC 50 ms earlier than areas in the temporal cortex

when trying to identify an object. From the initial activity in the OFC, the se-

quence of recognition progressed to the left fusiform gyrus (180 ms), then to the

left fusiform gyrus (215 ms). The neural activity was most robust at 130 ms from

the presentation of the object, and remained statistically significant for approx-

imately 40 ms. In addition, trials carried out with an easily recognizable object eli-

cited substantially less robust early OFC activity than the masked trials. This fur-

ther supports the proposition that top-down processing initiates predictions for

facilitating object recognition.

Other studies have shown evidence to support the presence of cortical and sub-

cortical shortcuts in general, including projections of coarse information from the

thalamus to the amygdale [57], from V1 and V2 to V4 and the posterior part of IT

[65], and between V4 and TE [33]. Additionally, studies have shown that the PFC

receives magnocellular projections [8,79,40]. Specifically, the magnocellular

pathway has direct connections from as early as visual area V2 to the dorsolateral

PFC, and from the ventral area V4, directly to the ventrolateral PFC [78,8]. There-

fore, these data provide evidence of the existence of the neural infrastructure re-

quired for the rapid projection of objects containing LSF from early visual areas to

the PFC [1]. This information is then projected back to the ventral visual cortex

where top-down and bottom-up processes may be integrated. This feedback sys-

tem allows more efficient processing by considerably narrowing the number of

possible object identities that need to be considered. In one mathematical model of

this interaction between top-down and bottom-up signals [71], it was argued that

vague LSF top-down contents are essential for overcoming combinatorial compu-

tational complexity that had plagued artificial intelligence and other mathematical

attempts at object recognition in the past. Top-down processing provides a frame-

work from which bottom-up processing can engage more efficiently. But in addi-

tion to the predictions that can be generated about an object’s identity based on its

global “gist” properties, a significant source of predictability can be gleaned from

the context in which objects appear.

3 Objects in Context: Overview

A scene is contextually coherent if it is composed of items that frequently appear

together within that context. Seeing a street scene prompts predictions about what

other objects might also be encountered: a mailbox, cars, buildings, people. These

predictions are triggered by stored associations that are based on the frequency

with which we encounter these objects together on a regular basis, and these con-

Predictions and incongruency in object recognition: a cognitive neuroscience perspective 7

text-driven predictions allow us to save valuable processing power by inferring

what objects are present, rather than having to attend specifically to each object

within the scene. These predictions are activated rapidly, and can be triggered by

LSF global information, just like we proposed with single objects. It has indeed

been shown thoroughly that LSF images of typical contexts are highly informative

about the context [88,68] and thus can serve the purpose of triggering predictions

reliably. This predictive association might be initiated using Bayesian inference

methods, based on priming, attention, and expectation of scales (e.g. size, dis-

tances) [88,53], and are further reinforced through cellular-level mechanisms such

as long-term potentiation (LTP) [29] and Hebbian-based learning [46].

There are instances of conditionalized associative activation that hint to the fact

that associations are not activated automatically. An image of a book is associated

with a number of other objects, and our brain does not know if we are in a library,

and need to activate corresponding associations such as rows of book shelves,

chairs, desks, computers, and librarians, or if we are at home and need to activate,

instead, the representation of a coffee table, a couch, and a television. It has been

proposed [2,34] that our brain initially activates all these associations simulta-

neously, but once more information about the context (e.g. library) becomes avail-

able, it suppresses associations from the irrelevant context (e.g. home), and keeps

online only the associations of the appropriate context frame.

Aside from objects that are related to each other contextually because they typ-

ically appear together, objects can be related to each other in various other ways.

For example, a toaster and a microwave are contextually related, but two different

brands of microwave are different types of the same basic-level concept and are

thus semantically related; still, the simple visual components of a microwave and

a mailbox, although not contextually related, are perceptually related. These vari-

ous relationships are stored differently in different areas of the brain. Recent evi-

dence has supported the theory that cortical representations of objects might be

grouped by physical properties in early visual cortex [41,59,84], by basic-level

categories in the anterior temporal cortex [76,45,28], by contextual relations in the

parahippocampal cortex (PHC) [5] and by semantic relations in the PFC [36]. We

maintain multiple different representations for the same object, each of which is

geared towards and deployed for dedicated purposes.

Computational attempts to identify what constitute a coherent contextual scene

have been undertaken since the 1960’s and, similar to object recognition, have

proven to be challenging. The mathematical and conceptual challenges associated

with these topics have the same limiting factor: the multitude of conditional com-

binations a scene can contain. There are many objects in the field of view in any

given direction, in any given moment. Which of them are relevant to which con-

text? The human mind solves this problem so effortlessly that it is difficult to ap-

preciate its computational complexity, which is even more involved than the pre-

viously discussed complexity of object recognition. Again, the gist-like nature of

top-down projections have been fundamental in reducing computational complexi-

ty, helping to understand the brain mechanism of top-down signal interaction that

facilitates the recognition of objects in context [2,70].

In addition to objects that commonly co-occur in the same scene, or that are

conceptually related to each other in some way, Biederman et al. (1982) ex-

8 Helena Yardley, Leonid Perlovsky and Moshe Bar

pounded on five types of object relations that characterize the rules that govern a

scene’s structure and their influence on perception: familiar size (known relative

size of objects), position (typical spatial orientation), probability (frequency with

which you have encountered this object within this scene), support (objects tend to

be physically supported, not floating), and interposition (objects in the foreground

will occlude what is behind them) [13]. These rules help to form our a priori

knowledge about how the world should be, and summons our attention when ob-

jects appear to be incongruent with our expectations. In this way, scenes that fol-

low these rules facilitate faster and more efficient processing, particularly when

one’s predictions are congruent with the contextual frame and the object’s identi-

ty. To a large extent, then, the process of object identification is dependent upon

how the object is related to its surroundings. [69,22,51,21,2].

The growing amount of scenes we encounter, our accumulated experience, al-

lows us to “prime” for object recognition. This sort of priming can be seen as

“predictive coding” [30], and it involves a matching process between our top-

down predictions and bottom-up feature detection along the cortical visual sys-

tems [30,64,77,35,82]. One theory of why priming facilitates faster object recogni-

tion is that these “predictions” help to direct attention [75], but beyond attention,

which tells us where to expect something, the predictions also often contain “gist”

information of what to expect in that location. Therefore, a prediction is typically

a combination of space and physical features that together facilitate our perception

and action.

4 Rare Events and “Surprises”

A key concept when thinking about how we detect rare and incongruent events in

our environment is to realize that by generating a prediction about what might be,

the brain also implicitly predicts what might not be. A prediction activates certain

expectations, and by doing so inhibits, actively or not, alternatives that are unlike-

ly to be relevant. As discussed, we typically have an underlying expectation of

what we will encounter, and these expectations help work synergistically with

feedback and feed-forward systems to identify objects within our environment. It

is a process that relies on memory, predictions, and collaborations among a num-

ber of different structures. But what happens when our predictions are incongruent

with the visual input? We live in a complex world, and in order to flourish in our

environment, we must possess the ability to adjust to a variety of situations flexi-

bly, and react to them appropriately. In addition to being able to respond flexibly

when our expectations have been violated to some degree, prediction error pro-

vides one of the prime sources for learning [87,24]. The systems we have men-

tioned have a role in how this discrepancy is realized and learned from. We are

particularly interested here in the detection of rare events within the context of

prediction error, and how our neural systems adjust to process the dissonance en-

countered within our environment.

There are two mechanisms that modulate how rare events are detected and

processed: attention and expectation. Attention prioritizes how stimuli are

processed based on personal relevance, and expectations constrain visual interpre-

Predictions and incongruency in object recognition: a cognitive neuroscience perspective 9

tation based on prior likelihood [83]. We typically don’t attend strongly to mun-

dane objects within our environment, unless they are of some importance to us at

that moment, though we may still perceive them on some lower level. Only when

there is something out of the ordinary do we take closer notice. Our brains allow

us to ignore, to some degree, the objects that are predictable, and activate visual

object recognition regions more strongly when an object falls outside our pool of

predicted objects. This creates a gradient of neural responses corresponding to

how different the input material is from the original expectation, as well as how

emotionally salient the information is (see Fig. 3). The frequency with which we

encounter the same situation aids in developing an internal statistical probability

of the most likely outcome, and these regularities continuously construct causal

models of the world [43]. When the present events violate our predictions, the

process of object recognition adjusts to reconcile the differences between what

was expected and what is presented.

The information that is congruent with our expectations is processed differently

than that of a rare event. Implicit in the definition of learning, learning occurs

when we are confronted with novel information, and have to make sense of it. The

identification of some novelty means that a rare event has occurred. We hold a

general expectation of how our environment should be, and rare events are identi-

fied when top-down predictions are mismatched with the incoming bottom-up in-

formation in a meaningful manner. This inconsistency is then noted, and for-

warded to memory to avoid prediction error in the future [35,30], which is the

basis of how we learn. At the moment that we are presented with a rare event, the

Fig. 2: Information from the LSF components of an image is projected from early visual

cortex to the OFC, possibly via the magnocellular pathway. As the OFC generates

predictions about the object’s identity, information is simultaneously processed in a

bottom-up manner along the relatively slower ventral visual pathway. Information from

the two visual pathways are integrated to accomplish object recognition. (Adapted from

Bar et. al, 2006)

Fig. 3. The degree of neural activity associated with the process of object recogni-

tion is dependent upon how much the rare event differs from our expectations, as

well as how much the mismatch between the event and our expectations is perso-

nally relevant. This effect is most pronounced in the OFC, where top-down predic-

tions are thought to originate in the process of object recognition.

10 Helena Yardley, Leonid Perlovsky and Moshe Bar

stimulus’ discordant properties within the scene alert us to the conflict. This rare

event is realized when an object’s individual characteristics conflict with our own

expectations of the world. As information travels up the visual processing hie-

rarchy, each lower-level processing center interprets incoming data, a process that

is presumably guided by and compared with the top-down predictions. When an

“error” in prediction has been perceived, the higher-level systems generate a new

prediction in accordance with the lower level information, and process the object’s

features further until these two cognitive-perceptual streams come to an agreement

[77]. There are a number of other factors that further influence this circuitry, in-

cluding the degree of deviation from our expectation, the personal relevance of the

deviation, as well as whether the situation poses a threat to our safety.

Emotions play a large part in perception as well. Every perceivable object and

event we encounter elicit either a positive, negative, or a neutral reaction, whether

it be a fleeting thought or a small physiological reaction. This discernment is made

by comparing the present moment to how a similar situation has affected us in the

past. The OFC integrates our external perceptions along with our internal bodily

ongoings to create a contextually specific representation of the situation, and its

personal relevance [7,9,95,20,62,38,11]. Current research has shown that emo-

tionally salient stimuli initiate activity in the limbic and paralimbic systems [86].

There is strong activity in the amygdala, which has been traditionally seen as the

“fear center,” when presented with a rare event that has the potential for personal

harm (e.g. a ninja perched upon the counter in your periphery). Sometimes the

presence of the rare event itself provides enough uncertainty, and therefore poten-

tial harm, to engage the amygdala.

5 Conclusions

Visual object recognition is typically a result of interplay between perception and

cognition. Science has only begun to uncover the mechanisms by which we perce-

ive and comprehend our world. Our brain is proactive in generating predictions

about the relevant future, to better prepare us for what is to come next. We learn

when we encounter rare and novel events, and store this novelty for future refer-

ence to be better prepared for a similar event in the future. Rare events themselves

are approached and processed differently based upon our current state of mind and

needs. In one moment, we are actively concerned with a particular outcome, and

therefore attend strongly, and react strongly to the resulting situation. In the next

moment, the same outcome is no longer personally relevant, and therefore that

particular rarity goes unnoticed. Our sensory brain is an ever-changing entity, con-

stantly acquiring new knowledge, computing statistical probabilities about our en-

vironment, and adapting to a changing environment. Our main purpose is survival,

and we are therefore hard-wired accurately and efficiently detect and reconcile

events that violate our expectations. Our neural circuitry enables us to thrive in our

respective environments, no matter what rare events we may encounter.

Predictions and incongruency in object recognition: a cognitive neuroscience perspective 11

References

1. Bar, M. (2003). "A cortical mechanism for triggering top-down facilitation in visual object

recognition." Journal of Cognitive Neuroscience 15: 600-609.

2. Bar, M. (2004). "Visual objects in context." Nature Reviews Neuroscience 5(8): 617-629.

3. Bar, M. (2007). "The proactive brain: Using analogies and associations to generate predic-

tions." Trends in Cognitive Sciences 11(7): 280-289.

4. Bar, M., Ed. (2009). Predictions: A universal principle in the operation of the human brain

(Introduction). . Theme issue: Predictions in the brain: using our past to generate a future.

Philosophical Transactions of the Royal Society B: Biological Sciences.

5. Bar, M. and E. Aminoff (2003). "Cortical analysis of visual context." Neuron 38(2): 347-

358.

6. Bar, M., K. S. Kassam, et al. (2006). "Top-down facilitation of visual recognition." Proc

Natl Acad Sci U S A 103(2): 449-454.

7. Barbas, H. (1993). "Organization of cortical afferent input to orbitofrontal areas in the rhe-

sus monkey." Neuroscience 56(4): 841-864.

8. Barbas, H. (1995). "Anatomic basis of cognitive-emotional interactions in the primate pre-

frontal cortex." Neurscience and Biobehavioral Reviews 19(3): 499-510.

9. Barbas, H. (2000). "Connections underlying the synthesis of cognition, memory, and emo-

tion in primate prefrontal cortices." Brain Research Bulletin 52(5): 319-330.

10. Barbey, A. K., F. Krueger, et al. (2009). "Structured event complexes in the medial pre-

frontal cortex support counterfactual representations for future planning." Philos Trans R

Soc Lond B Biol Sci 364(1521): 1291-1300.

11. Barrett, L. F. and M. Bar (2009). "See it with feeling: affective predictions during object

perception." Philos Trans R Soc Lond B Biol Sci 364(1521): 1325-1334.

12. Barsalou, L. W. (2009). "Simulation, situated conceptualization, and prediction." Philo-

sophical Transactions of the Royal Society B: Biological Sciences Special Theme Issue.

13. Biederman, I., R. J. Mezzanotte, et al. (1982). "Scene perception: detecting and judging

objects undergoing relational violations." Cognit Psychol 14(2): 143-177.

14. Buckner, R., J. Goodman, et al. (1998). "Functional-anatomic correlates of object priming

in humans revealed by rapid presentation event-related fMRI." Neuron 20: 285-296.

15. Bullier, J. (2001). "Integrated model of visual processing." Brain Research Reviews 36:

96-107.

16. Bullier, J. and L. G. Nowak (1995). "Parallel versus serial processing: New vistas on the

distributed organization of the visual system." Current Opinion in Neurobiology 5: 497-

503.

17. Cardin, V., K. J. Friston, et al. (2011). "Top-down modulations in the visual form pathway

revealed with dynamic causal modeling." Cereb Cortex 21(3): 550-562.

18. Carmichael, S. T. and J. L. Price (1995). "Limbic connections of the orbital and medial

prefrontal cortex in macaque monkeys." Journal of Comparative Neurology 363(4): 615-

641.

19. Carpenter, G. A. and S. Grossberg (1987). Neural dynamics of category learning and rec-

ognition: Attention, memory consolidation, and amnesia. Brain Structure, Learning, and

Memory. J. Davis, R. Newburgh and E. Wegman. Hillsdale, NJ, Erlbaum: 233-290.

20. Cavada, C., T. Company, et al. (2000). "The anatomical connections of the macaque mon-

key orbitofrontal cortex. A review." Cerebral Cortex 10(3): 220-242.

21. Chun, M. M. (2000). "Contextual cueing of visual attention." Trends Cogn Sci 4(5): 170-

178.

22. Cutler, B. L. and S. D. Penrod (1988). Context reinstatement and eyewitness identifica-

tion. Memory in context: Context in memory. G. M. Davies and D. M. Thomson, John

Wiley & Sons Ltd.

23. Dale, A., A. K. Liu, et al. (2000). "Dynamic statistical parametric mapping: Combining

fMRI and MEG for high-resolution imaging of cortical activity." Neuron 26(1): 55-67.

12 Helena Yardley, Leonid Perlovsky and Moshe Bar

24. den Ouden, H. E., K. J. Friston, et al. (2009). "A dual role for prediction error in associa-

tive learning." Cereb Cortex 19(5): 1175-1185.

25. DeValois, R. L. and K. K. DeValois (1988). Spatial vision. New York, Oxford Science

Publications.

26. Dhond RP, Buckner RL, Dale AM, Marinkovic K, Halgren E (2001) Spatiotemporal maps

of activity underlying word generation and their modification during repetition priming. J

Neurosci 21:3564 –3571.

27. Dobbins, I. G., D. M. Schnyer, et al. (2004). "Cortical activiy reductions during repetition

priming can result from rapid response learning." Nature 428(6980): 316-319.

28. Downing, P. E., Y. Jiang, et al. (2001). "A cortical area selective for visual processing of

the human body." Science 293(5539): 2470-2473.

29. Dudai, Y. (1989). The neurobiology of memory. Oxford, England, Oxford University

Press.

30. Egner, T., J. M. Monti, et al. (2010). "Expectation and surprise determine neural popula-

tion responses in the ventral visual stream." J Neurosci 30(49): 16601-16608.

31. Engel, A. K., P. Fries, et al. (2001). "Dynamic predictions: oscillations and synchrony in

top-down processing." Nature Reviews Neuroscience 2(10): 704-716.

32. Esterman, M. and S. Yantis (2010). "Perceptual expectation evokes category-selective cor-

tical activity." Cereb Cortex 20(5): 1245-1253.

33. Felleman, D. J. and V. C. Van Essen (1991). "Distributed hierarchical processing in pri-

mate visual cortex." Cerebral Cortex 1: 1-47.

34. Fenske, M. J., E. Aminoff, et al. (2006). "Top-down facilitation of visual object recogni-

tion: Object-based and context-based contributions." Progress in Brain Research 155: 3-

21.

35. Friston, K. (2005). "A theory of cortical responses." Philos. Trans. R. Soc. Lond. B. Biol.

Sci. 360(1456): 815-836.

36. Gabrieli, J. D., R. A. Poldrack, et al. (1998). "The role of left prefrontal cortex in language

and memory." Proceedings of the National Academy of Sciences, USA 95(3): 906-913.

37. Horton, J.C, Sincich, L.C. (2004). “A new foundation for the visual cortical hierarchy.”

In: Gazzaniga, M. ed., The Cognitive Neurosciences, 3d edition, Chapter 17, pp. 233-243.

38. Ghashghaei, H. and H. Barbas (2002). "Pathways for emotion: interactions of prefrontal

and anterior temporal pathways in the amygdala of the rhesus monkey." Neuroscience

115: 1261-1279.

39. Ghuman, A., M. Bar, I.G. Dobbins, D.M. Schnyer (2008). "The effects of priming on

frontal-temporal communication." Proceedings of the National Academy of Science

105(24): 8405-8409.

40. Goldman-Rakic, P. S. and L. J. Porrino (1985). "The primate mediodorsal (MD) nucleus

and its projection to the frontal lobe." Journal of Comparative Neurology 242(4): 535-560.

41. Grill-Spector, K., Z. Kourtzi, et al. (2001). "The lateral occipital complex and its role in

object recognition." Vision Research 41(10-11): 1409-1422.

42. Grossberg, S. (1980). "How does a brain build a cognitive code?" Psychological Review

87(1): 1-51.

43. Hanneke E.M. den Ouden, K.J. Friston, Nathaniel D. Daw, Anthony R. McIntosh, and

Klaas E. Stephan (2009) A Dual Role for Prediction Error in Associative Learning. Cereb.

Cortex, 19(5): 1175-1185 first published online September 26, 2008

doi:10.1093/cercor/bhn161

44. Hawkins, J., D. George, et al. (2009). "Sequence memory for prediction, inference and

behaviour." Philos Trans R Soc Lond B Biol Sci 364(1521): 1203-1209.

45. Haxby, J. V., M. I. Gobbini, et al. (2001). "Distributed and overlapping representations of

faces and objects in ventral temporal cortex." Science 293(5539): 2425-2430.

46. Hebb, D. O. (1949). The organization of behavior. New York, Wiley.

47. Henson, R. N. (2003). "Neuroimaging studies of priming." Progress in Neurobiology

70(1): 53-81.

48. Hirschfeld, G. and P. Zwitserlood (2011). "How vision is shaped by language comprehen-

sion--top-down feedback based on low-spatial frequencies." Brain Res 1377: 78-83.

Predictions and incongruency in object recognition: a cognitive neuroscience perspective 13

49. Hubel, D. H. and T. N. Wiesel (1977). "Ferrier lecture. Functional architecture of macaque

monkey visual cortex." Proc R Soc Lond B Biol Sci 198(1130): 1-59.

50. Hughes, H. C., G. Nozawa, et al. (1996). "Global precedence, spatial frequency channels,

and the statistics of natural images." Journal of Cognitive Neuroscience 8(3): 197-230.

51. Intraub, H. (1997). "The representation of visual scenes." Trends Cogn Sci 1(6): 217-222.

52. Kandel, E. R., J. H. Schwartz, et al. (1991). Principles of neural science. New York, El-

sevier.

53. Kersten, D., P. Mamassian, et al. (2004). "Object perception as Bayesian inference." Annu

Rev Psychol 55: 271-304.

54. Kosslyn, S. M. (1994). Image and Brain. Cambridge, MA, MIT Press.

55. Kveraga, K., A. S. Ghuman, et al. (2007). "Top-down predictions in the cognitive brain."

Brain and Cognition 65: 145-168.

56. Lamme, V. A. F. and P. R. Roelfsema (2000). "The distinct modes of vision offered by

feedforward and recurrent processing." Trends in Neuroscience 23: 571-579.

57. LeDoux, J. E. (1996). The emotional brain. New York, Simon & Schuster.

58. Logothetis, N. K. and D. L. Sheinberg (1996). "Visual object recognition." Annu Rev

Neurosci 19: 577-621.

59. Malach, R., I. Levy, et al. (2002). "The topography of high-order human object areas."

Trends Cogn Sci 6(4): 176-184.

60. Malcolm, G. L. and J. M. Henderson (2010). "Combining top-down processes to guide

eye movements during real-world scene search." J Vis 10(2): 4 1-11.

61. Marinkovic, K., R. P. Dhond, et al. (2003). "Spatiotemporal dynamics of modality-

specific and supramodal word processing." Neuron 38(3): 487-497.

62. Mesulam, M. (2000). “Behavioral neuroanatomy: large-scale networks, association cortex,

frontal syndromes, the limbic system, and hemispheric specializations.” In Principles of

behavioral and cognitive neurology (ed. M. Mesulam), pp. 1–120. 2nd edn. New York,

NY: Oxford University Press.

63. Moulton, S. T. and S. M. Kosslyn (2009). "Imagining predictions: Mental imagery as

mental simulation." Philosophical Transactions of the Royal Society B: Biological

Sciences Special Theme Issue.

64. Mumford, D. (1992). "On the computational architecture of the neocortex. II. The role of

cortico-cortical loops." Biol Cybern 66(3): 241-251.

65. Nakamura, H., R. Gattass, et al. (1993). "The modular organization of projections for

areas V1 and V2 to areas V4 and TEO in macaques." Journal of Neuroscience 13(9):

3681-3691.

66. Navon, D. (1977). "Forest before trees: The precedence of global features in visual per-

ception." Cognitive Psychology 9: 1-32.

67. Neisser, U. (1967). Cognitive psychology. New York: Appleton-Century –Crofts.

68. Oliva, A. & Torralba, A. (2001). “Modeling the shape of the scene: A holistic representa-

tion of the spatial envelope.” International Journal of Computer Vision 42(3), 145–175.

69. Palmer, S. E. (1975). "The effects of contextual scenes on the identification of objects."

Memory and Cognition 3: 519-526.

70. Perlovsky L.I. & Ilin R. (2010). Grounded Symbols In The Brain, Computational Founda-

tions For Perceptual Symbol System. WebmedCentral PSYCHOLOGY

2010;1(12):WMC001357.

71. Perlovsky, L.I. (2006). “Toward Physics of the Mind: Concepts, Emotions, Conscious-

ness, and Symbols.” Phys. Life Rev. 3(1), 22-55

72. Perrett, D. I., J. K. Hietanen, et al. (1992). "Organization and functions of cells responsive

to faces in the temporal cortex." Philosophical Transactions of the Royal Society of Lon-

don, B 335: 23-30.

73. Peyrin, C., C. M. Michel, et al. (2010). "The neural substrates and timing of top-down

processes during coarse-to-fine categorization of visual scenes: a combined fMRI and

ERP study." J Cogn Neurosci 22(12): 2768-2780.

14 Helena Yardley, Leonid Perlovsky and Moshe Bar

74. Porrino, L. J., A. M. Crane, et al. (1981). "Direct and indirect pathways from the amygdala

to the frontal lobe in rhesus monkeys." Journal of Comparative Neurology 198(1): 121-

136.

75. Posner, M. I., C. R. Snyder, et al. (1980). "Attention and the detection of signals." J Exp

Psychol 109(2): 160-174.

76. Puce, A., Allison, T., Asgari, M., Gore, J. C., & McCarthy, G. (1996). "Differential sensi-

tivity of human visual cortex to faces, letterstrings, and textures: a functional magnetic re-

sonance imaging study." Journal of Neurosciences 16(16): 5205-5215.

77. Rao, R. P. and D. H. Ballard (1999). "Predictive coding in the visual cortex: a functional

interpretation of some extra-classical receptive-field effects. [see comments.]." Nature

Neuroscience 2(1): 79-87.

78. Rempel-Clower, N. L. and H. Barbas (2000). "The laminar pattern of connections between

prefrontal and anterior temporal cortices in the rhesus monkey is related to cortical struc-

ture and function." Cerebral Cortex 10(9): 851-865.

79. Russchen, F. T., D. G. Amaral, et al. (1987). "The afferent input to the magnocellular divi-

sion of the mediodorsal thalamic nucleus in the monkey, Macaca fascicularis." Journal of

Comparative Neurology 256(2): 175-210.

80. Schnyer DM, Dobbins IG, Nicholls L, Schacter D, Verfaellie M (2005) Rapid Decision

Learning Alters the Repetition N400 Components in Left Frontal and Temporal Regions:

Evidence from MEG Recordings During Repetition Priming. (Society for Neuroscience,

Washington, D.C.).

81. Schyns, P. G. and A. Oliva (1994). "From blobs to boundary edges: Evidence for time-

and spatial-dependent scene recognition." Psychological Science 5(4): 195-200.

82. Spratling, M. W. (2008). "Predictive coding as a model of biased competition in visual at-

tention." Vision Res 48(12): 1391-1408.

83. Summerfield, C. and T. Egner (2009). "Expectation (and attention) in visual cognition."

Trends Cogn Sci 13(9): 403-409.

84. Tanaka, K. (1993). "Neuronal mechanisms of object recognition." Science 262: 685-688.

85. Tanaka, K. (1996). "Representation of Visual Features of Objects in the Inferotemporal

Cortex." Neural Netw 9(8): 1459-1475.

86. Taylor, S. F., K. L. Phan, et al. (2003). "Subjective rating of emotionally salient stimuli

modulates neural activity." Neuroimage 18(3): 650-659.

87. Tobler, P. N., P. O'Doherty J, et al. (2006). "Human neural learning depends on reward

prediction errors in the blocking paradigm." J Neurophysiol 95(1): 301-310.

88. Torralba, A. (2003). "Contextual priming for object detection." International Journal of

Computer Vision 53(2): 153-167.

89. Tovee, M. J., E. T. Rolls, et al. (1994). "Translation invariance in the responses to faces of

single neurons in the temporal visual cortical areas of the alert macaque." J Neurophysiol

72(3): 1049-1060.

90. Ullman, S. (1995). "Sequence seeking and counter streams: A computational model for

bidirectional information flow in the visual cortex." Cerebral Cortex 1: 1-11.

91. Wig, G. S., S. T. Grafton, et al. (2005). "Reductions in neural activity underlie behavioral

components of repetition priming." Nat. Neurosci. 8(9): 1228-1233.

92. Zago, L., M. J. Fenske, et al. (2005). "The rise and fall of priming: How visual exposure

shapes cortical representations of objects." Cereb. Cortex 15: 1655-1665.


Top Related