+ All Categories
Home > Documents > Attention, Memory, and the “Noticing” Hypothesis

Attention, Memory, and the “Noticing” Hypothesis

Date post: 02-Oct-2016
Category:
Upload: peter-robinson
View: 244 times
Download: 1 times
Share this document with a friend
49
Language Learning 452, June 1995, pp. 283-331 Review Article Attention, Memory, and the “Noticing” Hypothesis Peter Robinson University of Queensland Schmidt (1990) claimed that consciousness, in the sense of awareness of the form of input at the level of “noticing“, is necessary to subsequent second language acquisition (SLA). This claim runs counter to Krashen’s (1981) dual- system hypothesis that SLAlargely results from an uncon- scious “acquisition” system, the contribution of the con- scious “learning” system to SLA being limited and periph- eral. Important to a theory of SLA that allows a central role to the act of noticing is a specification of the nature of the attentional mechanisms involved, and OftheirreIationship to current models of the organization of memory. With this in mind the present paper reviews current research into the nature of attention and memory and proposes a model of the relationship between them during SLA that, it is argued, is complementary to Schmidt’s noticing hypoth- esis and oppositional to the dual-system hypothesis of Krashen. In light of this model, I argue that differential performance on implicit and explicit learning and memory experiments is caused by differences in the consciously regulated processing demands of training tasks and not by the activation of consciously and unconsciously accessed Correspondence concerning this article should be addressed to the author at University of Queensland, Centre for Language Teaching and Research, Brisbane, QLD 4072 Australia. Internet: peBrrBlingua.cltr.uq.oz.au 283
Transcript
Page 1: Attention, Memory, and the “Noticing” Hypothesis

Language Learning 452 , June 1995, pp. 283-331

Review Article Attention, Memory, and the

“Noticing” Hypothesis

Peter Robinson University of Queensland

Schmidt (1990) claimed that consciousness, in the sense of awareness of the form of input at the level of “noticing“, is necessary to subsequent second language acquisition (SLA). This claim runs counter to Krashen’s (1981) dual- system hypothesis that SLAlargely results from an uncon- scious “acquisition” system, the contribution of the con- scious “learning” system t o SLA being limited and periph- eral. Important to a theory of SLA that allows a central role to the act of noticing is a specification of the nature of the attentional mechanisms involved, and OftheirreIationship to current models of the organization of memory. With this in mind the present paper reviews current research into the nature of attention and memory and proposes a model of the relationship between them during SLA that, i t is argued, is complementary to Schmidt’s noticing hypoth- esis and oppositional to the dual-system hypothesis of Krashen. In light of this model, I argue that differential performance on implicit and explicit learning and memory experiments is caused by differences in the consciously regulated processing demands of training tasks and not by the activation of consciously and unconsciously accessed

Correspondence concerning this article should be addressed to the author at University of Queensland, Centre for Language Teaching and Research, Brisbane, QLD 4072 Australia. Internet: peBrrBlingua.cltr.uq.oz.au

283

Page 2: Attention, Memory, and the “Noticing” Hypothesis

284 La nguage Learning Vol. 45, No. 2

systems. I also argue that the attentional demands of pedagogical tasks and individual differences in memory and attentional capacity both affect the extent of noticing, thereby directly influencing SLA.

Schmidt (1990,1993a, 199333,1994a; Schmidt & Frota, 1986) has recently proposed that noticing, or conscious attention to the form of input, is necessary to subsequent second language (L2) development. A number of other researchers have also claimed an important role for “consciousness raising” activities and a role for “focus on form” in promoting L2 development (e.g., Fotos & Ellis, 1991; Long, 1988,1991; Rutherford, 1987; Sharwood Smith, 1991, 1993). These proposals run counter to claims that second language acquisition (SLA) is a largely subconscious process in which conscious learning serves merely t o monitor or edit an nonconsciously acquired knowledge base (Krashen, 1981, 1982, 19851, and that separate consciously and nonconsciously accessed systems of memory are differentially responsible for L2 learning processes (Paradis, 1994). Both positions have influenced L2 pedagogy. The task-based proposals of, for example, Nunan (19891, Long and Crookes (1992) and R. Ellis (19931, stress the importance ofnoticing and attention to form, though these propos- als differ with respect to methodological questions regarding how noticing should be facilitated and content questions regarding which aspects of language are important for learners t o notice. On the other hand, the attribution of SLA to unconscious processes has led t o the development of “The Natural Approach” (Krashen & Terrell, 19831, and has influenced the thinking underlying the Bangalore Procedural Syllabus (Prabhu, 1987, pp. 69-70). The issue of the role of conscious awareness in language learning is thus one where “deep divisions exist among language theorists” (Carr & Curran, 1994, p. 2151, and which is of fundamental pedagogic concern (Schmidt, in press). This paper aims to review research about two key cognitive mechanisms implicated in pro- posals for SLA based on the noticing hypothesis-attention and memory-and t o present a model of the relationship between

Page 3: Attention, Memory, and the “Noticing” Hypothesis

Robinson 285

them, which I take t o be complementary to the noticing hypothesis. In the process I also identify other models that are oppositional (Beretta, 1991) in the sense that they could be taken t o provide theoretical support for Krashen’s dual-system hypothesis. By considering positions on the nature of attention and memory that support or contradict the noticing hypothesis, I hope to relate the debate about the role of consciousness in SLA t o relevant empirical work and operational constructs in cognitive psychology, and to thereby provide an extended theoretical basis for the noticing hypothesis. In the first section of this paper, I review the role of noticing in filter theories and capacity theories of attention and relate noticing to three functional classifications of attentional resources; alerting, orienting, and detecting. This review elabo- rates on a recent discussion of the role of attention during SLA (Tomlin & Villa, 19941, relating the concepts discussed t o the noticing hypothesis and t o a complementary model of short-term memory (Cowan, 1993). Although some discussion in the SLA literature has begun t o draw on cognitive models of attention in theorizing noticing during L2 development, there has been little discussion of the role of memory in noticing. With this in mind, in the paper’s second section I review various type classifications of memory-episodic versus semantic, procedural versus declara- tive, short-term versus Zong-term-considering whether these are indeed different systems, o r simply functional distinctions, and relating them to the discussion of attention.

Resources, Information Processing, and Consciousness

Schmidt’s (1990, 1993a, 199313, 1994a; Schmidt & Frota, 1986) claim that awareness at the level of noticing is necessary for converting L2 input t o intake invokes, but does not explain in detail, attentional mechanisms and their relationship to encoding and retrieval from the various subsystems of memory. Neither are attentional mechanisms and information-processing relationships explained in detail in other proposals which, following Schmidt, have attributed an important role t o noticing in L2 pedagogy (e.g.,

Page 4: Attention, Memory, and the “Noticing” Hypothesis

286 Language Learn ing Vol. 45, No. 2

R. Ellis, 1993; Fotos, 1993; Fotos & Ellis, 1991; Long, 1991; Zalewski, 1993). Although there have been attempts to describe the relationship between input and intake at the neurophysiologi- cal level (Sato & Jacobs, 19921, there has been no attempt at a systems-level characterization of why attention is allocated to input under certain task conditions and not others. Attentional mechanisms, in Schmidt’s view, are causal factors in L2 learning, because they are responsible for allocating the cognitive resources that lead t o noticing, and subsequent encoding in memory. There are, though, competing characterizations of the role of attention during information processing; one recent explanation of implicit learning even claimed that the learning effects observed during a sequence learning task are due to a “nonattentional” form of processing (Curran & Keele, 1993).

Part of the difficulty in motivating an explanation of such mechanisms using current cognitive theory lies in the fact, as Baddeley (1986, p. 225) noted, that the study of attention has been dominated by theories of the role of attention in perception and visual processes, particularly signal detection and pattern recog- nition. The role of attention in the control of memory and action, arguably areas of greater potential application to SLA processes, has been less well studied until recently. Recent work on the memory/attention interface has progressed considerably beyond the early multistore model of memory and attentional control of Atkinson and Shiffrin (19681, rejecting many of its fundamental assumptions (McClelland & Rumelhart, 1985). However, as Cowan (1988) noted, cognitive psychology has yet to settle on an accepted view of the mutual constraints imposed by memory and attention during information processing. Two established frameworks for describing skill development and performance, Shiffrin and Schneider’s (1977) theory of automaticity and Anderson’s 11983) ACT* theory of skill acquisition, have been heavily cited in the SLA literature (e.g., R. Ellis, 1993; Faxch & Kasper, 1984; Hulstijn, 1990; Kohonen, 1992; McLaughlin, 1987; O’Malley & Chamot, 1990; Robinson, 1989; Schmidt, 1992). However, more recent work in the study of action (Holding, 1989; Navon, 1984; Navon &

Page 5: Attention, Memory, and the “Noticing” Hypothesis

Robinson 287

Gopher, 1980; Schneider & Detweiler, 1988; Wickens, 1980,1984, 1989), which describes the role of attentional processes in skilled performance using the dual-task paradigm for examining attentional allocation, has rarely been invoked by SLA research- ers, despite its potential relevance to such current issues as task complexity and grading in task-based approaches to L2 syllabus design (Long & Crookes, 1992; Nunan, 1988, 1989; Robinson, 1995a, t o appear).

Part of the difficulty in motivating a theory of attentional mechanisms in SLA by drawing upon an accepted body of relevant findings from cognitive psychology research lies also in challenges t o traditional information-processing accounts of attention posed by more recent connectionist accounts. Information-processing models, such as Broadbent’s (1958), view attention as an executive process directing the serial passage of information between sepa- rate short-term and long-term memory stores. In contrast, connectionist accounts dispute the modular metaphor for cogni- tive architecture that the information-processing views are based on, as well as the assumption of seriality, arguing that executive attentional control is distributed throughout the entire processing system, in local patterns of neuronal excitation and inhibition, rather than in a central executive processor. Recent attempts t o reconcile connectionist and control architectures in the study of attention (e.g., Schneider, 1993; Schneider & Detweiler, 1988) are not yet widely accepted. In the following section I survey theories of attention and the functions of attention, and propose a model of the relationship between attention, awareness, and detection that is compatible with Schmidt’s (1990,1993a, 1993b, 1994a; Schmidt & Frota, 1986) noticing hypothesis.

Attention

The concept of attention has three uses. It can be used to describe the processes involved in “selecting” the information t o be processed and stored in memory. For example, researchers have used dichotic listening tasks to examine the fact that attention has

Page 6: Attention, Memory, and the “Noticing” Hypothesis

288 Language Learning Vol. 45, No. 2

a variable focus and can select information t o be processed t o the exclusion of other information (Cherry, 1953; Moray, 1959). It can be used t o describe our “capacity” for processing information. Studies of divided attention show that attention is capacity- limited and that decrements in performance increase as the number of task dimensions, o r components to be processed, in- crease (Taylor, Lindsay, & Forbes, 1967). Finally, it can be used to describe the mental “effort” involved in processing information. Pupillary dilation, for example, can be measured as a physical index of the degree of mental effort required in attending to increasingly complex tasks (Kahneman, 1973). Each of these uses has influenced the development of theories of attention.

Filter Theories of Attention

Early filter theories of attention were based on pipeline models of information processing, in which information is con- veyed in a fixed serial order from one storage structure to the next. In Broadbent’s (1958) “bottleneck” model, voluntary control of information processing is exercised by a selective attention mecha- nism or filter that selects information from a sensory register and relays it t o a detection device. Once past the selective filter, information is analyzed for meaning rather than for physical properties alone, enters awareness, and is encoded in short-term memory (Fig. 1). Such a model, based largely on acoustic process- ing, suggests that selective attention operates on the form of the message first and cannot simultaneously be directed a t form and meaning.’ If a mechanism of selective attention a t the stage of detection is responsible for noticing, then this model does not fit well with recent claims that manipulations in instructional treat- ments can encourage noticing via a focus on form, meaning, or a combination of both (Hulstijn, 1989; VanPatten, 1990) because all noticing would initially be form-focused. However, there are arguments against identifying noticing with the process of detec- tion (Fig. 4 and the discussion below) and subsequent develop- ments in filter theory disputed Broadbent’s claims about the

Page 7: Attention, Memory, and the “Noticing” Hypothesis

Rob inson 289

Broadbent

Treisman

Sensory Selective Detection Short-term register filter device memory

Sensor, Attenuation Delection Short-term register control devico memory

R E S

- 0 N S E

= p --- - - - - - - -

R E

S E

Figure 1. Selection in three models of attention and sensory process- ing. Note: From Cognitive Psychology (p. 43), by J. B. Best, 1992, New York: West Publishing. Copyright 1992 by Psychonomic Society Publications. Reprinted with permission.

nature of detection. Treisman (19641, using evidence that partici- pants noticed their own name when repeated in an unshadowed ear during dichotic listening tasks, argued for a filter mechanism that was sensitive to semantic information as well as sensory information, and argued against the proposal that all information in an unselected channel is completely tuned out, and unavailable for detection.

Arguments against Treisman's (1964) attenuatedfilter model are that the preattentive processing, or analysis before detection, it requires is too complete and resource-demanding. Late selection theories (Norman, 1968; Watanabe, 1980) have proposed that all information is processed in parallel and enters working memory,

Page 8: Attention, Memory, and the “Noticing” Hypothesis

290 Language Learning Vol. 45, No. 2

where a decision is made about its importance. Information judged important is elaborated or rehearsed; thatjudgedunimpor- tant is forgotten.

Capacity Theories of Attention

Underlying filter theories of attention and their associated mechanisms of selective attention is the metaphor of a limited- capacity channel, in which information competes for limited attentional resources available to the passive processor. More recent theories emphasize the voluntariness of the participant’s control of attentional resources and the task-specificity of deci- sions about attentional allocation. The metaphor most suited to these theories is that of attention as a spotlight, with a variable focus, which can be narrowed and intensified, or broadened and dissipated, as task conditions demand. Kahneman’s (1973) model allocated resources t o incoming stimuli from a pool of cognitive resources that varies as a function of the participant’s state of arousal. Allocation is divided between enduring predispositions (e.g., to recognize one’s own name) and momentary intentions (e.g., to eavesdrop). Divided attention does not necessarily lead to decrements in performance, given sufEcient arousal and given that the demands of the tasks performed concurrently are not excessive (Fig. 2). In this respect, capacity theories such as Kahneman’s differ from filter theories, which characterize incom- ing stimuli as inevitably involved in competition for limited resources.

Wickens (1980, 1984, 1989) has recently expanded Kahneman’s (1973) view of attentional resource allocation, argu- ing that rather than a single pool of resources there are multiple pools. These pools occupy different points on three intersecting dimensions of resource systems: (a) the dimension representing perceptualhognitive activities versus response processes; (b) the dimension representing processing codes required by analog/ spatial activities versus verbal linguistic activities; and (c) the dimension representing processing modalities, that is, auditory

Page 9: Attention, Memory, and the “Noticing” Hypothesis

Robinson

Arousal -manifestations of arousal

291

Figure 2. Kahneman’s Capacity Model of Attention. Note: From Attention andEffort (p. lo), by D. Kahneman, 1973, Englewood Cliffs, NJ: Prentice-Hall. Copyright 1973 by Prentice-Hall, Inc. Reprinted with permission.

versus visual perception and vocal versus manual responses (Fig. 3). Wickens (1989) argued that the attentional demands of tasks, and so their relative difficulty, will be increased when concurrently performed tasks draw simultaneously on the same pool of re- sources. In the worst cases, like maintaining two separate conversations, interference effects may make time-sharing impos- sible and the person adopts the attentional mechanism of “serial processing”-separate task components are completed in succes- sion. In other cases, where there is less “global similarity” between

Page 10: Attention, Memory, and the “Noticing” Hypothesis

292 Language Learning Vol. 45, No. 2

4 STAGES ----- w Cenlrol

Encodhg Processing Rcspondnp

Figure 3. Wickens’model ofthe structure of multiple resources. Note: From Human Skills (p. 82>, edited by D. H. Holding, 1989, New York: John Wiley. Copyright 1989 by Academic Press. Reprinted with permission.

the tasks, and so less resource competition, as in driving a car while talking, the person may adopt the mechanism of parallel processing. However, although this may avoid the need for time- swapping, degradations in the quality ofthe attention allocated t o both activities may lead t o poor performance. When tasks draw on completely different pools of resources, or when one of the tasks is automatized, then successful time-sharing and dual-task perfor- mance are possible. In these circumstances “parallel processing” is always applied. However, Wickens noted, individuals may differ both in their time-sharing ability and in their store of available resources. Therefore, individual differences as well as task char- acteristics may determine which of these two mechanisms of attentional allocation the person adopts.

Noticing, Attentional Theory, and L2 Task Demands

Capacity theories, like those ofKahneman (1973) and Wickens

Page 11: Attention, Memory, and the “Noticing” Hypothesis

Robinson 293

(1980, 1984, 19891, are properly seen as extensions of the late selection filter theory. In addition t o mechanisms of selective attention, they propose mechanisms for adjusting the deployment of attentional resources t o suit the particular conditions of task demands. In this sense, more recent attentional theory provides a framework for relating the act of noticing to those L2 task conditions that facilitate it. Recent task-based approaches to L2 pedagogy present learners with communicative activities in which attention is directed to meaning and t o the accomplishment of task goals requiring information exchange and joint problem solving. These activities contrast with form-focused approaches t o peda- gogy that explicitly direct learners’ attention to the formal properties of different types of input, such as formal grammatical rules, rehearsal of spoken dialogues, and so forth (Long & Crookes, 1992). Task-based approaches assume that noticing formal as- pects of language will be determined jointly by the learners’ stage of development and the structure of the task (R. Ellis, 1993; Long, 1989). Recent research has explored the extent to which design features of tasks can manipulate learner attention and cause incidental noticing of certain formal features of task input that learners may overlook if exposed to the L2 in untutored conversa- tional settings (Doughty, 1991; Hulstijn, 1989; Watanabe, 1992). One of the main claims for the advantages of task-based instruc- tion therefore, is that well-designed tasks can facilitate noticing of aspects of L2 syntax, vocabulary, and phonology that may lack perceptual and psychological saliency in untutored conversational settings and so may go unnoticed and unlearned (Long, 1991; Schmidt, 1990).

Wickens’ (1984, 1989) model of the structure of attentional resources (Fig. 3) suggested that noticing following detection is more likely to occur in dual-task performance that draws on distinct, not identical resource pools, because in this case more attentional resources can be allocated to input processing. Ex- amples of tasks drawing on distinct resource pools would be those involving object assembly (spatial visual encodingjresponding), in which parts ofthe object to be assembled are labeled with words or

Page 12: Attention, Memory, and the “Noticing” Hypothesis

294 Language Learning Vol. 45, No. 2

phrases in the L2 (verbal visual encoding), as in the “describe and arrange” tasks using leg0 blocks described by Ur (1988, p. 2321, and discussed by Loschky and Bley-Vroman (1993, pp. 147-1481. Tasks drawing on distinct resource pools could also involve object identification (spatial visual encoding) followed by description (verbal visual encoding), as in the one-way information gap Udraw the picture” (Gass & Varonis, 1985) or “assemble the scene” (Pica, Doughty, &Young, 1987) tasks used in second language research (see Pica, Kanagy, & Falodun, 1993, pp. 25-27 for further ex- amples). Wickens’ model also implies that noticing the form of the language input would be more likely in such labeled object assem- bly, or one-way picture description, tasks than in tasks drawing simultaneously on the visual verbal encoding resource pool, such as the L2 task described in Doughty (1991). The latter required learners to read for meaning, while simultaneously noticing the form of input made salient through highlighting (both drawing on the verbal visual encoding resource pool). Such distinctions between the attentional demands of tasks, made possible by Wickens’ model, are rarely examined by second language re- searchers, despite the important relationship between attention, resource allocation, noticing, and intake. Perhaps Wickens’ dis- tinctions between the resource pools drawn on in task performance are still too coarse-grained to allow a successful analysis of the workload and attentional demands of second language tasks. Schneider and Detweiler (1988) argued that mechanisms are insufficiently specified in Wickens’ model; they proposed a five- phase model of skill acquisition that specifies the type of attention-switching that occurs a t each level. However, the dispute between cognitive theorists over the proper characteriza- tion of the mechanisms used t o resolve competing task demands is arguably one that is outside the preserve of SLA theory construc- tion per se, although one by which it must be informed and in turn contribute to. Schmidt (1990, p. 1431, for example, has argued that task demands are powerful determinants of what gets noticed. Others have speculated that certain task types may facilitate noticing of different amounts (Long, 1991; Robinson, Ting, &

Page 13: Attention, Memory, and the “Noticing” Hypothesis

Robinson 295

Urwin, in press) or aspects (Fotos & Ellis, 1991; Loschky & Bley- Vroman, 1993) ofinput. Second language acquisition theory needs a conceptual model for describing the extent to which changes in task demands can affect the difficulty of information processing for the L2 learner. Wickens’ model is sacient ly general that it can allow results t o categorize some of the resource demands of both controlled laboratory tasks and more complex classroom tasks with a view t o identifying differences in the extent to which such demands create the conditions for noticing of input.

Alertness, Orientation, and Detection

Although theories of attention have moved away from their early preoccupation with problems of signal detection and with the question of “where” selective attention occurs in relation t o the information-processing sequence connecting the sensory register and short-term memory, there is still a need for a theory that identifies the mechanism whereby, and the point at which, selec- tive attention occurs. Signal detection theory, acknowledging the parallelism ofinput processing, has moved fromunidimensional t o multidimensional formats for statistically modeling this process (Cohen & Massaro, 1992; Kadlec & Townsend, 1992). However, as Mandler (1992) pointed out, purely connectionist accounts of parallel input processing and representation are left without a mechanism for selectively directing attention to input, suggesting the need for “consciousness as an intervening limited serial pro- cess,” (p. 54) functioning as a “gate between external information and internal representations” (p. 54).

Tomlin and Villa (1994), purposefully avoiding reference to consciousness, argued that a fine-grained analysis of the process of detection that leads to noticing is also necessary for SLA theory. They distinguished between the alerting, orienting, and detection functions of attention during the allocation of selective attention. Alertness for them concerns an individual’s “general readiness t o deal with incoming stimuli or data” (p. 190). Warning signals, for example, can increase alertness or vigilance, leading to an in-

Page 14: Attention, Memory, and the “Noticing” Hypothesis

296 Language Learning Vol. 45, No. 2

creased likelihood of signal detection. Orientation concerns the allocation of resources based on expectations about the particular class of incoming sensory information, and involves activation of some higher level schema or plan of action and events. During detection, attention focuses on a specific bit of information. Detec- tion demands more attentional resources and enables further processing of a stimulus a t higher levels, such as storage and rehearsal in short-term memory. Tomlin and Villa claimed that alertness is of interest to SLA t o the extent that learners must be ready to process information. Orientation is of interest t o the extent that prior experience may predispose learners to attend, for example, to form or meaning in processing a stimulus. In this sense, orientation and the related processes of input facilitation and inhibition are important determiners of the extent t o which information gets noticed during task performance.

Attention, Awareness, and Detection

Clearly, of the three functions of attention just described, detection is most similar to what Schmidt (1990) termed noticing. Detection is responsible for encoding in memory, and therefore is the attentional level at which Tomlin and Villa (1994, p. 192) claimed learning must begin. However, Tomlin and Villa pointed out that detection does not necessarily imply awareness. There can be detection without awareness (e.g., subliminally presented items can cause semantic priming; see Balota, 1983; Cowan, 1988, p. 174). It follows from Tomlin and Villa’s definition of detection and awareness, then, that there can be learning without aware- ness. This brings them into conflict with Schmidt, who has claimed conscious noticing is necessary for learning.

These different positions can be reconciled if the concept of noticing is defined t o mean detection plus rehearsal in short-term memory, prior t o enc;oding in long-term memory, a view associated with a number of late selection theories of attentional allocation (Cowan, 1988, 1993; Norman, 1968). Although some have sug- gested that the contents of awareness and the contents of short-term

Page 15: Attention, Memory, and the “Noticing” Hypothesis

Robinson 297

Long-term memory

Figure 4. Noticing as detection with awareness in short-term memory. Note: From “Activation, attention, and short-term memory” by N. Cowan, 1993, Memory & Cognition, 21, p. 163. Copyright 1993 by The Psychonomic Society. Adapted with permission.

storage are identical (Stern, 19851, the position illustrated in Figure 4 is that activation in short-term memory must exceed a certain threshold before it becomes part of awareness (Cowan, 1988, p. 165; Shiffrin, 1993, p. 195). Further, short-term memory is the subset of long-term memory in a currently active state. Thus, noticing can be identified with what is both detected and then further activated following the allocation of attentional resources from a central executive. Rehearsal following detection would be a consequence of the allocation of resources to fulfill task demands (Baddeley, 1986, p. 991, as, for example, in Wickens’ (1989) capacity model. The nature of rehearsal and elaboration would vary according to whether the task demanded data-driven or conceptually-driven processing (Graf & Ryan, 1990; Jacoby, 1983; Roediger, Weldon, & Challis, 1989, p. 24). In data-driven processing, stimuli are encoded in “small pieces which are later assembled in working memory’’ (Best, 1992, p. 76, e.g., the visual marks making up a printed word). In contrast, conceptually-

Page 16: Attention, Memory, and the “Noticing” Hypothesis

298 Language Learning Vol. 45, No. 2

driven processing involves more effortful integration of encoded stimuli within the context of surrounding stimuli, drawing on “expectations o r plans” (p. 761, themselves the result ofthe activa- tion of schemata in long-term memory, for example, the formal or content reading schemata described by Carrel1 (1992). These different types of processing are initiated in response t o task demands, and affect the nature of encoding. In the context of Figure 4, I now briefly summarize the roles of detection, attention, and awareness in learning.

Detection. Detection requires attention. There is evidence that detection can occur without awareness, as Tomlin and Villa (1994) claimed, but subliminal exposure effects are unlikely to have effects over intervals longer than a few hundred milliseconds, are rapidly lost from memory, and cannot in any useful sense be claimed to be evidence of learning (Holender, 1986; Shanks & St. John, 1994).

Attention. Both detection and noticing require attention. Consequently, as Schmidt’s (1990) noticing hypothesis stated, there can be no learning, or encoding in memory, without atten- tion. Although some researchers have described particular forms of learning as being nonattentional (Curran & Keele, 19931, this description refers to the degree of attention directed a t the stimu- lus, and such learning mechanisms are properly considered less attention-demanding, not attentionless (see below).

Awareness. Awareness is critical to noticing, and distin- guishes it from simple detection. Noticing is a consequence of encoding in short-term memory, and is necessary for learning. What is noticed may be subsequently encoded in long-term epi- sodic memory (memory for personal experiences; see Tulving, 1985, and the discussion in the second section of this paper). It is possible t o briefly notice and permanently or temporarily forget, and to notice and remember over time. More permanent encoding in long-term memory is a consequence of the level of activation of information in short-term memory, itself the result of rehearsal and elaboration. The nature of the rehearsal in short-term memory is a consequence of the processing demands of particular

Page 17: Attention, Memory, and the “Noticing” Hypothesis

Robinson 299

tasks, and can involve data-driven processing and conceptually- driven processing (Jacoby, 1983). Data-driven processing, as described above, involves simple maintenance rehearsal of in- stances of input in memory. Conceptually-driven processing involves elaborative rehearsal and the activation of schemata or higher-order relations from long-term memory. These are used to organize instances of input into more abstract configurations and can be induced, for example, by the instruction to search for rules underlying a sequence of stimuli (as in the artificial grammar learning experiments of Reber, 1989; see Cam & Curran, 1994; Robinson, 1994; Schmidt, 1994b1, or to apply previously learned rules to new examples. Measures of awareness are difficult to operationalize given that: (a) the experience of noticing may be fleeting and thus difficult to recall; and (b) one may be aware of, yet unable to verbalize or otherwise articulate the nature of that which one is aware of.

Attentional and Nonattentional Learning us. Encoding Specificity

Schmidt’s (1990) noticing hypothesis claimed that all learn- ing requires attention. Curran and Keele (1993) claimed that performance on a sequence learning task using a dual-task meth- odology reveals effects for a nonattmtional form of learning. Their results can be explained by invoking the notion of encoding specificity and the distinction between data-driven and conceptu- ally-driven processing, both mediated by attention (illustrated in Fig. 4).

Single-task training and dual-task transfer. Curran and Keele (1993) used a “single-task training to dual-task transfer” (p. 191) paradigm for examining the influence of attention on the learning of sequences of stimuli (the letter X ) displayed in num- bered quadrants of a computer screen. One group of participants was shown the rules regulating the presentation of stimuli in these numbered quadrants, another group was not. Subsequently, participants in the noninstructed group were divided into aware and unaware groups on the basis of their ability to correctly

Page 18: Attention, Memory, and the “Noticing” Hypothesis

300 Language Learning VOL. 45, No. 2

identify the nature of the rules regulating the presented se- quences. The researchers found that in single-task conditions learning, as measured in decreases in reaction times, was better for instructed learners, and for aware noninstructed learners than for unaware noninstructed learners. All groups showed evidence of learning, in the sense of improved reaction times over the period of single-task training. However, when a secondary task was added (counting high pitched tones occurring during each trial of the transfer session), the performances of the two groups were a t the same level-though again all groups showed evidence of learning, in the sense of relatively faster reaction times compared t o initial practice. Curran and Keele argued that this is evidence of two forms of learning: an attentional form, displayed by the instructed and aware noninstructed learners under single-task conditions, and nonattentional learning, displayed by the nonaware participants under single-task conditions. The attentional form is degraded by the attentional demands of the secondary task, but the nonattentional form is not. Because the instructed and aware noninstructed participants showed some evidence of learning in the dual-task transfer phase, Curran and Keele argued that this is the result of a switch to the nonattentional form of learning that had been occurring in parallel with the attentional form in the single-task training sessions. They concluded that this is evidence of two distinct, noninterfaced, parallel forms of learning. They thus differ from Hayes and Broadbent (1988) who claimed that selective, attention-demanding, and unselective, less attention- demanding forms of learning compete with each other, and are applied serially.

Attention and awareness. It is clear, though, from Curran and Keele’s (1993) discussion that both forms of learning require attention. It is less clear that the two are differentially dependent on awareness. Awareness was assessed as the ability t o verbalize the rules of the sequence in answer t o a question about a pattern underlying the stimulus presentations. This is precisely the awareness induced by the task instructions for the instructed group, and the consequence of the interpretation of the task made

Page 19: Attention, Memory, and the “Noticing” Hypothesis

Robinson 301

by the aware noninstructedgroup. However, the participants may also have been aware of the frequency of repetition of co-occurring stimuli. This might not qualify as a pattern, or rule, and so they may have failed to report it; however, it could still have contributed to their performance, and they could have consciously noticed such co-occurrences (cf. Perruchet & Pacteau, 1990; Vokey & Brooks, 1992). This type of awareness was not assessed by Curran and Keele, but it is precisely the type one could expect the learners in the nonattentional mode to have had. Shanks and St. John (1994) made the same point in discussing the evidence for implicit learning produced by Reber’s (1989) artificial grammar learning experiments (for a review see Reber, 1989; Carr & Curran, 1994; Schmidt, 199413):

If subjects have learned something other than rules, then asking them about rules may lead to erroneous conclu- sions. On the other hand, if we ask the subjects questions about what they did in fact learn, we may get reasonable answers. (Shanks & St. John, 1994, p. 394)

Similarly, Munsell and Cam (19811, discussing Krashen’s (1985) claim that SLA is a subconscious process, argued that

if we looked at the right time and asked the right questions, we would get from the acquirer . . . responses that would in fact be quite specific and would in somegeneral sense show a passing yet crucial “consciousness” of what was being acquired. (pp. 497498)

Awareness, that is, is a function of the interpretation of the nature of the encoding and retrieval processes required by the task. Both forms of awareness are likely to have accompanied performance by participants in Curran and Keele’s experiment. The problem lies in the insensitivity of Curran and Keele’s measure of awareness.

Encoding specificity and awareness. Rather than invoking two distinct, noninterfaced forms of learning, one could invoke the distinction between data-driven and conceptually-driven process- ing, as a consequence of task demands, t o account for Curran and Keele’s (1993) results. Data-driven processing requires accumu- lation and rehearsal of instances encountered in the input in

Page 20: Attention, Memory, and the “Noticing” Hypothesis

302 Language Learning Vol. 45, No. 2

memory, and may lead t o the development of simple patterns of association between co-occurring items. Conceptually-driven pro- cessing requires the elaboration of input following activation of schemata (for, in this case, sequence relations) and the attendant rehearsal of more abstract patterns of hierarchical organization. Both require awareness, though the nature of what is noticed during processing will be relative t o the interpretation placed on the task by the learner.

Summary of the Proposed Role of Attention in L2 Noticing

Four points can now be made about the model illustrated in Figure 4:

1. It is consistent with Schmidt’s (1990, 1993, 1994a) claim that there is no learning without awareness at the level of noticing.

2. It is consistent with one interpretation of claims by Krashen (1981, 1982); namely, information processing during “acquisition” and during “learning” both require conscious atten- tion to form a t input. However, information processing leading to acquisition is data-driven and results in the accumulation of instances, whereas information processing that leads t o learning is conceptually-driven, involving access to schemata in long-term memory.

3. The model allows little, if any, influence (outside of fleeting subliminal exposure effects) of nonconscious information process- ing on the accumulation of L2 knowledge.

4. This model is consistent with the position that dissocia- tions in the extent of knowledge, and awareness of knowledge arising during learning, are consequences of the particular encod- ing and retrieval operations required by particular tasks (Green & Shanks, 1993; Jacoby, 1983; Roediger et al. 1989). This implies the operation of a central, executive attentional mechanism allocating limited resources to satisfy known task demands, rather than the operation of two distinct, consciously and unconsciously accessed, systems.

Page 21: Attention, Memory, and the “Noticing” Hypothesis

Robinson 303

Memory

The debate about the role of consciousness in SLA has involved considerable speculation about attention’s role in infor- mation processing but little discussion of the memory systems underlying such processing. Perhaps contemporary theorists and practitioners alike desire t o dissociate themselves from the audio- lingual approach to language teaching, which placed a heavy emphasis on memorization and rote learning ofpatterns (Chastain, 19761, or the exaggerated claims of memory-based methods like Suggestopedia (Scovel, 1979). In addition, they may sense, as perhaps confirmed by the discussion so far, that descriptions of memory should properly take place within the context of a theory of attention and its role in information processing. This may be true, but discussions of memory are due to receive more prominent treatment in SLA theory than they have so far. First, more extensive evidence of the neurophysiological substrata of memory are becoming available, as a result of experimental methodologies like magnetic resonance imaging and evoked brain potential (Cotman & Lynch, 1989; Squire & Zola-Morgan, 1991; Thompson, 19861, which appear t o more firmly support previously speculative distinctions between memory systems. Second, a rapidly expand- ing literature on dissociations between conscious awareness and performance on “implicit” versus “explicit” memory tests argues that the separate systems support functionally differentiated forms of learning. In the remainder of this paper I survey both the separate memory systems that researchers have proposed and the distinction between implicit and explicit memory tests. I relate the discussion of this research t o a view of memory, and of its relation- ship to attention during SLA, which I take t o be complementary t o Schmidt’s (1990) noticing hypothesis.

Short-Term and Long-Term Memory

Memory research makes a basic distinction between short- and long-term memory. Most current models of memory assume

Page 22: Attention, Memory, and the “Noticing” Hypothesis

304 Language Learning Vol. 45, No. 2

the need for this functional distinction, although whether it is a distinction between separate systems and stores (Shiffrin, 19931, o r between separate storage functions within a single intercon- nected system (Baddeley & Hitch, 1993; Carlson, Khoo, Yaure, & Schneider, 1990) is in dispute. Proceduralist views of memory incline to the latter position, and evolve from Hebb’s (1949) view that one mechanism for the storage of memories “was the contin- ued activity, or reverberation, of the cells and cell assemblies recruited by a perceptual act” (Crowder, 1993, p. 143) and that should this reverberation continue long enough, a mechanism causing structural changes in the neural units would lead t o indefinite memory. Versions of proceduralist positions include Cowan’s (19931, that short-term memory is the subset oflong-term memory in a current state of activation (represented in Fig. 41, and Schneider’s (1993) connectionistlcontrol architecture. Even those who argued for a more modular approach (Shiffrin, 1993) have agreed that Atkinson and Shiffrin’s (1968) early, modal model- which proposed distinct, serially connected, short- and long-term stores, with separate phonological and semantic codes respec- tively-is wrong. All agree on three basic properties of short-term storage: (a) I t involves temporary activation of neural structures; (b) it is the site of control processes such as directing focal and peripheral attention, rehearsing current information, and coding new inputs; and (c) it is capacity limited. Short-term memory “serves as the interface between everything we know and every- thing we can see or do” (Cowan, 1993, p. 1661, and is conceived as the workspace where skill development begins (Anderson, 1983; Baddeley, 1986; Shiffrin & Schneider, 1977) and where knowledge is encoded into (and retrieved from) long-term memory. Hence, measures of short-term memory capacity have been used as indices of aptitude for, and skill development in, SLA (e.g., Carroll, 1993; Cook, 1977,1986; Geva & Ryan, 1993; Harrington & Sawyer, 1992; Sasaki, 1993; Skehan, 1989). Two current areas ofmemory research, however, imply that the domain of the control processes in short-term memory may not be as extensive as researchers have usually claimed. The first idea challenges the role attributed t o

Page 23: Attention, Memory, and the “Noticing” Hypothesis

Robinson 305

short-term memory in the development of automatic processing; the second challenges its role in controlling access t o long-term memory.

Automaticity and Memory

Automatic processes are fast, effortless, obligatory, consis- tent, unconscious, and learned (Logan, 1988a, 198813). Second language acquisition researchers have often referred t o resource- based theories of automaticity and skill acquisition, such as Anderson’s (1983)ACT* theory, or Shiffrin and Schneider’s (1977) theory (e.g., R. Ellis, 1993; McLaughlin, 1987; McLaughlin, Rossman, & McLeod, 1983; O’Malley & Chamot, 1990; Robbins, 1992). These theories have argued that the mechanisms respon- sible for the development of the learned responses used in auto- matic skilled performance are located in short-term memory. Shiffrin and Schneider theorized that controlled processes acti- vate sequential and temporary links in short-term memory. After ongoing practice in consistent environments, corresponding links in long-term memory grow stronger. Eventually, automatic pro- cesses activate permanent links in long-term memory in parallel each time exposure to a particular pattern of inputs occurs. In contrast, Logan (1988a, 1988b, 1990) has argued that modal resource-based theories of automaticity merely describe the phe- nomena associated with automaticity but do not explain “why” practice in constant environments is necessary to its development or “why” automaticity is accompanied by a gradual withdrawal of attention. Memory-based theories of automaticity, like Logan’s, claim initial performance is limited by lack of knowledge, not simply by availability of attentional resources in a capacity- limited short-term memory, and that automaticity is the result of an increasing long-term knowledge base that eventually becomes the basis for automatic responses. Such theories have received less attention in discussion of L2 automaticity than Shiffrin and Schneider’s (1977) or Anderson’s (1983) (though see Robinson & Ha, 1993; Schmidt, 1992). Logan’s “instance” theory rests on three

Page 24: Attention, Memory, and the “Noticing” Hypothesis

306 Language Learning Vol. 45, No. 2

main assumptions: (a) Stimuli are obligatorily encoded in memory once attended to; (b) each instance together with whatever aspects of the context were attended t o is stored separately; and (c) retrieval of the memories is an obligatory consequence of attend- ing to similar stimuli. Together these assumptions imply a learning mechanism-the accumulation of instance representa- tions. Memory becomes stronger not through strengthening (Cohen, Dunbar, & McClelland, 1990; Schneider & Detweiler, 19881, but because “each experience lays down a trace that may be recruited at the time of retrieval” (Logan, 1988a, p. 494). As memory accumulates, retrieval becomes faster, until the speed of decisions about task performance made on the basis of retrieval from memory exceeds the speed of decisions made by a general algorithm. At this point, memory-based solutions become the basis of automatic responses.

Instance theory, therefore, raises questions about previous views of automaticity that placed the responsibility for its devel- opment on the speedup of processes under attentional supervision in short-term memory. These questions are raised by three aspects of Logan’s proposals: (a) the parallelism of memory-based and algorithm-based processing; (b) the obligatoriness of encoding into memory; and (c) the obligatoriness of retrieval from memory. First, instance theory claims that automatic , implicit information processing does not follow consecutively from an explicit learning phase of controlled processing, but rather develops in parallel with it and is to a large extent independent of it. Second, the assump- tion of obligatory encoding of attended stimuli appears to allow some role for memory without awareness in the development of automaticity, because detection of attended stimuli can operate outside of awareness and not involve noticing (see Fig. 4). How- ever, such memories would not contribute much to the knowledge base drawn on in automatic responses, because the quality of attention allocated a t encoding is, for Logan, directly implicated in the speed of retrieval from memory: “Attention to an item will have some impact on memory; it does not assume that all conditions of attention produce the same impact” (1988a, p. 494). Come-

Page 25: Attention, Memory, and the “Noticing” Hypothesis

Robinson 307

quently, noticing of instances appears necessary t o establish a knowledge base that can prompt retrieval sufficiently fast to outperform algorithm-based processing. Third, the obligatoriness of retrieval also implies a process that develops in parallel with algorithm-based processing in the early stages of skill develop- ment but is not dependent on the attentional control processes in short-term memory.

Access to Long-Term Memory: Implicit versus Explicit Memory

Claims that memory without awareness may play a more significant role in cognitive processes than has hitherto been acknowledged have also come from studies of performance on recognition and recall tasks. Researchers have found that perfor- mance may differ between “direct” tasks requiring conscious retrieval of material presented during a study phase and “indirect” tasks facilitating retrieval of the material without apparent con- scious attempts t o recall (Merickle & Reingold, 1991; Richardson-Klavehn, & Bjork, 1988; Schacter, 1987,1989; Schacter, Chiu, & Ochsner, 1993). Although some SLAresearch has adapted the methodology used by Reber (1989) to examine dissociations between implicit and explicit learning (DeKeyser, in press; N. Ellis, 1993; Hulstijn & DeGraaff, 1994; Nation & McLaughlin, 1986; Nayak, Hansen, Kreuger, & McLaughlin, 1990; Robinson, 1994, in press), no SLA research has made use of the methodology for studying implicit and explicit memory. A number of potentially important consequences for SLA, however, arise from claims based on L1 implicit memory research, particularly with respect to questions regarding the proper estimate ofmemory for vocabulary items. Some researchers have claimed that implicit memory measures reveal more extensive recognition of items than is apparent from explicit conscious attempts t o recall; therefore, estimates of the effect of vocabulary instruction based only on explicit measures of recall may be misleading (see Ellis & Beaton, 1993; Ellis, Tanaka, & Yamazaki, 1994; Robinson, 1995b), as may similarly-based measures of aptitude like the MLAT (Modern

Page 26: Attention, Memory, and the “Noticing” Hypothesis

308 Language Learn ing Vol. 45, No. 2

Language Aptitude Test) paired-associates test of memory (Carroll & Sapon, 1959). Researchers have also claimed that implicit memory for words is more robust, and longer lasting than explicit memory (Sloman, Hayman, Ohta, Low, & Tulving, 1988; Tulving, Hayman, & MacDonald, 1991; Tulving, Schachter & Stark, 1982)- again, a claim potentially significant for estimates of L2 lexical development and measures of the effect of vocabulary instruction.

Support for these claims for differences between explicit and implicit memory comes from a variety of memory tasks (Merickle & Reingold, 1991, for a review). Explicit memory tasks clearly instruct the testee t o recall previous experience with the stimulus material. For example, a word list is presented in a study phase, and then in the test phase the participant is asked to recall words from the list, or t o indicate which words from a new list appeared on the earlier one. In contrast, implicit memory tasks do not directly instruct participants t o recall the material. For example, in word fragment completion tasks, participants study a list of words, then are presented with another task in the memory test. This next task is presented as ifit were unrelated to the study task, thus avoiding conscious recall of the list. Participants then complete a series of word fragments (such as “s-n--r-”). Implicit memory is inferred from faster responses to, and accuracy in completing, fragments conforming to words from the list (e.g., “sincere”) compared t o the accuracy and speed of responses t o fragments based on novel words. Similarly, in lexical decision tasks participants are presented with letter strings and asked to judge if they are words or nonwords. Implicit memory is inferred from the reduced latency, or speed of responses to, previously presented words compared to novel ones. The claim that implicit and explicit memory are dissociated rests on the observation that manipulating one variable during the study phase will affect performance on the later implicit and explicit test phases differ- ently. For example, during the study phase, task instructions may focus attention on the studied words’ semantic aspects, o r on their physical features, thereby manipulating differences in the depth of processing (Craik & Lockhart, 1972). This manipulation affects

Page 27: Attention, Memory, and the “Noticing” Hypothesis

Robinson 309

performance on explicit tests, which are facilitated by the greater depth of processing required by focusing on semantics, but not on implicit tests. Tests using amnesics (with damage t o the hippo- campus) have found that repetition priming effects are preserved on implicit tests, relative t o normal populations, whereas perfor- mance on explicit tests is impaired (Cohen, 1984).

Some researchers have used such dissociations to support the argument that two distinct subsystems are responsible for these effects-an explicit memory, accessed by conscious retrieval, and an unconsciously accessed implicit memory (Schacter, 1989; Schacter & Tulving, 1982; Squire, 1992; Squire & Cohen, 1984). Paradis (1994) has recently invoked the argument for dissociable subsystems of memory t o explain aspects of second language development. The same argument is implied, if not specified in detail, in Krashen’s (1981, 1982) Monitor model. Claims that different performance on implicit and explicit memory tests evi- dence the existence of two neurophysiologically separable systems might lend credibility to Krashen’s acquisition-learning distinc- tion; unconscious acquisition could contribute t o representations in implicit memory, in contrast t o the learned representations in explicit memory. Paradis proposed exactly this, arguing that implicit and explicit memory, “each relying on different cerebral systems, are differentially involved during the acquisitiodearn- ing of a foreign language”(p. 393). Of course, one must distinguish claims about implicit‘explicit memory from claims about implicit‘ explicit learning, because the former relates t o the conditions whereby preexisting representations are accessed or activated, whereas the latter refers t o the processes whereby such represen- tations develop (see Reber, 1993; Robinson, 1993,1994). However, it is not necessary t o attribute the effects obtained on direct and indirect memory tests t o discrete memory systems. Researchers have proposed a number of single-system explanations of these dissociations. These explanations and their implications for mod- els of the architecture of memory, I consider below.

Page 28: Attention, Memory, and the “Noticing” Hypothesis

310 Language Learning Vol. 45, No. 2

Activation Views

Activation views hold that priming effects are due t o the temporary activation of preexisting semantic representations. They are based on theories of the processes underlying the devel- opment of automaticity.

Auto-associatiue semantic activation. Graf & Mander (1984) claimed that implicit memory performance is due to temporary activation of a representation; this occurs automatically, but carries with it no contextual information that could contribute t o explicit remembering. Presentation of a word in a study list activates its representation, thus making it more accessible than nonstudied words. However, this activation “is not sufficient t o determine recognition or recall operations, which are dependent on the relationships of the event to be retrieved and the character- istics of the context in which this event first occurred” (Besson, Fischler, Boaz, 8z Raney, 1992, p. 90). Activation theories like Graf and Mandler’s thus proposed a mechanism-spreading activation -variably related to tasks’ processing demands, which can either create the conditions for such automatic access or lead to more attentionally demanding processing that impedes it.

Similarity based recruitment and activation. Logan (1990) also offered an automatic activation explanation of repetition priming effects. In lexical decision tasks, judgments about words or nonwords are made initially on the basis of an algorithm. Subsequently, when the same words are presented a second or third time, decisions are made on the basis of single-step access t o previous encounters with the stimulus, which are activated in memory. However, Logan stated that instances are encoded in memory together with whatever contextual coordinates happen t o be attached t o them. Consequently, at least some episodic, per- sonal memories (associations of happiness, anxiety etc.) could be activated by obligatory retrieval of instances.

Page 29: Attention, Memory, and the “Noticing” Hypothesis

Robinson 31 1

Processing Views

Processing views argue that the nature of the match between the processing demands encouraged by the task instructions at study and a t test contributes to differences in performance on tests.

Data-driven and conceptually-driven processing. Jacoby (1983) has argued that conceptually-driven processing (such as organizing and reconstructing) which is “participant initiated and guided by knowledge and expectations fromlong-term memory, can be distinguished from data-driven processing, which is initi- ated and guided by data, for example, the shapes of letters (cf. Blaxton, 1989; Hamann, 1990; Roediger & Blaxton, 1987; Roediger et al. 1989). Study tasks encourage a blend of both forms of processing, but task instructions can bias the participant t o use more of one than the other. Task instructions in the test phase also bias participants to draw on one or the other form of processing; explicit tests draw primarily on conceptually-driven processing, and implicit tests draw primarily on data-driven processing. What distinguishes test performance is the transfer of appropriate processing strategies initiated during the study phase to the test. Implicit test performance reflects study-test overlaps in data- driven processing, and explicit test performance reflects study-test overlaps in conceptually-driven processing; different memory sys- tems are not necessarily involved.

Integrative and elaborative memoryprocesses. Graf and Ryan (1990) also took a processing perspective and distinguished inte- grative from elaborative processing. Integrative processing “results from processing that bonds the features of a target into a coordi- nated whole” (p. 9901, whereas elaborative processing “associates a target with other mental contents” (p. 990). Following the same reasoning as Roediger et al. (19891, they claimed study task performance encourages a blend of both types of memorization, but implicit tests draw on integrative processes whereas explicit tests draw on elaborative processes.

Page 30: Attention, Memory, and the “Noticing” Hypothesis

312 Language Learning Vol. 45, No. 2

Multiple Systems Views

Some researchers have attributed differences in performance on implicit and explicit tests to differences in the properties of underlying memory systems that are differentially activated by attempts a t explicit, conscious recall. The difference between this type of explanation and the preceding explanations lies in the researchers’ preference for associating functionally differentiated evidence of memory performance with system-level characteriza- tions of the architecture of memory. Processing views prefer t o avoid this and explain functional differentiations at the level of task demands. Multiple systems views often refer to neurophysi- ological evidence of different patterns of brain activity and neural architecture to support their claims for different subsystems. Taken t o an extreme, this approach ceases to explain data, because it proposes a new subsystem of memory to account for each apparent functional differentiation in performance on memory tasks (McKoon, Ratcliff, & Dell, 1986; Raaijmakers & Shifiin, 1992; Richardson-Klavehn & Bjork, 1988; Tulving, 1984). None- theless, memory researchers have supported a number of distinctions between subsystems of memory.

Squire and Cohen (1985) distinguished between consciously accessed declarative memory and unconscious procedural memory, the former respon- sible for performance on explicit tests, the latter for performance on implicit tests. Their explanation was inspired by neurophysi- ological evidence of the relation of brain impairments to memory functions. Combining studies of human amnesia with animal models of human amnesia, Squire and Zola-Morgan (1991) have identifled the medial temporal lobe as the site of memory func- tions. The medial temporal lobe binds together distributed storage sites in the hippocampus and neocortex. The role of the medial temporal lobe in distributing information in the neocortex is short- lived; after time memories stored in the neocortex become independent of the medial temporal lobe. Squire and Zola-Morgan interpreted this as evidence of the neurophysiological basis of

Procedural and declarative memory.

Page 31: Attention, Memory, and the “Noticing” Hypothesis

Robinson 313

declarative memory for facts. The medial temporal lobe functions as a short-term organizer of information in long-term storage in the neocortex. Lesions t o the hippocampus and neocortex SUT- rounding the amygdala produce memory impairments, whereas lesions to the amygdala do not, suggesting that the amygdala is responsible, in part, for the memories that are preserved in amnesiacs-the ability to make associative links between stimuli and sensory modalities, and other functions important to skill development and procedural learning. At present, evidence from neurophysiology is suggestive of, though far from clear evidence for, different aspects of memory function being centralized in different sites. Impairments in performance on explicit tests of recognition and recall appear to be linked to impairments of the hippocampal system; performance on implicit memory tests is not affected by such impairments (Squire, Amaral, & Press, 1990). Differences in synaptic plasticity within anatomically distinct sites have also been identified and used t o support claims for different forms ofmemory(G1uck &Granger, 1993, p. 668). Second language acquisition researchers have recently discussed this distinction (see Paradis, 1994; Robbins, 1992). Paradis main- tained the proceduraUdeclarative memory system distinction in claiming that “the memory system that subserves the formal learning of a second language (declarative memory) is newofunctionally and anatomically different from the one that subserves the first language or a foreign language acquired in conversational settings (procedural memory)” (1 994, p. 394). Like Krashen (1981) Paradis also maintained there is no interface between the knowledge bases of these two forms of memory: “metalinguistic knowledge formally learned in school is not inte- grated into linguistic competence and does not become available for automatic use” (1994, p. 394).

Episodic and semantic memory. Tulving (1984, 1985, 1986) has developed a multiple systems theory of memory that distin- guishes between episodic memory for personal involvement with an event and semantic, decontextualized memory. Schacter (1 989) has used this distinction to explain dissociations between implicit

Page 32: Attention, Memory, and the “Noticing” Hypothesis

314 Language Learning Vol. 45, No. 2

and explicit memory. Tulving (1985, 1986) argued that the episodic memory system is responsible for performance on tasks that require explicit remembering of recent events, whereas se- mantic memory is responsible for performance on tests that require access to preexisting decontextualized semantic represen- tations. Episodic memory deals with “unique, concrete, personal and temporally dated events” (Tulving, 1986, p. 3071, whereas semantic memory involves “general, abstract, timeless knowledge that a person shares with others”(Tulving, 1986, p. 307). Tulving‘s multiple memory model is hierarchical. It sees episodic memory as a specialized subsystem of semantic memory, accessed by autonoetic, intentionally manipulated, consciousness. Both epi- sodic and semantic systems deal with declarative knowledge. Declarative memory is a separate system embedded within a procedural memory. This distinction, then, is in some ways similar t o the declarativdprocedural distinction made above. However, it differs from Cohen’s (1984) and Squire and Zola- Morgan’s (1991) accounts by viewing declarative memory as nested within procedural memory rather than viewing them as separate coexisting systems functioning in parallel. It is also similar to the proceduraUdeclarative distinction made in Anderson’s (1 983) ACT* model of knowledge compilation. However, Anderson’s ACT* theory does not propose that episodic and semantic memory are separable systems within declarative memory. Rather than motivate the proceduraUdeclarative distinction with reference t o their neurophysiological evidence alone or t o a model of informa- tion processing during skill acquisition, Tulving, like Reber (19931, has claimed that distinctions between memory systems are the result of evolutionary adaptive processes. Episodic memory is a late arrival, and phylogenetically derived from semantic memory. Like Reber, Tulving (1985, p. 388) claimed that ontogeny recapitu- latesphylogeny, thereby explainingwhy in early childhood, children can perform on indirect measures of memory, which, he claims, access the semantic system, and only later can perform on direct measures, which access the episodic system (cf. Parkin & Streete, 1988; Schader & Moscovitch, 1984).

Page 33: Attention, Memory, and the “Noticing” Hypothesis

Rob inson 315

Multiple Systems and Multiple Explanations

As Schacter (1987) pointed out, there are advantages and disadvantages t o each of these explanations, not all mutually exclusive. One major disadvantage of multiple systems views, however, is that they are inconsistent with each other. Note, for example, the inconsistency in the use of procedural and declara- tive t o describe memory systems in the work of Squire and Zola- Morgan (19911, Tulving (1986) and Anderson (1983) described above. For Anderson, skill learning begins with conscious declara- tive knowledge, which is later proceduralized. This is an interface positicn. For Cohen (1984) and Squire and Zola-Morgan, con- sciously accessed declarative memory and unconsciously accessed procedural memory are dissociated and distinct forms, responsible for different kinds of learning. Clearly, despite the similarity that might be assumed on the basis of their common use of the terms declarative andprocedural t o distinguish forms of memory, Ander- son, Cohen, and Squire and Zola-Morgan differ from each other fundamentally with respect to the role they attribute to conscious- ness. Multiple explanations underlie the multiple systems that have been proposed. If researchers are t o avoid confusion about the role ofkey variables like attention and consciousness, they can only use the same pair of terms to indicate a contrast between memory mechanisms with reference to the assumptions of their specific explanation. Confusions may also arise through failure t o consider the different levels at which they pitch each explanation of the different forms of memory. Claims for distinctions between procedural and declarative memory based on neurophysiological evidence may be true at that level, but the two systems interact at the level of information processing during task performance (Bunge, 1973; McLaughlin et al., 1953). Whether the function of different memory systems can be isolated in task performance by amnesics and monkeys with lesions to the hippocampus is, to a large extent, irrelevant t o the question of how the memory systems interact and contribute to learning during task performance by normal L2 learners (though see Paradis, 1994, and Robbins, 1992, who base

Page 34: Attention, Memory, and the “Noticing” Hypothesis

316 Language Learning Vol. 45, No. 2

large claims for L2 learning and pedagogy on such neurological evidence). Of interest to the information-processing level of explanation is a close examination of the attentional demands of tasks, and the extent to which manipulations of these demands can lead to differences in measures of learning and retention.

Automaticity, Retrieval, and Context Instantiation

An alternative t o inferring two distinct forms of memory would be to reconcile an explanation based on automatic activation ofpreexistingrepresentations, as in Logan’s (1990) instance theory, with an explanation based on processing distinctions between conceptually-driven and data-driven processes. In this synthesis, explicit instructions t o remember lead participants to associate each currently attended target with each previously experienced target and additionally to compare the currently attended context of recall with the entire previous situational context of study. This requires conceptually-driven processing (cf. Graf & Ryan’s, 1990, closely related elaborative memory strategy) rather than data- driven processing alone. However, although a currently attended target may find a perceptual match with a previously encoded target, the surrounding situational context differs, and Logan’s theory codes context as a part of instance representation. Thus, single-step access t o a matching instance may be slower in situa- tions requiring explicit recall than in situations not requiring it, like implicit tests, because the currently attended contextual coordinates of the stimuli will not find a satisfactory match. In implicit tests, recognition is data-driven, and task demands do not invoke differences in the study-test contexts. Therefore, access may be more automatic, because instances find a clearer match. This hybrid “task-based” position would have the virtue of accom- modating two sources ofimplicit memory that Richardson-Klavehn & Bjork (1988) argued need to be accounted for, “ahistoric traces that depend on pre-existing codified representations, and historic traces that incorporate contextual information” (p. 5331, without the need t o invoke separate systems.

Page 35: Attention, Memory, and the “Noticing” Hypothesis

Robinson 31 7

Summary of the Proposed Role of Memory in L2 Noticing

In s w a r y , this review of the nature of the relationship of memory t o attention during information processing leads to the following conclusions:

1. There is minimal evidence of encoding into long-term memory without awareness, a position consistent with Schmidt’s (1990, 1993a, 199313, 1994a; Schmidt & Frota, 1986) noticing hypothesis. Existing evidence of encoding without awareness is largely irrelevant to the debate over the nature of implicit memory (Holender,l986, for a review). Most of this debate, and the debate over the role of attentional control processes in shortcterm memory, concerns evidence of memory activation and retrieval without awareness.

2. Short-term memory should be distinguished from long- term memory, adopting the definition that short-term memory is that subset of long-term memory in a currently activated state. Short-term memory is where noticing takes place.

3. Retrieval from long-term memory can be a consequence of conceptually driven, top-down processing under attentional con- trol in short-term memory, as in explicit memory tasks. When this occurs, the contents of retrieval are determined by the interaction of conceptually-driven processing and the specific nature of task demands.

4. There can be automatic activation of previously attended information encoded into long-term memory. When this occurs the contents of retrieval are determined by the interaction of data- driven, bottom-up processing and the nature of the specific task demands.

5. This position is inconsistent with the view that separate long-term memory systems are differentially responsible for con- scious and unconscious information access during information processing (Paradis, 1994; Schacter, 1989; Tulving, 1986). It is consistent with the view that level of attentional awareness during retrieval is a function of task demands and automatic processes, which jointly determine access to a single long-term memory store.

Page 36: Attention, Memory, and the “Noticing” Hypothesis

318 Language Learning Vol. 45, No. 2

Implications for SLA Research and Theory

I have drawn on current research and theory in cognitive psychology to specify a model ofthe relationship between attention and memory that complements Schmidt’s (1990, 1993a, 1993b, 1994a; Schmidt & Frota, 1986) noticing hypothesis. I have defined noticing as detection with awareness and rehearsal in short-term memory and have argued that this is necessary to learning and subsequent encoding in long-term memory (Fig. 4). I have also surveyed theoretical claims about these constructs, and their possible relationship, which oppose Schmidt’s hypothesis. My aim has been, by considering positions regarding the nature of atten- tion and memory that support or contradict the hypothesis, to relate the debate about the role of consciousness in L2 learning to relevant empirical work and operational constructs in cognitive psychology and t o thereby provide an extended theoretical basis for the noticing hypothesis. Two areas of work in cognitive psychology that I have described, and that I have argued are of great potential importance t o SLAresearch, are (a) the interaction of attentional resources and the dimensions of task demands and (b) the role of conscious noticing in controlling access to memory.

The nature of the interaction between cognitive resources during information processing and language learning is little understood. However, it follows from the previous discussion that individual differences in the allocation of memory and attentional resources should affect the extent of noticing and subsequent L2 development. Short-term memory capacity determines how much can be noticed, and subsequently rehearsed, while performing a task; differences in short-term memory capacity should therefore predict differences in the rate of progress during L2 learning. The relationship of short-term memory capacity to processing in L1 and L2 has been discussed by a number of researchers (e.g., Cook, 1977,1979,1986; Geva & Ryan, 1993; Harrington &Sawyer, 1992; Lado, 1965; Papagno, Valentine, & Baddeley, 1991; Stevick, 1976). Cook (1977) assessed the extent to which adult participants differed on identical tests of short-term and long-term memory

Page 37: Attention, Memory, and the “Noticing” Hypothesis

Robinson 319

using L1 and then L2. He concluded that performance on short- term memory tests was more similar in the L1 and L2 than performance on long-term memory tests, and that short-term memory capacity was therefore more directly transferable to L2 processing than was long-term memory capacity. Cook (1979) later extended the research t o child L2 learners, finding greater ability on digit span measures of short-term memory than on recall of words, and greater ability in both of these when the digit span and word list were presented in the L1. He tested short-term memory for sentences by presenting participants with L2 sen- tences of between 4 and 16 words in length and then asking for recall. Performance was significantly worse than when the sen- tences were presented t o adults in their L1. Papagno et d. examined the influence of a proposed subsystem of short-term working memory, the phonological store, on adult acquisition of second language vocabulary. Phonological storage was disrupted by articulatory suppression of subvocal rehearsal. Papagno et al. contrasted performance on paired associate tasks in the L1, using English pairs, and in the L2, using pairs of English with Italian, Russian, o r Finnish words. They found that this disrupted the learning of Finnish-English and Italian-English paired associ- ates, but not Russian-English or English-English. They speculated that the formal similarity of word structure between English and Russian was greater than between English and the other two languages, allowing participants to circumvent the use of re- hearsal in phonological short-term memory in learning the English-Russian associates. Harrington and Sawyer examined the effect of differences in L2 reading skill and working memory capacity, using a measure developed by Daneman and Carpenter (1983) for assessing reading span. They found that participants with larger measures of working memory capacity scored higher on tests of reading skill. In contrast, reading skill did not correlate highly with more passive measures of short-term storage, such as backward digit span or random word span. Geva and Ryan found significant positive correlations of scores on measures of L2 lin- guistic knowledge and scores on a static measure of L2 memory

Page 38: Attention, Memory, and the “Noticing” Hypothesis

320 Language Learning Vol. 45, No. 2

(the ability t o remember lists of semantically unrelated one- and two-syllable nouns) and on two measures of working memory (the ability to remember the opposites of words presented in lists, and the ability t o guess and then remember words t o complete a list of sentences). Like Harrington and Sawyer, they also concluded working memory measures were better predictors of L2 reading ability than were static memory measures, because correlations of L2 working memory and L2 reading ability remained high even when correlations for scores on an intelligence test were partialled out, though this diminished the correlation of static L2 memory measures and L2 reading ability.

With respect t o the relationship between memory and attentional capacity, differences in short-term memory capacity should be related to differential performance on single tasks, when resources in the L2 are limited and skills are relatively nonauto- matic, and to a greater extent t o performance on dual tasks that require time sharing and attention switching. Success in workload management, using Wickens’(1989) model (Fig. 31, should be most clearly related to short-term memory capacity when concurrently performed tasks draw on similar resource pools. Further, when tasks require predominantly conceptually-driven processing, the availability of knowledge schemas to organize perception and to direct attention to relevant aspects of the stimulus domain should also be important. The extent t o which relevant preexisting representations are available will determine the efficiency of attentional allocation, which in turn will lead t o more successful task performanc-a claim supported, for example, by research into the relationship between awareness, text schemata, and L2 reading comprehension (e.g., Carrell, 1992). In the domain of grammatical knowledge, familiarity with the basic metalinguistic principles for describing structural patterns and structural analo- gies would probably aid hypothesis testing by directing attention t o relevant features of the input t o be noticed. Carroll (1981 , 1993; Carroll 8z Sapon, 1959) has claimed that grammatical sensitivity to structural patterns is a component of aptitude for learning languages and not necessarily related t o previous foreign lan-

Page 39: Attention, Memory, and the “Noticing” Hypothesis

Robinson 321

guage study. However, explicit training in areas of metalinguistic knowledge with the aim of developing such sensitivity is certainly possible (O’Malley & Chamot, 1990).

However, those who claim SLA is a largely unconscious process (e.g., Krashen, 1981) arguethat ultimate attainment in L2 is likely to be unrelated to currently available measures of cogni- tive abilities like short-term memory capacity. These, they argue, measure only the abilities drawn on in conscious information processing, where individual differences can be expected. Implicit nonconscious learning, or acquisition, draws on information-pro- cessing abilities, which are relatively homogeneous in the population, and unrelated to short-term memory capacity or the ability t o consciously divide and maintain attention (see Reber, Walkenfield, & Hernstadt, 1991 and Zobl, 1992, for evidence supporting this claim and Robinson, 1994, in press, for counterevidence). Schmidt’s (1990,1993a, 1993b, 1994a; Schmidt & Frota, 1986) noticing hypothesis makes the opposite prediction. Explaining the results of studies to resolve these and other disputes regarding the extent to which conscious noticing is necessary t o SLA will no doubt involve reference to some of the relationships between cognitive constructs that I have described in this paper. Researchers will need to specify such relationships if the theory underlying the noticing hypothesis, o r any opposing hypothesis, is to be fully explanatory.

Revised version accepted 15 January 1995

References

Anderson, J.R. (1983). Thearchitectureofcognition. Cambridge, IvWHarvard University Press.

Atkinson, R., & Shiffrin, R. M. (1968). Human memory: A proposed system and its control processes. The Psychology of Learning and Motivation, 2,

Baddeley, A. D. (1986). Working memory. Oxford: Oxford University Press. Baddeley, A. D., & Hitch, G. (1993). The recency effect: Implicit learning with

89-195.

explicit retrieval? Memory & Cognition, 21, 146-155,

Page 40: Attention, Memory, and the “Noticing” Hypothesis

322 Language Learning Vol. 45, No. 2

Balota, D. A. (1983). Automatic semantic activation and episodic memory encoding. Journal of Verbal Learning & Verbal Behavior, 22,88-104.

Beretta, A. (1991). Theory construction in SLA Complementarity and oppo- sition. Studies in Second Language Acquisition, 13,493-512.

Besson, M., Fischler, I., Boaz, T., & Raney, G. (1992). Effects of automatic associative activation on explicit and implicit memory tests. Journal of Experimental Psychology: Learning, Memory, & Cognition, 18,89-105.

Best, J. B. (1992). Cognitive psychology. New York: West Publishing. Blaxton, T. A. (1989). Investigating dissociations among memory measures:

Support for a transfer-appropriate processing framework. Journal of Experimental Psychology: Learning. Memory, & Cognition, 15,657-668.

Broadbent, D. E. (1958). Perception andcommunication. New York: Pergamon. Bunge, M. A. (1973). Method, model, and matter. Dordrecht: Reidel. Carlson, R. A., Khoo, €3. H., Yaure, R. G., & Schneider, W. (1990). Acquisition

of a problem-solving skill: Levels of organization and use of working memory. Journal of Experimental Psychology: General, 119,193-214.

Carr, T. H., & Curran, T. (1994). Cognitive factors in learning about struc- tured sequences: Applications to syntax. Studies in Second Language Acquisition, 16, 205-225.

Carrell, P. L. (1992). Awareness oftext structure: Effects on recall.Language Learning, 42,l-20.

Carroll, J. B. (1981). Twenty-five years of research on foreign language aptitude. In K. C. Diller (Ed.), Zndiuidual differences and universals in language learning aptitude (pp. 83-118). Rowley, MA: Newbury House.

Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies. Cambridge: Cambridge University Press.

Carroll, J. B., & Sapon, S. M. (1959). The modern language aptitude test. San Antonio, 'IX The Psychological Corporation.

Chastain, K. (1976). Developing second language skills: Theory and practice. New York: Rand-McNally.

Cherry, E. C. (1953). Some experiments on the recognition of speech with one and two ears. Journal of the Acoustical Society of America, 25, 975-979.

Cohen, J. D., Dunbar, K., & McClelland, J. L. (1990). On the control of automatic processes: A parallel distributed processing account of the Stroop effect. Psychological Review, 97, 332-361.

Cohen, M. M., & Massaro, D. (1992). On the similarity of categorization models. In F. G. Ashby (Ed.), Multidimensional models ofperception and cognition (pp. 395-449). Hillsdale, NJ: Lawrence Erlbaum.

Cohen, N. J. (1984). Preserved learning capacity in amnesia: Evidence for multiple systems. In L. R. Squire & N. Butters (Eds.), Neuropsychology of memory (pp. 83-103). New York: Guilford Press.

Cook, V. J. (1986). Do second language learners have a cognitive deficit? In V.

Page 41: Attention, Memory, and the “Noticing” Hypothesis

Robinson 323

J. Cook (Ed.), Experimental approaches to second language acquisition (pp. 73-79). Oxford: Pergamon Institute of English.

Cook, V. J. (1977). Cognitive processes in second language learning. Interna- tional Review of Applied Linguistics in Language Teaching, 15,l-20.

Cook, V. J. (1979). Aspects of memory in secondary school language learners. Interlanguage Studies Bulletin, 4, 161-172.

Cotman, C. W., & Lynch, G. S. (1989). The neurobiology of learning and memory. Cognition, 33, 201-241.

Cowan, N. (1988). Evolving conceptions of memory storage, selective atten- tion, and their mutual constraints within the human information-process- ing system. Psychological Bulletin, 104, 163-191.

Cowan, N. (1993). Activation, attention, and short-term memory. Memory & Cognition, 21, 162-167.

Craik, F. I., & Lockhart, R .S. (1972). Levels of processing: A framework for memoryresearch. Journal of VerbalLearning & Verbal Behavior, 11,671- 684.

Crowder, R. G. (1993). Short-term memory: Where do we stand? Memory & Cognition, 21, 142-145.

Curran, T., & Keele, S. W. (1993). Attentional and nonattentional forms of sequence learning. Journal ofExperimenta1 Psychology: Learning, Memory, & Cognition, 19, 189-202.

DeKeyser, R. (in press). Learning second language grammar rules: An experiment with a miniature linguistic system. Studies in Second Lan- guage Acquisition, 17.

Daneman, M., & Carpenter, P. A. (1983). Individual differencesinintegrating information between and within Fmtences. Journal of Experimental Psychology: Learning, Memory, & Cognition, 9, 561-583.

Doughty, C. (1991). Second language instruction does make a difference: Evidence from an empirical study of SL relativization. Studies in Second Language Acquisition, 13, 431-470.

Ellis, N. C. (1993). Rules and instances in foreign language learning: Interac- tions of explicit and implicit knowledge. European Journal of Cognitive Psychology, 5,289-318.

Ellis, N. C., & Beaton, A. (1993). Psycholinguistic determinants of foreign language vocabulary learning. Language Learning, 43,559-617.

Ellis, R. (1993). The structural syllabus and second language acquisition. TESOL Quarterly, 27,91-113.

Ellis, R., Tanaka, Y., & Yamazaki, A. (1994). Classroom interaction, compre- hension, and the acquisition of L2 word meaning. Language Learning, 44, 449491.

Fsrch, C., & Kasper, G. (1984). !tho ways of defining communication strat- egies, Language Learning, 34(1), 45-63.

Page 42: Attention, Memory, and the “Noticing” Hypothesis

324 Language Learning Vol. 45, No. 2

Fobs, S. (1993). Consciousness-raising and noticing through focus on form: Grammar task performance versus formal instruction. Applied Linguis- tics, 14, 385407.

Fobs, S., & Ellis, R. (1991). Communicating about grammar: A task-based approach. TESOL Quarterly, 25,87-112.

Gass, S . M., & Varonis, E. M. (1985). Task variation and NNS/NNS negotia- tion of meaning. In S. M. Gass & C. G. Madden (Eds.), Input in second Languageacquisition (pp. 149-161). Rowley, MA: Newbury HousePublish- ers.

Geva, E., & Ryan, E. B. (1993). Linguistic and cognitive correlatesofacademic skills in first and second languages. Language Learning, 4 3 , 5 4 2 .

Gluck, M., & Granger, R. (1993). Computational models of the neural bases of learning and memory. Annual Review of Neuroscience, 16,667-706.

Graf, P., & Mandler, G. (1984). Activation makes words more accessible, but not necessarily more retrievable. Journal of Verbal Learning & Verbal Behavior, 23, 553-568.

Graf, P., & Ryan, L. (1990). Transfer-appropriate processing for implicit and explicit memory. Journal of Experimental Psychology: Learning, Memory, & Cognition, 16,978-992.

Green, R. E., & Shanks, D. R. (1993). On the existence ofindependent explicit and implicit learning: An examination of some evidence. Memory & Cognition, 21, 304-317.

Hamann, S. B. (1990). Level-of-processing effects in conceptually driven implicit tasks. Journal of Experimental Psychology: Learning, Memory, & Cognition, 16, 970-977.

Harrington, M., & Sawyer, M. (1992). L2 working memory capacity and L2 reading skill. Studies in Second Language Acquisition, 14,25-38.

Hayes, N. A., & Broadbent, D. E.(1988). Two modesoflearning for interactive tasks. Cognition, 8, 249-276.

Hebb, D. 0. (1949). Theorganization ofbehavior:A neuropsychological theory. New York: John Wiley.

Holding, D. H. (Ed.). (1989). Human skills (2nd ed.). New York: John Wiley. Holender, D. (1986). Semantic activation without conscious identification in

dichotic listening, parafoveal vision, and visual masking: A survey and appraisal. Behavioral &Brain Sciences, 9, 1-66.

Hulstijn, J. (1989). Implicit and incidental second language learning: Experi- ments in the processing of natural and partly artificial input. In H. Dechert (Ed.), Interlingual processing (pp. 50-73). Tiibingen: Gunter Narr.

Hulstijn, J. (1990). Acomparson between the information-processing and the analysidcontrol approaches to language 1earning.Applied Linguistics, 1 I ,

Hulstijn, J., & DeGraaff, R. (1994). Under what conditions does explicit 30-53.

Page 43: Attention, Memory, and the “Noticing” Hypothesis

Robinson 325

knowledge of a second language facilitate the acquisition of implicit know1edge.AII.A Review, 11,97-112.

Jacoby, L. L. (1983). Remembering the data: Analyzing interactive processes in reading. Journal of Verbal Learning & Verbal Behavior, 22,485-508.

Kadlec, H., & Townsend, J. T. (1992). Signal detection analyses of dimen- sional interactions. In F. G. Ashby (Ed.), Multidimensional models of perception and categorization (pp. 181-229). Hillsdale, N J Lawrence Erlbaum.

Kahneman, D. (1973). Attention and effort. Englewood Cliffs, NJ: Prentice- Hall.

Kohonen, V. (1992). Experiential language learning: Second language learn- ing as cooperative learner education. In D. Nunan (Ed.), Collaborative language learning and teaching (pp. 14-39). Cambridge: Cambridge Uni- versity Press.

Krashen, S. D. (1981). Second language acquisition and second language learning. Oxford: Pergamon.

Krashen, S. D. (1982). Principles andpractice in second language acquisition. Oxford: Pergamon.

Krashen, S. D. (1985). The input hypothesis: Issuesandimplications. London: Longman.

Krashen, S. D., & Terrell, T. D. (1983). The natural approach: Language acquisition in the classroom. Oxford: Pergamon.

Lado, R. (1965). Memory span as a factor in second language learning. International Review ofApplied Linguistics in Language Teaching, 1,123- 130.

Logan, G. D. (1988a). Toward an instance theory of automatization. Psycho- logical Review, 95,492-527.

Logan, G. D. (198813). Automaticity, resources, and memory: Theoretical controversies and practical implications. Human Factors, 30, 583-598.

Logan, G. D. (1990). Repetition priming and automaticity: Common underly- ing mechanisms? Cognitive Psychology, 22, 1-35.

Long, M. H. (1988). Instructed interlanguage development. In L. M. Beebe (Ed.), Issues in second language acquisition: Multiple perspectives (pp. 115-141). New York: Harper & Row.

Long, M. H. (1989). Task, group and task-group interaction. University of Hawaii Working Papers in ESL, 8, 1-26.

Long, M. H. (1991). Focus on form: A design feature in language teaching. In K. de Bot., R. B. Ginsberg & C. Kramsch (Eds.), Foreign language research in cross-cultural perspective (pp. 39-52). Amsterdam: John Benjamins.

Long, M. H., & Crookes, G. (1992). Three approaches to task-based syllabus design. TESOL Quarterly, 26, 26-56.

Loschky, L., & Bley-Vroman, R. (1993). Grammar and task-basedlearning. In

Page 44: Attention, Memory, and the “Noticing” Hypothesis

326 Language Learning Vol. 45, No. 2

G. Crookes & S. M. Gass (Eds.), Tasks and language 1earning:Zntegrating theory and practice (pp. 123-167). Clevedon, Avon: Multilingual Matters.

Mandler, G. (1992). Toward a theory of consciousness. In H.-G. Geissler, S. W. Link, & J. T. Townsend (Eds.), Cognition, information processing, and psychophysics:Basic issues (pp. 43-67). Hillsdale, NJ: Lawrence Erlbaum.

McClelland, J. L., & Rumelhart, D. E. (1985). Distributed memory and the representation of general and specific information. Journal ofExperimen- tal Psychology: Geneml, 114, 159-188.

McKoon, G., Ratcliff, R., & Dell, G. (1986). A critical evaluation of the semantic-episodic distinction. Journal ofExperimenta1 Psycho1ogy:Learn- ing, Memory, & Cognition, 12,295-306.

McLaughlin, B. (1987). Theories of second-language learning. London: Ed- ward Arnold.

McLaughlin, B., Rossman, T., & McLeod, B. (1983). Second language learn- ing: An information-processing perspective. Language Learning, 33,135- 158.

Merickle, P. M., & Reingold, E. M. (1991). Comparing direct (explicit) and indirect (implicit) measures to study unconscious memory. Journal of Experimental Psychology, Learning, Memory, & Cognition, 17,224-233.

Moray, N. (1959). Attention in dichotic listening tasks: Affective cues and the influence of instructions. Quarterly Journal of Experimental Psychology, 11, 56-60.

Munsell, P., & Carr, T. H. (1981) Monitoring the monitor: Areview ofsecond Language Acquisition and Second Language Learning by S. D. Krashen. Language Learning, 31,493-502.

Nation, R., & McLaughlin, B. (1986). Novices and experts: An information- processing approach to the “good language learner” problem. Applied Psycholinguistics, 7, 41-35.

Navon, D. (1984). Resources-A theoretical soup stone? Psychological Re- view, 91, 216-234.

Navon, D., & Gopher, D. (1980). On the economy of the human-processing system. Psychological Review, 86,214-255.

Nayak, N., Hansen, N., Kreuger, N., & McLaughlin, B. (1990). Language- learning strategies in monolingual and multilingual adults. Language Learning, 40, 221-244.

Norman, D. A. (1968). Toward a theory of memory and attention. Psychologi- cal Review, 84,522-536.

Nunan, D. (1988). Syllabus design. Oxford: Oxford University Press Nunan, D. (1989). Designing tasks for the communicative classroom. Cam-

0’ Malley, J. M., & Chamot, A. U. (1990). Learning strategies in second bridge: Cambridge University Press.

language acquisition. Cambridge: Cambridge University Press.

Page 45: Attention, Memory, and the “Noticing” Hypothesis

Rob inson 327

Papagno, C., Valentine, T., & Baddeley, A. D. (1991). Phonological short-term memory and foreign-language vocabulary learning. Journal of Memory & Langunge, 30,331-347.

Paradis, M. (1994). Neurolinguistic aspects of implicit and explicit memory: Implications for bilingualism and SLA. In. N. C. Ellis (Ed.), Zmplicit and explicit learning of languages (pp. 393-419). London: Academic Press.

Parkin, A. J., & Streete, S. (1988). Implicit and explicit memory in young children and adults. British Journal of Psychology, 79,361-369.

Perruchet, P., & Pacteau, C. (1990). Synthetic grammar learning: Implicit rule abstraction or fragmentary knowledge? Journal of Experimental Psychology: General, 119,264-275.

Pica, T., Doughty, C., & Young, R. (1987). The impact of interaction on comprehension. TESOL Quarterly, 21, 737-758.

Pica, T., Kanagy, R., Falodun, J. (1993). Choosing and using communication tasks for second language instruction. In G. Crookes & S. M. Gass (Eds.), Tasks and language learning: Integrating theory and practice (pp. 9-34). Clevedon, Avon: Multilingual Matters.

Prabhu, N. S. (1987). Second language pedagogy. Oxford: Oxford University Press.

Raaijmakers, J. G. W., & Shiffrin, R. M. (1992). Models for recall and recognition. Annual Review of Psychology, 43,205-234.

Reber, A. S. (1989). Implicit learning and tacit knowledge. Journal of Experimental Psychology: General, 118, 219-235.

Reber, A. S. (1993). Zmpticit learning and tacit knowledge: An essay on the cognitive unconscious. Oxford: Clarendon Press.

Reber, A. S., Walkenfield, F., & Hernstadt, F. (1991). Implicit and explicit learning: Individual differences and I&. Journal ofExperimenta1 Psychol- ogy: Learning, Memory, & Cognition, 17, 888-896.

Richardson-Klavehn, A., & Bjork, R. (1988). Measures of memory. Annual Review of Psychology, 39,475-543.

Robbins, S. (1992). A neurobiological model of procedural skill acquisition. Zssues in Applied Linguistics, 3, 235-265.

Robinson, P. (1989). Procedural vocabulary and language learning. Journal of Pragmatics, 13, 523-546.

Robinson, P. (1993). Problems of knowledge and the impliciUexplicit distinc- tion in SLA theory construction. University of Hawaii Working Papers in ESL, 12,99-139.

Robinson, P. (1994) Learning simple and complex second language rules under implicit, incidental, rule-search and instructed conditions. Unpub- lished doctoral dissertation, University of Hawaii at Manoa, Honolulu.

Robinson, P. (1995a). Task complexity and second language narrative dis- course. Language Learning, 45,99-140.

Page 46: Attention, Memory, and the “Noticing” Hypothesis

328 Language Learning Vol. 45, No. 2

Robinson, P. (1995b). Implicit and explicit memory for second language vocabulary. Unpublished manuscript, University of Queensland, Brisbane, Australia.

Robinson, P. (in press). Aptitude, awareness and the fundamental similarity of implicit and explicit second language learning. In R. Schmidt (Ed.), Attention and awareness in foreign language learning and teaching (Tech. Rep. No. 9). Honolulu: University of Hawaii at Manoa, Second Language Teaching and Curriculum Center.

Robinson, P. (Ed.). (to appear). Task complexity and second language syllabus design: Data-based studies and speculations [Special issue]. University of Queensland Working Papers in Applied Linguistics.

Robinson, P., & Ha, M. (1993). Instance theory and second language rule learning under explicit conditions. Studies in Second Language Acquisi- tion, 15, 413-439.

Robinson, P., Ting, S., & Unwin. J. (in press). Investigating second language task complexity. RELC Journal, 26.

Roediger, H. L., & Blaxton, T. A. (1987). Effects of varying modality, surface features, and retention interval on priming in word-fragment completion. Memory & Cognition, 15, 379-388.

Roediger, H. L., Weldon, M. S., & Challis, B. H. (1989). Explaining dissocia- tions between implicit and explicit measures of retention: A processing account. In H. L. Roediger 111, & F. I. M. Craik (Eds.), Varieties of memory andconsciousness: Essays in honour ofEndel Tulving(pp. 3-41). Hillsdale, NJ: Lawrence Erlbaum.

Rutherford, W. E. (1987). Second languagegrammar: Learning and teaching. London: Longman.

Sasaki, M. (1993). Relationships among second language proficiency, foreign language aptitude, and intelligence: A structural equation modeling approach. Language Learning, 43,31%344.

Sato, E., & Jacobs, B. (1992). From input to intake: Towards a brain-based perspective on selective attention. Issues In Applied Linguistics, 3, 267- 292.

Schacter, D. L. (1987). Implicit memory: History and current status. Journal ofExperimenta1 Psychology: Learning, Memory, & Cognition, 13,501-518.

Schacter, D. L. (1989). On the relation between memory and consciousness: Dissociable interactions and conscious experience. In H. L. Roediger 111, & F. I. M. Craik (Eds.), Varieties ofmemory and consciousness: Essays in honour ofEndel Tuloing (pp. 355-389). Hillsdale, NJ: Lawrence Erlbaurn.

Schacter, D. L., & Moscovitch, M. (1984). Infants, amnesics and dissociable memory systems. In M. Moscovitch (Ed.), Infant memory (pp. 173-216). New York: Plenum Press.

Schacter, D. L., & Tulving, E. (1982). Memory, amnesia and the episodic/

Page 47: Attention, Memory, and the “Noticing” Hypothesis

Robinson 329

semantic distinction. In R. L. Isaacson & N. E. Spear (Eds.), The expression of knowledge (pp. 33-65). New York: Plenum Press.

Schacter, D. L., Chiu, C.-Y. P., & Ochsner, K. N. (1993). Implicit memory: A selective review. Annual Review of Neuroscience, 16, 159-182.

Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 1 1 , 129-158.

Schmidt, R. (1992). Psychological mechanisms underlying second language fluency. Studies in Second Language Acquisition, 14,357-386.

Schmidt, R. (1993a). Awareness and second language acquisition. Annual Review of Applied Linguistics, 13, 206-226.

Schmidt, R. (1993b). Consciousness, learning and interlanguage pragmatics. In G. Kasper andS. Blum-Kulka(Eds.),Znterlanguagepragmatics (pp. 21- 42). New York: Oxford University Press.

Schmidt, R. (1994a). Deconstructing consciousness in search of useful defini- tions for applied linguistics. AILA Review, 11, 11-26.

Schmidt, R. (199413). Implicit learning and the cognitive unconscious: of artificial grammars and SLA. In N. C. Ellis (Ed.), Zmplicit and explicit learning of languages (pp. 165209). London: Academic Press.

Schmidt, R. (Ed.). (in press). Attention and awareness in foreign language learning and teaching (Tech. Rep. No. 9). Honolulu: University of Hawaii at Manoa, Second Language Teaching and Curriculum Center.

Schmidt, R., & F'rota, S. (1986). Developing basic conversational ability in a second language: A case study of an adult learner of Portuguese. In R. R. Day (Ed.), Talking to learn: Conversation in second language acquisition (pp. 237-326). Rowley, M A Newbury House.

Schneider, W. (1993). Varieties of working memory as seen in biology and in connectionistfcontrol architectures. Memory & Cognition, 21, 184-192.

Schneider, W., & Detweiler, M. (1988). "he role of practice in dual-task performance: Toward workload modeling in a connectionist-control archi- tecture. Human Factors, 30, 539-566.

Scovel, T. (1979). Review of Suggestology and outlines of Suggestopedy by Georgi Lozanov. TESOL Qunrterly, 13, 255-266.

Shanks, D. R., & St. John, M. F. (1994). Characteristicsof dissociable human systems. Behavioral and Brain Sciences, 17, 367-447.

Sharwood Smith, M. (1991). Speaking to many minds: On the relevance of different types of language information for the L2 learner. Second Lan- guage Research, 7,118-132.

Sharwood Smith, M. (1993). Input enhancement in instructed SLA. Studies in Second Language Acquisition, 15, 165-179.

Shiffrin, R. M. (1993). Short-term memory: A brief commentary. Memory & Cognition, 21, 193-197.

ShiffXn, R. M., & Schneider, W. (1977). Controlled and automatic human

Page 48: Attention, Memory, and the “Noticing” Hypothesis

330 Language Learning Vol. 45, No. 2

information processing: 11. Perceptual learning, automatic attending and a general theory. Psychological Review, 84, 127-190.

Skehan. P. (1989). Individual differences in second language learning. Lon- don: Edward Arnold.

Squire, L. R. (1992). Declarative and nondeclarative memory: Multiple brain systems supporting learning and memory. Journal of Cognitive Neuro- science, 4, 232-243.

Squire, L. R., & Cohen, N. (1984). Human memory and amnesia. In G. Lynch, J. L. McGaugh, & N. M. Weinberger, (Eds.), Neurobiology of learning and memory (pp. 3-64). New York: Guilford Press.

Squire, L. R., & Zola-Morgan, S. (1991). The medial temporal lobe memory system. Science, 253, 1380-1386.

Squire, L. R., Amaral, D., & Press, G. (1990). Magnetic resonance imaging measurements of hippocampal formation and mammillary nuclei distin- guish medial temporal lobe and diencephalic amnesia. Journal of Neuro- science, 10, 3106-3117.

Stern, L. (1985). Thestructures and strategies ofhuman memory. Homewood, IL: Dorsey Press.

Stevick, E. (1976). Memory, meaning & method: Some psychological perspec- tives on language learning. Rowley, M A Newbury House.

Taylor, M. M., Lindsay, P. H., & Forbes, S. M. (1967). Quantification ofshared capacity processing in auditory and visual discrimination. Acta Psychologica, 27, 223-229.

Thompson, R. F. (1986). The neurobiology of learning and memory. Science,

Tomlin, R., &Villa, V. (1994). Attention in cognitive science and SLA.Studies in Second Language Acquisition, 16, 185-204.

"reisman, A. M. (1964). Verbal cues, language, and meaning in selective attention. American Journal of Psychology, 77,533-546.

Tulving, E. (1984). Precis of Elements of episodic memory. Behavioral and Brain Sciences, 7, 223-238.

Tulving, E. (1985). How many memory systems are there? American Psy- chologist, 4, 385-398.

Tulving, E. (1986). What kind of a hypothesis is the distinction between episodic and semantic memory? Journal of Experimental Psychology: Learning, Memory, & Cognition, 12, 307-311.

Tulving, E., Hayman, C., & MacDonald, C. (1991). Long-lasting perceptual priming and semantic learning in amnesia: A case experiment. Journal of Experimental Psychology: Learning, Memory & Cognition, 17, 595-617.

Tulving, E., Schacter, D. L., & Stark, H. (1982). Priming effects in word- fragment completion are independent of recognition memory. Journal of Experimental Psychology: Learning, Memory & Cognition, 8,336-342.

233,941-947.

Page 49: Attention, Memory, and the “Noticing” Hypothesis

Robinson 331

Ur, P. (1988). Grammar practice activities: Aguide for teachers. Cambridge: Cambridge University Press.

VanPatten, B. (1990). Attending to form and content in the input. Studies in Second Language Acquisition, 12, 287-301.

Vokey, J., & Brooks, L. (1992). Salience of item knowledge in learning artificial grammars. Journal ofExperimnta1 Psychology: Learning, Memory & Cognition, 18, 320-344.

Watanabe, L. (1980). Selective listening and attention. Japanese Psychologi- cal Reuiew, 23, 335-354.

Watanabe, Y. (1992). Effects of increased processing on incidental learning of foreign language uocabulary. Unpublished master’s thesis, University of Hawaii at Manoa, Honolulu.

Wickens, C. D. (1980). The structure of attentional resources. In R. S. Nickerson (Ed.),Attention andperformance VZZZ (pp. 239-257). Hillsdale, NJ: Lawrence Erlbaum.

Wickens, C. D. (1984). Processingresources in attention. In R. Parasuraman & D. Davies(Eds.), Varietiesofattention (pp. 63-102). New York:Academic Press.

Wickens, C. D. (1989). Attention and skilled performance. In D. H. Holding (Ed.), Human skills (2nd ed. pp. 71-105). New York: John Wiley.

Zalewski, J. P. (1993). Numberlperson errors in an information-processing perspective: Implications for form-focused instruction. TESOL Quarterly,

Zobl, H. (1992). Sources of linguistic knowledge and uniformity of nonnative performance. Studies in Second Language Acquisition, 14, 387-403.

27,691-704.


Recommended