+ All Categories
Home > Documents > Visual binding of English and Chinese word parts is ... · cess, as do more recent studies of the...

Visual binding of English and Chinese word parts is ... · cess, as do more recent studies of the...

Date post: 17-May-2020
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
27
1 Introduction Physiological recordings show that initial coding in the visual system provides reliable information at fine temporal scales, with neurons in primary visual cortex following flicker frequencies well over 60 Hz (Gur and Snodderly 1997). Frequencies much higher than this are completely lost to perception, but still an impressive degree of precision is preserved in some cases. For example, when viewing a periodic pattern of bars (a sinusoidal grating) moving within a window at 20 Hz, humans can perceive its direc- tion of motion (Burr and Ross 1982) and depth (Morgan and Castet 1995).This occurs despite the fact that for 20 Hz gratings, temporal integration over 50 ms or more yields a uniform field or bars at every position, totally obliterating the motion and depth information. The fact that we nonetheless perceive motion and depth at these rates implies that the mechanisms underlying our perception of motion and depth operate at a time scale better than 50 ms. This high temporal precision is not restricted to perception of individual features ö the pairing of colour with orientation can also be perceived with over 20 Hz stimulus alternation rates (Holcombe and Cavanagh 2001). In that experiment, participants viewed a display rapidly alternating between a red leftward-tilted pattern and a green rightward-tilted pattern. In other trials, the feature pairing was reversed, with green left- ward tilt alternating with red rightward tilt. If the representation of the individual features at the point of binding had a precision worse than 50 ms, or if the binding process integrated over a period longer than 50 ms, then participants would not have been able to distinguish the feature pairings of these stimuli. The fast threshold found imposes strong constraints on the neural computations underlying the binding pro- cess, as do more recent studies of the binding of distributed shape elements (Clifford et al 2004). The results of these latter studies reveal that scattered dot pairs are also Visual binding of English and Chinese word parts is limited to low temporal frequencies Perception, 2007, volume 36, pages 49 ^ 74 Alex O Holcombe School of Psychology, Tower Building, Park Place, Cardiff University, Cardiff CF10 3AT, Wales, UK; and Department of Psychology, University of California at San Diego, La Jolla, CA 92093, USA; e-mail: [email protected] Jeff Judson Department of Psychology, University of California at San Diego, La Jolla, CA 92093, USA Received 19 August 2005, in revised form 26 January 2006; published online 5 January 2007 Abstract. Some perceptual mechanisms manifest high temporal precision, allowing reports of visual information even when that information is restricted to windows smaller than 50 ms. Other visual judgments are limited to much coarser time scales. What about visual information extracted at late processing stages, for which we nonetheless have perceptual expertise, such as words? Here, the temporal limits on binding together visual word parts were investigated. In one trial, either the word ‘ball’ was alternated with ‘deck’, or ‘dell’ was alternated with ‘back’, with all stimuli presented at fixation. These stimuli restrict the time scale of the rod identities because the two sets of alternating words form the same image at high alternation frequencies. Observers made a forced choice between the two alternatives. Resulting 75% thresholds are restricted to 5 Hz or less for words and nonword letter strings. A similar result was obtained in an analogous experiment with Chinese participants viewing alternating Chinese characters. These results support the theory that explicit perceptual access to visual information extracted at late stages is limited to coarse time scales. DOI:10.1068/p5582
Transcript

1 IntroductionPhysiological recordings show that initial coding in the visual system provides reliableinformation at fine temporal scales, with neurons in primary visual cortex followingflicker frequencies well over 60 Hz (Gur and Snodderly 1997). Frequencies much higherthan this are completely lost to perception, but still an impressive degree of precisionis preserved in some cases. For example, when viewing a periodic pattern of bars(a sinusoidal grating) moving within a window at 20 Hz, humans can perceive its direc-tion of motion (Burr and Ross 1982) and depth (Morgan and Castet 1995). This occursdespite the fact that for 20 Hz gratings, temporal integration over 50 ms or more yieldsa uniform field or bars at every position, totally obliterating the motion and depthinformation. The fact that we nonetheless perceive motion and depth at these ratesimplies that the mechanisms underlying our perception of motion and depth operate ata time scale better than 50 ms.

This high temporal precision is not restricted to perception of individual featuresöthe pairing of colour with orientation can also be perceived with over 20 Hz stimulusalternation rates (Holcombe and Cavanagh 2001). In that experiment, participantsviewed a display rapidly alternating between a red leftward-tilted pattern and a greenrightward-tilted pattern. In other trials, the feature pairing was reversed, with green left-ward tilt alternating with red rightward tilt. If the representation of the individualfeatures at the point of binding had a precision worse than 50 ms, or if the bindingprocess integrated over a period longer than 50 ms, then participants would not havebeen able to distinguish the feature pairings of these stimuli. The fast threshold foundimposes strong constraints on the neural computations underlying the binding pro-cess, as do more recent studies of the binding of distributed shape elements (Cliffordet al 2004). The results of these latter studies reveal that scattered dot pairs are also

Visual binding of English and Chinese word parts is limitedto low temporal frequencies

Perception, 2007, volume 36, pages 49 ^ 74

Alex O HolcombeSchool of Psychology, Tower Building, Park Place, Cardiff University, Cardiff CF10 3AT, Wales, UK;and Department of Psychology, University of California at San Diego, La Jolla, CA 92093, USA;e-mail: [email protected]

Jeff JudsonDepartment of Psychology, University of California at San Diego, La Jolla, CA 92093, USAReceived 19 August 2005, in revised form 26 January 2006; published online 5 January 2007

Abstract. Some perceptual mechanisms manifest high temporal precision, allowing reports ofvisual information even when that information is restricted to windows smaller than 50 ms.Other visual judgments are limited to much coarser time scales. What about visual informationextracted at late processing stages, for which we nonetheless have perceptual expertise, such aswords? Here, the temporal limits on binding together visual word parts were investigated. In onetrial, either the word `ball' was alternated with `deck', or `dell' was alternated with `back', withall stimuli presented at fixation. These stimuli restrict the time scale of the rod identities becausethe two sets of alternating words form the same image at high alternation frequencies. Observersmade a forced choice between the two alternatives. Resulting 75% thresholds are restricted to5 Hz or less for words and nonword letter strings. A similar result was obtained in an analogousexperiment with Chinese participants viewing alternating Chinese characters. These results supportthe theory that explicit perceptual access to visual information extracted at late stages is limitedto coarse time scales.

DOI:10.1068/p5582

bound together with high, 20 Hz, temporal precision when they form certain regularglobal patterns such as spirals and concentric circles (Clifford et al 2004). Hence, thevisual system is capable of combining spatially disparate features into global form witha precision better than 50 ms. Mechanisms for symmetry perception may have similarlyimpressive temporal resolution (Tyler et al 1995).

Interestingly, aside from the cases when disparate elements form certain globalpatterns, the pairing of disparate elements generally appears to be limited to low alter-nation rates. Recall that colour and orientation can be bound at a temporal precisionbetter than 20 Hz when the features are spatially superposed. When these samefeatures are spatially separated, their pairing can only be perceived at about 3 Hz orless (Holcombe and Cavanagh 2001; demonstration available from http://viperlib.comor author AOH's website), suggesting temporal precision worse than 100 ms.

The contrast between some visual binding judgments that reflect a precision betterthan 20 Hz and others limited to several hertz or worse sharpens some importantissues in temporal processing. In particular, at high temporal frequencies, what preventsperceptual binding for some features but not others?

The perceptual experience of high-speed visual binding may hold a clue. Considerthe experience of two red and green perpendicular gratings alternating at slow rates. Oneexperiences first one pattern, then the other, and so on. However, when the two gratingsare alternated at a rate faster than about 6 ^ 8 Hz, they no longer seem to be experi-enced as individuals. Instead, although one is aware that the stimuli are flickering,the gratings seem to be experienced simultaneously, as if they were both continuouslyavailable to cognition (Holcombe 2001; Holcombe and Cavanagh 2001; demonstrationsmay be viewed at http://viperlib.org and AOH's website). This `temporal transparency'phenomenon (Holcombe 2001) also occurs with rapid alternation of Glass dot patterns(Clifford et al 2004). Temporal transparency is accompanied by a loss of functionalaccess to the two stimuli as individuals (Holcombe 2001; Clifford et al 2004).

These phenomena suggest that, by the time visual signals reach awareness, theyhave been combined over long intervals, preventing explicit judgments of fine timescales. However, even when visual awareness does not seem to follow rapid alternations(Holcombe 2001), not all high-frequency information is lost. Certain aspects of the visualworld, such as the identity of certain global forms and the pairing of local colour andorientation, are extracted before awareness by mechanisms with high temporal resolution.Those aspects that are not extracted early on, such as the pairing of spatially disparatecolour and orientation, apparently cannot be perceived at fast alternation rates. Thenotion of a substantial loss in temporal resolution prior to cognitive stages suggeststhat visual representations from a range of times are combined, as if squeezed togetherthrough a bottleneck, to arrive as one to cognition and consciousness. To fully appre-ciate this theory, it is important to understand why low temporal resolution at cognitionand consciousness nevertheless does not preclude fine-time-scale access to all aspectsof the visual world. In the case of global form, the global-form mechanisms mayextract the presence of a left spiral and a right spiral at fine time scales, effectivelydemodulating this aspect of the stimuli by sending constant labels on to higher levels,signaling the presence of a left spiral and a right spiral. But in perceptual experience,these signals have been combined, yielding the awareness of the presence of a left spiraland a right spiral, but losing the representation of which occurred when. But if the global-form mechanism did not have high temporal resolution, the observer could not perceivethe spirals at all, for the dots from the successive patterns would be combined before theglobal-form mechanism operated (Clifford et al 2004).

With visual information apparently integrated over a long interval for explicit per-ception, an obvious question is where in the visual system this integration occurs. As wellas being a fundamental aspect of the architecture of the visual system, the answer

50 A O Holcombe, J Judson

would help to identify the neural correlates of visual awareness. The evidence so farpoints to a site somewhere between the extrastriate areas responsible for global-formextraction and the unknown areas that putatively correspond to the visual awarenessof successive events. This is following the assumptions of the popular quest for theneural correlates of consciousness (eg Crick and Koch 1995, 2003). But even if it is amistake to speak of brain areas corresponding to visual awareness, understanding thebottleneck will nonetheless put informative constraints on future, perhaps more sophis-ticated, theories of visual awareness. In this paper, we seek to further localise theaspects of perception for which substantial loss of resolution occurs by measuringthe visual time scale at which humans can access the identity of letter strings, Englishwords, and Chinese characters.

Linguistic material was chosen for three reasons. First, visual words and letterstrings are encoded very late in the visual processing hierarchyöapparently in thetemporal lobe (McCandliss et al 2003). If late visual mechanisms are generally limitedto coarse time scales, word perception should certainly show this limitation. A secondreason for choosing words is that, if any high-level visual judgment were to occur atfine time scales, the recognition of words should be a top candidate. Words are pro-cessed automatically by adult human readers (Stroop 1935), which is no doubt relatedto the decades of daily experience most adults have with reading. This has further led tosome unitisation in the processing of words (Tao et al 1997). A final advantage ofwords is that the way they are processed in normal reading has some resemblance tothe way they are presented in the laboratory experiment. Indeed, most theories ofreading suggest that only one or two words are processed per fixation in the rapidseries of fixations made during normal reading (Brysbaert and Vitu 1998; Inhoff et al2000; Rayner et al 2004). Potter and her colleagues have reported numerous experi-ments with rapid presentation of words. They find that even at presentation rates of12 words per second, participants can recall most of the words when together theyform a sentence (Potter 1993), which might suggest high temporal resolution. However,these experiments were not designed to isolate the duration of processing of individualwords. For example, in these rapid serial-presentation experiments, temporal integra-tion of a word with the preceding and following words does not completely obliteratethe cues to the word's identity.

From the present point of view, the masked priming literature (Kinoshita andLupker 2003) suffers from a similar limitation as does the rapid serial-presentationwork. The finding of semantic priming from very briefly presented words does suggestthat word binding and processing are extremely rapid. However, such results do notprovide good constraints on the temporal precision of the binding of letters into aword. In such experiments, typically a word is presented for a few dozen millisecondsbefore a patterned post-mask is presented. In this paradigm, considerable imprecisionin the relative processing time of individual letters and binding of the letters togethermight still yield semantic activation. For example, even if letter units are activated atsubstantially different times, there might still be no ambiguity which letters are to bebound together, since no other letters are presented. Hence, the binding mechanismcould potentially integrate over a much longer period than the actual presentation timeof the word.

Temporal limits on word binding have not been previously investigated, but in anumber of studies binding errors have been revealed by brief, simultaneous presentationof multiple words or letter strings. Mozer (1983) presented pairs of four-letter words for afew hundred milliseconds and followed them with a post-mask. Letter migration errorssometimes resulted. For example, when `line' and `lace' were presented, occasionallyparticipants reported seeing `lice' or `lane'. These errors occur more frequently than wouldbe expected by guessing (McClelland and Mozer 1986; Treisman and Souther 1986).

Binding words is slow 51

Harris and her colleagues showed that rapid presentation of letter strings can lead torepetition blindness for letter strings, words, and parts of words (Harris and Morris2001). The patterns of letter migrations and repetition blindness that occur in theseconditions are used for deciding among the various proposed models of how letterstrings are represented and how reading is accomplished (Davis and Bowers 2004).

Our main interest, the relationship of binding limits for linguistic material to thosefor other visual materials, does not seem to have been directly investigated by anyone.However, the reverse-hierarchy theory of perception by Hochstein and Ahissar (2002)may appear relevant. It suggests that, when we view a scene, our initial conscious perceptreflects high-level visual processing, such as the identity of faces and words, whereasperception of visual details, such as individual orientations, is slower and requires feed-back from higher levels of the cortex. This reverse-hierarchy theory might be taken toimply that visual characteristics extracted at later cortical stages, such as the identityof a word, are processed more rapidly and have higher temporal resolution than thoseextracted at earlier stages.

However, the results for which Hochstein and Ahissar's theory was concocted donot directly address this issue. The theory is largely based on patterns of response timesin visual-search experiments, which indicate that search for high-level properties, suchas object category, can be faster than search for more basic features. The presentinterest is in the processing of an individual item rather than the apparently simulta-neous processing of the many items of a search array. Furthermore, in visual search,the temporal precision of processing is unlikely to be the most important factor deter-mining reaction time (RT). Search RTs are critically dependent on factors such as theheterogeneity of items in the display (Duncan and Humphreys 1989), and whetherthe items are processed in parallel (Egeth et al 1972). Hence, the results of visual-searchexperiments may reveal the characteristics of initial perception when viewing a clutteredscene without indicating the temporal resolution of the processing of an individualattended item.

Thorpe and his colleagues (Thorpe et al l996; Fabre-Thorpe et al 2001; VanRullenand Thorpe 2001; Rousselet et al 2002) have measured the speed of high-level visualdiscrimination using evoked potential latencies. In one experiment, evoked potentialsrecorded at the scalp discriminated scenes containing an animal from those that donot. These potentials had latencies as short as 150 ms, suggesting fine temporal resolutionfor scene and animal processing. However, follow-up work found that the latency of thediscriminating signal is quite variable, ranging from 150 ms to 300 ms (Johnson andOlshausen 2003). Thus, processing of the various features involved in these visual dis-criminations may yet be temporally imprecise, despite the impressive overall performance.

To address the temporal precision of high-level visual processing more directly, we usea task in which the participants must bind together linguistic features presented at thesame timeöeither the two halves of a four-letter string or, in the Chinese experiment,two halves of a Chinese character. If word or letter-string binding mechanisms havetemporal properties similar to those for motion, global form, spatially superposed colourand orientation, flicker, or stereopsis (Morgan and Castet 1995), thresholds shouldbe better than 15 Hz. On the other hand, if the task relies on slower, perhaps morecentral mechanisms, with the temporal characteristics of that of binding spatially sepa-rated colour and orientation features, we can expect that accurate performance willoccur only for slow ratesöless than 8 Hz. As it turns out, threshold rates for bindingthe linguistic materials do fall into this latter slow category. Different stimulus condi-tions were used for this investigationöChinese characters, words, pseudowords, andmostly unpronounceable nonwords. Although most studies that use varying linguisticconditions such as these are designed to isolate the effect of various linguistic factors onsmall differences in performance (eg Murray and Forster 2004), this study is an exception.

52 A O Holcombe, J Judson

Here, the interest was not in small differences in performance, but rather only in anydramatic differences in temporal thresholdöessentially, whether each of the stimuluscategories yielded slow (58 Hz) or fast (415 Hz) thresholds. Revealing smaller differ-ences due to particular linguistic factors would require a much larger set of stimuli,carefully balanced in a way that would be very difficult given the constraints imposedby the paradigm (described below). As it turns out, a slow threshold is found in eachcondition. Given this, the use of the varying conditions provides confidence that theslow threshold is not an accident of the particular items chosen, as would be a worryif only one condition were used.

Figure 1 schematises an example trial in which the word `ball', presented at fixation,is alternated with the word `deck', also presented at fixation. In the experiment, thisstimulus train is paired with another in which `dell' is alternated with `back'. The utilityof using these particular words is seen when alternating the two sets of words at veryhigh rates. At rates exceeding the temporal resolution of the visual system, observerswill perceive the sum, which is identical for the two pairs (figure 1).

Thanks to the identical sums of the two stimuli, this presentation method system-atically limits the binding information to a particular temporal interval. Varying thealternation rate then varies this temporal interval, allowing an estimate of the resolu-tion of the system. The method exploits the fact that all mechanisms, including those

Alternating stimuli Sum

A

B

A

B

Two-alternativeforced choice

Two-alternativeforced choice

Figure 1. A schematic representation of the possible stimuli for a trial in the English wordsexperiment and the Chinese character experiment. The top half of the figure corresponds to atrial in which the participant was informed that either `ball' or `deck' would be presented inrapid alternation at the fixation point, or `dell' and `back' would alternate. Because the sums ofthese word pairs are the same, participants could only discriminate between the two alternativeswhen the presentation rate was slow enough for word perception mechanisms to individuate thetwo temporal intervals.

Binding words is slow 53

underlying word recognition, have a non-zero interval over which they integrate or,loosely, a temporal resolution. In other words, when two inputs to a mechanism arealternated at a sufficiently high rate, a mechanism will passively integrate them as ifthey were presented at the same time.

In the experimental trial corresponding to the stimuli shown in figure 1, observersare presented with one of the alternating pairs and asked to determine which waspresented. If the presentation rate is such that both words are presented within theinterval that the word recognition system averages over, then performance will be atchance. By varying the temporal frequency (presentation rate) of the stimuli acrosstrials, the temporal resolution can be estimated. To compare the results to those withmethods used previously in the literature, temporal thresholds were also measuredwith a single-presentation masked exposure.

The word `binding' is used to refer to many different things in psychophysics andneuroscience. In this paper, by binding we mean the ability to report the temporalpairing of two aspects of a visual stimulus. The results of our binding experiments areconsistent with the idea that by the time visual signals reach awareness, they havebeen combined over an interval of the order of 100 ms. This notion is related to the`psychological moment' hypothesis, first advanced over one hundred years ago (vonBaer 1864). Historically, a variety of evidence has been marshaled in support of thissomewhat vague idea that psychological processes operate on about a 100 ms time scale(Stroud 1956). A limitation of the previous evidence is that the temporal resolution ofdifferent visual judgments was not compared, and as seen here, comparing these maybe quite informative for understanding visual mechanisms. The present work does notaddress the psychological-moment theory claim that processing occurs in discreteepisodes (Geissler et al 1999; VanRullen and Koch 2003; but see Kline et al 2004,2006). Instead, the issue here is the temporal precision of binding word parts, with thequestion set aside for now whether these limits reflect continuous averaging or discretequantised processing.

Simplified movies of the stimuli give a sense of what it was like to be in the experimentand can be viewed at AOH's web page, currently http://www.psych.usyd.edu.au/staff/alexh.

2 Methods2.1 ParticipantsFor the experiments with English words, sixteen participants were recruited fromlocal laboratories and undergraduate psychology classes at the University of Californiaat San Diego. All spoke English as their first language and were paid by the hour.Six Chinese participants were recruited from campus Chinese associations and all wereadults under the age of 35 years, raised in mainland China, and all reported fluencywith the type of Chinese characters used in the study.

2.2 English language, repetitive-presentation experimentsIn these experiments, participants were shown the stimuli at a variety of alternationrates and made a forced choice between two alternatives. At the beginning of the trial,two pairs of four-letter strings were presented. After studying the two pairs of stringsfor as long as they wished, the observers fixated on the central dot and hit a key toinitiate the alternating presentation of one of the pairs of strings that they had juststudied.

During the alternating stimulus, observers were to neither blink nor move theireyes, and their right eye was monitored as described in section 2.5. Immediately afterthe alternation ended, the two pairs of letter strings were again arrayed on the screenand observers hit a key to indicate which had been presented. Feedback indicatedwhether a correct response had been made.

54 A O Holcombe, J Judson

The lighting in the room was dim, and during the trial the background of thescreen was a uniform 84 cd mÿ2. The observers viewed the screen from a chin-restplaced 68 cm away. The letter strings appeared in a black (�1 cd mÿ2) rectangularregion 85 pixels wide (2.41 deg) by 27 pixels high (0.76 deg). The letters were drawnin lower-case Courier font with the letter `l' 20 pixels (0.57 deg) high. Each letter wasdrawn 22 pixels (0.62 deg) to the right of the previous letter.

The CRT monitor was set to a refresh rate of 85 Hz, and the stimulus began withthe presentation of the `XXXX' pre-mask at 84 cd mÿ2 for two frames (23 ms). Thealternating letter strings followed. Each trial had a particular slowest temporal fre-quency, the target rate. The goal of the experiment was to determine how narrow aninterval between successive words would still allow observers to accurately determinethe pairing of the letters, with a 75%-correct criterion. To achieve this, however, thestimuli could not be simply alternated at different frequencies. Care must be taken toprevent observers from picking off a single stimulus from the beginning or the end ofthe stimulus train. Consider that the first stimulus in a train is not pre-masked in thesame way as the subsequent items and the last stimulus is not post-masked in the sameway as the previous items. This means that the observers' integration window wouldat certain times include only a single word, obscuring any effect of alternation rate.Indeed, previous results purporting to reveal high-frequency grouping (Usher andDonnelly 1998) have since been explained by lack of adequate pre-masking (Beaudot2002; Dakin and Bex 2002). The transient, high-resolution mechanism engaged whenpre-masking is inadequate (Beaudot 2002) may even extend to more than just the firstitem (Holcombe et al 2001). This transient mechanism can obscure the effect of tem-poral frequency on performance, by allowing even temporal differences finer than theintegration interval of the visual system to affect performance (Beaudot 2002). Use ofthis transient mechanism can be excluded by beginning the stimulus with a period inwhich the stimuli are much too rapid and at too low a contrast to perceive the featurepairing, and then gradually increasing the contrast and stimulus exposure duration tothe target rate (Holcombe and Cavanagh 2001; Holcombe et al 2001; Dakin and Bex2002; Clifford et al 2003, 2004). Hence, there were competing constraints on the man-ner of the stimulus presentationöthe desire for a smooth increase in stimulus durationat the beginning and decrease at the end, the desire for approximately equal totalstimulus train durations across target rates, and the goal of at least two successivestimuli exposed at the target rate, all while keeping total trial length short enough thatparticipants could fixate well throughout. In practice, pilot data from the authors anda few other observers indicated that total stimulus train duration was less importantthan having a smooth ramp, so the procedure was designed accordingly.

In these repetitive-presentation conditions the stimuli were always presented for twoframes (23 ms), but the black (50:3 cd mÿ2) interstimulus interval (ISI) varied to achievevarious temporal frequencies. It is appropriate to think in terms of the stimulus onsetasynchrony (SOA), which is the two-frame stimulus duration plus the ISI. This corre-sponds to the amount of time the stimulus was available given temporal integration.The results of pilot experiments yielded confidence that the relative proportion of expo-sure duration versus ISI in the SOA was not critical. Because the presentation of twosuccessive stimuli form a cycle, temporal frequency corresponds to the duration oftwo letter string exposures plus their ISIs. As described in detail in the next paragraph,within a given trial, the alternation of the letter strings began at a very high rate andlow luminance and gradually increased in luminance while slowing to the criticalrate for that trial. After a few presentations at the critical rate, the presentation rategradually increased and the luminance diminished until the trial ended with a 1-frameISI terminated by an `XXXX' post-mask for 2 frames. This ensured that observerscould not make the discrimination based on the beginning or end of the stimulus train,

Binding words is slow 55

when the stimulus would not be fully pre- and post-masked. In other words, withoutthis procedure the presentations at the beginning and end would be available at lowerthan the intended temporal frequency. A detailed description of the initial ramp-upand terminating ramp-down of ISI and luminance for each target temporal frequencyis given below.

The initial ramping-up of the exposure of the stimulus was carried out accordingto the following rules. While the duration of the stimulus was constant at 2 frames,the first three presentations had no ISI and the fourth was presented after a 1-frameISI. The first two of these presentations were within a few multiples of contrast thresh-old (51 cd mÿ2), with the third presented around 3 cd mÿ2 and the fourth around10 cd mÿ2. The fifth was presented at about twice the luminance of the fourth, and allsubsequent stimuli were presented at about 25 cd mÿ2, until the last two, which werepresented at 9 and about 1 cd mÿ2, respectively. The duration of the ISIs after thefourth presentation increased until the target ISI was reached, with successive ISIs afterthe first three of 0, 0, and 1 being 2, 4, 9, and 19 frames. For those target ISIs that weregreater than 19 frames, such as 30 frames, the target rate began immediately after the19-frame presentation. Target ISIs of less than 19 frames began immediately after the next-lowest ISI duration in the ramp. For example if the target ISI was 6 frames, theduration of the preceding ISIs were 0, 0, 1, 2, and 4 frames. The gradual ramp downthat began after the exposures at the target ISI included decrements of 10 frames untilthe ISI was less than 10 frames, followed by decrements of 3 frames until 1 frame wasreached. Each stimulus train had at least one 1-frame ISI presented, with additionalif needed to yield at least three exposures of the stimulus during the ramp down.For example, when the target ISI was 30 frames, the following ISIs were 20, 10, 7, 4, 1frames. For a target ISI of 13 frames, the ISIs of the downramp were 3, 1, 1, 1 frames.The effect of the 1 frame and 0-ISI exposures at the end and beginning were to providea low-contrast mask which appeared as the sum of the 2 letter strings, hence camou-flaging the letter pairing. The number of exposures at the target ISI was set to bringthe total duration of the stimulus train close to 1.6 s. These constraints yielded a rangeof 1.6 to 1.9 s, mostly because the acceleration and deceleration at beginning and endtook longer for longer rates. This variation is not very different from that in theduration of the target rates (376 ms). Participants quickly grew accustomed to waitingfor the stimulus to slow and subsequently attempting to perform the task.

The letter strings were centred on the fixation point, which alternated betweenbright white and red, changing each time a letter string was presented. This flickerprovided salient repeated events at the fovea, which helped the observer to maintainfixation. At the end of each trial, a tone and short message informed the observerwhether the correct choice had been made.

There were three different trial types, tested in separate blocks: words, nonwords,and pseudowords. The ten possible word sets for the trials of the words condition arelisted in table 1, as are the pairs of letter strings for the other conditions. In nonwordtrials, none of the four letter strings formed words, and observers had to choose whichof the two pairs of nonwords had alternated. The nonword strings were formed bytransposing the first two and last two letters of the words in the words condition. In apseudoword condition, all the letter strings formed nonwords that were pronounceable,in contrast to the nonword condition, in which most were not pronounceable.

Sessions lasted less than 1 h. After several practice trials, the span of interstimulusintervals was set for that participant, guided by the participant's performance in thepractice trials. Each subject then participated in 40 trials per condition per interstimulusinterval. The three conditions (words, nonwords, pseudowords) were tested in differentsessions, whereas the interstimulus intervals were randomly intermixed. Some partic-ipants were also in earlier preliminary experiments. When it was seen that participant

56 A O Holcombe, J Judson

GE's data departed at the fastest SOA of the word condition from the pattern of otherparticipants, he was run in an additional session of the word condition.

2.3 Chinese characters, repetitive presentationTen pairs of Chinese characters, shown in figure 2, were used. The Chinese charactersutilised were adopted in their final form by the People's Republic of China over fortyyears ago and are the form used today in the overwhelming majority of written andprinted material in mainland China. Certain parts of these characters are called `radicals'.For the present experiment, these radicals have been exchanged in a particular characterpair, in order to create a corresponding pair of characters. As in the English case,this yields two pairs of lexemes that have the same sum (figure 1). This ensured thattemporally integrating over successive characters would yield the same sum for eitherpair. A character pair was presented in similar fashion to the presentation of pairs ofEnglish letter strings, as described in the previous section. The only difference was thepre- and post-mask, comprised of the sum of the two successive characters presentedon that trial, rather than the `XXXX' mask used in the English-language experiments.After several practice trials, the span of interstimulus intervals was set, guided byparticipants' performance in the practice trials. Each mainland-China-raised partici-pant participated in 40 trials in each of 5 alternation rates, randomly intermixed, for atotal of 200 trials. Sessions lasted less than 1 h.

Table 1. The stimuli for each of the three conditions of the letter string experiment for Englishreaders. In each trial, participants discriminated between two possibilities. The two pairs of letterstrings in each cell represent the two possible alternating stimuli specified to the participant atthe beginning of a trial. The stimuli in the nonwords condition were created by spatially trans-posing the bigrams in the words condition, so that the same letters were used in the words andnonwords conditions. In all cases, temporally averaging across successively presented letter stringsmakes correct discrimination impossible.

Frame Words Nonwords Pseudowords

A B A B A B

1 ball back llba llde rane rall2 deck dell ckde ckba moll mone

1 pump pull mppu mphe runk rute2 hell hemp llhe llpu pote ponk

1 dent dell ntde ntpu lort lont2 pull punt llpu llde hent hert

1 tank tape nkta nkmo pime pirt2 mope monk pemo peta jurt jume

1 ball bank llba llhu dert demp2 hunk hull nkhu nkba jomp jort

1 pink pimp nkpi mppi hape haks2 jump junk mpju nkju miks mipe

1 tame tack meta ckta zate zams2 lick lime ckli meli koms kote

1 pike pick kepi ckpi nive nink2 jock joke ckjo kejo lank lave

1 call cart llca rtca paln pask2 port poll rtpo llpo bisk biln

1 milk mire lkmi remi ferm fent2 fore folk refo lkfo hont horm

Binding words is slow 57

Frame Character pair Translation %C

A B A B

Mean log%frequency

1 why? livehow?

83 ÿ1.04

2 pour river

1 mother excellent

76 ÿ0.832 sand code

1 town trouble

71 ÿ1.592 meet sincere

1 slave prostitute

82 ÿ1.102 actress only

1 ice rain cave

78 ÿ2.132 copper ring

1 burn cook

72 ÿ1.502 drink forgive

1 bridge rubber

72 ÿ1.802 photo alien

1 OK chirp

66 ÿ1.222 flesh fat

1 media beautiful

65 ÿ1.352 burn coal

1Mei(name floodof river)

73 ÿ1.882

Lao Mei(name of (name ofmountain) mountain)

Figure 2. The pairs of Chinese characters used in the experiment with Chinese readers. As in theexperiment with English readers, 10 pairs of stimuli were used. The sum of each pair of successivecharacters is equivalent to the sum of the complementary pair (A versus B). The meaning of mostof these characters is ambiguous without additional characters as context. One possible meaningis listed for each. %C stands for percent correct.

2.4 English language, single presentationUnlike in the repetitive-presentation experiments, only one letter string was shown, andit was presented just once. It was preceded by an `XXXX' mask shown for 2 frames(23 ms) and a 1-frame ISI (12 ms) following the pre-mask. The letter string was alwaysshown for 2 frames and followed by an ISI of variable duration, varied across trialsto determine threshold performance. The `XXXX' was presented subsequently for3 frames (33 ms) to post-mask the string. As in the other experiments, at the end ofeach trial a tone and short message informed the observer whether the correct choicehad been made.

The experiment consisted of two conditions: words and nonwords. As in therepetitive-presentation experiment, the stimuli were comprised of those in table 1.The pre-trial screen and response screen were identical to that of the multiple-presentation experiments. Four letter strings were shown in the preview screen, and thesame four letter strings were presented again in the response screen. These sets of fourwere the same as those used in the repetitive-presentation condition. Because in thiscondition only one of the letter strings, randomly chosen, was presented as a stim-ulus, only one of the four alternatives was correct, meaning that chance performancewas 25% instead of the 50% chance level of the repetitive-presentation experiments.

Each participant ran in 40 trials at each of 5 ISIs and each of the two conditions,for a total of 400 trials. Individual sessions lasted less than 1 h. Some of the partici-pants (TM, TN, DB) had also participated in the repetitive-presentation experiment.

2.5 EyetrackingSaccades and blinks might allow participants to reduce the intended masking by thestimuli in the train. To detect in which trials participants blinked or moved their eyes,the right eye was monitored during the experiment. In the experiments with Englishreaders, the eyetracking setup typically allowed rejection of trials in which observersmade larger than 0.5 deg movements from fixation. For the Chinese observers, appa-rently due to the differing average configuration of their eyelids, performance by theeyetracker was much more variable, and typically not nearly as precise, reliably allow-ing exclusion of movements only when they reached a few degrees in magnitudeor greater. Nevertheless, the eyetracking did have a similar psychological effect inthe Chinese experiment as in the English experiment. That is, the participants wereconstantly aware that their eye movements were being monitored and, in the experi-ence of the experimenters, they maintained fixation much better than naive observersotherwise do.

Each participant was monitored for saccades and blinks with a Skalar Iris (http://www.skalar.nl) infrared eyetracker (Reulen et al 1988). The output of the eyetracker,representing the horizontal position of the right eye, was recorded throughout stimuluspresentation. With observer's chin on a rest, the eyetracker was calibrated first bymanual adjustment to yield high gain and an approximately symmetric signal whenfixating two targets at opposite ends of the screen. Observers then participated in anautomated calibration session consisting of repeated saccades among five dots. Twowere horizontally spaced 1.2 deg on either side of fixation, and two more were set11.6 deg from fixation. These calibration data were used to create a criterion for decid-ing when a saccade was made. In the case of the experiments with letter strings, thecriterion was exceeded when a saccade of approximately 0.5 deg or greater occurred.To create this criterion, the position signals from the calibration trials were filteredwith a Chebyshev-type-two first-order low-pass filter. The maximum piecewise velocityduring the time of the saccade was determined, and the mean maximum velocity froma sample of 1.2 deg calibration saccades was calculated. In the case of the experimentwhich included English words, the 1.2 deg calibration saccade with maximum velocity

Binding words is slow 59

closest to the mean was examined more closely. The Chebyshev-filtered signal wasconvolved with the kernel [1 ^ 0.2 ^ 1], and the criterion for rejecting a trial was set tobe significantly smaller than the magnitude of the convolved 1.2 deg record. In partic-ular, the maximum magnitude of the convolved signal corresponding to noise aroundthe time of the saccade was estimated, and the rejection threshold was set a bit higherthan this noise-evoked signal. More precisely, the criterion was calculated by addingto the maximum noise-evoked convolution magnitude 20% of the difference betweenthe saccade-evoked convolution magnitude and the noise-evoked convolution.

Apparently because of the smaller average eye opening of the Chinese participants,a reliable estimate of the magnitude of the convolved signal evoked by the small cali-bration saccades could not be made; indeed, often the noise was as large as this signal.The criterion for rejecting a trial was instead set at just above the noise seen in theperisaccadic interval around the 11.6 deg saccade time.

3 Results3.1 Letter strings, repetitive presentationFigure 3 shows performance as a function of alternation rate for the English words,nonwords, and pseudowords conditions, after trials with significant eye movement wereeliminated. Accuracy was highest and usually close to perfect for the slowest alterna-tion rates, when exposure duration was longest. At higher alternation rates, performancegradually declined. Pilot experiments had consistently shown that performance was nearchance at frequencies above 8 Hz, and this was borne out in the data. An apparentexception was shown by participant GE for words at 8.5 Hz, but when he returned tothe lab 5 months later (data shown in inverted triangles), the anomaly was not replicated,despite the benefit of practice evident at the slower rates. Hence, the one exception topoor performance at higher temporal frequencies appears to have been due to randomchance.

A useful way to summarise performance in this task is with the 75% threshold ofeach subject in each condition. As such thresholds are commonly used in psychophysics,they have the further advantage of allowing rough comparisons with earlier psycho-physical studies. In particular, a point near 75% is often used because it is typicallyat a portion of the curve where the accuracy versus frequency slope is high, making itsensitive to differences across conditions.

We extracted the 75% thresholds from the data by fitting percent correct (%C) tocycle duration, using a class of curves that capture the main features of psychometricfunctions. The curve class fitted, equation below, was the top half of a cumulativeGaussian rescaled to go from 50% to 98.5% (98.5% used instead of 100% to allow foroccasional lapses), with freely varying sigma (s) and a freely varying exponent oncycle duration in seconds (x). For convenience, the expression is given in terms of theMATLAB error function used (erf ):

%C � 0:5 1� 0:985 erf1���2p

�x

s

�r� �� �.

The best-fitting curve was chosen by maximum likelihood via Nelder ^Mead simplexsearch method (MATLAB code available from the first author).

The 75% thresholds in this 2-alternative forced-choice design were sometimes a bithigher than the 3 ^ 4 Hz found previously for judging the pairing of spatially separ-ated shapes and colours (Holcombe and Cavanagh 2001), but they were far below the20 Hz rates achievable for binding local elements into the global forms of Glass patterns(Clifford et al 2004).

Recall that the purpose of these experiments was not to contrast the differentclasses of stimuli. Instead, different classes of stimuli were used simply to determine

60 A O Holcombe, J Judson

100

75

50

25

100

75

50

25

100

75

50

25

100

75

50

25

100

75

50

25

100

75

50

25

100

75

50

25

SOA=ms SOA=ms SOA=ms100 50 33 100 50 33 100 50 33

DB

GE

JY

SH

TL

TM

TN

DB

GE

JY

SH

TL

TM

TN

DB

GE

JY

SH

TL

TM

TN

5 10 15 5 10 15 5 10 15AR=Hz AR=Hz AR=Hz

words pseudowords nonwords

3.5 Hz 2.5 Hz 1.5 Hz

3.3 Hz 3.2 Hz 1.5 Hz

3.7 Hz 2.7 Hz 1.9 Hz

2.7 Hz 2.8 Hz 1.9 Hz

5.2 Hz 8.2 Hz 3.3 Hz

4.3 Hz 2.8 Hz 2.1 Hz

7.4 Hz 2.9 Hz 0.9 Hz

%C

%C

%C

%C

%C

%C

%C

Figure 3. Each plot is percent correct (%C) as a function of alternation rate (AR) for a particularparticipant (inset initials). Alternation rate is expressed in hertz at bottom, with correspondingstimulus onset asynchrony shown at top. Chance performance is 50%. Trials in which significanteye movement was detected were excluded. Standard error bars are shown where they exceed thesymbol size. Thin line is fit psychometric function. Dashed line shows 75% threshold. In the wordscondition subject GE shows an anomalous data point for the fastest rate, which did not recur ina second session (inverted triangles).

Binding words is slow 61

whether the slow threshold result was a general one. As linguistic variables were notcontrolled across the different classes of stimuli, a number of factors could explainany differences, which should be kept in mind in the following report of the results inthe various conditions.

For the different conditions of this experiment, as shown in table 2, performancewas highest with the words (75% threshold � 4.3 Hz), somewhat lower for the pseudo-words condition (3.6 Hz), and still lower for the nonwords (1.9 Hz). Paired-samples t-testsindicate that the difference between the words and pseudowords is not significant(t6 � 0:82, p � 0:44), whereas all other differences are: for the words versus nonwordst6 � 3:4, p � 0:01, and for the pseudowords versus nonwords t6 � 3:1, p � 0:02.

These results clearly indicate that temporal binding thresholds for the nonwordswere lower than for the pseudowords and the words. The main point of this paper isthat all the conditions yield slow thresholds, all of less than 5 Hz. Still, the differ-ence between the words and nonwords is notable, especially since several strings inthe nonwords condition were pronounceable (this was a consequence of satisfying theconstraint that the same bigrams be used as in the words condition).

The effect of blocking can be examined by comparing the unpronounceable strings inthe nonwords condition to the four strings within that condition which were pronounceable(`pemo', `meta', `kepi', `refo'). Excluding those pronounceable strings does not changethe thresholds by much. Thresholds were lowered by 0.1055 (not statistically significant,t6 � 0:46, p � 0:67). That the pronounceable-pseudowords condition nonetheless yieldedmuch higher thresholds suggests some benefit of blocking pronounceable nonwordstogether, perhaps because in the less-pronounceable-nonwords condition participantswere in the habit of using a less effective coding strategy for the letter strings.

In the analysis just described, thresholds from different-sized data sets were compared(nonwords condition with and without pronounceable nonwords). Some data-fittingalgorithms can exhibit a bias dependent on the size of the data set. To ensure that theexclusion of stimuli did not bias the threshold estimation procedure with our algo-rithm, thresholds were re-estimated after excluding random four-stimulus subsets of thenonword condition. Considering five hundred random subsets drawn with replacement,the average threshold difference was negligible (0.0048 Hz) and certainly not statisticallysignificant (t499 � 0:57, p � 0:57).

For further examination of the effect of pronounceability within the nonwordscondition, one might compare thresholds for the four pronounceable strings with thethresholds for the rest. Unfortunately this was not possible, because the small numbersof trials with pronounceable strings made the thresholds unstable. However, some insightcan be had into the variation due to particular items by examining percent correct byitem, collapsed across duration, which is shown in table 3. The overall percent correctfor the trials of the nonwords condition containing strings which happen to be pronounce-able is at the low end of the accuracy for the items in the pseudowords conditioncontaining exclusively pronounceable items.

Although the effect of blocking should be further explored, the difference betweenthe nonwords condition and the other conditions appears reliable. The difference likely

Table 2. 75% Thresholds (cycles sÿ1=ms SOA).

Repetitive presentation

words pseudowords nonwords Chinese characters

Mean 4.3 Hz=116 ms 3.6 Hz=139 ms 1.9 Hz=263 ms 2.9 Hz=172 msSE=Hz 1.6 2.0 0.7 1.1N 7 7 7 6

62 A O Holcombe, J Judson

results from one of the various mental processes involved in determining which letterswere presented simultaneously. The difference across conditions cannot be due todifficulties in identifying the individual letters at higher rates, as the letters used incorresponding trials of the words and nonwords conditions are identical (table 1).The possible reasons for the difference are described in section 5 below.

The present experiment was not designed to investigate the role of word frequency onbinding threshold. Still, its role in variation of performance within our words conditioncan be tentatively examined by regressing performance on word frequency. Frequencieswere taken from the CELEX database (Baayen et al 1995; Davis 2005) and the meanof the log frequencies for each set of four words was calculated. `Mope' was not presentin the frequency database, so it was assigned a nominal frequency of 0.6, lower than thatof all the other words, which ranged from 0.67 to 1233. When overall percent correctfor each four-word set is regressed on mean log frequency, the resulting r � 0:16 issmall and not significant ( p � 0:68). Unfortunately, the number of trials per four-wordset per rate was too small (4) to allow separate estimation of temporal thresholds foreach four-word set. The regression was performed separately for the five presentationrates but was far from significant in all cases, with the r negative for two.

In the context of the range of temporal thresholds reported in earlier literature,thresholds in all conditions of this experiment were decidedly slow. But rather than beinga general result for linguistic stimuli, it is possible that these results may be dependenton the particular words and nonwords chosen, the language used, or even the writingsystem. To determine the generality of the slow thresholds found for these linguisticstimuli, we decided to go as far afield as possible, and investigate a case of differentwords from a different language with a different writing system.

Table 3. Percent correct (%C) for each stimulus pairing, collapsed across alternation rates.

Words Nonwords Pseudowords

A B %C A B %C A B %C

ball back llba llde rane ralldeck dell 73 ckde ckba 59 moll mone 78

pump pull mppu mphe runk rutehell hemp 77 llhe llpu 70 pote ponk 77

dent dell ntde ntpu lort lontpull punt 75 llpu llde 54 hent hert 77

tank tape nkta nkmo pime pirtmope monk 75 pemo peta 74a jurt jume 65

ball bank llba llhu dert demphunk hull 80 nkhu nkba 56 jomp jort 65

pink pimp nkpi mppi hape haksjump junk 76 mpju nkju 58 miks mipe 71

tame tack meta ckta zate zamslick lime 73 ckli meli 64 a koms kote 69

pike pick kepi ckpi nive ninkjock joke 81 ckjo kejo 62 a lank lave 69

call cart llca rtca paln paskport poll 78 rtpo llpo 58 bisk biln 70

milk mire lkmi remi ferm fentfore folk 80 refo lkfo 67 a hont horm 71

aNote: These instances of the `nonwords' condition contained pronounceable strings.

Binding words is slow 63

3.2 Chinese characters (repeated presentation)Figure 4 shows percent correct as a function of alternation rate for the Chinese charac-ters, with one plot for each of the six Chinese natives. As in the experiment with letterstrings, thresholds rapidly decreased with increasing temporal frequency. The mean75% threshold was 2.9 Hz, in the slow end of the range found with the letter strings.A quantitative statistical comparison with the results of the letter strings experimentis not advisable, given the use of different participants and materials. Still, there is noquestion that thresholds are a great deal lower than the 15 ^ 20 Hz frequencies foundfor binding local elements into regular global forms (Clifford et al 2004).

Although in reading, as with English words, Chinese character corpus frequencycan have a large effect on performance (Seidenberg 1985; Hue 1992), in the presenttask no significant effect was found. Simplified Chinese character frequencies were takenfrom the combined corpus of Da (2004). The mean log percent frequency was calcu-lated for each four-character set (figure 2). Regressing overall percent correct on logfrequency yielded a small Pearson product-moment (r � 0:049, p � 0:89). Trials perword at each rate were too few to yield good estimates of the 75% threshold. However,we were able to examine whether there might be a strong effect of word frequencyconfined to a particular difficulty level by combining percent correct across observersfor the fastest rate for each observer, the second fastest rate for each, etc. Pearson corre-lations were low and never significant. From the lowest to the fastest rate, the rs were0.34, 0.19, 0.18, ÿ0:23, and ÿ0:17, with corresponding ps � 0.34, 0.59, 0.62, 0.52, 0.64.

3.3 Letter strings, single presentationThe results with letter strings described above reflect a repeated-presentation methoddesigned to test the temporal resolution of binding the parts together. This is in contrastwith earlier studies of the perception of masked letter strings, in which a stimulus waspresented only once in a trial. It was expected that, with this more traditional method-ology, the items could be discriminated on the basis of much shorter exposure intervals

SOA=ms SOA=ms100 50 33 100 50 33

CL

DS

JL

JY

LH

NL

2.1 Hz

3.2 Hz

2.3 Hz

3.3 Hz

1.7 Hz

2.5 Hz

100

75

50

25

100

75

50

25

100

75

50

25

5 10 15AR=Hz

5 10 15AR=Hz

%C

%C

%C

Figure 4. Each plot shows one subject's percent correct (%C) as a function of alternation rate(AR) of the Chinese characters. The thin line shows the psychometric fit, and the inferred 75%threshold is also shown. Standard error bars are shown where they exceed symbol size. All subjectswere expert readers of simplified Chinese. Chance performance is 50%.

64 A O Holcombe, J Judson

as, without repeated presentation, binding during the exposure time interval was notnecessary. The data shown in figure 5 show that, indeed, letter strings can be perceivedon the basis of much briefer intervals when only one is shown, even when it is masked.The reasons for this are described in section 5. Although chance performance was25% rather than 50% as in the other experiments, performance was so good that SOAseven for 75% performance for many participants were briefer than could be presentedwith our CRT, and were almost always much briefer than thresholds with repeatedpresentation. Since thresholds could not be estimated from these data, to compareperformance across conditions paired t-tests on percent correct were performed.Successive screen refreshes were separated by 11.8 ms, and the briefest stimulus expo-sure in the experiment consisted of two refreshes drawing the letter string and a singleblank frame before the post-mask, for a total exposure of 35.3 ms. At this shortestSOA, participants provided the correct answer in 87% of trials in the words conditionagainst 71% of trials in the nonwords condition. A two-sample paired t-test showedthis difference to be significant (t10 � 2:9, p � 0:02). The difference was also significantat the next two longer durations for which the advantage of the words condition was10% and 13% (t10 � 2:25, p � 0:048; and t10 � 4:1, p � 0:002), but not the longest twodurations for which the advantage was 3% and 2%. The absence of a statisticallysignificant difference in the longest two durations is welcome, as it suggests that theperformance difference was caused by the manipulation of exposure duration ratherthan always being present. The differences at the faster rates are comparable to, orsomewhat larger than, those found in earlier literature (Manelis 1974).

Accuracy in this experiment was not sensitive to mean log word frequency of eachfour-word set. Regressing overall percent correct on mean log word frequency per four-word set yields an inverse correlation (Pearson r � ÿ0:33, p � 0:35). Performing theanalysis separately by rate yields a slightly negative r for each rate, with none statisticallysignificant.

SOA=ms SOA=ms SOA=ms SOA=ms100 50 33 100 50 33 100 50 33 100 50 33

100

75

50

25

100

75

50

25

100

75

50

25

%C

%C

%C

AS

CS

EL

AZ

DB

JO

JW

RC

TN

LJ

TM

5 10 15 5 10 15 5 10 15AR=Hz AR=Hz AR=Hz

5 10 15AR=Hz

words

nonwords

Figure 5. With this single-presentation experiment, the same stimuli and exposure durations wereused as in the repeated-presentation experiments. Overall performance and 75% thresholds aremuch better. Each plot shows one subject's percent correct (%C) as a function of SOA of the twosuccessively presented words. Equivalent alternation rate (AR) is shown at bottom. Standard errorbars are shown where they exceed symbol size.

Binding words is slow 65

4 Eye movements and 75% thresholds4.1 Letter strings, repetitive presentationIn this experiment one concern was that eye movements might, on some trials, allowparticipants to circumvent the pre- and post-masking. Each successive stimulus in therepetitive-presentation paradigm is meant to mask the others by falling on the samepatch of retina. This way, when presentation rate exceeds temporal resolution of theneural processes that binds the letters into a string, successive unbound strings willbe combined and alternating strings with the same sum will not be discriminated.However, if a saccade is made at particular times during the stimulus presentation,then successive strings will not land on the same patch of retina. Potentially, the braincould then combine visual information over long periods and still recover the string'sidentity. Indeed, in the case of an eye movement, the string would be combined withthe empty background rather than a successively presented masking string. Hence,saccades might yield inflated temporal thresholds; moreover, a different pattern of eyemovement in different conditions might change the pattern of thresholds across con-ditions. In the authors' experience, using saccades willfully to subvert the masking wasdifficult but occasionally effective, especially after extensive practice. Strategic eye blink-ing could also subvert the stimulus masking. Fortunately, eyeblinks were easily detectedby the eyetracker and all but quite small saccades were also reliably detected.

Analysis of the data indicates that eye movements and blinks did not have asubstantial effect on thresholds. Thresholds estimated from the full data set were com-pared with those estimated from the data after discarding trials in which saccadeswere detected (see section 2.5 for eyetracking details). The mean changes in thresholdcaused by rejecting trials with eye movements in each condition were, with associatedstandard error across the seven subjects, ÿ0:6� 0:9 Hz in the words condition,0:2� 0:5 Hz in the pseudowords condition, and ÿ0:1� 0:2 Hz in the nonwords con-dition. None of these changes in estimated threshold was significant according tot-tests. Also insignificant were the differences between conditions in these thresholdchanges (one-way ANOVA: F2 18 � 2:73, p � 0:092). Nonetheless, the decrease in thresh-old caused by rejecting trials with eye movement in the words condition (t6 � 1:74,p � 0:13), although statistically insignificant, is consistent with the possibility that, ina small number of trials, eye movements defeated the pre- or post-masking. The magni-tude of threshold changes due to rejecting eye movements is nearly as large as thedifference in threshold between the words and pseudowords condition. For this reason,any further future studies of the difference between these conditions (nonsignificant inthis study) should monitor eye movements.

The evidence also indicates that the number of eye movements and blinks detecteddiffers across the conditions. The differences across the conditions in number of trialsrejected were not very large, but they were statistically significant. Of 40 trials percondition per subject per interstimulus interval, the mean number and associated stan-dard error of trials rejected were: 10:8� 1:14 for the words condition, 4:7� 0:88 forthe pseudowords condition, and 8:6� 0:92 for the nonwords condition. In an ANOVAwith condition, interstimulus interval, and their interaction as factors and number oftrials rejected as the dependent variable, condition was significant (F2 � 4:43, p � 0:015),whereas interstimulus interval was not (F4 � 0:11, p � 0:98). It is not clear why partic-ipants apparently made a comparable number of eye movements in the words andnonwords condition, but made fewer in the pseudowords condition. However, thispattern of differences does not correlate with the differences in temporal threshold,which provides further confidence that the measured differences in thresholds were notrelated to differences in eye movements in the different conditions.

Some participants exhibited significantly more eye movements than others, with theaverage number of trials rejected per 40-trial block ranging from 2.3 for one participant

,

66 A O Holcombe, J Judson

to 18.3 for another. This was expected, as it is commonly observed in psychophysicalexperiments that some participants fixate better than others. If these eye movementsallowed some subjects to inflate their thresholds, then one would expect a positiverelationship between number of eye movements and 75% thresholds estimated from thedata when all trials, including eye-movement trials, are included. A small and statisti-cally insignificant but positive correlation was found (r � 0:2, p � 0:39). That thisalready-small correlation diminished to nearly zero (r � 0:04, p � 0:86) when trialswith detected eye movements were discarded further increases confidence that thresholdsafter screening by the eyetracker were not greatly inflated by eye movements.

4.2 Chinese characters (repetitive presentation)The relationship of the data in the Chinese characters experiment to the detected eyemovements did not differ markedly from that found in the letter-strings experiment.Chance performance was 50%, and the 75% thresholds were again estimated just as inthe English repetitive-presentation experiment. In the Chinese characters experiment,of the 40 trials that each subject ran in each condition ^ ISI combination, an average of11.9 were rejected owing to eye movements or blinks. Rejecting these trials caused amean decrease of threshold of 0.4 Hz, which is statistically insignificant (t5 � 1:65,p � 0:16). As in the experiment with letter strings, there were differences among subjectsin the number of eye movements made, with the range spanning 17 to 6. Again, as inthe experiment with letter strings, statistically insignificant correlations were found bet-ween the number of eye movements made and the threshold, both before and after trialswith eye movements were discarded (r � 0:45, p � 0:37 ; r � 0:35, p � 0:49, respectively).

5 DiscussionOf all visual skills, reading is one of the most overlearned. Indeed, most literate adultshave decades of near-daily experience with the task of reading. Thanks to this practice,word reading is quite automatic (Stroop 1935). If automaticity and general experiencedetermined temporal limits in perception, then we might expect thesholds for wordsto be among the fastest of all visual thresholds. Instead, in the current study we foundthat temporal binding thresholds for English words are comparable to those for non-word letter strings and arbitrary conjunctions of spatially disparate colour and shapeelements. We found a similar result for binding the parts of Chinese characters, eventhough they may be processed more holistically than are English words (eg Tzeng et al1979; Chen 1984). The enormous advantage of words over nonwords stimuli in fre-quency of exposure, memorability, and other factors has not allowed temporal bindingthresholds to approach those of the fastest in the visual system.

Another result of the present work further distinguishes the slow-threshold linguisticmaterial from visual material that is bound at high rates. The phenomenology of it isclearly different. In the case of alternating Glass patterns and alternating gratings, atrates faster than several hertz observers report perceiving both patterns simultaneously,yet the features presented at the same time remain grouped in perception (Holcombe2001; Clifford et al 2004). In the case of the letter strings, as the rate of alternationincreases beyond 5 Hz, the successive letters eventually appear as if transparentlyoverlaid, but observers reported that there is no strong grouping perceived betweenparticular letters. In other words, binding is inaccurate not owing to perceptual mis-bindings, but rather because no particular binding is really perceived. Although thisis very like the phenomenology previously found for slow alternation thresholds, it isentirely different from the binding problems reported with single exposure. In thosecases, participants often mistakenly had high confidence that their binding errors werecorrect (Shallice and McGill 1978; Mozer 1983). Further investigation will be needed todiscover the reasons for this difference.

Binding words is slow 67

Frequency of experience with words has a large effect on response time in conven-tional word-reading tasks. In a study of visual-duration thresholds for identifying words,Howes and Solomon (1951) found large Pearson correlations with log frequency ofbetween r � 0:5 and 0.8. Unlike conventional identification tasks, with our forced-choicemethodology participants were informed of the options before each trial. The intentionwas to minimise any time needed for lexical search and other linguistic processes,in order to maximise the possibility for a fast threshold. The absence of a significantfrequency effect suggests that we succeeded. Previous studies of Chinese characters havesometimes found an even larger frequency effect than is found in English (Seidenberg1985; Hue 1992). In the present investigation with Chinese characters this frequency effectwas absent.

Our primary interest is what causes words and Chinese characters to be inaccessi-ble at high temporal frequencies, in contrast to simple motion direction (Burr andRoss 1982), flicker, edges and texture boundaries (Forte et al 1999; Kandil and Fahle2003; Ramachandran and Rogers-Ramachandran 1991), depth from binocular disparity(Morgan and Castet 1995), and pairings of superposed local colour and orientation.Perhaps the most striking discrepancy from the present result with pairing the spatiallyseparated forms of a word was the result when many form elements were arrangedinto simple shapes such as spirals or radial patterns. These are integrated with highefficiency in both space and time (Wilson and Wilkinson 1998; Clifford et al 2004),whereas letters and Chinese character parts apparently are not.

These results fit well into the present theoretical framework of lower temporalresolution for central processes. First, consider the case of a series of bars drifting athigh temporal frequencies, resulting in a retinal patch and its corresponding corticalcells receiving a rapidly fluctuating input. The motion direction of such stimuli isextracted by areas relatively early in the processing hierarchy (MT or earlier) whoseneurons have high temporal resolution (Borghuis et al 2003). These mechanisms trans-form the rapidly varying stimulus information into a constant signal. That is, the rapidlyvarying (high temporal frequency) pattern of light and dark as the bars pass by in aparticular direction is signaled by a relatively constant (low temporal frequency) firingby direction-selective cells. Hence, any subsequent loss of high temporal frequencies,such as may occur at more central stages, will preserve the now low-frequency motionsignal. In the case of words and Chinese characters, the poor temporal resolution mayresult from the lexical recognition mechanisms being situated after a stage which averagesover several dozen milliseconds or more.

Unfortunately, extant models of word processing have yet to address these temporalbinding issues (McClelland and Rumelhart 1981; Brysbaert and Vitu 1998; Coltheartet al 2001; Engbert et al 2002; Pollatsek et al 2003; Reilly and Radach 2003; Davis,submitted). Indeed, usually no attempt is made to situate the efficiency of work pro-cessing in the context of other visual judgments. Instead, word models in the literatureare designed primarily to explain differences in response time due to psycholinguisticfactors such as lexicality, word frequency, length effects, and priming effects. A partialexception are the modeling efforts of Graboi and Lisman (2003) who grapple directlywith some neurophysiological constraints on processing time. Their model does notaddress our repetitive-presentation method, but appears to predict that, with a 2-alter-native task like ours, recognition should occur in only a few processing cycles, orabout 50 ms or less. This is, of course, quite different from the thresholds found here,which were uniformly greater than 100 ms. If word-perception models were changedto accommodate temporal binding thresholds, this might help the models account fornormal and rapid serial-presentation reading as well.

Progress in understanding visual thresholds for words, as opposed to responsetimes, has been halting, and computational models that explain word perception do

68 A O Holcombe, J Judson

not seem to address the recent psychophysical threshold findings. Analysis to date,however, indicates that contrast and lateral-masking thresholds for words can beexplained by independent detection of letters, without holistic word processing (Pelliet al 2003; Martelli et al 2005). This is quite different from contrast thresholds forcertain static forms and motions (Morrone et al 1995; Wilson and Wilkinson 1998),in which detectors integrate efficiently over the constituent elements. As these samestimuli and visual judgments that efficiently pool over space also have precise temporalthresholds, we see a parallel here. Perhaps the presence of a specialised mechanism inthe visual system leads to both benefits for the visual judgments it serves.

The fast thresholds found to date are robust, in that variants of the forced-choicetask and of the particular stimulus parameters chosen do not change the 415 Hz result(Burr and Ross 1982; Holcombe and Cavanagh 2001; Clifford et al 2003, 2004), pro-vided that the total duration of the stimulus train is sufficient (Bodelon et al 2004).The slow �3 Hz thresholds are also robust, in that different arrangements of the spatiallyseparated features had little effect on thresholds (Holcombe and Cavanagh 2001).

Given the robustness of previously found fast and slow thresholds to methodologicalchanges, it is unlikely that changing the particulars of the presentation method used willallow thresholds for pairing spatially separated letters to improve by much. The presentresults suggest, however, that high-level factors like pronounceability might shift theslowness of the slow thresholds. This is in agreement with the present theory. To under-stand why, first consider the robustness of the fast thresholds. In cases where early visualmechanisms bind the relevant visual information, if the observer is to make the appro-priate response, the visual representation must subsequently be recognised by centralprocesses and consolidated into memory. These central processes will certainly taketime, but thanks to the temporal transparency phenomenon the relevant visual repre-sentation is continuously available to central processes (Holcombe 2001). Hence, witha long train of stimuli any reasonable duration requirement of central processing iseventually met, as long as the temporal frequency does not exceed the resolution of theearly visual-binding mechanism. The situation is quite different when temporal trans-parency does not occur, such as with the present linguistic material. Then, centralprocesses no longer have the interrupted, extended duration of the stimulus train avail-able to recognise and remember. Instead, without temporal transparency a particularstimulus is available centrally only in a series of short episodes interrupted by shortepisodes of the other stimulus. The duration of every episode corresponds to theinterstimulus interval. With visual recognition, memory encoding and consolidation,both likely to require a substantial amount of sustained processing time (Jolicoeur andDell'Acqua 1998; Lawson and Jolicoeur 2003), factors that affect the duration of theseprocesses would also affect the temporal threshold in our tasks. In other words, whenperipheral processes do not transform the intermittent stimulus into a constant represen-tation, cognitive stages can be a time-limiting step. In the case of the unpronounceablestrings in our experiment, they likely had longer central encoding times and this mayhave yielded the slower thresholds.

Interestingly, the advantage here for words over unpronounceable nonwords,although small when compared to the full range of resolution limits exhibited in visualperformance, is sizeable by the standards of previous studies. In these previous studiesusing brief presentation of linguistic material, the letter string was presented just oncein a trial, and usually post-masked. Duration identification thresholds were longerfor words than for nonwords (eg Baron and Thurston 1973; Manelis 1974). Typically,however, participants were not informed of the alternatives before the stimulus waspresented, allowing memory factors and bias to play a large role. Most publishedresults from forced-choice designs did not find an advantage for words (Bjork andEstes 1973; Thompson and Massaro 1973). In the forced-choice study which did find a

Binding words is slow 69

word advantage (Smith and Haviland 1972), the advantage was quite small, much smallerthan that found with the repetitive-presentation paradigm of this paper. A 4%^ 7%difference in accuracy was found in that study, whereas our study yielded an accu-racy difference of 16% at the shortest duration in the single-presentation case and evenlarger in the repeated-presentation experiment (as much as 22%).

It is always difficult to be sure that a difference between words and nonwords hasnot been caused by the greater ease with which words are remembered. In the presentstudy we attempted to minimise the role of memory by presenting the four possiblestimuli both before and immediately after each trial, but this does not guarantee thatthe observers invested the time necessary to completely internalise the options. If theydid not, the poorer thresholds in the nonwords condition could be explained by adifference in the degree to which the alternatives were held in working memory duringthe stimulus viewing period. Even if the alternatives are learned well, differences inmemory consolidation and retention processes could still result in poorer thresholdsfor nonwords. Determining the processing stage(s) at which words have an advantageover nonwords will require significant further work. One reason for the historically largedifference found here between words and unpronounceable strings likely is the novelrepeated-presentation methodology. This method also provides some other advantagesover traditional methods, as discussed in the following section.

5.1 Masking, single presentation, and multiple presentationIn single-presentation conditions, the SOA between word and mask can be less than40 ms and still the word can be reliably discriminated (Manelis 1974). This result wasreplicated in the present work. But the result with the novel, repetitive-temporal-binding paradigm was quite differentö75% thresholds of 116 ms, even though thesame items were used in both conditions. The present experiments, then, show thatdifferences between the single-presentation and repetitive-presentation literatures aredue to method rather than material.

A major factor distinguishing single from repetitive presentation is the differencein the time scale that the target's visual information is available. This difference may goa long way towards explaining the much longer thresholds found in repetitive presenta-tion. Consider that the repetitive-presentation condition was designed to be limitedby the temporal precision with which the word parts are represented. In the repetitive-presentation condition, if the system loses enough precision in its representation ofwhich letters were presented when, performance should be at chance. In the single-presentation case, temporal imprecision is of little consequence. As long as each of theletters is perceived, it matters not whether the system represents their time of occurrenceprecisely.

Furthermore, in single presentation even when the letter strings are presented sobriefly that the visual system integrates the letter string together with the conventional`XXXX' masks, potentially information is still available to determine which letter stringwas presented. That is, summing the mask and the letter string does not obliterate cuesto the identity of the letter string. This is in contrast to the repetitive-presentation case,where summing successive stimuli obliterates all clues to which of the two alternativeswas presented.

In the efforts to determine the stages in processing that lexical factors modulate,researchers should attempt to minimise differences in the attention allocated to wordsversus nonwords. The repetitive-presentation technique is likely to be less affected by thedynamics of attention than is single presentation. Certain words may attract attentionmore than do nonwords (Mack and Rock 1998), which may contribute to the advantageof words in single presentation, where attention must be allocated at the right instant.

70 A O Holcombe, J Judson

If attention is accidentally engaged at the wrong time, subsequent stimuli can easilybe missed (Duncan et al 1994). This is potentially a problem in simple masked displays,whereas the consequences of occasionally engaging attention on the wrong stimulus isless of an issue when targets are the only things presented, and are presented repeatedly.

As we have seen above, differences in the information available between the single-and repetitive-presentation methods at a given SOA may explain the difference intemporal thresholds. This should indeed be considered the most likely explanation ofthe difference. Still, there remains the possibility of an actual difference in the way thestimuli are treated in the two conditions. It is conceivable that the method of repetitivepresentation itself may change the nature of the processing that occurs. Specifically,the temporal integration window conceivably might expand to reflect the temporallyextended input of the repetitive-presentation condition. This could lead to the slowalternation thresholds found herein despite temporally precise processing when wordsare presented in non-repetitive fashion. The accurate performance at high rates of non-repetitive presentation found by Potter and others (Rubin and Turano 1992; Potter1993) as well as Sperling et al (1971) inspire this suggestion that single presentationmay indeed result in shorter integration intervals. Yet this possibility remains doubtful,because all the reasons described above for better performance with non-repetitivepresentation apply to this literature as well.

5.2 Future directionsThe present results militate for a model in which the temporal precision of binding isfine for efficient, specialised visual mechanisms, but coarse for those that rely on higher-level mechanisms, such as judgments of linguistic material. This is consistent with thetheory that high-level mechanisms subserving explicit judgments have access only tocoarse visual time scales.

In the tasks used in the present paper, only explicit judgments were solicited fromthe study participants. It is possible that high-level mechanisms, such as those whichextract lexical information, process visual information with high temporal resolution,but that this information does not become available to explicit judgments. The exis-tence of semantic priming from words even with very brief exposures (eg Greenwaldet al 1996) is suggestive here, although the concerns described above for single presen-tation do apply. Still, Potter and her colleagues (Potter 1993; O'Connor and Potter2002) have provided evidence that individual items are processed to a semantic levelwhen embedded in a rapid stream of items, and that these items can be recalled onlyif they are conceptually related to other items in the stream. To test the temporalprecision of this sort of implicit word processing, a neuroimaging study or behaviouralstudy of priming should be carried out, using an appropriate methodology such asthe repetitive-presentation method introduced here.

Acknowledgments. We thank Liqiang Huang for assistance with the Chinese aspects of thestudy, including his work to identify Chinese character pairs that could be used in our paradigm.Bill Holcombe and Janice Lai also contributed. Tom Sanocki, John Jacobson, and Catherine Harriscommented extensively on the manuscript and Mark Elliott provided stimulating feedback.Discussions with Colin Davis were very beneficial.

ReferencesBaayen R H, Piepenbrock R, Rijn H van, 1995 The CELEX Lexical Database. Release 2 [CD-ROM]

(Philadelphia, PA: Linguistic Data Consortium, University of Pennsylvania)Baer K V von, 1864 ``Welche Auffassung der lebendigen Natur ist die richtige? Und wie ist diese

Auffassung auf die Entomologie anzuwenden? [Which view of living nature is correct? And howis this view to be applied to entomology?], in Reden gehalten in wissenschaftlichenVersammlungenund kleinere Aufsa« tze vermischten Inhalts Ed. H Schmitzdorff (St Petersburg: Verlag der kaiserlichenHofbuchhandlung) pp 237 ^ 283

Baron J, Thurston I, 1973 `̀An analysis of the word superiority effect'' Cognitive Psychology 4 207 ^ 228

Binding words is slow 71

Beaudot W H, 2002 `̀ Role of onset asynchrony in contour integration'' Vision Research 42 1 ^ 9Bjork E, Estes W K, 1973 ``Letter identification in relation to linguistic context and masking

conditions'' Memory & Cognition 1 217 ^ 223Bodelon C, Fallah M, Reynolds J H, 2004 ``Temporal resolution of orientation/color conjunctions'',

paper presented at the Society for Neuroscience Annual Meeting, San Diego, CaliforniaBorghuis B G, Perge J A, Vajda I, Wezel R J van, Grind W A van de, Lankheet M J, 2003 `̀ The

motion reverse correlation (MRC) method: a linear systems approach in the motion domain''Journal of Neuroscience Methods 123 153 ^ 166

Burr D C, Ross J, 1982 `̀ Contrast sensitivity at high velocities'' Vision Research 22 479 ^ 484Brysbaert M, Vitu F, 1998 `̀ Word skipping: Implications for theories of eye movement control

in reading'', in Eye Guidance in Reading and Scene Perception Ed. G Underwood (New York:Elsevier) pp 125 ^ 147

Chen H C, 1984 `̀ Detecting radical component of Chinese characters in visual reading'' ChineseJournal of Psychology 26 29 ^ 34

Clifford C W G, Arnold D H, Pearson J, 2003 `̀A paradox of temporal perception revealed bya stimulus oscillating in colour and orientation'' Vision Research 43 2245 ^ 2253

Clifford C W, Holcombe A O, Pearson J, 2004 `̀ Rapid global form binding with loss of associatedcolors'' Journal of Vision 4 1090 ^ 1101 (http://journalofvision.org/4/12/8/, DOI:10.1167/4.12.8)

Coltheart M, Rastle K, Perry C, Langdon R, Ziegler J, 2001 `̀ DRC: a dual route cascaded modelof visual word recognition and reading aloud'' Psychological Review 108 204 ^ 256

Crick F, Koch C, 1995 `̀Are we aware of neural activity in primary visual cortex?'' Nature 375121 ^ 123

Crick F, Koch C, 2003 `̀A framework for consciousness'' Nature Neuroscience 6 119 ^ 126Da J, 2004 `̀A corpus-based study of character and bigram frequencies in Chinese e-texts and its

implications for Chinese language instruction'', in Proceedings of the Fourth InternationalConference on NewTechnologies in Teaching and Learning Chinese Eds Z Pu, T Xie, J Xu (Beijing:Tsinghua University Press) pp 501 ^ 511

Dakin S C, Bex P J, 2002 `̀ Role of synchrony in contour binding: some transient doubts sustained''Journal of the Optical Society of America A 19 678 ^ 686

Davis C J, 2005 `̀ N-Watch: A program for deriving neighborhood size and other psycholinguisticstatistics'' Behavior Research Methods 37 65 ^ 70

Davis C J, submitted `̀ The SOLAR (Self-Organizing Lexical Acquisition and Recognition) modelof visual word identification, Part 1: Orthographic input coding and lexical matching''

Davis C J, Bowers J S, 2004 `̀ What do letter migration errors reveal about letter position coding invisual world recognition?'' Journal of Experimental Psychology: Human Perception and Perfor-mance 30 923 ^ 941

Duncan J, Humphreys G W, 1989 `̀ Visual search and stimulus similarity'' Psychological Review96 433 ^ 458

Duncan J, Ward R, Shapiro K, 1994 `̀ Direct measurement of attentional dwell time in humanvision'' Nature 369 313 ^ 316

Egeth H, Jonides J,Wall S, 1972 `̀ Parallel processing of multielement displays'' Cognitive Psychology3 674 ^ 698

Engbert R, Longtin A, Kliegl R, 2002 `̀A dynamical model of saccade generation in reading basedon spatially distributed lexical processing'' Vision Research 42 621 ^ 636

Fabre-Thorpe M, Delorme A, Marlot C, Thorpe S, 2001 `̀A limit to the speed of processing ultra-rapid visual categorization of novel natural scenes'' Journal of Cognitive Neuroscience 13171 ^ 180

Forte J, Hogben J H, Ross J, 1999 `̀ Spatial limitations of temporal segmentation'' Vision Research39 4052 ^ 4061

Geissler H G, Schebera F U, Kompass R, 1999 `̀ Ultra-precise quantal timing: Evidence fromsimultaneity thresholds in long-range apparent movement'' Perception & Psychophysics 61707 ^ 726

Graboi D, Lisman J, 2003 `̀ Recognition by top ^ down and bottom^ up processing in cortex: thecontrol of selective attention'' Journal of Neurophysiology 90 798 ^ 810

Greenwald A G, Draine S C, Abrams R L, 1996 `̀ Three cognitive markers of unconscious semanticactivation'' Science 273 1699 ^ 1702

Gur M, Snodderly M, 1997 `̀A dissociation between brain activity and perception: Chromaticallyopponent cortical neurons signal chromatic flicker that is not perceived'' Vision Research 37377 ^ 382

Harris C L, Morris A L, 2001 `̀ Illusory words created by repetition blindness: a technique forprobing sublexical representations'' Psychonomic Bulletin & Review 8 118 ^ 126

72 A O Holcombe, J Judson

Hochstein S, Ahissar M, 2002 `̀ View from the top: hierarchies and reverse hierarchies in the visualsystem'' Neuron 36 791 ^ 804

Holcombe A O, 2001 `̀A purely temporal transparency mechanism in the visual system'' Perception30 1311 ^ 1320

Holcombe A O, Cavanagh P, 2001 `̀ Early binding of feature pairs for visual perception'' NatureNeuroscience 4 127 ^ 128

Holcombe AO, Kanwisher N,Treisman A, 2001 `̀ The midstream order deficit''Perception & Psycho-physics 63 322 ^ 329

Howes D H, Solomon R L, 1951 `̀ Visual duration threshold as a function of word probability''Journal of Experimental Psychology 41 401 ^ 410

Hue C W, 1992 `̀ Recognition processes in character naming'', in Language Processing in ChineseEds H C Chen, O J L Tzen (Amsterdam: Elsevier) pp 93 ^ 107

Inhoff A W, Starr M, Shindler K L, 2000 `̀ Is the processing of words during eye fixations inreading strictly serial?'' Perception & Psychophyhsics 62 1474 ^ 1484

Johnson J S, Olshausen B A, 2003 `̀ Timecourse of neural signatures of object recognition''Journal of Vision 3 499 ^ 512 (http://journalofvision.org/3/7/4, DOI:10.1167/3.7.4)

Jolicoeur P, Dell'Acqua R, 1998 `̀ The demonstration of short-term consolidation''Cognitive Psychology36 138 ^ 202

Kandil F I, Fahle M, 2003 `̀ Mechanisms of time-based figure ^ ground segregation'' EuropeanJournal of Neuroscience 18 2874 ^ 2882

Kinoshita S, Lupker S J (Eds), 2003 Masked Priming: The State of the Art (New York: PsychologyPress)

Kline K, Holcombe A O, Eagleman D M, 2004 `̀ Illusory motion reversal is caused by rivalry,not by perceptual snapshots of the visual field'' Vision Research 44 2653 ^ 2658

Kline K, Holcombe A O, Eagleman D M, 2006 `̀ Illusory motion reversal does not imply discreteprocessing: Reply to Rojas et al.'' Vision Research 46 1158 ^ 1159

Lawson R, Jolicoeur P, 2003 `̀ Recognition thresholds for plane-rotated pictures of familiar objects''Acta Psychologica 112 17 ^ 41

McCandliss B D, Cohen L, Dehaene S, 2003 `̀ The visual word form area: expertise for readingin the fusiform gyrus'' Trends in Cognitive Sciences 7 293 ^ 299

McClelland J L, Mozer M C, 1986 `̀ Perceptual interactions in two-word displays: familiarity andsimilarity effects'' Journal of Experimental Psychology: Human Perception and Performance 1218 ^ 35

McClelland J L, Rumelhart D E, 1981 `̀An interactive activation model of context effects in letterperception: Part 1. An account of basic findings'' Psychological Review 88 375 ^ 407

Mack A, Rock I, 1998 Inattentional Blindness (Cambridge, MA: MIT Press)Manelis L, 1974 `̀ The effect of meaningfulness in tachistoscopic word perception'' Perception &

Psychophysics 16 182 ^ 192Martelli M, Majaj N J, Pelli D G, 2005 `̀Are faces processed like words? A diagnostic test for

recognition by parts'' Journal of Vision 5 58 ^ 70 (http://journalofvision.org/5/1/6/, DOI:10.1167/5.1.6)Morgan M J, Castet E, 1995 `̀ Stereoscopic depth perception at high velocities'' Nature 378 380 ^ 383Morrone M C, Burr D C, Vaina L M, 1995 `̀ Two stages of visual processing for radial and

circular motion'' Nature 376 507 ^ 509Mozer M C, 1983 `̀ Letter migration in word perception'' Journal of Experimental Psychology: Human

Perception and Performance 9 531 ^ 546Murray W S, Forster K I, 2004 `̀ Serial mechanisms in lexical access: the rank hypothesis'' Psychol-

ogical Review 111 721 ^ 756O'Connor K J, Potter M C, 2002 `̀ Constrained formation of object representations'' Psychological

Science 13 106 ^ 111Pelli D G, Farell B, Moore D C, 2003 `̀ The remarkable inefficiency of word recognition'' Nature

423 752 ^ 756Pollatsek A, Reichle E D, Rayner K, 2003 `̀ Modeling eye movements in reading: Extensions of the

E-Z Reader model'', in The Mind's Eye: Cognitive and Applied Aspects of Eye MovementResearch Eds J Hyona, R Radach, H Deubel (Oxford: Elsevier) pp 361 ^ 390

Potter M C, 1993 `̀ Very short-term conceptual memory'' Memory & Cognition 21 156 ^ 161Ramachandran V S, Rogers-Ramachandran D C, 1991 `̀ Phantom contours: A new class of visual

patterns that selectively activates the magnocellular pathway in man''Bulletin of the PsychonomicSociety 29 391 ^ 394

Rayner K, Ashby J, Pollatsek A, Reichle E D, 2004 `̀ The effects of frequency and predictabilityon eye fixations in reading: implications for the E-Z Reader model'' Journal of ExperimentalPsychology: Human Perception and Performance 30 720 ^ 732

Binding words is slow 73

Reilly R, Radach R, 2003 `̀ Foundations of an interactive activation model of eye movement con-trol in reading'', in The Mind's Eye: Cognitive and Applied Aspects of Eye Movement ResearchEds J Hyona, R Radach, H Deubel (Oxford: Elsevier) pp 429 ^ 455

Reulen J P H, Marcus J T, Koops D,Vries F R de, Tiesinga G, Boshuizen K, Bos J E, 1988 `̀ Preciserecording of eye movement: the IRIS technique. Part 1'' Medical & Biological EngineeringComputing 26 20 ^ 26

Rousselet G A, Fabre-Thorpe M, Thorpe S J, 2002 `̀ Parallel processing in high-level categoriza-tion of natural images'' Nature Neuroscience 5 629 ^ 630

Rubin G S,Turano K, 1992 `̀ Reading without saccadic eye movements'' Vision Research 32 895 ^ 902Seidenberg M S, 1985 ``The time course of phonological code activation in two writing systems''

Cognition 19 1 ^ 30Shallice T, McGill J, 1978 `̀ The origins of mixed errors'', in Attention and Performance VII

Ed. J Requin (Hillsdale, NJ: Lawrence Erlbaum Associates) pp 193 ^ 208Smith E E, Haviland S E, 1972 `̀ Why words are perceived more accurately than non words:

inference versus unitization'' Journal of Experimental Psychology 92 59 ^ 64Sperling G, Budiansky J, Spivak J, Johnson M C, 1971 `̀ Extremely rapid visual search: The max-

imum rate of scanning letters for the presence of a numeral'' Science 174 307 ^ 311Stroop J R, 1935 ``Studies of interference in verbal reactions'' Journal of Experimental Psychology

18 643 ^ 662Stroud J M, 1956 `̀ The fine structure of psychological time'', in Information Theory in Psychology

Ed. H Quastler (Glencoe, IL: Free Press) pp 140 ^ 207Tao L, Healy A F, Bourne L E, 1997 `̀ Unitization in second-language learning: Evidence from

letter detection''American Journal of Psychology 110 385 ^ 395Thompson M C, Massaro D W, 1973 `̀ Visual information and redundancy in reading'' Journal

of Experimental Psychology 98 49 ^ 54Thorpe S, Fize D, Marlot C, 1996 `̀ Speed of processing in the human visual system'' Nature 381

520 ^ 522Treisman A, Souther J, 1986 `̀ Illusory words: the role of attention and of top ^ down constraints

in conjoining letters to form words'' Journal of Experimental Psychology: Human Perceptionand Performance 12 3 ^ 17

Tyler C W, Hardage L, Miller R T, 1995 `̀ Multiple mechanisms for the detection of mirrorsymmetry'' Spatial Vision 9 79 ^ 100

Tzeng O J, Hung D L, Cotton B, 1979 `̀ Visual internalisation effect in reading Chinese characters''Nature 282 499 ^ 501

Usher M, Donnelly N, 1998 `̀ Visual synchrony affects binding and segmentation in perception''Nature 394 179 ^ 182

VanRullen R, Koch C, 2003 `̀ Is perception discrete or continuous?'' Trends in Cognitive Sciences7 207 ^ 213

VanRullen R, Thorpe S J, 2001 `̀ The time course of visual processing: From early perception todecision-making'' Journal of Cognitive Neuroscience 3 454 ^ 461

Wilson H R, Wilkinson F, 1998 `̀ Detection of global structure in Glass patterns: implications forform vision'' Vision Research 38 2933 ^ 2947

ß 2007 a Pion publication

74 A O Holcombe, J Judson

ISSN 0301-0066 (print)

Conditions of use. This article may be downloaded from the Perception website for personal researchby members of subscribing organisations. Authors are entitled to distribute their own article (in printedform or by e-mail) to up to 50 people. This PDF may not be placed on any website (or other onlinedistribution system) without permission of the publisher.

www.perceptionweb.com

ISSN 1468-4233 (electronic)


Recommended