u American - Stacks

u J/ln- tJ^Hy -

■

AmericanScientist

Reprinted fromJuly-August 1973Volume 61Number 4pp. 394-403

Skill in Chess

Herbert A. SimonWilliam G. Chase

Copyright© 1973 byThe Society of the Sigma Xi

Herbert A. SimonWilliam G. Chase

As genetics needs its model orga-nisms, its Drosophila and Neuros-pora, so psychology needs standardtask environments around whichknowledge and understanding cancumulate. Chess has proved to bean excellent model environment forthis purpose. About a decade ago inthe pages of this journal, one of us,with Allen Newell, described theprogress that had been made up tothat time in using information-pro-cessing models and the techniquesof computer simulation to explainhuman problem-solving processes(1). A part of our article was devot-ed to a theory of the processes thatexpert chess players use in discov-ering checkmating combinations(2), a theory that was subsequentlydeveloped further, embodied in

Herbert A. Simon took his bachelor's anddoctor's degrees at the University of Chi-cago, the latter in 1943. He has served onthe faculties of the University of Califor-nia, Berkeley, and Illinois Institute ofTechnology, and, since 1949, on the facul-ty of Carnegie-Mellon University, wherehe is Richard King Mellon Professor ofComputer Science and Psychology. Begin-ning with an interest in decision-makingin organizations, Professor Simon has beenled during the past fifteen years into re-search on humanperformance in complextasks, using the computer to simulate cog-nitive processes. With Allen Newell, he iscoauthor of a recent book, Human Prob-lem Solving.William G. Chase is an AssociateProfessorof Psychology at Carnegie-Mellon Univer-sity, where he has served since receivinghis Ph.D. from the University of Wisconsinin 1969. His research has concentrated onthe elementary information processes un-derlying cognition. He has editeda recentbook, Visual InformationProcessing.This research was supported by PublicHealth Service Research Grant MH--07722, from the National Institute ofMen-tal Health. Address of both authors: De-partment of Psychology, Carnegie-MellonUniversity, Pittsburgh, PA 15213.

394 American

Scientist,

Volume 61

Skill in ChessExperiments with chess-playing tasks and computersimulation ofskilled performance throw light onsome humanperceptual and memoryprocesses

a running computer program,MATER, and subjected to addi-tional empirical testing (3) .The MATER theory is an applica-tion to the chess environment of amore general theory of problemsolving that employs heuristicsearch as its core element (4). TheMATER theory postulates thatproblem solving in the chess envi-ronment, as in other well-struc-tured task environments, involves ahighly selective heuristic searchthrough a vast maze of possibilities.Normally, when a chess player istrying to select his next move, he isfaced with an exponential explosionof alternatives. For example, sup-pose he considers only ten movesfor the current position; each ofthese moves in turn breeds ten newmoves, and so on. Searching to adepth of six plies (three moves byWhite and three by Black) will al-ready have generated a searchspace with a million paths. Hence,if every legal move is considered (aswould be the case in an exhaustivesearch), an enormous search spacewould be generated. Such a searchis beyond the capacity of thehuman player, as well as present-day computers. Humans seldomsearch more than a hundred pathsin choosing a move or finding acheckmate, and they seldom con-sider more than two or three possi-ble moves perposition.

The MATER theory postulatesthat humans don't consider movesat random. Rather, they use infor-mation from a position and applysome general rules (heuristics) toselect a small subset of the legalmoves for further consideration. Forexample, one powerful heuristicthat MATER uses in finding check-

mates is to examine first thosemoves that permit the opponentthe fewest replies. A comparison ofthe MATER program with think-ing-aloud protocols from humanchess players confirms the impor-tance of heuristic search as a basicunderlyingprocess.

While the MATER theory was suc-cessful in accounting for much ofwhat was known about chess think-ing in mating situations, some im-portant empirical phenomena-some of them known when thetheory was formulated, some ofthem discovered subsequently—eluded the theory's grasp. In thispaper, after describing the phenom-ena, we should like to tell the storyof a ten-year effort to account fortherecalcitrant facts.

An important by-product of this ef-fort has been to bring about a con-vergence of the theory of problemsolving with theories that havebeen developed to explain quite dif-ferent phenomena, which psycholo-gists label "perception," "rotelearning," and "memory." In thepast, both theorizing and experi-mentation relating to these differ-ent kinds of tasks—problem solv-ing, perceiving, learning by rote,and remembering—have tended togo their separate ways. In thecourse of our story we will see howthese theories come together to ex-plain chess skill; we will see the im-portant constraint that a limited-capacity short-term memory im-poses on problem solving in chessand how this limit can be bypassedby specific perceptual knowledgeacquired through long experience,stored in long-term memory, andaccessed by perceptual discrimina-tion processes.

The phenomenaIn Amsterdam, Adriaan de Groot,who was the first psychologist tocarry out extensive experiments onproblem solving using chess as thetask, also initially formulated histheory in terms of heuristic search(5). His subjects ranged from quiteordinary players to some of thestrongest chess grandmasters in theworld, including several formerworld champions. He was puzzledby one thing: none of the statisticshe computed to characterize hissubjects' search processes—numberof moves examined, depth ofsearch, speed of search—distin-guished the grandmasters from theordinary players. He could onlyseparate them by the fact that thegrandmasters usually chose thestrongest move in the position,while ordinary players often choseweaker moves. Why were thegrandmasters able to do this?Wherein lay their chess skill?

The perceptual basis of chess mas-tery. One clue to this riddle camewhen de Groot repeated and ex-tended an experiment that hadbeen performed earlier in the USSR(6). He displayed a chess positionto his subjects for a very brief peri-od of time (2 to 10 seconds) andthen asked them to reconstruct theposition from memory. These posi-tions were from actual mastergames, but games unknown to hissubjects. The results were dramat-ic. Grandmasters and masters wereable to reproduce, with almost per-fect accuracy (about 93% correct),positions containing about 25 piec-es. There was a quite sharp drop-offin performance somewhere near theboundary between players classifiedas masters, who did nearly as wellas grandmasters, and players clas-sified as experts, who did signifi-cantly worse (about 72%). Goodamateurs (Class A players in theAmerican rating scheme) could re-place only about half the pieces inthe same positions, and noviceplayers (from our own experiments)could recall only about eight pieces(about 33%). There is a quite nicegradationon this perceptual task asa function of chess skill, and wehave verified this in our own exper-iments (7).

We went one step further: we tookthe same pieces that were used in

In sum, these experiments showthat chess skill cannot be detectedfrom the gross characteristics of thesearch processes of chess playersbut can be detected easily using aperceptual task with meaningfulchess content. The experiment withrandom boards shows that the mas-ters' superior performance in themeaningful task cannot be ex-plained in terms of any general su-periority in visual imagery. Theperceptual skill is chess-specific.Moreover, a theory of problem solv-ing in chess that does not includeperceptual processes cannot be anadequate theory—cannot explainthe superior ability of the strongplayer tochoose the right moves.

Eye movements at the chess board.The second set of phenomena wemust consider are also perceptual,but of a more recent discovery. Ex-planations in terms of heuristicsearch postulate that problem solv-ing, and cognition generally, is aserial, one-thing-at-a-time process.(We are oversimplifying matters tomake the issue clear, but the over-simplification will suffice for thepresent.) Many psychologists havefound this postulate implausibleand have sought for evidence thatthe human organism engages in ex-tensive parallel processing (9). Theintuitive feeling that much infor-mation can be "acquired at aglance" argues for a parallel proces-sor. Of course, the correctness ofthe intuition depends both on theamount of information that can ac-tually be acquired and upon whatis meant by a "glance." If a glancemeans a single eye fixation (lastinganywhere from a fifth of a second toa half-second or longer), then weknow that there are high-speed se-rial processes (e.g. short-termmemory search, visual scanning)that operate within this time range(10). Thus, it is certainly inter-

the previous experiment, but now esting and relevant to find out howconstructed random positions with the human eye extracts informationthem. Under the same conditions, from a complex visual display like aall players, from master to novice, chess position and to see whetherrecalled only about three or four this extraction process is compat-pieces on the average—performing ible with the assumptions of thesignificantly more poorly here than heuristic search theories.the novice did on the real positions.(The same result was obtained byW. Lemmens and R. W. Jongmanin the Amsterdam laboratory, buttheir data have never been pub-lished, 8. )

A pair of Russian psychologists,Tichomirov and Poznyanskaya,placed an expert before a chess po-sition with instructions to find thebest move, and they observed hiseye movements during the first 5seconds of the task (11). The eyemovements were inconsistent withthe hypothesis that the subject,during these 5 seconds, was search-ing through a tree of possible movesand theirreplies.

To describe further what Tichomi-rov and Poznyanskaya found, wemust say a word about how the eyeoperates. The eye has a central re-gion of high resolution, the fovea(about 1° in radius), surrounded bya periphery of decreasingly lowerresolution. Most information aboutvisual patterns is acquired whilethe fovea is fixated on them; andthe eye moves abruptly, in so-calledsaccadic movements, from onepoint of fixation to the next. Thereare at most about four or five sac-cadic movementsper second.

In Tichomirov and Poznyanskaya'srecord of the first 5 seconds of theirsubject's eye movements, therewere about 20 fixations. Most ofthese centered on squares of theboard occupied by pieces that anychess playerwould consider to be ofimportance to the position. Therewere few fixations at the edges orcorners of the board or on emptysquares. Moreover, a large numberof the saccades moved from onepiece to another, where the formerpiece stood in a "chess" relation—that is, an attack or defense rela-tion—to the latter. For example,the eye would movefrequently froma pawn to a Knight that attackedit, or to a Knight that defended it,or from a Queen to a pawn it at-tacked.

It is important to note that the sac-cadic movements were not random—therefore, that some informationmust have been acquired peripher-ally about the target square beforethe saccade began. From other evi-dence, we know that a strong chess

1973 July-August 395

Figure 1. In this middle game position, used eye movement experiments, Black ii toby Tichomirov and Poznyanskaya in their play.

player can recognize a piece within ery for a likely target for the nexta radius of 5° to 7° from his point of fixation unless the two processesfixation; for eye-movement studies overlap in time (13, 14).show that he can frequentlyreplacesuch a piece correctly on a board Even more important, the Russianwhen he has had no closer point of experiments confirm the existencefixation to it (12). of an initial "perceptual phase,"

earlier hypothesized by de Groot,The Russian experiments are of in- during which the players first learnterest for two reasons. First, while the structural patterns of the piecesthe saccadic eye movements them- before they begin to look for a goodselves are serial, some parallel visu- move in the "search phase" of theal capacity appears to be operating, problem-solving process. The ex-for, since the saccade is not ran- periments of Tichomirov anddom, information about the target Poznyanskaya have been repeatedsquare must be acquired peripher- and confirmed both in Amsterdamally. From what we know about and in our own laboratory. Howsearch and scanning rates, it can be shall we extend theheuristic searchconcluded that the processes of theory or problem solving to ac-scanning the periphery for the next commodate them?target square and preparing thenext saccade must overlap in time .with the processes of searching Explainingthe eyememory for the identity and func- movementstion of a piece (or square) presentlyoccupying the fovea. Visual scan- Among the ground rules that oughtning experiments show that an eye to be followed in building theories,fixation does not allow enough time one of the most important is theboth to recognize a pattern in the rule of parsimony. If, in order to ex-fovea and to scan the visual periph- plain each new phenomenon, we

396 American

Scientist,

Volume 61

must invent a new mechanism,then we have lost the game.Theories, gradually modified andimproved over time, are convincingonly if the range of phenomenathey explain grows more rapidlythan the set of mechanisms theypostulate.

In the present instance, there aretwo ways in which we may seek topreserve parsimony as we extendthe theory. First, we may examineour existing theory to see whetherthe mechanisms already incorpo-rated in it might be adequate ifthey were reorganized. Second, ifwe need additional mechanisms toexplain some of the phenomena,then, instead of inventing them adhoc, we may draw upon mecha-nisms already postulated or knownin other parts of psychology—mechanisms whose existence al-ready has empirical support. Wewill explore both of theseroutes forimproving the theory while preserv-ingparsimony.

Perceptual processes in MATER.Let us return to the MATER theo-ry and see how much we must addto, or subtract from, it in order toaccount for the eye movement data.MATER, as noted earlier, is a pro-gram for discovering mating combi-nations by selective search. What isthe basis for the selectivity?A fun-damental idea imbedded inMATER is that forceful movesshould be explored first, where aforceful move is one that accom-plishes some significant chess func-tion, like attacking or capturing apiece or restricting the movementsof the opponent. Discovering theopportunities for forceful moves inany chess position involves perceiv-ing the attack, defense, and threatrelations that hold among pairs andclusters of pieces on the chess-board—it is basically a perceptualprocess.

Hence, if we examine MATER alevel or two below the executiveroutine that organizes its search,we see that the program is com-posed chiefly of a collection of pro-cesses for noticing significant chessrelations among pieces or squares.In the program as originally orga-nized, these processes were enlistedin the service of the heuristic searchfor a mating combination. Arethese noticing processes a sufficient

base on which to build a theory ofthe eye movements?

The PERCEIVER program. Itproved surprisingly easy to simu-late the eye movements. It was notdifficult to replace MATER's exec-utive program with a new programthat used the same perceptual pro-cesses to guide the scanning of theboard, and when this was done, agood correspondence was found be-tween the squares fixated duringthe first 20 saccades by the humanplayer and the squares fixated bytheprogram(15).

The program, dubbed PERCEIV-ER, operates in a very simple man-ner. With the simulated fovea fix-ated on a square of the board, in-formation is acquired peripherallyabout pieces standing on nearbysquares that attack or defend thefixated square, or that are attackedor defended by the piece on thatsquare. Attention is then assumedto switch to one of these nearbysquares, and, unless it immediatelyreturns to the square already fixat-ed, causes a saccadic movement tothe new square. With the fovea fix-ated on the new square, the processsimply repeats. A moment's reflec-tion will convince the reader that aprocess having this structure willcause a biased random walk of thefixation point around the board,returning most frequently to thoseregions where relations among piec-es are densest and spending littletime on theedges of theboard.

Figure 1 is one of the positions usedby Tichomirov and Poznyanskayain their eye-movementexperiments;Figure 2 is a record of the first 20fixations of their expert in this po-sition; and Figure 3 shows the first15 fixations produced by PER-CEIVER in the same position. Ofinterest is the fact that the PER-CEIVER simulation, by means ofits simple mechanism of attendingto attack and defense relations,shows the same preoccupationwiththe important pieces as does thehuman expert.

There are three points we need tomake about this simulation. First,no new mechanisms were invoked;it was sufficient to reorganize thelower-level perceptual mechanismsof MATER. The difference betweenthe behavior of MATER and the

Figure 2. Eye movements of an expert play- squares occupied by the most active pieceser are recorded for the first 5 seconds, by (seeFig. 1) are shaded.Tichomirov and Poznyanskaya. The 10

behavior of PERCEIVER lies large- ed eye around the chessboard are,ly in a difference in goal or motiva- in fact, serially organized, and it istion at different stages in the prob- a simple matter to simulate themlem solving process. The empirical in real time on standard computers,data from human subjects indicate Even if realistic time parameters,that initially the player sets himself estimated from human perfor-(not necessarily consciously or de- mance, were assigned to thevariousliberately, but perhaps habitually) processes of PERCEIVER, it is stillthe task of acquiring information not clear that anything resemblingabout thechess-significant relations a parallel process would be neces-on the board (PERCEIVER). Hav- sary. This problem is related to theing acquired this information, he third point,turns to generating moves andexploring their consequences Third, there is one level of percep-(MATER). There would be no tual processing that is finessed andgreat difficulty in revising MATER one level that is entirely missing into conform to this pattern—with PERCEIVER. The part that is fi-the perceptual, information-gather- nessed is the mechanism that rec-ing phase preceding the cognitive, ognizes the chess pieces in the firstheuristic search phase. As a matter place. What is more importantof fact, one earlier computer chess while PERCEIVER notices attacksprogram, written by Newell, Shaw, and defenses, it has no processes forand Simon in 1958, had much of organizing and remembering thisthis flavor (16), and another such information once it is attended toprogram is now being constructed But, as we shall see, the organizingby Berliner (17). process itself drives the eye move-

ments. It is quite plausible thatSecond, there is nothing a priori these missing processes operateparallel about PERCEIVER; the partly in parallel with the scanningsimple rules that drive the simulat- processesof PERCEIVER.


Figure 3. The solid line represents eye riod of initial orientation from the PER-movements and the broken lines represent CEIVER program. The 10 squares occupiedrelations noticed peripherally in this record by the most active pieces (see Fig. 1) areof simulated eye movements during the pe- shaded.

The board reconstruction program has a mechanism for theexperiment extensive storage in long-term

memory of familiar patterns, norNothing in the perceptual mecha- indeed do they have a long-termnisms we have described so far will memory of any complexity. But itallow us to account for the spectac- is precisely this kind of pattern-rec-ular skill of chess masters in recon- ognition process that lies at thestructing positions that they have heart of the master's reconstructiveseen for only a few seconds. Both ability.MATER and PERCEIVER glossover details of the process for recog- Elementary perceiver and memo-nizing a chess piece—noticing that rizer. Still retaining our respect forit is a Bishop, say, rather than a parsimony, we note that there al-pawn. Each piece is represented by ready exists in psychology an infor-a little bundle of features—its mation processing theory to explaincolor, for example, and its type how feature-bundles can become(King, Queen, etc.). The programs familiarized, associated with otherdo not undertake to explain or sim- information in long-term memory,ulate the feature extraction process, and used as components in largerbut simply assume that it is per- organizations of structures. Thisformed and that previous learning theory, called EPAM (Elementaryhas stored in long-term memory the Perceiver and Memorizer), was ini-requisite information about the tially developed by Feigenbaum tocapabilities of the different kinds of explain some of the principalpieces. More important, neither empirical findings about the roteprogram contains any mechanisms learning of nonsense syllables in thefor the recognition of meaningful, standard serial anticipation andfamiliar patterns of pieces—neither paired-associate paradigms (18).

398 American

Scientist,

Volume 61

Among the striking phenomenathat had been observed in rotelearning are: (1) a characteristicshape of the serial position curve(in serial anticipation learning), (2)a three-to-one (approximately)time advantage in learning mean-ingful over meaningless and famil-iar over unfamiliar syllables, (3)certain characteristic differences inlearning times between similar anddissimilar stimulus and responseitems, and (4) certain conditionsthat determine whether rote learn-ing will have an incremental or anall-at-once appearance. EPAM hasbeen successful in accounting for allof these phenomena (19).

The program of EPAM, and hencethe theory it embodies, is quitesimple. EPAM learns by growing adiscrimination net—a tree-likestructure whose nodes contain teststhat may be applied to objects thathave been described as bundles ofperceptual features. When a famil-iar object is perceived, it is recog-nized by being sorted through theEPAM net. At the terminalbranches of the EPAM net arestored partial "images"—also inthe form of feature bundles—of theobjects sorted to the respective ter-minals, together with other infor-mation about the objects.

The EPAM theory also plays animportant role in explaining the eyemovements. Recall that in the pre-vious section, PERCEIVER wasfound inadequate because it con-tained no mechanism for recogniz-ing pieces and patterns of pieces. Amore complete theory of eye move-ments would require that PER-CEIVER have access to EPAM.

The processes of EPAM influencethe eye movements via the way thediscrimination net is searched. Fig-ure 4 illustrates a small section ofthe net with two terminal nodes.Observe that the nodes containquestions about the contents ofspecific squares; depending uponwhat is found at a square, a decisionis made concerning which square toquery next. In short, the EPAM netis organized as a set of instructions,albeit abstract, for scanning theboard for familiar patterns. Theseinstructions must then be inter-preted by the perceptual system(PERCEIVER) in order to extractthe information, and eye move-

ments may well be necessary to ex-ecute the instructions. For smallclusters of pieces, some of thesesuccessive recognition steps may beexecuted in a single foveal fixation,without saccadic movement. Thus,eye movements may be of twokinds: (1) initial familiarization, inwhich simple chess functions (at-tack, defense) are noticed, and (2)recognition, in which complex pat-terns arescanned.

This explanation of the eye move-ments gains additional supportfrom the work of Noton and Stark,who developed independently asimilar theory (20). They proposedthat people's memory of a picturewill determine how that picture issubsequently scanned for recogni-tion, and they presented evidencethat, under the appropriate condi-tions, eye movements followed ste-reotypic "scanpaths" before the pic-ture was recognized. EPAM makesthis same strong assumption—thatpatterns are recognized by scanningthe configuration for specific fea-tures in a particular order.

EPAM has a recursive structure.This means that any object, oncefamiliarized and incorporated inthe net, can itself serve as a percep-tual feature of a more complexobject. Thus, once the varioustypes of chess pieces—Kings,pawns, Bishops—have becomefamiliarized, these can become fea-tures of more complex configura-tions, say, a "fianchettoed castledBlack King's position" (see Fig. 1for this pattern in the upper-rightpart of the board). Once familiar-ized (and this particular pattern isknown to every strong player), sucha complex can, in turn, serve as aperceptual feature of a still morecomplex pattern—e.g. an entirechess position.

We have now illustrated the re-cursive structure of EPAM with achess example, but the EPAM pro-gram was not constructed with thisapplication in mind. In the contextof rote verbal learning, the lowest-level features in EPAM are the geo-metrical and topological propertiesof English letters. With familiariza-tion, the EPAM net expands to en-compass the letters themselves,which then can be used as compo-nents (test nodes) of nonsense syl-lables. Familiarization of the syl-

Figure 4. A portion of the EPAM net forchess shows the terminal nodes for two pat-terns: (1) three pawns on second rank, and(2) fianchettoed Bishop. At each node isshown the test executed there. For

KR2?,

for example, read: "What piece stands ontheKing's RookTwo Square?" The patternsat the terminal nodes are for illustrativepurposes only: all the informationneeded torecognize the pattern is imbedded in thelogic of the discriminationnet. The terminalnode has the internal name of the node, anabstract symbolic reference (internal ad-dress) that can be stored in short-termmemoryas a single chunk.

lables, in turn, makes these avail-able as components of syllable pairsor lists, and so on. Thus, EPAMpostulates a single learning process,identical with what we have beencalling familiarization, and a singlekind of output of that process, anew unit orchunk.

The EPAM theory implies that thelength of time required for a learn-ing task will be proportional to thenumber of new chunks that have tobe familiarized in order to performthe task. This implication also fitsthe empirical evidence very well,the basic learning time being about5 seconds perchunk (21).

Chunks and short-term memory.Finally, an additional mechanism,short-term memory, is needed inorder to understand the reproduc-tion experiment—a mechanism forholding all that information for theshort period of time before it is re-called. George Miller, in order toaccount for the observed invarian-ces in memory-span experiments,first postulated such a memory sys-tem with a constant capacity of

about seven chunks (22). Millershowed that the well-known limiton- the amount of information thatcan be held in short-term memoryis not to be measured in bits, but inchunks—the capacity is about"seven, plus or minus two" familiarunits of any kind. By acquiring newfamiliar units (e.g. octal digits) andlearning to recode information interms of those units (e.g. recodingfrom binary to octal), holding aconstant number of chunks inshort-term memory allows one tohold an increased number of bits(in the example, a gain of three toone). The chunk of EPAM theoryhas these same characteristics.

Since Miller's influential articlewas published, there has been atremendous amount of research onshort-term memory, and virtuallyevery present-day theory aboutcognitive processes incorporatessuch a memory system. Much re-search on thinking and problemsolving has shown that, outside ofstrategies, the only other humancharacteristic that consistently lim-its performance in a wide variety oftasks is the small capacity of short-term memory. And without a short-term memory, EPAM theory by it-self does not account for the verballearning phenomena mentionedearlier. Short-term memory, then,is one of the basic cognitive capaci-ties. For our purposes, we assumethat what gets stored in short-termmemory are the internal names ofchunks (e.g. "fianchettoed castledBlack King's position"), whichserve as memory addresses or re-trieval cues for information aboutthechunks in long-term memory.

Let us return now to the chess-board construction phenomena.From Miller's chunking hypothesis,EPAM theory, and the limited ca-pacity of short-term memory, wewould predict that a chessboardcan be reconstructed from informa-tion held in short-term memory if,and only if, it can be encoded innot more than about seven familiarperceptual chunks. If a single pieceon a particular square constitutes achunk for a subject, then he shouldbe able to recall only about sevenpieces. If he can recall the positionsof more than twenty pieces, then itmust be that each chunk consists,on average, of a configuration ofabout three pieces.


We now have a proposed explana- a chunk boundary, we performed ation for the remarkable ability of second experiment, in which thechess masters to reconstruct posi- subject also reconstructed a chesstions—an explanation that meets position but with the original posi-our requirements of parsimony. We tion in view. The two boards werehave employed only mechanisms so placed that the subject had tothat are well rooted in other parts turn his head to look from the oneof psychological theory: (1) a lim- to the other. We found that, whenited-capacity short-term memory the subject placed two or morethat can hold the names of only pieces on theboard without turningabout seven chunks, (2) a vast rep- his head, each latency was almostertoire of familiar patterns stored always under 2 seconds. We as-as chunks in long-term memory, sumed that, under these speededand a recognition mechanism —the conditions, subjects load a singleEPAM net—for getting at them, chunk into short-term memoryand (3) therelated chunking process when they view the board and thenthat builds these patterns and their look directly over and recall thatretrieval mechanisms in the firstplace.

The next task is to find more directways to test the theory. Severalroutes are open: we can seek directempirical evidence for the existenceof these chunks and see if the mem-ory span for chunks is of the orderof seven; we can attempt to simu-late the reproduction task using themechanisms of the theory within acomputer program; and we can cal-culate whether the hypothesis leadsto reasonable estimates of the num-ber of familiar chunks a chess mas-ter must have stored in long-termmemory.We consider thesein turn.

Empirical identification of chunks.The logic we used in isolating thechunks was to see if, during the re-construction of a position, chunkboundaries could be identified bylong pauses. Time measurementshave been used for identifyingchunks in other experimental tasks.McLean and Gregg, for example,had subjects memorize permuta-tions of the alphabet (23). Theythen timed the intervals (latencies)between successive letters in thesubjects' recitals of the lists. Theyobtained convincing evidence thatthe permuted alphabet was storedin memory, not as a single uniformlist, but as a hierarchy of segments;the individual letter segments mostfrequently were three or four lettersin length. Within-chunk latencieswere much shorter than between-chunk latencies.

Adapting this technique to ourtask, we videotaped subjects recon-structing chess positions and mea-sured the latencies in placing suc-cessive pieces. In order to estimatewhat interval would correspond to

400 American

Scientist,

Volume 61

chunk. (It would be inefficient,under these conditions, to storemore than one chunk, because theywould then have to store the chunknames— there isn't enough room inshort-term memory to store thestructural information comprisingmore than one chunk—and then atrecall use each chunk name in suc-cession to retrieve the chunk fromlong-term memory—a time-con-suming procedure.) We thereforeassumed that, in the reconstructiontask, a pause longer than 2 secondsindicated the retrieval of a chunkfrom long-term memory via thechunk name in short-term memory.

To check the plausibility of this 2-second criterion, we counted thenumber of chess relations that heldbetween pairs of successively placedpieces. The relations counted wereattacks, defenses, proximity, iden-tity of type (e.g. both Rooks orpawns), and color. There was astrong negative correlation betweennumbers of relations and latency(see Fig. 5).

Next, we compared the pattern offrequencies of the between-chunkrelations (greater than 2 seconds)with the pattern of the within-chunk relations (less than 2 sec-onds) and both of these with thepattern that would have been ob-served had the pieces been replacedin random order. We made thiscomparison for both forms of thereconstruction experiment—frommemory and in sight of the board(see Table 1). For the two forms ofthe experiment, the within-chunkrelational patterns were highly cor-related (Pearson correlation coeffi-cient of .89), but these patternswere only slightly correlated withthe corresponding between-chunk

5

1

0 12 3 4Numberof relations

Figure 5. Mean latencies between succes-sively placed pieces in the reconstructiontask are plotted as a function ofthe numberofchess relationsbetweenthe pieces.

patterns (coefficients of .12, .18,.10, and .23) and not at all correlat-ed with the random pattern (-.04and -.03). On the other hand, thetwo between-chunk patterns werestrongly correlated with each other(.91) and with the random pattern(.87 and .81). Thus, there is strongevidence that the 2-second criterionin fact marks chunk boundaries.

What was the nature of the chunksthus delineated?Most of them werelocal clusters of pieces in arrange-ments that recur with high frequen-cy in actual chess positions. (Thefianchettoed castled King's positionmentioned earlier actually occurs inabout ten percent of all recentgames between grandmasters.) Inthe case of a subject who is a chessmaster, we were able to classify75% of his chunks as highly stereo-typed. Of the 77 chunks observedin his performance of the memoryexperiment, 47 were pawn chains,sometimes with a nearby support-ing or blockading piece. Tenchunks were castled King's posi-tions. Twenty-seven chunks wereother clusters of pieces of the samecolor, and 19 of these were of com-mon types: 9 consisted of pieces ontheir original squares in the backrank, and 9 of connected Rooks orconnected Queen and Rook. Theseare configurations a chess masterhas seen thousands of times—asoften as we have seen many of thefamiliar words in our reading voca-bularies. There is as much reasonto suppose in the one case as in theother that they are stored in hislong-term memory and that he willusually recognize them when hesees them.

", , „ , "V. scan tbe board in some way inTable 1. Intercorrelation matrix for the Sight-of-Board Constructions (1 and 3), order to notice the ieces and theirMemoryConstructions (2 and4), and HypotheticalRandom Constructions (5). relations. The scanning program is

12 3 4 5 a simplified version of PERCEIV-1. Within-chunk .89 .12 .18 -.04 ER, hence can be viewedas a simu-

0 T „ m oo no lation of the eye movements and2. Less than 2 sec .10 .23 -.03 , , _, „7,

control of attention. When a piece3. Between-chunk .91 .81 is fixated (salient piece), an4. Greater than 2 sec .87 EPAM-like discrimination process5 R , seeks to recognize the cluster of

pieces surrounding the fixated pieceas a familiar chunk. If it is success-ful, the symbol designating thischunk is stored in short-term mem-ory. This process is repeated at suc-cessive points of fixation until nomore pieces become salient orshort-term memory capacity isreached, whichever occurs first. Fi-nally, in the reconstruction phase,the terminal information in theEPAM net is used to decode thesymbols held in short-term memoryinto locational information for eachof the pieces in a chunk and thus toreconstruct theposition.

The learning component of MAPPis a simplified version of the portionof EPAM that grows or elaboratesthe discrimination netand stores in-formation at its terminal nodes.The input to the program consistsof many different configurations ofpieces (of two to seven pieces each)that occur frequently as compo-nents of chess positions. If such apattern has been familiarized pre-viously, the program will simplyrecognize it; if it has not, it willdiscriminate it from patterns pre-viously learned, will add tests tothe EPAM net to implement thediscrimination, will create a newterminal node to designate the newpattern, and will store informationabout the pattern at that node.

Figure 6. A schematic representation of theprincipal components of MAPP shows thelearning and performance processes used toreconstructa chessposition.

Reconstructedchess position

Chunksin short-term

memory

Thus far theempirical data support of the right order of magnitude—our theory, but we must mention not far from the memory span ofone piece of evidence that is equiv- seven—but the difference betweenocal. If we accept the 2-second cri- them is not predicted by the theo-terion for chunk boundaries, then ry. At the moment, we have nowe can measure directly the number good explanation for the discrepan-ofchunks our subjects are holding in cy, but have simply placed it as anshort-term memory when they at- item high on the research agenda,tempt to reconstruct the board. Our hunch is that a less simplisticOur theory predicts that the num- model of the structure of chunksber of chunks will be the same for and their interrelations, or of thestrong and weak players, but that organization of chunks in short-the average chunk size will vary by term memory, will be needed to at-a factor of two or three with chess tain a better second approximation,skill.

The MAPP simulation. A secondThis prediction is not borne out approach to testing the theory offully. When we compare, for exam- the chessboard reconstruction taskpie, the data from the memory ex- was to build a computer program,periment for a chess master with MAPP, to simulate the observedthe data for a Class A player, we phenomena (24). The general out-find that the master recalled about lines of the program follow immedi-twice as many pieces as the Class A ately from our description of theplayer, but the former's chunks theory. The program contains aaveraged only about 50% larger learning component to acquire andthan the latter's, while the average store in memory a large set of con-number of chunks he recalled also figurations of chess pieces and aaveraged about 50% more. The av- performance component to carryerage sizes of the first chunks re- out the board reconstruction taskcalled by master and Class A play- (Fig. 6).er were 3.8 and 2.6, respectively;the average numbers of chunks per Consider first the performanceposition were 7.7 and 5.7, respec- component. When a chess positiontively. Now the latter numbers are is presented, the program must

Pattern ofEPAM netchess pieces ■«■>.

o o /EPAM-likeX"■■^l pattern 1^""""^

° V learner / /

SalientpieceChess position


I 4

>> q

I■3 2

Thus the MAPP program is a hy- ulthood. Such people have reading guishable from the statistics of thebrid of a simplified PERCEIVER vocabularies of 50,000 words or weakerplayer's search?with a simplified EPAM; the finer more. If a chunk is a chunk is adetails of those prior programs are chunk as to learning time (as EPAM Two facts that have not been muchnot essential to demonstrating the theory proposes), then we would ex- studied in the laboratory, butphenomena. With a net of about pect thechess master to have a com- which are well known in chess cir-1,000 patterns, the performance of parable chess vocabulary. Our esti- cles, need to be mentioned. First,MAPP on the reconstruction task is mate agrees well with that reached the master and grandmaster notabout equal to that of a Class A previously. only select good moves but theyplayer, twice as good as a begin- often—much oftener than weakner's, but only half as good as a Finally, we may ask: given thevari- players—notice these moves in themaster's. In a typical set of posi- ety of possible chess positions from first few seconds after they look attions, MAPP recalled 51% of the well-played games, how big a vo- a new position. Having noticed suchpieces placed correctly by the mas- cabulary of patterns must we have a move, the master may continueter, but only 30% of the pieces so that each position could be rep- to analyze the position for somemissed by the master, indicating resented by a distinct set of seven, minutes before he is satisfied thatthat its chunks were not dissimilar or so, patterns? If Nis the number it is the best move—and sometimesfrom the master's. Finally, the of possible positions, while Pis the his analysis will show that his firstwithin-chunk chess relations of number of patterns, then the re- impulse was wrong. Nevertheless,pieces recalled successively by quirement is P7 >N.li P - 50,000, his ability to notice moves "at aMAPP were highly similar to those then F7 is approximately 8 X 1032 . glance" is always astonishing toof the human subjects, while the The latter number, in turn, is close lesser players,between-chunk relations were close to 640 . Now if we played chessto the random pattern. games to a depth of 20 moves for Second, although the average time

each player and at each choice an per move in serious tournamentThe chess master's vocabulary. We average of 6 reasonable moves were chess is 3 to 4 minutes (whichcan extrapolate from the present available, approximately 640 differ- means that some moves are madeperformance of the MAPP program ent games could be played. Since rapidly, while others are broodedto estimate how large a vocabulary there are probably not, on the aver- over for as much as half an hour), aof chess patterns would have to be age, six reasonable moves at each master or grandmaster can beatstored in the EPAM net to match choice point, 50,000 patterns should players of inferior skill while takingthe performance of the chess mas- be more than enough to accommo- only a few seconds per move andter. The distribution of different date the positions that could be playing simultaneously againstpatterns by frequency is highly reached in such games. It should be many players. His play in theseskewed, like the frequency distribu- emphasized that this estimate is games is not of the same quality astion of words in natural language, very crude, since it does not take in his more deliberate tournamentAssuming that the patterns in the into account that some patterns are games, but it is strong enough topresent MAPP net are those most much more frequent than others, beat most experts and almost allfrequently encountered in chess Nevertheless, it is reassuring that it players of lower class.games, and assuming the same de- gives results that are not inconsist-gree of skewness for chess patterns ent with those arrived at by other The most likely explanation ofas for words, we can estimate that routes. Until we can get better these facts is that the chess mastersomething of the order of 50,000 data—possibly by expanding the is not only acquainted with tens ofpatterns would have to be stored to EPAM net—it seems reasonable to thousands of familiar patterns ofmatch the master's performance. Is assume that a chess master can pieces, but that with many of thesethis a plausible estimate from other recognize at least 50,000 different patterns are associated plausibleviewpoints? We can check its configurations of pieces on sight, moves that take advantage of theplausibility in two ways. and a grandmaster even more. features represented by the pattern

(25). Many of the basic heuristicsFirst, there are no instant experts that guide the search for goodin chess—certainly no instant mas- "i " " u j moves are based on the presence ofters or grandmasters. There ap- r amiliarity breeds a pattern on the board. For exam-pears not to be on record any case competence pie, every chess player of even(including Bobby Fischer) where a moderate skill is familiar with theperson has reached grandmaster If the MAPP theory provides an ex- advice: "If there's an openfile, putlevel with less than about a dec- planation—at least a first approxi- a Rook on it." He knows that theade's intense preoccupation with mation—of the chess master's su- advice is not meant quite literally,the game. We would estimate, very perior skill in quickly perceiving that what is really meant is "con-roughly, that a master has spent chess positions and then recon- sider putting a Rook on it." Theperhaps 10,000 to 50,000 hours structing them from memory, it pattern of an open file will triggerstaring at chess positions, and a leaves unexplained the link be- the heuristic and initiate a move inClass A player 1,000 to 5,000 hours, tween this superiority and his chess- the heuristic search. Some patternsFor the master, these times are playing prowess. How does the theo- (perhaps many hundreds) may ac-comparable to the times that high- ry solve the riddle with which we tually be associated with an algo-ly literate people have spent in began—that the statistics of the rithmic solution—traps and combi-reading by the time they reach ad- master's search appear indistin- nations that lead to the guaranteed

402 AmericanScientist, Volume 61

win of a piece, a checkmate, or £ice-*-thousands of hours of prac-whatnot—in which a series of tice. This is implicit in the EPAMmoves may be played almost by theory; what is needed is to buildrote. up in long-term memory a vast rep-

ertoire of patterns and associatedThus, we suggest that the key to plausible moves. Early in practice,understanding chess skill —and the these move sequences are arrived atsolution to our riddle—lies in un- by slow, conscious heuristic searchderstanding these perceptual pro- —"If I take that piece, then hecesses. The patterns that masters takes this piece . . ."—but withperceive will suggest good moves to practice, the initial condition isthem. The structure of the search seen as a pattern, quickly and un-process through possible moves will consciously, and the plausible movenot be very different from that of comes almost automatically. Suchweaker players; only the paths sug- a learning process takes time—gested by the patterns will be dif- years—to build the thousands of fa-ferent. miliar chunks needed for master-

level chess.Such a view of chess skill is quiteamenable to theorizing in terms of Clearly, practice also interacts withproduction systems. By a produc- talent, and certain combinations oftion is meant a routine consisting of basic cognitive capacities may havetwo parts: a condition part and an special relevance for chess. Butaction part. The condition part there is no evidence that masterstests the presence or absence of a demonstrate more than above-aver-specific (perceptual) feature (e.g. age competence on basic intellectu-an open file); theaction part, which al factors; their talents are chess-is executed whenever the condition specific (although World Championis satisfied (whenever the feature is caliber grandmasters may possessrecognized as being present), gener- truly exceptional talents along cer-ates a chess move for consideration tain dimensions). The acquisitionthat is relevant to that specific fea- of chess skill depends, in large part,ture (e.g. putting a Rook on the on building up recognition memoryopen file). A separate analysis rou- for many familiar chess patterns,tine can then carry out the treesearch required for a final evalua- We now have an account of percep-tion of proposed moves. The advan- tual skills in chess that is consistenttage of modeling human behavior with theories drawn from otherwith production systems is that parts of psychology. There is nosuch systems are very simple and lack of tasks for continuing re-rulelike, avoiding many of the search, and the environment ofinflexibilities of algorithmic pro- chess continues to be one of thegramming languages. They can mostfruitful for cognitive studies,mimic learning by simply addingnew productions (26), and they T>p forc}rinoQ

have the perceptual flavor we need LVKlK' t^tcea

to simulate the pattern-recognition i.

Simon,

H. A., and A. Newell. 1964. In-processes in chess. formation processing in computer and

man.AmericanScientist52: 281-300.While the evidence is not yet in, it 2-

Simon,

H. A., and P. A. Simon. 1962.becomes increasingly plausible that Tn®! and error in solving difficult.J_. o j f problems: Evidence from the game ofthe cognitive processes underlying chess. BehavioralScience 7:425-29.skilled chess performance have 3. Baylori G w

_Jr and H A Simon, some such organization as this. 1966. A chess mating combinations pro-

Such a scheme would account for gram. AFIPS Conference Proceedings,the association of chess-playing 1966 SPrinS Joint Computer ConferenceSkill With the ability to recognize Washington, D.C, Spartan

numerous perceptual patterns on 4 Newell> A and H A Simon m2the board. Human Problem Solving. Englewood

Cliffs,

New Jersey:Prentice-Hall.There is another question which we 5.

deGroot,

A. D. 1965. Het Denken vanhaven't addressed directly, but den Schaker. Trans, as Thought andwhose answer is implicit in what we Smpany M°Ut°n &have been saying. The question is: 6 L N ., N . W. Petrowski, and P.how does one become a master in A. Rudik. 1927. Psychologie des Schach-the first place? The answer is prac- spiel. Berlin: Walter de Gruyter.

7.

Chase,

W.

G.,

and H. A. Simon. 1973.Perception in chess. Cognitive Psychology4:55-81.

8. Jongman, R. W. 1968. Het Oog van deMeester. (Doctoral dissertation, Universi-ty of Amsterdam.) Assen: Van Gorcum &Company.

9. Neisser, U. 1963. The imitation of manby machine. Science 139:193-97.

10. Sternberg, S. 1969. Memory-scanning:Mental processes revealed by reaction-time experiments. American Scientist57:421-57.

11. Tichomirov, O. K., and E. D. Poznyan-skaya. 1966. An investigation of visualsearch as a meansof analyzing heuristics.SovietPsychology 5 :2-15.(Trans, from Voprosy Psikhologii 2(4):39--53).

12. Noordzij, P. 1967. Registratie vanoogbewegingen bij schakers. Unpublishedworking paper, Psychology Laboratory ofthe UniversityofAmsterdam.

13. Williams, L. G. 1966. The effect of tar-get specification on objects fixated duringvisual search. Perception & Psychophysics1:315-18.

14.. Ellis, S. H., and W. G. Chase. 1971.Parallel processing in item recognition.Perception & Psychophysics 10:379-84.

15.

Simon,

H. A., and M. Barenfeld. 1969.Information-processing analysis of per-ceptual processes in problem solving. Psy-chologicalReview 76:473-83.

16. Newell, A., J. C.

Shaw,

and H. A.Simon. 1958c. Chess-playing programsandthe problem ofcomplexity.IBM Jour-nal of Research and Development 2:320--35.

17. Privatecommunication.18. Feigenbaum, E. A. 1961. The simulation

of verbal learning behavior.Proceedings ofthe Western Joint Computer

Conference,

121-32. (Reprinted in Feigenbaum &Feldman, eds. 1963. Computers andThought. NewYork: McGraw-Hill.)

19.

Simon,

H. A., and E. A. Feigenbaum.1964.An informationprocessing theory ofsome effects of similarity,

familiarization,

and meaningfulness in verbal learning.Journal of Verbal Learning and VerbalBehavior 3:385-96.

20. Noton, D. and L. Stark. 1971. Scan-paths in eye movements during patternperception. Science 171:308-11.

21.

Simon,

H. A. 1969. The Sciences of theArtificial. Cambridge: M. I. T. Press, pp.35-38.

22. Miller, G. A. 1956. The magical numberseven, plus or minus two. PsychologicalReview 63:81-97.

23. McLean, R.

S.,

and L. W. Gregg. 1967.Effects of induced chunking on temporalaspects ofserial recitation. Journal of Ex-perimentalPsychology 74(4) :455-59.

24.

Simon,

H. A., and K. Gilmartin. 1973.A simulation of memory for chess posi-tions. CognitivePsychology (inpress).

25.

Chase,

W.

G.,

and H. A. Simon. 1973.The mind's eye in chess. In Visual Infor-mation Processing, ed. W. G. Chase. Pro-ceedings of Eighth Annual Carnegie Psy-chology Symposium. New York: AcademicPress.

26. Newell, A., and H. A. Simon. 1972.Human Problem Solving. Englewood

Cliffs,

NewJersey: Prentice-Hall.


Date post:	30-Jan-2022
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

u American - Stacks

Documents