Date post: | 06-Jul-2018 |
Category: |
Documents |
Upload: | gadle-monick |
View: | 231 times |
Download: | 2 times |
of 78
8/17/2019 How Chess Players Think-Patrick Turner
1/78
How chess players think:
evidence for the role of search
at Expert level and below
Patrick Turner
First degree: BSc. (Hons) Mathematics
Open University personal identifier: U6094525
Dissertation submitted for:
MSc. in Psychological Research Methods
March 2005
8/17/2019 How Chess Players Think-Patrick Turner
2/78
Abstract
There are two competing views of the dominant mechanism underpinning chess
thinking – pattern recognition or search-and-evaluation? Whilst the recent
development of template theory has gone some way to unifying the two existing
theories, there still remain a great deal of unanswered questions concerning the nature
of the chess thinking process – in particular the relative contribution of recognition
and search-and-evaluation to chess skill. Although recognition-based theories of
chess thinking do not deny that search is part of the thought process, they emphasisethat recognition of the position provides for highly selective search. Thus an Expert
need not search any faster, or deeper, to arrive at a good move – he narrows down his
search by pattern recognition to focus his analysis on the good moves. Conversely,
search-and-evaluation theories emphasise the ability to search deeper, wider, faster
and more thoroughly, coupled with the ability to evaluate leaf nodes more accurately,
as the basis for the selection of good moves. They do not claim that recognition is not
involved in directing search – merely that it is not the dominant mechanism.
The aim of the research discussed here was to investigate support for both recognition
and search theories of chess skill through experimentation involving chess players at
two levels (Expert and Class A/B) completing a „choice of next move‟ task for three
chess positions. Two major conclusions are drawn from the results. Firstly, there is
strong evidence for differences in search capabilities across skill levels in chess
players, supporting the results of Gobet (1998a) and others. Such evidence argues
against the basis of de Groot‟s main conclusion (1965) that recognition is the
dominant mechanism underpinning chess skill. Proponents of template theory (e.g.
Gobet & Simon, 1998a) argue that such continued results for search differences
8/17/2019 How Chess Players Think-Patrick Turner
3/78
3
across skill levels do not undermine the recognition-based theory of chess skill itself.
The second major conclusion to be drawn, however, suggests that there is less support
for the role of recognition than in previous studies, such as Gobet‟s (1998a). It may
be that the results hold only between Class A/B players and Experts. This would
provide evidence to the fact that the better players at club level are superior primarily
because of their search capabilities and not recognition. A different model of chess
skill may be required for players below the level of Master.
8/17/2019 How Chess Players Think-Patrick Turner
4/78
4
Table of contents
Introduction 5
Literature review 10
Methodology 31
Analysis 37
Project Review 55
Conclusions 63
Appendix I: de Groot positions 66
Appendix II: Protocol analysis 71
Bibliography 77
8/17/2019 How Chess Players Think-Patrick Turner
5/78
5
Introduction
The game of chess provides an ideal environment for the study of human
decision-making in complex domains. As such, it has provided the basis for a
number of studies into human cognition, including perception, memory and
decision-making. Over the decades following the publication, in 1965, of
Adriaan de Groot‟s original research into chess thinking, there have emerged two
schools of thought concerning how chess players think – the family of
recognition-based theories typified by chunking theory, due to de Groot (1965),Chase & Simon (Gobet and Simon, 1998a, 1998b; Gobet, 2004) among others;
and the search-and-evaluation theory of Holding (Holding, 1985; Gobet 2004).
Whilst the recent development of template theory has gone some way to unifying
the two theories, there still remain a great deal of unanswered questions
concerning the nature of the chess thinking process – in particular the relative
contribution of recognition and search-and-evaluation (often simply referred to
as „search‟) to chess skill.
The structur e of chess thinking
The two theories agree on the basic structure of the chess thought process. De
Groot (1965) showed that this process can be represented as a sequence of
mental operations on not only the perceived position that the player is confronted
with but also imagined positions as might occur if certain sequences of moves
are played – a development of Selz‟s Framework of Productive Thinking (de
Groot, 1965). Briefly, the chess thinking process comprises three main phases –
a phase of orientation, noting possible threats, plans and candidate moves; a
8/17/2019 How Chess Players Think-Patrick Turner
6/78
6
phase of elaboration, within which specific sequences of moves are considered
(“I move here, then he moves here” etc.), each of which terminates in an
evaluation of the desirability of an imagined position (a „leaf node‟); and a final
phase within which the best move so far considered may be checked before the
player commits to it (de Groot 1965, pp100-116). It is within the middle phase
that search activity is carried out. Although recognition-based theories of chess
thinking do not deny that search is part of the thought process, they emphasise
that recognition of the position (and good moves or general plans to undertake in
such a position) serves to make search activity highly selective. Thus an expert
player need not search any faster, or deeper, to arrive at a good move – he
narrows down his search by recognition to focus his analysis on the good moves.
Conversely, search-and-evaluation theories emphasise the ability to search
deeper, wider, faster and more thoroughly, coupled with the ability to evaluate
leaf nodes more accurately, as the basis for the selection of good moves. They
do not claim that recognition is not involved in directing search – merely that it is
not the dominant mechanism.
Newell & Simon (1972) formalised de Groot‟s framework in the Pro blem
Behaviour Graph (PBG) model. A PBG characterises the phase of elaboration in
chess thinking, where search is undertaken. They are characterised by sequences
of moves, beginning with a candidate move (or base move) and alternating for
moves from each side, with possible branching in each sequence. Each branch
ends in a leaf node and each leaf node is evaluated, usually only as „good‟ or
„bad‟ for the player on move. As such, PBGs allow for the extraction of search
variables such as „number of nodes searched‟, and „maximum depth of search‟.
It is more difficult to extract variables characterising recognition although
8/17/2019 How Chess Players Think-Patrick Turner
7/78
7
„number of base moves considered‟ serves to characterise option generation
before any search is conducted.
Aims
The aim of the research discussed here was to investigate support for both
recognition and search theories of chess skill through experimentation involving
chess players of different calibres completing a „choice of next move‟ task for a
small number of chess positions with varying character.
The experimental aims were to establish significant differences in choice of next
move and search behaviour across two groups of chess players of differing
calibres, for three different chess positions. This was to be achieved through the
a pplication of de Groot‟s experimental procedure and using the analysis methods
of de Groot (1965) and Newell & Simon (1972). Data from the most recent
study of this kind, that of Gobet (1998a), was also to be used for comparisons of
results.
The specific research questions included:
Do club-level chess players of differing calibres differ in terms of quality
of move selection?
Do club-level chess players of differing calibres differ in terms of
capacity of search, mean and maximal search depth, and thoroughness of
search?
To what degree do the levels of search activity in club-level players fit
with existing models of chess thinking?
8/17/2019 How Chess Players Think-Patrick Turner
8/78
8
Novelty
The experimentation and analysis outlined above is not completely novel. It
draws much of the experimental procedure, analysis methods and study variables
from existing research in the field, such as de Groot (1965) and subsequent
replications of that original set of experiments (Newell & Simon, 1972; Gobet,
1998a). It is novel in two respects, however:
It comprises a repeated measures choice of next move task across three
positions; each of the three studies named above focused only on one
position;
It samples from club-level players only (Experts down to Class B) and
therefore serves to test some of de Groot‟s original conclusions, which
were based on an extremely high calibre sample including Grandmasters.
Motivation for this dissertation
The choice of subject matter for this dissertation is motivated by twin interests in
human decision-making in naturalistic settings and empirical research into
human decision-making. An enduring methodological problem that human
decision-making research faces is the design of experiments that both preserve
ecological validity (i.e. a naturalistic decision-making setting and task) and
enable the valid measurement of important variables. Chess is a rare case of a
structured and bounded decision-making environment that still affords
ecologically valid, yet well-defined, experimentation.
8/17/2019 How Chess Players Think-Patrick Turner
9/78
9
Structur e of this dissertation
The remainder of this dissertation is structured as follows:
The Literature Review introduces the main arguments for bothrecognition- and search-based theories of chess skill;
The Methodology chapter outlines the experimental design, experimental
procedure and analysis techniques undertaken.
The Analysis chapter sets out the results and analysis from the
experiment.
The Project Review reflects upon the changes in focus for the research
throughout its course, including modifications to the design, the success
of the experiment, the focusing of the analysis and the validity of the
methods.
The Conclusions chapter revisits the main findings of the analysis in the
context of the original research questions and the wider debate
concerning the nature of chess skill.
8/17/2019 How Chess Players Think-Patrick Turner
10/78
10
Literature review
The game of chess is ideally suited to a range of studies in cognitive psychology,
particularly memory, expertise and decision-making. Success at chess is
completely dependent upon skill and, whilst the configuration of the board and
pieces, and the rules of the game can be understood relatively quickly, a typical
chess position offers a non-trivial decision-making task, even for highly skilled
players. This is because of the inherent complexity that the game offers and,
although information about each position is known perfectly and the ultimate
goal of the game is certain, this complexity renders chess a credible domain of
interest for the study of human decision-making. There is also a substantial
amount of psychological literature on chess, perhaps because of the relatively
simple manner in which experiments can be conducted.
Cognitive psychology and chess enjoy a history of over a century of research; the
key question that has engaged psychologists throughout has been, “What
constitutes skill at chess?” Although it is generally agreed that chess skill is
based upon both recognition (the ability to match patterns based on the
possession of „good‟ patterns) and look -ahead search (essentially the ability to
compute sequences of moves), opinions are polarised and there are distinct
camps that espouse the dominance of one mechanism over the other.
Most modern research on chess skill has its foundations in the studies of the
Dutch international chess master Adriaan de Groot, whose original experiments,
conducted between 1938 and 1944, served to develop both theories of expertise
and decision-making, and corresponding experimental methods. The remainder
of this chapter is divided into sections, each of which discusses a key
8/17/2019 How Chess Players Think-Patrick Turner
11/78
11
development in one or both of the competing recognition-based and search-based
theories of chess skill.
The role of recognition: de Groot
De Groot (1965) was concerned with the thought processes underlying expert
chess players‟ choice of next move decisions. His main experiment was a
„choice of next move‟ task, conducted with a relatively small sample of good
chess players, ranging from grandmasters (including Alexander Alekhine and
Max Euwe) to Class C players (approximately average club level). De Groot
used a set of chess positions, typically middlegame positions taken from games
which he had played. De Groot set these positions up on a chessboard and asked
his subjects, assuming the role of the player on move, to think of a move and
play it on the board as if they were involved in an actual tournament game. The
only extra stipulation was that the subject „thought aloud‟ as he or she did so that
de Groot could record the way in which the subject arrived at his or her next
move. (This method is discussed in further detail in the next section.)
De Groot recorded each subject‟s thought as a verbal protocol which he then
coded, using Otto Selz‟s framework of productive thinking (de Groot, 1965). De
Groot was motivated by Selz‟s framework, which described thinking as a
„hierarchically organised linear series of operations‟ (de Groot, 1965, p vi) and,
in fact, sought to test it through the coding of the protocols. De Groot
demonstrated that he could successfully represent the protocols within this
framework, which, at the macro-structural level, comprises three phases: a first
phase of orientation that may include a listing of candidate moves for
consideration; a phase of elaboration whereby candidate moves are examined in
8/17/2019 How Chess Players Think-Patrick Turner
12/78
12
detail through the consideration of possible sequences of moves that they
precipitate; and a final phase in which a move is selected, possibly following
some form of summarisation. De Groot‟s coding, which was later formalised by
Newell and Simon (1972) as a Problem Behaviour Graph (PBG), captured the
history of all sequences of moves, each beginning with a base move (candidate
next move) considered by the subject. Such sequences included branching,
whereby the subject considered two or more possible sequences from some
branching move coming after the base move. Each sequence terminated in an
evaluation (positive, negative or unexpressed). Since this coding captured all the
moves considered it allowed for the reinvestigation of base moves.
De Groot did not expose every player to every position; positions A, B and C
were most commonly used and de Groot chose only to extract quantitative
variables from the encoded protocols for these positions (seen by 19, 6 and 6
players, respectively). These variables included the chosen move, the time taken
for each phase, the ordered sequence of base moves considered (candidate next
moves), the total number of moves, and variables concerning the frequency of
both immediate and non-immediate reinvestigations. De Groot had also analysed
positions A to C extensively to generate an order of „move quality‟ for each of
the legal moves in each position.
De Groot‟s first results were that stronger players chose better quality moves
than weaker players. Secondly, there was little difference between masters and
Experts1 on the various „search variables‟, including the total number of moves
considered (typically less than 100), depth of search or rate of search (number of
moves per minute). De Groot then asserted “the master does not necessarily
1 Experts is capitalised when referring to the class of players directly below masters and notcapitalised when referring, in general, to people possessing expertise.
8/17/2019 How Chess Players Think-Patrick Turner
13/78
13
calculate deeper, but the variations that he does calculate are much more to the
point; he sizes up positions more easily and, especially, more accurately” (1965,
p320). Although de Groot stated that he still expected greater search abilities in
high calibre players, he conceded that such differences did not explain the
observed performance differences. Having failed to establish skill differences on
these search variables, de Groot therefore conducted a second experiment based
on a „recall‟ task , originally conducted – in flawed form – by Djakow, Rudik and
Petrowski in 1927 (Gobet 2004). Players were exposed to 16 positions, taken
from relatively obscure master games, each for a short length of time (between 2
and 15 seconds). After each presentation the player was requested to reproduce
the position verbally and de Groot developed a scoring scheme for assessing the
corresponding verbal protocols. The results showed, significantly, that
grandmasters outperformed weaker players. De Groot inferred that experience
(in its effect upon perceptual processes) was the contributory factor, asserting
that the position is perceived in large complexes, each of which hangs together as
a genetic, functional and/ or dynamic unit. For the master such complexes are of
a typical nature.” (1965, p329, italics from original text). De Groot also
suggested that “eye movements undoubtedly come into play” – a hypothesis
proved, in 1996, by de Groot and Gobet (Gobet, 2004). De Groot conducted a
detailed analysis of the verbal protocols for the recall task and identified content-
specific themes that demanded differing degrees of attention. It is interesting to
compare this approach with the quantitative (information-theoretic) approach of
Chase and Simon in the development of chunking theory (see below).
Returning to the results of the „choice of next move experiment‟, one of de
Groot‟s innovations was an extension of the Selzian framework of productive
8/17/2019 How Chess Players Think-Patrick Turner
14/78
14
thinking. De Groot noticed that players employed a method that he denoted
„progressive deepening‟ – the reinvestigation of sequences emanating from the
same base move several times, either immediately or non-immediately, with the
tendency to search both progressively wider (examining more branches) and
deeper each time before evaluating at leaf nodes. This is referred to as „rough
cut, fine cut‟ by Newell and Simon (1972, p752). Selz‟s concept of „subsidiary
methods‟ stated that human problem solving is based on, essentially, exhaustive
depth-first search in support of one plan followed by depth-first search for a
second plan if the first fails etc. (where „plan‟ defines the context of evaluation
of leaf nodes). De Groot effectively redefined „exhaustiveness‟ in relative terms,
(1965, p270). This allowed for the reinvestigation of any base move, with the
examination of ever deeper and wider extensions to the search tree emanating
from each move. De Groot proposed that the varying criteria by which a
sequence is considered to be „exhausted‟ upon investigation/ reinvestigation –
and thus the criteria by which the corresponding base move is evaluated as good
or not – are based on recognition.
De Groot‟s main conclusions, across both of this experiments, was that
recognition (based on the possession of perceptual chess-specific knowledge),
together with the application of effective set of heuristic goal-driven rules, were
the major components of chess skill. The identification of recognition, in
particular, as a key mechanism refuted the then commonly held view that chess
skill was innate and had a large impact on theories of expertise that still persists.
8/17/2019 How Chess Players Think-Patrick Turner
15/78
15
I nformation processing and Problem Behaviour Graphs
The representation of human problem solving in the Selzian framework was
attractive to Herbert Simon, who viewed such an activity as, essentially, as
information processing. Simon was also the originator of the concepts of
bounded rationality, which states that there are limits on human information
processing that, in turn, impose limits on human rationality, and satisficing ,
which describes the sufficient, yet sub-optimal, human approach to decision-
making where bounded rationality is enforced, e.g. due to the complexity of the
decision-making environment. Chess is certainly one such environment and
there are clear parallels between satisficing and de Groot‟s progressive
deepening, the latter of which seeks a positive evaluation of a move even though
a thorough analysis may be lacking.
In 1965, Newell and Simon (1972) reinvestigated and replicated de Groot‟s
„choice of next move‟ experiment with the aim of investigating whether the
human decision-maker, in selecting his next move in chess, could be considered
an Information Processing System (IPS) and whether a thorough task analysis
would enable them to enrich their IPS model. Newell and Simon advocated the
elicitation of verbal protocols but emphasised their quantitative analysis rather
than de Groot‟s extensive qualitative analysis. As such they built on de Groot‟s
enhanced Selzian framework and formalised the coding of the verbal protocol as
a Problem Behaviour Graph (PBG).
A PBG is a descriptive chronological model of an individual‟s thinking
throughout the course of a problem-solving task. It concerns the navigation of a
human decision maker along sequences of linked nodes, each representing some
projected state of the environment with links representing the application of an
8/17/2019 How Chess Players Think-Patrick Turner
16/78
16
operator to a previous node. This forms a chronologically order set of sequences
of linked nodes, possibly with branching (representing the conception of two
different operators on a given node), ending at given leaf nodes. A PBG for
choosing the next move in a chess position represents, as nodes, future chess
positions that may be arrived at through the application of a sequence of moves
for white and black. Each initial move, or base move, represents the candidate
moves that a player conceives, and chooses from, in completing the task. Each
leaf node terminates in an evaluation (including a „non-evaluation‟) of the
position at that point. Note that a PBG is not equivalent to a search tree because
the latter models all sequences of moves considered by the chess player in
selecting his next move once only whereas a PBG provides a chronological view
on that player‟s considerations. As such, PBGs therefore may contain a number
of sequences beginning with the same base move, which may or may not be
different (indeed, identical sequences may or may not include different
evaluations). Whilst most of the work underpinning PBGs is due to Selz and de
Groot, Newell and Simon added the graphical formalism. To differentiate
between difference sequences, they redefined de Groot‟s „sub- phases‟ as
episodes – distinct chains of reasoning beginning with a base move, whether it be
different or the same as that considered beforehand.
The advantage of the PBG formulation is that it provides for the quantitative
analysis of the search-and-evaluation process. Newell and Simon (1972)
examined the quantitative variables derived from the protocol of a single subject
(S2) and compared them with those of de Groot‟s sample, noting the consensus
in results in terms of both quality of move and decision-making method. In
particular, S2 exhibited progressive deepening.
8/17/2019 How Chess Players Think-Patrick Turner
17/78
17
Perhaps the most important contribution of Newell and Simon‟s 1965 research
was their detailed analysis of the search strategies of S2 and de Groot‟s subjects.
They proposed a small number of principles for the generation of moves and
episodes – essentially an attempt at naming the „heuristic rules‟ that de Groot had
suggested contributed to chess skill. Newell and Simon did not find much
evidence, in the protocols, of means-ends analysis (goals-setting and the
identification and analysis of means – i.e. moves – to achieve those goals)
although they noted both that all protocols studied concerned position A – a
highly tactical position in which strategic plans are of less consequence – and
that de Groot had observed numerous examples of goal-setting in more strategic
positions (1965, pp157-9). Despite their characterisation of search strategies,
Newell and Simon share de Groot‟s view on the importance of recognition in
chess skill, particularly upon immediate consideration of a position and prior to
any search: “players notice a small number of considerable moves, and do not
notice (or at least do not mention noticing) the large number of remaining legal
moves” (Newell & Simon, p775), that is, there is a perceptual process guiding
search from the outset. This embodies the „first phase‟ in de Groot‟s macro-
structural model of next-move selection.
Chase and S imon’s Chunking theory
Chunking theory emerged from the 1973 experiments of Chase and Simon
(Gobet, 2004) as a general theory of expertise, originally applied to chess. In
line with de Groot‟s conclusions, it asserts that recognition is the key mechanism
underpinning expertise. In the experiment, three classes of player (Masters,
Experts and novices) were exposed to middle and end-game positions of two
8/17/2019 How Chess Players Think-Patrick Turner
18/78
18
types: positions from actual games and random positions matched for the number
of pieces present. There were two tasks: the „recall‟ task was essentially a
modification of de Groot‟s procedure although all positions were shown for 5
seconds and the players were subsequently asked to reconstruct them on a chess
board; the „copy‟ task differed in that the positions were not hidden from the
experiments during the reconstruction phase. For the positions drawn from
actual games, success at reconstruction (according to the number of pieces
correctly placed) was found to be proportional to skill level. For the random
positions, however, there were no significant differences across the three groups
of players. Chase and Simon concluded that the improved performance for more
skilled players was not due to any superiority in short-term memory, but to the
recognition of familiar patterns.
Chase and Simon (Gobet 2004) noted that, in both tasks, subjects reconstructed
pieces in groups, as defined by the intervals between piece placement in the
recall task, and by glances at the stimulus position in the copy task; further,
pieces in the same group tended to share more meaningful relations (e.g.
attacking, defending, same colour, same type etc. – judged by skilled players)
than those in different groups. Chase and Simon denoted these patterns of pieces
„chunks‟. The experiment also provided evidence that better players possess
bigger chunks (in terms of number of component pieces) and more chunks.
Chase and Simon (Gobet 2004) asserted that chunks are stored in short-term
memory (STM) as pointers to patterns encoded in semantic long-term memory
(LTM). Essentially, chunks are akin to the conditions of productions in LTM
that associate patterns with moves. Chase and Simon also expressed time
8/17/2019 How Chess Players Think-Patrick Turner
19/78
19
parameters for the rate of learning (approximately 8 seconds per chunk) and
STM limits (7 chunks, in line with Miller‟s predictions).
In a second 1973 paper, Chase and Simon also proposed that a secondary
transient memory store, a visuo-spatial store known as the mind’s eye, provides
an internal representation of the position upon which mental operations may be
carried out (e.g. the moves suggested by LTM). The position in the mind‟s eye is
also available to perceptual processes and thus chunks in a projected position
following a potential move may also be perceived and matched against patterns
in LTM. Thus chunking theory offers an explanation of how recognition may be
combined with mental simulation to arrive at good moves. It should be noted,
however, that the mind‟s eye extension to the theory is not supported by
empirical evidence since the experiment did not include a decision-making task.
Chase and Simon conducted a second experiment to demonstrate the stability of
chunks. The criterion for stability was: a chunk is considered to be repeated if at
least two thirds of its component pieces are recalled together. Stability of chunks
for class A players was 96%, versus 65% for the master player in the sample.
Support for chunking theory comes from Charness (Gobet, 2004) who, in 1974,
conducted a recall experiment with positions presented verbally, at a rapid rate
(average latency 2.3 seconds per piece) in three ways: by Chase and Simon‟s
relations; by columns (on the board) or randomly. The best recall was found for
Chase and Simon‟s relations and the worst for the random condition.
Cri ticisms of chunking theory
Chunking theory was not without its critics, however. These criticisms are on a
number of bases and include both methodological criticisms and theoretical
8/17/2019 How Chess Players Think-Patrick Turner
20/78
20
criticisms. Gobet and Simon (1998b) summarised the methodological criticisms
raised by many authors, including Holding (1985) and highlighted some
methodological concerns of their own, including the small sample size in the
1973 experiments and the one-to-one mapping of pieces placed a single „bursts
of activity‟ onto chunks. A single burst of activity was defined, in the 1973
recall experiments, as a sequence of piece placement with latencies less than 2
seconds between pieces. Gobet and Simon (1998a) argued that this latency may
actually increase over the recall period. Further, a burst of activity is also
dependent upon the physical limitations of picking up all component pieces of a
chunk in one hand. The most outspoken critic of the theory was, perhaps,
Holding (1985), who advocated the roles of both search and conceptual
knowledge (rather than perceptual chunks) in chess skill. Holding‟s specific
arguments included the following:
Chunks may be encoded into LTM in less than 8 seconds;
The size of chunks is too small to reflect conceptual knowledge;
Although chess skill can explain memory performance, there is no
evidence for a causal relationship in the opposite direction, that is that
memory (and recognition) explains chess skill.
The first criticism was based on recall experiments with interpolated tasks
designed to cause STM interference (e.g. Charness‟s experiment of 1976,
reported in Holding, 1985) had shown no effect on memory performance,
suggesting that LTM encoding for chunks was rapid. The second criticism is
based on Holding‟s assertion that chunking theor y “does not provide a sufficient
basis for maintaining that chess memory is organised in small chunks whose
labels are held in STM. Instead it appears that chess players who actively
8/17/2019 How Chess Players Think-Patrick Turner
21/78
21
process the given positions are able to integrate the general characteristics of
these positions in a hierarchical, prototypical or schematic format, not necessarily
based on pairs of pieces, that constitutes an „understanding‟ of the positions”
(Holding, 1985 p130). Key to this argument is Holding‟s inspection of both
positions and corresponding chunks from Chase and Simon‟s experiments. He
claimed that the actual chunks identified bear little relation to the important
playing themes in that same position and concluded “if we assume that all the
chunks for memorising purposes are to be identified on one basis and the patterns
for move selection on another, the theory loses a good deal of its economy”
[Holding, p103]. Indeed, if we accept the criterion for the stability of chunks
across experiments, it appears that better players perceive positions in a number
of ways (65% stability is a fairly low figure). The final criticism is backed up
with evidence from Holding and Reynold‟s (1982) experiment with random
positions. Players of different skill levels from novice to Expert completed two
tasks: the first was a recall task and the second was a choice of next move task on
the corrected positions. As expected, there was no effect of skill on memory, but
there was a significant effect of skill on (assessed) quality of next move.
Holding and Reynolds concluded that “the evidence shows that skill differences
continue to appear in situations where recognition by chunking is impossible”
(Holding, 1985 p133). In light of such criticisms, Gobet and Simon‟s replicated
the 1973 experiments and made corresponding modifications to the theory
(discussed in Gobet and Simon’s template theory, below).
8/17/2019 How Chess Players Think-Patrick Turner
22/78
22
SEEK Theory: the contribution of Holding
Above all of Holding‟s specific criticisms of Chunking Theory, his central belief
was that it was basically flawed – although he accepted the result that skill has an
effect on memory for meaningful chess positions, he believed that the role of
recognition (based on memory) was insufficient in explaining chess skill.
Holding promoted the importance of search, evaluation and knowledge to chess
skill and expressed this idea in his SEEK theory. It is important to understand
Holding‟s distinction between the mechanisms of „recognition‟ and „search‟
since his use of terminology differs slightly from that of other researchers. To
Holding „recognition‟ defines the key mechanism of Chunking Theory as the
association between perceived patterns (chunks) and good moves – without
search. „Search‟ involves a combination of planning a selective search through
candidate moves and sequences, and evaluating the utility of these moves to
support next move selection. Perhaps the most confusing aspect of Holding‟s
definitions is that he asserts that pattern recognition from semantic knowledge
also plays a key role in directing search by suggesting good moves. To Holding,
“patterns may be general rather than specific chunks” (1985, p174) and the
corresponding recognition mechanism is almost certainly less „automatic‟ in its
cueing of moves than that of Chunking Theory. In fact, it appears that „search‟,
in itself, is an extremely low-level skill, involving only focusing one‟s evaluative
skills on different moves. It should be noted that Holding (and others) refers to
„search‟ when he really means the wider set of skills described above, i.e. search,
evaluation and knowledge – all three of which are embodied in SEEK theory.
Holding claimed that, within de Groot‟s verbal protocols, there was, in fact, a
relationship between skill level and both number of moves considered and speed
8/17/2019 How Chess Players Think-Patrick Turner
23/78
23
of search (number of moves considered per minute), although this was not
statistically significant. He argued that the real effect was obscured by the highly
tactical nature of the only position for which a meaningful number of protocols
were published, i.e. position A. Other studies have supported this claim, in
particular Charness‟s 1981 experiment (Holding, 1985; Gobet, 2004), conducted
with 34 skilled players and a balance of tactical and strategic positions, different
to those used by de Groot, suggests a linear relationship between skill level (in
terms of Elo points) and depth of search (in terms of number of moves). Holding
reports that average maximal depth of search increases by 1.4 plies per standard
deviation of skill (200 points) and Gobet reports that the average depth of search
increases by 0.5 plies for the same interval.
In 1979, Holding (1985) developed a single scale to evaluate positions on the
basis of advantage to one side over the other using the expert judgement of
skilled players. He then asked 50 Class A-E players to evaluate a set of
quiescent positions, with level material, from actual grandmaster games on this
scale. The players were also asked to select a next move. Evaluations were
scored in comparison with the actual outcomes of the games. The results showed
that there is an effect of skill on evaluations. In Holding and Reynold‟s 1982
experiment (Holding, 1985) for recall on random positions players were also
asked to evaluate the position immediately (after it had been corrected following
the recall task) and after 5 minutes of consideration. There were no skill
differences for „correctness of evaluation‟ at either measurement point. Holding
concluded that evaluative skill is influenced by memory, including “generic
[semantic] memory for the type of specific… formations that are known to give
rise to advantages and disadvantages” (Holding, 1985 p208).
8/17/2019 How Chess Players Think-Patrick Turner
24/78
24
Holding‟s main conclusion is that differences in chess skill are due to search,
evaluation and knowledge: “the better players show greater competence in every
phase of the SEEK processes, conducting more knowledgeable evaluations, in
order to anticipate events on the chessboard” (1985, pp255-256).
Gobet and Simon’s template theory
Gobet and Simon (1996) set out to test Holding‟s conclusion by means of a
„natural experiment‟, observing the performance of the then-world champion
Grand Master Gary Kasparov, in both a series of matches of simultaneous games
and tournaments against expert opponents (predominately Masters and Grand
Masters). The average time afforded to Kasparov for each move was 3 minutes
in tournament play and 3 minutes per round (all matches of simultaneous games,
played against between four and eight opponents). Gobet and Simon reasoned
that the increased time-pressure in the simultaneous games would provide
Kasparov with less time to evaluate moves and, therefore, if Holding‟s
conclusion were true, he should perform less well in the simultaneous games
than in the tournament. The results showed that K asparov‟s performance did not
greatly differ across the two conditions. Indeed, in the simultaneous matches,
Kasparov played at the level of a very strong Grand Master. Gobet and Simon
concluded that it was Kasparov‟s pattern-matching that accounted for his similar
performances in both simultaneous matches and normal tournament play, and
that this result could be generalised to all expert chess players. This is supprted
by a similar result from Calderwood, Klein and Randall (1988).
Gobet and Simon (1998b) asserted that some of Holding‟s criticisms were valid
(e.g. those concerning LTM encoding and chunk size) whilst others were
8/17/2019 How Chess Players Think-Patrick Turner
25/78
25
incorrect (or had been shown to be incorrect). For example, Holding‟s result for
skill differences for choice of next move decisions in random positions was
countered by Gobet and Simon‟s experimental results (1998b) that indicated that
chunking theory does predict a small skill difference in the recall of such
positions – contrary to de Groot‟s and Chase and Simon‟s earlier results and
preserving the possibility of a relationship between memory and skill. Gobet and
Simon state that Holding‟s main issue with chunking theory – that it consists of
pattern recognition without search – is a misunderstanding, since the „mind‟s
eye‟ extension to the theory clearly describes the use of pattern recognition to
support a „think -head‟ process, thus generating subsequent moves for
consideration (this account also largely equates pattern recognition of non-base
moves with Holding‟s evaluation mechanism).
In 1996, Gobet and Simon (1998a) replicated Chase and Simon‟s original
experiments, with some key modifications, including an increased sample size of
26 (ranging from Masters to Class A players) and computer-aiding for the
reconstruction of positions, to eliminate the physical limitations on piece
replacement in the original experiment that may have confounded results on
chunk size. The main results concurred with Chase and Simon‟s original study –
that is, skill effects on recall in both tasks disappeared for random positions. The
most startling difference in results, however, related to the size of chunks.
Whilst the effect of skill level on chunk size was again present, mean largest
chunk size at all skill levels was greater. In particular, for Masters this figure
was 16.8 in the recall task (compared with 7 in the original experiment), and 14
in the copying task. Moreover, some positions were reconstructed by Masters
using only one chunk.
8/17/2019 How Chess Players Think-Patrick Turner
26/78
26
This new data confirmed Gobet and Simon‟s development of chunking theory,
namely template theory (1998a, 1998b). Template theory uses the same basic
mechanism as chunking theory, so that chunks are stored in STM as pointers to
patterns in LTM; they are also used to reconstruct visuo-spatial images in the
mind‟s eye (the secondary transient memory store). Gobet and Simon stated that
the more typical the position, the stronger the associations that chunk will have
with semantic memory, including moves, plans and other patterns. Further, they
proposed that such positions are actually represented by templates, which are
essentially chunks with slots for variables. They therefore comprise a „core
chunk‟ and their parameters allow them to describe a range of chunks within a
class defined by the range of variable values. Templates can provide for large
constellations of pieces to be considered together where large chunks alone
cannot, since the number of chunks with, e.g. more than 10 pieces, required to
hold all meaningful patterns on those pieces would be unmanageably large.
Templates, instead, provide for the redundancy that occurs because classes of
chunks tend to share good moves, plans, tactical and strategic features etc. Gobet
and Simon emphasise, within template theory, the associations between chunks
and templates with semantic knowledge. As with chunking theory, the authors
suggested a leaning time for 8 seconds for chunks and templates. Two learning
parameters are proposed: Gobet and Simon also assert that “like the chunking
theory, template theor y is not limited to chess” (Gobet 1998b p.127)
Template theory served to address the outstanding criticisms of chunking theory
in the following ways. The null effect of interference for recall of chess
positions could be accounted for by chunk size, since if less STM pointers are
required to encode a single position (possibly only one for Masters) then noise
8/17/2019 How Chess Players Think-Patrick Turner
27/78
27
will not necessarily eradicate that memory. Likewise, Holding‟s criticisms on
chunk size and conceptual knowledge were countered by direct modifications to
the theory, which were supported by empirical evidence. Finally, Gobet (1998a)
has used template theory to explain skill differences for search variables; this is
discussed in the next section.
The integration of pattern recogniti on and search
Gobet (1998a) conducted a replication of de Groot‟s choice of next move
experiment with 48 Swiss players (ranging from Master to Class B) using de
Groot‟s position A, and conducted an extensive analysis of the resultant verbal
protocols, including the generation of problem behaviour graphs (Newell &
Simon, 1972) and the extraction of the same quantitative variables as de Groot,
with the aim of comparing results and reinvestigating the effects of search
variables on quality of next move. Gobet was motivated both by empirical
evidence that opposed de Groot‟s result that search variables did not differ across
skill levels, e.g. due to Charness (Gobet 2004) and by the lack of replication of
de Groot‟s original experiment; he was undoubtedly also motivated in seeking
empirical evidence to support his own work at that time with Herbert Simon in
developing template theory, since although the research was published in 1998,
the original data was collected as part of a different study in 1986. As well as a
small skill difference for the mean depth of search, Gobet discovered a skill
effect for the way in which progressive deepening was conducted. The variables
in the study characterising progressing deepening behaviour related to the
number of reinvestigations of sequences starting with the same base move; these
were sub-divided into immediate reinvestigations (same base move considered
8/17/2019 How Chess Players Think-Patrick Turner
28/78
28
twice in succession) and non-immediate reinvestigations (same base move
considered twice with at least one different base move considered in between),
and also maximal and total values, with the former providing the largest number
of reinvestigations (immediate or non-immediate) among all base moves
considered. The maximal number of immediate reinvestigations had a positive
association with skill level and the maximal number of non-immediate
reinvestigations had a negative association with skill level.
Gobet‟s main conclusions were that players in his sample differed along more
dimensions that those in de Groot‟s sample, and that the aver age values on all
variables (pooled across skill levels) did not differ significantly between studies.
Gobet notes that the differences he found within his sample were mainly between
Masters and Class players. Since de Groot‟s sample only included 2 players at
Class level, it is perhaps not surprising that such differences did not show up in
the original experiments.
Importantly, Gobet claims that his skill effects for search can still be accounted
for by pattern recognition models of chess thinking because sequences of moves
are likely to be associated with patterns: “pattern recognition should facilitate the
generation of moves in the mind‟s eye, permitting a smooth search” (1998a p24).
Saariluoma presented further evidence of the pattern-recognition-based search
hypothesis (Gobet 1998a, 2004) with his „smothered mate‟ experiment, in which
high calibre players were asked to choose a move that would lead to mate in a
specially devised endgame. The position was one that had an efficient, yet
unusual sequence of moves that led to mate as well as a longer, more familiar
sequence. Players tended to choose the move at the beginning of the stereotyped
position.
8/17/2019 How Chess Players Think-Patrick Turner
29/78
29
Summary
In summary, the relative influences of recognition and search-and-evaluation on
chess skill are not fully understood. Further, the degree to which these are, in
fact, separate processes rather than alternative descriptions of the same process,
is unclear. Certainly most advocates of either theory believes that both
recognition and search mechanisms are fundamental to chess skill. For example,
de Groot‟s (Gobet, 2004, p120) assertion that recognition serves to direct the
look-ahead search-and-evaluation suggests that these processes are, in some
sense, interdependent. Further, Holding‟s (1985) conclusion that search-and-
evaluation is the dominant process is based on the assertion that better players
plan these evaluations in a more effective way. Yet Holding‟s “knowledgeable
evaluations” (1985, p256) might well be directed by effective pattern-matching,
which is essentially De Groot‟s conclusion. Gobet and Simon‟s template theory,
developed in part due to criticisms of chunking theory from advocates of search-
and-evaluation, provides for a credible explanation of skill differences for search
(if it is accepted that templates can store sequences of moves). This extended
theory apparently leaves no room for alternative explanations of chess skill
wherever it could be argued that patterns exist (e.g. any experimentation
involving real chess positions). It therefore offers the possibility of unifying both
recognition-based and search-based theories. To refine the template theory
explanation of skill differences on search variables, further data concerning such
differences would be of great benefit.
Further, the balance of chess research has been in favour of recall tasks, rather
than choice of next move tasks. The attractions of recall tasks (over choice of
next move tasks) in explaining chess skill are the objectivity of the measures and
8/17/2019 How Chess Players Think-Patrick Turner
30/78
30
the ease with which data can be analysed. Since chess skill is primarily
concerned with decision-making , however, it seems strange that there are not
more studies based on the choice of next move task. Finally, research based on
the choice of next move task, perhaps because of the analytical overheads the
task usually imposes, tends to focus on a small number of positions, often only
one – notably Gobet (1998a). An obvious danger in generalising results from a
single position is that any position effects are discounted.
8/17/2019 How Chess Players Think-Patrick Turner
31/78
31
Methodology
This chapter outlines the experimental design, procedure and analytical methods
employed in the research. It also includes an ethical section. The ecological
nature of the experimentation in this study meant that a great deal of relatively
unstructured data (verbal protocols) were generated through the experimental
procedure. These data were subjected to a detailed and structured (qualitative)
protocol analysis that provided a set of quantitative variables to be entered into
statistical analyses. The intermediate results of the protocol analysis offer the
best means of conveying this part of the methodology and serve to precipitate the
relevant section of the Project Review. Appendix II therefore contains details of
the protocol analysis, including an example verbal protocol and Problem
Behaviour Graph (PBG).
Participants
Eight male chess players from four different clubs in Worcestershire and the
West Midlands took part in the experiment. Although their ages were not
recorded, all had been playing chess as graded players for between 30 and 45
years (mean 34.75 years, standard deviation 5.39 years). Their British Chess
Federation (BCF) grades were converted into the Fédération Internationale Des
Échecs (FIDE) standard Elo ratings using the BCF conversion formula (BCF,
2003) and subsequently mapped onto United States Chess Federation (USCF)
class divisions to facilitate comparisons between the results of this experiment
and those of existing studies (e.g. Gobet, 1998a). The players were assigned to
8/17/2019 How Chess Players Think-Patrick Turner
32/78
32
two skill levels according to their equivalent USCF class as described in Table 1,
below.
Level 1 (Expert; n=4) Level 2 (Class; n=4)Sample mean (BCF grading) 168 120Sample mean (FIDE Elo rating) 2087 1849Equivalent USCF class Expert Class A/ Class BEquivalent Elo rating band 2000 – 2200 1600 – 2000Table 1; Description of Skill levels of experiment players
Materials
Three chess positions were used in the experiment. They were positions A, B1
and C of de Groot‟s original choice of next move experiments (de Groot, 1965
pp88-93) and were labelled A, B and C, respectively. They were depicted as
standard chess position images on A4 card, complete with full move histories for
the games from which they were taken. The positions themselves can be found
in Appendix I
Portable digital recording equipment, and pen and paper, were also used in the
experiment. The recording time display on the equipment was made available to
the players in place of a chess clock.
Experimental Design and procedure
A 2 x 3 repeated measures experiment was conducted using the following
independent variables: Skill (Expert; Class) and Position (A; B; C).
The experiment, which was conducted with each participant individually and in a
quiet and undisturbed environment, consisted of a single „choice of next move‟
task repeated across three conditions, defined by the three positions described
above (A, B and C). The procedure was essentially the same as in the original de
Groot experiments of 1938-43 (de Groot, 1965). Before the first task began the
8/17/2019 How Chess Players Think-Patrick Turner
33/78
33
experimenter instructed the player that he would be presented with the positions
one by one and, for each, would be required to choose his next move, as if he
were playing over the board in normal tournament play; the only difference being
that he was requested to think aloud as he did so. The experimenter clarified that
„thinking aloud‟ was not the same as providing a commentary on one‟s thought
process, i.e. it was simply a natural verbal expression of thought. Further, the
player was informed that the positions were from real games and were not chess
„problems‟ (typified by a single provable winning move); and that there were no
time limits imposed, although a guideline was provided: that the player should
aim to spend as much time on the task as they might reasonably expect to in a
tournament game. Once the experimenter had checked that instructions had been
understood and had gained the player‟s informed consent for their participation,
the task began.
The conditions were conducted sequentially with the offer of a short break
between each if required. The position was presented to the player at the same
time the recording began. Thereafter the experimenter only intervened if asked a
direct question concerning procedure or if the participant had remained quiet for
a period of approximately 30 seconds; in the latter case the experimenter
prompted the player by asking, “What are you thinking now?” Throughout the
recording and wherever necessary, the experimenter noted questions for
clarification. At the end of each condition the recording was stopped and the
experimenter requested clarification accordingly. Most such instances concerned
a misreported or unspecified move, piece or square.
Upon completion of the three iterations of the „choice of next move task‟ the
experiment concluded.
8/17/2019 How Chess Players Think-Patrick Turner
34/78
34
Protocol Analysis
The data collected from the experiment consisted of a single verbal protocol for
each player at each level of the 2 x 3 design, giving 24 protocols in total. Each
protocol was transcribed into tabular format and used to generate a Problem
Behaviour Graph (PBG) according to the coding scheme set out in de Groot
(1965), Newell & Simon (1972) and Gobet (1998a). Appendix II describes the
coding scheme in greater detail and includes an example verbal protocol and the
PBG that was generated from it. It also provides definitions of the important
elements of PBGs from which the quantitative variables may be extracted.
Derivation of quantitative var iables
Table 2 describes the set of quantitative variables derived from each graph, and
its means of derivation. Although most of these variables were originally
devised by de Groot (1965) and also used by Gobet (1998a), two were novel and
are indicated in the table.
8/17/2019 How Chess Players Think-Patrick Turner
35/78
35
Quality of Move Subjective assessment of the quality of the chosen move(see Appendix A for the derivation of scores)
Total Time Total time taken for choice of next move: time elapsed frominitial presentation of position to confirmation of next moveselection
Time of First Phase Total time elapsed before first Episode begins
Number of Base Moves Number of distinct base moves (null moves permitted) Number of Episodes Number of distinct Episodes of problem-solving behaviour Number of Nodes Number of nodes (moves) considered, including repeated
and null moves.Total Depth Aggregate of search depths for each episode, with null
moves included in the totals. Episodic depth is defined bythe longest sequences of moves, beginning with the basemove, among all branches. This variable is only measuredto enable the calculation of Mean and Maximal SearchDepths.
Maximal Depth of Search The maximal number among all episodic depths, with nullmoves omitted from the totals.
Mean Depth of Search Mean episodic depth with null moves included ; Total Depth
divided by Episodes.Standard Deviation of Depth ofSearch
Standard deviation of episodic depth with null movesincluded . This is a new variable.
Rate of Base Moves Rate of generation of distinct base moves; Total Timedivided Base Moves
Rate of Nodes Rate at which nodes are considered; Total Time divided by Nodes
Total IR Total number of immediate reinvestigations of all basemoves
Total NIR Total number of non-immediate reinvestigations of all basemoves
Maximal IR The maximal IR amongst all base movesMaximal NIR The maximal NIR across all base moves
Number of Null Moves Total number of null moves among all nodes. This is a newvariable and is only measured to enable the calculation ofProportion of Null Moves.
Proportion of Null Moves Proportion of total number of nodes that are null moves; Nodes divided by Null Moves. This is a new variable.
Table 2; quantitative variables derived from Problem Behaviour Graphs
Ethics
The only serious ethical consideration for this research is the non-disclosure of
any personally identifiable data both during and after the life of the study.
Although all data has been rendered anonymous before reporting, players‟
choices of next move have being assessed and thus they may have reason to feel
that their individual performance is under scrutiny. To mitigate against any such
misconceptions, the experimenter explained that each player‟s data was to
8/17/2019 How Chess Players Think-Patrick Turner
36/78
36
remain anonymous and protected from unauthorised use under the Data
Protection Act 1998. The experimenter also explained that the anonymous
results would be published as part of the MSc. dissertation. The players were
also advised of their right to withdraw from the study, even retrospectively, and
the experimenter provided contact details to each player if they wished to
exercise this right.
The experimental procedure itself was totally innocuous – there were no risks to
the players‟ physical or mental well-being as a result of taking part.
8/17/2019 How Chess Players Think-Patrick Turner
37/78
37
Analysis
Each dependent variable in Table 2 except Total Depth of Search and Number of Null Moves was subjected to a repeated measures factorial analysis of variance
(ANOVA) with the between-subjects variable Skill and the within-subjects
variable Position. The criterion of sphericity was satisfied for all variables
entering each analysis except for Number of Non-immediate Reinvestigations,
which was subsequently excluded from the analysis. These results for each
variable are provided in the next section in meaningful groups; details of other
tests are provided under the appropriate headings. The second section compares
the results with those of similar studies, notably Gobet (2004) and the final
section provides a higher level discussion of all findings.
Resul ts from this study
Quali ty of M ove
The main effect of Skill on Quality of Move is significant (F(1,6)=9.757,
MSE=15.042, p
8/17/2019 How Chess Players Think-Patrick Turner
38/78
38
Skill
level
Position A Position B Position C
Move Quality Move Quality Move Quality
Expert Rc2 1 Rb8 5 Ne4 3
Bxd5 5 Rb8 5 Kh8 2
Bxd5 5 Rb8 5 Bd7 3
Bxd5 5 Rb8 5 e5 5
Class Rc2 1 Kf8 4 d5 1
b4 1 Rb8 5 Bd7 3
b4 1 Kg7 3 e5 5
Kh1 1 h5 2 Ne4 3
Table 3; Moves chosen and Quality of Move for all players across all
positions
Quality of Move
Skill level
ExpertClass
6
5
4
3
2
1
0
Position
A
B
C
Figure 1; estimated marginal means for Quality of Move
The most interesting features of the data illustrated above are that although
Position A appears to split Experts from Class players in terms of Quality of
Move, Move Quality in the other two Positions is better balanced across Skill
levels. In particular, the marginal means for Quality of Move across Skill levels
in position C are almost identical (Class = 3; Expert = 3.25). Further, no player
selected a „bad move‟ in Position B, with no Quality of Move score below 2.
8/17/2019 How Chess Players Think-Patrick Turner
39/78
39
Time vari ables
There is no main effect of Skill on Total Time (F(1,6)=0.605, MSE=29.592, ns)
and, in fact, Experts apparently taken longer than Class players in choosing their
next move in all three positions, the biggest of which was observed for Position
A (a mean total time of 14.5 minutes for Experts versus 9.2 minutes for Class
players). The same pattern is observed for the Time the First Phase; the main
effect of Skill is non-significant here also (F(1,6)=3.604, MSE=3.604, ns).
There is, however a significant main effect of Position on Total Time
(F(2,12)=8.117, MSE=64.528, p
8/17/2019 How Chess Players Think-Patrick Turner
40/78
40
Position Marginal Means Number of
Legal MovesNumber of Base
Moves
Number of
Episodes
A 4.625 10.25 56
B 3 7.75 35C 6.375 13.625 37
Table 4; Marginal Means for Base Moves/ Episodes and Number of Legal
Moves
As can be seen in Table 4, the relationship between Position and Number of Base
Moves does not apparently stem from the number of legal moves available in
each position: an average of 4.625 base moves are generated for position A (56
legal moves) and 3 for position B (35 legal moves), yet 6.375 of the possible 37
legal moves are generated for position C. Further, it can be seen that there
appears to be a linear relationship between the mean Number of Base Moves and
the mean Number of Episodes.
Number of Nodes
The main effect of Skill upon Number of Nodes is significant (F(1,6)=6.593,
MSE=4056, p
8/17/2019 How Chess Players Think-Patrick Turner
41/78
41
Number of Nodes
Skill
ExpertClass
70
60
50
40
30
20
10
Position
A
B
C
Figure 2; Marginal Means for Number of Nodes
Finally, the distribution of Number of Nodes is shown in Figure 3. Apart from
the outlier (117 nodes searched by one of the Expert players in Position A),
Number of Nodes is fairly normally distributed with all values < 100.
Number of Nodes
100 - 110
80 - 90
60 - 70
40 - 50
20 - 30
0 - 10
Frequency Distribution of Number of Nodes
6
5
4
3
2
1
0
Std. Dev = 26.51
Mean = 41
N = 24.00
Figure 3; Frequency distribution of Number of Nodes
Rate of generation
There are no effects (main or interaction) of Skill or Position on Rate of Base
Moves. The main effect of Skill level on Rate of Nodes is weakly significant
(F(1,6)=5.646, MSE=13.777, p
8/17/2019 How Chess Players Think-Patrick Turner
42/78
42
(F(2,12)=0.590, MSE=0.001978, ns) and no interaction effect. Better players
generate nodes more rapidly (Expert: mean 4.09 , s.d. 1.03; Class: mean 2.58,
s.d. 1.48), as illustrated in Figure 4.
Number of Nodes per minute
Skill level
ExpertClass
5.0
4.5
4.0
3.5
3.0
2.5
2.0
1.5
Position
A
B
C
Figure 4; Estimate marginal means of Number of Nodes per minute
Depth of Search
The main effect of Skill for Mean Depth of Search is significant (F(1,6)=3.977,
MSE=3.899, p
8/17/2019 How Chess Players Think-Patrick Turner
43/78
43
a. selecting the maximal search depth of all episodes undertaken to derive
Maximal Depth of Search (pooled);
b. Pooling both Total Depth of Search and Number of Episodes to derive the
new quotient Mean Depth of Search (pooled).
Table 5 summarises the corresponding search data entering the analysis.
Elo
rating
Maximal Depth
of Search
(pooled)
Total Depth
of Search
(pooled)
Number of
episodes
(pooled)
Mean Depth
of Search
(pooled)
1720 4 8 7 1.14
1780 8 79 25 3.161925 5 111 36 3.08
1970 7 128 29 4.41
2010 14 170 43 3.95
2045 11 100 26 3.85
2105 9 170 44 3.86
2190 9 144 43 3.35
Table 5; Pooled Mean and Maximal Depth of Search by Player
The regression of Maximal Depth of Search on Elo Rating is significant
(F(1,22)=10.597, MSE=59.802, p
8/17/2019 How Chess Players Think-Patrick Turner
44/78
44
Reinvestigations
There are no main effects of Skill or Position on any of the reinvestigation
variables although the interaction effect upon Maximal Number of IR is
significant (F(1,6)=7.895, MSE=6.25, p
8/17/2019 How Chess Players Think-Patrick Turner
45/78
45
Proportion of Null Moves
Skill
ExpertClass
.18
.16
.14
.12
.10
.08
.06
.04
Position
A
B
C
Figure 6; Estimated Marginal Means of Proportion of Null Moves
8/17/2019 How Chess Players Think-Patrick Turner
46/78
46
Summary
The following table summarises the main effects of Skill level and Position on
each of the dependent variables entered into the analysis.
Dependent variable Main effect of
Skill2
Main effect of
Position
Interaction
effect
Quality of Move p
8/17/2019 How Chess Players Think-Patrick Turner
47/78
47
Comparison wi th other studies
The results reported above are interpreted in the context of the design and sample
size. This is particularly important for comparisons with results from other
related studies, i.e. Gobet (1998a) and de Groot (1965). The sample was fairly
small sample with a relatively narrow range of skill levels; in particular there
were no Masters among the sample. De Groot‟s sample3 included players of all
skill levels down to Class (n=14; Grandmasters=5, Masters = 2; Experts = 5;
Class = 2). Gobet‟s sample was larger (n=48) with average skill level
somewhere in between de Groot‟s and the sample used in this study
(Masters=12; Experts=12; Class A=12; Class B=12). Conversely, the data in
both of the other studies is based on Position A only, whereas this study
employed three very different types of position (see Appendix I).
Quali ty of M ove
The results of both this study and Gobet‟s confirm de Groot‟s assertion that
better players choose stronger moves. The significance of the effect of Position
on Quality of Move in this study, however, suggests that some positions are more
difficult to select a good move for than others – in particular, Position A.
Interestingly, the position that the players were least comfortable with (Position
C) generated the best quality moves on average. Figure 1 suggests an interaction
effect, with the tactical and complex Position A splitting the two groups
effectively and the strategic and quieter Position B showing little difference, but
the corresponding F ratio is non-significant.
3 For the purposes of comparison, this sample includes only the players for whom detailedstatistics have been extracted from their Position A protocols, courtesy of Gobet (1998a)
8/17/2019 How Chess Players Think-Patrick Turner
48/78
48
Time vari ables
Gobet (1998a) found a weakly significant result for Total Time, suggesting that
Masters choose their next move more rapidly than lower calibre players. The
results above show no differences between Experts and Class players, although
the marginal means indicate that Experts are slower than Class players (12.68
minutes versus 10.46 minutes). The implication is that there are, in fact, no
differences between players of different levels in the time taken to choose their
next move. An observation from the experiment is that some players consciously
truncated their thought processes on the basis that, in a tournament game, too
much time spent on the single choice would lead them into time trouble. Gobet
found a significant reduction in the Time of First Phase for higher calibre players
whereas the results here are also non-significant. Time of First Phase was
perhaps one of the more difficult variables to extract from the protocols due to
the poorly defined boundary it shares with the Phase of Elaboration (de Groot,
1965). Although certain players deliberately sized up the situation and discussed
general plans before entering a longer phase of search and evaluation, others
apparently focused immediately on base moves and corresponding sequences,
whilst one player spent the majority of his time apparently in the First Phase
before committing to a move. This issue is revisited in the Methodological
Discussion.
Base Moves and Episodes
Gobet‟s results suggest a curvilinear relationship for both variables with Skill,
since Class A players generate more base moves and episodes than either Experts
or Class B players, although only the effect on Number of Base Moves is
8/17/2019 How Chess Players Think-Patrick Turner
49/78
49
significant (Gobet 1998a). Perhaps unsurprisingly, with Class A and B players
pooled in this experiment, there are no significant effects of Skill. The
significant effects of Position on both Number of Base Moves and Number of
Episodes, however, again suggests that different types of position give rise to
different search and evaluation strategies irrespective of skill level, but that this
relationship is not explained by the complexity of the position (as measured by
number of legal moves). Position C demanded the widest search for base moves
and generated the most episodes; it may be argued that the character of this
position is perhaps more ambiguous that the other two, containing strategic and
tactical themes. It is possible that this required players to pursue potential
tactical lines as well as more strategic moves.
Search variables 4
De Groot (1965) based his main conclusion, that recognition is the dominant
mechanism in chess thinking, on two results suggesting that search behaviour
does not differ across skill levels (at least at the higher levels of chess skill):
1. Chess players rarely search more than 100 nodes in any position;
2. There are no significant effects of skill on any search variable (e.g. Number
of Nodes, Mean Depth of Search, Maximal Depth of Search).
Whilst both this study and Gobet‟s (1998a) provide evidence in support of the
first result, this study shows that Experts do search more nodes than Class
players. This is partially backed up by Gobet (1998a): although he did not find a
skill effect for Number of Nodes in position A, the average number of Nodes was
considerably lower for the Class B group (33.9) than for the other groups (58 for
4 The variables in the previous groups Number of Nodes, Rate of generation and Depth of Searchare considered here together.
8/17/2019 How Chess Players Think-Patrick Turner
50/78
50
Masters, 58.3 for Experts and 56.8 for Class A players; Gobet 1998a p13). The
significant difference found here, therefore, might be due, in part, to the reduced
skill range among the players in the experiment; it could be that the biggest skill
differences for this search variable are actually to be found between Experts and
Class players. This suggests that there is a improvement in search capacity up to
Expert level, beyond which this measure remains fairly constant – and that de
Groot‟s second result, above, does not hold below the level of Expert.
This study also confirms the significant result from Gobet (1998a) concerning
the effect of Skill on Mean Depth of Search, and adds evidence to the argument
(counter to that of de Groot) that higher calibre players employ greater search
than lower calibre player – due to the significant result on Maximal Depth of
Search.
To investigate such effects in more detail, Charness (Holding 1985; Gobet, 2004)
and Gobet (1998) made predictions of search capabilities for different skill levels
by analysing the relationship between Elo rating and selected depth of search
variables (Maximal Depth of Search and Mean Depth of Search). Charness, in
his 1981 experiment investigating the effects of age and skill on search
capabilities, used four positions, two of which were strategic whilst the other two
were tactical in nature. Gobet used only one position, de Groot‟s position A,
which is highly tactical in nature. The regression equations calculated from the
pooled data in this study suggest slightly larger increases in Maximal Depth of
Search and Mean Depth of Search per 200 Elo points than evidenced by the
previous studies (see Table 7).
8/17/2019 How Chess Players Think-Patrick Turner
51/78
51
Prediction This study Charness Gobet
Increase in Maximal Depth ofSearch per 200 Elo points
2.1 1.4 N/A
Increase in Mean Depth of Search per 200 Elo points
0.8 0.5 0.6
Table 7; predicted gain in search capabilities as a function of Elo rating
In interpreting this result it is noted that:
1. de Groot‟s results are based on a sample dominated by Grandmasters,
Masters and Experts;
2. Charness and Gobet found skill differences for search capabilities when
lower calibre players were more prevalent in the sample;
3. Both Charness and Gobet have suggested that the relationship between skill
level and search capabilities across all playing levels is not linear. Whilst
Charness proposes a plateau effect for high calibre players, Gobet suggests a
curvilinear relationship, whereby high calibre players actually search less due
to better recognition-led evaluation capabilities.
Given the relatively low calibre of the players in this sample, the data presented
here therefore extends the model of Gobet in suggesting that rate of change of
search capability (as measured by Mean and Maximal Depth of Search) is
greater at lower skill levels (e.g. between Class A/B and Expert). Note that the
predictions for Mean Depth of Search are similar across three studies that used
different combinations of types of position. This backs up the result of the
previous section that states that there is no significant effect of Position on either
Mean Depth of Search or Maximal Depth of Search.
Rate of generation
The weakly significant effect of Skill on Rate of Nodes is divergent with Gobet‟s
(1998a) result. Although neither study provides evidence for an effect of Skill on
8/17/2019 How Chess Players Think-Patrick Turner
52/78
52
Rate of Base Moves, Charness‟s 1981 result (Gobet, 1998a) suggests that
Grandmasters generate more base moves per minute than Experts. The reduced
sample size in this study might explain why such a result was not identified here.
Reinvestigations
There was a degree of convergence with Gobet (1998a) concerning
reinvestigation variables. Gobet‟s only significant results in this area were for
the main effects of Skill on Maximal Number of IR (p
8/17/2019 How Chess Players Think-Patrick Turner
53/78
53
Gobet (1998a) also asserted that Maximal Number of NIR is inversely
proportional to Skill, yet an ANOVA with the current data (Position A only)
generates a non-significant result, as Figure 7 indicates.
Maximal Number of NIR
Skill
ExpertClass
2.0
1.8
1.6
1.4
1.2
1.0
.8
.6
Position
A
B
C
Figure 7; Estimated Marginal Means for Maximal Number of NIR
Null Moves
The significant skill effect for Proportion of Null Moves suggests that better
players think in terms of completely specified sequences of moves more often
than lesser players. By means of a comparison, Saariluoma and Hohlfeld (Gobet
2004)5 examined the proportion of null moves as a function of position type
(strategic or tactical) and found that it is greater, at approximately 12%, in
strategic positions; Charness (Gobet 2004) previously found this percentage to be
approximately 10%. Interestingly, although the result in the current study holds
for Expert players (Position B = 11%; Position A = 5.5%; Position C = 5%),
Class players search approximately 15-16% null moves irrespective of position
type. (See also Figure 6.)
5 Calibre of players involved in the study unspecified.
8/17/2019 How Chess Players Think-Patrick Turner
54/78
54
The differences in proportions across the 3 positions as each skill level lead to
two alternative interpretations:
1. Strategic positions (Position B) demand more generalised „plan
formulation‟ than tactical positions (Position A and, to a certain extent,
Position C). result is an increased proportion of templates of move
sequences;
2. Better players are simply more thorough in their analysis of tactical
sequences.
Summary
The results generated by this study broadly agree with those of Gobet (1998a),
Charness (Holding, 1985; Gobet, 2004) and Saariluoma and Hohlfeld (Gobet
2004) and argue against some of de Groot‟s earlier conclusions. Better players
make better choices of move, as shown by de Groot (1965) and Gobet (1998a),
but they also search more, to a greater depth and more thoroughly than lesser
players. The exact relationship between skill and both capacity and depth of
search is probably not linear. It appears that the rate of increase in search
capacity plateaus at the level of Master and above; and that depth of search may
actually vary in a curvilinear fashion with skill level, with a rate of increase that
itself decreases, and actually changes sign, as skill level increases from Class B
to Grandmaster. Given the difference in calibre of players in the samples
considered across the various studies, it is entirely possible that de Groot‟s
results on search variables were actually correct – it is merely the applicability of
the conclusions to lower skill levels that is in question.
8/17/2019 How Chess Players Think-Patrick Turner
55/78
55
Project Review
This chapter reflects upon a two key issues: the necessary refocusing of the
research throughout its course (including modifications both to the design and
the analysis) and the validity of the data collection and analysis methods used in
support of the choice of next move task.
Focus of research
The final dissertation is far more focused than the original research proposal
suggested in might be. The main reason for this is that one half of the study was
suspended to keep the study to a manageable size, both in a positive sense (due to
the healthy amount of material available from the choice of next move task) and
a negative sense (due to both access difficulties and increased overheads of
qualitative analysis). The original experimental design included a choice of next
move task and a personal construct elicitation task, the latter conceived with the
aim of investigating the nature of conceptual knowledge that chess players
possess. Holding (1985) postulated that conceptual knowledge, along with
search and evaluation, explain skill in chess and one of his main criticisms of
chunking theory was that chunks were too small in size to reflect conceptual
knowledge (Gobet & Simon, 1998b). Template theory (Gobet & Simon, 1998a)
addresses this criticism by introducing larger perceptual structures known as
templates, which are large enough, in theory, to encode entire positions.
Personal Construct Psychology (PCP) is concerned with how individuals
construe the world, based on the assertion that each man possesses an ever
changing set of hypotheses about the world that are represented on personal
8/17/2019 How Chess Players Think-Patrick Turner
56/78
56
constructs – essentially axes of reference characterised by contrasting poles (e.g.
we may hypothesise about people on the construct „good- bad‟ or we may
hypothesise about chess positions on the construct, „tactical-strategic‟). Must of
PCP is due to George Kelly, who also devised the Repertory Grid technique,
which includes methods for the elicitation of personal constructs (Fransella, Bell
& Bannister, 2004).
Under the assumption that personal constructs, which may exist at any level of
abstraction, are equivalent ways of classifying/ describing both templates and the
higher level schemata that they relate to, the research questions that the second
half of the study concerned, therefore, were:
How many constructs do chess players of a given skill level possess?
How are the construct systems of chess players organised?
What degree of overlap is there between different chess players‟ construct
systems, particularly those players with similar skill levels? What are the most concrete constructs and do they correspond to Chase &
Simon‟s piece relations in chunking theory?
Thus the questions for this part of the study were fairly open-ended and the
analysis was intended to be investigative. The basic procedure chosen was the
method of triads, whereby thee „elements‟ (in this case, chess positions) are