Uncovering the Problem-Solving Process to Design Effective ......Merriënboer, Schuurman, De Croock,...

Uncovering the Problem-Solving Process to Design Effective Worked Examples

(Interuniversity Center for Educational Research) and was funded by

The Netherlands Organisation for Scientific Research (project no. 411-01-010. The work reported in Chapter 2 was facilitated by an additional Internationalization grant in 2003).

The research reported here was carried out at the

OUN in the context of the research school

© Tamara van Gog, Heerlen, The Netherlands, 2006 ISBN-10: 90-9020490-3 ISBN-13: 978-90-9020490-1 Cover design: Jeroen Berkhout All rights reserved Printed by Datawyse, Maastricht, The Netherlands

Uncovering the Problem-Solving Process to Design Effective Worked Examples

PROEFSCHRIFT

Ter verkrijging van de graad van doctor aan de Open Universiteit Nederland op gezag van de rector magnificus

prof. dr. ir. F. Mulder ten overstaan van een door het

College voor promoties ingestelde commissie in het openbaar te verdedigen

op vrijdag 28 april 2006 te Heerlen

om 15.30 uur precies

door Tamara van Gog

geboren op 18 juni 1979 te Raamsdonk

Promotores: Prof. dr. J. J. G. van Merriënboer, Open Universiteit Nederland Prof. dr. G. W. C. Paas, Open Universiteit Nederland Overige leden beoordelingsommissie: Prof. dr. S. Dijkstra, Universiteit Twente Prof. dr. P. H. Gerjets, Knowledge Media Research Center/University of Tübingen Prof. dr. J. Sweller, University of New South Wales Dr. R. M. J. P. Rikers, Erasmus Universiteit Rotterdam Prof. dr. H. P. A. Boshuizen, Open Universiteit Nederland

Contents Preface 7 Chapter 1: 9

General introduction PART I: Uncovering the problem-solving process 17 Chapter 2:

Instructional design for advanced learners: Establishing connections between the theoretical frameworks of cognitive load and deliberate practice

Chapter 3: 29 Uncovering the problem-solving process: Cued retrospective reporting versus concurrent and retrospective reporting

Addendum 45 Expertise-related differences in experience of cued retrospective, concurrent, and retrospective reporting

Chapter 4: 47 Uncovering expertise-related differences in troubleshooting performance: Combining eye movement and concurrent verbal protocol data

PART II: Process-oriented worked examples Chapter 5: 69

Process-oriented worked examples: Improving transfer performance through enhanced understanding

Chapter 6: 83 Effects of process-oriented worked examples on troubleshooting transfer performance

Chapter 7: 101 Effects of sequencing process-oriented and product-oriented worked examples on troubleshooting transfer performance

Chapter 8: 115

General discussion

References 119 Summary 127 Samenvatting 133 ICO dissertation series 141

7

Preface I have many people to thank for the important role they played in the coming about of this dissertation.

First and foremost, my supervisors Fred Paas and Jeroen van Merriënboer, for giving me excellent examples to study, and providing the right guidance, advice, and support, as well as a lot of freedom. Fred, thanks for all the interesting ideas, for all the challenges, and for motivating me in your very unique way to seek and stretch my boundaries, while at the same time keeping me with my feet on the ground. Jeroen, thanks for your down-to-earth and relaxed way of providing feedback, and for sometimes giving discussions an unexpected but interesting twist by looking at things from a different perspective.

Puk Witte, for doing a great job and spreading a lot of positive energy. You were more like a co-worker than an “intern”; I think we made a good team.

Gerard van den Boom, for supervising me when I was an “intern” at the OTEC; if that had not been such a good experience there’s no telling whether I’d be writing this now.

My ‘supervising committee’, Els Boshuizen, Frans Prins, Remy Rikers, and John Sweller, for their critical comments and interesting suggestions throughout this four-year trajectory.

Our colleagues at Florida State University, for making my short visit in 2003 into a wonderful learning experience. In particular I’d like to thank Anders Ericsson for very inspiring conversations.

Ivo Hamers (Sintermeerten College, Heerlen), Nico Pluijmaekers (Arcus College, Heerlen), Roger Sliepen (Leeuwenborgh Opleidingen Sittard), Louise Verhoeven (Leeuwenborgh Opleidingen Maastricht), Jan Gielen (Bisschoppelijk College, Weert), and the ICT-maintenance staff at all those schools for their help with organizing the experiments. The students of the aforementioned schools and of the Hogeschool Zuyd Heerlen for participating.

Mihály Koltai of DesignSoft, Inc. for enabling us –free of charge- to also run the experiments with the TINA Pro software at schools where this software was not (yet?) used.

All colleagues at OTEC for a nice work environment, but in particular the former and current PhD students (Dominique, Huib, Liesbeth K., Angela, JW, Silvia, Ron, PJ,

8

Judith, Karen, Marieke, Liesbeth B., Pieter, Gemma, Fleurie, Wendy, Sandra, and Femke) for being a great peer group, and my roommate Olga, for putting up with me and my stacks of paper for four years, and for being a “richtik Yiddische mame”: baking great cookies and always worrying about my daily vitamin intake.

Our colleagues from Rotterdam (especially Anique, Gino, Huib, Peter, Remy, Sofie, and Wilco) and Tübingen (especially Peter, Katharina, and Maria), and fellow PhD students in the VPO and JURE networks, for providing pleasant atmospheres to discuss research in and for being good fun to hang out with.

Some colleagues, many of whom have become more like friends over the years, for livening up the evenings in Zuid-Limburg with dinners, movies, games of cards, single malts, and concert/festival visits; I cannot be exhaustive here, but I’d like to mention in particular Dominque, Frans, Fred, Gerard & Loes, Judith & Flip, Iwan, JW, Liesbeth & Rob, Linda, PJ, Puk, and Ron.

My friends (again, I cannot be exhaustive, but I’d like to mention particularly Esther, Renée, Renske & Denny, Veerle, and Viola) for always being there for me despite the fact that we did not see each other as much as we’d like to because of the distance and often heavily loaded work-schedules.

My “family in-law”, Jeanne, Peter, and Inge, for their love and support. My mother Elly, who taught me that I could always speak my mind to anyone, as long

as I did so politely, my as-a-father Wim, who always reminded me that “people who do not get a move on will always come second”, and my sister Lieke, who often pointed out that there’s more to life than work, for their love and support.

Finally, I would like to thank my paranimfen, Judith and Bas, who were always there for me in the last four years, both regarding work and private life. Judith, thanks for all the laughter (which, to quote Victor Borge, really is the shortest distance between two people), you are a soulmate. Bas, thank you for the truly incredible amount of love and support, as well as patience when work once again invaded evenings and weekends. I’d probably be lying when I’d say that’s all over now, so I won’t. But I’ll try.

Thank you!

9

Chapter 1

General Introduction “Kunnen zonder kennen kan niet.” This Dutch alliterated statement, meaning, “one cannot have skill without knowledge”, may seem obvious, but the tendency in present-day curricular innovations seems to be to consign “knowledge transmission” to the wastepaper basket and to focus on “competency-based education”, “learning by doing”, and “problem-based learning”. Yet, procedural knowledge lies at the heart of skilled performance, and there is a large body of research that has shown that in the initial phases of skill acquisition students learn more effectively and more efficiently from studying worked examples than from solving problems (for overviews, see Atkinson, Derry, Renkl, & Wortham, 2000; Sweller, Van Merriënboer, & Paas, 1998).

Admittedly, the tendency to discard knowledge transmission is mainly a response to the teaching of factual knowledge and simple procedures in isolation, which indeed is not the best way to go because knowledge does not guarantee skill. Knowledge is a necessary but not a sufficient condition for skilled performance, just as extensive experience in a domain is a necessary but not a sufficient condition for expert performance (Ericsson & Lehmann, 1996). Also, a lot of worked examples research has focused on relatively simple procedures (which may nonetheless be complex in terms of imposed cognitive load for novice learners; see below), for example in arithmetic, that are usually just a small component of more complex, ‘whole tasks’ that are often a core component of curricular innovations (Van Merriënboer, 1997).

Yet, worked examples are well suited for (initial) learning of ‘whole tasks’ as well. For example, the research in this dissertation centers on learning to troubleshoot electrical circuits from worked examples, which is a relevant, reasonably authentic, and relatively complex whole task, in which declarative domain knowledge, system knowledge, heuristics, and procedural knowledge have to be used in unison in order to reason out successfully what causes the malfunction (see e.g., Gitomer, 1988; Gott, Parker-Hall, Pokorny, Dibble, & Glaser, 1993; Jonassen & Hung, 2005; Schaafstal, Schraagen, & Van Berlo, 2000).

Learning from Worked Examples: Decreasing Extraneous Cognitive Load

Cognitive load theory (Sweller, 1988; Sweller et al., 1998; Van Merriënboer & Sweller, 2005) explains the effectiveness of training that consists (mainly) of studying worked examples over training consisting of solving equivalent problems, in terms of reduced

10

extraneous or ineffective cognitive load on working memory during training. Working memory capacity is considered limited to seven plus or minus two elements or chunks of information (Miller, 1956). Hence, tasks that contain a high number of interacting elements that have to be processed in working memory simultaneously, place high demands on working memory. In cognitive load theory, this is referred to as intrinsic cognitive load. Next to the load imposed by the task, there is also load imposed by the instructional design. This can take two forms: when it is ineffective for learning, it is called extraneous cognitive load; when it is effective for learning it is referred to as germane cognitive load (Sweller, 1988; Sweller et al., 1998).

When learners have to solve problems that are high in intrinsic load, searching for a solution with the weak strategies (e.g., means-ends analysis) that novice learners often employ, is not effective for learning. Studying worked examples reduces the extraneous load, because learners can devote all available working memory capacity to studying the worked-out solution and constructing a schema for solving such problems in long-term memory (i.e., learning; Sweller, 1988; 2004). As a result, novice learners are capable of attaining more efficient transfer performance, that is, higher performance on transfer test problems combined with lower investment of cognitive resources in solving those problems (Paas & Van Merriënboer, 1993), after a training consisting of studying worked examples than after a training consisting of solving problems.

Enhancing Learning from Worked Examples: Increasing Germane Cognitive Load

The cognitive capacity that is freed-up by the reduction of extraneous cognitive load, can –within capacity limits- be devoted to activities that contribute to learning and further enhance transfer performance. However, learners are unlikely to do so spontaneously. Hence, cognitive load research has started to shift attention towards the identification of instructional techniques that stimulate learners to invest cognitive resources in activities relevant for learning, that is, techniques that are successful at inducing germane cognitive load (Paas, Renkl, & Sweller, 2003, 2004; Sweller et al., 1998).

Strategies that are known to increase germane cognitive load are for example increasing variability (Paas & Van Merriënboer, 1994a), or contextual interference (Van Merriënboer, Schuurman, De Croock, & Paas, 2002) of worked examples during practice. Prompting students to self-explain the rationale behind the presented solution steps (Atkinson, Renkl, & Merrill, 2003; Chi, Bassok, Lewis, Reimann, & Glaser, 1989; Renkl, 1997) may also induce a germane cognitive load, provided that learners are

11

capable of providing adequate explanations (see Chi et al., 1989; Renkl, 1997). However, students may lack the domain knowledge necessary to do so, especially very early in training, and when this pre-condition is not met, requiring learners to self-explain is likely to induce an extraneous instead of germane cognitive load (i.e., will be ineffective for learning).

In this dissertation the question is addressed whether providing process-oriented worked examples that not only present learners with worked-out solution steps, but also explicate the rationale behind them, may be more effective at inducing a germane load than so-called product-oriented worked examples (Chapters 5, 6, and 7). Students can use the capacity that is freed-up by reduced extraneous load through the availability of a worked-out solution to study the “process information” and gain a better understanding of the solution procedure. Understanding is considered imperative for transfer, especially for far transfer, because far transfer tasks have different structural features from the trained tasks, and therefore do not allow learners to merely apply a memorized procedure. Recognizing and flexibly applying those parts of a learned procedure that are relevant for a new problem is impossible unless a learner “not only knows the procedural steps for problem-solving tasks, but also understands when to deploy them and why they work” (Gott et al., 1993, p. 260; see also Catrambone, 1996, 1998).

The Role of Expertise in Learning from Examples

Next to the focus-shift towards identification of germane load enhancing measures, cognitive load research also started to consider the role of expertise1 in the effectiveness of instructional formats. This resulted in the identification of the “expertise reversal effect” (for a thorough discussion see Kalyuga, Ayres, Chandler, & Sweller, 2003), which suggests that instructional techniques that are effective for novice learners may not be so for more advanced learners (e.g., worked examples; Kalyuga, Chandler, Tuovinen, & Sweller, 2001), and vice versa (e.g., imagining; Cooper, Tindall-Ford, Chandler, & Sweller, 2001). This raises the question of how increasing expertise influences the cognitive load imposed by instructional materials and how instruction should be altered in response.

1 Note that the term expertise as it is used here, refers to knowledge. A definition of expertise as the amount of knowledge/experience should not be confused with a definition of expertise as expert (excellent) performance, for which it is a necessary, but not sufficient condition (Ericsson & Lehmann, 1996).

12

Regarding relatively short-term increases in knowledge, cognitive load research has found that a completion or fading strategy should be applied in order to adapt to the learner’s increasing knowledge (Renkl & Atkinson, 2003; Van Merriënboer & Krammer, 1990). In such a strategy, the learner starts with studying worked examples and progresses to completing partially worked examples with increasingly more blanks, until s/he solves problems without any instructional assistance. Rather than through a fixed instructional sequence, increases in knowledge can also be taken into account in a more flexible way, by adaptive selection of instructional tasks in response to learner characteristics. For example, a learner’s level of performance, mental effort, or a combination of the two on one task, can be used to decide whether a learner should best be given a next task that is more or less complex or contains more or less instructional guidance (Camp, Paas, Rikers, & Van Merriënboer, 2001; Corbalan, Kester, & Van Merriënboer, in press; Salden, Paas, Broers, & Van Merriënboer, 2004). Recently, Kalyuga and Sweller (2004, 2005) developed a rapid method for dynamic assessment of knowledge, presenting learners with a problem state and asking them to indicate the appropriate move, which is assumed to reflect the current level of schema development and can be used to select an optimal level of instructional guidance in the next task.

However, concerning longer-term expertise development, that is, not just acquiring knowledge on how to perform one particular task (constructing a problem schema), but acquiring expertise in a domain (constructing many, interrelated schemata), it becomes more complex to predict the height and type of cognitive load imposed by instruction and hence its effectiveness. Therefore, researchers need to attain detailed insight into the cognitive structures learners have developed, and process-tracing techniques seem promising in this respect (Chapter 2).

Process-Tracing

Process-tracing techniques can be “used to make inferences about the cognitive processes or knowledge underlying task performance” (Cooke, 1994, p. 814). This makes them valuable for educational researchers as well as instructional designers, because they can be used both for gaining insight into changes in cognitive structures with increasing expertise development, and for eliciting problem-solving process information from individuals with high levels of expertise to develop instruction (e.g., process-oriented worked examples).

For an overview of process-tracing techniques, the reader is referred to Cooke (1994). In this dissertation, two techniques are investigated (Chapters 3 and 4): collecting verbal

13

reports (Ericsson & Simon, 1993), which is quite widely applied in all kinds of disciplines, and eye tracking (Rayner, 1998), which is much less used (especially in educational research), but is gaining more attention with advancing technology that makes the collection and analysis of eye movement data less burdensome.

Overview of the Dissertation

This dissertation is divided into two parts. Part I, Uncovering the Problem-Solving Process, is concerned with process-tracing techniques for eliciting information about the problem-solving process and expertise-related differences therein. Part II, Process-Oriented Worked Examples, is concerned with the effects on learning of explicating problem-solving process information in worked examples.

The first part starts with a discussion of the connections between the theoretical frameworks of cognitive load theory and expert performance, more specifically, deliberate practice (Chapter 2), and identifies –amongst others- the (process-tracing) techniques used in expert performance research that are valuable for cognitive load research. Chapters 3 and 4 empirically investigate the use of process-tracing techniques to uncover detailed problem-solving process information. Chapter 3 describes the effectiveness of cued retrospective reporting, concurrent reporting, and retrospective reporting in terms of the quantity of different types of problem-solving information elicited; in the addendum expertise-related differences in the way participants experience those reporting methods are explored. Chapter 4 investigates the usefulness of eye movement data in addition to concurrent verbal reports for uncovering subtle expertise-related differences in performance.

The second part starts with a chapter on the theoretical rationale for the effectiveness of process-oriented worked examples (Chapter 5). Chapters 6 and 7 describe experiments on the effects of process-oriented worked examples on learning and transfer. The experiment reported in Chapter 6 compares the effectiveness of worked examples with and without process information (i.e., process- and product-oriented) to that of conventional problems with and without process information. Chapter 7 compares the effectiveness of a sequence of process-oriented and product-oriented worked examples to a sequence of product-oriented and process-oriented, only product-oriented, or only process-oriented worked examples. The final chapter contains a general discussion of the findings (Chapter 8).

PART I

Uncovering the Problem-Solving Process

17

Chapter 2

Instructional Design for Advanced Learners: Establishing Connections between the Theoretical Frameworks of Cognitive Load

and Deliberate Practice1 Cognitive load theory (CLT) has been successful in identifying instructional formats that are more effective and efficient than conventional problem solving in the initial, novice phase of skill acquisition. However, recent findings regarding the “expertise reversal effect” have begun to stimulate cognitive load theorists to broaden their horizon to the question of how instructional design should be altered as a learner’s knowledge increases. To answer this question, it is important to understand how expertise is acquired and what fosters its development. Expert performance research, and, in particular, the theoretical framework of deliberate practice, has given us a better understanding of the principles and activities that are essential in order to excel in a domain. This article explores how these activities and principles can be used to design instructional formats based on CLT for higher levels of skills mastery. The value of these formats for e-learning environments in which learning tasks can be adaptively selected on the basis of online assessments of the learner’s level of expertise is discussed.

Nowadays, most researchers agree that, ideally, instruction for complex skill learning should center on authentic tasks, should be adaptive to the individual learner’s needs and capacity, and should support and motivate learners in acquiring the ability to plan, monitor, and evaluate their own learning process. Modern e-learning tools allow the incorporation of sophisticated online assessments of the level of learner expertise, and are, therefore, very helpful in the delivery of this kind of instruction. However, the challenge for instructional designers is to develop instruction that suits the above demands, and that is not only effective, but also as efficient as possible. To be able to do so, insight into the mechanisms that underlie or mediate the acquisition of particular complex skills at different levels of expertise is required.

In investigating the acquisition of complex skills, the lines of research on cognitive load theory (CLT; Sweller, 1988) and expert performance (Ericsson, 2002) have very different foci. Research on expert performance investigates the history of skill acquisition of highly skilled professionals in order to identify the mechanisms that underlie their superior achievements, without the aim of translating these very specific

1 This chapter was published as Van Gog, T., Ericsson, K. A., Rikers, R. M. J. P., & Paas, F. (2005). Instructional design for advanced learners: Establishing connections between the theoretical frameworks of cognitive load and deliberate practice. Educational Technology Research and Development, 53(3), 73-81.

18

mechanisms into general instruction for complex skills in educational settings. In contrast, CLT research has mainly focused on developing effective and efficient instructional strategies to support initial skill acquisition in educational settings.

Complementary to CLT’s almost exclusive focus on the initial learning phase, the recently found “expertise reversal effect” (Kalyuga, Ayres, Chandler, & Sweller, 2003; Kalyuga, Chandler, & Sweller, 1998), indicating that most CLT effects become less effective as a function of increasing expertise, can be considered a strong indication for the need of CLT research to broaden its scope toward providing instructional design recommendations beyond the initial phase of skill acquisition. From both a theoretical and a practical point of view, the basic tenet of CLT, to take the limitations of human cognitive architecture into account when designing instruction, does not seem to put any obstacles in the way of fulfilling this need. However, we argue that in order to successfully develop design guidelines for instruction beyond the initial phase, CLT research should take into account the mechanisms that underlie the superior achievement of experts who have been identified in expert performance research, especially because this research, although it has not aimed to provide general instructional guidelines, has provided valuable insights about the interaction between human cognitive architecture and developing expertise.

CLT and the Expertise Reversal Effect

CLT (Paas, Renkl, & Sweller, 2003) holds that instructional design should explicitly consider the human cognitive architecture and its limitations in order to be effective. According to CLT, cognitive architecture consists of a general-purpose working memory that has a limited capacity of about seven chunks of information when just holding information, and not more than two or three chunks when processing information, and a long-term memory that has a virtually unlimited capacity, and holds information stored in schemas. Schemas can reduce working memory load, because once they have been acquired and automated, they can be handled in working memory with very little conscious effort. In addition, no matter how extensive a schema is, it will be treated as one chunk of information, thereby increasing the amount of information that can be held and processed in working memory without requiring more conscious effort. This ensures that there is enough cognitive capacity available to solve very complex problems. However, when schemas have not yet been acquired, all information elements (chunks) of the problem have to be kept in working memory as separate items, which might lead to a high or excessive demand on working memory capacity. Consequently, there would

19

not be enough capacity left for the formation of a problem schema, and learning would be hampered.

CLT is concerned with instructional techniques for managing working memory load in order to facilitate the changes in long-term memory associated with schema construction and automation. These techniques aim at minimizing extraneous, ineffective cognitive load (i.e., not requiring complex reasoning processes with many interacting unknown chunks of information), and increasing germane, effective cognitive load that facilitates domain-specific knowledge acquisition. CLT research on instructional formats that take these principles into account has identified the following effects: the “goal-free effect,” the “worked example effect,” the “split-attention effect,” the “redundancy effect,” the “modality effect,” the “completion effect,” the “variability effect,” and the “imagination effect” (see Sweller, 2004; Sweller, Van Merriënboer, & Paas, 1998).

In recent years, most of these CLT effects have been found to facilitate learning for novices, but to become less effective or even dysfunctional as a function of increasing expertise, which is known as the “expertise reversal effect” (Kalyuga et al., 2003). As an example, the “splitattention effect” occurs when learners have to divide their attention over two (or more) information sources that cannot be understood in isolation, such as mutually referring text and diagram, and therefore, physically integrating the text and diagram is beneficial for learning. But with increasing expertise, a learner might be able to understand the information sources in isolation, and when this is the case, the integrated format will lead to a “redundancy effect,” which hampers learning (e.g., Chandler & Sweller, 1991).

However, the expertise reversal effect has been found in studies that use the same instructional materials for students of very low and somewhat higher levels of expertise. Therefore, the conclusion that some of the instructional formats based on CLT do not work for more experienced learners (Kalyuga et al., 2003) may be premature. It is still possible that CLT-based formats (e.g., worked examples) can benefit learners at higher levels of expertise, by taking into account their prior knowledge. Nonetheless, the findings regarding the expertise reversal effect have caused increased attention by CLT researchers to the fact that the learner’s level of expertise is an important factor mediating the relation between cognitive architecture, information structures, and learning outcomes. Consequently, researchers have started to explore how to design effective and efficient instruction for learning beyond the initial levels of mastery, and,

20

therefore, need to understand what it means to acquire expertise, and what fosters its development.

Expert Performance Research

Research on expert-novice differences (Chi, Glaser, & Farr, 1988) has shown that experts excel mainly in their domain of expertise, are faster than novices at performing skills, perform their tasks (almost) error free, have superior shortterm and long-term memory, and have deeper and more principled problem representations than novices, who tend to build superficial representations of a problem. As Ericsson and Lehmann (1996) noted, this research on expert-novice differences has taken a knowledge-based approach to expertise, which equates expertise with having acquired a lot of knowledge during many years of experience in a domain. However, there is evidence (see Ericsson & Lehmann) that experts by this definition often do not show superior performance on relevant tasks as compared to less-experienced individuals. Expert performance research is concerned with identifying the mechanisms that have enabled individuals to attain expert performance, that is, “consistently superior performance on a specified set of representative tasks for a domain” (Ericsson & Lehmann, 1996, p. 277). It uses techniques such as collecting retrospective verbal protocols, diaries, and interview data to study small groups of people differing in their current (high) levels of performance under normal conditions (e.g., Ericsson, Krampe, & Tesch-Römer, 1993), as well as process-tracing methods such as eye tracking, reaction time tasks, recall tasks, and verbal reports to study expert performance under experimentally varied conditions (Ericsson & Lehmann).

Expert performance research has shown that it is not the amount of experience in a domain that is relevant for acquiring expert performance, but rather the amount of deliberate effort to improve performance. As Ericsson et al. (1993) argued, expertise and expert performance are acquired by extensive engagement in relevant practice activities, and individual differences in performance are for a large part accounted for by differences in the amount of relevant practice. Relevant practice activities for improving performance are referred to as deliberate practice, and typically –in domains such as sports, typing, chess and music- these activities are initially designed by the teacher or coach to help students to improve specific aspects of their performance. Deliberate practice activities are at an appropriate, challenging level of difficulty, and enable successive refinement by allowing for repetition, giving room to make and correct errors, and providing informative feedback to the learner (Ericsson et al., 1993; Ericsson &

21

Lehmann, 1996). Given that deliberate practice requires students to stretch themselves to a higher level of performance, it requires full concentration and is effortful to maintain for extended periods. Students do not engage in deliberate practice because it is inherently enjoyable, but because it helps them improve their performance. That is why deliberate practice activities are often scheduled for a fixed period during the day (at which body and mind are best capable of the effort), and this daily period is of limited duration (Ericsson et al., 1993).

A few other important findings from expert performance research should be mentioned here. The first is that for domain-relevant tasks, expert performers are able to acquire cognitive mechanisms and physiological adaptations that circumvent or change limits constraining the performance of novices. For example, working memory limitations are expanded by the acquisition of long-term working memory (Ericsson & Kintsch, 1995), reasoning is improved by knowledge encapsulation (Rikers, Schmidt, & Boshuizen, 2002), and rapid responses are attained by anticipation and many other qualitative changes (Ericsson, 2002, 2003; Ericsson & Lehmann, 1996). A second finding is that –in contrast to assumptions by theories of skill acquisition (e.g., Fitts & Posner, 1967)- the most important aspects of expert performance are not fully automated and the expert performer retains control over them. Ericsson (2002; Ericsson & Lehmann, 1996) proposed that maintaining high levels of conscious monitoring and control is essential for further improvement of a skill through deliberate practice. Finally, teachers and coaches are helping future expert performers to become independent learners, and design and monitor their own training activities (Glaser, 1996; Zimmerman, 2002), which is critical to the ultimate goal for expert performers, namely to make a major creative contribution to their domain of expertise (Ericsson & Lehmann, 1996).

Establishing Connections between the Two Frameworks and Suggestions for New Research in Education

On the long road from beginner to expert performer, there are many fundamental changes in the structure of the mechanisms mediating performance as well as in the conditions of learning and practice. Although the designed instruction of beginners and the self-guided deliberate practice of expert performers may have very little in common, we will in this section explore how methods and insights from the study of expert performers might allow instructional designers to extend and supplement the methods of CLT to increased levels of skill and expertise. First we will discuss methods for

22

identifying and describing acquired cognitive structures and mechanisms, such as schemas, which are necessary if we want to make predictions about how instructional methods might aid advanced learners. We will then examine how the insights from deliberate practice by expert performers can be adapted and incorporated into the instruction and training of less advanced students. Identifying Cognitive Structures and Skilled Mechanisms of Advanced Learners

Whether it is possible to extend CLT-based instructional strategies that are effective for novices to more skilled learners depends on several issues that have received relatively little attention by researchers. First, it must be possible to study and describe advanced performers’ problem solving, and their use of explicit schemas.

If a learner’s knowledge base changes, and this change influences future learning, we need a way to accurately judge the content of this knowledge base to be able to design effective instruction. Although CLT research relies heavily on assumptions about schema construction, it has predominantly evaluated the effectiveness of instructional intervention using the combined measures of transfer test performance and mental effort. Although it is valid to conclude that an instructional format that results in higher transfer performance with a lower investment of mental effort is more efficient, this does not provide insight into the mechanisms that underlie this enhanced performance. CLT assumes that schema acquisition was successful based on the higher performance measures, but these measures say very little about the content of such a schema, how it is used when solving transfer problems, and whether errors in performance occur because of a lack of understanding or from computational errors for example (see also Van Gog, Paas, & Van Merriënboer, 2004). More importantly, whereas this assumption works fine for the initial phase of skill acquisition, when no appropriate schemas for the specific problems are present, it will fail when designing instruction for more advanced students, because according to CLT, schemas largely determine the cognitive load. Another assumption that is hardly tested is that a given instructional design will elicit the same specific learner activities for all learners (Gerjets & Scheiter, 2003). So, to make predictions about the cognitive load imposed by different designs on learners who have passed the initial phase of skill acquisition, existing schemas and their organization, as well as learners’ processing strategies, will have to be taken into account. Techniques that are very often used in expert performance research might be of use here, such as analysis of concurrent or retrospective verbal protocols (Ericsson & Simon, 1993), and other process-tracing methods such as eye tracking (e.g., Charness, Reingold, Pomplun,

23

& Stampe, 2001). Using these methods to explain differences in performance not only during transfer tests, but also during practice might help researchers pinpoint the mechanisms that underlie schema acquisition.

Another interesting approach to identify schema content would be to try to design experiments that aim at directly comparing the problem representations established in long-term memory by different instructional formats. For example, learners could be asked to reproduce from memory the solution path of either the worked example they just studied or the conventional problem they just solved. If the worked example group has indeed acquired a better problem schema, they should perform better on this task than the problem-solving group. Similarly, the assumption that completion problems (worked examples with blanks that learners have to fill in) force learners to pay more attention than studying complete worked examples (e.g., Paas, 1992), could be tested by removing the completion problem or worked example after a given study time and asking learners to reproduce it. The combination of these memory reproduction tasks and concurrent verbal protocols during study might provide more direct evidence on what and how students learn from different instructional formats. Designing Instruction and Training Based on the Characteristics of Deliberate Practice

The specific nature of deliberate practice depends on the structure and the amount of previously acquired skill, and will differ greatly across individuals. We will therefore limit the discussion of applying the characteristics of deliberate practice to instruction, to the ideal of instruction for complex skill learning sketched in the introduction: adaptive, individualized instruction, based on authentic tasks, that gradually allows learners to take control over the process –an approach for which adaptive e-learning environments are well-suited. A first step to applying these ideas would be to try and identify aspects of skilled performance on representative tasks in a given domain, together with performance criteria associated with different levels of expertise. Tasks could be developed, or existing tasks should be identified that improve performance on these aspects (i.e., deliberate practice tasks). However, as Ericsson et al. (1993) emphasized, it is important that activities to improve specific aspects of a skill are carried out in the context of the entire skill.

Selection rules and variables. For any learner, the current level of performance and areas of improvement should be identified, and this assessment can be based on the performance criteria for aspects of skilled performance on representative tasks. This provides the first input for selection rules for assigning deliberate practice tasks. Because

24

the same level of performance can be attained by different individuals, but with different mediating processes and at very different costs, those rules should be composed from task and learner variables. In addition to task performance, variables such as mental effort, time on task, and strategies should be considered in the selection of practice tasks. Recently, Camp, Paas, Rikers, and Van Merriënboer (2001) investigated the use of a combined measure of performance and mental effort (i.e., efficiency) as a basis for dynamic task selection in the domain of air traffic control (see also Kalyuga & Sweller, 2005; Salden, Paas, Broers, & Van Merriënboer, 2004). However, it should be determined if CLT efficiency measures can be applied for the selection of deliberate practice activities, because deliberate practice requires the investment of a high level of effort. On the other hand, a high level of effort does not necessarily imply engagement in deliberate practice. Comparable to the concept of germane cognitive load, in deliberate practice the increased concentration and effort has a special function, namely to allow students to attain a higher level and to improve the targeted aspect of performance.

More generally, we see the need for studying how detailed assessments of student performance can guide the selection of appropriate practice activities. A good starting point might be to investigate the actual training practices as well as decision processes and deliberations of master teachers and coaches who have experience in selecting deliberate practice activities (e.g., in domains such as sports, typing, chess and music; see Ericsson, 2002). However, it is likely that rules used by teachers and coaches would not be sufficiently explicit to allow simple translation into selection rules that can be used in instructional design. We believe that the issue of selecting appropriate training activities for more advanced students will provide a very fruitful area of research where analysis of measurable aspects of performance and assessment of mediating cognitive mechanisms will lead to the discovery of valid rules for selection of training tasks.

Format and scheduling of activities. In spite of the evidence for the expertise reversal effect (Kalyuga et al., 2003), there is some suggestion that instructional format based on CLT may be effective when adapted to advanced levels of expertise. In particular, some recent formats that aim at enhancing germane cognitive load might qualify as deliberate practice activities for students at certain levels of expertise, such as instructing students to self-explain (Renkl, 2002), or to imagine or anticipate on next steps (Cooper, Tindall-Ford, Chandler, & Sweller, 2001). These formats encourage students to generate rich responses and thus learn from errors and difficulties, and feedback in the form of the right explanation or steps gives students the opportunity to diagnose and learn from

25

their errors. Important for both the concept of germane cognitive load and deliberate practice is that it will have positive effects on performance only if learners are motivated to put in the effort. Motivation is thus a mediating variable, and can be an important constraint on effectiveness. Monitoring learner motivation, or the interaction between a particular format, motivation, and effectiveness, might provide important information on how to schedule different types of activities (see also Paas, Tuovinen, Van Merriënboer, & Darabi, 2005).

Ericsson et al. (1993) have shown that deliberate practice activities are usually of limited duration (2– 4 hr) and are often scheduled for a fixed time during the day, because they require such high amounts of effort. A very interesting question is whether this is more efficient, that is, whether the improvements in performance are equal to or higher than those attained with traditional forms of instruction, even though the time spent per day is limited. Zhu and Simon’s (1987) findings are interesting in this respect. They compared a traditional 3-year mathematics curriculum to a redesigned curriculum based on carefully chosen sequences of worked-out examples and problems. They found that most of the students in the new curriculum were able to complete the entire curriculum in 2 years and were at least as successful as students learning by conventional methods.

However, as indicated before, there are probably other principles that come into play at higher levels of expertise or task complexity. Highly interesting in this respect are recent efforts to study the microstructure of practice activities of expert performers by process-tracing methods, such as “think aloud” and detailed observation (Deakin & Cobley, 2003; for reviews, see Ericsson, 2002, 2003). In addition, longitudinal research should be used to identify those principles, as well as provide answers to the questions raised here. Such research can also provide more detailed information on how learners use the informative feedback provided by deliberate practice, how they use the opportunity for repetition (i.e., how many times), and how they go about correcting errors if they make any.

Learner control. If students are to continue improving their performance after the end of formal education, they have to be prepared to shape their own learning processes, that is, diagnose their needs for improvement, seek out their own activities, and plan, monitor, and evaluate this entire process (see Zimmerman, 2002). The most common method to prepare students for increased independence in domains of expertise is to gradually reduce the teacher-controlled selection of tasks, and thus force the learner to

26

develop skill in the selection of tasks in parallel with development of other aspects of expertise (Glaser, 1996).

For highly skilled performers, a high level of learner control is possible, because analyses show that they are capable of monitoring their performance, so they can diagnose and modify the mediating cognitive mechanisms in response to inferior achievement and design their own training activities to improve their weaknesses (Ericsson, 2002). Therefore, it is likely that with acquiring increasingly complex cognitive mechanisms, individuals will also acquire mechanisms for monitoring and assessing their performance, and become able to use general feedback and their expected and observed performance outcomes to help them diagnose and make appropriate adjustments in the mechanisms controlling their performance.

This raises both an interesting question for further research, and a major challenge for instructional design. The question is on if it is possible to facilitate student development of the skill to diagnose their own needs for improvement; in other words, to teach them how to identify appropriate training activities. For example, a specific type of process-oriented worked example (Van Gog et al., 2004) that shows explicitly how teachers and advanced learners select tasks by monitoring the processes mediating the task performance, might benefit the acquisition of this skill. A major challenge, but also a major benefit for instruction for more skilled performers would be to develop a collection of authentic training tasks that can qualify as deliberate practice activities and support self-regulated learning, generation of feedback, and repeated practice of corrected performance.

General Discussion

Researchers studying CLT and expert performance have focused on the beginning and the ultimate goal of the acquisition of skilled performance, respectively. As instructional designers working within the CLT framework extend their work toward increasingly skilled students, we see great promise in developing connections to the framework of expert performance research. In this article, we have indicated a number of interesting relations between both frameworks; their implications for instructional design research are summarized in Table 1.

Skilled and expert performers have acquired a wide range of complex cognitive mechanisms that mediate their superior performance and allow them to circumvent the processing limits that constrain novices. These mechanisms also allow developing performers to monitor and gradually refine their performance during deliberate practice.

27

A fundamental question is if and how instructional interventions might facilitate this learning process and whether the duration of this training can be shortened by designing training activities according to CLT, as Rikers, Van Gerven, and Schmidt (2004) suggested. Another highly relevant question emerging from the expert performance perspective concerns the motivational factors that support skilled performers to focus their lives on attaining high levels of performance and spending thousands of hours in deliberate practice (Ericsson, 2002). Might a better understanding of these motivational factors help instructional designers to facilitate the engagement of less skilled students in deliberate training activities?

Table 1 Suggestions for instructional design research Memory / Cognitive Structures ID research should identify the actual effects on memory structures of different instructional formats for learners at different levels of expertise, instead of deriving assumptions about these structures from subsequent task performance.

Improving Learning & Performance ID research should identify what instructional formats are capable of increasing germane cognitive load, or may constitute deliberate practice, for learners of different levels of expertise. This requires the identification of aspects of skilled performance on representative tasks in a domain and the associated performance criteria at different levels of expertise.

Adaptive Training Design ID research should identify what the relevant aspects of performance (i.e., a more fine-grained performance measure) and other relevant variables are in a domain, and investigate how these variables can be assessed and used in subsequent task selection to make instruction adaptive to the needs for improvement of individual learners.

Effort / Motivation ID research should focus more on the relationship between motivation, investment of mental effort, and effectiveness of differ ent instructional formats.

Self-regulation / Control ID research should identify at what point in their development learners become capable of self-assessment and self-selection and what the relevant mechanisms are that support these skills, and investigate whether these skills can be trained.

One of the exciting challenges of developing instruction for advanced learners is that

their learning will involve the modification of skills and information that have been previously organized in long-term memory. In this article we sketched the opportunities for instructional designers to use process-tracing methods to assess individuals’ organization of these preexisting structures and to develop instructional methods to fit to the attributes of the individual advanced learner. Once this can be realized for a given

28

domain, there is great promise for instruction in the development of a collection of suitable training tasks. E-learning tools would have considerable benefits in storing such a collection of tasks in a database, in allowing online assessment of level of expertise based on a number of variables, and in translating this assessment into selection rules for retrieving tasks. Current technological developments increasingly allow the development of tools that incorporate these functionalities (Shute & Towle, 2003). However, when it comes to instruction aimed at improving performance instead of knowledge, we still have a long way to go, because the domain should allow the use of e-learning tools without devaluating task authenticity, and those tools should allow for gradually increasing learner control in assessment and selection. The questions raised in this article show that establishing connections between the theoretical frameworks of CLT and expert performance research provides fertile grounds for future research on instructional design. Eventually, integrating those theoretical perspectives and their empirical findings could be attempted.

29

Chapter 3

Uncovering the Problem-Solving Process: Cued Retrospective Reporting versus Concurrent and Retrospective Reporting1

This study investigated the amounts of problem-solving process information (‘action’, ‘why’, ‘how’, and ‘metacognitive’) elicited by means of concurrent, retrospective, and cued retrospective reporting. In a within-participants design, 26 participants completed electrical circuit troubleshooting tasks under different reporting conditions. The method of cued retrospective reporting used the original computer-based task and a superimposed record of the participant’s eye fixations and mouse/keyboard operations as a cue for retrospection. Cued retrospective reporting (with the exception of ‘why’ information) and concurrent reporting (with the exception of ‘metacognitive’ information) resulted in a higher number of codes on the different types of information than retrospective reporting.

Process-tracing methods, such as concurrent reporting (‘thinking aloud’), retrospective reporting, eye tracking, and decision analysis, can be used to elicit information that allows for making inferences about the cognitive processes underlying problem-solving performance (Cooke, 1994). Hence, these methods are widely applied, for example in usability studies, to investigate how people interact with a system or device in order to improve it (e.g., Van den Haak, De Jong, & Schellens, 2003); in the design of expert systems, to uncover expert cognitive processes in order to model a system (see Richman, Gobet, Staszewski, & Simon, 1996); and in educational research either to uncover problem-solving processes as a goal in itself, or to improve instruction (e.g., Renkl, 1997).

Problem solving is defined as getting from an initial problem state to a desired goal state, without knowing exactly what actions are required to get there (Newell & Simon, 1972). In problem solving, different types of knowledge are applied. Domain knowledge (principles) is used to mentally represent the problem and to narrow down the problem space to those problem-solving operators (i.e., solution steps/ actions) that may be relevant for this kind of problem. It interacts with strategic knowledge (heuristics, systematic approaches to problem solving), which is used to select operators that are most likely to lead to the goal state. Metacognitive knowledge is used to monitor this

1 This chapter was published as Van Gog, T., Paas, F., Van Merriënboer, J. J. G. & Witte, P. (2005). Uncovering the problem-solving process: Cued retrospective reporting versus concurrent and retrospective reporting. Journal of Experimental Psychology: Applied, 11, 237-244.

30

process of selection and application of operators by keeping track of the progress towards the goal state.

For the design of instruction that makes all the knowledge used in the problem-solving process explicit to learners (e.g., process-oriented worked examples, see Van Gog, Paas, & Van Merriënboer, 2004), a process-tracing technique is required that is able to uncover information about problem-solving actions taken (‘action’), domain principles used (‘why’), strategies used (‘how’), and self-monitoring (‘metacognitive’). However, the results obtained with the two most widely applied verbal methods, concurrent reporting and retrospective reporting, suggest that neither of these methods is suitable for providing a comprehensive picture of the problem-solving process in terms of ‘action’, ‘why’, ‘how’, and ‘metacognitive’ information.

With the method of concurrent reporting (Ericsson & Simon, 1993; Van Someren, Barnard, & Sandberg, 1994), participants are instructed to “think aloud”, that is, to verbalize all thoughts that come to mind while working on a task (i.e., on-line). In retrospective reporting (Ericsson & Simon, 1993), participants are instructed to report the thoughts they had while they were working on a task immediately after completing it (i.e., off-line). It should be noted that in order to allow for valid inferences about the cognitive processes underlying task performance, the wording of verbalization instructions and prompts is crucial. Only when the instructions and prompts are worded in a way that the evoked responses do not interfere with the cognitive processes can concurrent and retrospective reporting result in verbal protocols that reflect the reported cognitive processes (Ericsson & Simon, 1993).

However, there is an important distinction between the information contained in concurrent and retrospective protocols, related to their on-line and off-line generation, respectively. Concurrent protocols reflect the information available in short-term memory during the process, whereas retrospective protocols reflect the memory traces of the process that are retrieved from short-term memory (in tasks of very short duration) or long-term memory directly after it is finished (Camps, 2003; Ericsson & Simon, 1993). This reference to different memory systems seems to result in differences in the problem-solving information contained in the protocols. For example, Taylor and Dionne (2000) noted that concurrent protocols seem to contain predominantly information on actions and their outcomes, while retrospective protocols seem to contain more “references to strategies that control the problem solving process” and “information such as the conditions that elicited a particular response” (cf. our categories of ‘why’, ‘how’, and ‘metacognitive’ information; Taylor & Dionne, 2000, p.

31

414). They reported only means and standard deviations, but an analysis of their data (all participants, N = 36) shows that the number of codes on their category of ‘actions’ was significantly higher in concurrent protocols, with a large correlation effect size r (Rosenthal, Rosnow, & Rubin, 2000) of .84, and that the number of codes on their categories of ‘conditional knowledge’, ‘beliefs’ and ‘strategy acquisition knowledge’ (cf. our categories of ‘why’, ‘how’, and ‘metacognitive’ information) were significantly higher in retrospective protocols (large correlation effect sizes r of .91, .93, and .94, respectively)2. Kuusela and Paul (2000) found that concurrent protocols contained more information than retrospective protocols, because the latter often contained only references to the effective actions that led to the solution3. This might be a result of participants’ selective reporting of the correct solution steps to represent their performance as better than it actually was, but a more likely explanation is that only the correct steps that have led to attainment of the goal are stored in long-term memory, since only these steps are relevant for future use. So, because retrospective reporting requires retrieval of episodic memories from long-term memory, reports can be subject to forgetting, which may explain why less information on actions is elicited with this method. Besides forgetting, another problem with retrospective reports is fabricating, that is, off-line reporting of information that was not actually part of the on-line process. It is unclear whether this might explain the finding that retrospective protocols seem to contain more ‘why’, ‘how’, and ‘metacognitive’ information, as this knowledge might also have been used during the process but might be omitted in concurrent reporting as a result of the greater processing demands of this method (see Russo, Johnson, & Stephens, 1989). What is clear, however, is that neither of these methods seems able to provide a comprehensive picture of the problem-solving process in terms of ‘action’, ‘why’, ‘how’, and ‘metacognitive’ information.

A technique is needed that combines the advantages of concurrent reporting (i.e., more ‘action’ information) and retrospective reporting (i.e., more ‘why’, ‘how’ and ‘metacognitive’ information). Of course, one could use both methods complementary (Ericsson & Simon, 1993; Camps, 2003; Taylor and Dionne, 2000). However, there is a methodological risk involved in retrospective reporting on the same task that was

2 The authors gratefully acknowledge Dr. K. Lynn Taylor’s efforts at providing the effect sizes for the data reported in Taylor & Dionne (2000). 3 Kuusela and Paul (2000) used a between-participants design, however, since they did not report standard deviations, effect sizes could not be computed based on the data presented in that article. Our attempts to obtain the required data were unsuccessful.

32

carried out while reporting concurrently, because the episodic memory retrieved and reported might be the memory of the concurrent verbalizations instead of the memory of the process. We propose that a method of cued retrospective reporting, using a combined record of eye movements and mouse/keyboard operations that is superimposed on the problem as a cue, might be able to combine the advantages of both other methods.

In cued retrospective reporting, participants are instructed to report retrospectively based on a record of observations or intermediate products of their problem-solving process, which they are shown to cue their memories of this process. This is known to lead to better results due to less forgetting and/or fabricating of thoughts than plain retrospective reporting (Van Someren, et al., 1994). More importantly for our purposes, cued retrospective reporting based on a cue that shows participants’ actions might lead to more actions being reported, without losing the retrospective nature and its associated information types. A videotape of the problem-solving session can be used as a cue of actions, but for computer-based tasks in domains in which visual inspection plays a key role (e.g., troubleshooting electrical circuit simulations, numerically controlled machinery programming), a cue consisting of a combined record of eye movements and mouse/keyboard operations that is superimposed on the problem can be expected to lead to better results. Because eye movements reflect cognitive processes (Lauwereyns & d’Ydewalle, 1996; Rayner, 1998), they might cue the participants to report on those processes, including visual problem-solving actions that would not be observable on videotape or in a record of mouse/keyboard operations. For example, when troubleshooting electrical circuit simulations, the problem-solving process would (ideally) start with surveying the circuit, and diagnosing possible malfunctioning components. Such activities would only become visible in a record of eye movements, and using such a record superimposed on the problem-solving task as a cue can lead to these actions and the underlying ‘why’ and ‘how’ knowledge being reported, even if this concerns implicit (tacit) knowledge (Lauwereyns & d’Ydewalle, 1996).

Support for the assumptions with regard to this type of cue, comes from the work of Russo (1979) and Hansen (1991). Although Russo (1979) did not distinguish between different types of information, he found that retrospective reporting based on a record of eye fixations resulted in protocols that contained more words and were of longer

33

duration than concurrent and retrospective protocols4. Russo et al. (1989) conducted a study that might have provided interesting information on the use of eye movements as a cue. They intended to compare errors of fabricating and forgetting in different types of retrospective reporting: based on the initial stimulus, the response, or on a record of eye movements. Unfortunately, however, the use of an improper instruction for retrospective reporting based on eye-movements made comparisons between this type and the other two types pointless. Hansen (1991) compared the number of cognitive, manipulative (physical actions on the PC, e.g., typing) and visual operational (visual actions on the PC, e.g., reading) comments elicited with retrospective reporting cued by either a record of eye movements or a video recording. Although both methods resulted in the same number of comments on cognitive operations, the condition with the video record as a cue resulted in the highest number of manipulative comments, whereas the condition with the eye movement record as a cue yielded more visual operational comments. So, for computer-based tasks that rely on visual inspection, a combined record of mouse/keyboard operations and eye movements is expected to lead to the best results, because it can trigger memory of physical as well as visual cognitive actions.

The question addressed in this study concerns the differences in the amounts of ‘action’, ‘why’, ‘how’, and ‘metacognitive’ information elicited with the methods of concurrent reporting, retrospective reporting and cued retrospective reporting. The tasks used are computer-simulated electrical circuits troubleshooting problems. It is expected that concurrent protocols would contain more ‘action’ information than retrospective protocols (cf. Kuusela & Paul, 2000; Taylor & Dionne, 2000), and that retrospective protocols would contain more ‘why’, ‘how’ and ‘metacognitive’ information than concurrent protocols (cf. Taylor & Dionne, 2000). By combining the advantages of concurrent and retrospective protocols, cued retrospective protocols are expected to reveal the most comprehensive picture of the problem-solving process by a) providing more ‘action’ information than retrospective protocols, and b) providing more ‘why’, ‘how’ and ‘metacognitive’ information than concurrent protocols.

4 Russo (1979) used a within-participants design, and hence, effect sizes could not be computed based on the means and standard deviations presented in that article. Our attempts to obtain the required data were unsuccessful.

34

Method

Design A within-participants design with four conditions was used: concurrent reporting,

retrospective reporting, cued retrospective reporting, and concurrent reporting with eye tracking. Each condition was paired with two tasks, and the conditions were counterbalanced, resulting in four sequences to which participants were randomly assigned. The concurrent reporting with eye tracking condition was added to gather data outside the scope of this article. Note therefore, that other data from the participants in this study are reported in Van Gog, Paas, and Van Merriënboer (2005).

Participants

Participants were 26 students (17 male, 9 female; age range 17-21 years) in their fifth year of pre-university education or their first or second year of higher professional education, all of whom had uncorrected good eyesight or good eyesight when corrected with hydrophilic contact lenses. They had studied at least the basic theory of electricity, so each participant knew how circuits and the individual components should function. They volunteered to participate and received € 12.50 after the experiment. Apparatus and Materials

Registration of eye movements. A 50Hz video-based remote eye-tracking device from SensoMotoric Instruments (SMI) was used to record eye movements. This camera with infrared source was placed under the 21-inch screen of the stimulus PC, located in the recording room. By means of an adjustable forehead rest that was placed in front of the screen, participant’s eyes were positioned at approximately 70 cm from the screen’s center. On a connected PC in the adjoining observation room, I-View software (SMI) was used to operate the camera and to calibrate the eye-tracking system. To enable the experimenter to perform the necessary actions on the stimulus PC when calibrating the system from the observation room, an extra mouse, keyboard and monitor located in the observation room were connected to the stimulus PC. Participants’ eye movements and mouse and keyboard operations were registered and replayed using GazeTrackerTM software (Lankford, 2000). A one-way screen enabled the experimenter to observe the recording room from the observation room, and microphones enabled verbal communication between both rooms. These microphones were attached to a digital audio-recorder to enable recording of participants’ verbal reports.

35

Troubleshooting tasks. The troubleshooting tasks consisted of computer-simulated malfunctioning electrical circuits. They were constructed by a science teacher in the Crocodile Physics 1.5® software program, and were at a level of difficulty appropriate for fourth year higher general secondary education and pre-university education. The circuits contained at least a toggle switch, a lamp, a battery, a voltmeter and an ammeter. These components were supplemented in varying ways by other toggle switches, lamps, batteries, voltmeters, ammeters, and/or push switches, resistors, variable resistors, fuses, buzzers, and gears driven by constant speed motors.

Each circuit contained multiple faults, for example, batteries or meters could be connected in the wrong way, components could be short-circuited by redundant wire, or the voltage or current could be too high or too low. The participants were informed that good functioning, that is, after repair, encompassed that: a) all components with outwardly observable functions function visibly when the circuit is closed (e.g., lamps burn and gears turn visibly), b) the (repaired) circuit contains at least the same components, that is, components could be added to the initial, malfunctioning circuit, or could be changed, but not removed, c) all components are properly connected, and d) in case of multiple switches the circuit functions appropriately at different switch (on/off) combinations.

The tasks were preceded by a general introduction that was intended to familiarize participants with the functioning of Crocodile Physics®. It contained a textual explanation of the program functions, an example of a simple circuit that participants had to copy so that they could practice with placing and connecting components, and an example of a simple troubleshooting task. The introduction also contained a simple task to practice concurrent reporting that was somewhat simpler than the actual experimental tasks, but not so simple that participants could solve it right away. So, they had time to familiarize themselves with thinking aloud, without being distracted by task demands. Furthermore, it contained a demonstration of an eye-movement recording (the cue), which showed the experimenter’s eye movements and mouse keyboard actions on the same task that was used to practice concurrent reporting. Figure 1 gives an indication of what the cue looked like, by fictitiously displaying the replay of a record over one of the tasks in the cued retrospective reporting condition. The definition of a properly functioning circuit (as described above) was given halfway through as well as at the end of the introduction.

36

Figure 1 An example of the cue, showing the eye movements replayed over the task (replay of mouse/keyboard actions is not simulated here). Participants saw a red cross, which indicated their eye fixations, moving across the screen. This is fictitiously displayed here through multiple crosses connected by the thinner lines.

Verbal instructions. Instructions and prompts were worded in line with the standards

described by Ericsson and Simon (1993). The instruction before the practice task in the introduction was “you should really think aloud, that is, verbalize everything that comes to mind, and not mind my presence in doing so, even when curse words come to mind for example, these should also be verbalized. Act as if you were alone, with no one listening, and just keep talking”. Whenever participants stopped verbalizing their thoughts, the experimenter prompted them after 5 seconds by saying, “Please try to keep talking.” For the concurrent reporting condition, the instruction was “please think aloud while you are working on the next two tasks”. The instructions to the retrospective and cued retrospective reporting conditions consisted of two parts. For the retrospective condition the first part was “Please complete this task, you can work in silence.” The second part was “This is the begin state of the task. Can you please tell me what you were thinking during problem solving.” The first part of the instruction to the cued

37

retrospective condition was “I am going to record your eye movements while you are working on the next task, so I will calibrate the eye-tracking system before you start. In a minute you will see a red square appearing on your screen, please follow it with your eyes.” After calibration: “Thank you, the system is calibrated. Please complete the next task, you can work in silence.” The second part of the instruction was “This is a record of your eye movements and your actions. I am going to replay it, please watch it and tell me what you were thinking during problem solving. If you want the record to pause, you can use the key F2, when you want it to proceed, you can press the ‘play’ button in the program menu.” When participants stopped verbalizing, the prompting procedure described above was used in each condition.

Procedure

Each participant was scheduled for an individual session of approximately 90 minutes. Participants were seated in the chair in front of the stimulus PC in the recording room and they were informed that they would start with an introduction to the program, before they could start working on the tasks. The experimenter also indicated that the tasks would have to be completed under different conditions, and that instructions regarding those conditions would be given at the right moment. If participants had no further questions, they were asked to put their head in the forehead rest and to adjust the chair so that they would be seated comfortably in that position when eye movements would be recorded (when eye movement recording was not necessary, they did not have to keep their head in the forehead rest). The experimenter then went into the observation room and participants started with the introduction.

After finishing the introduction practice task, the participants were asked whether they felt comfortable enough with the concurrent reporting procedure to do it ‘for real’, and the experimenter also judged whether this was the case before continuing. Although some participants initially needed quite some prompts, none of the participants or the experimenter deemed additional practice tasks necessary.

When they had finished the introduction, participants were informed that they were allowed to ask questions, but that questions related to the content of the tasks would not be answered, only those with regard to the functioning of Crocodile Physics®. Then they were given the appropriate instruction for the first task of their assigned sequence. The order of the instructions depended on the sequence of conditions participants were assigned to, and were given after they finished either one (in the retrospective and cued

38

retrospective conditions) or both (in the concurrent condition) tasks in a condition. In the retrospective and cued retrospective conditions, the first part of the instructions was given before each task, and the second part after each task. Before each task in the cued retrospective condition, the eye-tracking system was calibrated, and the recording of eye movements and mouse/keyboard actions was started with the GazetrackerTM software. In the concurrent and retrospective conditions, only participants’ mouse/keyboard actions were recorded with the GazetrackerTM software, which was started before both tasks (in the concurrent condition) or before each task (in the retrospective condition). After each task, a screen followed that indicated that they had to wait for the experimenter (in the retrospective and cued retrospective conditions) or for instructions. After retrospective reporting or cued retrospective reporting the instructions for the next condition were given face-to-face by the experimenter. The concurrent and retrospective verbalizations were recorded on digital audiocassettes.

Data Analysis

Coding scheme. The coding scheme was developed based on our definitions of information types, and refined by analyzing samples of the protocols. The task-oriented sub and main categories, which were used in the later analyses, are shown in the Appendix. Next to these, categories like ‘questions asked’, ‘answers given’ ‘program-related remarks’ and ‘no code’ (i.e., unintelligible) were included. Protocols were segmented at a small grain size: each sentence, or utterance preceded and followed by a pause, was considered a separate segment. If a segment clearly contained two types of information, it received two codes. Two raters were familiarized with the experimental tasks and program, so that they could meaningfully interpret possible references to the task or program in the protocols. They scored 20 percent of the protocols, which were not used in the refining phase of the coding scheme, with an inter-rater reliability of .79 (Cohen’s kappa). When considering only the task-oriented main categories, the inter-rater reliability was also .79 (Cohen’s kappa). Since the inter-rater reliability was sufficiently high (i.e., higher than .70; Van Someren et al., 1994), one rater scored the remaining protocols.

Dependent variables. For each participant, the number of codes on the task-oriented main categories, ‘action’, ‘why’, ‘how’, and ‘metacognitive’ information per method, was calculated by summing the number of codes on the constituting subcategories (see Appendix).

39

Results

The data were analyzed using Friedman’s tests with Conover’s (1999) procedure for comparisons (an equivalent to the parametric Fischer’s least significant difference procedure), with a .05 significance level (two-way). Table 1 shows the differences per method in the median values of the number of codes assigned to each category of information. Table 2 shows the sums of ranks.

Table 1 The first quartile, median and third quartile values for the number of codes assigned to each information type per reporting method

Reporting Method Concurrent Retrospective Cued Retrospective Information Type 1st Median 3rd 1st Median 3rd 1st Median 3rd Action 17.50 39.00 65.75 12.00 16.00 22.25 19.00 28.50 36.75 Why 8.75 12.00 18.25 7.00 8.00 11.25 6.00 11.00 14.25 How 2.00 3.00 5.25 0.00 1.00 1.25 1.00 2.50 4.25 Metacognitive 0.75 3.00 6.25 1.00 2.00 3.00 2.75 5.50 7.00

Table 2 Sums of ranks for each information type per reporting method

Σ ranks Reporting Method Action Why How Meta Concurrent 62.92 61.62 61.10 47.58 Retrospective 38.48 42.38 37.44 42.38 Cued retrospective 54.60 52.00 57.46 66.04 Note. Meta = metacognitive

The Conover procedure is based on the difference between the sums of ranks. A

difference is significant when the outcome of the following formula (Conover, 1999, p. 371) is correct (i.e., when the result in the left-hand part is indeed larger than the result in the right-hand part):

The number of codes on ‘action’ information differed significantly between the three methods (χ 2 = 12.51, df = 2, p < .01). The Conover procedure showed that in order to be significant at the .05 level, the difference between sums of ranks should exceed

21

21

2/1 )1)(1()(2

⎥⎥⎦

⎤

⎢⎢⎣

⎡

−−

−⟩− ∑

− kbRbA

tRR jij α

40

12.56. As can be inferred from Table 2, for ‘action’ information, the difference between concurrent protocols and retrospective protocols was 24.44, and that between retrospective and cued retrospective protocols was –16.12, hence, both were significant. The direction of those differences can be inferred from Table 1. In line with our hypothesis, the number of codes on ‘action’ information in concurrent protocols was higher than in retrospective protocols and was higher in cued retrospective protocols than in retrospective protocols.

Significant differences between the methods were also found in the number of codes on ‘why’ (χ 2 = 7.15, df = 2, p < .05), ‘how’ (χ 2 = 15.49, df = 2, p < .01), and ‘metacognitive’ (χ 2 = 12.51, df = 2, p < .01) information. The Conover procedure showed that in order to be significant at the .05 level, the difference between sums of ranks should exceed 13.52 for ‘why’ information, 11.06 for ‘how’ information, and 12.05 for ‘metacognitive’ information. As can be inferred from Table 2, on ‘why’ and ‘how’ information, there was a significant difference between concurrent and retrospective protocols (differences in sums of ranks of 19.24 and 23.66, respectively). However, Table 1 reveals that this effect was in the direction opposite of the predicted one: concurrent protocols contained more ‘why’ and ‘how’ information than retrospective protocols. On ‘metacognitive’ information, there was no significant difference (the difference in sums of ranks was 5.20).

Given this unexpected finding that retrospective protocols did not contain more ‘why’, ‘how’, and ‘metacognitive’ information than concurrent protocols, it does not make much sense to test our initial hypothesis that cued retrospective protocols –like retrospective protocols- would contain more why’, ‘how’, and ‘metacognitive’ information than concurrent protocols. We will therefore test whether cued retrospective protocols –like concurrent protocols- also contain more ‘why’ and ‘how’ information and not significantly different or more ‘metacognitive’ information than retrospective protocols.

From Table 2 it can be inferred that there was no significant difference in the number of codes on ‘why’ information in cued retrospective and retrospective protocols (the difference in sums of ranks was -9.62), but there was a significant difference on ‘how’ and ‘metacognitive’ information (differences in sums of ranks of –20.02 and -23.66, respectively). Table 1 reveals that cued retrospective protocols contained a higher number of codes on ‘how’ and ‘metacognitive’ information than retrospective protocols.

41

Discussion

Regarding our hypothesis that concurrent reporting would provide more ‘action’ information than retrospective reporting and that retrospective reporting would yield more ‘why’, ‘how’ and ‘metacognitive’ information than concurrent reporting only the first part was confirmed. Unexpectedly, concurrent reporting not only resulted in more ‘action’ information, but also in more ‘why’ and ‘how’ information than retrospective reporting, and there was no significant difference on the amount of ‘metacognitive’ information provided by these methods. A possible explanation for this unexpected finding may lie in the fact that we first segmented the protocols based on utterances and then coded the segments, whereas Taylor and Dionne (2000) seemed to have coded meaningful episodes (in which case the segmenting and coding processes become intertwined).

Our hypothesis that cued retrospective reporting –like concurrent reporting- would elicit more ‘action’ information than retrospective reporting, was confirmed. Given the unexpected finding that concurrent reporting yielded a higher amount of ‘why’ and ‘how’, and a not significantly different amount of ‘metacognitive’ information, than retrospective reporting, we tested whether cued retrospective reporting would also result in more ‘why’, ‘how’ and ‘metacognitive’ information than retrospective reporting. This was indeed the case for the ‘how’ and ‘metacognitive’ information, but there was no significant difference between cued retrospective and retrospective reporting in the amount of ‘why’ information elicited.

Some critical observations must be made with regard to this study. First of all, retrospective reports are known to be sensitive to fabrication. On the one hand, in cued retrospective reporting, fabrication of thoughts related to actions may be less likely because of the cue. On the other hand, the cue might lead to an active reconstruction of thoughts as a result of re-viewing one’s own problem-solving process, instead of being solely based on memory. Based on the present study, it is not possible to distinguish memory only and active reconstruction processes. This issue should be addressed in future research, especially since it might be a possible explanation for the fact that cued retrospective reporting did result in more ‘metacognitive’ information being reported than retrospective reporting, whereas concurrent reporting did not. Secondly, we decided not to use probes in retrospective and cued retrospective reporting, but only prompts, and kept instructions and prompts as similar as possible across conditions to avoid possible bias. Although this approach is less of a threat to reliability and validity, the use

42

of probes is often seen as a great benefit of retrospective reporting and does not have to influence reliability and validity when probes are neutral, non-evaluative and do not restrict the participant’s reporting (Ericsson & Simon, 1993; Taylor & Dionne, 2000; Van Someren et al., 1994). This implies that the full potential of both retrospective reporting and cued retrospective reporting may not have been realized in this study. Thirdly, some technical aspects in cued retrospective reporting were not entirely optimal. Because of the real-time replay of the record, the speed was equal to that of the actual problem-solving session. It might be interesting to investigate whether and how slower replay or participant control over replay would affect the results.

In sum, both concurrent reporting (with the exception of ‘metacognitive’ information) and cued retrospective reporting (with the exception of ‘why’ information) resulted in a higher number of codes on the different types of information than retrospective reporting. These results suggest that the method of cued retrospective reporting based on a record of eye movements and mouse/keyboard operations superimposed on the task may be a promising one for eliciting a broad range of information about solving computer-based problems. It seems worthwhile to further investigate the possible benefits of this method in relation to concurrent reporting. Some important qualitative questions should be addressed, such as whether or not concurrent and cued retrospective reports capture the same content, and whether the methods are differentially effective for different groups of participants (e.g., different levels of expertise). Another qualitative issue that would be interesting to study is whether the different communicative nature of concurrent and cued retrospective reporting (e.g., the latter might be perceived as more “conversational”) influences the content of the protocols. If this would be the case, it would be another methodological issue to address in future studies in addition to the ones described above.

43

Appendix The task-oriented categories of the coding scheme Category Sub Main Remarks relating to… Example 1 Survey

Action

the organization of the circuit

“This one has three lamps, two connected in parallel and one in series”

2 Practical evaluation of initial state

Action

a practical evaluation of the initial state

“Nothing happens”

3 Evaluation of information in initial state

Action

evaluation of information from the Crocodile function, or components in the initial state

“The current here is 12 Volt”

4 Try Action trying the circuit “Let’s see what it does” 5 Execute Action actions/operations “I am now adding a lamp” 6 Practical evaluation

of changed state Action

a practical evaluation of the

executed solution step “Damn, the lamp still explodes”

7 Evaluation of information in changed state

Action

evaluation of information from the Crocodile function, meters or wires

“The lamp got 12 Volt, while its maximum is 9 Volt”

8 Theoretical

evaluation of initial state

Why

a theoretical evaluation of the initial state

“The voltage is very low, the lamp won’t burn”

9 Theoretical

evaluation of executed solution step

Why

a theoretical evaluation of an executed solution step

“Now it should work”

10 Theoretical evaluation of possible solution step

Why

a theoretical evaluation of a possible but not executed solution step

“But if I lower the voltage, the other lamp gets too little”

11 Theory about circuits in general

Why

theory about circuits in general

“In parallel connections each branch gets the same voltage”

12 Theory about specific parts of circuits

Why

theory about components of circuits

“An ammeter should be connected in series”

13 Definition of

problem

Why

the problem

“This battery is connected in the opposite direction from the others”

14 Definition of solution

Why

a (possible) solution

“The resistance should be higher”

44

15 Exclusion of

possibilities

Why

exclusion of possible causes

“The batteries are properly connected, so that isn’t the problem”

16 Evaluation of previous problem states

How

evaluation of previous problem states

“When the resistor was set at less Ohm, this lamp exploded”

17 Heuristics

How

the use of heuristics

“I always check first whether all components are connected properly”

18 General approach to/ searching for a solution

How

a general approach to or searching for a solution

“How can I do this?”

19 Specific approach to a solution

How

a specific approach to a solution

“Then I am going to check this first”

20 Goal-orientation Meta goal orientation “All lamps should burn visibly” 21 Self-evaluation-

knowledge Meta

the evaluation of knowledge

“Remarkable how much I remember”

22 Self-evaluation- actions

Meta

the evaluation of actions

“This was not a smart thing to do”

23 Self-evaluation- strategy

Meta

the evaluation of strategy

“I am going in the right direction”

24 Task evaluation Meta the evaluation of the task “This one is easy”

45

Addendum

Expertise-Related Differences in the Way Participants Experience Cued Retrospective, Concurrent, and Retrospective Reporting

Differences in expertise might influence the way participants experience a reporting method. It is known for example that in case of a very high cognitive load caused by performing a task, concurrent reporting may become difficult (Ericsson & Simon, 1993). Since the cognitive load imposed by a task is related to the level of expertise of the performer (Kalyuga, Ayres, Chandler, & Sweller, 2003), this may influence the experience of the reporting method and, possibly, the results obtained with it. Via debriefing questions at the end of the experiment, participants’ opinions about the methods were obtained. The answers from the five highest and five lowest expertise participants in our sample were explored.

The reader is referred to Chapter 4 for a description of how those participants were selected, suffice it to say here that a) those two groups differ significantly on attained performance, invested mental effort and time-on-task on all eight experimental tasks (the higher expertise group had a significantly higher performance score, combined with significantly less investment of time and effort), and b) this distinction in levels of expertise indicates a relative difference between participants from our sample, rather than an absolute classification, hence we will use the terms ‘lower expertise’ and ‘higher expertise’.

The three open-ended debriefing questions that were asked at the end of the (within-subjects) experiment were simply “How did you feel about concurrent reporting?”, “How did you feel about retrospective reporting?”, “How did you feel about cued retrospective reporting?”. Participants’ answers were recorded and qualitatively analyzed on indication of a positive or negative experience, preference for a method, and factors that mediate experience and/or preference.

Most of the ‘lower expertise’ participants found concurrent reporting a negative experience (four out of five), and preferred cued retrospective reporting over concurrent reporting and retrospective reporting (four out of five). Most of the ‘higher expertise’ participants did not find any of the methods a negative experience (four out of five; only one of them indicated a negative experience, with cued retrospective reporting), nor did they seem to have a clear preference for a method. Across both groups, time-on-task and the cue were mentioned as mediating factors for preference of method. With regard to time-on-task, it was indicated that the cue would

46

be most useful in tasks of longer duration, because for tasks of shorter duration they could remember easily what they had done and what they were thinking (which is why one of the ‘higher expertise’ participants found cued retrospective reporting a negative experience). With regard to the cue, it was indicated that seeing the replay of their eye movements and mouse/keyboard operations helped participants to remember what they were thinking, although some of them indicated that the “red cross” that showed their eye fixations moved across the screen too fast to respond to.

The finding that the ‘lower expertise’ participants found thinking aloud a negative experience, is in line with Ericsson and Simon’s (1993) remark that concurrent reporting may be difficult when the task poses a high demand on the cognitive system; those participants invested significantly more mental effort than the ‘higher expertise’ group. The finding that they preferred cued retrospective reporting may be due to the influence of time-on-task that participants mentioned as a factor mediating their experience/preference; it was indicated that the cue would be most useful in tasks of longer duration, and ‘lower expertise’ participants had a significantly higher time-on-task.

These findings may have important implications for deciding which methods to use with which participants (high/low expertise) and tasks (short/longer duration). Future research with larger groups of participants and larger differences in expertise should establish whether or not differences in experience influence the quantitative and qualitative results obtained with concurrent and cued retrospective reporting methods. Depending on the outcome, it might be preferable to use cued retrospective instead of concurrent reporting when participants have a low level of expertise.

47

Chapter 4

Uncovering Expertise-Related Differences in Troubleshooting Performance: Combining Eye Movement and Concurrent Verbal

Protocol Data1 This study explored the value of eye movement data for uncovering relatively small expertise-related differences in electrical circuit-troubleshooting performance, and describes that value in relation to concurrent verbal protocols. Results show that in the ‘problem orientation’ phase, higher expertise participants spent relatively more time, had a shorter mean fixation duration, and fixated more on a major fault-related component than lower expertise participants. In the ‘problem formulation’ part of the ‘problem formulation and action decision’ phase, the mean fixation duration of the higher expertise participants was longer. In the ‘action evaluation and next action decision’ phase, higher expertise participants spent relatively more time than the lower expertise participants. Over the different phases, only the mean fixation duration of the higher expertise participants differed significantly. The relation between the eye movement and concurrent verbal protocol data is qualitatively described. The results are discussed in perspective of the combined value of eye tracking and concurrent reports for expertise research and instructional design.

Technical troubleshooting, that is, diagnosing and repairing faults in a technical system, is considered a complex process to carry out and learn. Effective performance of a troubleshooting task requires adequate domain, system, and strategic knowledge (organized in mental models), and adequate reasoning based on this knowledge (Gitomer, 1988; Schaafstal, Schraagen, & Van Berlo, 2000). By definition, experts possess much more extensive knowledge than non-experts, and in addition, their knowledge is more effectively organized and better accessible in long-term memory (Chi, Glaser, & Farr, 1988; Ericsson & Lehmann, 1996), which makes their performance more efficient.

For example, expert troubleshooters’ system knowledge, allows them to exhibit failure-based reasoning when troubleshooting familiar systems. That is, their mental models contain a large body of knowledge of previously encountered failures that caused similar malfunctioning, which allows them to diagnose and act based on that similarity. In the absence of this high amount of experience, non-experts have to rely on system-based reasoning, that is, they have to build a mental representation of the system and use that

1 This chapter was published as Van Gog, T., Paas, F., & Van Merriënboer, J. J. G. (2005). Uncovering expertise-related differences in troubleshooting performance: Combining eye movement and concurrent verbal protocol data. Applied Cognitive Psychology, 19, 205-221.

48

representation to reason about the system’s behavior in order to diagnose possible faults. The same goes for experts when troubleshooting unfamiliar systems, however, in those cases the amount and organization of their domain knowledge allow them to build a representation faster, and their strategic knowledge allows them to apply effective strategies (e.g., the structured approach to troubleshooting described by Schaafstal et al., 2000). Non-experts have to rely on weaker and domain-general strategies (e.g., means-ends analysis, which relies on backward reasoning; Patel, Arocha, & Kaufman, 1994; Patel, Groen, & Norman, 1993; Sweller, 1988). Evidence for the fact that the amount and organization of experts’ domain knowledge support a faster construction of mental system representations comes from the work of Egan and Schwartz (1979). They found that more skilled technicians formed chunks of components in schematic drawings according to their functional units, and were able to recall more information than less skilled technicians after brief exposure to a drawing. In addition, chunking mechanisms (Gobet et al., 2001) reduce working memory load so that experts can devote more cognitive capacity to reasoning. Approximately 5-9 elements or chunks of information can be held in working memory simultaneously (and less when information is not only to be remembered, but also processed; Sweller, 2004). So, when the chunks that are formed or retrieved from long-term memory are larger (e.g., an entire functional unit instead of one component), more information can be held in working memory. Hence, an individual with more expertise is able to keep the same amount of information in working memory while using less capacity than an individual with less expertise, and can therefore devote more cognitive capacity to reasoning.

Although an extensive body of research exists on expert-novice differences in knowledge, memory, and performance in a substantial number of domains, among which, technical troubleshooting (e.g., in general: Chi et al., 1988; Ericsson & Lehmann, 1996; in troubleshooting: Gitomer, 1988; Schaafstal et al., 2000), much less is known about the relatively long intermediate phase on the developmental continuum from being a novice to becoming an expert in a domain2. For most complex cognitive skills, this intermediate phase in which students gradually acquire competence, can have a very long duration (cf. “the 10-year rule of necessary preparation” for attaining excellence; Ericsson & Lehmann, 1996). Consequently, there are expertise differences, or (sub)levels of less to more skilled performance, within this phase. Research on the differences in knowledge and performance at those sublevels of expertise will advance the insights of

2 The medical domain is somewhat of an exception; see for example Rikers, Schmidt, and Boshuizen (2000).

49

expertise research. Moreover, such research on more subtle expertise differences is imperative for instructional design (Alexander, 2003; Van Gog, Ericsson, Rikers, & Paas, 2005). Instructional designers acknowledge that in order to foster students’ expertise development as far as possible during a formal instructional period, instruction should be adaptive to the individual learner’s level of expertise (Shute & Towle, 2003). However, educational research has not systematically addressed the questions of exactly how expertise in a domain develops, and what aspects of performance distinguish students at different (sub)levels of expertise (Alexander, 2003; Van Gog, Paas, & Van Merriënboer, 2005).

Uncovering Expertise-Related Performance Differences: Process-Tracing Techniques

For studying expertise-related differences in performance on complex cognitive tasks, process-tracing techniques are very promising because “… the data that are recorded are of a pre-specified type (e.g., verbal reports, eye movements, actions) and are used to make inferences about the cognitive processes or knowledge underlying task performance” (Cooke, 1994, p. 814; italics added). Verbal reports, such as concurrent (‘think aloud’) and retrospective reporting (Ericsson & Simon, 1993) are probably the most widely used process-tracing techniques. With the method of concurrent reporting, participants are instructed to “think aloud”, that is, verbalize everything that comes to mind, while they are working on a task. With the method of retrospective reporting, participants are instructed to verbalize the thoughts they had during problem solving immediately after finishing the task. Both methods can result in verbal protocols that allow for making inferences about cognitive processes, but to ensure validity of those inferences, the wording of verbalization instructions and prompts is crucial (Ericsson & Simon, 1993). Furthermore, their on-line (concurrent) and off-line (retrospective) generation may lead to differences in the kind of information contained in the protocols, and hence, the kind of inferences made (for an in-depth discussion of these methods, instructions, and results, see Ericsson & Simon, 1993).

Eye tracking, that is, recording eye movement data while participants are working on a task, is less commonly used as a process-tracing method. However, eye movement data provide insight in the allocation of attention and therefore allow –albeit cautious- inferences to be made about cognitive processes (Rayner, 1998). Attention can shift in response to exogenous or endogenous cues (Rayner, 1998; Stelmach, Campsall, & Herdman, 1997). Whereas exogenous shifts occur mainly in response to salient features in the environment, endogenous shifts are driven by knowledge of the task, of the

50

environment, and of the importance of information sources, and are therefore influenced by expertise (cf. Underwood, Chapman, Brocklehurst, Underwood, & Crundall, 2003)3. For example, Haider and Frensch’s (1999) information-reduction hypothesis, stating that with practice, people learn to ignore task-redundant information and limit their processing to task-relevant information, was corroborated by eye movement data, and in the domain of chess it was found that experts fixated proportionally more on relevant pieces than non-expert players (Charness, Reingold, Pomplun, & Stampe, 2001). Furthermore, eye movement data can provide information about the cognitive load particular cognitive processes impose. For example, pupil dilation (Van Gerven, Paas, Van Merriënboer, & Schmidt, 2004), and fixation duration (Underwood, Jebbett, & Roberts, 2004) are known to increase with increased processing demands (task difficulty), whereas the length of saccades decreases (for an in-depth discussion of eye movement data and cognitive processes, see Rayner, 1998).

So, eye movement data presumably have the potential to show differences in the problem-solving process at a more fine-grained level than verbal protocol data, as well as to provide information about the cognitive demands of those processes that cannot be inferred from verbal protocols. Therefore, the combination of eye movement and concurrent verbal protocol analysis to obtain insight into the content of ongoing cognitive processes and the cognitive demands they impose may be especially useful when investigating relatively small expertise differences.

This study aims to explore the value of eye movement data and the combination of these data with concurrent verbal protocol data for discovering expertise-related differences in troubleshooting performance between students at lower and higher sublevels of expertise in the early intermediate phase. In secondary education science curricula, simple (and nowadays often computer-simulated) troubleshooting tasks are used to teach and test students’ ability to not only memorize certain principles (e.g., Ohm’s law), but also understand their working and use this to reason about a technical systems’ behavior. The troubleshooting tasks used in this study consist of malfunctioning computer-simulated electrical circuits. In the process of solving such problems, it is possible to distinguish the following phases based on the physical (i.e., mouse/keyboard) actions taken on the circuit:

3 It is generally held that a movement of attention precedes the movement of the eyes, but it must be noted that although this is the case in response to exogenous cues, findings by Stelmach et al. (1997) suggest that this may not always be so in response to endogenous cues.

51

• Problem orientation: orienting to the circuit; • Problem formulation and action decision: formulating a problem description and

deciding on the first action, which involves diagnosis of possible causes; • Action evaluation and next action decision: evaluating the outcome of the action and

deciding on the next, which again involves diagnosis; and • Evaluation: a final evaluation of the problem solution (e.g., in terms of outcome versus

costs). Because the diagnosed and tested possible cause may not be the actual or only cause of the malfunctioning, troubleshooting is often a cyclic process (even for individuals with a substantial amount of expertise), and therefore the third phase will be repeated until the problem is solved and the final solution can be evaluated. Note that in phase 2 and 3, two cognitive processes presumably take place (‘subphases’), although these cannot be distinguished as separate phases based on the physical (i.e., mouse/keyboard) actions taken on the circuit.

This study centers on the first three phases of the process, ‘problem orientation’, ‘problem formulation and action decision’ (first action), and ‘action evaluation and next action decision’ (second action). Given that all participants are in the early intermediate phase, we expect the higher expertise participants to spend more time on problem orientation because they will try to build a mental representation of the problem, as well as on problem formulation, because they can use their mental models in combination with the representation to reason about how this specific circuit should function if it was intact. In contrast, lower expertise participants will find it harder to build a representation and be more likely to test the functioning of the circuit and try to use this information to generate hypotheses4. Higher expertise participants might spend relatively more time on deciding on actions and evaluating them, because they might try to consider the impact of their action and will evaluate (or monitor) whether it has gotten them closer to their goal, whereas lower expertise participants would be more likely to try what comes to mind and use the outcome to generate new hypotheses, without a clear sense of the goal. This refers back to the problem orientation and formulation: if they do not consider how the circuit should function if it was intact, they have not explicitly and concretely defined the goal state, which makes monitoring difficult. With

4 Had the participants been further in the intermediate phase, one would expect the opposite pattern, since then both lower and higher expertise participants would likely build representations, but the higher expertise participants would have been faster at this.

52

regard to processing demands as reflected in fixation duration, we expect orientation and evaluation to be less cognitively demanding than reasoning, that is, formulating the problem and deciding on actions, and we expect all these processes to be more demanding for lower than higher expertise participants. In line with the findings by Haider and Frensch (1999) and Charness et al. (2001) we hypothesize that in the ‘problem orientation’ phase, higher expertise participants will have a higher proportion (percentage) of fixations on and gaze switches between components related to major faults in the electrical circuit. Finally, we will qualitatively describe the relation between the eye movement and concurrent verbal protocol data, to get an indication of the unique contribution eye movement data may make to the investigation of cognitive processes.

Method

The data reported here were collected as part of a larger experiment (see Chapter 3). The first aim of this experiment was to compare three verbal methods of knowledge elicitation, concurrent reporting, retrospective reporting and cued retrospective reporting based on a record of eye movements and mouse/keyboard operations, on the types of problem-solving information they elicited and to study the possible influence of expertise on the results. The second aim was the one reported here: to investigate the value of eye movement data for uncovering expertise-related performance differences and to describe that value in relation to concurrent verbal protocols. These data were obtained in the fourth condition included in the experiment: concurrent reporting with eye tracking.

Design

The experiment was set up as a within-subjects, balanced Latin square design, resulting in four sequences of reporting methods and tasks, to which participants were randomly assigned (e.g., the first sequence was: concurrent reporting, tasks 1 and 2; concurrent reporting with eye tracking, tasks 3 and 4; retrospective reporting, tasks 5 and 6; cued retrospective reporting, tasks 7 and 8). The task reported on here was the first of the concurrent reporting with eye tracking condition. It was the same for each participant, irrespective of their assigned sequence, although it could be either the first, third, fifth, or seventh task they worked on.

53

Participants Participants were students in their fifth year of pre-university education or in their

first or second year of higher professional education. All participants had uncorrected good eyesight or good eyesight when corrected with hydrophilic contact lenses, and all had studied at least the basic theory of electricity (i.e., all of them knew the relevant domain principles and the function of the circuit components). Participation was voluntary and was rewarded with € 12.50 after the experiment. From the entire group of 26 participants, the 5 participants with the highest and lowest expertise were selected for this study, based on a measure of performance efficiency on all tasks in the experiment. This measure consists of a combination of performance, mental effort, and time-on-task scores (see Table 1 for these scores and the ‘apparatus and materials’ section for a description of the measures), and is based on the rationale that students of higher expertise are able to obtain equal or higher performance with lower investment of time and mental effort (Anderson & Fincham, 1994; Kalyuga & Sweller, 2005; see also Yeo & Neal, 2004)5.

Table 1 Means and standard deviations of the lower and higher expertise participants’ mental effort, performance, and time-on-task scores on all experimental tasks

Lower Higher M SD M SD Significant (2-tailed) Mental effort (scale 1-9) 5.28 .42 3.63 .54 U= 0, p = 0.008 Performance (scale 1-7) 4.55 .40 5.69 .34 U= 0, p = 0.008 Time-on-task (seconds) 359.98 45.46 163.29 55.47 U= 0, p = 0.008

So, the ‘higher expertise’ group consisted of the 5 participants with the highest

performance efficiency. All of them were in the first or second year of higher professional education. One of them had worked on sequence 1, two had worked on sequence 2, one on sequence 3, and one on sequence 4. The ‘lower expertise’ group consisted of the 5 participants with the lowest performance efficiency. Four of them

5 The performance efficiency scores of all participants were calculated by subtracting the mean standardized mental effort score (zE) and the mean standardized time-on-task (zT) from the mean standardized performance score (zP), and dividing the outcome by the square root of 3:

(zP - zE – zT) / √3 These mean standardized scores were based on the scores on all tasks in the experiment. The reader is referred to Tuovinen & Paas (2004) for a discussion of this formula in relation to determining the efficiency of instructional conditions.

54

were in the fifth year of pre-university education, and 1 was in the first year of higher professional education. Two of them had worked on sequence 1, one had worked on sequence 2, one on sequence 3, and one on sequence 4.

Apparatus and Materials

Troubleshooting task. All troubleshooting tasks in the experiment consisted of malfunctioning electrical circuits, constructed and offered in a simulation program, Crocodile Physics®, version 1.5. A science teacher constructed them, at the level of difficulty of fourth year higher general secondary education and pre-university education. All circuits contained at least the following components: a toggle switch, a lamp, a battery, a voltmeter and an ammeter. Those components were supplemented in varying ways by the following components: toggle switches, push switches, lamps, batteries, voltmeters, ammeters, resistors, variable resistors, fuses, buzzers, and gears driven by constant speed motors.

Each of the circuits contained multiple faults. For example, components (like batteries and meters) could be connected in the wrong way, components could be short-circuited, and there could be problems with the voltage or current, resulting from too high or too low power supplies (batteries) or resistance (resistors). Participants were instructed to troubleshoot the circuits so that they would function properly: a) all components with outwardly observable functions should function visibly when the circuit was closed (e.g., lamps should burn and gears should turn visibly), b) the repaired circuit should contain at least the same components as the initial, malfunctioning circuit, that is, components could be added, or changed, but not removed, c) all components should be properly connected, and d) in the case of multiple switches, the circuit should function when all switches were in the “on” position.

In the introduction preceding the tasks, participants were acquainted with the functioning of Crocodile Physics® and with thinking aloud by means of a simple practice task. No domain-specific information was provided either in the introduction or with the tasks.

The first troubleshooting task of the concurrent reporting with eye tracking condition is shown in Figure 1. The first major fault in this circuit is that the voltage of the battery (3V) is far too low to provide enough power for three lamps (which have a maximum rating of 9V). Hence, when the functioning of the circuit is tried by pressing the switch, participants will see that the lights do not burn and can read the low current from the ammeter. A second major fault is that lamp 1 is short-circuited by the switch, so that

55

even if the battery voltage is raised, this lamp will only burn when the switch is open. A

functioning), the meters should be connected in the right direction, the switch and lamp 1 should be connected in series, and the voltage of the battery should be raised to the range of 15V to 22V to make all lights burn visibly (or alternatively, the resistance could be reduced by lowering the value of the resistor and raising the voltage within a lower range –e.g., in the case of setting the resistor to 10 Ohm, the voltage should be raised to the range of 12V to 14V). So, the task allows multiple approaches to reaching a correct solution, by allowing the actions to be carried out in a different order, and by allowing choice between different options to reach the same goal.

Figure 1

The troubleshooting task (text added)

Performance rating. Participants’ task solutions were scored on the following aspects: a) functioning at each switch position, b) intensity (e.g., of the lamps, of motor rotation), c) optimal functioning (e.g., in the case of a variable resistor and a lamp, the resistor really had to work as a dimmer for the lamp, not just as a resistor), d) proper connection of components, e) proper direction of meters, f) no unnecessary addition of components, g) no extreme values of components. Not all aspects applied to each task,

’ symbol). To repair the circuit (in correspondence with the definition of proper ‘-minor fault is that the meters are connected in the wrong direction (as indicated by the

56

for example, when the meters were already properly directed, the proper direction of meters was not relevant. Also, for some tasks certain aspects could have a maximum score of two points instead of one, when a circuit had multiple switches for example, two points could be gained for aspect ‘a’. For each of the tasks, a maximum score of 7 points could be gained. To determine the reliability of the scoring form, two raters scored the performance of 15 participants. The reliability was .94 (Intraclass Correlation Coefficient 3,2; Shrout, & Fleiss, 1979), and the internal consistency was .97 (Cronbach’s alpha).

Mental effort. Participants indicated the amount of mental effort it took them to complete the task on a 9-point rating scale ranging from 1 “very, very low effort” to 9 “very, very high effort”, which is known to be a reliable measure of experienced mental effort (Paas, 1992; Paas, Tuovinen, Tabbers, & Van Gerven, 2003).

Registration of eye movements. Participants’ eye movements were recorded with a 50Hz video-based remote eye-tracking device (RED) from SensoMotoric Instruments (SMI) with an angular resolution of less than 0.5º. This infrared camera was placed under the 21-inch screen of the stimulus PC, located in a recording room. The resolution of the stimulus PC’s screen was set at 1024 x 768 pixels. The size of the diagram on the screen was 740 x 480 pixels. An adjustable forehead rest was placed in front of the screen, so that the participant’s eyes were positioned at a distance of approximately 70 cm from the center of the screen. On a PC in an adjoining observation room, I-View software (SMI) operated the camera and the calibration of the eye-tracking system. An extra mouse, keyboard and monitor were connected to the stimulus PC and located in the observation room. This enabled the experimenter to perform the necessary actions on the stimulus PC when calibrating the system from the observation room. GazeTrackerTM software (Lankford, 2000) ran on the stimulus PC to register participants’ eye movements and their mouse and keyboard operations. These registration files also enabled determination of time-on-task. The recording room was visible from the observation room through a one-way screen, and microphones that were attached to a digital audio-recorder enabled verbal communication between both rooms and the recording of participants’ verbalizations.

Procedure

Before they started working on the task, the eye-tracking system had to be calibrated, so the experimenter asked the participants to place their head in the forehead rest and instructed the participants: “I am going to record your eye movements while you are

57

working on the next task, so I will calibrate the eye-tracking system before you start. In a minute you will see a red square appearing on your screen, please follow it with your eyes”. After calibration, the experimenter instructed the participants “Thank you, the system is calibrated. Please think aloud while you are working on the next task”. The GazeTrackerTM software was started to record participants’ eye movements and mouse/keyboard operations. Participants’ verbalizations were recorded on digital audiocassettes (and were transcribed after the session). After they finished the task participants went to the next screen where they indicated their perceived mental effort on the 9-point rating scale, and then waited for instructions from the experimenter.

Data Reduction and Analysis

GazeTrackerTM saves the eye movement and mouse/keyboard data in a Microsoft Access database. Based on the mouse-click data in participants’ databases, it was determined how much time they spent on each phase. The first phase, ‘problem orientation’, started at 0 s and ended when the participant tried the functioning of the circuit by pressing the switch. The second phase, ‘problem formulation and action decision’, started at the end of the first phase and ended when the participant initiated the first action (repair) on the circuit. The third phase, ‘action evaluation and next action decision’, started after completion of the first action (so there was a time lag between the end of the second phase and the onset of the third, during which the action was carried out), and ended with the initiation of the second action. The time participants spent on each phase was converted to a percentage of their total time on task.

To be able to roughly distinguish between the problem formulation and action decision subphases in phase 2, or between the action evaluation and next action decision subphases in phase 3, these phases were split half-way (end-time minus start-time divided by two), and the mean fixation duration for the first and second half of phases 2 and 3 was calculated.

We used a dispersion-based method of fixation identification (Salvucci, 1999), and identified fixation points by a minimum number of 3 gaze points that fell within a certain dispersion, that is, were grouped within a radius of 40 pixels, and together had a minimal duration of 100 ms. Using GazeTrackerTM the fixation data for each phase were exported to Microsoft Excel. This resulted in an overview of the coordinates and duration of each fixation, and –when applied- the LookZone (Area of Interest) in which

58

a fixation fell, as well as “summary” data like the number of fixations, mean fixation duration, possible tracking time lost, percentage of time fixated, etcetera (and where LookZones were applied these “summary” data were also provided for each LookZone). The area around the circuit components that were related to the primary faults, that is, the battery, lamp 1 and the switch, were defined as LookZones. The LookZones were 115 by 146 or 146 by 115 pixels, depending on whether the components were horizontally or vertically located.

Results

Because of the small sample sizes, non-parametric tests were used for the analyses. First, the ‘higher expertise’ and ‘lower expertise’ participants’ relative time spent in a phase and mean fixation duration in a phase (and for phase 1 also the percentage of fixations on the battery and the number of gaze switches between the switch and lamp 1) were compared using Mann-Whitney U Tests. For these analyses, the exact 2-tailed significance is reported. Given our initial hypotheses, a less stringent significance level of .10 is used to avoid type II error (i.e., increase power). Second, ‘lower expertise’ and ‘higher expertise’ participants’ fluctuation of mean fixation duration over time was analyzed using a Friedman Test for K Related Samples with a Nemenyi post-hoc procedure. Third, the concurrent verbal protocol data were qualitatively related to the eye movement data.

Phase 1: Problem Orientation

Table 2 shows the means and standard deviations for the absolute time spent, the relative time spent, the number of fixations, and fixation duration in each phase. The medians and quartiles for the percentage of time and the mean fixation duration data are summarized in Table 3.

The ‘higher expertise’ participants spent relatively more time on this phase than ‘lower expertise’ participants (U = 4.0, p = .095). The mean fixation duration in this phase was higher for the ‘lower expertise’ participants (U = 4.0, p = .095). The median of the ‘higher expertise’ participants’ percentage of fixations on the battery of the total amount of fixations on the diagram6 in this phase was 10.53 (interquartile range [IQR] = 5.00 – 15.88), that of the ‘lower expertise’ participants was 5.45 (IQR = .00 - 7.69). The ‘higher expertise’ participants had a higher percentage of fixations on the battery than

6 This excludes fixations outside the diagram, for example those on the toolbar.

59

the ‘lower expertise’ participants (U = 4.0, p = .087). The median of the ‘higher expertise’ participants’ gaze switches between the switch and lamp 1 was 2.0 (IQR = .5 - 4.0), for the ‘lower expertise’ participants it was 1.0 (IQR = .0 - 1.0). The number of gaze switches of the ‘higher expertise’ participants was not significantly higher, although the difference approached significance (U = 5.5, p = .175).

Phase 2: Problem Formulation and Action Decision

The relative time spent on this phase did not differ significantly between the ‘higher expertise’ participants and the ‘lower expertise’ participants (U = 6.0, p = .222), and neither did the mean fixation duration in this phase (U = 6.0, p = .222). In the problem formulation part of phase 2 (2.1), the mean fixation duration of the ‘higher expertise’ participants was higher than that of the ‘lower expertise’ participants (U = 4.0, p = .095). However, in the action decision part of phase 2 (2.2), the mean fixation duration did not differ significantly between the ‘higher expertise’ participants and the ‘lower expertise’ participants (U = 12.0, p = 1.000).

Phase 3: Action Evaluation and Next Action Decision

The ‘higher expertise’ participants spent relatively more time on this phase than ‘lower expertise’ participants (U = 3.0, p = .056). However, the mean fixation duration did not differ between the ‘higher expertise’ participants and the ‘lower expertise’ participants, neither in the entire phase (U = 6.0, p = .222), nor in the action evaluation (U = 12.0, p = 1.000) or next action decision (U = 6.0, p = .222) parts of this phase. Table 2 Means and standard deviations of the lower and higher expertise participants’ absolute time spent, relative time spent, number of fixations made, and mean fixation duration per phase

Time (s) Time (%) L H L H

Phase M SD M SD M SD M SD 1 7.48 6.70 10.14 5.07 2.55 3.07 7.15 2.94 2 14.94 6.31 22.40 21.41 4.40 3.44 17.78 20.68 2.1 n.a. n.a. n.a. n.a. n.a. n.a. n.a. n.a. 2.2 n.a. n.a. n.a. n.a. n.a. n.a. n.a. n.a. 3 12.66 12.87 15.07 7.30 4.38 5.21 10.87 5.09 3.1 n.a. n.a. n.a. n.a. n.a. n.a. n.a. n.a. 3.2 n.a. n.a. n.a. n.a. n.a. n.a. n.a. n.a. Note. L = lower expertise; H = higher expertise; n.a. = not applicable

60

Table 2 (continued)

Mean Nr. Fixations Mean Fixation Duration (ms) L H L H Phase M SD M SD M SD M SD 1 19.60 20.07 23.40 13.63 255.8 86.5 195.5 17.8 2 35.40 24.24 50.80 56.79 261.7 54.9 343.9 85.3 2.1 21.60 13.07 24.50 28.52 227.1 27.2 355.5 123.6 2.2 13.80 11.63 26.30 28.79 346.2 135.7 370.1 146.6 3 23.4 13.83 32.40 22.32 254.3 75.0 308.2 60.6 3.1 11.60 8.11 16.00 11.45 285.2 84.7 281.6 68.5 3.2 11.60 6.07 16.40 11.44 247.0 90.1 346.2 100.7 Table 3 The first quartile, second quartile (Median), and third quartile values for the relative time spent and mean fixation duration per phase

Time (%) Mean Fixation Duration (ms) Lower Higher Lower Higher Phase 1st-2nd-3rd 1st-2nd-3rd 1st-2nd-3rd 1st-2nd-3rd 1 .47–1.03*–5.38 4.49–7.33*–9.70 202.1–217.5*–328.6 181.2–201.6*–206.9 2 1.66–2.85–7.92 4.39–7.34–36.40 219.8–245.6–311.6 271.7–317.7–429.3 2.1 n.a. n.a. 204.7–224.3*–250.9 255.2–348.4*–459.5 2.2 n.a. n.a. 241.7–285.0–481.3 256.0–330.9–503.6 3 .60–3.21*–8.76 6.46–9.76*–15.83 194.9–233.9–323.9 254.0–291.2–378.0 3.1 n.a. n.a. 214.2–252.0–372.8 211.2–302.6–341.4 3.2 n.a. n.a. 169.9–225.3– 334.8 264.0–327.6–437.7 Note. Medians marked with * differ significantly between groups; n.a. = not applicable.

Mean Fixation Duration over Phases

Figure 2 shows the median values of the ‘lower expertise’ and ‘higher expertise’ participants’ mean fixation duration over time (these values are also shown in Table 3).

A Friedman Test (2-tailed) showed no significant differences in mean fixation duration over the phases for the ‘lower expertise’ participants (χ 2 = 6.40, df = 4, p = .174). The ‘higher expertise’ participants’ mean fixation duration over the phases differed significantly (χ 2 = 10.40, df = 4, p = .022). A Nemenyi post-hoc procedure shows that the mean fixation duration in phase 1 is significantly lower than that in phase 2.1, 2.2 and 3.2, and that the mean fixation duration in phase 2.1 is significantly higher than that in phase 3.17.

7 The same pattern emerges when looking at phase 2 and 3 in their entirety instead of at the subphases: there are no significant differences in the mean fixation duration over phases for the ‘lower expertise’ students (χ2 = .40, df = 4, p = .954), and significant differences are found for the ‘higher expertise’ students (χ 2 = 7.60, df = 4, p = .024), with the mean fixation duration in phase 1 being significantly lower than that in the other phases.

61

050

100150200250300350400

1 2.1 2.2 3.1 3.2

Phases

Mea

n M

edia

n Fi

xatio

n D

urat

ion

(ms)

LowerHigher

Figure 2

Fixation durations by expertise level (lower, higher) over the phases

Combining Eye Movement and Concurrent Verbal Protocol Data

In Table 4, participants’ verbalizations in Phase 1, 2 and 3 are reported. Verbalizations of 1 participant in the ‘higher expertise’ group are missing data due to a recording error. A qualitative interpretation of the verbal protocol data is in line with the finding that ‘higher expertise’ participants spent relatively more time on the ‘problem orientation’ phase. These data suggest that the ‘lower expertise’ participants hardly oriented and were immediately focused on testing the functioning of the circuit (e.g., ‘lower expertise’ participants 3, 4, and 5), whereas the ‘higher expertise’ participants seemed more inclined to make an inventory of the circuits’ components and predict it’s functioning before trying it (e.g., ‘higher expertise’ participants 1 and 3). Probably related to this predictive behavior, is the finding that the ‘higher expertise’ participants devoted more attention (a higher proportion of fixations) to the battery, which was a major fault-related component. Even though the ‘higher expertise’ participants tended to engage in more thorough orientation, the results on comparison of mean fixation durations in this phase suggest that this phase led to more extensive processing for the ‘lower expertise’ participants. Furthermore, the within-subjects findings (Figure 2) suggest that for the ‘higher expertise’ participants this orientation led to less extensive processing than problem formulation and deciding on actions.

62

Table 4 Lower and higher expertise participants’ verbalizations per phase

P Phase 1 Phase 2 Phase 3 L1 Uh.. well….. Uhm ... ok let’s

(incomprehensible) Uhm.....

L2 Let’s see first what’s in the circuit ... hm. Try it..

Hm, well, hm. hm yes, the ammeter is right

That lamp here…

L3 I check first whether the lamp works…

The lamp.. none of the three lamps works.. first … I myself think that the power supply is too low. Check whether that’s the case

So now I’m done…right? Uh..is it off now? Yes it’s off right? (Experimenter: “yes, the switch is open now”). Hm, that’s not good. … uhm … I will lower the power to 15

L4 Yes. Ok. I’ll try first to see if it works. Probably not.

.. Uhm.. let me think Yes, I think I will here..or wait, I’ll first raise the voltage, maybe there is too little

Well, that was too much … (laughs). Let’s see if 12 … I can always raise it later ..

L5 Ok I’ll see first what it does and does not do.

Ok, it does next to nothing. ..-10 …-1V..I was thinking first … it needs more voltage …

Nothing … again more ..

H1 Ok…hm….Circuit. 3V. So nothing will happen when I close this, voltage is too low.

Let’s just raise the voltage a little

Then all three of them burn. Those glow a bit slow, uh, soft. Switch is still open…has to be closed

H2 MISSING MISSING MISSING H3 Well, what do we have here?

Just two lamps with all kinds…aha, yes, so this one’s wrong…oh no, this is a meter. Uh sorry, resistor. Two lamps. These are in series. In principle they would have to burn when I close this

But the voltage is too low. What do I think…what do I think now? This one has too be raised. 3.5, 4.5 should it be. But oh, there’s also a resistor.

Now see what it does. That lamp can only take 9V…and this one to. So when I’d connect them in series ….this one could …9.5 … yeah ..could be…they burn…

H4 Ok..uhm.. first close this It does nothing at all. That’s more like it H5 Let’s see, a switch, so we’ll

check first what will blow, then we can work towards that. I close it.

And nothing burns. 10mA, that’s very strange, but let’s see … uh ..does it say on those lamps how much…oh. 6mA should go through it then …or am I wrong? Well, no I’m not wrong. Uh ..off… I think the current is too low, so I’ll raise the voltage here.

Well, those two already burn a little…this one still does nothing. Because the switch is open of course. Those lamps, how bright should they burn, because..40mA.. 20...

Note. P = participant; L = lower expertise, H = higher expertise, m = milli, A = Ampère, V = Volt

63

It seems that the ‘higher expertise’ participants stated their problem formulation with somewhat more certainty than the ‘lower expertise’ participants, which might relate to the finding from the mean fixation duration data that the ‘higher expertise’ participants showed more extensive processing in the ‘problem formulation’ part of the ‘problem formulation and action decision’ phase.

With regard to the finding that ‘higher expertise’ participants spent relatively more time on the ‘action evaluation and next action decision’ phase, the verbal protocols suggest that this might be because ‘lower expertise’ participants did not evaluate the outcome of their action in as much detail as the ‘higher expertise’ participants did. Although the higher expertise participants tended to engage in more thorough action evaluation, there were no significant differences in mean fixation duration between groups. The within-subjects findings suggest that the ‘higher expertise’ participants’ action evaluation led to less extensive processing than problem formulation and deciding on actions.

Discussion

Our expectation that the ‘higher expertise’ participants would spend relatively more time on problem orientation, problem formulation, deciding on actions and evaluating them, was confirmed for the ‘problem orientation’ and ‘action evaluation and next action decision’ phases, but not for the ‘problem formulation and action decision’ phase. The verbal data suggest that the ‘lower expertise’ participants hardly oriented at all, but were immediately focused on testing the functioning of the circuit, whereas the ‘higher expertise’ participants tended to show predictive behavior during orientation, which may have helped to limit the relative time they spent on the ‘problem formulation and action decision’ phase. This is in line with our assumption that the mental models of individuals with more expertise are better developed.

The lower expertise students should have been capable of making a mental representation of the circuit by inventorying its components, which in turn would have allowed them to reason about the circuits’ behavior without testing it (given that they knew the basic principles and the function of components). However, the fact that they favored the opposite, to test it first and then start reasoning, may have been due to the fact that this imposed less cognitive load for them than constructing a mental representation. Probably, the ‘higher expertise’ participants’ more effective mental models allowed them to allocate capacity to the construction of a representation and even to mentally test the circuit and make predictions about its behavior, as well as to

64

focus on critical components (the higher proportion of fixations on the battery; although the number of gaze switches between the switch and lamp 1 did not differ significantly). Apart from differences in cognitive load, this may also relate to differences in metacognitive knowledge. Individuals with less expertise are known to start working in the first direction that comes to mind, and tend to stick to that direction, whereas individuals with more expertise are known to spent relatively more time before deciding on a direction and evaluate (or monitor) whether their actions bring them closer to the desired goal (Schoenfeld, 1987).

Even though ‘higher expertise’ participants spent relatively more time on problem orientation (and possibly on action evaluation as well –that cannot be inferred from these data), these phases seemed to impose less processing demands for them than formulating the problem and deciding on actions. A possible explanation might be that shorter fixations might signal perceptual encoding processes, whereas longer fixations might represent problem-solving processes. Unexpectedly, for ‘lower expertise’ participants all these processes seemed to impose equal processing demands. Furthermore, with exception of the ‘problem orientation’ phase in which the ‘lower expertise’ participants did have a longer mean fixation duration (as we expected), the ‘higher expertise’ participants tended to have a longer mean fixation duration in all other phases, although this was only significantly longer in the ‘problem formulation’ subphase. This is surprising, given that the ‘higher expertise’ participants had a significantly lower self-reported overall mental effort than the ‘lower expertise’ participants. Apparently, the cognitive demands that are measured through the fine-grained fixation data and the more global perceived mental effort scale differ.

Finally, a number of critical observations must be made with regard to this study. First of all, participants had been randomly assigned to task sequences, and all four sequences were represented in both the ‘lower expertise’ and the ‘higher expertise’ groups. Hence, our findings are not likely to be sequence artifacts. Second, a weak point of the use of eye movement (and mouse/keyboard) data is that it does not enable a distinction to be made between the cognitive processes of problem formulation/ action evaluation and action decisions. In order to get some indication, we made a rough distinction by splitting phases 2 and 3 in half, but these results should be interpreted with caution. However, the verbal protocol data did not allow us to make a better distinction. Trying to define a split-point based on protocol data would lead to the problems that participants may –despite clear instructions and practice- not always verbalize everything that comes to mind, and that there may be a pause in between utterances relating to

65

different processes. In that case, one has the same problem of trying to decide were one process ends and the other begins, only at a smaller scale. Third, the performance efficiency measure we used seems adequate to distinguish between individuals from one participant group that differ significantly on those variables. However, this is a quite relative distinction, and does not position an individual on an exact point of the continuum from novice to expert. Since expertise and instructional design research would benefit from a way to classify students at different sublevels of expertise, it would be interesting to study the potential of this measure at a larger scale. Fourth, it would be interesting to conduct more research on the relationship between the cognitive demands measured through eye movement data and more traditional measures of cognitive load such as self-report or secondary task data. Finally, a replication with larger sample sizes is desirable, because these small sample sizes do not warrant any definite claims.

Nonetheless, the results of this study suggest that the combined use of verbal-protocol and eye-movement data can enhance our insight into (implicit) cognitive processes. Although concurrent verbal protocol data reveal important information on the content of the cognitive processes, eye fixation data provide insights into processing demands and specific content of processes, that is, on the allocation of attention to specific components, which cannot be obtained from verbal protocol data. Therefore, studies using a combination of eye movement and verbal protocol data may contribute to our knowledge of the microstructure of expertise development in a domain and provide valuable input for instructional design.

PART II

Process-Oriented Worked Examples

69

Chapter 5

Process-Oriented Worked Examples: Improving Transfer Performance through Enhanced Understanding1

The research on worked examples has shown that for novices, studying worked examples is often a more effective and efficient way of learning than solving conventional problems. This theoretical paper argues that adding process-oriented information to worked examples can further enhance transfer performance, especially for complex cognitive skills with multiple possible solution paths. Process-oriented information refers to the principled (“why”) and strategic (“how”) information that experts use when solving problems. From a cognitive load perspective, studying the expert’s “why” and “how” information can be seen as constituting a germane cognitive load, which can foster students’ understanding of the principles of a domain and the rationale behind the selected operators, and their knowledge about how experts select a strategy, respectively. Issues with regard to the design, implementation, and assessment of effects of process-oriented worked examples are discussed, as well as the questions they raise for future research.

One of the main problems in instruction is the lack of transfer of skills. More often than not, students are unable to apply the knowledge and skills they have acquired to novel settings or novel problems. This lack of transfer can occur on different levels. It can be seen between settings, when students cannot apply what they have learned in one setting (e.g., school) to another setting (e.g., their first job). But it can also occur between different but related domains (e.g., Maths and Science) within the same setting (e.g., secondary education) and even between different problem categories within the same domain. Another problem is that time for instruction is often limited. Research on instructional design has therefore concerned itself with finding instructional methods that are more effective than “traditional methods” in terms of transfer performance and more efficient in terms of time and mental effort required during practice.

There are two different approaches to effective instructional strategies for bringing about transfer: process approaches and product approaches (Paas & Van Merriënboer, 1994a). Whereas product approaches only focus on the results of effective task performance, process approaches seek to attain transfer by having novices mimic experts’ problem-solving behavior during training. In air traffic control (ACT) for example, a

1 This chapter was published as Van Gog, T., Paas, F., & Van Merriënboer, J. J. G. (2004). Process-oriented worked examples: Improving transfer performance through enhanced understanding. Instructional Science, 32, 83-98.

70

result of effective task performance is the skill to direct the safe landing of a group of approaching planes. An example of a product approach to ACT training is practicing this skill with a simulation program. An example of a process approach is confronting learners with an expert’s approach, which they mimic during practice with the simulation program.

One line of research that has established an effective product approach to training is that on worked examples (for an overview, see Atkinson, Derry, Renkl, & Wortham, 2000; Sweller, Van Merriënboer, & Paas, 1998). We argue here that taking a process approach to instruction based on worked examples might further increase its effectiveness by improving transfer performance through enhanced understanding.

Effects of Studying Worked Examples on Learning and Transfer

Research on studying worked examples as opposed to solving problems has shown convincingly that problem solving is not the most effective nor efficient form of instruction for novices in a domain. For novices, studying worked examples (which can be alternated with solving problems, see for example Paas, 1992; Sweller, Chandler, Tierney, & Cooper, 1990) is more effective than solving the equivalent problems in terms of performance on transfer tests (e.g., Cooper & Sweller, 1987; Kalyuga, Chandler, Tuovinen, & Sweller, 2001; Paas 1992; Paas & Van Merriënboer, 1994b; Sweller et al., 1990; Ward & Sweller, 1990). That it is more efficient as well is shown by the fact that students need to invest less time and mental effort to reach this better transfer performance (e.g., Kalyuga et al. 2001; Paas & Van Merriënboer, 1994b).

The key to the effectiveness of worked examples is believed to lie in their prevention of unnecessary search and the promotion of schema construction (Sweller, 1988; Sweller et al., 1998). For students to be able to solve a problem, it is necessary that they recognize the structural problem features as belonging to a particular problem category and remember the related sequence of operators necessary to reach a solution. Since the information on problem types and their associated operators are stored in problem schemata, transfer performance relies heavily on the acquisition of appropriate schemata. Sweller’s Cognitive Load Theory (Paas, Renkl, & Sweller, 2003; Sweller, 1988; Sweller et al., 1998) differentiates between cognitive load related to the nature of the learning material, which is called intrinsic cognitive load and cognitive load related to the presentation of the material and the required learning activities. The part of the latter that does not contribute to learning processes (due to poor instructional design) is called extraneous or ineffective cognitive load. The part that is relevant for learning processes is

71

referred to as germane or effective cognitive load. According to Cognitive Load Theory, the weak methods (e.g., means-ends analysis) used by novices to solve goal specific problems impose a high extraneous cognitive load, which hinders learning. These methods require so much cognitive capacity that little or no resources are left for processes relevant to schema construction. When studying worked examples, however, students’ attention can be entirely devoted to studying the operators necessary to obtain a problem solution and to notice distinctions between types of problems and the associated operators. It is assumed that studying worked examples thus enhances the development of appropriate problem schemata, which in turn leads to better transfer performance.

This explanation, based on Cognitive Load Theory, is supported by the findings that students are indeed capable of performing well on transfer problems after studying worked examples: they are able to recognize structural problem features of the problem categories they have learned, and to apply the solution paths they have linked to those problem categories, even if the surface features of the problems differ. What can be doubted however, is whether the developed problem schemata are such that they allow for ‘far’ transfer. Can students solve a problem from an unknown problem category by combining previously learned operators from other problem categories? For far transfer performance, understanding is critical (Byrnes 1996; Gerjets, Scheiter, & Kleinbeck, 2004; Mayer & Wittrock, 1996; Ohlsson & Rees, 1991). As Ohlsson and Rees (1991) state, “procedures learned without conceptual understanding tend to be error prone, are easily forgotten, and do not transfer easily to novel problem types” (p. 104). We adopt their definition of understanding, holding that “understanding of a procedure must involve both knowledge of its domain and of its teleology” (p. 118). Knowledge of a domain consists of principled knowledge about objects and events in that domain, and knowledge of the teleology of a procedure is knowledge of the rationale behind or purpose of the steps in a procedure (Ohlsson & Rees, 1991).

In summary, students have to understand a procedure to be capable of solving problems from novel categories. They need to know the domain principles and know why the solution steps are taken and why they are performed in this particular order. In other words, they need to know why a particular structural problem feature is associated with a particular operator. It follows then that the worked examples studied have to allow students to acquire schemata in which principled knowledge relevant for the application of procedures is integrated. The term mental models will be used to refer to

72

such elaborate schemata. In Young’s (1983) terminology: mental models of good systems users (experts), allow not only for performance (as simple problem schemata do), but for learning (forming generalizations and retention) and reasoning (inventing new methods and explanations) as well.

From the perspective of Young’s (1983) terminology, it can be argued that most of the worked examples that have been used in previous research do not in themselves allow for the development of mental models that enable performance, learning and reasoning, and therefore do not stimulate conceptual understanding and far transfer. That work has primarily taken what Paas and Van Merriënboer (1994a) refer to as a product approach, using examples that show a given state (i.e., the problem formulation), a sequence of operators (i.e., solution steps), and the goal state (i.e., the final solution; Renkl, 2002; Van Merriënboer, 1997). These examples confront learners with the “products” experts produce at each step in the problem-solving process: an applied operator leading to an altered problem state until the goal state is reached. One might therefore call these examples product-oriented worked examples. Information underlying the selection and application of operators is not necessarily integrated in a schema because it is not included in the examples. Only when students are stimulated and able to generate adequate explanations or justifications themselves (i.e., understand the selection and application of solution steps), does this information become part of their knowledge structures. Results from the research on self-explaining while studying examples (Chi, Bassok, Lewis, Reimann, & Glaser, 1989; Renkl, 1997, 2002) illustrate the importance of understanding the solution steps presented in worked examples.

Taking a process approach to the use of worked examples in instruction might have beneficial effects on understanding and far transfer performance. In line with the definition of understanding as knowledge of the principles governing a domain and being able to use this knowledge to justify the procedure, the “why” information which informs the learner of the rationale behind the selection and application of operators consists of descriptive and predictive principles. Principled knowledge is declarative: it is goal independent, and concerns the relationships of objects or events with other objects or events (Ohlsson & Rees, 1991). An example from the domain of electricity is “electrons cannot pass through a voltmeter and therefore it should be connected in parallel”. The “how” information, which informs the learners of the strategic knowledge used by experts in selecting the operators, consists of a description of an expert’s systematic approach to problem solving (SAP) and the heuristics that accompany it. A SAP is a general, prescriptive plan for action that specifies a sequence of phases or

73

subgoals, while heuristics are prescriptive principles, guidelines or rules-of-thumb (Van Merriënboer, 1997). This knowledge is also declarative, and although it is directed at goals and actions, it has to be interpreted before it can be applied, in contrast to procedural information. A heuristic for troubleshooting electrical circuits is, for example “When an electrical circuit does not seem to function, check first whether the circuit is closed”. Note that a SAP is general in nature, it provides guidelines that can be of use in reaching (sub)goals, but these do not guarantee a solution.

In summary, using process-oriented worked examples which include “why” and “how” information on the selection and application of operators used by expert problem solvers, could lead to mental models that support understanding and far transfer. Before turning to the design of these process-oriented worked examples, we take a closer look at results from the few studies in which successful attempts were made to redesign worked examples in order to enhance understanding, and at the results of studies investigating the effects of providing students with principled information in instructional materials.

Enhancing Understanding: Effects of Example Format and Principled Information

Example Formats that Enhance Understanding In the domain of statistics, research has shown that example formats that aim at

subgoal learning (Atkinson & Catrambone, 2000; Catrambone, 1996, 1998) and derivational (or modular) example formats (Gerjets, Scheiter, & Kleinbeck, 2004; Gerjets, Scheiter, & Catrambone, 2004) can enhance understanding and transfer performance. Catrambone (1996, 1998) defines subgoals as meaningful conceptual pieces of an overall procedure. Subgoals can help learners to solve novel problems that require different steps to reach the same subgoal by limiting the search space to the few steps to reach the subgoal instead of an entire procedure to reach an overall goal state. Cues like visual isolation or labels that indicate which solution steps belong together, can foster students’ self-explanations of the purpose of the group of steps, that is, foster subgoal learning (Catrambone, 1996, 1998). Another technique that fosters subgoal learning in statistics is the use of conceptual instead of computational equations (Atkinson & Catrambone, 2000).

The derivational (or modular) example format (Gerjets, Scheiter, & Kleinbeck, 2004; Gerjets, Scheiter, & Catrambone, 2004), aims at increasing transfer through step-by-step replication of a procedure instead of mapping it as a whole. A structural feature is explicitly conversed into an aspect of the solution procedure, which gives learners insight in the purpose of the steps in the procedure and the possibility of recognizing the steps

74

that require adaptation in novel problems. The commonly used transformational (or molar) example format, in contrast, explicitly identifies the structural problem features that assign a problem to a category, and because each category is represented by a formula for solving problems within that category, transfer problems are solved by assigning the problem to a category and mapping the procedure as a whole. The derivational (or modular) example format was found to be more effective and efficient than the transformational (or molar) format: it led to better transfer performance with less investment of time and effort (Gerjets, Scheiter, & Catrambone, 2004). So, it seems that, at least for examples in the domain of statistics, an example design that aims at increasing understanding through enabling students to infer justifications for or purposes of solution steps, is effective. Enhancing Understanding by Providing Principled Information

A number of studies have investigated the effects of providing students with principled (declarative) information in worked examples (Atkinson & Catrambone, 2000; Catrambone, 1996; Gerjets, Scheiter, & Catrambone, 2004) and of providing this information in learning material studied before practice (Kieras & Bovair, 1984; Singley & Anderson, 1989). Catrambone (1996) has included rule-based elaborations in worked examples, but whereas labeling was found to enhance transfer performance, rule-based elaborations did not. In another experiment (Atkinson and Catrambone 2000) conceptual elaborations were used, but these did not seem to affect transfer performance either. In a recent experiment, Gerjets, Scheiter, and Catrambone (2004) investigated the effects of example format, example elaborations and prior knowledge, and found that example format affected transfer performance the most.

Singley and Anderson (1989) indicate that they initially concluded that transfer is use specific, but later had to refine that statement after studying the role of declarative information in experiments in the domain of calculus and logic. They found that “declarative knowledge provides a basis for transfer between different uses of the same knowledge” (p. 220). Different procedures can be inferred from the same declarative knowledge. They do note however that the beneficial effects of declarative knowledge on transfer might be restricted to the early phases of training. Kieras and Bovair (1984) compared the effects of providing students with declarative “how-it-works” information (a device model) before learning operating procedures of a fictive device versus having them learn the operating procedures by rote. Learning with a device model was found to

75

be far more effective and efficient, because the device model provided a meaningful context from which learners could understand and infer the procedures.

How can these mixed results be explained? It is likely that a number of factors interact and determine the effectiveness of including principled information. Kieras and Bovair (1984) suggest that providing users with “how-it-works” information may not be necessary when the device that they have to operate is very simple, since it reduces the need to understand and infer the operations. It is possible that this applies to problem categories as well: adding principled information to very simple problems may not have a beneficial effect. Another possibility is that students do not attend to this information. As Renkl (2002) indicates, instructional explanations (i.e., explanations provided by persons with more expertise) are very often ineffective and inferior to self-explanations. He attributes this to three aspects of these instructional explanations: the fact that they are often not adapted to learners’ prior knowledge, timing (instructional explanations seem only effective when integrated in an ongoing activity, and therefore timing is critical), and a generation effect (self-generated information is better remembered). With regard to the timing aspect, drawing students’ attention to the explanations is probably equally if not more critical than timing for integration in problem solving and reasoning. In Renkl’s (2002) study, the instructional explanations were provided on learner demand, were kept minimal but contained the possibility of progressive help, and focused on principles. In general the explanations were used moderately by the students. One group of students who did not use them had high prior knowledge and probably did not need them, but the other had low prior knowledge and might have benefited from the explanations. The students who did use them were students with low prior knowledge, and those who used both the minimalist and extensive explanations benefited from them as shown by higher transfer performance (Renkl, 2002). Whether or not these findings might explain the failure to find an effect of rule-based and conceptual elaborations in Catrambone’s studies (Atkinson & Catrambone, 2000; Catrambone, 1996) is as yet unclear.

In summary, the practical tests of the assumption that the provision of principled information during learning is important for understanding have led to mixed results. Whether or not these mixed results can be explained from an interaction between prior knowledge, task complexity and example design is unclear. It seems important for the development of process-oriented worked examples, however, to further investigate and fine-tune the interplay of task complexity, student’s prior knowledge and example

76

design. With regard to example design, timing of information presentation, challenging students to attend to the information and integrating this information in the example seem important factors.

Design of Process-Oriented Worked Examples

For the design of process-oriented worked examples it is important to consider what skills are to be trained, since this partly determines the usefulness of principled or strategic knowledge. Complex cognitive skills consist of multiple related constituent skills. Constituent skills can be either recurrent, that is, they should be performed as algorithmic, rule-based behavior after training, or non-recurrent, that is, they have to be performed in varying ways across problem situations (Van Merriënboer, 1997). The aspects of a task that should be performed as recurrent, algorithmic skills after training have a narrow problem space, and correct application of a particular set of operators associated with a problem type always leads to a correct solution. Therefore, problem solving is a matter of recognizing the appropriate set of operators instead of selecting one amongst many possibilities. On the other hand, the aspects of a task that should be performed as non-recurrent skills after training have multiple possible solution paths, so one needs a strategy to narrow the search space and select those operators that are most likely to lead to a solution. For the task aspects that are to be performed as recurrent skills, adding “why” information in order to “explain” the function of the operators is useful. Because of the algorithmic nature of these elements, there is little strategic knowledge involved and there is no need for “how” information. For the aspects that are to be performed as non-recurrent skills, both “why” and “how” information is useful. Figure 1 shows the elements of process-oriented worked examples for modeling non-recurrent and recurrent aspects of a task for training a complex cognitive skill.

As an example, consider a training program for secondary school students aimed at the complex skill of troubleshooting electrical circuits (see also Kester, Kirschner, & Van Merriënboer, 2004). The desired exit behavior of this training is that students are able to repair a malfunctioning electrical circuit by reasoning about the system principles. An example of a recurrent constituent skill for this complex skill is properly connecting voltmeters and ammeters (in parallel and in series connection, respectively). An example of a non-recurrent constituent skill is reasoning about the location and function of a resistor in the specific circuit at hand. As Kester et al. (2004) note: “although the principles of series and parallel connections are always the same, the features of specific series or parallel connections determine their influence on current and current intensity

77

in the circuit, and therefore the results are always different”. See Figure 2 for a process-oriented worked example of a malfunctioning circuit.

Figure 1. Elements of process-oriented worked examples for modeling recurrent and non-recurrent aspects of a complex cognitive skill. Note: The darker gray area encloses the elements of product-oriented worked examples.

Figure 2 Process-oriented worked example of a complex cognitive task: Troubleshooting an electrical circuit

78

Cognitive Load Theory and Example Design According to Cognitive Load Theory (Sweller et al., 1998), instructional designers

should aim at creating instructional formats that diminish extraneous cognitive load and increase germane cognitive load, within the threshold of available cognitive resources. Instructional formats that have proven successful at this are discussed in this section, and process-oriented worked examples are added to the list of formats that may increase germane cognitive load. It should be noted that extraneous cognitive load is ineffective and that it is imposed by a less than optimal instructional design. However, for a design to impose an effective, germane cognitive load, learners have to invest mental effort in building or elaborating their mental model. Designs aimed at increasing germane load should therefore challenge learners to do so.

From a large body of previous research, general design guidelines for worked examples have emerged that are effective at diminishing extraneous cognitive load. These guidelines and the questions they raise for process-oriented worked examples are summarized in Table 1. For a thorough discussion of these guidelines the reader is referred to Sweller et al. (1998), Atkinson et al. (2000), and Paas et al. (2003).

Table 1 Design guidelines for worked examples aimed at reducing extraneous cognitive load

Design guideline Implication For process-oriented examples Avoid split-attention Multiple mutually referring

information sources (e.g., text and picture, text and text) should be offered in an integrated format

How should the extra information source (“why” and/or “how” information) be integrated?

Avoid redundancy In one information source is redundant (e.g., text) because the other is intelligible by itself (e.g., diagram) the redundant source should be left out.

Is the “why” information redundant if students are able to self-explain the procedure?

Use multiple modalities In multimedia instructions consisting of pictorial and textual information, the text should be presented orally rather than visually.

In what modality should the “why” and/or the “how” information be presented?

If extraneous load can be successfully diminished, the designer could aim at invoking

germane cognitive load to make instruction even more effective. By using completion problems (Paas, 1992; VanMerriënboer, Schuurman, De Croock, & Paas, 2002), that is, examples with a partly worked out solution path which learners have to complete, the

79

learners are forced to study the solution steps more carefully. Increasing variability of examples during training is another way of increasing germane load. Provided that extraneous load is kept low (Paas & Van Merriënboer, 1994a; Van Merriënboer et al., 2002), variability during practice has a positive effect on transfer performance. Requiring learners to self-explain can also be seen as inducing a germane cognitive load (Renkl, 1997, 2002), although this might not be the case for students who are poor self-explainers. The possibility exists that ineffective self-explainers are not able to anticipate, or reason in a principle-based fashion, because they have no cognitive resources left for doing this. As Chi et al. (1989) acknowledge, understanding by self-explaining requires conscious effort. For good students, this might well be a germane cognitive load. For low ability students however, the requirement to self-explain (because of the lack of explanation in the example design) may actually constitute an extraneous load.

Creating process-oriented worked examples, that is, adding “why” information, “how” information, or both to worked examples can also be seen as a way to increase germane load, since students’ learning can benefit from this expert knowledge. However, adding this information does not guarantee that students attend to it and learn from it, which is why process-oriented worked examples are probably most effective when implemented in combination with one of the aforementioned methods. Implementing Process-Oriented Worked Examples and Assessing their Effects

A very effective strategy for implementation of worked examples, especially in design-oriented domains (e.g., computer programming) is the completion or fading strategy (Paas, 1992; Renkl & Atkinson, 2002; Van Merriënboer, 1997; Van Merriënboer & Krammer, 1990; Van Merriënboer et al., 2002). In a completion strategy, the learner starts by studying worked examples, then proceeds via completion problems with increasingly more blanks in the solution path, to solving conventional problems. The completion problems provide a bridge between the worked examples and conventional problems. The benefit of a fading strategy is that extraneous (ineffective) cognitive load is kept low while germane (effective) cognitive load is increased as the learner is gradually forced to become more constructive (Van Merriënboer, 1997). For example, for the first process-oriented completion problems in a fading strategy, the “why” and “how” information could be faded. In following completion problems, both the solution steps and the “why” and “how” information could be faded, so that the students have to

80

fill in the steps as well as provide an adequate justification for their selection and purpose. This justification could then in turn be used to assess understanding during training. A similar approach might be used during a transfer test to assess understanding: asking students to think aloud or to explain their actions while solving the test problems.

It is necessary to look for such direct measures of understanding, because scores on tests of near and far transfer only partly reflect understanding. Of course it will often be the case that a student solves a problem correctly because s/he understands why s/he is doing what s/he is doing, but this is just an indirect measure. Furthermore, what can we infer from the unsolved problems? The quantitative measure of number of solved problems does not allow for judgment of schema content. All it allows us to say is that mistakes are made for a number of unsolved problems. The types of mistakes are ignored, while such qualitative information is very important in judging a student’s knowledge and understanding. For example, when solving a math problem on a near transfer test, one student might fail to correctly solve the problem because of a wrong classification of the problem type, which leads to a failure in selecting the appropriate solution steps, while another student may have classified the problem correctly and has selected the appropriate solution path, but has made an incorrect computation at one of the steps. When not only “why” but also “how” information is contained in a process-oriented worked example, it is even more important to assess directly whether or not students use this information in subsequent problem solving.

General Discussion

In the previous sections, we have proposed the development and use of process-oriented worked examples as a means to further enhance understanding and transfer performance. Understanding was defined as knowledge of the principles of a domain and knowledge of the purpose of solution steps in a problem-solving procedure. For aspects of tasks that are to be performed as recurrent, algorithmic skills after training, a process-oriented worked example that not only contains a given state, solution steps, and goal state (the elements of product-oriented worked examples) but also the principled (“why”) information that experts use when solving such problems could enhance understanding. For the aspects that are to be performed as non-recurrent skills, with multiple possible solution paths, including experts’ strategic (“how”) information, next to the “why” information could enhance students’ understanding and transfer performance. It was indicated that studying this expert information can be seen as a form of germane

81

cognitive load, and suggestions were made for implementation of these examples and assessment of their effects. However, some questions regarding the development of process-oriented worked examples have risen from the previous sections. They will be summarized and discussed in this section.

The question that was raised previously on the interaction of task complexity, student’s prior knowledge and information included in process-oriented worked examples seems a good starting point. To be able to test whether our assumption is correct that students’ understanding is enhanced by principled and strategic information, the process-oriented worked examples have to meet two important criteria.

Firstly, the design should be consistent with the guidelines for diminishing extraneous cognitive load presented in Table 1. The design of the example in Figure 2 almost certainly violates the split-attention guideline. One way to avoid split-attention might be to offer the SAP as spoken text (using the modality principle), since students can then attend to the diagram and the textual information at the same time. With regard to the redundancy principle, the “why” information might be redundant if students are able to self-explain the procedure. Whether a source of information is intelligible by itself is also determined by the student’s level of prior knowledge: if information is already integrated in a learner’s mental model, its integration in instruction is redundant (e.g., Kalyuga, Ayres, Chandler, & Sweller, 2003; Kalyuga et al., 2001). Clearly, more research is needed to define how these design guidelines apply to process-oriented worked examples.

Secondly, the design should be such that students actually attend to the expert information and integrate it in their own mental models, that is, the design should induce germane cognitive load. We have proposed the use of a completion strategy, but at the moment we can only assume that this is also effective when students are asked to complete not only solution steps, but gaps in the principled and strategic information as well. It seems logical that students who complete a procedure can provide a justification for a chosen solution step or for their choice of strategy, and it seems plausible that this has a positive effect on learning, but research is needed to establish that doing both at the same time will not interfere with learning in a negative way.

So, more research on these criteria is necessary to design process-oriented worked examples in such a way that their effects can be assessed fairly, in other words, that poor design or lack of attention are ruled out as alternative explanations for the failure to find an effect. In addition, when we want to assess effects, transfer tests have to allow a

82

judgment of understanding, for example by asking students to think-aloud while they are working on the test problems (see also Atkinson & Catrambone, 2000; Renkl, 2002).

Another interesting question for future research is whether process-oriented examples are effective in different domains or have different effects for different types of students. Most worked examples research has been carried out in well-structured technical domains, mainly with rule-based content. Our argumentation for the effectiveness of process-oriented worked examples builds on this previous research. However, the expert “why” and “how” information could also be very helpful in fostering students’ understanding in less structured domains, because for tasks in these domains the goal state is not always well defined, and multiple paths might lead to an acceptable solution. Furthermore, student characteristics like learning styles or the ability to self-explain might influence the effectiveness of process-oriented worked examples. Offering students an expert’s explanation might reduce in part the necessity to self-explain, and thus result in better learning for the ‘poor’ self-explainers. Of course, students who actively study these explanations, relate them to their prior knowledge, and try to anticipate the explanations, would still be expected to gain more from them than students who do not.

In conclusion, process-oriented worked examples seem a promising instructional means to further enhance understanding and far transfer. Students have to be challenged to invest ‘germane’ effort in studying the “why” and “how” information, and the design and implementation of the examples should help them to do so in an effective and efficient way. Research is needed to come to clear design guidelines, and to verify whether this method is indeed more effective and efficient than conventional problem solving and studying product-oriented worked examples in attaining understanding and far transfer.

83

Chapter 6

Effects of Process-Oriented Worked Examples on Troubleshooting Transfer Performance1

In the domain of electrical circuits troubleshooting, a full factorial experiment investigated the hypotheses that a) studying worked examples would lead to better transfer performance than solving conventional problems, with less investment of time and mental effort during training and test, and b) adding process information to worked examples would increase investment of effort during training and enhance transfer performance; whereas adding it to conventional problems would increase investment of effort, but would not positively affect transfer performance. The first hypothesis was largely confirmed by the data; the second was not: adding process information indeed resulted in increased investment of effort during training, but not in higher transfer performance in combination with worked examples.

Troubleshooting, that is, diagnosing and repairing faults in a technical system, constitutes an important part of most technical jobs. Fault diagnosis is considered a complex cognitive skill to carry out, and even more complex to acquire (Gott, Parker Hall, Pokorny, Dibble, & Glaser, 1993; Schaafstal, Schraagen, & Van Berlo, 2000). This study addresses the question of how initial acquisition of this skill can be fostered by the design of effective troubleshooting instruction and, in particular, the support given to students during practice. Specially, the study investigates the effects on learning outcomes of support formats that are assumed to help students to use their cognitive resources more effectively.

According to Cognitive Load Theory (CLT; Sweller, 1988; Sweller, Van Merriënboer, & Paas, 1998; Van Merriënboer & Sweller, 2005) instructional materials draw on students’ cognitive resources in three ways, related to the intrinsic, extraneous and germane cognitive load they impose. Intrinsic cognitive load is imposed by the complexity of the instructional task and depends on the number of interacting elements that have to be related, controlled, and kept active in working memory during learning activities. By nature, troubleshooting tasks are complex and require processing of numerous information elements. For example, in troubleshooting a simple electrical DC circuit, specific knowledge of the function of its components (e.g., voltage sources,

1 This chapter will be published as Van Gog, T., Paas, F., & Van Merriënboer, J. J. G. (in press). Effects of process-oriented worked examples on troubleshooting transfer performance. Learning and Instruction.

84

resistors) and general knowledge about the relation between voltage, current and resistance (Ohm’s law) and about the conservation of energy and charge (Kirchoff’s laws) is needed to be able to determine how the circuit should function. Knowledge about how to use different meters is required to be able to measure the voltage, current and resistance at different points in the circuit. The troubleshooter has to compare those measurements to his/her own calculations of optimal functioning. Furthermore, s/he needs to relate the outcome of that comparison to knowledge about how certain symptoms (e.g., no current in the entire circuit) relate to certain faults (e.g., defect voltage source or open wire). And this is only an example of a very simple system; these circuits very often form just one subsystem in a more complex whole.

Ineffective, or extraneous cognitive load is imposed by the design of the instructional task or by the activities required of the learner and has been known to hamper learning. Effective cognitive load is also imposed by instructional design, but is germane to learning, as it focuses attention to activities relevant to the acquisition of knowledge and/or skills. Thus, in order to be effective, especially for tasks that impose a high intrinsic cognitive load such as troubleshooting, instruction should be designed in such a way that extraneous cognitive load is minimised and learners are challenged to use the resulting freed-up cognitive capacity for processes and activities directed at learning.

As the above example shows, troubleshooting tasks impose a high intrinsic cognitive load because effective performance requires the interactive use of system knowledge or given system information, principled domain knowledge, strategic knowledge and information provided through measurements conducted on the system, to reason about the system’s (mal)functioning (Gitomer, 1988; Gott et al., 1993; Schaafstal, Schraagen, & Van Berlo, 2000). Troubleshooting remains complex, even for highly experienced individuals. However, experience does offer tremendous advantages. For example, expert troubleshooters have well-developed cognitive schemas that contain quantitatively (more) and qualitatively (better) different system, principled, and strategic knowledge than novices’ schemas do (Chi, Glaser, & Rees, 1982; Larkin, McDermott, Simon, & Simon, 1980). When troubleshooting familiar systems, experts can also use the case-based knowledge they gained through previous fault-finding experiences. When faced with unfamiliar systems troubleshooting, their schema-based knowledge assists them in more rapidly building a mental representation of that system than less experienced troubleshooters can (Egan & Schwartz, 1979). These sophisticated mental representations can then be used to reason about the system’s (mal)functioning.

85

Since novices lack both experience (case-based knowledge) and effective schemas, they have to keep in working memory all the system elements to construct an appropriate mental representation. Given that working memory capacity is limited to about 7 plus or minus 2 elements when merely holding information, and considerably less when processing it (Miller, 1956; Sweller, 2004), processing the system elements alone imposes very high demands on a novice’s cognitive system. In fact, little capacity is likely to be left for reasoning based on this representation. Furthermore, when reasoning about system (mal)functioning, experts can rely on their strategic knowledge, which allows them to apply more effective strategies (e.g., the structured approach to troubleshooting described by Schaafstal et al., 2000), whereas novices have to rely on weaker strategies (using domain-general heuristics), which impose a high extraneous cognitive load (Sweller, 1988; Sweller et al., 1998).

Instruction that consists mainly of solving conventional problems (with only a formulation of criteria for an acceptable goal state and some “givens”) forces novices to resort to weak problem-solving strategies (such as means-ends analysis), which is known to be ineffective for learning. By offering instructional formats that prevent the use of weak strategies, such as studying well-structured worked examples (possibly alternated with solving problems), extraneous cognitive load is reduced and learning is enhanced (Carroll, 1994; Cooper & Sweller, 1987; Ward & Sweller, 1990). Worked examples present the learner not only with the begin (or problem) state and a description of the criteria for an acceptable goal state as conventional problems do, but also show the solution steps that are to be taken to reach the goal state. So, the use of means-ends analysis is prevented because the learner does not have to search for a solution and can instead devote all available cognitive capacity to studying the given solution and constructing an appropriate problem schema (see Sweller, 1988). The ‘worked example effect’ demonstrates that for novices, instruction that relies more heavily on studying worked examples rather than exclusively conventional problem solving is superior with regard to learning outcomes, as measured by both near and far transfer tasks (for an overview of the benefits of worked examples, see Atkinson, Derry, Renkl, & Wortham, 2000; Sweller et al, 1998; Sweller, 2004).

The cognitive capacity that is freed-up by reducing the extraneous load can –within working memory limits- be used to induce germane cognitive load activities that stimulate learning. For example, asking students to self-explain the solution steps may be an effective way to increase germane cognitive load and enhance learning (Chi, Bassok,

86

Lewis, Reimann, & Glaser, 1989; Renkl, 1997). The reason why worked examples require self-explaining is that the solution steps are given but the rationale for taking each step is not. Such examples can be called product-oriented (Van Gog, Paas, & Van Merriënboer, 2004), because they show the problem solved, that is, which solution steps are applied to attain the goal state (the product). They do, however, not explain the problem-solving process, that is, why those steps are chosen (i.e., strategic knowledge) or why they are appropriate (i.e., principled knowledge). Recently, Van Gog et al. (2004) have argued that including in worked examples not only the solution steps, but also the strategic (“how”) and principled (“why”) information used in selecting the steps, may enhance learners’ understanding of the solution procedure. This enhanced understanding due to given process information (i.e., “how” plus “why” information) is expected to lead to higher transfer test performance. Especially, far transfer performance can be expected to increase, because in contrast to near transfer tasks, which have structural features comparable to those of the training tasks but different surface features, far transfer tasks have different structural features, and therefore do not allow learners to merely apply a memorized procedure. Flexibly using those parts of a learned procedure that are relevant for a new (far transfer) problem requires that the learner understands the rationale behind (subgroups of) solution steps (cf. Catrambone, 1996, 1998), that is, “not only knows the procedural steps for problem-solving tasks, but also understands when to deploy them and why they work” (Gott et al., 1993, p. 260).

Research on either unsolicited or on-demand provision of only principle-based instructional explanations/elaborations in examples versus prompting for self-explanations, has suggested that prompting self-explanations is more effective for increasing transfer performance (e.g., Schworm & Renkl, 2002). However, a precondition is that students are capable of providing high quality self-explanations, which is not always the case (see Chi et al., 1989; Lovett, 1992; Renkl, 1997). If this precondition is not met, providing high quality instructional explanations may improve performance (Lovett, 1992).

This study intended to empirically address the issues raised by Van Gog et al. (2004), by investigating the effectiveness for novices of a computer-simulated electrical circuits troubleshooting training program consisting of solving conventional problems with or without process information, and studying worked examples with or without process information (i.e., process-oriented and product-oriented worked examples). In line with consistent findings in the field of instructional design and CLT-inspired research (see Atkinson et al., 2000; Sweller, 2004), we hypothesized that practice consisting of

87

studying worked examples would result in more effective learning than practice consisting of solving conventional problems. This ‘worked example effect’ would be demonstrated by higher near and —especially— far transfer test performance, with less investment of time and mental effort during both the training and the subsequent test.

Furthermore, we hypothesized that process information added to worked examples and conventional problems would result in higher investment of mental effort during practice compared to the conditions without process information (i.e., conventional problems and product-oriented worked examples). This higher investment of effort during practice is expected to produce differential effects on learning for the worked examples and conventional problems conditions (cf. Van Merriënboer, Schuurman, De Croock, & Paas, 2002). Combined with worked examples, process information is expected to produce higher (far) transfer test performance than studying worked examples without process information would, because this higher effort is assumed to be an indication of germane load. Learners would be able to handle this germane load because extraneous load has been reduced through the implementation of worked-out solutions. Combined with conventional problems, in contrast, process information is expected to produce equally low or even lower transfer test performance than solving conventional problems without process information would, because the conventional problems already impose a high extraneous load. This would make it impossible to handle the process information in such a way that it facilitates learning.

Method

Participants Sixty-eight first year electrotechnics students of three schools of senior secondary

vocational education were asked to participate in the experiment. Because some of them did not attend school when the experiment took place, only 61 students actually participated (all male; age M = 17.04 years, SD = .90). They were rewarded with € 7.50. In the 4.5 months prior to the experiment, the school curricula (all schools used the same textbooks) offered electrotechnics instruction from which participants had gained the basic knowledge required to perform the experimental tasks. For example, students were given instruction in the function of basic circuit components such as voltage sources, resistors, ammeters, and voltmeters. They were familiarized with Ohm’s law and Kirchoff’s laws, and knew how to restate those laws in order to find an unknown value from the givens. They were also instructed on how to design basic

88

parallel and series circuits. However, they had not yet acquired any troubleshooting experience. Design

A 2 x 2 between-subjects factorial design, with the factors ‘Solution Worked Out’ (No/Yes) and ‘Process Information Given’ (No/Yes) was used. The resulting four training conditions were: conventional problem solving (CP: no solution given, no process information), conventional problem solving with process information given (PCP: no solution given, process information), studying product-oriented worked examples (WE: solution given, no process information), and studying process-oriented worked examples (PWE: solution given, process information). Materials

Training and test tasks. All experimental tasks were designed and delivered with TINA Pro software, version 6.0 (TINA = Toolkit for Interactive Network Analysis; DesignSoft, Inc., 2002) and consisted of malfunctioning electrical DC circuits simulations with one or two faults. The training tasks were preceded by an introduction to the TINA program (on paper) and an “introduction practice task” (in TINA) on which students could try out the functioning of the program described in the introduction. For example, participants could ascertain where to find the meter in the menu, how to use that meter, and how to repair a circuit component with TINA.

The training consisted of six parallel circuit tasks that contained the following faults: a resistor could be open, shorted, or its resistance could have changed to a higher or lower value than it should have according to the circuit diagram. In the first three tasks, those faults occurred in isolation and in the last three tasks two different faults occurred in combination, so that each fault occurred three times during the training: once in isolation and twice in combination with another fault. In the CP training condition, only the circuit diagrams were presented with a formulation of the criterion for an acceptable goal state (see Appendix A). In the PCP condition, the circuit diagrams, a formulation of the goal state criterion, and the process information (the text printed in bold and bold italics in Appendix B) was given. In the WE condition, the circuit diagram was shown, together with a formulation of the goal state criterion and a worked-out solution (the non-bold text in Appendix B). In the PWE condition, the circuit diagram, the goal state criterion formulation, and a worked-out solution complemented with the process information (Appendix B) was given.

89

The transfer test consisted, in order, of three near and three far transfer tasks. The structural features of the near transfer tasks were comparable to those of the training tasks: they were parallel circuit tasks with the same types of faults. The first near transfer task contained one fault, the second and third contained two different faults. The far transfer tasks had different structural features and consisted of one parallel circuit task with a new fault (voltage source) and two combined circuit (i.e., series-parallel) tasks with a familiar fault. The process information would be expected to lead to better performance on those tasks because it for example taught students to always measure again after repairing a component (higher likelihood of finding both faults in the near transfer tasks), and contained principles that helped identify the type of fault, and the faulty component’s location in the circuit (for example knowing the principle that infinitely low current in the entire circuit –as opposed to in just one branch- involves an open component/wire outside the branches would result in a higher likelihood of finding the new fault).

Performance. On pre-printed training and test answer sheets, participants were asked to indicate for each task which components were faulty and to indicate what the fault was: the component: a) “is open”, b) “is shorted”, c) “has changed value: from … (given in diagram) to ... ”, or d) “I do not know”. They were instructed to fill out the values when they indicated ‘c’. Although for both worked examples conditions the faulty components and the nature of the faults were given, participants in those conditions were asked to fill out the training answer sheet anyway, to ensure that this activity would not lead to time-on-task differences between conditions (i.e., to make sure that possible differences between the worked examples and conventional problems conditions are not due to the longer study time participants in the worked examples groups might have when they would not have to fill in the answer sheets).

Mental effort. On the answer sheets participants also had to indicate how much mental effort they invested to complete each task, on a 9-point rating scale ranging from 1 “very, very low effort” to 9 “very, very high effort” (Paas, 1992; Paas, Tuovinen, Tabbers, & Van Gerven, 2003).

Time-on-task. To be able to determine the time participants spent on each task, the screen coordinates of their mouse clicks and the time (in seconds) at which these were made, were logged with GazeTrackerTM software (Lankford, 2000).

90

Procedure Before the experiment, the 68 participants were randomly assigned to one of the four

conditions, in such a way that each condition contained 17 participants. As mentioned in the ‘participants’ section, only 61 students actually participated, with 16 participants in the CP and PWE conditions, 15 in the PCP condition, and 14 in the WE condition. The study was run in five sessions in a computer room at the participants’ respective schools. When participants arrived in the computer room, a PC was marked with their name, and the introduction and answer sheets were placed next to the PC. GazeTrackerTM was already recording at that moment and the introduction practice task was already visible on the screen. Participants were instructed to start reading the introduction to the program and to familiarize themselves with the functioning of the program through the “introduction practice task”. At this point, they were not yet allowed to start on the actual training tasks. When all participants had finished the introduction and had no more queries regarding the program, they were allowed to move on to the training tasks. Participants were allocated a maximum time of three minutes per training task, and although they could complete a task faster, they could not move on to the next task until three minutes had passed. On the test tasks, they were allowed to work at their own pace. Both during the training and the test, a “task list” was always visible on the right-hand side of their screen. When they had finished a task, they were instructed to click on the “submit” button, located under the “task list”, before proceeding to the next task in that list. Participants could use a calculator (‘real’ or software) both during the training and the test, to ensure that simple computational errors would not affect task performance. After completing each task participants indicated the faulty component(s) and the type of fault(s), as well as the mental effort they invested in the task on the answer sheets.

After the experiment, participants’ performance on the near and far transfer test tasks was scored in the following way. For each correctly diagnosed faulty component 1 point was given and for correct diagnosis of the fault in that component an additional point was given (and in case ‘c’ was indicated, but the value was wrong, ½ point was given). So, the maximum mean performance score on the near transfer test tasks was 10 points / 3 tasks = 3⅓ points (one task containing one fault with a maximum score of 2 points and two tasks containing two faults with a maximum score of 4 points each), and on the far transfer tasks it was 6 points / 3 tasks = 2 points (three tasks containing one fault with a maximum score of 2 points each).

91

Training and test time-on-task were calculated by first determining the screen coordinates of the tasks in the “task list” and of the “submit” button. Based on those coordinates, the time at which each task was selected and submitted could be determined from the mouse-click logging files. By subtracting the time of selection from the time of submission, the time-on-task was obtained. Due to recording errors, all time-on-task data were lost for three participants in the CP condition and two participants in the WE condition, test-time on task data were lost for on participant in the PWE condition, and training time-on-task data for one participant in PCP condition.

Results

Because of random assignment to conditions, it is unlikely that there are differences in prior knowledge between the conditions. An ANOVA on the performance scores on school exams of the subject matter (remember that all schools used the same textbooks) that were taken after 3 months of instruction (i.e., 1.5 months before the experiment) from participants of two schools (N = 48; the third school could not provide this information), indeed showed no differences in exam performance between the four conditions F (3, 44) = .198, ns. Hence, the results reported here are not likely to be artefacts of prior knowledge differences between conditions.

Data under analysis with respect to the training phase are mean training time-on-task (in seconds) and mean mental effort during the training. With regard to the test phase, the data under analysis are mean performance on near and far transfer test tasks, mean time spent on near and far transfer test tasks, and mean mental effort on near and far transfer test tasks. A series of 2 x 2 ANOVAs with ‘solution worked out’ and ‘process information given’ as between-subjects factors were conducted. The means and standard deviations for all dependent variables are presented in Table 1. Given the relatively large number of dependent variables examined, only the significant main and interaction effects are reported here. Cohen’s f is provided as a measure of effect size, with f = .10 corresponding to a small effect, f = .25 to a medium, and f = .40 to a large effect (Cohen, 1988).

Training Data

The ANOVA on mean training time-on-task showed significant main effects for ‘Solution Worked Out,’ F (1, 51) = 9.18, MSE = 517.64, p < .01, f = .39, and for ‘Process Information Given,’ F (1, 51) = 6.60, MSE = 517.64, p < .05, f = .32, and a significant interaction effect, F (1, 51) = 5.78, MSE = 517.64, p < .05, f = .30. The

92

conditions with worked-out solutions (WE and PWE) showed lower time-on task (M = 146.04, SD = 30.39) than the conditions without worked-out solutions (CP and PCP; M = 162.57, SD = 17.45), and the conditions with process information (PCP and PWE) showed higher time-on-task (M = 160.99, SD = 18.39) than the conditions without process information (CP and WE; M = 145.95, SD = 31.44). The interaction (depicted in Figure 1) suggests that the availability of a worked-out solution had a large beneficial effect on time-on-task only when no process information was given, that is, the product-oriented worked-out examples group (WE) had to invest less time on the training tasks.

Table 1 Means and standard deviations of time-on-task and mental effort during training and test, and test performance as a function of experimental condition

Experimental Condition Solution worked out Solution not worked out Process No process Process No process Dependent Variables M SD M SD M SD M SD Training Time-on-task (s) 159.19 17.18 128.51 35.76 163.06 20.13 162.04 14.83 Mental effort (1-9) 5.10 1.93 4.01 1.82 5.98 1.85 5.22 1.56 Test –near transfer Performance (0-3⅓) 1.98 .66 2.25 .74 1.68 .54 1.70 .68 Time-on-task (s) 247.60 99.00 212.58 77.24 172.49 44.70 135.46 60.19 Mental effort (1-9) 6.27 1.43 4.62 1.57 5.71 1.80 4.73 1.68 Test –far transfer Performance (0-2) 1.21 .44 1.40 .35 1.13 .37 1.02 .28 Time-on-task (s) 133.07 64.10 151.64 71.95 134.09 49.58 125.56 74.28 Mental effort (1-9) 5.90 1.62 4.81 1.66 6.02 1.40 5.21 1.78

On mean mental effort during the training significant main effects in the expected

direction were found for ‘Solution Worked Out’, F (1, 57) = 5.12, MSE = 3.21, p < .05, f = .29, and for ‘Process Information Given’, F (1, 57) = 4.05, MSE = 3.21, p < .05, f = .26. Specifically, participants in the conditions with worked-out solutions (WE and PWE) had to invest less mental effort during training (M = 4.59, SD = 1.93) than their counterparts in the conditions without worked-out solutions (CP and PCP; M = 5.59, SD = 1.72). Furthermore, participants in the conditions with process information given (PCP and PWE) had to invest more mental effort during training (M = 5.53, SD = 1.91) than participants in the conditions without this information (CP and WE; M = 4.66, SD = 1.77).

93

Figure 1 Interaction between the factors ‘Solution Worked-Out’ and ‘Process Information Given’ on training time-on-task

Test Data

For the near transfer test, ANOVA on mean test performance yielded a significant main effect for ‘Solution Worked Out,’ F (1, 57) = 6.53, MSE = .44, p < .05, f = .34. In line with our expectation, participants in the conditions with worked-out solutions (WE and PWE) obtained higher performance (M = 2.11, SD = .70) than participants in the conditions without worked-out solutions (CP and PCP; M = 1.68, SD = .61). With respect to mean time-on-task, a significant main effect was also found for ‘Solution Worked Out,’ F (1, 51) = 14.68, MSE = 5378.36, p < .001, f = .52. This effect was, however, not in the expected direction: Participants who studied worked-out solutions (WE and PWE) spent more time on those test tasks (M = 232.04, SD = 90.09) than participants who had solved conventional problems (CP and PCP; M = 155.30, SD = 54.77). The fact that participants in the conditions with worked-out solutions spent more time on the test tasks might be a possible explanation for their higher performance. Therefore, an additional ANCOVA was performed with time-on-task as a covariate (for this analysis, the 6 participants’ missing test time-on-task data were replaced with the means of their conditions). Again, a significant main effect on near transfer test performance, in the same direction, was found for ‘Solution Worked

94

Out,’ F (1, 56) = 5.85, MSE = .44, p < .05, f = .33. Thus, the extra time participants in the conditions with worked-out solutions spent on the near transfer test tasks did not explain their higher performance. Finally, for the mean mental effort invested in the near transfer test tasks, a significant main effect for ‘Process Information Given’ was found, F (1, 57) = 10.00, MSE = 2.64, p < .01, f = .41. Participants in the conditions with process information (PCP and PWE) invested more mental effort in solving the test tasks (M = 6.00, SD = 1.61) than participants in the conditions without this process information (CP and WE; M = 4.68, SD = 1.60).

The results on the far transfer test are in line with those on the near transfer test. The ANOVA on far transfer test performance yielded a significant main effect for ‘Solution Worked Out,’ F (1, 57) = 5.99, MSE = .13, p < .05, f = .32. As expected, participants in the conditions with worked-out solutions (WE and PWE) obtained higher performance (M = 1.30, SD = .40) than participants in the conditions without worked-out solutions (CP and PCP; M = 1.08, SD = .33). With respect to mean time-on-task on the far transfer test, no significant main or interaction effects were found. For the mean mental effort invested in this test, the pattern is the same as for the near transfer test with a significant main effect for ‘Process Information Given,’ F (1, 57) = 5.02, MSE = 2.64, p < .05, f = .30. Participants who received process information (PCP and PWE) invested more mental effort in solving the far transfer test tasks (M = 5.96, SD = 1.50) than participants who did not receive such information (CP and WE; M = 5.02, SD = 1.71).

Discussion

The hypothesis that a training consisting of studying worked examples would lead to higher near and far transfer test performance, with less investment of time and mental effort during the training and the test, than a training consisting of solving conventional problems, was largely confirmed. Both on near and far transfer test tasks participants who had studied worked examples obtained a higher performance, with lower investment of effort and time on task during the training (the ‘worked example effect’). Furthermore, the interaction between worked-out solution and process information showed a disproportionately low time-on-task for the product-oriented worked examples group compared to the other groups, including the process-oriented worked examples group. With regard to investment of time and mental effort during the test, only a main effect on near transfer test time-on-task was found, and in the direction opposite of our prediction. Specifically, participants who had studied worked examples

95

during training spent more time on the near transfer test tasks than participants who had solved conventional problems. The fact that they spent more time on those tasks could not, however, explain their higher transfer performance.

A possible explanation for this unexpected finding that studying worked examples did not result in decreased time-on-task and invested mental effort on the test tasks, might be that the duration of the training was to short for schema automation to occur. Schema construction was fostered, as is reflected in the higher performance outcomes. However, only when automation has been established can schemas be handled fast and effortlessly in working memory (Sweller, et al., 1998). The fact that near transfer test time-on-task was even higher for participants who had studied worked examples might be due to motivational aspects. Participants who had solved conventional problems during training may have given up on the test tasks when they felt that they would not be able to solve the problems. This would result in a (artificially) lower solution time.

The results clearly confirm the first part of the second hypothesis, process information added to worked examples and conventional problems resulted in higher investment of mental effort during the training compared to the conditions without process information. Participants who were given process information invested higher mental effort during practice tasks, near transfer test tasks, and far transfer test tasks. During practice, process information also increased time-on-task, which might be due to the additional processing of the “why” and “how” information. We did not, however, find any interaction effects of ‘solution worked out’ and ‘process information given’ on near and far transfer test performance. So, the second part of that hypothesis, that this higher investment of effort on the training would lead the process-oriented worked examples condition to obtain higher transfer test performance than the product-oriented worked examples condition, whereas it would lead the conventional condition with process information to obtain equal or even lower transfer performance than the conventional condition without that information, was not supported by our data.

There may be two possible explanations for why we did not find beneficial effects on performance of process information added to worked examples. The first is in terms of extraneous load. The ‘split-attention effect’ demonstrates that mutually referring information sources (e.g., text and diagram) that cannot be understood in isolation are best presented in an integrated format. For instance, by dividing the text into small pieces that can be included at the appropriate places in the diagram or by presenting the text as spoken sentences and simultaneously highlighting the parts of the diagram each

96

sentence refers to. Thanks to the integrated format the learner is not required to switch attention between different information sources; a process that imposes an extraneous cognitive load that is not effective for learning (Chandler & Sweller, 1992; Tarmizi & Sweller, 1998; Ward & Sweller, 1990). Whereas in our training tasks, the two types of text (worked-out solution and process information) in the process-oriented worked examples were integrated, they were not integrated in the diagram or presented as spoken text. Although text and diagram were separated for all worked examples, the text of the process-oriented worked examples was much longer than that of the product-oriented worked examples (see Appendix B for the information given with both types of worked examples). So, because of the short text, switching attention may not have imposed (high) extraneous load for participants studying the product-oriented worked examples, but may have imposed high extraneous load for those studying process-oriented worked examples. In sum, this first explanation refers to the fact that the form in which the process information was offered may have caused a high extraneous load, which resulted in more investment of effort during the training but not in higher performance. To test this explanation, future research should also use process-oriented worked examples in which texts are fully integrated with the diagram or in which the texts and diagrams are presented in an audiovisual format.

The second explanation is in terms of intrinsic cognitive load. Although intrinsic cognitive load is considered fixed because it is innate to the task (number of interacting elements), and instructional procedures are considered to influence only extrinsic (i.e., extraneous and germane) cognitive load (Ayres, in press; Sweller et al., 1998) adding process information in essence means adding extra information to the task. Although we conceived of this as an instructional procedure and expected it to induce a germane cognitive load, the fact that information was added to the task may have “changed” the task and increased the intrinsic cognitive load, (i.e., task complexity; see Ayres, in press); especially the principled information (Appendix B, bold italics text) is highly interactive with other information elements. Novice learners might not have been able to handle this increased complexity, even though the extraneous load was reduced through a worked-out solution (and this might even be the case when an integrated worked-out solution is presented). Therefore, it might have been better for novice learners if we had presented all or part of the process information before or after the worked example. This would be in accordance with the findings of Kester, Kirschner, and Van Merriënboer (in press), that presenting different types of information together, either before or during task execution, leads to lower learning efficiency than presenting one of

97

these information types before and the other during task execution. More advanced learners have already acquired a basic problem schema and the task therefore imposes a lower intrinsic cognitive load for them. For that reason, they do not necessarily benefit from studying product-oriented worked examples (the ‘expertise reversal effect’; Kalyuga, Ayres, Chandler, & Sweller, 2003), but they might be able to use process-oriented worked examples to their advantage.

In sum, this second explanation refers to the fact that the content of the process information may have caused a too high intrinsic load for our participants, which resulted in more investment of effort during the training and the test, but not in higher performance. Future research should compare the effects of process-oriented worked examples for novice and advanced learners. Furthermore, it might be necessary to reconsider the definitions of the different kinds of cognitive load. Our instructional procedure differed from other procedures to induce germane cognitive load in the sense that self-explaining examples (Renkl, 1997), or being offered a sequence of examples with high variability (Paas & Van Merriënboer, 1994), require different activities from learners than merely studying examples, but do not involve processing of additional information elements. Even though the process information may not change the task complexity in the sense that the ultimate target skills (effective performance of the task after the study phase) are not changed, it does make the “task” of studying the example more complex (by adding information elements that are interacting with other information elements).

Unfortunately, we have no means to support either of those theoretical explanations, since a pilot study with 4 first-year electrotechnical vocational education students had shown that they had great cognitive and motivational difficulties with thinking aloud (and those were students who volunteered), resulting in protocols of extremely poor quality. Hence, we decided not to implement a think aloud procedure in the experiment, as we had originally intended, but such kind of data would have been very informative.

The results of this study show that implementing more support in the form of worked examples in troubleshooting instruction will make that instruction more effective (lead to better transfer performance) as well as efficient (better performance is obtained with less investment of time and effort) for novice learners. This study also resulted in specific questions for further research that need to be addressed in order to find out whether and how additional support in the form of process information added to worked examples can be effective for novice learners.

98

Appendix A:

A Training Task (Conventional Problem) Note for the reader: in the diagram, SW = Switch, VS = Voltage Source, R = Resistor, AM = Ampère measurement point. The value of the voltage source and resistors is given after the component label (V = Volt, k = kilo [Ohm]).

This circuit is not functioning correctly.

Find the fault(s) and repair the circuit so that it does function correctly.

R1

6k

R2

6k

R3

3k

AM4

+

VS1 12V

SW1

AM

2

AM

3

AM

1

99

Appendix B:

Process-Information (Bold and Bold Italics) and Worked-out Solution of the Task Shown in Appendix A

1. Determine how this circuit should function, using Ohm’s law

(so what the current is that you should measure at the different measurement points) In parallel circuits, the total current equals the sum of the currents in the parallel branches. The total current should be It = I1 + I2 + I3, or It = U1 / R1 + U2 / R2 + U3 / R3, or It = 12V / 6kOhm + 12V / 6kOhm + 12V / 3kOhm = 2mA + 2mA + 4mA = 8mA. You should measure: AM1 = 2mA AM2 = 2mA AM3 = 4mA AM4 = 8mA

2. Measure how it actually functions, using the Multimeter

(so what the current is that you actually measure at the different measurement points) Go to T&M > Multimeter and measure the current at AM1, AM2, AM3, and AM4. You get: AM1 = 2mA AM2 = 2mA AM3 = 12nA AM4 = 4mA

3. Compare the outcomes of 1 and 2

They do not correspond, something is wrong. 4. Determine which component is faulty and what the fault in that component is, using the

principles given below. If the total current is lower than you would expect, the resistance in one or more of the parallel branches is too high (the same voltage [U] divided by a higher resistance [R] results in a lower current [I]). If the total current is higher than you would expect, the resistance in one or more of the parallel branches is too low (the same voltage [U] divided by a lower resistance [R] results in a higher current [I]). Infinitely low current in a parallel branch means that there is infinitely high resistance in that branch; very likely the resistor is open, but it can also be another component or the wire that is open. No or infinitely low current in the entire circuit (in all branches) indicates that there is infinitely high resistance somewhere outside the branches; possibly the voltage source, the switch, or the wire outside the branches is open. Infinitely high current in a parallel branch plus infinitely high total current indicates that the resistance is infinitely low; very likely the resistor in that one branch is shorted. I3 = 12nA. Conclusion: R3 = open.

5. Repair the component

Repair R3. 6. Measure again

Go to T&M > Multimeter and measure the current at AM1, AM2, AM3, and AM4. You get: AM1 = 2mA AM2 = 2mA AM3 = 4mA AM4 = 8mA

7. Determine if the measures correspond to those you determined at step 1. If so, the circuit

now functions correctly. If not, start over again at step 4. The circuit now functions correctly.

101

Chapter 7

Effects of Sequencing Process-Oriented and Product-Oriented Worked Examples on Troubleshooting Transfer Performance1

This study investigated the effects of sequences of product-oriented worked examples, presenting a problem solution, and process-oriented worked examples, also explicating the rationale behind a presented solution, on the efficiency of transfer (combination of performance with mental effort invested to attain it) of initial electrical circuits troubleshooting skill. The hypothesis that process-oriented worked examples would initially be more efficient, but that the process information should be removed when it becomes redundant (i.e., a process-product sequence would be more efficient than a process only, product only or product-process sequence) was confirmed in the present study. The results are discussed in terms of practical and theoretical implications.

Research has convincingly shown that studying worked examples is more effective for learning than solving the equivalent problems in the initial phases of skill acquisition (for overviews see Atkinson, Derry, Renkl, & Wortham, 2000; Sweller, Van Merriënboer, & Paas, 1998). Cognitive load theory (Sweller, 1988; Sweller et al., 1998; Van Merriënboer & Sweller, 2005) explains the effectiveness of worked examples in terms of reduced extraneous, or ineffective cognitive load during training. The theory distinguishes cognitive load inherent to the task, and cognitive load imposed by the instructional design. The former is called intrinsic cognitive load, and results from the number of interacting elements in a task. The latter is called extraneous cognitive load when it is ineffective for learning and germane cognitive load when it is effective for learning. Given that working memory capacity is considered limited to seven plus or minus two elements or chunks of information (Miller, 1956), tasks high in intrinsic load place high demands on working memory. Under conditions of high intrinsic load, searching for a problem solution with weak strategies such as means-ends analysis that novice learners often employ, can easily overload working memory, which is not effective for learning. Studying worked examples reduces the extraneous load, because learners do not have to devote cognitive resources to weak problem-solving strategies (Sweller et al., 1998). Instead, they can invest all available resources in studying the solution and constructing and automating a cognitive schema for solving such problems. This

1 This chapter is submitted as Van Gog, T., Paas, F., & Van Merriënboer, J. J. G. (2006). Effects of sequencing process-oriented and product-oriented worked examples on troubleshooting transfer performance. Manuscript submitted for publication.

102

advantage manifests itself in more efficient transfer performance, that is, higher performance on transfer test problems combined with lower investment of mental effort in solving those problems (Paas & Van Merriënboer, 1993).

The cognitive capacity that is freed up by the reduction of extraneous load, can—within capacity limits—be used for processes that are effective for learning. Hence, over the years, the focus of cognitive load theory has shifted from identifying instructional measures to reduce extraneous load, towards identifying measures that exploit the freed capacity by inducing a germane cognitive load (Paas, Renkl, & Sweller, 2003, 2004). Instructional measures that successfully induce a germane load, stimulate learners to invest mental effort in the development of rich cognitive schemata during training, that subsequently allow for effective and efficient transfer performance (i.e., higher performance with lower investment of effort when solving transfer problems; Paas & Van Merriënboer, 1993). Instructional measures that are known to induce a germane cognitive load in studying worked examples are for example increasing the variability (Paas & Van Merriënboer, 1994a) or contextual interference (Van Merriënboer, Schuurman, De Croock, & Paas, 2002) in series of worked examples.

One shortcoming of conventional worked examples is that they only show the product of problem solving (i.e., a solution), but not the process of problem solving (i.e., the rationale behind that solution; Van Gog, Paas, & Van Merriënboer, 2004). Understanding the rationale behind (subgroups of) solution steps is considered imperative for the ability to recognize and flexibly apply the relevant parts of a previously learned procedure, that is, for attaining transfer (especially far transfer; Catrambone, 1996, 1998; Gott, Parker-Hall, Pokorny, Dibble, & Glaser, 1993). Some students may be capable of overcoming this shortcoming by adequately self-explaining the rationale (Chi, Bassok, Lewis, Reimann, & Glaser, 1989), and when they do, this activity can contribute to learning (i.e., impose a germane cognitive load, see Atkinson, Renkl, & Merrill, 2003). However, students may lack the domain knowledge necessary to do so (see Chi et al., 1989; Renkl, 1997), especially very early in training. By providing process-oriented worked examples that also include the rationale behind the steps, students have the opportunity to use the capacity that is freed-up by reduced extraneous load through the availability of a worked-out solution, to study the process information, which would impose a germane cognitive load (Van Gog et al., 2004).

However, once students have gained understanding, the process information is likely to become redundant, and continuing to offer it during training might impose an extraneous instead of a germane load and hamper rather than foster learning. Indeed, a

103

recent study showed that studying only process-oriented worked examples required higher investment of mental effort during training compared to studying product-oriented worked examples, but this led to lower transfer efficiency (combination of transfer performance and effort invested to attain that performance; Van Gog, Paas, & Van Merriënboer, in press). A number of studies have shown that instructional guidance needs to be faded with increasing knowledge/skill development (Atkinson et al., 2003; Kalyuga & Sweller, 2004; Renkl & Atkinson, 2003). As soon as learners have incorporated the information provided by the instructional guidance in a cognitive schema, offering this information in instruction is redundant and no longer contributes to learning; in fact, it might even hamper learning (the well-documented “expertise reversal effect”; Kalyuga, Ayres, Chandler, & Sweller, 2003; Kalyuga, Chandler, Tuovinen, & Sweller, 2001).

In sum, studying process-oriented worked examples would be expected to initially impose a germane cognitive load, but once learners have gained understanding of the solution procedure, the redundancy of the process information would impose an extraneous load. Hence, at that point, the instructional guidance should be reduced to control cognitive load by replacing process-oriented worked examples with product-oriented worked examples. This would also have the additional benefit that at this point, learners might try to as well as be capable of self-explaining the solution provided in those product-oriented worked examples, and have the cognitive capacity to do so. The present experiment investigates this hypothesis that studying a process-product sequence of worked examples will be more efficient than studying a product-process sequence, product-oriented worked examples only, or process-oriented worked examples only, in the domain of learning to troubleshoot electrical circuits.

Method

Participants Participants were 82 fifth-year pre-university education students of two Dutch schools

circuits, but had no experience with applying that knowledge to troubleshooting. Design

A repeated measures design was used, in which participants studied two series of training worked examples each followed by a series of transfer test problems. The

They received € 7 50 for their participation. They had studied the basics of electrical with physics in their curriculum (mean age 16.10 years, SD = .49; 40 male, 42 female).

.

104

conditions, to which participants were randomly assigned before the experiment, were: product-product sequence (n = 21), process-process sequence (n = 21), product-process sequence (n = 20), and process-product sequence (n = 20). Materials

Prior knowledge questionnaire. As a check on randomization, a prior knowledge questionnaire, consisting of seven open-ended questions on troubleshooting and parallel circuits principles was administered, with a maximum total score of 10 points.

Introduction. An introduction to the TINA Pro software, version 6.0 (TINA = Toolkit for Interactive Network Analysis; DesignSoft, Inc., 2002) was provided on paper, together with an “introduction practice circuit” in TINA on which students could try out the functioning of the program as described in that introduction (how to take measurements, repair components, etc.).

Training examples. In total, there were eight training examples, offered in two series of four, consisting of malfunctioning parallel electrical circuit simulations in TINA for which either a worked-out solution (product-oriented) or a worked-out solution plus process information (process-oriented) was provided. Four types of faults occurred in the circuits: a resistor could be open, shorted, its value could have changed to a higher one than indicated in the circuit diagram, or to a lower one than indicated in the diagram. Each training task contained one fault, each fault occurred once in each of the two series, but not in a fixed order. The training series were constructed this way, because when more tasks in a series would be given, the worked-out solution would likely become redundant (Atkinson et al., 2003; Kalyuga, et al., 2003; Kalyuga, et al., 2001). A product-oriented worked example is shown in Appendix A, the same example as a process-oriented one in Appendix B.

Test problems. In total, there were eight test problems, offered in two series of four, consisting of malfunctioning electrical circuit simulations in TINA that participants had to troubleshoot without any further information available. Each series consisted of two near transfer problems and two far transfer problems. The near transfer problems were similar to the trained ones (parallel circuit and same types of faults), whereas the far transfer problems were dissimilar (different fault or different type of circuit). In each series, one near and one far transfer problem contained one fault, and one near and one far transfer problem contained two faults.

Performance. On pre-printed answer sheets, participants were asked to write down for each test problem which component/s was/were faulty, and to indicate what the fault in

105

the component/s was by selecting it from a list: the component: a) “is open”, b) “is shorted”, c) “has changed value: from … [given in diagram] to ... [calculated based on measurements]”, or d) “I do not know”. They were instructed to fill out the values when they indicated ‘c’. For each correctly diagnosed faulty component 1 point will be given and for correct diagnosis of the fault in that component 1 point will be given (and in case ‘c’ is indicated, but the values are wrong, ½ point will be given). So, the maximum mean performance score on each series of test problems is 3 (the two problems with one fault have a maximum of 2 points per problem and the two problems with two faults have a maximum of 4 points per problem; this is a maximum of 12 points in total, which divided by four problems gives a maximum mean score of 3 points).

Mental effort. Participants were asked to indicate after each training example how much mental effort they invested in studying it, and after each test problem how much mental effort they invested solving it, on a 9-point rating scale ranging from 1 “very, very low effort” to 9 “very, very high effort” (Paas, 1992; Paas, Tuovinen, Tabbers, & Van Gerven, 2003).

Time-on-task. In the Teacher-Supervisor module, TINA logged the time-on-task. Procedure

The study was run in three sessions (max. 1.5 hours) in a computer room at participants’ own schools, sessions 2 and 3 took place on the same day, two days after session 1. Per session, participants were equally distributed across conditions (session 1: n = 16, 4 participants per condition; session 2: n = 46, 12 participants in conditions 1 and 2, and 11 in conditions 3 and 4; session 3: n = 20, 5 participants per condition). The paper materials were already besides the PCs, and the practice problem in TINA was already open on the PC screen. Participants were instructed to fill out the prior knowledge questionnaire first, after which they could familiarize themselves with TINA by reading the introduction and trying functionalities on the “introduction practice circuit”. When all participants had finished the introduction and had no more questions regarding the program, they were allowed to start studying the first series of training examples (either product or process-oriented worked examples). Participants could display the training and test circuits by selecting them from the “task list” on the right-hand side of the screen. After studying each training example, participants indicated the amount of effort they invested in studying it on the 9-point rating scale. After the first series of training examples, they completed the first series of test problems, for each problem indicating the faulty component/s and the type of fault/s, as well as the mental

106

2EP −

effort invested in solving it, on the answer sheets. Then, they studied the second series of training examples (either product or process-oriented worked examples), and completed the second series of test problems in the same way. Participants were allowed to use a calculator when solving the test problems, to ensure that simple computational errors would not affect performance. Data-Analysis

Participants’ mean performance and mental effort scores on the transfer tests, were combined into one score using the method described by Paas and Van Merriënboer (1993; see also Paas et al., 2003; Tuovinen & Paas, 2004). The performance (P) and mental effort (E) scores are first standardized, and then the z-scores are entered into the formula:

This measure indicates the quality of the cognitive schemata participants have acquired during training: higher performance with lower investment of mental effort is indicative of a better schema.

Results

One participant (male; process-process condition) was excluded from the analyses because of a large number of missing values in the test answers and mental effort ratings (ca. 70 %). Fifteen participants had a missing mental effort value on one of the training series or one of the test series; those were replaced by their mean mental effort in that series. One participant in the product-product condition failed to fill out or hand in the prior knowledge questionnaire.

An ANOVA on the prior knowledge scores showed no significant differences between the conditions F (3, 76) < 1, ns, so random assignment to conditions was successful and differences in prior knowledge between conditions can be ruled out as an alternative explanation for the results reported here. Means and standard deviations of all measures reported here are provided in Table 1. A significance level of .05 is used for all analyses, and Cohen’s f is provided as an estimate of effect size, with f = .10 corresponding to a small effect, f = .25 to a medium, and f = .40 to a large effect (Cohen, 1988).

107

Table 1 Means and standard deviations of prior knowledge, (maximum score of 10), time-on-task (in minutes) and mental effort (9-point scale) Invested in training and test, test performance (maximum score of 3), and efficiency Product

Product Process Process

Product Process

Process Product

M SD M SD M SD M SD Prior knowledge 5.10 .70 5.20 1.08 4.75 1.31 4.75 1.28 Time-on-task Training 1 Training 2 Test 1 Test 2

3.35 1.35 5.08 2.93

.50 .51

1.20 .58

3.31 1.51 4.99 3.04

.61 .56

1.51 1.09

3.47 1.94 4.83 3.37

.71 .77

1.29 1.11

3.49 1.19 4.87 3.15

.64 .46

1.45 .95

Mental effort Training 1 Training 2 Test 1 Test 2

3.38 2.95 5.30 4.55

1.48 1.53 1.25 1.70

3.84 3.33 4.49 4.39

1.69 1.83 1.56 1.93

2.98 3.38 5.56 4.64

1.22 1.16 .98

1.17

3.16 2.99 5.16 4.73

1.40 1.32 1.32 1.24

Performance Test 1 Test 2

1.70 2.23

.59 .48

1.75 1.93

.47 .45

1.70 2.28

.45 .56

1.87 2.45

.44 .53

Efficiency Test 1 Test 2

-.17 .02

1.22 .99

.34 -.30

1.02 1.06

-.31 .05

.84 .94

.15 .24

1.10 1.10

Transfer Test Performance

A General Linear Model (GLM) Repeated Measures Analysis on transfer performance at the two test moments, with performance on Test 1 and performance on Test 2 as within-subjects variables, and condition as between-subjects variable, showed a significant main effect of test moment, F (1, 77) = 74.84, MSE = .12, p < .001, f = .48, indicating that performance on Test 2 (M = 2.22, SD = .53) was significantly higher than on Test 1 (M = 1.75, SD = .49), as well as a significant interaction effect between test moment and condition, F (3, 77) = 3.25, MSE = .12, p = .026, f = .17. Contrasting the process-product to the process-process condition on the second test moment, shows that the process-product sequence (M = 2.45, SD = .53) has a higher performance on Test 2 than the process-process sequence (M = 1.93, SD = .45), t(77) = 3.28, p = .001 (one-tailed), f = .52. Hence, the interaction effect seems to indicate that after a first training consisting of process-oriented worked examples, an increase in performance from the first to the second test will only occur when the second training consists of product-oriented worked examples. Figure 1 visualizes the main and interaction effects on test scores.

108

Figure 1 Differences per condition in mean transfer performance scores on Test 1 and Test 2

Mental Effort Invested during Training and Test

On mental effort invested during the two training moments, a GLM Repeated Measures Analysis did not show any main or interaction effects. This analysis on mental effort invested during the two transfer test moments, showed a significant within-subjects main effect of test moment, F (1, 77) = 74.84, MSE = .12, p < .001, f = .20, indicating that the mean mental effort participants invested in Test 2 (M = 4.57, SD = 1.52) was lower than in Test 1 (M = 5.13, SD = 1.33). Efficiency on Transfer Test

A GLM Repeated Measures Analysis on the efficiency at the two transfer test moments, showed a significant interaction effect between the within-subjects factor test moment and the between-subjects factor condition, F (3, 77) = 4.87, MSE = .40, p = .004, f = .19, depicted in Figure 2. On the first test moment, contrasting the process-process and process-product conditions to the product-product and product-process conditions, shows that efficiency on the first transfer test was higher for students who had studied process-oriented worked examples (M = .24, SD = 1.05) than for students who had studied product-oriented worked examples (M = -.24, SD = 1.04), t(77) = 2.06, p = .021 (one-tailed), f = .23. On the second test moment, contrasting the

109

process-product with the process-process condition shows that the process-product condition (M = .24, SD = 1.10) had a higher efficiency than the process-process condition (M = -.30, SD = 1.06), t(77) = 1.67, p = .050 (one-tailed), f = .25, it did not differ significantly from the product-product and product-process conditions. Hence, the interaction effect seems to indicate that a first training consisting of process-oriented worked examples results in higher transfer test performance combined with lower investment of effort during the first test, but that efficiency on the second test decreases when the second training also consists of process-oriented worked examples.

Figure 2

Differences per condition in mean efficiency scores on Test 1 and Test 2 Time-on-Task Invested during Training and Test

On time-on-task invested on the two training moments, a GLM Repeated Measures Analysis revealed a significant within-subjects main effect of training moment, F (1, 77) = 383.97, MSE = .38, p < .001, f = 1.37, indicating that the invested time-on-task in Training 2 (M = 1.50, SD = .64) was lower than that in Training 1 (M = 3.41, SD = .61), a significant between-subjects main effect of condition, F (1, 77) = 3.07, MSE = .36, p = .033, f = .35, as well as a marginally significant interaction effect F (1, 77) = 2.72, MSE = .38, p = .050, f = .23. A post-hoc LSD analysis showed that the product-process condition had a significantly higher time-on-task on Training 2 than the other conditions. Figure 3 visualizes the effects on training time.

110

The analysis on time-on-task invested on the two test moments, showed a significant within-subjects main effect of test moment, F (1, 77) = 149.36, MSE = .90, p < .001, f = .79, indicating that the mean time-on-task invested in Test 2 (M = 3.12, SD = .95), was lower than that in Test 1 (M = 4.94, SD = 1.34).

Figure 3

Differences per condition in mean time-on-task on Training 1 and Training 2

Discussion

This study investigated whether learning to troubleshoot electrical circuits by studying a sequence of process-oriented and product-oriented worked examples, would lead to the development of more efficient cognitive schemata than studying a sequence of product-oriented and process-oriented worked examples or only product-oriented or process-oriented worked examples. Our hypothesis that studying process-oriented worked examples would initially impose a germane cognitive load, resulting in higher efficiency than studying product-oriented worked examples, that is, to higher performance on transfer test problems combined with lower investment of effort in solving those problems, was confirmed. Also in line with our hypothesis, the process information became redundant, imposing an extraneous load that started to hamper learning: continuing to study process-oriented worked examples led to lower efficiency on the second test, than continuing with product-oriented worked examples. However, contrary

111

to the first test, the efficiency of the process-product condition on the second test was not significantly higher than that of the product-product or product-process conditions.

This might be caused by the fact that after first studying process-oriented worked examples and then being confronted with product-oriented worked examples, students should be able –both because of the knowledge they have acquired and because of the fact that extraneous load imposed by the process information is reduced- to self-explain the now absent rationale. However, not all students may do so unsolicited. Using process-tracing techniques in future studies, such as different types of concurrent or (cued) retrospective verbal reports (Ericsson & Simon, 1993; Van Gog, Paas, Van Merriënboer, & Witte, 2005) may shed light on how students actually respond to the reduced instructional guidance. In addition, it would be interesting to investigate whether after initial provision of process-oriented worked examples, combining product-oriented worked examples with self-explanation prompts would be more efficient than providing product-oriented examples alone or product-oriented worked examples with prompts, and what learner characteristics mediate this efficiency.

It should be noted, that this study used the efficiency measure as it was originally proposed (Paas & Van Merriënboer, 1993), combining test performance with mental effort invested on the test. A large number of studies (see Tuovinen & Paas, 2004, for an overview) have adopted, but also adapted this formula, by combining test performance with mental effort invested in the training. This may provide interesting information for educational practice. For example, when two instructional formats lead to equal test outcomes, but format A requires learners to invest much more effort during the training than format B, then it seems a better choice for teachers to implement format B in their classroom. However, in defining those test outcomes, looking only at performance does not suffice. When drawing conclusions on cognitive effects of instructional formats, the combination of performance and effort invested in attaining that performance is more informative, since the level of effort problem solving requires is indicative of the quality of cognitive schemata a learner has constructed. This study also shows that the use of the efficiency formula provides more subtle insight into cognitive effects of different instructional approaches than either mental effort or performance measures alone.

To conclude, the fact that continuing to offer process information was found to result in a rather strong redundancy effect (i.e., hampering learning), even though this study was conducted in an ecologically valid setting with learner-controlled time-on-task, underlines the strength of this effect, and shows that it should not be ignored in

112

educational/instructional practice. Considering previous work on fading worked-out solution steps when those become redundant (see e.g., Atkinson et al., 2003; Renkl & Atkinson, 2003), an optimal training sequence for novices would likely proceed from studying process-oriented worked examples, product-oriented worked examples, and completion problems with increasingly more blanks that they have to fill in, to solving conventional problems.

113

Appendix A

A Product-Oriented Worked Example Note for the reader regarding both Appendices: In the diagram, AM = Ampère measurement point, SW = switch, V = voltage source, R = resistor. The Multimeter and screwdriver are TINA functionalities for measuring and repairing respectively.

The total current should be: It = I1 + I2 + I3+ I4 or:

= mAmAmAmAmA 5.49185.4189 =+++ Hence, you should measure: AM1 = 9mA AM2 = 18mA AM3 = 4.5mA AM4 = 18mA AM5 = 49.5mA Go to T&M Multimeter, and measure the current at AM1, AM2, AM3, AM4 en AM5. You see: AM1 = 9mA AM2 = 18mA AM3 = 9nA AM4 = 18mA AM5 = 45mA The calculation and measurement do not correspond, something is wrong. I3 = 9nA R3 is open. Repair R3 using the screwdriver with the . Go to T&M Multimeter, and measure the current at AM1, AM2, AM3, AM4 en AM5. You see: AM1 = 9mA AM2 = 18mA AM3 = 4.5mA AM4 = 18mA AM5 = 49.5mA The measures correspond, the circuit now functions correctly.

R4 500

R3 2k

R2 500

R1 1k AM1

AM2

AM3

AM4

V1 9V

SW1

AM5

=+++=4321

URU

RU

RU

RIt Ω

+Ω

+Ω

+Ω 500

929

5009

19 V

kVV

kV

114

Appendix B

A Process-Oriented Worked Example 1. Using Ohm’s law, determine how this circuit should function. In parallel circuits, the total current (It) equals the sum of the currents in the parallel branches (I1, I2, etc.) Therefore, the total current should be: It = I1 + I2 + I3+ I4 or:

= mAmAmAmAmA 5.49185.4189 =+++ Hence, you should measure: AM1 = 9mA AM2 = 18mA AM3 = 4.5mA AM4 = 18mA AM5 = 49.5mA 2. Use the Multimeter to measure how the circuit actually functions. Go to T&M Multimeter, and measure the current at AM1, AM2, AM3, AM4 en AM5. You see: AM1 = 9mA AM2 = 18mA AM3 = 9nA AM4 = 18mA AM5 = 45mA 3. Compare the outcomes of 1 and 2 The calculation and measurement do not correspond, something is wrong. 4. Determine which component is faulty and what the fault is In case of infinitely low current in a parallel branch, the resistance in that branch is infinitely high, very likely that resistor is open (but possibly it is another component or the wire that’s open), unless there is infinitely low current in the entire circuit, in that case there is an infinitely high resistance somewhere outside the parallel branches, very likely the battery, switch, or wire outside the branches is open. There is only infinitely low current in a branch (I3 = 9nA), not in the entire circuit, so likely R3 is open. 5. Repair the component Repair R3 using the screwdriver with the . 6. Measure again, there might be more faults Go to T&M Multimeter, and measure the current at AM1, AM2, AM3, AM4 en AM5. You see: AM1 = 9mA AM2 = 18mA AM3 = 4.5mA AM4 = 18mA AM5 = 49.5mA 7. Do the measures correspond to the calculations at step 1? Yes problem solved No Start again at step 4. The measures correspond, the circuit now functions correctly.

=+++=4321

URU

RU

RU

RIt Ω

+Ω

+Ω

+Ω 500

929

5009

19 V

kVV

kV

R4 500

R3 2k

R2 500

R1 1k AM1

AM2

AM3

AM4

V1 9V

SW1

AM5

115

Chapter 8

General Discussion In this final chapter, implications for research of the studies reported in Part I and II of this dissertation are presented first. Then the implications for educational practice are discussed. Finally, ideas are presented for bringing Parts I and II together in future research and practice, that is, the possibility of not separating “uncovering” the problem-solving process and “explicating” it in instruction, but of process-tracing as instruction.

Implications for Research

Part I Some issues resulting from Chapter 3 that require further attention, are for example

the influence of grain-size of segmentation on the results, and the questions of whether concurrent and cued retrospective reporting fundamentally differ with regard to qualitative results, or with regard to the quantitative and qualitative results obtained with participants at different levels of expertise. The possibility of fabrication in cued retrospective reporting is another issue; although the fact that this method resulted in more ‘metacognitive’ information than retrospective and concurrent reporting1 may be interesting for practice (see below), from a methodological research perspective it should be established whether this is not merely an artifact of reviewing the process. Lastly, it would be desirable to investigate the added value of the eye movements in cued retrospective reporting, by comparing the method used here, based on a record of eye movements and mouse/keyboard operations, to cued retrospective reporting based on a record of mouse/keyboard operations only.

Depending on possible qualitative differences between the methods, or possible differential effectiveness at different expertise levels, the finding that lower expertise students found concurrent reporting a negative experience and preferred cued retrospective reporting (Addendum to Chapter 3) may have implications for when to use which methods in future studies.

A limitation to both the Addendum to Chapter 3 and Chapter 4 is that the number of participants was very small and that a relative measure was used to determine the highest

1 The comparison with concurrent reporting is not explicitly made in Chapter 3, but this can be inferred from the Tables.

116

and lowest expertise participants. Hence, replication with larger sample sizes and validation of the performance efficiency measure in future research is desirable.

Part II

In Chapter 6, two possible explanations were given for the fact that studying process-oriented worked examples did not lead to better learning and transfer than studying product-oriented worked examples: a split-attention effect (Chandler & Sweller, 1992) due to the fact that text and diagram were not integrated in the examples, and increased intrinsic load due to added process information. However, progressing insight led us to conclude that there was an even more likely explanation: the possibility that the process information becomes redundant and starts to hamper learning (as measured by efficiency on the transfer test). Efficiency on the transfer test was not analyzed as such in Chapter 6, but the fact that performance between conditions with and without process information did not differ, whereas the mental effort invested in the test did (being higher for the process information conditions), indicates that efficiency was lower for the groups that received process information. This explanation was tested and confirmed in Chapter 7. Nonetheless, the possible explanation in terms of intrinsic load might hold true for more complex tasks. That is, for tasks higher in intrinsic load, the addition of process information to worked examples might really be problematic, and other options for presenting this information to learners may need to be sought in those cases.

Implications for Practice

The “worked example effect” (see Sweller, 2004), that is, the finding that training consisting of studying worked examples leads to better learning than training consisting of solving equivalent problems, was replicated in Chapter 6. Even though the first findings on this effect date from the 1980’s (e.g., Cooper & Sweller, 1987; Sweller & Cooper, 1985), a heavier reliance on worked examples in the early phases of instruction is still not implemented on a wide scale. As shown here, it is possible to create worked examples for ‘whole tasks’, and in curricula in which students are more and more expected to study on their own without the continuous presence of a teacher, these whole-task examples may play an even more important role.

Chapters 6 and 7 show that adding process information to worked examples initially leads to more efficient learning, but offering this information should be avoided once it becomes redundant. Combining this result with previous work on fading worked-out solution steps with progressing training (see Renkl & Atkinson, 2003), an optimal

117

training sequence (at least when “fixed” training is the only option) for novice learners seems to proceed from studying process-oriented worked examples, followed by studying product-oriented worked examples, via completing worked examples with increasingly more blanks, to solving conventional problems.

Future Research and Practice: Process Tracing as a Form of Instruction?

To conclude, let us consider the implications of bringing Parts I and II together: what would be the effects of presenting different types of verbal reports or eye movement recordings, either from individuals with more expertise or from the learner him/herself, as instruction? The Process of Individuals with More Expertise as Instruction

In the worked examples used here and in other studies, the researcher/instructional designer captures the problem-solving process of a “model” (in our case a teacher) in the examples. It might be interesting to have the problem-solving actions and thought processes reflected in examples exactly as they occur (“cognitive modeling examples”; CMEs), comparable to instructional situations (e.g., workplace/apprenticeship learning) in which students rely for a substantial part on observational learning from more experienced individuals. In “live” situations, observing the actions, possibly (when circumstances allow it) combined with a concurrent report from the model, is the only option.

However, the models’ problem solving could be recorded, as well as his/her verbalization of the thoughts during problem solving, and the record of performance could be integrated with the audio record of thought processes (“video cognitive modeling examples”; VCMEs). In this case, concurrent reporting is no longer the only option; it is also possible to have the model review the performance record and provide a cued retrospective report that is audio-taped and integrated with the performance record later on (given the little information contained in un-cued retrospective reports, these do not seem a good option for inclusion in VCMEs; see Chapter 3). This might lead to different learning outcomes, given that our study showed that cued retrospective reporting resulted in more metacognitive information than concurrent reporting, and that those methods might differ qualitatively (which should be established through future research, see above).

Furthermore, in CMEs and VCMEs, the model’s level of expertise may very well influence learning outcomes. Previous research has on the one hand provided indications

118

that the level of abstraction of expert knowledge may be too high for novices (Bromme, Rambow, & Nückles, 2001), but on the other hand suggested that this abstraction may have learning benefits, leading to better performance on novel tasks (i.e., far transfer performance; Hinds, Patterson, & Pfeffer, 2001).

Finally, because eye movement research has shown that more experienced individuals focus their attention faster and in greater proportion on the relevant elements of a task than less experienced individuals (see Chapter 4), there will likely be a discrepancy between the allocation of attention by the model and by the learner. Hence, guiding the attention of learners in the right direction by showing them an eye movement recording of the model in the cognitive modeling example might help them to better attend to and encode the available information. Findings by Velichkovsky (1995) on cooperative problem solving by expert-novice pairs, using eye movements to demonstrate on which task aspects the partners focused their attention, support this assumption.

The Learner’s Own Process as Instruction

Although from a research perspective, it should be established whether the finding that the cued retrospective reporting method elicited more metacognitive (i.e., process monitoring) information than retrospective reporting and concurrent reporting (Chapter 3), is not an artifact of reviewing the process, this does suggest an interesting application. Confronting students with a record of their problem solving and asking them to report what they were thinking during the process might be an effective means to prompt reflection on their problem-solving strategies. This is considered to play a central role in the acquisition of self-regulated learning competence (SRLC; see Van den Boom, Paas, Van Merriënboer, & Van Gog, 2004; Zimmerman, 2002), and stimulating students’ SRLC development is an important element of many Dutch curricular innovations.

119

References Alexander, P. A. (2003). The development of expertise: The journey from acclimation

to proficiency. Educational Researcher, 32, 10-14. Anderson, J. R., & Fincham, J. M. (1994). Acquisition of procedural skills from

examples. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 1322-1340.

Atkinson, R. K., & Catrambone, R. (2000). Subgoal learning and the effect of conceptual vs. computational equations on transfer. In: L. R. Gleitman & A. K. Joshi (Eds.), Proceedings of the 22th annual conference of the Cognitive Science Society (pp. 591-596). Mahwah, NJ: Erlbaum.

Atkinson, R. K., Derry, S. J., Renkl, A., & Wortham, D. (2000). Learning from examples: Instructional principles from the worked examples research. Review of Educational Research, 70, 181-214.

Atkinson, R. K., Renkl, A., & Merrill, M. M. (2003). Transitioning from studying examples to solving problems: Effects of self-explanation prompts and fading worked-out steps. Journal of Educational Psychology, 95, 774-783.

Ayres, P. (in press). Using subjective measures to detect variations of intrinsic load within problems. Learning and Instruction.

Bromme, R., Rambow, R., & Nückles, M. (2001). Expertise and estimating what other people know: The influence of professional experience and type of knowledge. Journal of Experimental Psychology: Applied, 7, 317-330.

Byrnes, J. P. (1996). Cognitive development and learning in instructional contexts. Boston, MA: Allyn & Bacon.

Camp, G., Paas, F., Rikers, R. M. J. P., & Van Merriënboer, J. J. G. (2001). Dynamic problem selection in air traffic control training: A comparison between performance, mental effort and mental efficiency. Computers in Human Behavior, 17, 575-595.

Camps, J. (2003). Concurrent and retrospective verbal reports as tools to better understand the role of attention in second language tasks. International Journal of Applied Linguistics, 13, 201-221.

Carroll, W. M. (1994). Using worked examples as an instructional support in the algebra classroom. Journal of Educational Psychology, 86, 360-367.

Catrambone, R. (1996). Generalizing solution procedures learned from examples. Journal of Experimental Psychology: Learning, Memory and Cognition, 22, 1020-1031.

Catrambone, R. (1998). The subgoal learning model: Creating better examples so that students can solve novel problems. Journal of Experimental Psychology: General, 12, 355-376.

Chandler, P., & Sweller, J. (1991). Cognitive load theory and the format of instruction. Cognition and Instruction, 8, 293-332.

Chandler, P., & Sweller, J. (1992). The split-attention effect as a factor in the design of instruction. British Journal of Educational Psychology, 62, 233-246.

120

Charness, N., Reingold, E. M., Pomplun, M., & Stampe, D. M. (2001). The perceptual aspect of skilled performance in chess: Evidence from eye movements. Memory and Cognition, 29, 1146-1152.

Chi, M. T. H., Bassok, M., Lewis, M. W., Reimann, P., & Glaser, R. (1989). Self-explanations: How students study and use examples in learning to solve problems. Cognitive Science, 13, 145-182.

Chi, M. T. H., Glaser, R., & Farr, M. J. (1988). The nature of expertise. Hillsdale, NJ: Erlbaum.

Chi, M. T. H., Glaser, R., & Rees, E. (1982). Expertise in problem solving. In R. J. Sternberg (Ed.), Advances in the psychology of human intelligence (Vol. 1, pp. 7-76). Hillsdale, NJ: Erlbaum.

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.

Conover, W. J. (1999). Practical nonparametric statistics (3rd edition). New York: Wiley.

Cooke, N. J. (1994). Varieties of knowledge elicitation techniques. International Journal of Human-Computer Studies, 41, 801-849.

Cooper, G., & Sweller, J. (1987). Effects of schema acquisition and rule automation on mathematical problem-solving transfer. Journal of Educational Psychology, 79, 347-362.

Cooper, G., Tindall-Ford, S., Chandler, P., & Sweller, J. (2001). Learning by imagining. Journal of Experimental Psychology: Applied, 7, 68-82.

Corbalan, G., Kester, L., & Van Merriënboer, J. J. G. (in press). Towards a personalized task selection model with shared instructional control. Instructional Science.

Deakin, J. M., & Cobley, S. (2003). A search for deliberate practice: An examination of the practice environment in figure skating and volleyball. In J. Starkes & K. A. Ericsson (Eds.), Expert performance in sport: Recent advances in research on sport expertise (pp. 115-132). Champaign, IL: Human Kinetics.

DesignSoft, Inc. (2002). TINA Pro (Version 6) [Computer software]. Available from http://www.tina.com

Egan, D. E., & Schwartz, B. J. (1979). Chunking in recall of symbolic drawings. Memory and Cognition, 7, 149-158.

Ericsson, K. A. (2002). Attaining excellence through deliberate practice: Insights from the study of expert performance. In M. Ferrari (Ed.), The pursuit of excellence through education (pp. 21-55). Hillsdale, NJ: Erlbaum.

Ericsson, K. A. (2003). How the expert-performance approach differs from traditional approaches to expertise in sports: In search of a shared theoretical framework for studying expert performance. In J. Starkes & K. A. Ericsson (Eds.), Expert performance in sport: Recent advances in research on sport expertise (pp. 371-401). Champaign, IL: Human Kinetics.

Ericsson, K. A., & Kintsch, W. (1995). Long-term working memory. Psychological Review, 102, 211-245.

Ericsson, K. A., Krampe, R. T., & Tesch-Römer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review, 100, 363-406.

121

Ericsson, K. A., & Lehmann, A. C. (1996). Expert and exceptional performance: Evidence for maximal adaptation to task constraints. Annual Review of Psychology, 47, 273-305.

Ericsson, K. A., & Simon, H. A. (1993). Protocol analysis: Verbal reports as data (Rev. ed.). Cambridge, MA: MIT Press.

Fitts, P. M., & Posner, M. I. (1967). Human Performance. Belmont, CA: Brooks/Cole. Gerjets, P., & Scheiter, K. (2003). Goal configurations and processing strategies as

moderators between instructional design and cognitive load: Evidence from hypertext-based instruction. Educational Psychologist, 38, 33-41.

Gerjets, P., Scheiter, K., & Catrambone, R. (2004). Reducing cognitive load and fostering cognitive skill acquisition: Benefits of category-avoiding examples. In: R. Alterman & D. Kirsch (Eds.), Proceedings of the 25th annual conference of the Cognitive Science Society (pp. 450-455). Mahwah, NJ: Erlbaum.

Gerjets, P., Scheiter, K., & Kleinbeck, S. (2004). Instructional examples in hypertext-based learning and problem solving: Comparing transformational and derivational approaches to example design. In: H.M. Niegemann, R. Brünken & D. Leutner (Eds.), Instructional design for multimedia learning (pp. 165-179). Münster: Waxmann.

Gitomer, D. H. (1988). Individual differences in technical troubleshooting. Human Performance, 1, 111-131.

Glaser, R. (1996). Changing the agency for learning: Acquiring expert performance. In K. A. Ericsson (Ed.), The road to excellence: The acquisition of expert performance in the arts and sciences, sports, and games (pp. 303-311). Mahwah, NJ: Erlbaum.

Gobet, F., Lane, P. C. R., Croker, S., Cheng, P. C-H., Jones, G., Oliver, I., & Pine, J. M. (2001). Chunking mechanisms in human learning. TRENDS In Cognitive Sciences, 5, 236-243.

Gott, S. P., Parker Hall, E., Pokorny, R. A., Dibble, E., & Glaser, R. (1993). A naturalistic study of transfer: Adaptive expertise in technical domains. In D. K. Detterman & R. J. Sternberg (Eds.), Transfer on trial: Intelligence, cognition, and instruction (pp. 258-288). Norwood, NJ: Ablex.

Haider, H., & Frensch, P. A. (1999). Eye movement during skill acquisition: More evidence for the information reduction hypothesis. Journal of Experimental Psychology: Learning, Memory and Cognition, 25, 172-190.

Hansen, J. P. (1991). The use of eye mark recordings to support verbal retrospection in software testing. Acta Psychologica, 76, 31-49.

Hinds, P. J., Patterson, M., & Pfeffer, J. (2001). Bothered by abstraction: The effect of expertise on knowledge transfer and subsequent novice performance. Journal of Applied Psychology, 86, 1232-1243.

Jonassen, D. H., & Hung, W. (2005). Learning to troubleshoot: A new theory-based design architecture. Manuscript submitted for publication.

Kalyuga, S., Ayres, P., Chandler, P., & Sweller, J. (2003). The expertise reversal effect. Educational Psychologist, 38, 23-32.

Kalyuga, S., Chandler, P., & Sweller, J. (1998). Levels of expertise and instructional design. Human Factors, 40, 1-17.

122

Kalyuga, S., Chandler, P., Tuovinen, J.E., & Sweller, J. (2001). When problem solving is superior to studying worked examples. Journal of Educational Psychology, 93, 579-588.

Kalyuga, S., & Sweller, J. (2004). Measuring knowledge to optimise cognitive load factors during instruction. Journal of Educational Psychology, 96, 558-568.

Kalyuga, S., & Sweller, J. (2005). Rapid dynamic assessment of expertise to improve the efficiency of adaptive e-learning. Educational Technology Research and Development, 53(3), 83-93.

Kester, L., Kirschner, P. A., & Van Merriënboer, J. J. G. (2004). Information presentation and troubleshooting in electrical circuits. International Journal of Science Education, 26, 239-256.

Kester, L., Kirschner, P. A., & Van Merriënboer, J. J. G. (in press). Just-in-time information presentation: Improving learning a troubleshooting skill. Contemporary Educational Psychology.

Kieras, D. E., & Bovair, S. (1984). The role of a mental model in learning to operate a device. Cognitive Science, 8, 255-273.

Kuusela, H., & Paul, P. (2000). A comparison of concurrent and retrospective verbal protocol analysis. American Journal of Psychology, 113, 387-404.

Lankford, C. (2000). GazeTrackerTM: Software designed to facilitate eye movement analysis. Proceedings of the Eye Tracking Research and Applications Symposium (pp. 57-63). New York: ACM Press.

Larkin, J., McDermott, J., Simon, D. P., & Simon, H. A. (1980). Expert and novice performance in solving physics problems. Science, 208, 1335-1342.

Lauwereyns, J., & d’Ydewalle, G. (1996). Knowledge acquisition in poetry criticism: The expert’s eye movements as an information tool. International Journal of Human-Computer Studies, 45, 1-18.

Lovett, M. C. (1992). Learning by problem solving versus by examples: The benefits of generating and receiving information. In Proceedings of the 14th Annual Conference of the Cognitive Science Society (pp. 956-961). Hillsdale, NJ: Erlbaum.

Mayer, R. E., & Wittrock, M. C. (1996). Problem-solving transfer. In: D.C. Berliner & R.C. Calfee (Eds.), Handbook of Educational Psychology (pp. 47-62). New York: Macmillan.

Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity to process information. Psychological Review, 63, 81-97.

Newell, A., & Simon, H. A. (1972). Human problem solving. Englewood Cliffs, NJ: Prentice-Hall.

Ohlsson, S., & Rees, E. (1991). The function of conceptual understanding in the learning of arithmetic procedures. Cognition and Instruction, 8, 103-179.

Paas, F. (1992). Training strategies for attaining transfer of problem-solving skill in statistics: A cognitive load approach. Journal of Educational Psychology, 84, 429-434.

Paas, F., Renkl, A., & Sweller, J. (2003). Cognitive load theory and instructional design: Recent developments. Educational Psychologist, 38, 1-4.

Paas, F., Renkl, A., & Sweller, J. (2004). Cognitive load theory: Instructional implications of the interaction between information structures and cognitive architecture. Instructional Science, 32, 1-8.

123

Paas, F., Tuovinen, J. E., Tabbers, H., & Van Gerven, P. W. M. (2003). Cognitive load measurement as a means to advance cognitive load theory. Educational Psychologist, 38, 63-71.

Paas, F., Tuovinen, J. E., Van Merriënboer, J. J. G., & Darabi, A. (2005). A motivational perspective on the relation between mental effort and performance: Optimizing learners’ involvement in instructional conditions. Educational Technology Research and Development, 53(3), 25-33.

Paas, F., & Van Merriënboer, J. J. G. (1993). The efficiency of instructional conditions: An approach to combine mental-effort and performance measures. Human Factors, 35, 737-743.

Paas, F., & Van Merriënboer, J. J. G. (1994a). Variability of worked examples and transfer of geometrical problem-solving skills: A cognitive load approach. Journal of Educational Psychology, 86, 122-133.

Paas, F., & Van Merriënboer, J. J. G. (1994b). Instructional control of cognitive load in the training of complex cognitive tasks. Educational Psychology Review, 6, 351-372.

Patel, V. L., Arocha, J. F., & Kaufman, D. R. (1994). Diagnostic reasoning and medical expertise. In D. L. Medin (Ed.), Psychology of learning and motivation: Advances in research and theory (pp. 187-252). San Diego, CA: Academic Press.

Patel, V. L., Groen, G. J., & Norman, G. R. (1993). Reasoning and instruction in medical curricula. Cognition and Instruction, 10, 335-378.

Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124, 372-422.

Renkl, A. (1997). Learning from worked-out examples: A study on individual differences, Cognitive Science, 21, 1–29.

Renkl, A. (2002). Worked-out examples: Instructional explanations support learning by self-explanations. Learning and Instruction, 12, 529-556.

Renkl, A., & Atkinson, R. K. (2003). Structuring the transition from example study to problem-solving in cognitive skills acquisition: A cognitive load perspective. Educational Psychologist, 38, 15-22.

Richman, H. B., Gobet, F., Staszewski, J. J., & Simon, H. A. (1996). Perceptual and memory processes in the acquisition of expert performance: The EPAM model. In K. A. Ericsson (Ed.), The road to excellence: The acquisition of expert performance in the arts and sciences, sports and games (pp. 167-187). Mahwah, NJ: Erlbaum.

Rikers, R. M. J. P., Schmidt, H. G., & Boshuizen, H. P. A. (2000). Knowledge encapsulation and the intermediate effect. Contemporary Educational Psychology, 25, 150-166.

Rikers, R. M. J. P., Schmidt, H. G., & Boshuizen, H. P. A. (2002). On the constraints of encapsulated knowledge: Clinical case representations by medical experts and subexperts. Cognition and Instruction, 20, 27-45.

Rikers, R. M. J. P., Van Gerven, P. W. M., & Schmidt, H. G. (2004). Cognitive load theory as a tool for expertise development. Instructional Science, 32, 173-182.

Rosenthal, R., Rosnow, R. L., & Rubin, D. B. (2000). Contrasts and effect sizes in behavioral research: A correlational approach. Cambridge: Cambridge University Press.

124

Russo, J. E. (1979). A software system for the collection of retrospective protocols prompted by eye fixations. Behavior Research Methods & Instrumentation, 11, 177-179.

Russo, J. E., Johnson, E. J., & Stephens, D. L. (1989). The validity of verbal protocols. Memory and Cognition, 17, 759-769.

Salden, R. J. C. M., Paas, F., Broers, N. J., & Van Merriënboer, J. J. G. (2004). Mental effort and performance as determinants for the dynamic selection of learning tasks in air traffic control training. Instructional Science, 32, 153-172.

Salvucci, D. D. (1999). Mapping eye movements to cognitive processes (Doctoral dissertation, Carnegie Mellon University, 1999). Dissertation Abstracts International, 60, 5619.

Schaafstal, A., Schraagen, J. M., & Van Berlo, M. (2000). Cognitive task analysis and innovation of training: The case of structured troubleshooting. Human Factors, 42, 75-86.

Schoenfeld, A. H. (1987). What’s all the fuss about metacognition? In A. H. Schoenfeld (Ed.), Cognitive science and mathematics education. Hillsdale, NJ: Erlbaum.

Schworm, S., & Renkl, A. (2002). Learning by solved example problems: Instructional explanations reduce self-explanation activity. In W. D. Gray & C. D. Schunn (Eds.), Proceedings of the 24th Annual Conference of the Cognitive Science Society (pp. 816-821). Mahwah, NJ: Erlbaum.

Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin, 86, 420-428.

Shute, V., & Towle, B. (2003) Adaptive e-learning. Educational Psychologist, 38, 105-114.

Singley, M. K., & Anderson, J. R. (1989). The transfer of cognitive skill. Cambridge, MA: Harvard University Press.

Stelmach, L. B., Campsall, J. M., & Herdman, C. M. (1997). Attentional and ocular movements. Journal of Experimental Psychology: Human Perception and Performance, 23, 823-844.

Sweller, J. (1988). Cognitive load during problem-solving: Effects on learning. Cognitive Science, 12, 257-285.

Sweller, J. (2004). Instructional design consequences of an analogy between evolution by natural selection and human cognitive architecture. Instructional Science, 32, 9-31.

Sweller, J., Chandler, P., Tierney, P., & Cooper, M. (1990). Cognitive load as a factor in the structuring of technical material. Journal of Experimental Psychology: General, 119, 176-192.

Sweller, J., & Cooper, G. (1985). The use of worked examples as a substitute for problem solving in learning algebra. Cognition and Instruction, 2, 59-89.

Sweller, J., Van Merriënboer, J. J. G., & Paas, F. (1998). Cognitive architecture and instructional design. Educational Psychology Review, 10, 251-295.

Tarmizi, R., & Sweller, J. (1988). Guidance during mathematical problem solving. Journal of Educational Psychology, 80, 424-436.

Taylor, K. L., & Dionne, J. P. (2000). Accessing problem-solving strategy knowledge: The complementary use of concurrent verbal protocols and retrospective debriefing. Journal of Educational Psychology, 92, 413-425.

125

Tuovinen, J. E., & Paas, F. (2004). Exploring multidimensional approaches to the efficiency of instructional conditions. Instructional Science, 32, 133-152.

Underwood, G., Chapman, P., Brocklehurst, N., Underwood, J., & Crundall, D. (2003). Visual attention while driving: Sequences of eye fixations made by experienced and novice drivers. Ergonomics, 46, 629-646.

Underwood, G., Jebbett, L., & Roberts, K. (2004). Inspecting pictures for information to verify a sentence: Eye movements in general encoding and in focused search. The Quarterly Journal of Experimental Psychology, 57, 165-182.

Van den Boom, G., Paas, F., Van Merriënboer, J. J. G., & Van Gog, T. (2004). Reflection prompts and tutor feedback in a web-based learning environment: Effects on students’ self-regulated learning competence. Computers in Human Behavior, 20, 551-567.

Van den Haak, M., De Jong, M. D. T., & Schellens, P. J. (2003). Retrospective versus concurrent think-aloud protocols: Testing the usability of an online library catalogue. Behaviour and Information Technology, 22, 339-351.

Van Gerven, P. W. M., Paas, F., Van Merriënboer, J. J. G., & Schmidt, H. (2004). Memory load and the cognitive pupillary response in aging. Psychophysiology, 41, 167-174.

Van Gog, T., Ericsson, K. A., Rikers, R. M. J. P., & Paas, F. (2005). Instructional design for advanced learners: Establishing connections between the theoretical frameworks of cognitive load and deliberate practice. Educational Technology Research and Development, 53(3), 73-81.

Van Gog, T., Paas, F., & Van Merriënboer, J. J. G. (2004). Process-oriented worked examples: Improving transfer performance through enhanced understanding. Instructional Science, 32, 83-98.

Van Gog, T., Paas, F., & Van Merriënboer, J. J. G. (2005). Uncovering expertise-related differences in troubleshooting performance: Combining eye movement and concurrent verbal protocol data. Applied Cognitive Psychology, 19, 205-221.

Van Gog, T., Paas, F., & Van Merriënboer, J. J. G. (in press). Effects of process-oriented worked examples on troubleshooting transfer performance. Learning and Instruction.

Van Gog, T., Paas, F., Van Merriënboer, J. J. G. & Witte, P. (2005). Uncovering the problem-solving process: Cued retrospective reporting versus concurrent and retrospective reporting. Journal of Experimental Psychology: Applied, 11, 237-244.

Van Merriënboer, J. J. G. (1997). Training complex cognitive skills: A four-component instructional design model for technical training. Englewood Cliffs, NJ: Educational Technology Publications.

Van Merriënboer, J. J. G., & Krammer, H. P. M. (1990). The “completion strategy” in programming instruction: Theoretical and empirical support. In: S. Dijkstra, B. H. A. M. Van Hout-Wolters & P. C. Van der Sijde (Eds.), Research on instruction: Design and effects (pp. 45–61). Englewood Cliffs, NJ: Educational Technology Publications.

Van Merriënboer, J. J. G., Schuurman, J. G., De Croock, M. B. M., & Paas, F. (2002). Redirecting learners’ attention during training: Effects on cognitive load, transfer test performance and training efficiency. Learning and Instruction, 12, 11-37.

126

Van Merriënboer, J. J. G., & Sweller, J. (2005). Cognitive load theory and complex learning: Recent developments and future directions. Educational Psychology Review, 17, 147-177.

Van Someren, M. W., Barnard, Y. F., & Sandberg, J. A. C. (1994). The think aloud method: A practical guide to modeling cognitive processes. London: Academic Press.

Velichkovsky, B. M. (1995). Communicating attention: Gaze position transfer in cooperative problem solving. Pragmatics and Cognition, 3, 199-224.

Ward, M., & Sweller, J. (1990). Structuring effective worked examples. Cognition and Instruction, 7, 1-39. Yeo, G. B., & Neal, A. (2004). A multilevel analysis of effort, practice and performance:

Effects of ability, conscientiousness, and goal orientation. Journal of Applied Psychology, 89, 231-247.

Young, R. M. (1983). Surrogates and mappings: Two kinds of conceptual models for interactive devices. In: D. Genter & A. L. Stevens (Eds.), Mental models (pp. 35-52). Hillsdale, NJ: Erlbaum.

Zhu, X., & Simon, H. A. (1987). Learning mathematics from examples and by doing. Cognition and Instruction, 4, 137-166.

Zimmerman, B. J. (2002). Achieving academic excellence: A self-regulatory perspective. In M. Ferrari (Ed.), The pursuit of excellence through education (pp. 85-110). Hillsdale, NJ: Erlbaum.

127

Summary Research contrasting learning from problem solving with learning from worked examples has shown that the latter is often more effective and efficient in the initial phases of skill acquisition. Cognitive Load Theory (CLT; Sweller, 1988; Sweller, Van Merriënboer, & Paas, 1998) explains this in terms of a reduction of ineffective (extraneous) cognitive load: learners do not have to invest cognitive resources in searching for a solution, but can devote all available resources to studying the solution. In order to further increase the effectiveness of worked examples, learners should be stimulated to use the capacity that is freed-up through the reduction in ineffective load for processes that contribute to learning, that is, processes that impose an effective (germane) load. In this dissertation, it is studied whether including not only the solution steps, but also information on the solution process in worked examples would induce such a germane load. The first part of the dissertation focuses on process-tracing techniques as a means to uncover such problem-solving process information and expertise-related differences in performance. The second part centers on the effects of process-oriented worked examples on learning.

Part I: Uncovering the Problem-Solving Process

Chapter 2 discusses the connections between the theoretical frameworks of CLT and deliberate practice, and the resulting interesting directions for CLT-based research. CLT research has been very successful in identifying effective and efficient instructional formats for novice learners. With the identification of the expertise reversal effect (see Kalyuga, Ayres, Chandler, & Sweller, 2003), however, it became clear that formats that are effective for enhancing novices’ learning might not necessarily be so for advanced learners.

Expert performance research, and especially research on deliberate practice, provides insight into the requirements for developing excellence. Deliberate practice activities are defined as practice activities that are at an appropriate, challenging level of difficulty and enable successive refinement by allowing for repetition, giving room to make and correct errors, and providing informative feedback to the learner (Ericsson, Krampe, & Tesch-Römer, 1993; Ericsson & Lehmann, 1996). Hence, deliberate practice requires full concentration of the learner and is effortful to maintain for extended periods. This bears resemblance to the concept of germane cognitive load. Positive effects on learning by the germane activities the instructional design tries to engage students in, will only occur if the learners are actually motivated to invest cognitive effort in those activities. Hence, one of the interesting directions for CLT research would be to study the relationship between cognitive load, learning outcomes, and learner motivation as well as the other

128

requirements of deliberate practice tasks: feedback and the possibility to make and correct errors.

Another interesting aspect of expert performance and deliberate practice research for CLT-research is the kind of methods and techniques used. In order to be able to design effective instruction for learners beyond the novice phase, more detailed information is required on how schemata actually develop (construction, elaboration, automation) with increasing expertise. This information is necessary in order to predict the cognitive load a certain instructional format might impose. This may require other measurement techniques than performance and mental effort scores on training and transfer tasks that are usually used in CLT-research to conclude on cognitive load and learning effects of instruction, because those measures have a rather large grain-size. The techniques used in expert performance research to study for example memory (development), and (micro-structure) cognitive processes at different levels of expertise, might be interesting in this regard.

Those techniques are referred to as process-tracing techniques (Cooke, 1994), and are studied in Chapters 3 and 4. The data reported in those chapters come from one within-subjects experiment with four conditions: concurrent reporting, retrospective reporting, cued retrospective reporting, and concurrent reporting with eye tracking. Participants worked on two computer-simulated electrical circuits troubleshooting tasks under each reporting condition (eight tasks in total).

Chapter 3 focuses on the differences in problem-solving process information (‘action’, ‘why’, i.e., use of domain principles, ‘how’, i.e., use of strategies/heuristics, and ‘metacognitive’) elicited with concurrent, retrospective and cued retrospective reporting. Results from previous research (e.g., Kuusela & Paul, 2000; Taylor & Dionne, 2000) suggested that concurrent reporting would lead to more information on actions being reported, whereas retrospective reporting would lead to more “references to strategies that control the problem solving process” and “information such as the conditions that elicited a particular response” (cf. our categories of ‘why’, ‘how’, and ‘metacognitive’ information; Taylor & Dionne, 2000, p. 414). Because all this information is relevant for instructional designers (e.g., for the development of process-oriented worked examples), a method is needed that is able to combine those results.

Cued retrospective reporting based on a record of eye movements and mouse/keyboard operations replayed superimposed on the original task, was hypothesized to be able to do this. Since the task was computer-based and had a large visual component (diagram inspection), the eye movements and mouse/keyboard

129

operations in the cue could trigger memory of thoughts related to cognitive and physical actions, respectively, while maintaining the retrospective nature of the report. Protocols were segmented based on utterances and coded.

Results showed that in line with our hypothesis, both concurrent and cued retrospective reporting resulted in a higher number of codes on ‘action’ information. Unexpectedly, however, retrospective reporting did not result in a higher number of codes on ‘why’, ‘how’, and ‘metacognitive’ information than concurrent reporting. Actually, the effect was reversed for ‘why’ and ‘how’ information. Given this unexpected finding, we analyzed whether cued retrospective reporting also resulted in more ‘why’, ‘how’, and ‘metacognitive’ information than retrospective reporting which was found to be the case for the last two categories. It is concluded that possible qualitative differences in the content captured by concurrent and cued retrospective reporting should be studied, but that cued retrospective reporting seems a promising method that should be further investigated and refined.

In the addendum to this chapter, it is explored whether expertise influences the way in which participants experience the methods. Expertise was computed by means of standardized performance, mental effort, and time on task scores on all experimental tasks, using a formula originally proposed to study the efficiency of instructional conditions (Paas & Van Merriënboer, 1993; Tuovinen & Paas, 2004). Performance efficiency can be seen as an indicator of expertise, since an individual with high expertise will be able to attain a higher performance score combined with lower investment of time-on-task and mental effort than a person with less expertise.

Open-ended questions were asked after the experiment. Participants’ answers were recorded and analyzed qualitatively on indicating positive/negative experience, preference for a method, and factors that mediate experience/preference. The answers of the five participants with the highest expertise and the five participants with the lowest expertise were compared. Lower expertise participants seemed to find concurrent reporting a negative experience and to prefer cued retrospective reporting. Across both groups, time-on-task and the cue were mentioned as mediating factors for preference of method.

In Chapter 4, the analysis of data from the five highest and five lowest expertise participants (see above) from one of the tasks in the concurrent reporting with eye tracking condition is reported. Those data were analyzed to explore the value of eye movement data to uncover relatively small expertise-related differences in electrical circuits troubleshooting performance, in relation to concurrent verbal protocols. The

130

first three phases of the problem-solving process are considered: “problem orientation”, “problem formulation and action decision”, and “action evaluation and next action decision”. Higher expertise participants spent relatively more time on the first and third phases. In the first phase, higher expertise participants had a shorter mean fixation duration (indicator of processing demands), fixated proportionally more on a major fault related component, and showed a trend towards a higher number of gaze switches between two fault related components. Within-subjects analyses showed that only the mean fixation duration of the higher expertise participants differed significantly, being lower during the first phase, orientation, and the first half of the third phase, “evaluation”. The concurrent verbal protocol data were qualitatively analyzed and related to the eye movement data.

Concurrent protocols seemed to provide more information on the general content of cognitive processes (what a person is actually thinking), however, eye movement data seemed to provide very specific content information that is not necessarily captured in a verbal protocol, such as differential allocation of attention to components. Furthermore, the processing demands reflected in eye movement data provide interesting and detailed information both within and between subjects that is hard to infer from concurrent protocols. It is therefore concluded that combining those two data sources has an added value for expertise researchers and instructional designers interested in detailed insight into cognitive processes during problem solving.

Part II: Process-Oriented Worked Examples

This part starts with a theoretical chapter (Chapter 5) on the assumed effects of process-oriented worked examples. It is argued in this chapter that the worked examples used in previous research can be called product-oriented, because they only provide solution steps and not the rationale behind those steps. Strategic knowledge, such as heuristics and systematic approaches to problem solving, and domain principled knowledge used to select the steps are not made explicit in those examples. However, understanding the rationale behind those steps is considered a crucial factor for transfer, especially for far transfer. In contrast to near transfer tasks, which have structural features comparable to those of the training tasks but different surface features, far transfer tasks have different structural features, and therefore do not allow learners to merely apply a memorized procedure. Flexibly using those parts of a learned procedure that are relevant for a new (far transfer) problem requires that the learner understands the rationale behind (subgroups of) solution steps (cf. Catrambone, 1996, 1998); that is, that a learner “not

131

only knows the procedural steps for problem-solving tasks, but also understands when to deploy them and why they work” (Gott, Parker Hall, Pokorny, Dibble, & Glaser, 1993, p. 260).

It is therefore assumed that the use of process-oriented worked examples that do make this knowledge explicit would induce a germane, or effective cognitive load. Cognitive capacity that is freed-up through the reduction of extraneous load can be used to study the additional process information, which is expected to increase understanding and hence transfer of learning (i.e., studying process information would induce higher cognitive load, but also lead to better learning).

In the experiment described in Chapter 6, this assumption was tested in the domain of electrical circuits troubleshooting, using a 2 x 2 factorial design with the factors ‘solution worked-out’ (no/yes) and ‘process information given’ (no/yes). The resulting training conditions were: 1) solving conventional problems, 2) solving conventional problems with process information available, 3) studying product-oriented worked examples, and 4) studying process-oriented worked examples. The training consisted of six parallel electrical circuits troubleshooting tasks, three circuits containing one fault and three circuits containing two faults. The test consisted of six conventional problems. Three were near transfer problems (different surface features but similar structural features; i.e., parallel circuits with the same types of faults as the training tasks contained), and three were far transfer problems (different structural features; i.e., a different type of fault or a different type of circuit than the training tasks contained). After each training task and after each test task participants indicated the amount of mental effort they invested on a 9-point rating scale.

In line with our expectation, it was found that studying worked examples required less investment of mental effort during training, but led to higher near and far transfer test performance than solving conventional problems. Furthermore, studying process information indeed led to higher investment of effort during the training; however, for the process-oriented worked examples group it did not increase transfer performance compared to the product-oriented worked examples group.

In Chapter 7, a possible explanation was investigated for the fact that the mental effort invested in studying process-oriented worked examples during training was higher, but performance and efficiency (a combination of test performance and mental effort invested in the test) were not. It was hypothesized that process-oriented worked examples may have been more efficient initially than product-oriented worked examples, but that the process information becomes redundant and starts to hamper learning, and

132

should therefore be removed at that point. So, a process-product worked examples sequence was assumed to be more efficient than a process only, product only or product-process sequence. Participants studied two series of four training worked examples (parallel circuits with one fault, cf. Chapter 6), each followed by a series of four transfer test problems (two near and two far transfer problems, cf. Chapter 6). The conditions were: product-product sequence, process-process sequence, product-process sequence, and process-product sequence. After each training task and after each test task participants indicated the amount of mental effort they invested on a 9-point rating scale.

In line with our hypothesis, having studied process-oriented worked examples resulted in higher efficiency on the first transfer test than having studied product-oriented worked examples. Also in line with our hypothesis, the process information became redundant: continuing to study process-oriented worked examples led to lower efficiency on the second test, than continuing with product-oriented worked examples. However, contrary to the first test, the efficiency of the process-product condition on the second test was not significantly higher than that of the product-product or product-process conditions. It can be concluded that when offering process information in worked examples, this should be done at the beginning of training and should be removed when it becomes redundant. Combined with previous work on fading worked-out solution steps when those become redundant (see Renkl & Atkinson, 2003), our results suggest that an optimal training sequence for novices would likely proceed from studying process-oriented worked examples, product-oriented worked examples, and completion problems with increasingly more blanks that they have to fill in, to solving conventional problems.

133

Samenvatting Onderzoek naar leren van probleemoplossen versus leren van uitgewerkte voorbeelden heeft aangetoond dat dit laatste vaak effectiever en efficiënter is, in elk geval voor beginnende lerenden. Cognitive Load Theory (CLT; Sweller, 1988; Sweller, Van Merriënboer, & Paas, 1998) verklaart dit in termen van gereduceerde ineffectieve cognitieve belasting: lerenden hoeven geen cognitieve capaciteit te investeren in het zoeken naar een oplossing, maar kunnen al hun aandacht richten op het bestuderen van bruikbare oplossingsstappen. Om de effectiviteit van uitgewerkte voorbeelden nog verder te verhogen, kan men lerenden stimuleren de “ruimte” die is vrijgemaakt door het reduceren van ineffectieve belasting te investeren in processen die bijdragen aan het leren, d.w.z. een effectieve belasting opleggen. In dit proefschrift wordt onderzocht of het niet alleen tonen van de oplossing in uitgewerkte voorbeelden, maar daarnaast ook informatie geven over het oplosproces, zo’n bron van effectieve belasting kan zijn. Het eerste deel van het proefschrift gaat in op zogenaamde “process tracing” technieken, die gebruikt kunnen worden om informatie te vergaren over het probleemoplosproces en om verschillen in prestatie die voortvloeien uit expertiseverschillen te onderzoeken. Het tweede deel heeft betrekking op de effecten van procesgerichte uitgewerkte voorbeelden op het leren.

Deel I: Blootleggen van het Probleemoplosproces

Hoofdstuk 2 gaat in op de verbanden tussen de theoretische kaders van CLT en “deliberate practice” en de daaruit voortvloeiende veelbelovende richtingen voor verder onderzoek gebaseerd op CLT. CLT-onderzoek heeft een belangrijke bijdrage geleverd aan het vinden van instructievormen die effectief en efficiënt zijn voor beginnende lerenden. Met de ontdekking van het “expertise reversal effect” (zie Kalyuga, Ayres, Chandler, & Sweller, 2003) werd echter duidelijk dat instructievormen die het leren van beginners stimuleren niet noodzakelijk ook effectief zijn voor gevorderde lerenden.

Onderzoek naar expertgedrag en vooral dat naar “deliberate practice” heeft belangrijke inzichten opgeleverd in de vereisten voor het ontwikkelen van “expert performance”. Activiteiten die vallen onder de noemer deliberate practice zijn op een uitdagend moeilijkheidsniveau, maken successievelijke verfijning van prestatie mogelijk middels herhaling, geven de lerende de ruimte om fouten te maken èn te corrigeren en leveren informatieve feedback aan de lerende (Ericsson, Krampe, & Tesch-Römer, 1993; Ericsson & Lehmann, 1996). Deze activiteiten vragen dan ook om volledige concentratie en het kost veel moeite om ze gedurende langere tijd vol houden. Dit is vergelijkbaar met het idee van effectieve cognitieve belasting. Positieve effecten op het leren door de activiteiten die gestimuleerd worden door de instructiemaatregelen treden alleen op als de lerenden ook daadwerkelijk moeite steken in deze activiteiten. Eén van de interessante richtingen voor CLT onderzoek is dan ook het onderzoeken van de relatie tussen

134

cognitieve belasting, leerresultaten en motivatie van lerenden, alsook de andere vereisten van deliberate practice: de rol van feedback en de mogelijkheid fouten te maken en te verbeteren.

Een ander interessant aspect van het onderzoek naar expert performance en deliberate practice betreft de gebruikte methoden en technieken. Voor het ontwerpen van effectieve instructie voor gevorderde lerenden dient meer gedetailleerd inzicht verkregen te worden in hoe cognitieve schemata zich feitelijk ontwikkelen (constructie, elaboratie, automatisatie) met toenemende expertise. Dit inzicht is nodig om de effecten op cognitieve belasting van bepaalde instructievormen te kunnen voorspellen. Waarschijnlijk vereist dit andere technieken dan het meten van prestatie en mentale inspanning op trainings- en testtaken, maten die gewoonlijk gebruikt worden in CLT onderzoek om conclusies te trekken over de effecten van instructievormen op cognitieve belasting en leeruitkomsten, omdat dit vrij grove maten zijn. De technieken die in expert performance-onderzoek gebruikt worden om bijvoorbeeld geheugen(ontwikkeling) en (de microstructuur van) cognitieve processen op verschillende niveaus van expertise te bestuderen, kunnen in dit opzicht van belang zijn.

Deze technieken worden ook wel “process tracing” technieken genoemd (Cooke, 1994) en vormen het onderwerp van Hoofdstuk 3 en 4. De gerapporteerde data in deze hoofdstukken komen uit één binnenproefpersonen experiment met vier condities: hardopdenken tijdens de taak, retrospectief rapporteren, cued-retrospectief rapporteren en hardopdenken tijdens de taak met oogbewegingregistratie. In elke conditie werkten proefpersonen aan het troubleshooten van twee elektrische schakelingen in een computersimulatieprogramma (acht taken in totaal).

Hoofdstuk 3 is gericht op de verschillen in probleemoplosprocesinformatie (‘handeling’, ‘waarom’, d.w.z. gebruik van domein principes, ‘hoe’, d.w.z. gebruik van strategieën en ‘metacognitief’) verkregen middels hardopdenken, retrospectief rapporteren en cued-retrospectief rapporteren. Resultaten van eerder onderzoek (b.v. Kuusela & Paul, 2000; Taylor & Dionne, 2000) suggereerden dat hardopdenken meer informatie over handelingen zou opleveren, terwijl retrospectief rapporteren meer “referenties naar strategieën die het probleemoplosproces bepalen” en “informatie over de omstandigheden die resulteerden in een bepaalde handeling” zou opleveren (vgl. onze categorieën ‘waarom’, ‘hoe’ en ‘metacognitief’; Taylor & Dionne, 2000, p. 414). Omdat voor instructie-ontwerpers al deze informatie relevant is (b.v. voor het ontwikkelen van procesgerichte uitgewerkte voorbeelden) werd een methode gezocht die deze resultaten zou combineren.

135

Cued-retrospectief rapporteren op basis van een opname van oogbewegingen en muis-/toetsenbordhandelingen, die afgespeeld wordt over de oorspronkelijke taak, leek hiervoor geschikt. Aangezien de taken op de computer uitgevoerd werden en een grote visuele component bevatten (inspectie van de tekening van de schakeling), zou het terugzien van de oogbewegingen en muis-/toetsenbordhandelingen herinnering van gedachten over respectievelijk cognitieve en fysieke handelingen kunnen stimuleren, met behoud van de retrospectieve aard van rapportage. De verbale protocollen werden gesegmenteerd op basis van uitspraken (zinnen of tekstfragmenten gescheiden door een duidelijke pauze voor- en achteraf) en gecodeerd.

Overeenkomstig onze hypothese, lieten de analyses zien dat zowel hardopdenken als cued-retrospectief rapporteren meer informatie over ‘handelingen’ opleverden dan retrospectief rapporteren. Echter, onverwachts bleek dat retrospectief rapporteren niet meer ‘waarom’, ‘hoe’ en ‘metacognitieve’ informatie opleverde dan hardop denken; het was zelfs precies omgekeerd voor ‘waarom’ en ‘hoe’ informatie. Vanwege deze onverwachte bevinding, is bekeken of cued-retrospectief rapporteren dan ook meer ‘waarom’, ‘hoe’ en ‘metacognitieve’ informatie op zou leveren dan retrospectief rapporteren, wat inderdaad het geval bleek voor de laatste twee categorieën. De conclusie wordt getrokken dat onderzoek naar mogelijke kwalitatieve verschillen in inhoud van protocollen verkregen middels hardopdenken en cued-retrospectief rapporteren nodig is, maar dat de laatstgenoemde methode veelbelovend lijkt en zeker verder onderzocht en verfijnd moet worden.

In het Addendum bij dit hoofdstuk wordt exploratief onderzocht of expertise invloed heeft op de manier waarop proefpersonen de verschillende rapportagemethoden ervaren. Expertise werd bepaald op basis van gestandaardiseerde prestatie-, mentale inspanning- en tijdscores op alle experimentele taken, met behulp van een formule die oorspronkelijk ontwikkeld is om de efficiëntie van instructiecondities te meten (Paas & Van Merriënboer, 1993; Tuovinen & Paas, 2004). Prestatie-efficiëntie kan beschouwd worden als een indicator van expertise, omdat iemand met meer expertise een hogere prestatiescore zal behalen met minder mentale inspanning en in kortere tijd dan iemand met minder expertise.

Aan het eind van het experiment werden open vragen gesteld. De antwoorden werden opgenomen en kwalitatief geanalyseerd op uitspraken over positieve/negatieve ervaring, voorkeur voor een methode en factoren die een rol spelen bij ervaring of voorkeur. De antwoorden van de vijf proefpersonen met de hoogste expertise en de vijf proefpersonen met de laagste expertise werden met elkaar vergeleken. Proefpersonen met lagere

136

expertise leken hardopdenken een vervelende ervaring te vinden en hadden een voorkeur voor cued-retrospectief rapporteren. Door beide groepen werden de tijd die aan de taak besteed werd en de cue genoemd als belangrijke factoren in de ervaring van de methoden.

In Hoofdstuk 4 worden gegevens gerapporteerd van de vijf proefpersonen met de hoogste expertise en de vijf proefpersonen met de laagste expertise (zie boven), op één van de taken van de conditie hardopdenken met oogbewegingregistratie. Deze gegevens werden geanalyseerd om het nut van oogbewegingsdata te exploreren voor het bestuderen van redelijk kleine, expertisegerelateerde verschillen in troubleshooting prestatie, in relatie met verbale protocollen. De eerste drie fasen van het probleemoplosproces werden bekeken: ‘probleemoriëntatie’, ‘probleemstelling en handelingskeuze’ en ‘handelingsevaluatie en keuze van volgende handeling’. Proefpersonen met meer expertise spendeerden relatief meer tijd aan fasen 1 en 3. In de eerste fase was de gemiddelde fixatieduur (indicator van verwerkingsvereisten van cognitieve processen) van proefpersonen met meer expertise korter, fixeerden zij proportioneel meer op een component die gerelateerd was aan een grote fout en neigden zij meer hun blik heen en weer te laten gaan tussen twee andere foutgerelateerde componenten. Binnenproefpersonen analyses lieten alleen significante verschillen zien in de gemiddelde fixatieduur van de proefpersonen met meer expertise; die was lager in de oriëntatiefase en in de eerste helft van de derde fase, “evaluatie”. De hardopdenkprotocollen werden kwalitatief geanalyseerd en gerelateerd aan de oogbewegingsdata.

Hardopdenkprotocollen leken meer informatie te geven over de generieke inhoud van cognitieve processen; echter, oogbewegingsdata leken meer specifieke inhoudinformatie te geven die niet noodzakelijk af te leiden is uit een verbaal protocol, zoals het spreiden van de aandacht over verschillende componenten. Bovendien leveren oogbewegingsdata interessante en gedetailleerde informatie op over de verwerkingsvereisten van cognitieve processen, zowel wanneer vergelijkingen gemaakt worden tussen groepen als wanneer gekeken wordt naar variatie binnen groepen over verschillende fasen. Dergelijke informatie is niet of nauwelijks af te leiden uit verbale protocollen. De conclusie is dan ook dat de combinatie van deze twee methoden meerwaarde heeft voor expertise-onderzoekers en instructie-ontwerpers die belang hebben bij het verkrijgen van een gedetailleerd beeld van cognitieve processen tijdens probleemoplossen.

137

Deel II: Procesgerichte Uitgewerkte Voorbeelden

Dit deel begint met een theoretisch hoofdstuk over de hypothetische effecten van procesgerichte uitgewerkte voorbeelden (Hoofdstuk 5). Er wordt beargumenteerd dat de uitgewerkte voorbeelden die in eerder onderzoek gebruikt werden voornamelijk productgericht waren, omdat ze slechts de oplossingstappen toonden en niet de redenering erachter. Strategische kennis, zoals heuristieken of een systematische probleemaanpak en kennis van domeinprincipes die gebruikt worden in het selecteren van de oplossingstappen, worden niet expliciet gemaakt in deze voorbeelden. Begrip van de redenering achter oplossingsstappen wordt echter als een cruciale factor gezien voor het bereiken van transfer van geleerde vaardigheden, vooral voor verre transfer. In tegenstelling tot nabije transferproblemen, die dezelfde structurele kenmerken hebben als de leertaken maar andere oppervlaktekenmerken, hebben verre transferproblemen ook andere structurele kenmerken en kunnen daarom niet opgelost worden door het eenvoudigweg toepassen van de geleerde procedure. Om die delen van de procedure die wel relevant zijn voor het nieuwe (verre transfer) probleem flexibel te kunnen gebruiken, is het nodig dat de lerende de redenering achter (subgroepen van) stappen begrijpt (vgl. Catrambone, 1996, 1998). Zoals Gott, Parker-Hall, Pokorny, Dibble, and Glaser (1993, p. 260) stellen, moet een lerende “niet alleen de procedurele stappen voor het oplossen van een probleem kennen, maar ook begrijpen wanneer ze gebruikt kunnen worden en waarom ze werken”.

Om deze redenen kan aangenomen worden dat het gebruik van procesgerichte uitgewerkte voorbeelden, die deze kennis wel expliciteren, een effectieve cognitieve belasting oplegt. De cognitieve capaciteit die vrijkomt door het reduceren van ineffectieve belasting kan gebruikt worden om de toegevoegde procesinformatie te bestuderen, wat het begrip van de procedure en dus de transfer naar nieuwe probleemsituaties kan verhogen. Dat wil zeggen, het bestuderen van procesinformatie vraagt meer mentale inspanning, maar leidt ook tot beter leren.

Het experiment beschreven in Hoofdstuk 6 test deze aanname in het domein van troubleshooten van elektrische schakelingen, middels een 2 x 2 factorieel design met de factoren ‘oplossing uitgewerkt’ (nee/ja) en ‘procesinformatie gegeven’ (nee/ja). De trainingscondities waren dus: 1) oplossen van conventionele problemen, 2) oplossen van conventionele problemen met procesinformatie gegeven, 3) bestuderen van productgerichte uitgewerkte voorbeelden en 4) bestuderen van procesgerichte uitgewerkte voorbeelden. De training bestond uit zes troubleshootingtaken: drie parallelle schakelingen met één fout en drie parallelle schakelingen met twee fouten. De

138

test bestond uit zes conventionele problemen, drie nabije transferproblemen (dezelfde structurele kenmerken, d.w.z. parallelle schakelingen en geleerde fouten) en drie verre transferproblemen (verschillende structurele kenmerken, d.w.z. ander type schakeling of niet geleerde fout). Na elke trainingstaak en na elke testtaak gaven proefpersonen op een 9-puntsschaal aan hoeveel mentale inspanning het bestuderen of oplossen hen gekost had.

Zoals verwacht vroeg het bestuderen van uitgewerkte voorbeelden minder mentale inspanning tijdens de training, maar leidde dit wel tot een betere prestatie op de nabije- en verre-transferproblemen dan het oplossen van problemen. Bovendien leidde het bestuderen van procesinformatie inderdaad tot meer mentale inspanning tijdens de training, echter, dit leidde niet tot een betere prestatie op transferproblemen voor de conditie met procesgerichte uitgewerkte voorbeelden.

In Hoofdstuk 7 werd een mogelijke verklaring onderzocht voor het feit dat het bestuderen van procesgerichte voorbeelden wel hogere mentale inspanning vroeg tijdens de training, maar niet tot betere transfer prestatie en efficiëntie (combinatie van prestatie en mentale inspanning op de test) leidde. De hypothese was dat procesgerichte uitgewerkte voorbeelden in eerste instantie tot meer efficiëntie leiden dan productgerichte uitgewerkte voorbeelden, maar dat de procesinformatie op een bepaald moment redundant wordt en het leren gaat schaden en beter vermeden kan worden vanaf dat moment. Dus, aangenomen werd dat een sequentie van proces-/productgerichte uitgewerkte voorbeelden tot meer efficiëntie zou leiden dan een sequentie van proces-/procesgerichte, product-/productgerichte, of product-/procesgerichte uitgewerkte voorbeelden. Proefpersonen bestudeerden twee training series van vier uitgewerkte voorbeelden (parallel schakelingen met één fout, vgl. Hoofdstuk 6), elk gevolgd door een test serie van vier transfer problemen (twee nabije en twee verre transfer problemen, vgl. Hoofdstuk 6). De condities waren: product-product sequentie, proces-proces sequentie, product-proces sequentie en proces-product sequentie. Na elke trainingstaak en na elke testtaak gaven proefpersonen op een 9-puntsschaal aan hoeveel mentale inspanning het bestuderen of oplossen hen gekost had.

Overeenkomstig onze hypothese, leidde het bestudeerd hebben van procesgerichte voorbeelden op de eerste test tot hogere efficiëntie dan het bestudeerd hebben van productgerichte voorbeelden. Ook werd de procesinformatie inderdaad redundant: doorgaan met het bestuderen van procesgerichte voorbeelden leidde tot lager efficiëntie op de tweede test dan wanneer vervolgd werd met productgerichte voorbeelden. Echter, in tegenstelling tot de eerste test, was de efficiëntie van de proces-product conditie op de

139

tweede test niet significant hoger dan de efficiëntie van de product-product of de product-proces condities. Er kan geconcludeerd worden dat wanneer procesinformatie aangeboden wordt in uitgewerkte voorbeelden, dit aan het begin van de training gedaan moet worden en dat deze informatie weggelaten moet worden wanneer ze redundant wordt. Gecombineerd met eerder werk op het gebied van het “faden” van uitgewerkte oplossingsstappen (Renkl & Atkinson, 2003), suggereren deze resultaten dat een optimale trainingsvolgorde voor novices verloopt van het bestuderen van procesgerichte voorbeelden, het bestuderen van productgerichte voorbeelden, via het completeren van steeds meer stappen in gedeeltelijk uitgewerkte voorbeelden, naar het zelf probleemoplossen.

141

ICO Dissertation Series In the ICO Dissertation Series dissertations are published of graduate students from faculties and institutes on educational research within the following universities: University of Twente, University of Groningen, Maastricht University, University of Amsterdam, Utrecht University, Open University, Leiden University, Wageningen University, Technical University of Eindhoven, and Free University (and formerly University of Nijmegen and University of Tilburg). Over one hundred dissertations have been published in this series. The most recent ones are listed below. 98. Kruiter, J.H. (13-05-2002). Groningen Community Schools: Influence on Child

Behaviour Problems and Education at Home. Groningen: University of Groningen.

99. Braaksma, M.A.H. (21-05-2002). Observational Learning in Argumentative Writing. Amsterdam: University of Amsterdam.

100. Lankhuijzen, E.S.K. (24-05-2002). Learning in Self-Managed Management Career: The Relation between Managers’ HRD-Patterns, Psychological Career Contracts and Mobility Perspectives. Utrecht: Utrecht University.

101. Gellevij, M.R.M. (06-06-2002). Visuals in Instruction: Functions of Screen Captures in Software Manuals. Enschede: University of Twente.

102. Vos, F.P. (26-06-2002). Like an Ocean Liner Changing Course: The Grade 8 Mathematics Curriculum in the Netherlands, 1995-2000. Enschede: University of Twente.

103. Sluijsmans, D.M.A. (28-06-2002). Student Involvement in Assessment: The Training of Peer Assessment Skills. Heerlen: Open University.

104. Tabbers, H.K. (13-09-2002). The Modality of Text in Multimedia Instructions: Refining the Design Guidelines. Heerlen: Open University.

105. Verhasselt, E. (14-11-2002). Literacy Rules: Flanders and the Netherlands in the International Adult Literacy Survey. Groningen: University of Groningen.

106. Beekhoven, S. (17-12-2002). A Fair Chance of Succeeding: Study Careers in Dutch Higher Education. Amsterdam: University of Amsterdam.

107. Veermans, K.H. (09-01-2003). Intelligent Support for Discovery Learning: Using Opportunistic Learner Modeling and Heuristics to Support Simulation Based Discovery Learning. Enschede: University of Twente.

108. Tjepkema, S. (10-01-2003). The Learning Infrastructure of Self-Managing Work Teams. Enschede: University of Twente.

109. Snellings, P.J.F. (22-01-2003). Fluency in Second Language Writing: The Effects of Enhanced Speed of Lexical Retrieval. Amsterdam: University of Amsterdam.

110. Klatter, E.B. (23-01-2003). Development of Learning Conceptions during the Transition from Primary to Secondary Education. Nijmegen: University of Nijmegen.

142

111. Jellema, F.A. (21-02-2003). Measuring Training Effects: the Potential of 360-degree Feedback. Enschede: University of Twente.

112. Broekkamp, H.H. (25-02-2003). Task Demands and Test Expectations: Theory and Empirical Research on Students’ Preparation for a Teacher-made Test. Amsterdam: University of Amsterdam.

113. Odenthal, L.E. (21-03-2003). Op zoek naar balans: Een onderzoek naar een methode ter ondersteuning van curriculumvernieuwing door docenten. Enschede: University of Twente.

114. Kuijpers, M.A.C.T. (21-03-2003). Loopbaanontwikkeling: onderzoek naar ‘Competenties’. Enschede: University of Twente.

115. Jepma, IJ. (01-07-2003). De schoolloopbaan van risico-leerlingen in het primair onderwijs. Amsterdam: University of Amsterdam.

116. Sotaridona, L.S. (05-09-2003). Statistical Methods for the Detection of Answer Copying on Achievement Tests. Enschede: University of Twente.

117. Kester, L. (05-09-2003). Timing of Information Presentation and the Acquisition of Complex Skills. Heerlen: Open University.

118. Waterreus, J.M. (05-09-2003). Lessons in Teacher Pay: Studies on Incentives and the Labor Market for Teachers. Amsterdam: University of Amsterdam.

119. Toolsema, B. (23-10-2003). Werken met competenties. Enschede: University of Twente.

120. Taks, M.M.M.A. (20-11-2003). Zelfsturing in leerpraktijken: Een curriculumonderzoek naar nieuwe rollen van studenten en docenten in de lerarenopleiding. Enschede: University of Twente.

121. Driessen, C.M.M. (21-11-2003). Analyzing Textbook Tasks and the Professional Development of Foreign Language Teachers. Utrecht: Utrecht University.

122. Hubers, S.T.T. (24-11-2003). Individuele leertheorieën en het leren onderzoeken in de tweede fase. Eindhoven: Technical University of Eindhoven.

123. Sun, H. (04-12-2003). National Contexts and Effective School Improvement: An Exploratory Study in Eight European Countries. Groningen: University of Groningen.

124. Bruinsma, M. (09-12-2003). Effectiveness of Higher Education: Factors that Determine Outcomes of University Education. Groningen: University of Groningen.

125. Veneman, H. (01-07-2004). Het gewicht van De Rugzak: Evaluatie van het beleid voor leerlinggebonden financiering. Groningen: University of Groningen.

126. Annevelink, E. (27-08-2004). Class Size: Linking Teaching and Learning. Enschede: University of Twente.

127. Emmerik, M.L. van (22-09-2004). Beyond the Simulator: Instruction for High Performance Tasks. Enschede: University of Twente.

128. Vries, B. de (15-10-2004). Opportunities for Reflection: E-mail and the Web in the Primary Classroom. Enschede: University of Twente.

129. Veenhoven, J. (05-11-2004). Begeleiden en beoordelen van leerlingonderzoek: Een interventiestudie naar het leren ontwerpen van onderzoek in de tweede fase van het voortgezet onderwijs. Utrecht: Utrecht University.

143

130. Strijbos, J.W. (12-11-2004). The Effect of Roles on Computer-Supported Collaborative Learning. Heerlen: Open University.

131. Hamstra, D.G. (22-11-2004). Gewoon en Anders: Integratie van leerlingen met beperkingen in het regulier onderwijs in Almere. Groningen: University of Groningen.

132. Lubbers, M.J. (09-12-2004). The Social Fabric of the Classroom: Peer Relations in Secondary Education. Groningen: University of Groningen.

133. Nijman, D.J.J.M. (10-12-2004). Supporting Transfer of Training: Effects of the Supervisor. Enschede: University of Twente.

134. Dewiyanti, S. (25-02-2005). Learning Together: A positive experience. The effect of reflection on group processes in an asynchronous computer-supported collaborative learning environment. Heerlen: Open University.

135. Stoof, A. (04-03-2005). Tools for the identification and description of competencies. Heerlen: Open University.

136. Groot, R.W.A. de (10-03-2005). Onderwijsdecentralisatie en lokaal beleid. Amsterdam: Universiteit van Amsterdam.

137. Salden, R.J.C.M. (22-04-2005). Dynamic Task Selection in Aviation Training. Heerlen: Open University.

138. Huong, N.T. (23-05-2005). Vietnamese learners mastering English articles. Groningen: University of Groningen.

139. Gijlers, A.H. (23-09-2005). Confrontation and co-construction: Exploring and supporting collaborative scientific discovery learning with computer simulations. Enschede: University of Twente.

140. Stevenson, M.M.C. (27-09-2005). Reading and writing in a foreign language: A comparison of conceptual and linguistic processes in Dutch and English. Amsterdam: University of Amsterdam.

141. Saab, N. (14-10-2005). Chat and explore: The role of support and motivation in collaborative scientific discovery learning. Amsterdam: University of Amsterdam.

142. Löhner, S. (11-11-2005). Computer-based modeling tasks: The role of external representation. Amsterdam: University of Amsterdam.

143. Beers, P.J. (25-11-2005). Negotiating common ground: Tools for multidisciplinary teams. Heerlen: Open University.

144. Tigelaar, E.H. (07-12-2005). Desgn and evaluation of a teaching portfolio. Maastricht: University of Maastricht

145. Van Drie, J.P., (20-12-2005). Learning about the past with new technologies. Fostering historical reasoning in computer-supported collaborative learning. Utrecht: Utrecht University.

Date post:	20-Feb-2021
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

Uncovering the Problem-Solving Process to Design Effective ......Merriënboer, Schuurman, De Croock,...

Documents