The Effects of Generative Testing on Text Retention and Text Comprehension
Kim J. H. Dirkx, Liesbeth Kester, Paul A. Kirschner
Centre for Learning Sciences and Technologies Open Universiteit
Nederland
Content
• Testing effect• Theoretical framework• Experimental set up• Results• Conclusion• Discussion
The Testing Effect
• Retrieving information by means of testing
is very effective for long-term retention of
facts (see Roediger & Karpicke, 2006).
• Long research history (Glover, 1989; Rothkopf, 1966; Spitzer, 1939).
The Testing Effect• Wordlist
• Expository texts
• Different test formats
• Different retention intervals
• With or without feedback/ restudy phase
Theoretical framework
• Most research used verbatim factual questions
• Simply retrieving facts is not enough in our knowledge based society (LLL; Bloom et al.,
1956).
• Few studies investigated the effect of test on higher order
learning goals (Marsh, Roediger, Bjork & Bjork, 2007; Karpicke & Blunt, 2011)
The present experiment
• Many new testing methods available.
• These foster and /or examine higher order
learning goals (Fletcher & Bloom; Mannes &
Kintsch, 1987).
• Summarizing is a frequently test.
The present experiment
• In a previous study we investigated the effects of summarization on comprehension
• No effects were found
• In this second study we made some major changes….
Mean post-test score
0
0,5
1
1,5
2
2,5
3
3,5
4
Re-Read Sumarize
Learning Activity
Com
preh
ension
Tes
t Sco
re
The present experiment• Students were trained in summarization
• We included a retention test (can we replicate the traditional testing effect using summarization or free recall?)
• We gave students more time for reading and summarization/ free recall
• We included a retrieval effort measure
• We included a free recall condition (is summarization better than free recall?)
Research Questions
1. Can the testing effect (for facts) be replicated
using new testing methods (i.e.,
summarization)?
2. Are new testing methods (i.e., summarization)
more effective for enhancement of
comprehension compared to traditional testing
methods (i.e., free recall)?
3. Can evidence be found for the retrieval effort
hypothesis?
Method and Materials
• 146 secondary school students from pre-university.
• 2 x 3 design with retention interval (5 minutes; 1
week) and learning activity (re-read, summarize, free
recall).
• Expository text of 500 words.
• Final test: 8 verbatim factual questions (Andre, 1979);
8 general questions (Andre, 1979).
• Effort and usefulness were measured on a 9-point
scale.
Examples• Verbatim Question: ‘Where is excessive food stored?’
Answer: In connective tissue under the skin
• General Question: The author says in the text that people often eat more food than they need. Can you explain why this surplus food is stored and not excreted?’
Answer: In the past, sometimes there was much food available, but at other times, there was no food available. The body preserved the surplus of food for times there was less food available’
• Mental Effort (Paas, 1992) ‘Indicate, by putting a mark on the bar below, how much effort it took you to free recall/ make a summary/ re-read the text’
• Usefulness: ‘Indicate, by putting a mark on the bar below, how useful you found it to make a summary of the text/ free recall/ re-read the text’
Examples
1
It cost me very very little effort
9
It cost me very very much effort
1
I found the learning activity not useful at all
9
I found the learning activity very very useful
Design
Instruction Phase
Intervention Final test (after 5 minutes)
Final test (after 1 week)
Instruction in the learning activity
Read text (7 minutes)SudokuRe-read, summarize, or free recall (10 minutes)Effort and usefulnessSudoku
Final test (12 minutes)
Final test (12 minutes)
Results (1)
Figure 1. The interaction between Learning Activity and Retention Interval on the verbatim factual test.
Figure 2. The interaction between Learning Activity and Retention Interval on the general question test.
Results (2)
Figure 3. The interaction between Learning Activity and Retention Interval on effort
Figure 4. The interaction between Learning Activity and Retention Interval on usefulness
Conclusion
1. Can the testing effect (for facts and comprehension) be
replicated using new testing methods (i.e., summarization)?
YES
2. Are new testing methods (i.e., summarization) more effective
for enhancement of comprehension compared to traditional
testing methods (i.e., free recall)? NO
3. Can evidence be found for the retrieval effort hypothesis? NO
Discussion
• The students used summarization to retrieve information verbatim.
• Previous studies also showed ambiguous results.
• The effect of higher order questions spreads less easily to unrelated questions (Hamaker, 1984).
• Students are not well trained in the higher-order processing activities (Hamaker, 1984, p.38)