5 ways to get retrieval practice wrong and reflections on a new study which shows that when material is complex, testing can stop being a desirable difficulty and start being an undesirable one.
Hi Carl. I found myself tripping over the early paragraph re the student who has 'read something once and understood 60%.'
You mention that asking this student to close the book and (attempt to) retrieve is 'floundering' rather than learning. However, doesn't this contradict the research?
My understanding of the literature (and my longtime experience with Anki) is that the struggle to retrieve fuzzy, partially-encoded material (i.e., that 60%) is exactly what prevents the 'illusion of competence.' If we wait until encoding is 'secure' (near 100%) before we test, aren't we missing the biggest ROI window for retrieval?
Surely identifying the 40% we don't know via a failed retrieval attempt is more valuable than just re-reading the text?
Essentially, the paragraph seems to go against most of my reading and understanding: that testing is not only good, but is literally learning in-and-of-itself.
Did I misinterpret?
--
P.S. I've just read the Redifer paper, and I'm not sure about it. Asking students to 'free recall' a very heavy/dense 2,500-word academic article after a single 15-minute read seems less like a fair test of retrieval practice and more like a 'working memory torture test' that was destined to cause overload.
There is so much to mine in this piece, but I'll begin with a statement that goes to the heart of the knowledge building vs. strategy instruction debate, which has created a false binary. We use literary works and knowledge-rich texts to guide students toward understanding by engaging them in interpretation, analysis, and synthesis. The knowledge in the piece is the "what" we teach; making sense of it is the "how" we teach.
"An interpretation of a novel, an analysis of historical causation, a synthesis of competing theoretical perspectives; these are not items stored in memory awaiting collection. They must be constructed in the moment, assembled from understanding rather than retrieved from storage."
“Assembled from understanding” … yes. But in what ways do the cognitive operations underlying this complex and interleaved ‘understanding’ actually function? ie listing concepts such as ‘interpretation’, ‘analysis’ & ‘synthesis’ is one thing; whereas, designing meaningful ‘retrieval’ of complex texts seems to require at least 1. students’ background knowledge (specifically & generally inculcated over time and in relation to topic at hand) 2. a clear grasp of key concepts (more easily amenable to direct teaching/retrieval). All dependent on existing level of familiarity with the specific subject matter, vocabulary & so on. These need to be inculcated/taught specifically in the first place, and this is where some direct ‘retrieval’ practice is useful. For the rest, formative guided practice (ie the ‘how’) - in comprehension, precis, paraphrasing etc - is what builds knowledge and skills over time.
"While the academic benefits of reading are clear, schools may inadvertently transmit the perception that reading is something done solely for the purposes of testing and learning, and not for enjoyment. Research findings from the WASCBR indicate that this conceptualization of reading may not be uncommon, and that teachers' focus on reading for learning rather than enjoyment may play a role in cementing the divorce between reading and enjoyment." -- Margaret K. Merga, Creating a Reading Culture in Primary and Secondary Schools: A Practical Guide (2023)
Based on some recent classroom observations, I too have had thoughts about the limits of retrieval practice, and I found this article thought-provoking. I agree that retrieval practice seems to work best for discrete pieces of information, but at least one of the well-known retrieval practice studies did NOT involve that kind of information.
In 2011, Karpicke and Blunt described a retrieval practice study in which the to-be-learned content was "a science text," and they had undergraduates in four conditions, one of which was retrieval practice. That group engaged in free recall, reading the text and then putting it aside and writing down everything they could remember--and then they did that again. When tested a week later, the retrieval practice group remembered the concepts significantly better. (See https://www.science.org/cms/asset/882cef39-74f8-4177-a127-dd7191b24d4e/pap.pdf).
I don't know whether the science text used in 2011 was less complex than the article in the recent study, or whether (perhaps) in the time between the two studies, students have become less able to understand complex text (we do have some evidence of that).
I do know that when Karpicke tried to repeat the experiment with 4th graders, using a 4th grade level science text, there was no difference between the retrieval practice group and the other groups. Karpicke told me he suspected that there was simply too much information in the text for the 4th graders to be able to retrieve. When he simplified the task (less reading and/or more support for the writing), the retrieval practice group did do significantly better.
So I wonder: have undergrads become more like 4th graders in this respect in the last 15 years? Or maybe the Karpicke results were different because the undergrads engaged in retrieval practice twice, even though I think little or no time elapsed between the two efforts?
Re: "or whether (perhaps) in the time between the two studies, students have become less able to understand complex text (we do have some evidence of that)."
Oh, kids these days. What is the world coming to? Does anyone even make handbaskets around here anymore, for the trip to Hades? Or will I have to have Amazon ship my handbasket all the way from Qingdao? And will I have to pick up a gig job to afford it?
"The 'kids today' crisis rhetoric, I believe, is much more a reflection of adults, the cynicism of aging and the loss we all feel as we move further and further away from our childhood and teens years." -- P. L. Thomas
Why are you using AI to generate your texts? I like your content, it’s really good, and I (and I guess most people, too) would happily take fewer posts in return for the posts being written fully by you and not AI.
It’s super obvious that you are using LLMs, sadly. :(
Hi Carl - another incredibly rich article. Thank you. Along similar lines to the comment from shaeda.io I was hoping you could clear up my understanding on the point regarding retrieval only working on 'fully encoded' knowledge. Isn't the point of retrieval to strengthen the trace on key/priority curriculum information that has been partially forgotten? Indeed isn't it an advantage / a necessity for retrieval to be carried out on information that has been partially forgotten (i.e not fully encoded) to a) allow desirable difficulty to be utilised on forgotten information and b) enable the benefits of the 'prediction error' effect to kick in?
Also where does this leave pre-questioning and pre-testing and its supposed benefits? As I understand them, those are a) for the teacher to see what's known and what isn't (this would be true of retrieval in terms of what's been genuinely remembered) b) to position what the learner should be paying attention to? Many thanks
I think this article opens up space for more Concept-based Curriculum and Instruction. I always felt that retrieval practice was only for simple, discrete facts in a science classroom. We can teach more for conceptual understanding and use retrieval practice to consolidate learning, and leave comprehension to language work and figuring out.
Re: "this means the distortion of a practice with good evidence to support it in the laboratory but then gets misapplied in the field and becomes a pale imitation of its former self"
Or maybe the only place it ever really worked is in the laboratory.
"Normal, sensible learning is excluded from the research laboratory." -- Frank Smith, Joining the Literacy Club (1988, p114)
Hi Carl. I found myself tripping over the early paragraph re the student who has 'read something once and understood 60%.'
You mention that asking this student to close the book and (attempt to) retrieve is 'floundering' rather than learning. However, doesn't this contradict the research?
My understanding of the literature (and my longtime experience with Anki) is that the struggle to retrieve fuzzy, partially-encoded material (i.e., that 60%) is exactly what prevents the 'illusion of competence.' If we wait until encoding is 'secure' (near 100%) before we test, aren't we missing the biggest ROI window for retrieval?
Surely identifying the 40% we don't know via a failed retrieval attempt is more valuable than just re-reading the text?
Essentially, the paragraph seems to go against most of my reading and understanding: that testing is not only good, but is literally learning in-and-of-itself.
Did I misinterpret?
--
P.S. I've just read the Redifer paper, and I'm not sure about it. Asking students to 'free recall' a very heavy/dense 2,500-word academic article after a single 15-minute read seems less like a fair test of retrieval practice and more like a 'working memory torture test' that was destined to cause overload.
I wonder if this doesn’t address your question: “retrieval practice is a consolidation strategy, not a comprehension strategy”.
So you are asking more about Check for Understanding than about Retrieval Practice? https://chatgpt.com/share/693fe778-5f80-800b-bd50-3e6335c91994
There is so much to mine in this piece, but I'll begin with a statement that goes to the heart of the knowledge building vs. strategy instruction debate, which has created a false binary. We use literary works and knowledge-rich texts to guide students toward understanding by engaging them in interpretation, analysis, and synthesis. The knowledge in the piece is the "what" we teach; making sense of it is the "how" we teach.
"An interpretation of a novel, an analysis of historical causation, a synthesis of competing theoretical perspectives; these are not items stored in memory awaiting collection. They must be constructed in the moment, assembled from understanding rather than retrieved from storage."
“Assembled from understanding” … yes. But in what ways do the cognitive operations underlying this complex and interleaved ‘understanding’ actually function? ie listing concepts such as ‘interpretation’, ‘analysis’ & ‘synthesis’ is one thing; whereas, designing meaningful ‘retrieval’ of complex texts seems to require at least 1. students’ background knowledge (specifically & generally inculcated over time and in relation to topic at hand) 2. a clear grasp of key concepts (more easily amenable to direct teaching/retrieval). All dependent on existing level of familiarity with the specific subject matter, vocabulary & so on. These need to be inculcated/taught specifically in the first place, and this is where some direct ‘retrieval’ practice is useful. For the rest, formative guided practice (ie the ‘how’) - in comprehension, precis, paraphrasing etc - is what builds knowledge and skills over time.
"But in what ways do the cognitive operations underlying this complex and interleaved ‘understanding’ actually function?"
Let me know if this comes close to addressing your question: Daniel Willingham Reminded Me That Memory is the Residue of Thought (https://harriettjanetos.substack.com/p/daniel-willingham-reminded-me-that?r=5spuf).
"While the academic benefits of reading are clear, schools may inadvertently transmit the perception that reading is something done solely for the purposes of testing and learning, and not for enjoyment. Research findings from the WASCBR indicate that this conceptualization of reading may not be uncommon, and that teachers' focus on reading for learning rather than enjoyment may play a role in cementing the divorce between reading and enjoyment." -- Margaret K. Merga, Creating a Reading Culture in Primary and Secondary Schools: A Practical Guide (2023)
Based on some recent classroom observations, I too have had thoughts about the limits of retrieval practice, and I found this article thought-provoking. I agree that retrieval practice seems to work best for discrete pieces of information, but at least one of the well-known retrieval practice studies did NOT involve that kind of information.
In 2011, Karpicke and Blunt described a retrieval practice study in which the to-be-learned content was "a science text," and they had undergraduates in four conditions, one of which was retrieval practice. That group engaged in free recall, reading the text and then putting it aside and writing down everything they could remember--and then they did that again. When tested a week later, the retrieval practice group remembered the concepts significantly better. (See https://www.science.org/cms/asset/882cef39-74f8-4177-a127-dd7191b24d4e/pap.pdf).
I don't know whether the science text used in 2011 was less complex than the article in the recent study, or whether (perhaps) in the time between the two studies, students have become less able to understand complex text (we do have some evidence of that).
I do know that when Karpicke tried to repeat the experiment with 4th graders, using a 4th grade level science text, there was no difference between the retrieval practice group and the other groups. Karpicke told me he suspected that there was simply too much information in the text for the 4th graders to be able to retrieve. When he simplified the task (less reading and/or more support for the writing), the retrieval practice group did do significantly better.
So I wonder: have undergrads become more like 4th graders in this respect in the last 15 years? Or maybe the Karpicke results were different because the undergrads engaged in retrieval practice twice, even though I think little or no time elapsed between the two efforts?
Re: "or whether (perhaps) in the time between the two studies, students have become less able to understand complex text (we do have some evidence of that)."
Oh, kids these days. What is the world coming to? Does anyone even make handbaskets around here anymore, for the trip to Hades? Or will I have to have Amazon ship my handbasket all the way from Qingdao? And will I have to pick up a gig job to afford it?
"The 'kids today' crisis rhetoric, I believe, is much more a reflection of adults, the cynicism of aging and the loss we all feel as we move further and further away from our childhood and teens years." -- P. L. Thomas
https://radicalscholarship.com/2025/12/17/kids-today-perpetually-dumb-and-lazy-as-a-box-of-rocks/
Why are you using AI to generate your texts? I like your content, it’s really good, and I (and I guess most people, too) would happily take fewer posts in return for the posts being written fully by you and not AI.
It’s super obvious that you are using LLMs, sadly. :(
Hi Carl - another incredibly rich article. Thank you. Along similar lines to the comment from shaeda.io I was hoping you could clear up my understanding on the point regarding retrieval only working on 'fully encoded' knowledge. Isn't the point of retrieval to strengthen the trace on key/priority curriculum information that has been partially forgotten? Indeed isn't it an advantage / a necessity for retrieval to be carried out on information that has been partially forgotten (i.e not fully encoded) to a) allow desirable difficulty to be utilised on forgotten information and b) enable the benefits of the 'prediction error' effect to kick in?
Also where does this leave pre-questioning and pre-testing and its supposed benefits? As I understand them, those are a) for the teacher to see what's known and what isn't (this would be true of retrieval in terms of what's been genuinely remembered) b) to position what the learner should be paying attention to? Many thanks
I think this article opens up space for more Concept-based Curriculum and Instruction. I always felt that retrieval practice was only for simple, discrete facts in a science classroom. We can teach more for conceptual understanding and use retrieval practice to consolidate learning, and leave comprehension to language work and figuring out.
Re: "this means the distortion of a practice with good evidence to support it in the laboratory but then gets misapplied in the field and becomes a pale imitation of its former self"
Or maybe the only place it ever really worked is in the laboratory.
"Normal, sensible learning is excluded from the research laboratory." -- Frank Smith, Joining the Literacy Club (1988, p114)