A single 10-minute retrieval practice activity significantly improved final exam performance.
How a Brief Quiz Reduced Anxiety, Boosted Performance and Worked Better Than Students Expected
A new study from Kenneth Barideaux Jr. explored whether a single, brief retrieval practice session could outperform traditional review methods in preparing diverse college students for final exams. Using a quasi-experimental design with 38 psychology students, Barideaux compared the outcomes of a 10-minute practice test against a conventional instructor-led review session covering identical content. What made this study particularly significant was its racially diverse sample (50% students of colour and 37% first-generation college students) directly challenging the demographic limitations of a lot of previous retrieval practice research.
The intervention was 10 minutes of students taking an unexpected, closed-notes practice test consisting of:
10 multiple-choice questions created by the instructor
Questions focused on key concepts likely to appear on the final exam
Each question had four answer choices
Questions assessed recall or comprehension of foundational concepts
Students were told it was ungraded and framed as preparation for the final exam. Immediately after the 10-minute test, the instructor provided corrective feedback, explaining why each answer was correct or incorrect.
The passive review was a brief PowerPoint-based presentation where the instructor delivered key concepts as bullet points to the class. Specifically, the review group received:
The same content that was tested in the retrieval practice group
Information presented in bullet-point format on slides
Instructor clarification of misconceptions
A structured overview of concepts likely to appear on the final exam
This is what the study calls a "more common instructional approach"; essentially a traditional pre-exam review session where students passively receive information rather than actively retrieving it from memory.
The retrieval practice group achieved 83% correct answers compared to 72% for the review group: an 11 percentage point improvement that could easily mean the difference between letter grades.
The crucial difference was the cognitive demand: the review group sat and listened while information was presented to them, whereas the retrieval practice group had to actively recall and produce answers from memory before receiving feedback.
This design choice makes the study particularly valuable because it compares retrieval practice to what actually happens in most classrooms, not the artificial comparison to "re-reading" that dominates much of the research literature.
Students who engaged in retrieval practice didn’t report feeling any more prepared for the final exam than their peers who reviewed the material passively. In fact, their self-assessed preparedness was nearly identical. However, their actual performance told a different story: they scored significantly higher across all types of questions, including applied and unrelated items.
Even more intriguing, students who completed the practice test reported feeling less anxious about the exam and found it easier than those who received the traditional review. This suggests retrieval practice may serve a dual function: enhancing performance while reducing test anxiety.
For me this is a fascinating aspect of the science of learning: the disconnect between how we think we learn and how we actually learn. The students who quizzed themselves didn’t feel more prepared but they actually performed better. This crucial difference between feeling and reality underscores why educators should be designing instruction on hard evidence like this as opposed to flimsy constructs like "engagement".
Another significant aspect of this study addressed a critical gap: testing retrieval practice with racially diverse students (50% students of colour). The benefits held across all demographics.
This finding is more consequential than it might initially appear. The study notes that approximately 94% of classroom-based retrieval practice research has been conducted with WEIRD samples (Western, Educated, Industrialized, Rich, Democratic populations), which typically translates to predominantly white, middle-class student populations. This demographic homogeneity creates what researchers call the "generalizability crisis". In other words, we simply don't know whether findings from cognitive psychology apply universally or represent the learning patterns of a privileged subset.
There are however some methodological limitations which warrant caution. The quasi-experimental design represents the most significant weakness: students weren't randomly assigned but allocated by pre-existing class sections, meaning unmeasured group differences could explain the performance gap. Additional concerns include the small sample size (38 students), single 10-minute intervention with only 48-hour follow-up, potential instructor bias, and testing limited to factual questions in one subject area. However, these weaknesses don't invalidate the study's contribution to the broader evidence base, the findings align with decades of laboratory research on retrieval practice, and the ecologically valid classroom setting actually strengthens real-world applicability.
Read the paper here. 🔒
Good to know: Taking multiple choice tests is the best way to prepare for taking multiple choice tests.
If you're going to talk about it being WEIRD or not, why didn't you say WHERE the research was conducted? Just because they're "students of color," doesn't mean they aren't WEIRD.
Really useful for teachers to apply and explain the ‘why’ to students. Thanks for the explanation.