Which Multimedia Strategies Actually Improve Learning?
New research evaluates nearly three decades of work on multimedia learning, and explores the boundary conditions of when it enhances or hinders learning.
Richard Mayer has been a huge influence on my understanding of how learning happens. If I was to sum up his work in one idea, it’s that clarity beats novelty, and that students learn by actually thinking hard about stuff, not merely being entertained. I had been thinking for years however that there is not enough attention paid to the boundary conditions of multimedia design principles. There is huge variance in how certain things work in terms of age, subject and relative knowledge.
Mayer himself addressed this in a recent 2024 reflection which called for researchers to specify 'for whom the principle applies, for which kind of lesson the principle applies, and under what circumstances the principle applies'; essentially asking for a shift from universal claims to conditional ones: who benefits, under what conditions, and to what extent?
So it was with great interest that I read a new meta-analysis of Mayer's multimedia learning which research found a moderate overall effect on learning, influenced by various factors, including the type of multimedia, the specific design principle used, and the age of the learners.
This study asks a straightforward question: Which multimedia design strategies actually make learning better, and when? Pooling Mayer’s studies from 1990–2022, the average impact is a respectable “medium” (g=0.37). The most reliable boosts come from stripping out seductive details (irrelevant pictures, fun facts), using spoken words with visuals when pacing isn’t under student control (the modality principle), keeping language personal (“you”, conversational tone), and getting students to generate stuff, especially self-explanations. Plain text + diagrams performs strongly and consistently across factual, inference, and transfer outcomes.
By contrast, VR (virtual reality) in Mayer’s own work doesn’t show a significant overall advantage; animation, games, and simulations are spottier, tending to help more on inference/transfer than on isolated facts. Effects differ by outcome (facts vs inference vs transfer) and media type, and many effects have declined slightly by publication year, likely as tech novelty wears off and contexts diversify.
The Low Technology Rankings
When it comes to different types of multimedia, the results are not a surprise to anyone who has being paying attention to the broader field of cognitive science and the many decades of expensive edtech failures. Despite the fact that a lot of “21st Century learning” was basically digitised worksheets that didn’t work in the first place, there is some really interesting new work on harnessing the science of learning in app development, (something I’ll return to again.)
So what were the key findings?
Text combined with diagrams emerged as the most reliable approach across all learning outcomes. This humble combination consistently produced meaningful learning gains whether students were trying to memorise facts, make inferences, or transfer knowledge to new situations.
Virtual reality, despite generating enormous excitement and investment, VR showed non-significant effects overall. This doesn't mean VR is useless, but right now it should be approached with caution. (I know it’s a rapidly evolving field but I’m sceptical of kids ‘conversing’ with AI generated versions of Einstein or Aristotle, and this research does nothing to allay those concerns.)
Educational games and simulations: Across Mayer’s studies, both games and simulations show moderate average benefits for learning (games: g≈0.41; simulations: g≈0.41).However, the gains are uneven across outcomes: effects are larger for inferential and transfer measures and smaller for factual recall. Put simply, games/sims help most when pupils must reason with knowledge, not just repeat it.
The decline in effect sizes over time is particularly sobering, suggesting that as multimedia becomes commonplace, its advantages may diminish. This reinforces the importance of focusing on pedagogical principles rather than technological novelty.
Boundary Conditions: Context Matters
Now the important bit. The research reveals that multimedia principles work differently depending on several crucial factors:
Age and education level: University students and medical trainees showed the largest benefits, while elementary and graduate students showed smaller or non-significant effects. This suggests that multimedia may work best for learners with some background knowledge but who aren't yet experts.
Academic domain: STEM and medical subjects benefited more than liberal arts topics, possibly because these fields more naturally lend themselves to visual representation. I think this is because fields like English literature is what John Sweller calls “an ill-defined domain”. In other words, there is not a similar hierarchal knowledge structure like Maths or Science.
Learning goals: Transfer and inferential learning showed larger benefits than simple factual recall, suggesting multimedia is particularly valuable when students need to apply knowledge in new contexts.
So what is the explanation for this? Are there some general principles we can bring together to make better decisions around using multimedia for learning in a more effective way?
The Cognitive Architecture That Explains So Much
The boundary conditions make perfect sense when you understand what's actually happening in learners' minds. Mayer's research is built on a three-process framework that reveals why multimedia principles work differently for different types of learning.When students encounter multimedia content, their brains must:
Select relevant information from what they see and hear—filtering out distractions and focusing on what matters. This is the foundation of factual learning: students need to pick out and remember key information.
Organize that information into coherent mental models—connecting pieces within the learning material to understand relationships and patterns. This drives inferential learning: students must see how elements relate to draw conclusions.
Integrate new information with existing knowledge—linking what they're learning now with what they already know to apply it in novel situations. This enables transfer learning: using principles from one context to solve problems in another.

Here's the crucial insight: different multimedia principles support different cognitive processes, and different learning goals depend on different processes working well. Here are some key takeaways:
For factual learning (selection-heavy), simple design moves work best: removing seductive details helps students focus on relevant information, while basic visual cues guide attention to key elements.
For inferential learning (organisation-heavy), structural supports become more important: segmenting complex information and providing organisational scaffolds help students see connections within the material.
For transfer (integration-heavy), active engagement strategies dominate: self-explanation, personalisation, and other approaches that force students to connect new material with existing knowledge show the largest effects.
This framework explains why text-plus-diagrams works so consistently; it efficiently supports both selection (clear visual organisation) and integration (explicit connections between verbal and visual information). It also explains why VR often disappoints: the cognitive overhead of navigating immersive environments can overwhelm selection and organisation processes, making integration more difficult.
Most importantly, it gives educators a practical lens for evaluating any multimedia intervention: Which cognitive process does this support? Does that match what my students need to achieve their learning goals?
What This Means for Practice: Key Design Principles
Prioritise Cognitive Engagement Not Entertainment: The most effective strategies are those that actively engage the learner, rather than just presenting information. This kind of generative learning is about helping students select, organise, and integrate new information with what they already know. Simple methods like text with diagrams work consistently because they support these core mental processes efficiently.
Match Tools to Goals: Different multimedia types and principles are suited for different learning objectives and contexts. For basic factual recall, simple and clear presentations are often best. For deeper, more complex tasks like problem-solving and knowledge transfer, you should use multimedia that encourages active reasoning, such as simulations or games.
Consider Your Context: The effectiveness of multimedia varies depending on the learner's age, the subject matter, and the specific learning goals. For instance, the research found that effects were strongest for university and medical students and in STEM subjects.
The Broader Implications
For me, this research joins a growing body of evidence suggesting that educational technology's benefits have been oversold. It doesn't mean multimedia learning is ineffective, (in fact I think there are some really exciting developments in applying the science of learning with apps like Math Academy and also
’s work for example) but rather that effectiveness depends heavily on careful implementation and context, and if it takes an inordinate investment of time, technology and training to get it “working” then it’s probably not worth investment, when we have demonstrable methods that do actually work. However, we do have some key multimedia principles that if used thoughtfully, can certainly improve student learning.It’s worth reading Oliver Caviglioli’s work who has been one of the best thinkers on this topic, and has pushed my thinking particularly on the lack of attention paid to the transient information effect and the lethal mutations of Mayer’s work and cognitive load theory more broadly. His thinking is really worth checking out on this topic.


Thanks for taking the heavy lifting out of reading the research. Excellent piece of research.
For a better exploration of education technology, to include its scandals, I recommend Audrey Watters' Teaching Machines: The History of Personalized Learning.
https://direct.mit.edu/books/book/5138/Teaching-MachinesThe-History-of-Personalized
My own take, in part based on what Frank Smith said long ago, is that technology can help insomuch as it helps children join a club of literate people. They learn "by participating in literate activities with people who know how and why to do those things." So for instance, students will readily pick up the lingo of their gaming communities or online social groups because they want to belong. If you doubt this, ask a teen or preteen right now what "6-7" means. If you give them a club to belong to, something meaningful to get done, they will learn. If whatever you're teaching appears not to have a use to them, they will readily forget or ignore it.
https://archive.org/details/joiningliteracyc0000smit