AI Brain Fry, Workslop and the Ironies of Automation
The Cognitive Price of AI at Work, and the Learning Science Case for Optimism
In 1983, a cognitive psychologist named Lisanne Bainbridge published a four-page paper in an engineering journal that almost nobody outside her field has ever read. It concerned the automation of industrial processes: nuclear plants, chemical refineries, flight decks. Its tone was measured, almost dry but it would prove to be unerringly prophetic for the predicament we now find ourselves in.

Bainbridge’s central argument was this: the more sophisticated an automated system becomes, the more demanding, not less, the human role within it. She called this the “ironies of automation”, and the irony she had in mind was structural rather than incidental. Designers of automated systems, she observed, tend to view the human operator as the weak link: unreliable, inconsistent, prone to fatigue and error. The aim of automation is therefore to eliminate the human wherever possible.
But the designer who tries to eliminate the operator still leaves the operator to do the tasks which the designer cannot think how to automate. What remains after automation is not a simplified role but an arbitrary residue of the most demanding, most ambiguous, and least supported work in the entire system. The human is not replaced. In other words, the human is paradoxically left with the hardest parts, and given almost no preparation for them.
Forty years on, Bainbridge’s paper reads less like a historical document than a prophecy. It describes, with an accuracy that might trouble us, exactly what is now happening to knowledge workers across every sector in which AI has taken hold. And it raises, with particular accuracy, a question that education has barely begun to confront.
Work Doesn’t Shrink. It Expands.
Two recent studies, published within weeks of each other in the Harvard Business Review, have now supplied the empirical substance that Bainbridge anticipated over forty years ago1. The first, by Aruna Ranganathan and Xingqi Maggie Ye of UC Berkeley, followed around two hundred employees at a technology company over eight months, tracking in granular detail how the introduction of generative AI tools changed the texture of their working lives. The results were not what the productivity optimists had promised. AI did not reduce work. It intensified it, consistently and across every role the researchers examined.
Workers expanded their remit because AI made previously inaccessible tasks feel suddenly tractable. Parkinson’s Law holds that work expands to fill the time available; what this study suggests is a more unsettling corollary: work also expands to fill the capacity available. They blurred the boundaries between work and rest because prompting an AI tool felt less like working than like chatting, and so slipped imperceptibly into lunch breaks, evenings, and the margins of meetings. They took on more tasks simultaneously because AI gave them a sense of momentum, of always having a capable partner at hand. And as a result, they found themselves doing more at once, feeling more pressure, and experiencing less genuine recovery than before they had adopted the tools that were supposed to liberate them. As one engineer in the study put it with disarming honesty: the expectation had been that greater productivity would mean working less. What actually happened was working the same amount, or more.
The Cognitive Governors
Part of what drives this dynamic is the disappearance of friction. Before AI, the effort required to begin a task served as a natural governor on the number of tasks undertaken. The blank page, the research required before drafting, the time needed to think through a problem from scratch: all of these imposed a rhythm on cognitive work, structuring attention and creating the stopping points that allowed genuine recovery. When AI lowers the cost of beginning almost anything, those governors are removed. The scope of work expands not because anyone demands it but because the tools make expansion feel possible, even effortless. What disappears is not effort but the friction that once gave effort its shape.
This is Bainbridge’s first irony, translated from the control room to the open-plan office. The system absorbs the tractable parts of the job and returns to the human a workload that is larger in volume, broader in scope, and denser in the kind of high-stakes, high-attention judgement that cannot be delegated. The work does not diminish. It concentrates.
The Workslop Problem: Why Oversight Is Harder Than It Looks
There is a particular kind of modern frustration that many of us will recognise since 2023 which had no name until recently. It looks like this; you receive a report. It is long, it is formatted, it has an executive summary and a set of recommendations and a conclusion that circles back to the introduction with apparent tidiness. You read it. You realise, slowly, that it has told you nothing. Someone has produced it with AI without thinking about it, and now the thinking has been transferred to you.
This phenomenon underpinning this is explored in the second study, conducted by a research team at Boston Consulting Group and published in early March 2026. Surveying nearly fifteen hundred full-time workers across industries, roles, and seniority levels, the researchers found that intensive oversight of AI tools was the single most mentally taxing form of engagement their participants described. Workers required to monitor AI agents closely reported fourteen percent more mental effort, twelve percent more mental fatigue, and nineteen percent greater information overload than those whose AI engagement was less demanding.
A distinct phenomenon emerged from the data, which the researchers called “AI brain fry”: a state of acute cognitive exhaustion characterised by difficulty focusing, slower decision-making, and what participants variously described as mental fog, a buzzing sensation, or the feeling of having a dozen browser tabs open in one’s head simultaneously. Fourteen percent of those using AI at work reported experiencing it; among those in marketing and operations roles, the figure rose to over a quarter.

The consequences were not merely personal. Workers experiencing AI brain fry made major errors thirty-nine percent more frequently than those who did not. They were thirty-nine percent more likely to be actively seeking to leave their jobs. The cognitive tax was, in the most precise sense of the phrase, a business cost.
What makes this finding particularly interesting I think is the nature of the work that produces the fatigue. AI output does not arrive as raw material awaiting construction: it arrives looking finished. This is what Niederhoffer, Robichaux and Hancock at Stanford have termed workslop: AI-generated content that masquerades as good work but lacks the substance to meaningfully advance a given task. The particular cruelty of workslop is that it doesn’t announce its own inadequacy; it arrives fluent, confident, and formatted, offloading the cognitive labour of detecting its failures entirely onto the recipient. In other words, the person who did the least thinking ends up doing the least work, and the person who receives it ends up doing the most.
I think what makes this burden so difficult to manage is what Ethan Mollick has called the jagged frontier2 of AI capability: the fact that AI is immensely powerful on some tasks and fails completely or subtly on others, and that these failures bear no reliable relationship to how difficult or straightforward the task appears to be. The frontier is invisible, which means the human overseer cannot calibrate their vigilance to the actual risk; they must attend to everything with equal suspicion. For example, AI can produce a technically accurate summary of a research paper it has never read, and a technically plausible summary of a research paper that does not exist, and the two outputs will be indistinguishable to anyone who does not already know the answer.

A generated text is fluent and confident; generated code compiles and may even pass tests; a synthesised argument has the surface coherence of careful reasoning. This is precisely the condition under which what Parasuraman and Riley identified as automation bias takes hold: the tendency to over-trust automated output, following it when it is wrong and failing to notice when it has failed.
Their subsequent synthesis with Manzey, published in 2010, showed that this bias manifests in two distinct forms: omission errors, where the human fails to notice the system has gone wrong, and commission errors, where the human actively follows the system’s incorrect recommendation. Critically, neither form can be overcome by training or experience alone; automation complacency and bias are found in expert and novice users alike, because the underlying mechanism is attentional rather than epistemic. It is not that people lack the knowledge to catch errors; it is that the conditions of oversight suppress the vigilance required to deploy that knowledge.
The evaluative burden this places on the human is therefore heavier than it might initially appear. The user is not correcting an obvious gap but assessing something that presents itself with the authority of a completed product, searching for the subtle error, the misplaced assumption, the quietly wrong inference that the system’s fluency has carefully concealed. This is cognitively exhausting in a way that is quite distinct from the effort of original production; and it is, in the most uncomfortable sense, a task that demands precisely the expertise that the system was supposed to make less necessary. AI, in other words, assumes the very competence it risks undermining.
This is Bainbridge’s monitoring irony in its contemporary form. She had identified, with uncomfortable clarity, that asking humans to oversee automated systems is a near-impossible task: vigilance decays rapidly, the system’s internal logic is opaque to the overseer, and the human is held responsible for catching failures that the machine cannot flag and that the human has no reliable means of detecting. The BCG study confirms that this impossibility is not confined to nuclear operators watching instrument panels. It applies equally to the marketing manager reviewing AI-generated copy, the engineer checking code produced by a large language model, the teacher evaluating the output of an AI tutoring platform. The work looks like oversight. What it actually requires is a form of sustained, expert, hypervigilant attention that human cognition was simply not designed to sustain.
The Exhaustion of Evaluation
But there is a lived texture to all of this that the research papers, for all their rigour, cannot quite capture. Siddhant Khare, a software engineer who builds the infrastructure that powers AI agents in production systems, published an account of his own experience in February 2026 that functions as an inadvertent case study in Bainbridge’s thesis. He had shipped more code in the previous quarter than in any quarter of his career. He had also felt more drained than at any previous point in his working life. These two facts, he noted at the outset, were not unrelated.
What Khare described was not the exhaustion of creation but the exhaustion of evaluation: the relentless cognitive labour of reviewing outputs he had not produced, from a system whose reasoning he could not trace, whose errors were often subtle rather than obvious, and whose behaviour was fundamentally probabilistic rather than deterministic. Before AI, his job had been to think about a problem and build a solution. After AI, his job had become to prompt, wait, read, evaluate, decide, correct, and repeat; a quality inspector on an assembly line that never stopped. He found himself unable, by midweek, to make simple decisions. His working memory was not full of code: it was full of judgements about code.
The most unsettling passage in Khare’s account concerns what he calls “thinking atrophy”. During a design review meeting, asked to reason through a concurrency problem at a whiteboard without his laptop, without AI, he struggled. Not because he lacked the relevant knowledge but because he had not exercised that particular cognitive muscle in months. The capacity for unassisted first-draft thinking had quietly degraded, in direct proportion to how consistently he had outsourced it. This is precisely what Parasuraman and Riley described as the out-of-the-loop performance problem: operators removed from active engagement with a system lose not just practice but situation awareness, the accumulated, dynamic sense of what the system is doing and why, which is the very thing required to intervene effectively when it fails. He reached, without naming her, precisely Bainbridge’s deskilling insight: the skills that atrophy during smooth automated operation are exactly the skills required when the automation fails. The former expert, in the moment of crisis, has become the novice.
The Final Irony
Bainbridge concluded her 1983 paper with a characteristically dry observation that has become I think, even sharper since it was written. The final irony of automation, she wrote, is that the most successful automated systems, those with the rarest need for human intervention, are precisely the systems that require the greatest investment in human skill. The longer the machine runs without incident, the more degraded the human backup; and yet it is in those rare, high-stakes moments of failure that the human is most needed and least prepared.

The BCG study offers an unexpected confirmation of this from a completely different direction. When AI was used to eliminate genuinely repetitive, low-value tasks, workers in their sample reported lower burnout scores, more engagement, and more time for the kind of human connection and creative work that the technology was supposed to enable. The problem was not AI itself. The problem was the way AI had been deployed: not to remove toil but to expand scope, not to free attention but to multiply the demands upon it, not to simplify the human role but to intensify it while stripping away the natural governors, the pace constraints, the breathing room, that had previously made it sustainable.
Ranganathan and Ye put this plainly: without intention, AI makes it easier to do more but harder to stop. The natural tendency of AI-assisted work is not contraction but intensification, and the costs of that intensification accumulate quietly, in the degradation of judgement, the erosion of skill, the slow compression of recovery time, and the rising frequency of errors made by people who are producing more than ever and thinking less carefully than they know.
The Positive Case for AI in Education
However, the case against careless AI adoption is not the same as the case against AI itself, and the distinction matters enormously. The BCG data already contains, embedded within its warnings, the outline of a genuine positive case. When AI was used to eliminate genuinely repetitive, low-value tasks rather than to expand the scope of work indefinitely, burnout fell, engagement rose, and workers reported more time for the kind of human connection and creative thinking that gives professional life its meaning. The technology is not the problem. The absence of intentional design is.
Consider first the signal to noise problem in education. What do we actually know about student learning? In my experience, assessment in most schools operates like a kind of faulty compass: it produces readings, generates confidence, and often points consistently in the wrong direction. The granular, timely, diagnostic information that would actually improve instruction is precisely what our current systems cannot supply. Instead we have built vast administrative infrastructures around measurements that tell us remarkably little about what students actually know or can do.
The spreadsheet of predicted grades is one of education’s great epistemic embarrassments: a document that looks like knowledge but tells the teacher almost nothing. Grades are entered, targets are set, progress is RAG-rated, and the resulting data is presented with a confidence entirely disproportionate to its explanatory power. AI’s most under-appreciated contribution to education may not be instruction at all but measurement: the possibility of replacing these crude, lagging, heavily interpreted signals with something finer-grained, more timely, and more honest about what it does and does not know about the learner in front of it.
The second argument concerns scale. Teaching quality is the single most important in-school factor in student outcomes, and it varies enormously; a student in one classroom may receive instruction that is transformatively better or worse than a student in the classroom next door, for reasons that have nothing to do with the student and everything to do with the professional lottery of which teacher they happened to be assigned.
The children who most need the best teachers are, by every measure we have, often the least likely to encounter them: disadvantaged students are disproportionately taught by less experienced practitioners, in schools with higher turnover, in systems that have never found a credible answer to the distribution problem. That problem is not a resource problem or a political problem in the first instance; it is a human problem. Expertise of the kind that makes a great teacher cannot be manufactured on demand, franchised, or guaranteed to persist under the conditions of a real school. AI does not solve this problem. But it may be the first tool that meaningfully addresses it: not by replicating the best teachers, which it cannot do, but by raising the floor of curriculum design, instructional quality and assessment in the classrooms where that floor is lowest.
I first heard this scaling argument from Joe Liemandt, when I met him in Austin in December last year; I found it uncomfortable in the way that only true things are uncomfortable which challenge long-held beliefs. If the system is dependent on those 1 in 20 superhero teachers who burn out after five years, we should at least be honest that what we are running is not a scalable model of education but a moral subsidy paid for by the most dedicated people in the profession.
How AI Can Further The Science of Learning
There is a third argument, less discussed but in some ways the most exciting of all (for me at least), in terms of the science of learning. AI does not merely offer a new delivery mechanism for existing instructional content; it offers, for the first time, the computational infrastructure to take the science of learning seriously at the level of the individual learner. Consider spaced repetition, one of the most robust findings in memory research: the evidence that distributing practice across time, rather than massing it, produces dramatically stronger long-term retention. The principle has been known since the nineteenth century; it has been replicated hundreds of times; and it has been almost entirely ignored by faculties of education and academics, in favour of a revisionist formulation of people like Vygotsky or Piaget.
But as strong as the evidence is, the reality is that we have decades of research from laboratories and post grads but as Tom Perry showed in 2021, almost nothing from real teachers, with real students in real classrooms. One reason why we struggle to implement retrieval and spacing in schools is for the simple reason that calculating the optimal review interval for each piece of knowledge, for each learner, at each moment in time, is computationally impossible for a human teacher managing a class of thirty.
However, projects like FSRS4Anki, an open-source scheduler that uses machine learning to model the precise dynamics of an individual's memory and predict the exact moment a piece of knowledge is about to be forgotten, represent something genuinely new: not the replacement of learning science by technology, but its realisation through it. For decades, the gap between what cognitive science knew about memory and what classrooms could actually implement has been an almost impossible to implement at scale. AI is the first tool capable of possibly addressing that.
That positive case becomes considerably more compelling when set against the historical baseline. Despite the fact that the most cited tutoring study from the Edtech world, Bloom’s 2 Sigma is actually based on really poor evidence (see here for more on this), more recent evidence provides a better foundation. a landmark 2011 meta-analysis, Kurt VanLehn reviewed decades of research comparing human tutoring, intelligent tutoring systems, and conventional classroom instruction. His findings were interesting: human one-to-one tutoring produced an effect size of around 0.76 over whole-class teaching, a substantial and consistent advantage that held across domains, age groups, and study designs. Crucially, intelligent tutoring systems achieved an effect size of around 0.40: roughly half the benefit of human tutoring, but still a meaningful and reliable improvement over conventional instruction. Even if there is a measurement error that reduces the true effect considerably, the convergence across studies and contexts suggests something real is being captured, not manufactured.
For most of human history that kind of individualised instructional support has been available only to the privileged few, a luxury of wealth and social capital that no amount of pedagogical goodwill could redistribute at scale. AI tutoring systems powered by the science of learning, even with their real limitations and the well-documented voltage drop between controlled trials and classroom deployment, represent a genuine attempt to offer something approaching that experience to every learner regardless of background. The equity argument for well-designed AI in education, cautiously and precisely stated, is genuine; and it is, for those who care about educational justice, a powerful one. To dismiss the technology entirely is to dismiss the possibility it carries for the children who have historically been most failed by the systems we already have.
The other thing which is, again, counterintuitive is that a model like this does not mean less human interaction but more. Dylan Wiliam has spent decades arguing that the most powerful intervention in education is not any particular curriculum or technology but the quality of the feedback loop between teacher and learner: the moment a teacher notices where understanding has broken down and responds in real time with the precisely calibrated question or explanation that moves the student forward. That moment requires something AI cannot replicate: genuine knowledge of the individual, accumulated over time, in a relationship of trust. What AI can do is remove the hundred smaller tasks that currently prevent teachers from arriving at that moment with sufficient attention and energy to make it count.
The positive case for teachers is equally compelling, provided the crucial condition is met. If AI absorbs the most mechanical elements of formative assessment: flagging routine errors, generating retrieval practice at appropriate intervals, tracking the accumulation of misconceptions across a class over time, it could in principle return to teachers the cognitive bandwidth required for the work that only they can do. The relational, diagnostic, and contextual dimensions of teaching; the capacity to read a room, to detect the student whose quiet compliance masks genuine confusion, to repair a fractured relationship with a subject or with learning itself; these are not peripheral features of good teaching. They are its irreducible core. A teacher freed from the drudgery of marking thirty near-identical practice responses is, in principle, a teacher with more of herself available for the interactions that matter most.
The condition in all of these cases is the same word: deliberately. The same technology that deskills when deployed carelessly is the technology that augments when deployed with intention. The distinction between AI that replaces human cognition and AI that strengthens it is not a feature of the technology itself; it is a design choice, made or unmade by the people who commission, build, and implement these systems. Spell-checkers did not inevitably stop people learning to write. Calculators did not inevitably stop people learning to think mathematically. The question in each case was always the same: is this tool being used to bypass the cognitive work through which understanding is built, or to support the learner in doing that work more effectively? AI poses the same question, at greater scale and with higher stakes than any tool that has come before it.
What This Means for Education
The educational implications of all this are not difficult to see, even if they remain largely unexamined. Students now have access to systems capable of producing fluent explanations, essays, and problem solutions in seconds. These systems remove many of the visible barriers to completing academic work. But to evaluate whether an AI-generated answer is correct requires domain expertise; to detect a subtle error requires conceptual understanding; to judge whether an argument is coherent requires genuine familiarity with the underlying ideas. As I’ve said before, you cannot connect the dots if you don’t have any.
Without that knowledge, the student becomes a supervisor of information they don’t have the understanding to properly assess. The technology, in short, assumes the very expertise it risks undermining: and nowhere is that irony more consequential than in the domain whose entire purpose is the development of that expertise in the first place.
What all of this points toward is not the rejection of AI but the deliberate design of the human role within it. Automation never removes the human from the system; it recasts the system as an inescapable human-machine team, which must then be designed explicitly rather than inherited by default. That design question, of how much cognitive work to delegate, how to preserve the skills that delegation erodes, and how to structure oversight so that it does not simply exhaust the people performing it, is one that every organisation deploying AI at scale now faces. Including, with particular urgency, schools.
Bainbridge said all of this in four pages, in 1983, about nuclear power plants. It has taken us forty years and several peer-reviewed studies to confirm that she maybe have been presciently right, and that the domain she was describing was not industry but the future of work itself. The question now is not whether AI intensifies the cognitive burden on the humans who use it. The evidence on that point is, at this stage, fairly clear. The question is whether the organisations deploying these tools at scale are willing to take that finding seriously: to design human roles that are genuinely sustainable, to invest in the skills that AI cannot replicate, and to resist the seductive arithmetic that mistakes higher output for better thinking.
The ironies of automation do not resolve themselves. They compound.
Another irony about AI is that most of the best writing about it seems to come from the 50s, 60s and 70s. See earlier posts on Herbert Simon and W. Ross Ashby.



Can we just quit saying that woo-woo marketing phrase "AI" and just say "automation" or "education technology," please? None of this is particularly new or transformational. It's all an exercise in settling, and every plea that we automate education is a demand that we settle for less.
Re: "Bainbridge concluded her 1983 paper with a characteristically dry observation that has become I think, even sharper since it was written. The final irony of automation, she wrote, is that the most successful automated systems, those with the rarest need for human intervention, are precisely the systems that require the greatest investment in human skill. The longer the machine runs without incident, the more degraded the human backup; and yet it is in those rare, high-stakes moments of failure that the human is most needed and least prepared."
This calls to mind Cory Doctorow's explanation of a centaur versus a reverse centaur. A person who figures out how to use technology to increase the quality or efficiency of their work is a centaur; a person who is told to supervise one or more machines is a reverse centaur. A reverse centaur does not exist to ensure quality, efficiency, or even safety; a reverse centaur exists as a moral accountability sink, someone to blame when the automation all goes sideways. And the automation is there because the bosses don't really give a crap about the product (in this case, educating poor people).
https://pluralistic.net/2026/03/11/modal-dialog-a-palooza/#autoplay-videos
It's also addressed in the book Data Driven: Truckers, Technology, and the New Workplace Surveillance (2023), by Karen Levy. She uses it to explain how long-haul truckers can't just supervise a "self-driving" truck.
Re: "In my experience, assessment in most schools operates like a kind of faulty compass: it produces readings, generates confidence, and often points consistently in the wrong direction. The granular, timely, diagnostic information that would actually improve instruction is precisely what our current systems cannot supply. Instead we have built vast administrative infrastructures around measurements that tell us remarkably little about what students actually know or can do."
This is the standardized test in a nutshell. And it's why we need to quit micromanaging teachers and overemphasizing test scores. You CAN get granular, timely diagnostic information in teacher-created formative assessment, but you can't measure anything of value with a multiple choice standardized test. What standardized test scores do correlate well with, though, is the socioeconomic status of the students being tested.
Re: "AI’s most under-appreciated contribution to education may not be instruction at all but measurement: the possibility of replacing these crude, lagging, heavily interpreted signals with something finer-grained, more timely, and more honest about what it does and does not know about the learner in front of it."
So surveillance? More data collection? Or more multiple-choice questions, which we already know doesn't tell us much?
Re: "For most of human history that kind of individualised instructional support has been available only to the privileged few, a luxury of wealth and social capital that no amount of pedagogical goodwill could redistribute at scale. AI tutoring systems powered by the science of learning, even with their real limitations and the well-documented voltage drop between controlled trials and classroom deployment, represent a genuine attempt to offer something approaching that experience to every learner regardless of background. The equity argument for well-designed AI in education, cautiously and precisely stated, is genuine; and it is, for those who care about educational justice, a powerful one. To dismiss the technology entirely is to dismiss the possibility it carries for the children who have historically been most failed by the systems we already have."
This is bunk. You don't automate a system you care about. The wealthy have NEVER cared about educating and elevating "the poors." We have always been asked to do education on the cheap, despite the fact that we have had the resources to fully fund public education. And if the wealthy were somehow forced to send their progeny to public schools, public education would soon have all the resources it needs. Wealthy people hate paying taxes for public schools, and they hate uppity workers (teachers unions), and automation is just another excuse to do away with both.
As a nuclear technician and then later a nuclear engineer and Navy submarine officer, I resonated with this piece very strongly -- I can say from long experience that it is incredibly difficult to maintain the mental sharpness necessary to evaluate quickly and then respond properly when things go suddenly awry after weeks or months of incredibly boring steady state operations which are, of course, the most desirable kind! Andrew Evans' comment about truckers is on point as well. If you're along for the ride, there just to respond in those moments when "the system" failed, your ability to respond effectively is really degraded. Great post.