The OECD’s Verdict on AI in Schools: It Works—But Only When Teachers Lead It

The most authoritative review of generative AI in education published to date has a clear, evidence-based conclusion: AI can transform learning. Without pedagogical guidance, however, it produces a mirage: better outputs, no deeper understanding. Here is what the research actually says, and what schools and governments should do about it.

37% of teachers already use GenAI at work — OECD TALIS 2024
57% say AI helps them write or improve lesson plans
72% express concern about AI enabling academic dishonesty
−17% performance drop when AI is removed after the AI-assisted study
+48% Students using GenAI were 48% more successful in completing tasks — but gains collapsed without AI access
−31% reduction in lesson planning time for secondary science teachers in England who used AI support
+9pp Increase in student pass rates when low-experience tutors used AI support — closing the expertise gap
15% of people in low-income communities have stable internet — the “second digital divide,” the OECD warns, is widening

For years, every educational technology wave followed the same script: launch with extraordinary promise, deploy with insufficient preparation, disappoint at scale, and leave teachers to pick up the pieces. Interactive whiteboards. MOOCs. Tablet programmes. The pattern was so consistent that the education research community developed a shorthand for it — the “EdTech hype cycle” — and most experienced educators learned to wait it out. Generative AI, the OECD says in its most comprehensive assessment of the field to date, is different. Not because it automatically works. But it can work under specific, identifiable conditions that governments and schools can actually create.

The OECD Digital Education Outlook 2026, published in January and already the most cited education policy report of the year, is the result of a systematic synthesis of emerging international research, policy analysis, and design experiments across dozens of countries. Its core message is neither the uncritical enthusiasm of the EdTech marketing sector nor the reflexive scepticism of commentators who distrust anything algorithmic in a classroom. It is more useful than either: a careful, evidence-based verdict that GenAI has genuine educational potential, that this potential is being realised in specific contexts, and that how AI is designed and used matters more than whether it is used at all.

The Mirage of False Mastery: When AI Helps Without Teaching

The most striking and consequential finding in the Outlook is also the simplest to state: completing a task with GenAI does not mean a student has learned anything. The OECD synthesizes multiple studies showing that students who use general-purpose GenAI tools — standard chatbots, not purpose-built educational systems — consistently produce higher-quality outputs than those working without AI. Essays are more polished. Code runs. Problems are “solved.” But when the AI is removed — during an examination, a practical exercise, an assessment conducted without technological support — the advantages disappear. In some cases, they reverse.

Key finding: the AI crutch effect

Students perform better with AI and worse without it

+48%Task success rate for students using AI assistance, compared to peers without it
−17%Performance drop when AI assistance is subsequently removed — e.g., in a traditional exam setting

The OECD describes this as producing a “mirage of false mastery.” Students and teachers both see better outputs and can mistake them for evidence of learning. The AI has improved the product without improving the producer. This is not an argument against AI in education. It is an argument that using general-purpose AI without a pedagogical structure is not education — it is outsourcing.

The research reviewed by the OECD is clear about why this happens. Cognitive load theory and decades of learning science establish that genuine understanding requires students to engage in the effortful mental process of connecting new information to existing knowledge — retrieving, elaborating, and applying it. When AI performs those steps on behalf of a student, the neural encoding that constitutes learning either weakens or does not occur. The output is impressive. The learning is absent. This distinction is invisible in a standard assessment of final products, which is one reason it has been missed in many early AI-in-education evaluations.

“When designed or used without pedagogical support, outsourcing tasks to GenAI will only enhance student performance without leading to real learning gains.”

OECD Digital Education Outlook 2026 — core finding, page summary

What Works: GenAI as Tutor, Partner, and Assistant

The Outlook is not a critique of AI in education. It is a guide to what forms of AI use produce genuine learning gains — and the evidence base here is genuinely encouraging. The OECD identifies three productive roles for GenAI in educational settings, each grounded in different evidence streams and each requiring different levels of teacher design and oversight.

AI as a Tutor

Purpose-built, intelligentbuilt, intelligent tutoring systems using Socratic questioning and adaptive dialogue to guide students through reasoning — not supplying answers but prompting reflection and self-correction.

Evidence: prototype systems show promise; unlike rigid dialogue trees, GenAI tutors adapt language and approach dynamically to individual learner needs.

AI as a Partner

Collaborative learning support where AI plays a defined role in group tasks — as a devil’s advocate, a knowledge source to be questioned, or a shared writing partner whose contributions students must evaluate and revise.

Evidence: four collaborative AI roles identified — challenger, resource, co-author, and audience. Structured collaboration produces more critical engagement than unstructured AI use.

AI as a Teacher’s Assistant

AI supporting teacher productivity: lesson planning, formative feedback drafts, administrative workflow, resource tagging, and assessment design — freeing teacher time for the high-value human interactions AI cannot replace.

Evidence: −31% time on lesson planning for science teachers in England; +9pp student pass rates when novice tutors used AI pedagogical support.

Across all three roles, the OECD’s synthesis identifies a common design principle: hybrid systems that combine GenAI with explicit pedagogical models show more promise than general-purpose chatbots. An AI tutor that applies structured tutoring strategies and evidence-centred assessment design outperforms a standard chatbot given the same task. A teacher using AI to design formative feedback based on a clear understanding of student misconceptions outperforms a teacher deploying AI without that diagnostic framing. The human expertise is not replaced — it is amplified.

The Teacher is Not Optional: Why Human Agency is the Non-negotiable Variable.

A consistent theme throughout the Outlook — and one that the OECD has built a dedicated conceptual framework around — is the centrality of teacher agency. This is not a diplomatic gesture toward a profession that might feel threatened by AI. It is an evidence-based finding about what determines whether AI produces learning or produces the mirage of learning.

The report’s framework for “teacher-AI teaming” distinguishes between two types of AI deployment in education. In the first, AI is handed to students or classrooms as a general-purpose tool with minimal pedagogical framing — the approach that produces the crutch effect. In the second, a teacher designs the context in which AI is used: specifying what role the AI plays, what cognitive work the student must still do independently, how AI outputs will be interrogated rather than accepted, and how assessment will be structured to value process over product.

The difference is not in the technology. The same AI model can produce the crutch effect in one classroom and genuine learning gains in another, depending entirely on how the teacher has structured the encounter. This means that teacher professional development — specifically, development in AI pedagogy, not just AI tool familiarity — is the highest-leverage investment any school system can make right now.

The Second Digital Divide

The OECD’s equity analysis introduces a distinction that will become increasingly important as AI use expands in education. The “first digital divide” was about access — who had devices and connectivity and who did not. Policies to close that divide, like Japan’s GIGA School programme, have had real success in many countries. The “second digital divide” is different: it is about effective pedagogical use. Students and teachers in well-resourced schools, with strong professional development and curriculum alignment, extract genuine learning value from AI. Those in under-resourced settings, without that pedagogical scaffolding, extract the crutch effect — better outputs, no deeper learning — or simply do not use AI at all.

The OECD also flags a connectivity infrastructure problem: only around 15% of people in low-income communities have access to stable internet, compared to near-universal mobile phone access. The report highlights small language models running offline on mobile devices as a promising avenue for AI to bridge these divides — and a large-scale experiment in rural Brazil showed that even with intermittent connectivity and minimal equipment, AI could provide meaningful feedback and guidance when pedagogically designed to do so.

Process Over Product: How Assessment has to Change

One of the most practically significant implications of the OECD’s findings concerns assessment. The traditional model — grade the final essay, the completed problem set, the finished project — is designed to evaluate products. When AI can generate high-quality versions of those products in seconds, grading them no longer reliably measures what a student knows or can do. The assessment is still measuring something. It is no longer measuring learning.

The Outlook advocates for a shift toward what it calls process-oriented assessment: evaluation that focuses not on what a student produced but on how they produced it. How did the students interact with AI? Did they accept its output or interrogate it? How did they revise AI-generated drafts? How do they explain their reasoning when questioned? What decisions did they make and why? This shift is not simply a response to AI — it reflects a learning-science principle that metacognitive engagement (thinking about your own thinking) is one of the strongest predictors of genuine understanding. AI has made ignoring that principle more expensive.

Implementing process-oriented assessment at scale is technically and logistically challenging. It requires different tools, different teacher training, and, in many systems, different examination frameworks that may require policy changes at the national level. The OECD acknowledges that thisthat this is not a short-term transition. But it is also not optional: as general-purpose AI becomes more capable, assessment systems that do not adapt will increasingly measure AI performance rather than student learning.

What Governments Should Actually Do: The OECD’s Four-point Call to Action

The Outlook concludes with specific policy recommendations, structured as four action areas for governments and education systems. These are not aspirational statements. They are derived from the evidence of what conditions produce genuine educational benefit from AI, and translated into the kinds of decisions that ministries, school authorities, and institutions can make.

OECD Digital Education Outlook 2026

Foster human-centred teaching and learning with GenAI, prioritise foundational skills and independent thinking. Introduce AI in a structured sequence: first without AI, then with purpose-built educational AI, then with general-purpose AI. The sequence matters: students need foundational capability before AI augmentation produces learning gains rather than crutch dependency.
Invest in educational GenAI research and development, support tools grounded in learning science, co-created with teachers and learners, and rigorously evaluated for effectiveness. Governments should fund the development of purpose-built educational AI systems — not simply endorse the deployment of general-purpose commercial tools in classroom settings without pedagogical design.
Shape a trustworthy AI policy environment by setting clear expectations of privacy, safety, bias, transparency, age-appropriateness, and alignment with educational goals. Current governance frameworks in most countries were not designed for AI in education and are insufficiently specific. AI governance in schools cannot wait for general AI regulation to mature.
Ensure equitable digital infrastructure and professional learning by providing devices, connectivity, curriculum-aligned resources, and — critically — sustained professional learning that develops teachers’ AI pedagogical skills, not just tool familiarity. Without the professional learning component, infrastructure investment produces a second digital divide rather than closing it.

Are We Building Learners or Better Output Generators?

The OECD’s findings land at a moment when the pace of AI capability development is outrunning the pace of educational adaptation. ChatGPT users doubled between 2024 and 2025. General-purpose AI tools are already more capable than the purpose-built educational systems the Outlook recommends. Students are using them anyway, largely outside institutional control, because they are free, intuitive, and enormously effective at producing the outputs that traditional assessment rewards. The question of whether those outputs reflect genuine learning is one that most assessment systems are not yet equipped to answer.

The OECD’s verdict is not pessimistic. The evidence base for educational AI genuinely working in well-designed intelligent tutoring systems, in collaborative learning contexts, and in teacher productivity is real and growing. A 9-percentage-point improvement in student pass rates when novice tutors received AI pedagogical support is not a trivial finding. A 31% reduction in lesson planning time, if it translates into better lesson quality rather than just faster production, represents a genuine shift in what teaching can look like. The potential is there. The conditions to realise it require deliberate investment.

The harder truth, implicit throughout the Outlook, is that education systems have spent decades struggling to implement even straightforward evidence-based practices — formative feedback, retrieval practice, spaced repetition — at scale. GenAI does not make that implementation challenge simpler. It raises the stakes. Systems that invest in teacher AI pedagogy, process-oriented assessment, and purpose-built educational tools will see genuine learning gains. Those who deploy general-purpose AI without a pedagogical structure will produce a generation of students who are better at generating content and no better at understanding it. The OECD has given governments the evidence they need to choose. The choice now is theirs.

Key Takeaways

GenAI can support genuine learning — but only when embedded in pedagogically grounded designs that require students to do effortful cognitive work. General-purpose AI without that structure produces the “crutch effect”: better outputs, no deeper learning.
Students using unguided AI are 48% more successful at tasks — but perform 17% worse when AI is removed. This gap is the measure of false mastery, and it is invisible to traditional product-based assessment.
The three productive roles for AI in education are: tutor (Socratic dialogue, adaptive explanation), partner (structured collaborative tasks), and teacher’s assistant (lesson design, feedback, administration). Each requires teacher design and oversight.
Teacher agency is the non-negotiable variable. The same AI model produces crutch effects or genuine learning gains depending entirely on how the teacher has structured the encounter. AI pedagogy training, not just tool training, is the highest-leverage investment.
Assessment must shift from product to process: evaluating how students interact with AI, critique its outputs, and explain their reasoning — not just what they produced.
A “second digital divide” is emerging — not between those with devices and those without, but between those with pedagogical scaffolding for AI use and those without. Equity policy must address both layers.
The OECD’s four action areas for governments: human-centred AI pedagogy; investment in purpose-built educational AI research; trustworthy AI governance frameworks; equitable infrastructure and sustained professional learning.

AI as a Tutor

AI as a Partner

AI as a Teacher’s Assistant

The Trust Recession: Why the Smartest Brands Are Now Competing on Humanity, Not AI

Sport Is the New Infrastructure: How Private Equity Turned Professional Teams Into Asset Classes

Adolescence, Sinners, and The Studio: How 2026’s Breakout Titles Are Rewriting What Hollywood Thinks Audiences Want

The Privacy-Copyright Collision: How AI’s Hunger for Data is Forcing Courts to Rewrite Both Bodies of Law Simultaneously