For something called machine learning, we spend surprisingly little time asking whether machines actually learn.

That isn’t a throwaway line. It’s the central tension behind a recent paper, and it cuts closer to home than much of the field might like.

Because if you look carefully, today’s AI systems do not learn in the way we intuitively mean. They train. They optimize. They scale.

Learning, however, sits in a different category...


The training illusion

Modern AI systems excel at one thing: extracting patterns from large datasets.

Give a model enough data, compute, and time, and it will:

  • Predict the next token
  • Classify images
  • Generate code
  • Pass exams that most of us would prefer to avoid

This represents real progress. It underpins most of the breakthroughs of the last decade.

There is, however, a structural catch.

💡
These systems learn almost entirely in a predefined training phase, often on carefully curated datasets. Once deployed, they operate largely as fixed functions. Updates require retraining, fine-tuning, or human intervention.

In other words, they operate without ongoing learning.

The paper frames this as a fundamental limitation: current systems lack autonomous, continuous learning, the ability to adapt dynamically in open environments.

This gap marks the difference between a system that performs well in benchmarks and one that behaves robustly in the real world.

The story of Sora: What it reveals about building real-world AI
After ChatGPT’s breakthrough, the race to define the next frontier of generative AI accelerated. One of the most talked-about innovations was OpenAI’s Sora, a text-to-video AI model that promised to transform digital content creation.

Why this matters for AI professionals

At first glance, this might feel like academic hair-splitting. After all, if the model works, does the learning process matter?

In practice, it matters quite a lot.

The cracks show up in familiar places:

  • Systems struggle with out-of-distribution scenarios
  • Long-horizon tasks break down after a few steps
  • Context fades faster than anyone would like
  • Real-world interaction remains brittle

These patterns appear consistently, and they reflect structural consequences of how learning is currently defined.

The paper compares today’s paradigm to an assembly line: data is collected, models are trained, and outputs are produced. The system itself does not evolve through experience.

That approach works well for static tasks. It performs less effectively in a dynamic world.


Learning, according to biology

To understand what’s missing, the authors take an unusual route for an AI paper and look to cognitive science.

Biological systems do not separate learning into neat phases. They combine multiple modes of learning, continuously:

  • Passive observation
  • Active interaction
  • Internal control over what to learn and when

The paper formalizes this into three components:

1. System A: Learning from observation

This aligns closely with what current models do: learning patterns from data, often through self-supervision.

2. System B: Learning from action

Here, learning happens through interaction with the environment, through trial, error, feedback, and adaptation.

3. System M: Meta-control

This is the interesting layer. A system that decides how to learn, when to observe, when to act, and how to allocate resources.

Current AI systems lean heavily toward System A. The other two appear only in partial form.

Which explains quite a lot.

AI swarms are here: How autonomous agents work together
If the last wave of AI felt like hiring a very smart intern, this one feels more like managing an entire organization that never sleeps (and occasionally argues with itself).

The missing ingredient: Autonomy

The paper’s core claim is simple and slightly uncomfortable:

Today’s AI systems function as non-autonomous learners.

They rely on:

  • Curated datasets
  • Predefined objectives
  • External supervision

They lack the ability to:

  • Generate their own learning signals
  • Adapt continuously to new environments
  • Build internal models that evolve over time
💡
This explains why scaling alone shows diminishing returns. Performance improves, yet the learning paradigm stays the same. It resembles upgrading a car’s engine while leaving the steering system untouched.

What a different architecture might look like

The authors move beyond critique and outline a path forward, blending ideas from reinforcement learning, self-supervision, and cognitive science.

At a high level, future systems would:

  • Learn from both observation and interaction, rather than static data alone
  • Continuously update their internal representations
  • Use meta-control mechanisms to guide learning dynamically
  • Operate in open-ended environments instead of fixed datasets

In this framing, learning becomes an ongoing process rather than a one-time event.

Training becomes the starting point.


Two shifts this implies

If you take the paper seriously, it points to two broader shifts for the field:

1. From datasets to environments

The center of gravity moves away from static corpora toward interactive, evolving environments.

Think less “pre-training on the internet,” more “learning through experience.”

2. From optimization to adaptation

Performance metrics shift from accuracy on benchmarks to adaptability over time.

The question changes from “how well does it perform?” to “how quickly can it improve?”


A quick reality check

Before declaring a major shift in machine learning, it helps to stay grounded.

Current approaches:

  • Work extremely well in many domains
  • Scale predictably
  • Deliver real economic value

All of that remains true.

What this paper highlights is not failure, but incompleteness.

The field has built systems that excel at extracting patterns from data. Systems that interact with data continuously, adapt in real time, and evolve through experience still sit on the horizon.

Building hybrid AI for financial crime detection
Here’s how consulting leader Valentin Marenich and his team built a hybrid AI system that combines machine learning, generative AI, and human oversight to deliver real-world results in a highly regulated environment.

So… does AI learn?

Yes, within a narrow and well-defined frame.

It learns during training. It generalizes within bounds. It performs impressively under the right conditions.

Continuous, interactive, self-directed learning remains out of reach.

That distinction matters.

Because the next wave of AI progress likely depends less on scaling the same paradigm and more on expanding what “learning” actually means.


Final thought

The field has achieved remarkable progress through pattern recognition.

Now it’s encountering the limits of that success.

The question has shifted.

It’s no longer about whether machines can learn from data. It’s now about whether they can learn from the world.

And that is a much harder problem, though it does come with better benchmarks than ImageNet.