Most AI agent systems are built once and then repeat the same patterns indefinitely, like an employee who insists every problem can be solved with the same spreadsheet.

New research from Peking University proposes a more effective model: agents that accumulate working solutions as executable code, refine them with each new task, and become more capable over time.

The static agent problem

Most agent systems today follow a simple pattern. An agent receives a task, follows a prompt-defined process, produces output, and moves on.

When it encounters a similar task later, it often starts again from scratch, apparently having learned absolutely nothing from the experience.

Blueprints like AgentFactory, developed by researchers at Peking University and the Beijing Academy of Artificial Intelligence, take a different approach.

Instead of storing successful solutions as prompt tweaks or textual reflections, this kind of framework saves them as executable Python subagents: reusable pieces of code with standardized documentation that can be retrieved, adapted, and redeployed for future tasks.

The result is a system that improves over time:

Its library of tools expands
Repeated tasks require less effort
Performance improves through reuse and refinement

Without further ado, here are seven signs your current setup would benefit from this type of architecture...

1. Your agents solve the same problems repeatedly from scratch

If your agent handles similar tasks across sessions and rebuilds its approach every time from a blank prompt, you are spending compute on work that already exists.

Modern architectures address this directly. When a new task arrives without a relevant subagent, the system builds one and saves it for future use.

The next time a similar task appears, the groundwork already exists.

2. Successful runs leave nothing behind

In many frameworks, a successful execution becomes a one-time event. The output is produced, logged, and forgotten.

💡

Ideally, you need a framework that treats successful executions as reusable assets. The subagent responsible for the result gets stored as runnable Python code with documentation attached.

If successful runs are not creating reusable capability, accumulated knowledge disappears after every session.

3. Your agents get better prompts but not better tools

Prompt engineering remains the default response when agent performance needs improvement:

Adjust the instructions
Add more context
Refine the examples
Repeat until morale improves

That can help, but it improves the reasoning layer rather than the execution layer.

Many frameworks nowadays modify the tools themselves. Using execution feedback, it improves existing subagents over time, making them more robust and reusable.

4. Re-running a task type requires rebuilding context every time

One of the hidden costs of static agent systems is context overhead.

💡

Without a persistent skill library, every task run must reconstruct the knowledge needed to solve the problem. In practice, this means the agent keeps forgetting how it solved the same issue last Tuesday.

For example, AgentFactory reduces this overhead by retrieving saved subagents for similar tasks, cutting down on repeated reasoning and setup work.

5. Your agent’s capabilities are tied to one platform

If your agent’s tools are tightly coupled to one framework or runtime, the work invested in building them stays trapped there.

Most of today’s subagents can be exported and run in any Python-capable environment.

Portable, documented code turns a tool into a long-term asset.

6. Performance plateaus after deployment

A well-calibrated static agent often performs strongly at launch and then levels off, much like a New Year’s gym membership.

Without a feedback loop that converts new tasks into improved tooling, capability growth slows quickly.

If your deployment curve flattens after the first few weeks, the absence of a self-improvement loop is probably the reason.

7. Scaling task volume means scaling manual maintenance

In many agent architectures, handling more task types requires:

More prompt engineering
More configuration
More human oversight

What you should be looking to do is reverse that relationship.

As task volume increases, the subagent library should expand alongside it. A growing share of tasks can then be handled by retrieving and adapting existing subagents.

The system scales its own capabilities.

What this means in practice

This shift is not incremental. It introduces a different mental model for successful agent systems. Instead of a pipeline that runs tasks, you need an infrastructure that learns from them.

For teams building agentic AI systems, the practical takeaway is straightforward:

Static agents are a starting point
Reusable capability compounds value over time
Systems improve fastest when every run produces something reusable

The most capable agent systems will not simply complete tasks. They will build a growing library of proven, portable capabilities that improve with every task they encounter.

Of course, self-improving agent systems come with tradeoffs. Persisting and modifying executable tools introduces questions around verification, security, version control, and failure propagation.

A system that can improve its own tooling can also reinforce bad patterns if the feedback loop is poorly designed.

Final thought

💡

The most important shift in agent design may not be bigger models or more sophisticated prompts. It may be the move from systems that simply execute tasks to systems that accumulate capability over time.

Static agents eventually hit a ceiling. Systems that create reusable tools, refine them through feedback, and carry those improvements forward operate very differently.

Every completed task becomes part of the system’s future capability instead of disappearing into the void five seconds later.

The long-term winners in AI may not be the systems that sound the smartest in a demo. They may be the ones that quietly get better every week while nobody is looking.

Source

Zhang, Z., Lu, S., Qian, H., He, D. and Liu, Z. (2026), “AgentFactory: A Self-Evolving Framework Through Executable Subagent Accumulation and Reuse”, arXiv:2603.18000. Published 18 March 2026.

7 signs your AI agent system needs to start building its own tools