When are sub-agents a good idea?
Building a new AI agent is easy. Everything is neat and clean, it does what it should.
However, as you add tool after tool and expand its domain knowledge, you hit a core problem: how do you manage its ever-growing context? The more you stuff into the system prompt, the less any individual instruction really sticks.
There are two ways to solve this:
- One expert sub-agent per domain
- A really solid context engine
Option 1: one expert sub-agent per domain
This is what I intuitively reached for. Divide and conquer. Software engineers love this pattern. It brings back memories of amazing recursive algorithms and clean data structures.
It seems easy at first. Then comes the moment when you realize the context handoff between agents is not easy. Details of the user’s goal get lost or misinterpreted at every boundary.
A concrete example. Imagine you have:
- Agent A: plans a new ad campaign and landing page. Knows the user, their goals, previous work, performance data.
- Agent B: implements the landing page in Preact. Knows tracking, domain-specific best practices, how to place logos, how to build lead forms.
- Agent C: generates ad creatives. Knows image generation, decorating existing images, image-to-video pipelines.
The user talks to A, which spawns B and C as workers. B and C know little about the user, but when prompted correctly they deliver great output.
The problem: to prompt B and C correctly, A has to know exactly what they know and don’t know. Otherwise the context it provides at spawn time will be lacking, missing branding information, project goals, user preferences, performance data from previous campaigns, available tools, and so on.
I usually fixed this two ways:
- Shared context via database records. A persists a “project” record; B and C read from it. Implicit, but works as long as everyone agrees on the schema.
- Explicit prompting instructions. Give A detailed instructions on how to prompt B and C correctly for every case.
Both approaches are vulnerable to context drift. Update how B works and you also have to make sure A still calls it correctly. Improve landing pages end-to-end and you are optimizing multiple agents instead of one. If you iterate fast, this pain compounds quickly.
There is a subtler tell, too: if A already needs to know so much about B and C to prompt them correctly, that is a good indicator that maybe A should just be B and C.
Sub-agents solve 90% of the problem easily. However not the last 10%.
Option 2: a really solid context engine
Have a sophisticated system in place that decides what is relevant for the agent each turn and builds the context accordingly.
This started with good old RAG, but required many more paradigm shifts to really come to fruition. The latest of which is 🦄 Peter Steinberger’s OpenClaw.
I looked at the OpenClaw source code and what I found is that its core idea is a simple one:
Give an agent a sense of self (identity, memory, soul) and let it own a workspace of files and skills, and get rid of the massive multi-domain system prompt.
What was previously baked into the system prompt and the function call
array gets broken up into .md files in the workspace.
The agent reads the files it deems relevant each turn. Only the
relevant fraction ends up in context.
The fat system prompt problem
The workspace approach
Together with function call pruning and regular session history compaction, this makes for a very solid context engine:
(read on demand)
pruning
compaction
context engine
Obviously this also has its issues and is not perfect. But it makes for damn good agents.
So when ARE sub-agents a good idea?
Sub-agents don’t solve the core problem of context management. They break it up. Initially that seems like a simplification. In reality it makes the problem harder.
In the true fashion of divide and conquer, you first have to solve the smallest unit of the problem: a single, well-behaved AI agent that manages its own context properly.
Sub-agents are a good idea after solving the context management problem. Do not treat them as a solution to it.
With that baseline in place, here are three cases where sub-agents genuinely make sense:
Parallelizing background work
Running long tasks in the background without blocking the main chat. If Agent A is user-facing and you don’t want to freeze the conversation while it implements a website, spawning a background worker makes sense.
But wait. Isn’t that exactly the context handoff problem we just described?
Not necessarily.
Don’t spawn a context-constrained sub-agent. Clone your main agent.
A clone carries the full chat history. It knows exactly what the user is working on because it was there for the whole conversation. Context drift is zero. The seam between main agent and worker disappears.
Multiple business cases
In theory, a single OpenClaw-style agent could handle everything. In practice, your business model might not want that. You may need to bill agents separately, enforce permission boundaries, meet compliance requirements, or isolate failure domains. These are legitimate reasons to run separate agents, and they have nothing to do with context management.