2026.05.25

When are sub-agents a good idea?

Part 1 of a series Exploring AI context engineering: how agents manage what they know, when they know it, and what gets lost along the way.

Building a new AI agent is easy. Everything is neat and clean, it does what it should.

However, as you add tool after tool and expand its domain knowledge, you hit a core problem: how do you manage its ever-growing context? The more you stuff into the system prompt, the less any individual instruction really sticks.

There are two ways to solve this:

One expert sub-agent per domain
A really solid context engine

Option 1: one expert sub-agent per domain

This is what I intuitively reached for. Divide and conquer. Software engineers love this pattern. It brings back memories of amazing recursive algorithms and clean data structures.

It seems easy at first. Then comes the moment when you realize the context handoff between agents is not easy. Details of the user’s goal get lost or misinterpreted at every boundary.

A concrete example. Imagine you have:

Agent A: plans a new ad campaign and landing page. Knows the user, their goals, previous work, performance data.
Agent B: implements the landing page in Preact. Knows tracking, domain-specific best practices, how to place logos, how to build lead forms.
Agent C: generates ad creatives. Knows image generation, decorating existing images, image-to-video pipelines.

The user talks to A, which spawns B and C as workers. B and C know little about the user, but when prompted correctly they deliver great output.

Multi-agent topology

Campaign Planner

user · goals · history · performance data

spawns

Landing Page

Preact · tracking · forms · brand

Ad Creative

image gen · decoration · video

The problem: to prompt B and C correctly, A has to know exactly what they know and don’t know. Otherwise the context it provides at spawn time will be lacking, missing branding information, project goals, user preferences, performance data from previous campaigns, available tools, and so on.

What Agent A sends to Agent B vs. what Agent B actually needs

Agent A sends to Agent B

✓ Campaign goal

✗ Brand colors

✗ User style prefs

✗ Performance history

✗ Available tools

✗ Previous designs

Agent B actually needs

✓ Campaign goal

! Brand colors

! User style prefs

! Performance history

! Available tools

! Previous designs

Each gap is a missing assumption Agent B has to fill in itself, often incorrectly.

I usually fixed this two ways:

Shared context via database records. A persists a “project” record; B and C read from it. Implicit, but works as long as everyone agrees on the schema.
Explicit prompting instructions. Give A detailed instructions on how to prompt B and C correctly for every case.

Both approaches are vulnerable to context drift. Update how B works and you also have to make sure A still calls it correctly. Improve landing pages end-to-end and you are optimizing multiple agents instead of one. If you iterate fast, this pain compounds quickly.

Context drift: the maintenance trap

Improve Agent B: new landing page format, updated branding logic

↓

Must update A’s prompting instructions for B (now out of date)

↓

Must verify A still calls B correctly across all cases

↓

Miss one → silent regression in production

Every end-to-end improvement touches multiple agents. Each boundary is a sync point that can go stale.

There is a subtler tell, too: if A already needs to know so much about B and C to prompt them correctly, that is a good indicator that maybe A should just be B and C.

Sub-agents solve 90% of the problem easily. However not the last 10%.

Option 2: a really solid context engine

Have a sophisticated system in place that decides what is relevant for the agent each turn and builds the context accordingly.

This started with good old RAG, but required many more paradigm shifts to really come to fruition. The latest of which is 🦄 Peter Steinberger’s OpenClaw.

I looked at the OpenClaw source code and what I found is that its core idea is a simple one:

Give an agent a sense of self (identity, memory, soul) and let it own a workspace of files and skills, and get rid of the massive multi-domain system prompt.

What was previously baked into the system prompt and the function call array gets broken up into .md files in the workspace. The agent reads the files it deems relevant each turn. Only the relevant fraction ends up in context.

The fat system prompt problem

Before: everything in context, always

System prompt

🪪 Identity & persona

📋 Domain knowledge: campaign planning

💻 Domain knowledge: landing page impl

🎨 Domain knowledge: ad creative gen

🔧 All tool definitions

📊 Performance data & history

🏷️ Brand guidelines

⚙️ User preferences & settings

↓ every single turn

🤖 Agent: drowning in irrelevant context

Every turn pays the full context cost. Instruction following degrades as prompt length grows.

The workspace approach

→ → → →

auto

Workspace

IDENTITY.md

SOUL.md

MEMORY.md

AGENTS.md

skills/landing-page.md

skills/ad-creative.md

skills/campaign-planner.md

skills/email-writer.md

🔸 get_customer_settings()

✓ { logo_url, brand_colors, font_family, tone }

In context this turn

Together with function call pruning and regular session history compaction, this makes for a very solid context engine:

Workspace files
(read on demand)

Function call
pruning

Session history
compaction

Very solid
context engine

Obviously this also has its issues and is not perfect. But it makes for damn good agents.

So when ARE sub-agents a good idea?

Sub-agents don’t solve the core problem of context management. They break it up. Initially that seems like a simplification. In reality it makes the problem harder.

In the true fashion of divide and conquer, you first have to solve the smallest unit of the problem: a single, well-behaved AI agent that manages its own context properly.

Sub-agents are a good idea after solving the context management problem. Do not treat them as a solution to it.

With that baseline in place, here are three cases where sub-agents genuinely make sense:

Parallelizing background work

Running long tasks in the background without blocking the main chat. If Agent A is user-facing and you don’t want to freeze the conversation while it implements a website, spawning a background worker makes sense.

But wait. Isn’t that exactly the context handoff problem we just described?

Not necessarily.

Don’t spawn a context-constrained sub-agent. Clone your main agent.

A clone carries the full chat history. It knows exactly what the user is working on because it was there for the whole conversation. Context drift is zero. The seam between main agent and worker disappears.

Multiple business cases

In theory, a single OpenClaw-style agent could handle everything. In practice, your business model might not want that. You may need to bill agents separately, enforce permission boundaries, meet compliance requirements, or isolate failure domains. These are legitimate reasons to run separate agents, and they have nothing to do with context management.