Sub-Agents & Parallel Execution
Delegate work to specialized agents. Run multiple agents concurrently and compress wall-clock time dramatically.
Why Sub-Agents?
A senior developer doesn't do everything themselves. They delegate: one person explores the codebase, another writes the plan, a third reviews the code. Harness agents work the same way.
Two key benefits drive the sub-agent model:
- Specialization — different agents optimized for different tasks (exploring vs planning vs reviewing)
- Parallelism — run multiple agents concurrently and get results in the wall-clock time of the slowest, not the sum of all
| Feature | Harness | Claude Code | Cursor | Aider | OpenHands | SWE-Agent |
|---|---|---|---|---|---|---|
| Sub-agents | ✓ (4 types) | ✓ (Task) | ✗ | ✗ | ✗ | ✗ |
| Parallel spawn | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ |
| Custom agent defs | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Agent types | 4 built-in | 1 generic | 0 | 0 | 0 | 0 |
Agent Types
Harness ships with four built-in agent types. Each has a predefined tool set and permission level.
| Agent Type | Default max_turns | Access |
|---|---|---|
general |
50 |
Read + Write (Read, Write, Edit, Bash, Glob, Grep) |
explore |
20 |
Read-only (Read, Glob, Grep) |
plan |
30 |
Read-only (Read, Glob, Grep) |
review |
30 |
Read-only (Read, Glob, Grep) |
Explore, plan, and review agents are read-only by default — they can search and read but never modify files. This makes them safe to run in parallel without risk of conflicting writes.
You can also define custom agent types using AgentDef:
from harness.types.agents import AgentDef
# Define a custom agent type
security_agent = AgentDef(
name="security",
description="Security-focused code reviewer",
model=None, # None = inherit from parent
tools=("Read", "Glob", "Grep"),
system_prompt="You are a security expert. Identify vulnerabilities only.",
max_turns=50,
read_only=True,
)
# From harness.types.agents
class AgentDef:
name: str
description: str
model: str | None = None # Override model, None = inherit
tools: tuple[str, ...] = () # Allowed tools, empty = inherit
system_prompt: str | None = None
max_turns: int = 50
read_only: bool = False
Spawning a Sub-Agent
The simplest way to use sub-agents is through harness.run(). When the agent decides to
delegate, it uses the built-in Task tool internally:
import asyncio
import harness
from harness.agents.manager import AgentManager
async def main():
# AgentManager needs a provider and tools —
# normally these come from the engine internally.
# For standalone use, you can get them from a running session.
# Simple example using harness.run() with sub-agents via the SDK
async for msg in harness.run(
"Explore the src/ directory and summarize the architecture",
provider="anthropic",
):
if isinstance(msg, harness.ToolUse) and msg.name == "Task":
print(f"Sub-agent spawned: {msg.args.get('prompt', '')[:50]}...")
elif isinstance(msg, harness.Result):
print(f"Done: {msg.text[:200]}")
asyncio.run(main())
For direct control, use AgentManager to spawn agents explicitly:
from harness.agents.manager import AgentManager
# Create manager (provider and tools come from your setup)
manager = AgentManager(
provider=my_provider,
tools=my_tools,
cwd="/path/to/project",
)
# Spawn a single sub-agent
result = await manager.spawn(
"explore",
"Find all API endpoint definitions in this codebase",
)
print(result) # Returns the agent's response text
# From harness.agents.manager
class AgentManager:
def __init__(
self,
provider: ProviderAdapter,
tools: dict,
cwd: str | Path,
context_window: int = 200_000,
): ...
async def spawn(
self,
agent_name: str,
prompt: str,
*,
model: str | None = None,
) -> str: ...
async def spawn_parallel(
self,
tasks: list[tuple[str, str]],
*,
model: str | None = None,
) -> list[str]: ...
Sequential Workflow
The classic pattern is Explore → Plan → Implement. Each agent hands its output to the next as context:
# team_sequential.py
import asyncio
from harness.agents.manager import AgentManager
async def sequential_workflow(manager: AgentManager):
# Step 1: Explore the codebase
print("Step 1: Exploring...")
exploration = await manager.spawn(
"explore",
"Map out the authentication system: find all files, classes, and functions involved",
)
print(f"Found: {exploration[:100]}...")
# Step 2: Plan changes based on exploration
print("Step 2: Planning...")
plan = await manager.spawn(
"plan",
f"Based on this exploration:\n{exploration}\n\nPlan how to add OAuth2 support",
)
print(f"Plan: {plan[:100]}...")
# Step 3: Implement (uses general agent with write access)
print("Step 3: Implementing...")
implementation = await manager.spawn(
"general",
f"Implement this plan:\n{plan}",
)
print(f"Done: {implementation[:100]}...")
return implementation
Each agent's full response is passed as context to the next. This gives later agents complete awareness of what earlier agents found, without inflating a single agent's context window with tool calls.
Parallel Workflow
When tasks are independent, use spawn_parallel to run them simultaneously.
All agents start at the same time and results are returned once all complete:
# team_parallel.py
import asyncio
from harness.agents.manager import AgentManager
async def parallel_exploration(manager: AgentManager):
# Spawn 3 explore agents in parallel
results = await manager.spawn_parallel([
("explore", "Find all database models and their relationships"),
("explore", "Find all API routes and their handlers"),
("explore", "Find all test files and their coverage"),
])
db_models, api_routes, test_coverage = results
print(f"DB Models: {db_models[:100]}...")
print(f"API Routes: {api_routes[:100]}...")
print(f"Test Coverage: {test_coverage[:100]}...")
# Now plan based on all three findings
combined = f"""
Database Models: {db_models}
API Routes: {api_routes}
Test Coverage: {test_coverage}
"""
plan = await manager.spawn(
"plan",
f"Based on this analysis:\n{combined}\n\nPlan a comprehensive refactoring",
)
return plan
Three explorations that each take 30 seconds run in 30 seconds total — not 90. The wall-clock time is bounded by the slowest agent, not the sum.
Build a Team Workflow Script
Here is a complete, runnable three-phase team workflow: parallel exploration, then planning, then review — with timing and cost reporting at each stage:
# team_workflow.py — full team workflow with parallel exploration
import asyncio
import time
import harness
async def team_workflow():
start = time.time()
# Phase 1: Parallel exploration (3 agents at once)
print("=== Phase 1: Parallel Exploration ===")
phase1_start = time.time()
async for msg in harness.run(
"""Run three parallel explorations:
1. Find all Python files with security vulnerabilities
2. Find all files missing type hints
3. Find all TODO/FIXME comments
Use sub-agents to explore in parallel, then summarize all findings.""",
provider="anthropic",
model="claude-sonnet-4-20250514",
permission_mode="plan", # Read-only for safety
):
if isinstance(msg, harness.TextMessage) and not msg.is_partial:
print(msg.text)
elif isinstance(msg, harness.Result):
phase1_time = time.time() - phase1_start
print(f"\nPhase 1 complete: {phase1_time:.1f}s")
print(f" Turns: {msg.turns}, Tools: {msg.tool_calls}")
exploration_result = msg.text
# Phase 2: Planning (single agent)
print("\n=== Phase 2: Planning ===")
async for msg in harness.run(
f"Based on these findings:\n{exploration_result}\n\nCreate a prioritized action plan",
provider="anthropic",
permission_mode="plan",
):
if isinstance(msg, harness.Result):
plan_result = msg.text
print(f"Plan created: {len(plan_result)} chars")
# Phase 3: Review (single agent)
print("\n=== Phase 3: Review ===")
async for msg in harness.run(
f"Review this plan for completeness and risks:\n{plan_result}",
provider="anthropic",
permission_mode="plan",
):
if isinstance(msg, harness.Result):
total_time = time.time() - start
print(f"\nTotal time: {total_time:.1f}s")
print(f"Total cost: ${msg.total_cost:.4f}")
asyncio.run(team_workflow())
Save this file and run uv run team_workflow.py from your project root. The agent will spawn explore sub-agents in parallel during Phase 1.
Wall-Clock Time Savings
Parallelism yields real, measurable time savings. Because read-only agents never conflict, they can all run simultaneously:
| Approach | Sequential | Parallel (3 agents) | Savings |
|---|---|---|---|
| 3 explorations (30s each) | 90s | 30s | 67% |
| 5 file reviews (20s each) | 100s | 20s | 80% |
| Explore + Plan + Review | 60s | ~35s | 42% |
Parallel agents share the same provider and API key. Each agent gets its own context window and tool access. API rate limits are the only practical ceiling on parallelism.
spawn_parallel uses asyncio.gather to launch all agents concurrently.
Each agent runs its own async loop, makes independent API calls, and uses its own context.
Results are collected once all coroutines complete.
# Conceptually, spawn_parallel does this:
results = await asyncio.gather(
manager.spawn("explore", tasks[0][1]),
manager.spawn("explore", tasks[1][1]),
manager.spawn("explore", tasks[2][1]),
)
# All three run concurrently; gather returns when all complete
The /team REPL Command
You do not need to write a script to use sub-agents. The /team REPL command
automatically decomposes your request and parallelizes it:
# Start REPL
harness
# Use /team to decompose and parallelize
> /team Add comprehensive test coverage for the auth module
# Harness automatically:
# 1. Spawns an explore agent to find existing tests
# 2. Spawns an explore agent to find auth module files
# 3. Plans test cases
# 4. Implements tests
The /team command is the easiest way to use sub-agents. It automatically decides how to split the work — no scripting required.
Use /team for one-off tasks in the REPL. Use AgentManager directly when building automated pipelines, CI integrations, or workflows that need fine-grained control over agent selection and prompt templating.
Next Steps
You now know how to spawn specialized sub-agents, chain them sequentially, and run them in parallel. The next tutorial takes this further — integrating Harness into your CI/CD pipeline so every pull request gets an automatic AI review.
Set up GitHub Actions to auto-review PRs and triage issues using Harness — with native webhook parsing and Check Runs reporting.