Sub-Agents & Parallel Execution

Delegate work to specialized agents. Run multiple agents concurrently and compress wall-clock time dramatically.

Intermediate 20 min read
1

Why Sub-Agents?

A senior developer doesn't do everything themselves. They delegate: one person explores the codebase, another writes the plan, a third reviews the code. Harness agents work the same way.

Two key benefits drive the sub-agent model:

  1. Specialization — different agents optimized for different tasks (exploring vs planning vs reviewing)
  2. Parallelism — run multiple agents concurrently and get results in the wall-clock time of the slowest, not the sum of all
Harness vs the field: Sub-agent support
Feature Harness Claude Code Cursor Aider OpenHands SWE-Agent
Sub-agents ✓ (4 types) ✓ (Task)
Parallel spawn
Custom agent defs
Agent types 4 built-in 1 generic 0 0 0 0
2

Agent Types

Harness ships with four built-in agent types. Each has a predefined tool set and permission level.

general
General-purpose tasks with full tool access — Read, Write, Edit, Bash, Glob, Grep
Read + Write
explore
Code exploration and research — Read, Glob, Grep
Read-only
plan
Architecture and planning — Read, Glob, Grep
Read-only
review
Code review and analysis — Read, Glob, Grep
Read-only
Agent Type Default max_turns Access
general 50 Read + Write (Read, Write, Edit, Bash, Glob, Grep)
explore 20 Read-only (Read, Glob, Grep)
plan 30 Read-only (Read, Glob, Grep)
review 30 Read-only (Read, Glob, Grep)
Read-only by default

Explore, plan, and review agents are read-only by default — they can search and read but never modify files. This makes them safe to run in parallel without risk of conflicting writes.

You can also define custom agent types using AgentDef:

python
from harness.types.agents import AgentDef

# Define a custom agent type
security_agent = AgentDef(
    name="security",
    description="Security-focused code reviewer",
    model=None,           # None = inherit from parent
    tools=("Read", "Glob", "Grep"),
    system_prompt="You are a security expert. Identify vulnerabilities only.",
    max_turns=50,
    read_only=True,
)
python
# From harness.types.agents
class AgentDef:
    name: str
    description: str
    model: str | None = None       # Override model, None = inherit
    tools: tuple[str, ...] = ()    # Allowed tools, empty = inherit
    system_prompt: str | None = None
    max_turns: int = 50
    read_only: bool = False
3

Spawning a Sub-Agent

The simplest way to use sub-agents is through harness.run(). When the agent decides to delegate, it uses the built-in Task tool internally:

python
import asyncio
import harness
from harness.agents.manager import AgentManager

async def main():
    # AgentManager needs a provider and tools —
    # normally these come from the engine internally.
    # For standalone use, you can get them from a running session.

    # Simple example using harness.run() with sub-agents via the SDK
    async for msg in harness.run(
        "Explore the src/ directory and summarize the architecture",
        provider="anthropic",
    ):
        if isinstance(msg, harness.ToolUse) and msg.name == "Task":
            print(f"Sub-agent spawned: {msg.args.get('prompt', '')[:50]}...")
        elif isinstance(msg, harness.Result):
            print(f"Done: {msg.text[:200]}")

asyncio.run(main())

For direct control, use AgentManager to spawn agents explicitly:

python
from harness.agents.manager import AgentManager

# Create manager (provider and tools come from your setup)
manager = AgentManager(
    provider=my_provider,
    tools=my_tools,
    cwd="/path/to/project",
)

# Spawn a single sub-agent
result = await manager.spawn(
    "explore",
    "Find all API endpoint definitions in this codebase",
)
print(result)  # Returns the agent's response text
python
# From harness.agents.manager
class AgentManager:
    def __init__(
        self,
        provider: ProviderAdapter,
        tools: dict,
        cwd: str | Path,
        context_window: int = 200_000,
    ): ...

    async def spawn(
        self,
        agent_name: str,
        prompt: str,
        *,
        model: str | None = None,
    ) -> str: ...

    async def spawn_parallel(
        self,
        tasks: list[tuple[str, str]],
        *,
        model: str | None = None,
    ) -> list[str]: ...
4

Sequential Workflow

The classic pattern is Explore → Plan → Implement. Each agent hands its output to the next as context:

python team_sequential.py
# team_sequential.py
import asyncio
from harness.agents.manager import AgentManager

async def sequential_workflow(manager: AgentManager):
    # Step 1: Explore the codebase
    print("Step 1: Exploring...")
    exploration = await manager.spawn(
        "explore",
        "Map out the authentication system: find all files, classes, and functions involved",
    )
    print(f"Found: {exploration[:100]}...")

    # Step 2: Plan changes based on exploration
    print("Step 2: Planning...")
    plan = await manager.spawn(
        "plan",
        f"Based on this exploration:\n{exploration}\n\nPlan how to add OAuth2 support",
    )
    print(f"Plan: {plan[:100]}...")

    # Step 3: Implement (uses general agent with write access)
    print("Step 3: Implementing...")
    implementation = await manager.spawn(
        "general",
        f"Implement this plan:\n{plan}",
    )
    print(f"Done: {implementation[:100]}...")

    return implementation
Context flows forward

Each agent's full response is passed as context to the next. This gives later agents complete awareness of what earlier agents found, without inflating a single agent's context window with tool calls.

5

Parallel Workflow

When tasks are independent, use spawn_parallel to run them simultaneously. All agents start at the same time and results are returned once all complete:

python team_parallel.py
# team_parallel.py
import asyncio
from harness.agents.manager import AgentManager

async def parallel_exploration(manager: AgentManager):
    # Spawn 3 explore agents in parallel
    results = await manager.spawn_parallel([
        ("explore", "Find all database models and their relationships"),
        ("explore", "Find all API routes and their handlers"),
        ("explore", "Find all test files and their coverage"),
    ])

    db_models, api_routes, test_coverage = results

    print(f"DB Models: {db_models[:100]}...")
    print(f"API Routes: {api_routes[:100]}...")
    print(f"Test Coverage: {test_coverage[:100]}...")

    # Now plan based on all three findings
    combined = f"""
    Database Models: {db_models}
    API Routes: {api_routes}
    Test Coverage: {test_coverage}
    """

    plan = await manager.spawn(
        "plan",
        f"Based on this analysis:\n{combined}\n\nPlan a comprehensive refactoring",
    )
    return plan
The key insight

Three explorations that each take 30 seconds run in 30 seconds total — not 90. The wall-clock time is bounded by the slowest agent, not the sum.

6

Build a Team Workflow Script

Here is a complete, runnable three-phase team workflow: parallel exploration, then planning, then review — with timing and cost reporting at each stage:

python team_workflow.py
# team_workflow.py — full team workflow with parallel exploration
import asyncio
import time
import harness

async def team_workflow():
    start = time.time()

    # Phase 1: Parallel exploration (3 agents at once)
    print("=== Phase 1: Parallel Exploration ===")
    phase1_start = time.time()

    async for msg in harness.run(
        """Run three parallel explorations:
        1. Find all Python files with security vulnerabilities
        2. Find all files missing type hints
        3. Find all TODO/FIXME comments

        Use sub-agents to explore in parallel, then summarize all findings.""",
        provider="anthropic",
        model="claude-sonnet-4-20250514",
        permission_mode="plan",  # Read-only for safety
    ):
        if isinstance(msg, harness.TextMessage) and not msg.is_partial:
            print(msg.text)
        elif isinstance(msg, harness.Result):
            phase1_time = time.time() - phase1_start
            print(f"\nPhase 1 complete: {phase1_time:.1f}s")
            print(f"  Turns: {msg.turns}, Tools: {msg.tool_calls}")
            exploration_result = msg.text

    # Phase 2: Planning (single agent)
    print("\n=== Phase 2: Planning ===")
    async for msg in harness.run(
        f"Based on these findings:\n{exploration_result}\n\nCreate a prioritized action plan",
        provider="anthropic",
        permission_mode="plan",
    ):
        if isinstance(msg, harness.Result):
            plan_result = msg.text
            print(f"Plan created: {len(plan_result)} chars")

    # Phase 3: Review (single agent)
    print("\n=== Phase 3: Review ===")
    async for msg in harness.run(
        f"Review this plan for completeness and risks:\n{plan_result}",
        provider="anthropic",
        permission_mode="plan",
    ):
        if isinstance(msg, harness.Result):
            total_time = time.time() - start
            print(f"\nTotal time: {total_time:.1f}s")
            print(f"Total cost: ${msg.total_cost:.4f}")

asyncio.run(team_workflow())
Run it

Save this file and run uv run team_workflow.py from your project root. The agent will spawn explore sub-agents in parallel during Phase 1.

7

Wall-Clock Time Savings

Parallelism yields real, measurable time savings. Because read-only agents never conflict, they can all run simultaneously:

Approach Sequential Parallel (3 agents) Savings
3 explorations (30s each) 90s 30s 67%
5 file reviews (20s each) 100s 20s 80%
Explore + Plan + Review 60s ~35s 42%
Shared provider, separate contexts

Parallel agents share the same provider and API key. Each agent gets its own context window and tool access. API rate limits are the only practical ceiling on parallelism.

spawn_parallel uses asyncio.gather to launch all agents concurrently. Each agent runs its own async loop, makes independent API calls, and uses its own context. Results are collected once all coroutines complete.

python
# Conceptually, spawn_parallel does this:
results = await asyncio.gather(
    manager.spawn("explore", tasks[0][1]),
    manager.spawn("explore", tasks[1][1]),
    manager.spawn("explore", tasks[2][1]),
)
# All three run concurrently; gather returns when all complete
8

The /team REPL Command

You do not need to write a script to use sub-agents. The /team REPL command automatically decomposes your request and parallelizes it:

bash
# Start REPL
harness

# Use /team to decompose and parallelize
> /team Add comprehensive test coverage for the auth module
# Harness automatically:
# 1. Spawns an explore agent to find existing tests
# 2. Spawns an explore agent to find auth module files
# 3. Plans test cases
# 4. Implements tests
Try it now

The /team command is the easiest way to use sub-agents. It automatically decides how to split the work — no scripting required.

When to use /team vs scripting

Use /team for one-off tasks in the REPL. Use AgentManager directly when building automated pipelines, CI integrations, or workflows that need fine-grained control over agent selection and prompt templating.

9

Next Steps

You now know how to spawn specialized sub-agents, chain them sequentially, and run them in parallel. The next tutorial takes this further — integrating Harness into your CI/CD pipeline so every pull request gets an automatic AI review.

Tutorial 9: CI/CD Integration

Set up GitHub Actions to auto-review PRs and triage issues using Harness — with native webhook parsing and Check Runs reporting.