Tutorial 10 — Capstone

Enterprise Production Deployment

The definitive guide to deploying Harness with all 14 enterprise features enabled simultaneously. This capstone tutorial brings together multi-provider routing, policy-as-code, audit logging, OpenTelemetry observability, Docker sandboxing, and cloud-native deployment into a single, production-ready configuration.

1The Enterprise Checklist

Before deploying any AI coding agent in a regulated or enterprise environment, security, compliance, and operations teams ask the same 14 questions. Harness is the only coding agent that answers yes to all of them. The table below is the definitive comparison.

14-Feature Enterprise Comparison

Every feature that matters for production-grade coding agent deployment

#	Enterprise Feature	Harness	Claude Code	Cursor	Aider	OpenHands	SWE-Agent
1	Multi-provider (5+)	✓	✗	✗	✓	✓	✗
2	Full async streaming SDK	✓	✗	✗	✗	✓	✗
3	4-mode permission system	✓	✗	✗	✗	✗	✗
4	Token / cost budgets	✓	✗	✗	✗	✗	✗
5	Policy-as-code engine	✓	✗	✗	✗	✗	✗
6	SHA-256 audit hash chain	✓	✗	✗	✗	✗	✗
7	PII scanning	✓	✗	✗	✗	✗	✗
8	Dual sandbox modes	✓	✗	✗	✗	✓	✓
9	Sub-agent parallelism	✓	✓	✗	✗	✗	✗
10	Model router + fallback	✓	✗	✗	✗	✗	✗
11	Native CI/CD integration	✓	✗	✗	✗	✗	✗
12	OpenTelemetry observability	✓	✗	✗	✗	✗	✗
13	Hooks system	✓	✓	✗	✗	✗	✗
14	Interactive REPL	✓	✓	✗	✗	✗	✗
	Total Score	14 / 14	3 / 14	0 / 14	1 / 14	3 / 14	1 / 14

Harness is the only coding agent that checks every enterprise box

Claude Code scores 3/14 — the second-closest competitor. It has sub-agent parallelism, hooks, and an interactive REPL, but lacks budgets, policy-as-code, audit logging, PII scanning, observability, sandbox modes, and CI/CD integration. This tutorial shows you how to deploy all 14 features together.

2Complete Configuration

The following configuration file enables every enterprise feature. Copy it to .harness/config.toml in your project root (or ~/.harness/config.toml for a user-level default) and fill in your provider API keys via environment variables.

TOML .harness/config.toml

# .harness/config.toml — Complete Enterprise Configuration

# === Provider Selection ===
# Provider and model are set via CLI flags (-p, -m),
# environment variables, or the router configuration below.
# Default provider: anthropic
# Default model: determined by provider

# === Router & Budget ===
[router]
strategy = "cost_optimized"
fallback_chain = ["anthropic", "openai", "google"]
max_cost_per_session = 5.00
max_tokens_per_session = 1000000
simple_task_model = "claude-haiku-4-5-20251001"

# === Permissions ===
[permissions]
mode = "accept_edits"

# === Policy ===
[policy]
policy_paths = [".harness/policy.yml", "~/.harness/policy.yml"]
simulation_mode = false

# === Audit ===
[audit]
enabled = true
scan_pii = true
retention_days = 365
retention_max_size_mb = 1000
log_tool_args = true

# === Sandbox ===
[sandbox]
enabled = true
mode = "docker"
max_memory_mb = 1024
max_cpu_seconds = 60
network_access = false
docker_image = "python:3.12-slim"
allowed_paths = ["/workspace"]
blocked_commands = ["rm -rf /", "curl", "wget", "nc"]

# === OpenTelemetry ===
# OpenTelemetry is configured via environment variables:
# export OTEL_SERVICE_NAME="harness-agent"
# export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317"
# export OTEL_TRACES_EXPORTER="otlp"
# export OTEL_METRICS_EXPORTER="otlp"

Configuration Precedence

Harness merges configuration in this order: environment variables override .harness/config.toml (project), which overrides ~/.harness/config.toml (user). API keys should always be set via environment variables, never committed to config files.

3Architecture Overview

The diagram below shows how all 14 enterprise features interact at runtime. Every task flows through the Model Router and Policy Engine before reaching the Agent Loop, then exits through the Sandbox and Audit Logger to OpenTelemetry.

Diagram Harness Enterprise Architecture

User / CI/CD

CLI Python SDK GitHub Actions

↓

Model Router

Cost Optimized Quality First Latency First Fallback Chain

Policy Engine

YAML Rules Inheritance Chain Simulation Mode Audit Log

↓

Budget Tracker

Token limits Cost limits BudgetExhaustedError Live snapshot

Permission System

4-mode control 4-tier hierarchy Allow / Deny rules Ask at runtime

↓

Agent Loop (Prompt → Model → Tool → Result)

Sub-Agent explore

Sub-Agent plan

Sub-Agent implement

Sub-Agent review

↓

Sandbox Runtime

Process / Docker Memory limits CPU limits Network isolation

Audit Logger

SHA-256 hash chain PII scanning JSONL + rotation Retention policy

↓

OpenTelemetry Export

Traces → Jaeger Metrics → Prometheus Dashboards → Grafana

Every Request is Governed End-to-End

No tool call can bypass the Policy Engine or Budget Tracker. Even sub-agents spawned during parallel execution inherit the parent session's policy and budget constraints. The audit hash chain covers every event from session start to completion.

4OpenTelemetry Observability

Harness emits OpenTelemetry traces and metrics for every agent session, tool call, and model request. Start the monitoring stack with a single Docker Compose command, then explore traces in Jaeger and dashboards in Grafana.

Telemetry Configuration

Shell Environment Variables

# OpenTelemetry is configured via environment variables:
export OTEL_SERVICE_NAME="harness-agent"
export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317"
export OTEL_TRACES_EXPORTER="otlp"
export OTEL_METRICS_EXPORTER="otlp"

Monitoring Stack

YAML docker-compose.monitoring.yml

version: "3.8"
services:
  jaeger:
    image: jaegertracing/all-in-one:1.52
    ports:
      - "16686:16686"  # Jaeger UI
      - "4317:4317"    # OTLP gRPC
    environment:
      - COLLECTOR_OTLP_ENABLED=true

  prometheus:
    image: prom/prometheus:v2.48.0
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml

  grafana:
    image: grafana/grafana:10.2.0
    ports:
      - "3000:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
    depends_on:
      - prometheus
      - jaeger

Bash

docker compose -f docker-compose.monitoring.yml up -d

Prometheus Scrape Config

YAML prometheus.yml

global:
  scrape_interval: 15s

scrape_configs:
  - job_name: "harness"
    static_configs:
      - targets: ["host.docker.internal:9464"]

Access the Stack

After starting the stack, access Jaeger at http://localhost:16686 to see distributed traces, Prometheus at http://localhost:9090 for raw metrics, and Grafana at http://localhost:3000 (admin / admin) for dashboards.

Key metrics emitted by Harness to the configured OTLP endpoint:

Metric	Type	Description
harness_tokens	Counter	Total tokens consumed this session
harness_tool_calls	Counter	Total tool calls made this session
harness_cost	Gauge	Cumulative session cost in USD
harness_provider_latency	Histogram	Provider response latency distribution
harness_context_utilization	Gauge	Fraction of context window consumed
harness_audit_chain_valid	Gauge	1 = chain intact, 0 = integrity broken

5Security Hardening

Follow these five steps to harden a Harness deployment before exposing it to production workloads. Each step takes under two minutes.

Set restrictive file permissions

Lock down the config directory so only your user account can read API keys and credentials.

Bash

chmod 600 ~/.harness/config.toml   # Owner read/write only
chmod 600 ~/.harness/credentials   # Protect API keys
chmod 700 ~/.harness/              # Protect directory

Use environment variables for secrets

Never put API keys in config files. Use environment variables that are injected at runtime by your secrets manager or CI/CD system.

Bash

# Use env vars instead of config file for secrets
export ANTHROPIC_API_KEY="sk-ant-..."
export OPENAI_API_KEY="sk-..."
# Never commit API keys to git!

Enable Docker sandbox with network disabled

Docker mode provides filesystem isolation that the process sandbox cannot. Disabling network access prevents exfiltration even if the agent is tricked into running malicious commands.

TOML .harness/config.toml

[sandbox]
enabled = true
mode = "docker"
network_access = false

Enable PII scanning in the audit logger

PII scanning checks every tool argument and result for names, emails, SSNs, credit card numbers, and phone numbers before writing them to the audit log.

TOML .harness/config.toml

[audit]
scan_pii = true

Add a network exfiltration policy rule

Even with network_access = false in Docker, add an explicit policy rule as a defence-in-depth layer for any future sandbox escape scenarios.

YAML .harness/policy.yml

# .harness/policy.yml
version: 1
rules:
  - tool: Bash
    decision: deny
    conditions:
      - type: command_matches
        pattern: "curl.*|wget.*|nc .*"
    description: "Block network exfiltration"

Docker vs Process Sandbox

In production, always use Docker sandbox mode with network disabled. The process sandbox (mode = "process") is lighter-weight and faster to start, but it does not provide filesystem isolation — the agent can still read files outside your working directory if not restricted by policy rules.

6Multi-Team Policy Setup

Large organizations need three levels of policy: organization-wide rules that apply everywhere, team-level customizations, and project-specific overrides. Harness evaluates these in reverse order (project first) so more specific rules always take precedence.

Organization Level

~/.harness/policy.yml

Global rules for every project on this machine / CI runner

Team Level

~/team-policy.yml | shared repo .harness/policy.yml

Shared across all projects in a team repository; inherits org rules

Project Level

.harness/policy.yml — in the project root

Project-specific rules; inherits team rules via inherit_from

Organization Policy

YAML ~/.harness/policy.yml

version: 1
defaults:
  decision: ask
rules:
  - tool: Bash
    decision: deny
    conditions:
      - type: command_matches
        pattern: "rm -rf /.*"
    description: "Org: Never delete root paths"
  - tool: Write
    decision: deny
    conditions:
      - type: path_matches
        pattern: "\\.(env|pem|key)$"
    description: "Org: Protect secrets"

Team Policy

YAML team-policy.yml

version: 1
inherit_from: "~/.harness/policy.yml"
rules:
  - tool: Bash
    decision: deny
    conditions:
      - type: command_matches
        pattern: "docker push.*"
    description: "Team: No manual docker pushes"

Project Policy

YAML .harness/policy.yml

version: 1
inherit_from: "../team-policy.yml"
rules:
  - tool: Read
    decision: allow
    description: "Project: Allow all reads"
  - tool: Edit
    decision: allow
    conditions:
      - type: path_matches
        pattern: "src/.*\\.py$"
    description: "Project: Allow editing Python source"

Try It: Test your policy chain

Use simulation mode to verify the full inheritance chain without enforcing anything. Set simulation_mode = true in your config and run a test task. Then check the audit log to confirm every tool call was evaluated against all three policy files.

7Monitoring & Alerting

Add these Prometheus alert rules to get notified about cost overruns, exhausted budgets, audit chain integrity failures, and high error rates. Save them to alerts.yml and reference them from your Prometheus configuration.

YAML alerts.yml

groups:
  - name: harness
    rules:
      - alert: HighCostSession
        expr: harness_cost > 5
        for: 1m
        labels:
          severity: warning
        annotations:
          summary: "Harness session cost exceeds $5"

      - alert: HighErrorRate
        expr: rate(harness_tool_calls{status="error"}[5m]) > 0.1
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Harness tool error rate above 10%"

      - alert: HighLatency
        expr: harness_provider_latency > 30000
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Harness provider latency above 30s"

      - alert: AuditChainBroken
        expr: harness_audit_chain_valid == 0
        for: 0m
        labels:
          severity: critical
        annotations:
          summary: "Audit log chain integrity check failed"

Treat AuditChainBroken as P0

A broken audit hash chain means one or more audit log entries have been tampered with or corrupted. This alert should page your on-call team immediately. Preserve the log file and open an incident before running any further agent tasks.

Create a Grafana dashboard with these panels for a complete operational view:

Session Cost (USD) — harness_cost as a time-series
Token Usage — harness_tokens as a counter time-series
Context Utilization — harness_context_utilization as a gauge with warning threshold at 0.8
Tool Calls/min — rate(harness_tool_calls[1m])
Error Rate — rate(harness_tool_calls{status="error"}[5m]) / rate(harness_tool_calls[5m])
Provider Latency P99 — histogram_quantile(0.99, harness_provider_latency_bucket)
Audit Chain Status — harness_audit_chain_valid as a stat panel (green=1, red=0)

8Production Script

Use this script as the entry point for any production automation. It handles budget guarding, structured logging, and clean error exit codes — ready to use directly in a Kubernetes Job, GitHub Actions step, or cron task.

Python production_agent.py

# production_agent.py — Enterprise-grade Harness deployment
import asyncio
import logging
import sys
from pathlib import Path

import harness
from harness.audit.logger import AuditLogger
from harness.providers.budget import TokenBudgetTracker, BudgetExhaustedError

logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")
logger = logging.getLogger("harness-prod")

async def run_production_task(prompt: str) -> dict:
    """Run a task with full enterprise controls."""

    # Budget guard
    # Note: This budget tracker monitors cost AFTER each run completes.
    # To enforce budgets mid-run, configure the [router] section in
    # .harness/config.toml with max_cost_per_session — the engine
    # checks the budget internally between turns.
    budget = TokenBudgetTracker(max_tokens=500_000, max_cost=5.00)

    result_data = {
        "success": False,
        "output": "",
        "tokens": 0,
        "cost": 0.0,
        "error": None,
    }

    try:
        async for msg in harness.run(
            prompt,
            provider="anthropic",
            model="claude-sonnet-4-20250514",
            permission_mode="accept_edits",
            sandbox_mode="docker",
            max_turns=50,
        ):
            match msg:
                case harness.TextMessage(text=t, is_partial=False):
                    logger.info(f"Agent: {t[:100]}...")

                case harness.ToolUse(name=name, args=args):
                    logger.info(f"Tool: {name}")

                case harness.Result() as r:
                    # Record final usage
                    budget.record_usage(
                        # Approximation: total_tokens doesn't split input/output
                        input_tokens=r.total_tokens // 2,
                        output_tokens=r.total_tokens // 2,
                        cost=r.total_cost,
                    )
                    # Check if we should stop future runs
                    budget.check_budget()

                    result_data.update({
                        "success": True,
                        "output": r.text,
                        "tokens": r.total_tokens,
                        "cost": r.total_cost,
                    })

                    snap = budget.snapshot()
                    logger.info(
                        f"Complete: {r.turns} turns, {r.tool_calls} tools, "
                        f"${r.total_cost:.4f}, {r.total_tokens:,} tokens"
                    )
                    logger.info(
                        f"Budget: ${snap.cost_remaining:.2f} remaining, "
                        f"{snap.tokens_remaining:,} tokens remaining"
                    )

    except BudgetExhaustedError as e:
        logger.error(f"Budget exhausted: {e}")
        result_data["error"] = str(e)
    except Exception as e:
        logger.error(f"Unexpected error: {e}")
        result_data["error"] = str(e)

    return result_data

async def main():
    if len(sys.argv) < 2:
        print("Usage: python production_agent.py 'Your task here'")
        sys.exit(1)

    prompt = sys.argv[1]
    logger.info(f"Starting production task: {prompt[:50]}...")

    result = await run_production_task(prompt)

    if result["success"]:
        logger.info(f"Task completed successfully. Cost: ${result['cost']:.4f}")
    else:
        logger.error(f"Task failed: {result['error']}")
        sys.exit(1)

if __name__ == "__main__":
    asyncio.run(main())

Bash

uv run python production_agent.py "Audit all TODO comments and open GitHub issues for each"

9Capacity Planning

Use the table below to estimate monthly costs before enabling Harness for your team. The cost-optimized router automatically routes simple tasks to Haiku and complex tasks to Sonnet, so your actual costs will be at or below these estimates.

Use Case	Tasks / Day	Avg Tokens / Task	Model	Daily Cost	Monthly Cost
PR Reviews	10	5,000	Sonnet 4	$0.90	$27
Issue Triage	20	2,000	Haiku 4.5	$0.12	$3.60
Code Generation	5	50,000	Sonnet 4	$4.50	$135
Security Scans	3	30,000	Opus 4	$13.50	$405
Total	38	—	—	$19.02	$570.60

Cost-Optimized Routing Saves 40–60%

With strategy = "cost_optimized", simple tasks use Haiku ($0.80 / 1M input tokens) and complex tasks use Sonnet ($3 / 1M input tokens). This typically reduces total costs by 40–60% compared to always using the best available model.

Session Budget Recommendation

TOML .harness/config.toml

[router]
max_cost_per_session = 10.00       # Individual session cap
# Monthly budget tracked externally via audit logs + Prometheus

The audit log is JSONL format. Aggregate monthly costs with a single Python one-liner:

Bash

# Sum total cost from audit log entries this month
python3 -c "
import json, sys
from pathlib import Path
from datetime import datetime
month = datetime.now().strftime('%Y-%m')
total = sum(
    e.get('cost', 0) for e in
    (json.loads(l) for l in open(Path('~/.harness/audit.jsonl').expanduser()))
    if e.get('timestamp', '').startswith(month)
)
print(f'Monthly cost: \${total:.4f}')
"

10Cloud Deployment

Deploy Harness as a containerized workload on your cloud provider of choice. Select your platform below for the Kubernetes deployment manifest and CLI commands.

Deploy to Amazon Elastic Kubernetes Service. API keys are stored in Secrets Manager and injected via a Kubernetes Secret.

YAML k8s/deployment.yml

# AWS EKS deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: harness-agent
spec:
  replicas: 1
  template:
    spec:
      containers:
        - name: harness
          image: your-ecr-repo/harness-agent:latest
          env:
            - name: ANTHROPIC_API_KEY
              valueFrom:
                secretKeyRef:
                  name: harness-secrets
                  key: anthropic-api-key
          resources:
            limits:
              memory: "2Gi"
              cpu: "1"

Bash

# ECR + EKS setup
aws ecr create-repository --repository-name harness-agent
docker build -t harness-agent .
docker tag harness-agent:latest <account>.dkr.ecr.<region>.amazonaws.com/harness-agent:latest
docker push <account>.dkr.ecr.<region>.amazonaws.com/harness-agent:latest
kubectl apply -f k8s/deployment.yml

Deploy to Azure Kubernetes Service. Store API keys in Azure Key Vault and reference them through the AKS Secrets Store CSI driver.

YAML k8s/deployment.yml

# Azure AKS deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: harness-agent
spec:
  replicas: 1
  template:
    spec:
      containers:
        - name: harness
          image: your-acr.azurecr.io/harness-agent:latest
          env:
            - name: ANTHROPIC_API_KEY
              valueFrom:
                secretKeyRef:
                  name: harness-secrets
                  key: anthropic-api-key
          resources:
            limits:
              memory: "2Gi"
              cpu: "1"

Bash

# ACR + AKS setup
az acr create --name harnessprod --resource-group mygroup --sku Standard
az acr login --name harnessprod
docker build -t harnessprod.azurecr.io/harness-agent:latest .
docker push harnessprod.azurecr.io/harness-agent:latest
kubectl apply -f k8s/deployment.yml

Deploy to Google Kubernetes Engine. Store API keys in Secret Manager and reference them as environment variables via Workload Identity.

YAML k8s/deployment.yml

# GCP GKE deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: harness-agent
spec:
  replicas: 1
  template:
    spec:
      containers:
        - name: harness
          image: gcr.io/your-project/harness-agent:latest
          env:
            - name: ANTHROPIC_API_KEY
              valueFrom:
                secretKeyRef:
                  name: harness-secrets
                  key: anthropic-api-key
          resources:
            limits:
              memory: "2Gi"
              cpu: "1"

Bash

# Artifact Registry + GKE setup
gcloud artifacts repositories create harness --repository-format=docker --location=us-central1
docker build -t us-central1-docker.pkg.dev/your-project/harness/harness-agent:latest .
docker push us-central1-docker.pkg.dev/your-project/harness/harness-agent:latest
gcloud container clusters get-credentials my-cluster --region us-central1
kubectl apply -f k8s/deployment.yml

Kubernetes Resource Limits

Always set both requests and limits for memory and CPU. An unconstrained Harness container running a large code generation task with sub-agents can consume several gigabytes of RAM during peak parallel execution.

11Congratulations

You've completed the entire tutorial series

You now know how to deploy Harness with all 14 enterprise features in production

Over 10 tutorials you went from installing Harness to running a fully governed, observable, cost-controlled enterprise AI coding agent. Here is a summary of everything covered:

01Getting Started 02SDK Basics 03Permissions & Safety 04Budget & Cost Controls 05Policy-as-Code 06Audit & Compliance 07Sandbox Isolation 08Sub-Agent Parallelism 09CI/CD Integration 10Enterprise Production (this page)

Star on GitHub PyPI: harness-agent Back to All Tutorials

Bash

uv tool install harness-agent
harness --version