Chapter 11: Context Management in Production Systems

11.1 Extract Facts into a Separate Block

Instead of relying on conversation history (which degrades during summarization), extract key facts into a structured block:

=== CASE FACTS (updated whenever a new fact appears) ===
Customer ID: CUST-12345
Order ID: ORD-67890
Order Date: 2025-01-15
Order Amount: $89.99
Issue: Damaged item on delivery
Customer Request: Full refund
Status: Pending manager approval
===

Include this block in every prompt, regardless of how history is summarized.

11.2 Trimming Tool Results

If lookup_order returns 40+ fields but you only need 5 for the current task:

# PostToolUse hook: keep only relevant fields
@hook("PostToolUse", tool="lookup_order")
def trim_order_fields(result):
    return {
        "order_id": result["order_id"],
        "status": result["status"],
        "total": result["total"],
        "items": result["items"],
        "return_eligible": result["return_eligible"]
    }

This conserves context and reduces noise.

11.3 Position-aware Input

Place critical information with the lost-in-the-middle effect in mind:

[KEY FINDINGS — at the top]
Found 3 critical vulnerabilities...

[DETAILED RESULTS — middle]
=== File auth.ts ===
...
=== File database.ts ===
...

[ACTION ITEMS — at the end]
Priority: fix auth.ts vulnerabilities before merge.

11.4 Scratchpad Files

In long investigations, the agent can write key findings to a scratchpad file:

# investigation-scratchpad.md
## Key findings
- PaymentProcessor in src/payments/processor.ts inherits from BaseProcessor
- refund() is called from 3 places: OrderController, AdminPanel, CronJob
- External PaymentGateway API has a rate limit of 100 req/min
- Migration #47 added refund_reason (NOT NULL) — 2024-12-01

When context degrades (or in a new session), the agent can consult the scratchpad instead of re-running discovery.

11.5 Delegating to Subagents to Protect Context

Main agent: "Investigate dependencies of the payments module"
  -> Subagent (Explore): reads 15 files, traces imports
  -> Returns: "Payments depends on AuthService, OrderModel, and the external PaymentGateway API"

Main agent: keeps one line in context instead of 15 files

Separate context layer: In multi-agent systems, each subagent operates within a limited context budget—it receives only the information required for its task. The coordinator acts as a separate context layer: it aggregates subagent outputs, stores global state, and allocates context. This prevents "context leakage," where one agent consumes the window with information irrelevant to others.

Constrained context budgets for subagents:

Send minimal context: a specific task + necessary data
Instruct the subagent to return structured results, not raw data dumps
Use allowedTools to limit the subagent's toolset—fewer tools means fewer distractions and lower context cost

11.6 Structured State Persistence (for crash recovery)

Each agent exports its state to a known location:

// agent-state/web-search-agent.json
{
  "status": "completed",
  "queries_executed": ["AI music 2024", "AI music composition"],
  "results_count": 12,
  "key_findings": [...],
  "coverage": ["music composition", "music production"],
  "gaps": ["music distribution", "music licensing"]
}

The coordinator loads a manifest on resume:

// agent-state/manifest.json
{
  "web-search": "completed",
  "doc-analysis": "in_progress",
  "synthesis": "not_started"
}