Chapter 11: Context Management in Production Systems
11.1 Extract Facts into a Separate Block
Instead of relying on conversation history (which degrades during summarization), extract key facts into a structured block:
=== CASE FACTS (updated whenever a new fact appears) ===
Customer ID: CUST-12345
Order ID: ORD-67890
Order Date: 2025-01-15
Order Amount: $89.99
Issue: Damaged item on delivery
Customer Request: Full refund
Status: Pending manager approval
===
Include this block in every prompt, regardless of how history is summarized.
11.2 Trimming Tool Results
If lookup_order returns 40+ fields but you only need 5 for the current task:
# PostToolUse hook: keep only relevant fields
@hook("PostToolUse", tool="lookup_order")
def trim_order_fields(result):
return {
"order_id": result["order_id"],
"status": result["status"],
"total": result["total"],
"items": result["items"],
"return_eligible": result["return_eligible"]
}
This conserves context and reduces noise.
11.3 Position-aware Input
Place critical information with the lost-in-the-middle effect in mind:
[KEY FINDINGS — at the top]
Found 3 critical vulnerabilities...
[DETAILED RESULTS — middle]
=== File auth.ts ===
...
=== File database.ts ===
...
[ACTION ITEMS — at the end]
Priority: fix auth.ts vulnerabilities before merge.
11.4 Scratchpad Files
In long investigations, the agent can write key findings to a scratchpad file:
# investigation-scratchpad.md
## Key findings
- PaymentProcessor in src/payments/processor.ts inherits from BaseProcessor
- refund() is called from 3 places: OrderController, AdminPanel, CronJob
- External PaymentGateway API has a rate limit of 100 req/min
- Migration #47 added refund_reason (NOT NULL) — 2024-12-01
When context degrades (or in a new session), the agent can consult the scratchpad instead of re-running discovery.
11.5 Delegating to Subagents to Protect Context
Main agent: "Investigate dependencies of the payments module"
-> Subagent (Explore): reads 15 files, traces imports
-> Returns: "Payments depends on AuthService, OrderModel, and the external PaymentGateway API"
Main agent: keeps one line in context instead of 15 files
Separate context layer: In multi-agent systems, each subagent operates within a limited context budget—it receives only the information required for its task. The coordinator acts as a separate context layer: it aggregates subagent outputs, stores global state, and allocates context. This prevents "context leakage," where one agent consumes the window with information irrelevant to others.
Constrained context budgets for subagents:
- Send minimal context: a specific task + necessary data
- Instruct the subagent to return structured results, not raw data dumps
- Use
allowedToolsto limit the subagent's toolset—fewer tools means fewer distractions and lower context cost
11.6 Structured State Persistence (for crash recovery)
Each agent exports its state to a known location:
// agent-state/web-search-agent.json
{
"status": "completed",
"queries_executed": ["AI music 2024", "AI music composition"],
"results_count": 12,
"key_findings": [...],
"coverage": ["music composition", "music production"],
"gaps": ["music distribution", "music licensing"]
}
The coordinator loads a manifest on resume:
// agent-state/manifest.json
{
"web-search": "completed",
"doc-analysis": "in_progress",
"synthesis": "not_started"
}