Chapter 9: Escalation and Human-in-the-Loop
9.1 When to Escalate to a Human
Escalation triggers (clear rules):
| Situation | Action |
|---|---|
| The customer explicitly asks "get me a manager" | Escalate immediately; do not attempt to solve |
| Policy does not cover the request | Escalate (e.g., competitor price matching when policy is silent) |
| The agent cannot make progress | Escalate after a reasonable number of attempts |
| Financial operation above a threshold | Escalate (preferably enforced via a hook, not a prompt) |
| Multiple matches when searching for a customer | Ask for additional identifiers; do not guess |
What is NOT a reliable trigger:
| Unreliable method | Why it fails |
|---|---|
| Sentiment analysis | Customer mood does not correlate with case complexity |
| Model self-rated confidence (1–10) | The model can be confidently wrong; calibration is poor |
| An automatic classifier | Overengineering; may require training data you don't have |
9.2 Escalation Patterns
Immediate escalation:
Customer: "I want to speak to a manager"
Agent: [immediately calls escalate_to_human]
NOT: "I can help with your issue, let me..."
Escalation after an attempt to resolve:
Customer: "My refrigerator broke two days after purchase"
Agent: [checks the order, offers a warranty replacement]
If the customer is not satisfied -> escalate
Nuanced escalation (acknowledge → resolve → escalate on reiteration):
Customer: "This is outrageous, I'm very unhappy with the quality!"
Agent: [acknowledges frustration] "I understand your frustration."
[offers resolution] "I can offer a replacement or a refund."
Customer: "No, I want to talk to someone!"
Agent: [customer insists again -> immediate escalation]
Key principle: acknowledge emotion first, then propose a concrete solution, and only escalate if the customer reiterates the desire for a human. Do not escalate on the first expression of dissatisfaction (that is not the same as requesting a manager).
Escalation for a policy gap:
Customer: "Competitor X has this item 30% cheaper—give me a discount"
Policy: covers price adjustments only on your own site
Agent: [escalates — policy does not cover competitor price matching]
9.3 Structured Handoff Protocols
On escalation, the agent should pass a structured summary to a human:
{
"customer_id": "CUST-12345",
"customer_name": "Ivan Petrov",
"issue_summary": "Refund request for a damaged item",
"order_id": "ORD-67890",
"root_cause": "Item arrived damaged; photos attached",
"actions_taken": [
"Verified customer via get_customer",
"Confirmed order via lookup_order",
"Offered a standard replacement — customer insists on a refund"
],
"refund_amount": "$89.99",
"recommended_action": "Approve a full refund",
"escalation_reason": "Customer requested to speak with a manager"
}
The human operator does not have access to the full conversation transcript—they only see this summary. Therefore it must be complete and self-contained.
9.4 Confidence Calibration and Human Oversight
For data extraction systems:
- Field-level confidence scores: the model outputs a confidence score per extracted field
- Calibration: use labeled validation sets to tune thresholds
- Routing:
- High confidence + stable accuracy -> automated processing
- Low confidence or ambiguous sources -> human review
Stratified random sampling:
- Even for high-confidence extractions, regularly audit a sample
- An aggregate 97% accuracy can hide 40% errors for a particular document type
- Analyze accuracy by document type and by field, not only overall