Chapter 9: Escalation and Human-in-the-Loop

9.1 When to Escalate to a Human

Escalation triggers (clear rules):

Situation	Action
The customer explicitly asks "get me a manager"	Escalate immediately; do not attempt to solve
Policy does not cover the request	Escalate (e.g., competitor price matching when policy is silent)
The agent cannot make progress	Escalate after a reasonable number of attempts
Financial operation above a threshold	Escalate (preferably enforced via a hook, not a prompt)
Multiple matches when searching for a customer	Ask for additional identifiers; do not guess

What is NOT a reliable trigger:

Unreliable method	Why it fails
Sentiment analysis	Customer mood does not correlate with case complexity
Model self-rated confidence (1–10)	The model can be confidently wrong; calibration is poor
An automatic classifier	Overengineering; may require training data you don't have

9.2 Escalation Patterns

Immediate escalation:

Customer: "I want to speak to a manager"
Agent: [immediately calls escalate_to_human]
NOT: "I can help with your issue, let me..."

Escalation after an attempt to resolve:

Customer: "My refrigerator broke two days after purchase"
Agent: [checks the order, offers a warranty replacement]
If the customer is not satisfied -> escalate

Nuanced escalation (acknowledge → resolve → escalate on reiteration):

Customer: "This is outrageous, I'm very unhappy with the quality!"
Agent: [acknowledges frustration] "I understand your frustration."
       [offers resolution] "I can offer a replacement or a refund."
Customer: "No, I want to talk to someone!"
Agent: [customer insists again -> immediate escalation]

Key principle: acknowledge emotion first, then propose a concrete solution, and only escalate if the customer reiterates the desire for a human. Do not escalate on the first expression of dissatisfaction (that is not the same as requesting a manager).

Escalation for a policy gap:

Customer: "Competitor X has this item 30% cheaper—give me a discount"
Policy: covers price adjustments only on your own site
Agent: [escalates — policy does not cover competitor price matching]

9.3 Structured Handoff Protocols

On escalation, the agent should pass a structured summary to a human:

{
  "customer_id": "CUST-12345",
  "customer_name": "Ivan Petrov",
  "issue_summary": "Refund request for a damaged item",
  "order_id": "ORD-67890",
  "root_cause": "Item arrived damaged; photos attached",
  "actions_taken": [
    "Verified customer via get_customer",
    "Confirmed order via lookup_order",
    "Offered a standard replacement — customer insists on a refund"
  ],
  "refund_amount": "$89.99",
  "recommended_action": "Approve a full refund",
  "escalation_reason": "Customer requested to speak with a manager"
}

The human operator does not have access to the full conversation transcript—they only see this summary. Therefore it must be complete and self-contained.

9.4 Confidence Calibration and Human Oversight

For data extraction systems:

Field-level confidence scores: the model outputs a confidence score per extracted field
Calibration: use labeled validation sets to tune thresholds
Routing:
- High confidence + stable accuracy -> automated processing
- Low confidence or ambiguous sources -> human review

Stratified random sampling:

Even for high-confidence extractions, regularly audit a sample
An aggregate 97% accuracy can hide 40% errors for a particular document type
Analyze accuracy by document type and by field, not only overall