Chapter 6: Prompt Engineering — Advanced Techniques
Documentation: Prompt Engineering | Anthropic Cookbook
6.1 Few-shot Prompting
Few-shot prompting is the inclusion of 2–4 input/output examples in a prompt to demonstrate the expected behavior.
Why few-shot is more effective than textual descriptions:
- A vague instruction like "be more precise" can be interpreted in many ways
- An example unambiguously shows the expected format and decision logic
- The model generalizes the pattern to new cases (it does not just repeat the examples)
Types of few-shot examples and when to use them:
- Examples for ambiguous scenarios:
Request: "My order is broken"
Action: Call get_customer -> lookup_order -> check status.
Rationale: "broken" may mean a damaged item; you need order details.
Request: "Get me a manager"
Action: Immediately call escalate_to_human.
Rationale: The customer explicitly requests a human. Do not attempt to solve autonomously.
- Examples for output formatting:
Finding example:
{
"location": "src/auth/login.ts:42",
"issue": "SQL injection in the username parameter",
"severity": "critical",
"suggested_fix": "Use a parameterized query"
}
- Examples to separate acceptable vs problematic code:
// Acceptable (do not flag):
const items = data.filter(x => x.active);
// Problem (flag):
const items = data.filter(x => x.active == true); // Use strict equality ===
- Examples for extraction from different document formats:
Document with inline citations:
"As shown in the study (Smith, 2023), the rate is 42%."
-> {"value": "42%", "source": "Smith, 2023", "type": "inline_citation"}
Document with bibliography references:
"The rate is 42%. [1]"
-> {"value": "42%", "source": "reference_1", "type": "bibliography"}
- Examples for informal measurements:
Text: "about two handfuls of rice"
-> {"amount": "~100g", "original_text": "two handfuls", "precision": "approximate"}
Text: "a pinch of salt"
-> {"amount": "~1g", "original_text": "a pinch", "precision": "approximate"}
Few-shot is especially effective for extracting informal and non-standard measurement units that are too diverse for purely rule-based instructions.
Format normalization rules in prompts: When using strict JSON schemas for structured output, add normalization rules in the prompt:
Normalization:
- Dates: always ISO 8601 (YYYY-MM-DD); "yesterday" -> compute an absolute date
- Currency: numeric amount + currency code; "five bucks" -> {"amount": 5, "currency": "USD"}
- Percentages: decimal fraction; "half" -> 0.5
This prevents semantic errors where the JSON is syntactically valid but values are inconsistent.
6.2 Explicit Criteria vs Vague Instructions
Bad (vague):
Check code comments for accuracy.
Be conservative—report only high-confidence findings.
Good (explicit criteria):
Flag a comment as problematic ONLY if:
1. The comment describes behavior that CONTRADICTS the actual code behavior
2. The comment references a non-existent function or variable
3. A TODO/FIXME comment refers to a bug that has already been fixed in code
Do NOT flag:
- Comments that are merely stylistically outdated
- Comments with minor wording inaccuracies
- Missing comments (that is a separate category)
Define severity criteria with examples:
CRITICAL: Runtime failure for users
Example: NullPointerException while processing a payment
HIGH: Security vulnerability
Example: SQL injection, XSS, missing authorization checks
MEDIUM: Logic bug without immediate impact
Example: Wrong sorting, off-by-one error
LOW: Code quality
Example: Duplication, suboptimal algorithm for small data
6.3 Prompt Chaining
Prompt chaining breaks a complex task into a sequence of focused steps:
Step 1: Analyze auth.ts (local issues only)
-> Output: list of issues in auth.ts
Step 2: Analyze database.ts (local issues only)
-> Output: list of issues in database.ts
Step 3: Integration pass (cross-file dependencies)
-> Output: issues at module boundaries
Why this matters:
- Avoids attention dilution—when the model receives too many files at once, it may miss bugs in some files while providing shallow commentary on others
- Ensures consistent analysis quality per file
- Allows separate analysis of cross-file interactions
When to use prompt chaining vs dynamic decomposition:
- Prompt chaining — predictable, repeatable tasks (code review, file migrations)
- Dynamic decomposition — open-ended investigations where subtasks become clear only during execution
6.4 The "Interview" Pattern
Before implementing a solution, Claude asks clarifying questions:
Claude: "Before implementing caching for the API, a few questions:
1. Which cache invalidation strategy do you prefer—TTL or event-based?
2. Is stale data acceptable when the cache is unavailable?
3. Should caching be per-user or global?
4. What is the expected data volume to cache?"
When this is useful:
- Unfamiliar domain (fintech, healthcare, legal systems)
- Tasks with non-obvious implications (cache strategies, failure modes)
- Multiple viable approaches where the best choice depends on context
6.5 Validation and Retry-with-Feedback
When extracted data fails validation:
Step 1: Extract data from the document
Step 2: Validate (Pydantic, JSON Schema, business rules)
Step 3: If there's an error—retry with context:
- The original document
- The previous (incorrect) extraction
- The specific error: "Field 'total' = 150, but sum(line_items) = 145. Re-check values."
When retry will be effective:
- Format errors (date in the wrong format)
- Structural errors (a field placed in the wrong location)
- Arithmetic inconsistencies (the model can re-check)
When retry will NOT help:
- The information is absent from the source document
- The required context is external (the data is in another document not provided)
Pydantic as a validation tool: Pydantic is a Python library for schema-based data validation. For the exam, the key points are:
- Structural validation: types, requiredness, enum constraints checked in code after receiving JSON from Claude
- Semantic validation: custom validators enforce business logic (sum of items equals total; start_date < end_date)
- Validate–retry loops: on Pydantic validation failure, construct an error message and re-prompt Claude with the error context
- JSON Schema generation: Pydantic models can generate JSON Schema for
tool_use, providing a single source of truth
6.6 Self-correction
A pattern for detecting internal contradictions:
{
"stated_total": "$150.00",
"calculated_total": "$145.00",
"conflict_detected": true,
"line_items": [
{"name": "Widget A", "price": 75.00},
{"name": "Widget B", "price": 70.00}
]
}
The model extracts both the stated value and a computed value—if they differ, conflict_detected allows you to handle the discrepancy.