Chapter 7: Message Batches API
Documentation: Message Batches
7.1 Overview
The Message Batches API lets you submit batches of requests for asynchronous processing:
| Attribute | Value |
|---|---|
| Savings | 50% compared to synchronous calls |
| Processing window | Up to 24 hours (no latency SLA guarantee) |
| Multi-turn tool calling | Not supported (one request = one response) |
| Correlation | custom_id field to link request and response |
7.2 When to Use Batch API vs Synchronous API
| Task | API | Why |
|---|---|---|
| Pre-merge PR check | Synchronous | The developer is waiting; 24 hours is unacceptable |
| Overnight tech-debt report | Batch | Result is needed by morning; 50% savings |
| Weekly security audit | Batch | Not urgent; 50% savings |
| Interactive code review | Synchronous | Immediate response required |
| Processing 10,000 documents | Batch | Bulk processing; savings are significant |
7.3 Using custom_id
{
"custom_id": "doc-invoice-2024-001",
"params": {
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Extract data from: ..."}]
}
}
custom_id allows you to:
- Link the result to the original document
- On failure, re-submit only the failed documents
- Avoid re-processing successful documents
7.4 Handling Failures in Batches
- Submit a batch of 100 documents
- 95 succeed; 5 fail (context limit exceeded)
- Identify failures by
custom_id - Modify strategy (e.g., split long documents into chunks)
- Re-submit only the 5 failed documents
7.5 SLA Planning
If you need a result in 30 hours and the Batch API can take up to 24 hours:
- Submission window: 30 - 24 = 6 hours
- Batches must be submitted no later than 24 hours before the deadline
- For frequent submissions, split into 4-hour windows