Chapter 1: Claude API — Fundamentals of Model Interaction
Documentation: Messages API | Prompt Engineering
1.1 API Request Structure
The Claude API follows a request–response model. Each request to the Claude Messages API includes:
{
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"system": "You are a helpful assistant.",
"messages": [
{"role": "user", "content": "Hi!"},
{"role": "assistant", "content": "Hello!"},
{"role": "user", "content": "How are you?"}
],
"tools": [...],
"tool_choice": {"type": "auto"}
}
Key fields:
model— model selection (claude-opus-4-6,claude-sonnet-4-6,claude-haiku-4-5)max_tokens— maximum number of tokens in the responsesystem— the system prompt (defines model behavior)messages— conversation history (you must send the full history to maintain coherence)tools— definitions of available toolstool_choice— tool selection strategy
1.2 Message Roles
The messages array uses three roles:
user— user messagesassistant— model responses (included when sending history)tool— tool call results (the role is not explicitly set; this appears as atool_resultcontent block)
Critically important: on every API request you must send the full conversation history. The model does not persist state between requests—each call is independent.
1.3 The stop_reason Field in the Response
The Claude API response includes stop_reason, which indicates why the model stopped generating:
| Value | Description | Action |
|---|---|---|
"end_turn" |
The model finished its response | Show the result to the user |
"tool_use" |
The model wants to call a tool | Execute the tool and return the result |
"max_tokens" |
Token limit reached | The response is truncated; you may need to increase the limit |
"stop_sequence" |
A stop sequence was encountered | Handle based on your application logic |
For agentic systems, "tool_use" and "end_turn" are the most important—they control the agent loop.
1.4 System Prompt
The system prompt is a special instruction that defines context and behavioral rules. It:
- Is not part of the
messagesarray; it is passed separately in thesystemfield - Has priority over user messages
- Is loaded once and applies throughout the conversation
- Is used to define role, constraints, and output format
Important for the exam: system prompt wording can create unintended tool associations. For example, an instruction like "always verify the customer" can cause the model to overuse get_customer, even when it is unnecessary.
1.5 Context Window
The context window is the total amount of text (in tokens) the model can process at once. It includes:
- The system prompt
- The full message history
- Tool definitions
- Tool results
Key context-window problems:
Lost-in-the-middle effect: models reliably process information at the start and end of a long input but can miss details in the middle. Mitigation: place key information near the beginning or end.
Accumulation of tool results: every tool call adds output to the context. If a tool returns 40+ fields but only 5 matter, then most of the context is wasted.
Progressive summarization: when compressing history, numeric values, percentages, and dates often get lost and become vague ("about", "roughly", "a few").