Chapter 4: Context and Conversation Management

Part of the **Build Your Own Coding Agent** tutorial. One issue = one chapter (`chapters/04-context.md`) plus its `examples/04-context/` samples.



**Goal (1 sentence):** Build the stateful layer over the stateless Messages API - history, context-window limits, prompt caching, and per-user sessions.

### After this chapter you can
- Maintain a correctly alternating `messages` array across turns, count tokens before sending, and keep the conversation inside the model's context window.
- Set and tune a `system` prompt for persona and constraints, using a string or a block array with `cache_control`.
- Apply `cache_control` on stable prefixes to cut latency and cost, and verify cache hits via `usage.cache_read_input_tokens`.
- Isolate per-user sessions in memory and persist them to disk so history survives restarts.

### What to cover (ONE paragraph, not a list)
The Anthropic Messages API is stateless: every call to `client.messages.create` must supply the full conversation history as the `messages` array, with strictly alternating `role: "user"` and `role: "assistant"` turns where each `content` field accepts a string or an array of content blocks (`text`, `tool_use`, `tool_result`). After each call you extract `response.content` and push it back as an `assistant` turn before the next user message; a `system` string or array of `text` blocks sets persona and stable background context without occupying a turn. Before sending, call `client.messages.countTokens` with the same `model`, `system`, `messages`, and `tools` payload, then branch on the result - trimming oldest pairs (always in pairs to preserve alternation), running a rolling window of the last N turns, or calling `create` with a summarize instruction and replacing the accumulated turns with a single injected `user`/`assistant` summary pair - tying the threshold to your model via `client.models.retrieve` rather than hard-coding a number. Prompt caching adds `cache_control: { type: "ephemeral" }` to the final `text` block of the `system` array or large stable `user` turns; the prefix must exceed the model's minimum cacheable size (about 4096 tokens on Opus 4.x and Haiku 4.5, about 2048 on Sonnet 4.6) and must be byte-stable across requests or it silently no-ops. Finally, per-user sessions key an in-memory `Map` by Telegram `chat.id` and serialize to a JSON file so history survives bot restarts.

### Going deeper (optional asides - keep OFF the main line)
- None

### Out of scope (defer - do NOT preview)
- None

### Code samples - examples/04-context/
- [ ] `multi-turn.ts` - append assistant `content` back into `messages`; print turn count.
- [ ] `system-prompt.ts` - `system` string vs block array; verify persona across turns.
- [ ] `token-counter.ts` - `countTokens` before `create`; rolling-window trim at a model-specific threshold.
- [ ] `summarize-history.ts` - summarize and replace oversized history.
- [ ] `prompt-cache.ts` - `cache_control` on a large, byte-stable system block; print `cache_read_input_tokens`.
- [ ] `telegram-sessions.ts` - `Map` by `chat.id`, persisted to `sessions.json`.

### Must-keep for a beginner (floor - never cut for brevity)
- The run command for the first sample.
- "Never hardcode your key; it comes from the environment" (once, in prose).
- Bun auto-loads `.env`; no loader needed.
- Cache minimum sizes: ~4096 tokens on Opus 4.x and Haiku 4.5, ~2048 on Sonnet 4.6 - a changing value (e.g. a timestamp) invalidates the cache silently.
- Trimming must drop pairs (user + assistant together) to preserve alternation - never drop one side alone.

### Friendliness floor (never cut - terse is not friendly)
- The chapter addresses the reader as "you", never "the user" or "one".
- The intro AND at least one section open with a warm, second-person sentence.

### Key APIs (flat list, reference only)
`client.messages.create`, `client.messages.countTokens`, `client.models.retrieve`, `messages`, `system`, `cache_control: { type: "ephemeral" }`, `usage.cache_read_input_tokens`, `usage.cache_creation_input_tokens`

### Prerequisites
Chapters 1-3. The Telegram session sample reuses the Chapter 3 bot token.

### Definition of done
- [ ] Chapter at `chapters/04-context.md`, <=120 lines, <=4 main-line H2s plus an optional "What's next" closer (paste `wc -l` AND `grep -c '^## '` in the PR).
- [ ] Every sample runnable with `bun run`, imported via `<<< @/examples/04-context/file.ts`, <=35 lines, comment:code <=0.30.
- [ ] One-home rule held: no prose sentence restates an inline code comment.
- [ ] Friendliness floor held: reader addressed as "you"; intro + >=1 section open warm.
- [ ] Samples use only real `@anthropic-ai/sdk` surface; ASCII punctuation only.
- [ ] Optional material lives in Going-deeper asides, not main-line H2s.
- [ ] Linked from `README.md` and the `.vitepress/config.ts` sidebar; `bun x vitepress build` passes.
- [ ] Caching sample actually shows a non-zero `cache_read_input_tokens`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chapter 4: Context and Conversation Management #4

After this chapter you can

What to cover (ONE paragraph, not a list)

Going deeper (optional asides - keep OFF the main line)

Out of scope (defer - do NOT preview)

Code samples - examples/04-context/

Must-keep for a beginner (floor - never cut for brevity)

Friendliness floor (never cut - terse is not friendly)

Key APIs (flat list, reference only)

Prerequisites

Definition of done

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Chapter 4: Context and Conversation Management #4

Description

After this chapter you can

What to cover (ONE paragraph, not a list)

Going deeper (optional asides - keep OFF the main line)

Out of scope (defer - do NOT preview)

Code samples - examples/04-context/

Must-keep for a beginner (floor - never cut for brevity)

Friendliness floor (never cut - terse is not friendly)

Key APIs (flat list, reference only)

Prerequisites

Definition of done

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions