Part of the Build Your Own Coding Agent tutorial. One issue = one chapter (chapters/10-real-world.md) plus its examples/10-real-world/ samples.
Goal (1 sentence): Apply everything to three agent archetypes (research, automation, support) plus human-in-the-loop gates and a small eval harness, all wired into Telegram.
After this chapter you can
- Compose the Chapter 6 runner with domain tool sets to build research, automation, and support agents that each carry their own
messages history and share a single Telegram bot client.
- Implement a human-in-the-loop approval gate that intercepts a
tool_use block, emits a Telegram inline-keyboard prompt, and resumes or cancels the tool cycle based on the reply.
- Integrate RAG retrieval as a tool call for the support agent, using a
retrieve_docs tool that fetches top-k chunks and a cache_control block on the static knowledge-base system prompt to cut cost across repeated queries.
- Build a minimal eval harness that replays fixture conversations through
client.messages.create, extracts structured assertions from the final text block, and logs pass/fail with usage metrics to track cost per eval.
- Wire all three agents into one Telegram bot, routing
/research, /automate, and /support commands to separate agent instances.
What to cover (ONE paragraph, not a list)
This chapter wires the Chapter 6 runner into three production archetypes. The research agent defines a web_search tool with input_schema, chains multiple tool_use -> tool_result turns, and controls depth with a step counter checked on each stop_reason === "tool_use" guard. The automation agent is triggered from a cron or webhook handler, passes task context through the system prompt, and forces an action on the first turn with tool_choice: { type: "any" }, then resets to auto on later turns so the loop can reach end_turn - because forced tool_choice blocks a final text reply. The support agent adds a retrieve_docs tool for top-k RAG retrieval, an escalate_to_human tool that fires a Telegram notification, and a cache_control block on the static knowledge-base system prompt to cut latency and cost across repeated queries. Safety checks inspect tool_use.name and tool_use.input before execution and short-circuit with an explanatory tool_result when a call looks destructive; token budgeting reads usage.input_tokens and usage.output_tokens after each turn and trims oldest non-system messages when the running total nears the context limit. The chapter closes by routing /research, /automate, and /support Telegram commands to their respective agent instances and measuring quality with the eval harness.
Going deeper (optional asides - keep OFF the main line)
Out of scope (defer - do NOT preview)
Code samples - examples/10-real-world/
Must-keep for a beginner (floor - never cut for brevity)
- The run command for the first sample.
- "Never hardcode your key; it comes from the environment" (once, in prose).
- Anything a beginner cannot infer from the code (e.g. Bun auto-loads
.env, no loader needed).
- The one genuinely non-obvious gotcha:
tool_choice: { type: "any" } prevents a final end_turn text reply - reset to auto on subsequent turns or the loop never terminates.
Friendliness floor (never cut - terse is not friendly)
- The chapter addresses the reader as "you", never "the user" or "one".
- The intro AND at least one section open with a warm, second-person sentence.
Key APIs (flat list, reference only - NOT a coverage checklist)
client.messages.create, client.messages.stream, stop_reason, tool_use, tool_result, tool_choice (auto/any/tool), input_schema, usage.input_tokens, usage.output_tokens, cache_control, system, tools, human-in-the-loop gate, eval harness
Prerequisites
Chapters 6 (reusable runner), 3 (Telegram raw fetch), 4 (conversation history), 9 (RAG/retrieval); a vector store for retrieve_docs.
Definition of done
Part of the Build Your Own Coding Agent tutorial. One issue = one chapter (
chapters/10-real-world.md) plus itsexamples/10-real-world/samples.Goal (1 sentence): Apply everything to three agent archetypes (research, automation, support) plus human-in-the-loop gates and a small eval harness, all wired into Telegram.
After this chapter you can
messageshistory and share a single Telegram bot client.tool_useblock, emits a Telegram inline-keyboard prompt, and resumes or cancels the tool cycle based on the reply.retrieve_docstool that fetches top-k chunks and acache_controlblock on the static knowledge-base system prompt to cut cost across repeated queries.client.messages.create, extracts structured assertions from the finaltextblock, and logs pass/fail withusagemetrics to track cost per eval./research,/automate, and/supportcommands to separate agent instances.What to cover (ONE paragraph, not a list)
This chapter wires the Chapter 6 runner into three production archetypes. The research agent defines a
web_searchtool withinput_schema, chains multipletool_use->tool_resultturns, and controls depth with a step counter checked on eachstop_reason === "tool_use"guard. The automation agent is triggered from a cron or webhook handler, passes task context through thesystemprompt, and forces an action on the first turn withtool_choice: { type: "any" }, then resets toautoon later turns so the loop can reachend_turn- because forcedtool_choiceblocks a final text reply. The support agent adds aretrieve_docstool for top-k RAG retrieval, anescalate_to_humantool that fires a Telegram notification, and acache_controlblock on the static knowledge-base system prompt to cut latency and cost across repeated queries. Safety checks inspecttool_use.nameandtool_use.inputbefore execution and short-circuit with an explanatorytool_resultwhen a call looks destructive; token budgeting readsusage.input_tokensandusage.output_tokensafter each turn and trims oldest non-system messages when the running total nears the context limit. The chapter closes by routing/research,/automate, and/supportTelegram commands to their respective agent instances and measuring quality with the eval harness.Going deeper (optional asides - keep OFF the main line)
Out of scope (defer - do NOT preview)
Code samples - examples/10-real-world/
research-agent.ts- bounded multi-step web-search + synthesis.automation-agent.ts- cron-triggered; forced first action thenauto.support-agent.ts-retrieve_docs+escalate_to_human+ cached KB prompt.approval-gate.ts- intercept destructivetool_use; Telegram confirm; resume/cancel.eval-harness.ts- fixture replay scoring tool selection + answer quality.telegram-all-agents.ts- route/research/automate/supportto the three agents.Must-keep for a beginner (floor - never cut for brevity)
.env, no loader needed).tool_choice: { type: "any" }prevents a finalend_turntext reply - reset toautoon subsequent turns or the loop never terminates.Friendliness floor (never cut - terse is not friendly)
Key APIs (flat list, reference only - NOT a coverage checklist)
client.messages.create,client.messages.stream,stop_reason,tool_use,tool_result,tool_choice(auto/any/tool),input_schema,usage.input_tokens,usage.output_tokens,cache_control,system,tools, human-in-the-loop gate, eval harnessPrerequisites
Chapters 6 (reusable runner), 3 (Telegram raw fetch), 4 (conversation history), 9 (RAG/retrieval); a vector store for
retrieve_docs.Definition of done
chapters/10-real-world.md, <=120 lines, <=4 main-line H2s plus an optional "What's next" closer (pastewc -lANDgrep -c '^## 'in the PR).bun run, imported via<<< @/examples/10-real-world/file.ts, <=35 lines, comment:code <=0.30.@anthropic-ai/sdksurface; ASCII punctuation only.bun run.README.mdand the.vitepress/config.tssidebar;bun x vitepress buildpasses.