Skip to content

Chapter 10: Real-World Patterns: Research, Automation, and Support #10

@yagop

Description

@yagop

Part of the Build Your Own Coding Agent tutorial. One issue = one chapter (chapters/10-real-world.md) plus its examples/10-real-world/ samples.

Goal (1 sentence): Apply everything to three agent archetypes (research, automation, support) plus human-in-the-loop gates and a small eval harness, all wired into Telegram.

After this chapter you can

  • Compose the Chapter 6 runner with domain tool sets to build research, automation, and support agents that each carry their own messages history and share a single Telegram bot client.
  • Implement a human-in-the-loop approval gate that intercepts a tool_use block, emits a Telegram inline-keyboard prompt, and resumes or cancels the tool cycle based on the reply.
  • Integrate RAG retrieval as a tool call for the support agent, using a retrieve_docs tool that fetches top-k chunks and a cache_control block on the static knowledge-base system prompt to cut cost across repeated queries.
  • Build a minimal eval harness that replays fixture conversations through client.messages.create, extracts structured assertions from the final text block, and logs pass/fail with usage metrics to track cost per eval.
  • Wire all three agents into one Telegram bot, routing /research, /automate, and /support commands to separate agent instances.

What to cover (ONE paragraph, not a list)

This chapter wires the Chapter 6 runner into three production archetypes. The research agent defines a web_search tool with input_schema, chains multiple tool_use -> tool_result turns, and controls depth with a step counter checked on each stop_reason === "tool_use" guard. The automation agent is triggered from a cron or webhook handler, passes task context through the system prompt, and forces an action on the first turn with tool_choice: { type: "any" }, then resets to auto on later turns so the loop can reach end_turn - because forced tool_choice blocks a final text reply. The support agent adds a retrieve_docs tool for top-k RAG retrieval, an escalate_to_human tool that fires a Telegram notification, and a cache_control block on the static knowledge-base system prompt to cut latency and cost across repeated queries. Safety checks inspect tool_use.name and tool_use.input before execution and short-circuit with an explanatory tool_result when a call looks destructive; token budgeting reads usage.input_tokens and usage.output_tokens after each turn and trims oldest non-system messages when the running total nears the context limit. The chapter closes by routing /research, /automate, and /support Telegram commands to their respective agent instances and measuring quality with the eval harness.

Going deeper (optional asides - keep OFF the main line)

  • None

Out of scope (defer - do NOT preview)

  • None

Code samples - examples/10-real-world/

  • research-agent.ts - bounded multi-step web-search + synthesis.
  • automation-agent.ts - cron-triggered; forced first action then auto.
  • support-agent.ts - retrieve_docs + escalate_to_human + cached KB prompt.
  • approval-gate.ts - intercept destructive tool_use; Telegram confirm; resume/cancel.
  • eval-harness.ts - fixture replay scoring tool selection + answer quality.
  • telegram-all-agents.ts - route /research /automate /support to the three agents.

Must-keep for a beginner (floor - never cut for brevity)

  • The run command for the first sample.
  • "Never hardcode your key; it comes from the environment" (once, in prose).
  • Anything a beginner cannot infer from the code (e.g. Bun auto-loads .env, no loader needed).
  • The one genuinely non-obvious gotcha: tool_choice: { type: "any" } prevents a final end_turn text reply - reset to auto on subsequent turns or the loop never terminates.

Friendliness floor (never cut - terse is not friendly)

  • The chapter addresses the reader as "you", never "the user" or "one".
  • The intro AND at least one section open with a warm, second-person sentence.

Key APIs (flat list, reference only - NOT a coverage checklist)

client.messages.create, client.messages.stream, stop_reason, tool_use, tool_result, tool_choice (auto/any/tool), input_schema, usage.input_tokens, usage.output_tokens, cache_control, system, tools, human-in-the-loop gate, eval harness

Prerequisites

Chapters 6 (reusable runner), 3 (Telegram raw fetch), 4 (conversation history), 9 (RAG/retrieval); a vector store for retrieve_docs.

Definition of done

  • Chapter at chapters/10-real-world.md, <=120 lines, <=4 main-line H2s plus an optional "What's next" closer (paste wc -l AND grep -c '^## ' in the PR).
  • Every sample runnable with bun run, imported via <<< @/examples/10-real-world/file.ts, <=35 lines, comment:code <=0.30.
  • One-home rule held: no prose sentence restates an inline code comment.
  • Friendliness floor held: reader addressed as "you"; intro + >=1 section open warm.
  • Samples use only real @anthropic-ai/sdk surface; ASCII punctuation only.
  • Optional material lives in Going-deeper asides, not main-line H2s.
  • All three agents + gate + eval harness implemented and runnable with bun run.
  • Unified Telegram entry point routes all three archetypes.
  • Linked from README.md and the .vitepress/config.ts sidebar; bun x vitepress build passes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions