docs(adr): ADR-0038 Build Verification Loop — the agent builds, verifies, and corrects itself by os-zhuang · Pull Request #1710 · objectstack-ai/framework

os-zhuang · 2026-06-11T04:51:24Z

Design center: never make correctness depend on a human looking. Humans won't review, and the magic moment auto-publishes before they could — the agent that builds an app must be the same loop that verifies and corrects it.

Grounded in this week's live data: six agent-authored defects shipped to staging in one day — every one passed schema validation, every one was found by a human manually browsing (dangling dataset refs, measure-name mismatches, seeds not materializing, queries returning 0 on populated objects, silent seed failures, views rendering as "Unknown component type"). Schema-valid ≠ renders ≠ returns data ≠ matches intent — four separate verification planes, of which only the first exists today.

Five layers, one issues[] contract consumed by the agent, the chat health-card, and the eval store:

L1 draft-time cross-artifact graph lint (same-turn feedback into the tool envelope)
L2 pre-publish renderability contract (translation + registry as data; dataset compilation)
L3 post-publish runtime probes (row counts, one real query per widget — generalizing seedApplied)
L4 bounded self-correction + machine-conditional publish gate (replaces the human approval gate for AI builds; HITL stays for destructive actions only) + an LLM intent-review pass
L5 golden-prompt eval suite on the existing-but-unused ai_eval_cases/ai_eval_runs (every incident becomes a permanent regression case)

Phase 1 (~1 week, cloud) alone converts this week's discovery latency from human-hours to same-turn.

Status: Proposed — open for review; merging records the proposal.

🤖 Generated with Claude Code

…ies, corrects itself Never make correctness depend on a human looking: six agent-authored defects shipped to staging in one day, every one schema-valid, every one found by a human manually browsing. Five layers — draft-time graph lint, pre-publish renderability contract, post-publish runtime probes, bounded self-correction with a machine-conditional publish gate, and a golden-prompt eval suite on the existing ai_eval_* skeleton. HITL remains only for destructive actions; quality review is the machine's job. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

vercel · 2026-06-11T04:51:26Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
spec	Ready	Preview, Comment	Jun 11, 2026 4:55am

github-actions Bot added size/m documentation Improvements or additions to documentation labels Jun 11, 2026

vercel Bot deployed to Preview June 11, 2026 04:55 View deployment

os-zhuang merged commit af72fb2 into main Jun 11, 2026
12 checks passed

os-zhuang deleted the adr-0038-build-verification-loop branch June 11, 2026 04:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(adr): ADR-0038 Build Verification Loop — the agent builds, verifies, and corrects itself#1710

docs(adr): ADR-0038 Build Verification Loop — the agent builds, verifies, and corrects itself#1710
os-zhuang merged 1 commit into
mainfrom
adr-0038-build-verification-loop

os-zhuang commented Jun 11, 2026

Uh oh!

vercel Bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

os-zhuang commented Jun 11, 2026

Uh oh!

vercel Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Jun 11, 2026 •

edited

Loading