Skip to content

Workflow boards: kanban state machines that drive coding agents#3032

Open
ccdwyer wants to merge 301 commits into
pingdotgg:mainfrom
ccdwyer:ft/hyperion
Open

Workflow boards: kanban state machines that drive coding agents#3032
ccdwyer wants to merge 301 commits into
pingdotgg:mainfrom
ccdwyer:ft/hyperion

Conversation

@ccdwyer

@ccdwyer ccdwyer commented Jun 10, 2026

Copy link
Copy Markdown

Workflow Boards

Per-project kanban boards as event-sourced state machines that drive coding agents. Lanes hold pipelines of steps (agent / script / approval / merge); routing between lanes is decided by step outcomes, JSONLogic predicates over captured output, lane fallbacks, manual actions, or external webhook events. Every ticket gets its own git worktree, every move is audited and explained.

All screenshots below are from a live run on a mock project ("Snackbase"): the board's agent steps run GPT-5.5 at different reasoning levels per lane (planning = low, implementation = medium with escalation to extra-high on retry, review = high ×3 reviewers), and the "Fix off-by-one" ticket was driven through the pipeline by real agents.

The board

Lanes with per-lane colors and WIP limits, tickets with status stripes, dependency badges ("waiting on 1 dependency"), token budgets ("0 tok / 250k"), and usage roll-ups.

Board overview

Creating a ticket: description, blocked-by dependencies, and an optional token budget that halts agent steps once spent.

New ticket

Intake: braindump → tickets

Paste a braindump, pick the agent (provider/model + reasoning effort), and it proposes structured tickets — including dependency edges ("After #1") — which you edit and approve before anything is created. These proposals came from a real GPT-5.5 run.

Intake braindump

Intake proposals

The workflow editor

Canvas view: lanes as cards, steps typed and colored, routing edges colored by outcome (success/failure/blocked), numbered transitions, dotted action edges, routing-precedence legend. Edits are drag-to-connect or via the inspector; explicit Save lints and writes the board file (.t3/boards/*.json is the source of truth).

Editor canvas

Selecting a lane dims every edge that doesn't touch it, so dense graphs stay readable:

Lane focus dimming

Agent steps, fully configurable

The implement step: GPT-5.5 · Medium reasoning, 2 retry attempts, and "Escalate on retry" to GPT-5.5 · Extra High — a failed attempt automatically reruns on the stronger configuration.

Implement step

The review step: GPT-5.5 · High, captured output (the agent ends with a fenced JSON verdict that routing predicates can read), and a 3-reviewer panel — three independent sessions vote, strict majority wins.

Review step panel

Lane form: merge steps, routing, external events

The Land lane in form view: a merge step (commits the ticket worktree and merges it into the checked-out branch; conflicts block instead of failing), lane success/failure/blocked routes, and external event matchers — a ci.passed webhook with a payload predicate moves the ticket to Done.

Land lane form

Dry run

Simulate a hypothetical ticket through the definition you're editing (unsaved changes included) under all-succeed / all-fail / all-block scenarios. It mirrors the engine's exact routing semantics and explains every hop — here it correctly flags that the success path stalls in Review unless a verdict transition matches.

Dry run

Version history

Every save is snapshotted per board with diffs and non-destructive revert.

Version history

External events

Each board gets a webhook endpoint with a rotating token (shown exactly once) and a copyable curl example. CI, PR automation, or cron can move correlated tickets (by ticketId or workflow/<id> branch) through their lane's event matchers, with delivery dedupe.

Webhook config

The board reports to you

A digest of the last 24h: shipped/created counts, tokens spent, agent time, and which tickets are waiting on a human.

Board digest

Living with a ticket

The drawer: "Why is this ticket here?" route explainability (every hop with the rule that caused it), a discussion thread whose comments reach the next agent step as context, per-step status/duration/token usage, and one-click lane actions ("Retry build", "Back to backlog").

Ticket drawer

Script steps are gated by per-project trust — the first node --test run blocks until you allow it:

Trust gate

Every ticket has a case file (.t3/ticket/<id>/) the agents write into — here the PLAN.md the planning agent produced — plus the script output and reviewer sessions:

Artifacts

And any agent step's full session is one click away, read-only:

Agent session

Boards live in the sidebar with hover rename/delete (delete cascades tickets, events, versions, worktrees, and webhook tokens):

Sidebar

Not shown but included

Durable restart recovery (pipelines, retries, merges, approvals resume safely), WIP queueing with FIFO auto-admission, dependency auto-release, terminal-lane retention TTL with full state cleanup, aging badges and waiting-on-you toasts, ticket search, multi-environment boards, and an event-sourced audit trail under everything.

Notes

  • The engine is a new bounded context under apps/server/src/workflow/** (Effect TS, event-sourced over SQLite) with contracts in packages/contracts/src/workflow.ts and the web UI under apps/web/src/components/board/**.
  • Every batch on this branch went through adversarial review (security findings like webhook token hashing, prototype-pollution sanitization, and stale-event fencing came out of those rounds) plus live Playwright verification.
  • docs/workflow-demo/ (these screenshots) is demo material and can be dropped before merge.

🤖 Generated with Claude Code

Note

Add workflow boards — a kanban-based state machine system for driving coding agents

  • Introduces a full event-sourced workflow engine (WorkflowEngine) with per-board kanban lanes, agent/script/merge step types, WIP limits, conditional routing via JSONLogic predicates, and ticket dependency tracking
  • Adds 22 database migrations (033–054) for event store, projection tables, worktree leases, dispatch outbox, setup runs, checkpoints, webhook delivery, and version history
  • Implements durable execution with a provider dispatch outbox, worktree leases with fencing tokens, crash recovery, and a terminal retention sweeper for expired tickets
  • Exposes a comprehensive WebSocket RPC surface (30+ methods) gated by new workflow:read and workflow:operate auth scopes, wired through the existing environment auth layer
  • Adds a full web UI: board view with drag-and-drop (BoardView), ticket drawer with step activity feed and diffs, an Intake dialog for proposing tickets from a braindump, a visual canvas editor with SVG edge routing, and version history with JSON diff preview
  • Adds per-board webhook ingestion (POST /hooks/workflow/:boardId) with token verification, payload sanitization, and delivery deduplication
  • Hides workflow-internal orchestration threads from user-facing snapshots and agent awareness relay by propagating a hidden flag through the thread schema and projection queries
  • Risk: large schema and runtime surface introduced in a single PR; recovery paths, retention sweeper, and worktree janitor run at startup and may interact with existing orchestration under concurrent workloads

Macroscope summarized 5002be8.


Note

High Risk
Very large new subsystem touching auth scopes, webhooks, worktree leasing, dispatch/outbox recovery, and orchestration visibility; correctness and security depend on many interacting durable paths.

Overview
Adds per-project workflow boards (.t3/boards/*.json) backed by a new event-sourced engine under apps/server/src/workflow/**: lane pipelines (agent / script / approval / merge), ticket moves via step outcomes, JSONLogic routing (json-logic-js), WIP queues, dependencies, token budgets, worktree leases, a dispatch outbox into orchestration, crash recovery, terminal retention, and board webhooks (hashed tokens + delivery dedupe). 22 SQLite migrations (033–054) create workflow events, projections, and supporting tables.

The server wires WorkflowServerRuntimeLive into startup (recovery + retention sweeper), exposes workflowHooksRouteLayer, and extends OAuth token exchange with workflow:read / workflow:operate (tests/assertions now use shared AuthStandardClientScopes / AuthAdministrativeScopes).

Orchestration gains hidden threads for internal workflow agent runs: they are projected but stripped from public snapshots/lists, gated by isThreadHidden, and skipped by the agent-awareness relay. thread.create can mark hidden; terminal manager adds attachHistoryStream so script steps can stream persisted output without opening a shell.

Also ships a sample Standard delivery board definition and ignores .superpowers/ in git.

Reviewed by Cursor Bugbot for commit 5002be8. Bugbot is set up for automated code reviews on this repo. Configure here.

ccdwyer and others added 30 commits June 7, 2026 12:55
Real workflow execution was green only through stubs; the live path needed durable recovery, provider-question waits, repo-root worktrees, and hard supersede handling to satisfy the v1 invariants.

Constraint: Fixes were driven by docs/superpowers/reviews/2026-06-07-workflow-boards-v1-adversarial-review.md and the v1 design invariants.

Rejected: Stub-only fixes | they would preserve the broken end-to-end runtime path.

Confidence: high

Scope-risk: broad

Directive: Keep future workflow changes covered by real-path tests with temp git repos and provider waits, not only stubbed unit tests.

Tested: pnpm exec vp run typecheck; pnpm exec vp check; cd apps/server && pnpm exec vp test run src/workflow; pnpm --filter @t3tools/contracts test

Not-tested: full non-workflow server/web suites beyond the required workflow and contracts gates
Recovery now interrupts stale pre-restart projection turns, restarts provider dispatches without rebinding to dead turns, and hands terminal results back to the engine so autonomous pipelines route after restart. Provider question waits now come from real user-input activity projection and the engine re-awaits terminal completion after answers before starting the next step.

Constraint: Fix reviewed residuals without adding v2 workflow features or weakening stub-free real-path tests.

Rejected: Stubbing TurnProjectionPort or pending approvals in real-path coverage | Those were the exact seams hiding the restart and user-input bugs.

Confidence: high

Scope-risk: moderate

Directive: Keep provider wait completion tied to dispatch terminal state; do not confirm a step merely because the user answered a provider question.

Tested: cd apps/server && pnpm exec vp test run src/workflow; pnpm exec vp run typecheck; pnpm exec vp check

Not-tested: Full application browser workflow; changes are server workflow runtime only.
Outer pipeline failures should leave a durable blocked-ticket record and an operator-visible warning instead of disappearing behind recovery behavior. The regression now asserts the real event store contains the error detail and the warning is emitted.

Constraint: Low-severity polish only; step-level failures continue to use StepFailed routing.

Rejected: Routing orchestration failures through step failure handling | There may be no started step when the pipeline wrapper fails.

Confidence: high

Scope-risk: narrow

Directive: Keep the outer pipeline catch interrupt-aware; manual supersede interrupts must not block tickets.

Tested: cd apps/server && pnpm exec vp test run src/workflow

Not-tested: Full repo typecheck/check will run after the polish pass.
Recovered provider monitors intentionally fork independently so workflow startup is not held open by provider terminal waits. The comment records that these restart-window continuations are not tracked as live pipeline fibers and therefore cannot be interrupted by manual moves.

Constraint: Optional low-risk polish only; do not destabilize passing restart recovery behavior.

Rejected: Reworking recovery continuations through live pipeline fiber tracking | That changes concurrency and interruption semantics beyond the requested low-risk pass.

Confidence: high

Scope-risk: narrow

Directive: Revisit this only with dedicated recovery concurrency tests.

Tested: cd apps/server && pnpm exec vp test run src/workflow

Not-tested: Full repo typecheck/check will run after the polish pass.
The TurnStateReader tests now include one focused path through ProjectionTurnRepositoryLive and TurnProjectionPortLive so running, completed, and error states are covered without the TurnProjectionPort stub seam.

Constraint: Optional low-risk test coverage only; no production behavior change.

Rejected: Broad real-path workflow expansion | The requested seam is covered by a focused projection test.

Confidence: high

Scope-risk: narrow

Directive: Keep state-mapping tests close to TurnStateReader rather than relying only on workflow runtime scenarios.

Tested: pnpm exec vp test run src/workflow/Layers/TurnStateReader.test.ts; cd apps/server && pnpm exec vp test run src/workflow

Not-tested: Full repo typecheck/check will run after the polish pass.
Make adding a board to a project as easy as a new chat: multiple named
file-backed boards listed in the sidebar (board icon), one-click "Add board"
that writes a templated .t3/boards/<slug>.json defaulting to the user's most
recent agent, and server-side discovery/registration of board files.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Server-resolved workspace root (drop client repoRoot/filePath; remove
trust-prone registerBoardFromFile); board list as a separate source from
the registry (invalid files surface as error entries); net-new per-project
board-list store slice; grouped-project member picker for "Add board";
modelSelection from full threads with availability-filtered providers;
loader path split; file-deletion unregister; exclusive-create writes;
BoardSnapshot.projectId; full RPC layer wiring enumerated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
14 TDD tasks: default template + slug helpers, BoardListEntry/createBoard
contracts, registry unregister + read-model list/delete, loader path split,
project workspace resolver, BoardDiscovery, listBoards/createBoard handlers
(server-resolved root; registerBoardFromFile removed), runtime+watcher wiring,
client RPC wiring, resolveRecentAgent, board-list store slice, sidebar board
rows + Add board, board route cleanup.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Task 5: keep repo compiling (shim the existing registerBoardFromFile caller
  to the new loader signature until Task 8 removes it).
- Task 3: BoardSnapshot.projectId is top-level; remove registerBoardFromFile
  everywhere (ipc.ts, client runtime, EnvironmentApi, scope, mocks).
- Task 6: resolver returns a dedicated tagged error; unwrap Option from
  getProjectShellById; don't mistype projection errors.
- Task 8: add WorkspaceFileSystem.createFileExclusive (wx); createBoard uses it.
- Task 9: discovery is on-demand via listBoards; server file-watcher/boardsChanged
  push explicitly deferred (no generic project-visible hook exists).
- Task 11: read recent agent from thread SHELLS; filter providers by installed.
- Task 12: refetch on mount + after createBoard (no push).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Constraint: Board Creation UX plan requires the default board template before create/discovery wiring.\nRejected: Hand-authoring ad hoc test fixtures | the default board should be generated by the same helper createBoard will use.\nConfidence: high\nScope-risk: narrow\nDirective: Keep this template pure; provider validation belongs to loader/discovery layers.\nTested: cd apps/server && pnpm exec vp test run src/workflow/defaultBoard.test.ts; pnpm exec vp run typecheck; pnpm exec vp check; cd apps/server && pnpm exec vp test run src/workflow\nNot-tested: Full app manual create-board flow is not implemented yet.
Constraint: Board creation needs deterministic, collision-safe file slugs before createBoard writes files.\nRejected: Inline slug logic in RPC handlers | shared pure helpers keep discovery and createBoard aligned.\nConfidence: high\nScope-risk: narrow\nDirective: Keep slug generation deterministic; idempotent discovery depends on it.\nTested: cd apps/server && pnpm exec vp test run src/workflow/boardSlug.test.ts; pnpm exec vp run typecheck; pnpm exec vp check; cd apps/server && pnpm exec vp test run src/workflow\nNot-tested: createBoard slug collision flow is not wired yet.
…omFile

Constraint: New workflow RPC entries must be in WsRpcGroup now, but the plan keeps the legacy register handler alive until Task 8 for the loader-path transition.\nRejected: Removing registerBoardFromFile handlers in Task 3 | Task 5 and Task 8 explicitly sequence that removal later.\nConfidence: medium\nScope-risk: moderate\nDirective: Replace the temporary listBoards/createBoard handler failures with real BoardDiscovery/createBoard wiring in Task 8, then remove registerBoardFromFile everywhere.\nTested: pnpm --filter @t3tools/contracts test -- workflow.test.ts; pnpm exec vp run typecheck; pnpm exec vp check; cd apps/server && pnpm exec vp test run src/workflow; pnpm --filter @t3tools/contracts test\nNot-tested: listBoards/createBoard runtime behavior is intentionally not implemented until later tasks.
Constraint: Discovery must unregister deleted board files and list boards per project from projection rows.\nRejected: Keeping stale projection rows on delete | deleted files must disappear from listBoards results.\nConfidence: high\nScope-risk: narrow\nDirective: Do not cascade into ticket runtime state from deleteBoard; this method removes board-list projection rows only.\nTested: cd apps/server && pnpm exec vp test run src/workflow/Layers/BoardRegistry.test.ts src/workflow/Layers/WorkflowReadModel.test.ts; pnpm exec vp run typecheck; cd apps/server && pnpm exec vp test run src/workflow; pnpm exec vp check\nNot-tested: File-discovery delete flow is not wired until Task 7.
Constraint: create/list discovery must resolve board files under a project workspace but persist workspace-relative paths.\nRejected: Continuing to store absolute or client-provided paths | sidebar board rows need portable relative file paths and server-side root resolution.\nConfidence: high\nScope-risk: moderate\nDirective: Task 8 should remove the temporary registerBoardFromFile shim after createBoard/listBoards are real.\nTested: cd apps/server && pnpm exec vp test run src/workflow/Layers/WorkflowFileLoader.test.ts; cd apps/server && pnpm exec vp test run src/workflow/Layers/WorkflowRpcHandlers.test.ts; pnpm exec vp run typecheck; cd apps/server && pnpm exec vp test run src/workflow; pnpm exec vp check\nNot-tested: BoardDiscovery scanning is not implemented until Task 7.
Constraint: Board creation must resolve workspace roots server-side from projectId, never from client paths.\nRejected: Passing repo paths from the web client | this preserves the invariant that server projections own workspace roots.\nConfidence: high\nScope-risk: narrow\nDirective: Use this port in BoardDiscovery and createBoard instead of accepting paths in RPC payloads.\nTested: cd apps/server && pnpm exec vp test run src/workflow/Layers/ProjectWorkspaceResolver.test.ts; pnpm exec vp run typecheck; cd apps/server && pnpm exec vp test run src/workflow; pnpm exec vp check\nNot-tested: Real projection SQL lookup is covered by existing ProjectionSnapshotQuery tests, not duplicated here.
Constraint: listBoards must discover file-backed boards on demand and unregister rows for files that disappear.\nRejected: Server push/watchers for board changes | v1 explicitly defers watchers and uses on-demand list refresh.\nConfidence: high\nScope-risk: moderate\nDirective: Keep discovery idempotent and keep invalid files as error entries without registering them.\nTested: cd apps/server && pnpm exec vp test run src/workflow/Layers/BoardDiscovery.test.ts; pnpm exec vp run typecheck; cd apps/server && pnpm exec vp test run src/workflow; pnpm exec vp check\nNot-tested: RPC list/create handlers are not wired until Task 8.
…oardFromFile

Constraint: Removing registerBoardFromFile everywhere required client/runtime/test-harness cleanup and workflow provider wiring in the same compile boundary.
Rejected: Leaving temporary registerBoardFromFile shims until later tasks | the plan invariant requires grep-clean removal in Task 8.
Confidence: high
Scope-risk: moderate
Directive: Keep board creation server-resolved; do not reintroduce client-supplied board file paths.
Tested: cd apps/server && pnpm exec vp test run src/workspace/Layers/WorkspaceFileSystem.test.ts; cd apps/server && pnpm exec vp test run src/workflow/Layers/WorkflowRpcHandlers.test.ts; cd apps/server && pnpm exec vp test run src/workflow; pnpm --filter @t3tools/contracts test; pnpm exec vp run typecheck; pnpm exec vp check; rg registerBoardFromFile packages/contracts apps/server/src apps/web/src packages/client-runtime/src -n
Not-tested: end-to-end browser sidebar creation flow is covered in later web tasks.
Constraint: Task 8's grep-clean handler removal made the runtime wiring necessary before that compile boundary, so this commit records the verified Task 9 boundary.
Rejected: Reintroducing a temporary registerBoardFromFile shim to defer runtime wiring | it violates the Task 8 removal invariant.
Confidence: high
Scope-risk: narrow
Directive: Keep discovery on-demand through listBoards; do not add a watcher or boardsChanged push for v1.
Tested: pnpm exec vp run typecheck; cd apps/server && pnpm exec vp test run src/workflow; pnpm exec vp check
Not-tested: no additional runtime code changed after Task 8.
Constraint: registerBoardFromFile client/runtime entries were removed at the Task 8 compile boundary; this task completes the web helper surface.
Rejected: Passing a board file path through the helper | createBoard is server-resolved by projectId/name/agent only.
Confidence: high
Scope-risk: narrow
Directive: Keep board creation callers using EnvironmentApi.workflow.createBoard with no client path field.
Tested: cd apps/web && pnpm exec vp test run src/workflow/boardRpc.test.ts; cd apps/web && pnpm exec vp test run src/workflow; pnpm exec vp run typecheck; pnpm exec vp check
Not-tested: sidebar create button flow is covered in later tasks.
Constraint: Board creation must default to the user's most recent available agent without reading full thread details.
Rejected: Reading sidebar summaries for modelSelection | summaries intentionally omit modelSelection, so the resolver uses thread shells.
Confidence: high
Scope-risk: narrow
Directive: Keep availability gated by enabled, installed, and isAvailable provider entries.
Tested: cd apps/web && pnpm exec vp test run src/workflow/resolveRecentAgent.test.ts; cd apps/web && pnpm exec vp test run src/workflow; pnpm exec vp run typecheck; pnpm exec vp check
Not-tested: sidebar create-button integration is covered in later tasks.
Constraint: Board discovery is on-demand in v1, so the store needs explicit per-project replacement rather than live push merging.
Rejected: Appending board list entries incrementally | listBoards is a snapshot and deleted files must disappear on the next fetch.
Confidence: high
Scope-risk: narrow
Directive: Treat setProjectBoards/applyBoardList as replacement semantics for each project.
Tested: cd apps/web && pnpm exec vp test run src/workflow/boardListState.test.ts; cd apps/web && pnpm exec vp test run src/workflow; pnpm exec vp run typecheck; pnpm exec vp check
Not-tested: sidebar fetch/refetch wiring is covered in Task 13.
Add the board creation action in the same project header flow as new threads, so discovered boards are visible and navigable from the sidebar without a separate registration step.

Constraint: Task 13 requires a one-click project Add board affordance, sidebar board rows, and a manual running-app verification path.

Rejected: Client-supplied board paths | createBoard is server-resolved by projectId and agent only.

Confidence: high

Scope-risk: moderate

Directive: Keep board creation coupled to listBoards refetches until a future watcher/push design is explicitly introduced.

Tested: cd apps/web && pnpm exec vp test run src/components/Sidebar.logic.test.ts; cd apps/web && pnpm exec vp test run src/workflow; pnpm exec vp run typecheck; pnpm exec vp check; Playwright against http://localhost:5734 created a Workflow board row and navigated to /board?boardId=...

Not-tested: Multi-project context-menu branch was not clicked manually; implementation mirrors the existing new-thread member picker.
Remove the manual board registration affordance now that board creation and discovery provide real board ids from the sidebar path.

Constraint: Task 14 requires the board route to open sidebar-provided boardId values without a Register step and to render an explicit missing-board state.

Rejected: Keeping a disabled Register button | the v1 creation path no longer has a manual registration workflow.

Confidence: high

Scope-risk: narrow

Directive: Keep board runtime actions routed through subscribeBoard/createTicket/moveTicket/resolveApproval/runLane; this commit only changes route presentation and board lookup state.

Tested: cd apps/web && pnpm exec vp test run src/components/board/BoardHeaderControls.test.tsx 'src/routes/-boardRouteState.test.ts'; cd apps/web && pnpm exec vp test run src/workflow; pnpm exec vp run typecheck; pnpm exec vp check; Playwright opened sidebar board with no Register button and showed Board not found for a deleted/missing board id.

Not-tested: Browser test did not create a ticket from the cleaned route; ticket flows were left unchanged and covered by existing workflow tests.
First v2 sub-project: first-class script steps. Incorporates GPT-5.5 design
review — generic ScriptCommandRunner over TerminalManager (subscribe-before-
write, wrapped exit, threadId+terminalId identity), new blocked step
semantics, shared prepared-worktree executor refactor, per-project trust
table+RPC, workflow_script_run model, step cancel path, restart recovery,
and a read-only drill-in terminal viewer.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Thread `blocked` through detection/recovery/lease/projection/read-model
  (+ blockedReason), not just the schema.
- Trust gates BEFORE setup (pre-setup guard) so untrusted runs no shell.
- Cancellation is cooperative via a ScriptCancelRegistry + terminal.close
  (not pipeline-fiber interrupt, which commits nothing); full cancelStep
  RPC wiring enumerated.
- History-only terminal: dedicated server attach/read API + read-only UI
  (guarded TerminalViewport prop is insufficient; attach w/o cwd errors).
- timeout via Schema.DurationFromString; cwd containment (realpath under
  worktree); weaken process-group claim to PTY-shell kill; drop separate
  ScriptStepCancelled event.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ntics)

Combined checkpoint (two uncommitted bodies were intertwined at hunk level,
no interactive git available). All green: typecheck, contracts (163),
server workflow (82).

- board-creation polish: create-ticket dialog (title+description) +
  pipelineStepCount on board lanes + browser test.
- script-steps unit-1 (GPT-5.5/Codex, TDD): `blocked` step semantics
  threaded through StepOutcome/StepRunStatus/StepBlocked event/blockedReason,
  engine + recovery terminal-detection + lease-release, projection, read-model.
- process artifacts: codex build/fix prompts, review docs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ccdwyer and others added 15 commits June 9, 2026 22:17
The intake agent can now mark a proposal as depending on earlier ones
("build the API, then the UI on it") via zero-based dependsOn indices.
Backward references only — forward, self, and junk indices are dropped
during parsing, so a proposed set can never contain a cycle. The
dialog shows "After #N" on dependent proposals, approval remaps edges
onto the approved list (edges to excluded rows disappear), and tickets
are created sequentially so each dependent passes the real TicketIds
of its predecessors. A braindump with implied ordering becomes a
self-executing pipeline: dependents queue until their prerequisites
land, then start automatically.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Agent steps now offer "View agent session" in the drawer — a read-only
transcript of the hidden orchestration thread behind the step: the
exact instruction sent, every assistant reply, and a collapsible
activity log (tool calls, status changes). Available for completed
runs, not just live ones, via the existing subscribeThread snapshot
(by-id lookups intentionally resolve hidden threads).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The drawer gains an Artifacts section that lazily lists and renders the
scratch documents pipeline steps write under .t3/ticket/<id>/ (PLAN.md,
SPEC.md, REVIEW.md, ...) — each expandable, capped at 20 files and 64k
characters with a truncation marker. Reads go through the worktree
resolved for the ticket and the workspace filesystem's new listFiles,
which keeps the same realpath containment as every other workspace
read. Combined with route history and discussion, every ticket is now a
self-documenting record of how the work happened.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Boards can now react to the world. Lanes gain onEvent matchers
({name, when?, to}) whose predicates see only {event: {name, payload}}
(lint enforces the allowlist and validates targets). A per-board
webhook — POST /hooks/workflow/:boardId with an x-t3-webhook-token
header — correlates events to tickets by explicit ticketId XOR a
workflow/<ticketId> branch name, dedupes optional deliveryIds
race-free, bounds payloads with a JSON-aware sanitizer, and answers
404 identically for unknown boards and bad tokens.

Matching events move the ticket through the engine's new
ingestExternalEvent: matchers are evaluated against the current lane,
the move commits a TicketRouteDecided with source "external_event"
under the board admission lock with a stale-lane guard (a concurrent
move makes the event a no-op), supersedes in-flight work like a manual
move, and reports moved/queued/noop precisely (enterLane now returns
what it did). Route history explains these moves with the event name.

Tokens are stored hashed (sha256) with an 8-char prefix; the plaintext
appears only in the create/rotate response of the new
workflow.getWebhookConfig RPC. Cron, PR-mode merges, and broadcast
events stay out of scope per the design review.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- Supersede a ticket's running work only inside the admission lock, after
  the stale-lane guard and a matcher/target revalidation pass — stale
  events can no longer interrupt the current pipeline
- Webhook route: byte-accurate body cap (content-length precheck +
  byteLength), fail-closed 503 when delivery dedupe cannot be recorded,
  safe decodeURIComponent
- Sanitizer drops __proto__/prototype/constructor keys
- Board deletion revokes webhook tokens and delivery logs (RPC delete,
  recovery sweep, and discovery sweep paths)
- onEvent predicate allowlist no longer accepts event.payloadX typos

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- getBoardDigest RPC: created/shipped counts, token + agent-time totals,
  and tickets waiting on a human, over a clamped 1-168h window
- BoardTicketView.updatedAt threads through projections to the web store
- Ticket cards show a warn/alert aging badge once a ticket has been
  waiting_on_user/blocked for 30min/2h
- Digest dialog in the board header with a needs-attention count badge

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Selecting a lane (or one of its steps/transitions) fades every edge that
neither leaves nor enters that lane to 15% opacity and renders the
connected edges last so they sit on top. Slot allocation is unchanged,
so edge geometry stays put while selecting.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- The digest dialog's load fires from the controlled open click —
  onOpenChange only covers internal closes, so the fetch never ran
- In-flight digest fetches are invalidated on close so stale responses
  cannot repopulate the dialog
- Aging badges and the needs-attention count recompute on a 60s tick
  instead of waiting for an unrelated re-render
- getBoardDigest clamps windowHours into 1..168 instead of falling
  back to 24

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
simulateBoardRoute walks a definition with every step forced to a
chosen outcome (success/failure/blocked), mirroring the engine's route
precedence (step.on → transitions → lane.on). Transition predicates run
against a synthetic context, so lane.runCount loop bounds behave as
they would live. Ends classify as terminal / manual / no_route /
cycle_cap — surfacing dead ends and unbounded loops before an agent
burns tokens on them.

Exposed as workflow.dryRunBoard (read scope) over the definition
currently in the editor — unsaved changes included — with a "Dry run"
panel in the workflow editor explaining every hop.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- "Webhook" board-header dialog: endpoint URL, token shown exactly once
  (on first provision or rotate, prefix-only afterwards, cleared from
  memory on close), copyable curl example built from the page origin
- "External events" section in the lane routing editor: name, optional
  predicate JSON over event.name/event.payload.*, and target lane,
  backed by add/update/remove editorModel mutations

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The intake dialog now has the same provider/model + effort pickers as
agent steps, defaulting to the recent-agent heuristic that previously
chose silently. The selection (including effort options) rides the
existing intakeTickets AgentSelection; proposing is disabled when no
provider is available.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Review findings on the dry-run batch:
- lane.runCount now mirrors countLanePipelineRuns: a consecutive streak
  that resets when another lane runs a pipeline — alternating loops now
  correctly dry-run as unbounded instead of bounded
- Empty auto lanes no longer route (the engine returns before starting
  a pipeline with no steps); the dry run ends with an explanatory note
- Synthetic ticket status matches what the routing-context builder
  would read (running / blocked), with a note whenever a predicate
  reads status so the approximation is visible
- Predicate evaluation errors stop the walk (live routing errors there)
  instead of falling through to later transitions
- dryRunBoard rejects oversized definitions (256k chars, 200 lanes,
  100 steps/transitions/events per lane) — it is read-scoped and takes
  caller-supplied definitions

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
# Conflicts:
#	apps/web/src/components/Sidebar.tsx
Captured from a live run on a mock project (Snackbase): a delivery
board whose agent steps run GPT-5.5 at low/medium/high/xhigh reasoning,
with a real ticket driven through plan → build → review.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 3c9e9787-59ad-4b9a-aca6-09293aedb682

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added vouch:unvouched PR author is not yet trusted in the VOUCHED list. size:XXL 1,000+ changed lines (additions + deletions). labels Jun 10, 2026
Comment thread apps/server/src/workflow/Layers/TurnStateReader.ts Outdated
Comment thread apps/server/src/workflow/workflowFile.ts
Comment thread apps/web/src/components/board/TicketCard.tsx Outdated
Comment thread apps/server/src/workflow/Layers/ApprovalGate.ts
Comment thread apps/web/src/components/board/editor/canvas/CanvasView.tsx
Live-run notes:
- Captured output now falls back to earlier assistant messages in the
  same turn (newest first) when the final message has no fenced json
  block — multi-message agents (skill-driven review formats, progress
  notes) no longer read as "no vote"
- The auto-appended captureOutput suffix states it overrides any
  skill/workflow output format

Macroscope findings on the PR:
- ApprovalGate.getOrCreate registers the deferred atomically via
  Ref.modify — concurrent callers can no longer wait on an orphaned
  deferred
- TurnProjectionPort treats interrupted turns as completed, matching
  toTurnState's terminal classification
- Predicate path lint rejects steps.<key>.status.<extra> nesting
- Failed ticket badge says "failed", not "blocked"
- Canvas lane-height equality checks key presence, not just value
- Version history preview guards against stale async responses

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@ccdwyer

ccdwyer commented Jun 10, 2026

Copy link
Copy Markdown
Author

Addressed all 6 Macroscope findings in 87db8cc:

  • ApprovalGate race (Medium): getOrCreate now registers the deferred atomically via Ref.modify — concurrent callers always share one deferred.
  • TurnStateReader interrupted (Medium): completed now includes interrupted, matching toTurnState's terminal classification.
  • workflowFile status path (Low): steps.<key>.status.<extra> is now rejected, same as exitCode.
  • TicketCard failed badge (Low): label fixed to "failed".
  • CanvasView record equality (Low): key presence checked explicitly instead of ?? 0 masking.
  • VersionHistoryPanel race (Low): stale async version loads are invalidated by a request counter.

Same commit also hardens review-panel verdict capture (found during the live demo run): captured output now falls back to earlier assistant messages in the turn when the final message lacks the fenced json block, and the auto-appended captureOutput suffix explicitly overrides skill-driven output formats.

🤖 Generated with Claude Code

@ccdwyer ccdwyer marked this pull request as ready for review June 10, 2026 21:44
Comment thread apps/server/src/ws.ts
Comment thread apps/server/src/terminal/Layers/Manager.ts
@macroscopeapp

macroscopeapp Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Approvability

Verdict: Needs human review

Diff is too large for automated approval analysis. A human reviewer should evaluate this PR.

You can customize Macroscope's approvability policy. Learn more.

ccdwyer and others added 4 commits June 10, 2026 18:24
Ticket cards communicated status three ways at once (colored left
border, status dot, icon pill) — for idle/queued/done the dot was pure
gray decoration. Status is now a single colored word in a quiet meta
footer, aging folds into it instead of stacking a second badge, and
dependencies render as plain text. Only running tickets get a live
pulsing indicator (motion-safe gated).

Lanes lose the bordered box and "manual / 2" chips: header is name +
count in type, auto-entry lanes get a small AUTO tag, and the card area
is a soft surface that highlights as a drop target while dragging.
Webhook/Digest toolbar triggers demote to ghost so the header reads
primary → secondary → tertiary.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Fresh live run on the Snackbase demo project replacing all 18 PR
screenshots: a new "Add stock count endpoint" ticket driven through
plan → build (trust gate on the script step, then trusted and rerun) →
three-reviewer panel → Needs Attention, plus a live GPT-5.5 intake
proposal run. Captures now show the redesigned cards/lanes (single
status word, live running indicator, chip-less lane headers) and are
free of the provider-update toast that polluted the previous set.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XXL 1,000+ changed lines (additions + deletions). vouch:unvouched PR author is not yet trusted in the VOUCHED list.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant