[DRAFT] [FEAT]: GUI Conversation Tree UI #2014
Conversation
Captures the rev-18 design state across all three GUI tree-view docs before PR1 (backend operator-validation relocation) starts landing implementation. Rev 18 closures (per 01 §0): - Q.S.1 DECIDED: V1.0 ships without intra-wave memoization; Crescendo cost cliff is documented in §1.2 instead. - Q.S.2 DECIDED: operator-as-tag (honor-system); §9.4.5 scaled back to relocation-only, no anonymous-rejection. Preserves the no-labels early-return. - Q.S.3 remains a V1.0 gate item pending the Q.S.4 Crescendo experiment. - Rubber-duck cheap wins: per-leaf ExecutionRecord timing fields, per-WaveEvent emittedAt, version: number on ConversationTreeNodeBase for V2 forward-compat, i18n string-registry V1.0 commitment, FanNode polymorphism honest naming, cost-preview tooltip on the refresh affordance, permanent failure class surfaced distinctly, client-side telemetry-vs-privacy line in §15. - F7 mechanical sweep: backend .py:L<n> citations refreshed. These docs are the contract that PR1-PR7 implement against.
V1.0 tree-UI PR1 per doc/gui/design/01_tree_primitives.md §9.4.5.
Problem
-------
Today's check at pyrit/backend/services/attack_service.py reads from
piece.labels["operator"], which is written by an attack_mappers.py path
marked `removed_in="0.16.0"`. After that deprecation lands, the
piece-label check silently no-ops and the server-side operator
isolation disappears for tree-UI traffic, leaving only the UI posture
(client-side, bypassable via direct REST call).
Change
------
Relocate the source of truth to `AttackResult.labels["operator"]` so
the check survives the 0.16.0 piece-label-write deprecation. The
signature changes from `(conversation_id=, request=)` to
`(ar=, request=)`: the caller (add_message_async) already has the AR
in scope, so passing it directly avoids a duplicate lookup and makes
the dependency explicit.
Backward-compat fallback
------------------------
When ar.labels has no operator key, fall back to reading from existing
piece labels — matches the pre-relocation behavior for legacy ARs that
were tagged at the piece level only. Bounded fallback: it dies with
the deprecated piece-label write path in 0.16.0.
The fallback was chosen over pure relocation per user input: if any
production AR exists without an AR-level operator label, pure
relocation would silently disable enforcement for those rows. The 3
LOC + 2 tests cost is well below the silent-disablement risk.
Honor-system preserved (Q.S.2)
-------------------------------
The no-labels early-return is preserved by design. Anonymous requests
(request.labels absent / empty / missing "operator" key) pass
unchallenged. The operator is a tag the operator picks — not an auth
claim. The V1.0 posture defends against accidental mis-attribution and
casual cross-operator extensions; motivated bypass is out of scope
until V1.1+ multi-operator collaboration revisits.
TDD
---
Wrote 11 tests in `TestValidateOperatorMatch` + `TestAddMessageOperatorIntegration`
covering:
- Three honor-system early-returns (no labels, empty labels, no
operator key).
- Four AR-first paths (match passes; mismatch raises; precedence
rule when both sources present; no-enforcement when nothing to
compare).
- Three backward-compat fallback paths (mismatch raises; match
passes; empty everywhere passes).
- One integration test through add_message_async confirming the
call site is wired with `ar=ar`.
All 11 failed for the right reasons against the original
implementation (TypeError on `ar=` kwarg; DID NOT RAISE on the AR-only
mismatch case). All pass after the relocation. Backend suite green:
652 passed, 4 skipped, 0 regressed.
PR sequencing
-------------
Per design doc §9.4.5, this PR must merge before the V1.0 GUI PR. The
PyRIT version bump that signals "tree-UI safe" is a separate concern
tracked by the §9.4.5 startup version gate in the upcoming PR3.
…essagePiece DTO
V1.0 tree-UI PR2 per doc/gui/design/01_tree_primitives.md §9.4.4 (b).
Problem
-------
The response DTO `MessagePiece` at pyrit/backend/models/attacks.py drops
two domain fields the V1.0 tree-UI must read at reload time:
1. `original_prompt_id` — the lineage-root piece id, preserved across
`Message.duplicate()` so descendants share the same root. The
resolver in 03 §4.1 reads it from prepended pieces to keep
lineage chains intact when re-prepending clean-prefix history.
2. `converter_identifiers` — the sequential converter pipeline that
produced the piece's converted_value. Without it, reload-
reconstruction (§9.4.1) renders UserTurnNodes with empty
converter pipelines indistinguishable from "no converter ever
applied", and the next Refresh silently fires without the
operator's authored converters. Also load-bearing for
`Fan(axis='converter')` variant-payload reconstruction (§9.3.1)
which derives `variants[s].payload.converters` from the
fan-child leaf's first user-turn `converter_identifiers`.
Change
------
- Add `original_prompt_id: str | None = None` (defaults to None for
defensive null-handling, though persisted domain pieces always have
a non-null value via `_set_original_prompt_id_default`).
- Add `converter_identifiers: list[ComponentIdentifierField] = []`
(uses the same annotated alias the domain `MessagePiece` uses; the
PlainSerializer flattens each ComponentIdentifier to the wire shape
the frontend reads via `ComponentIdentifierField.model_dump()`).
- Update `pyrit_messages_to_dto_async` to populate both. The
`original_prompt_id` is cast to `str()` since the domain field is
`uuid.UUID | None`; the `converter_identifiers` is a defensive
`list(...)` copy.
Reusing ComponentIdentifierField (rather than defining a parallel
ComponentIdentifierDto) keeps the DTO surface honest: the wire shape
matches the domain shape's `model_dump()` output, so the round-trip
contract is structural rather than maintained-in-two-places.
The PR2 contract spelled out in §9.4.4: empty list (not None) means
"no converter applied"; the field being declared is what makes that
distinguishable from "DTO missing the field" on the TypeScript side.
TDD
---
Wrote 6 tests:
- 4 in `TestPyritMessagesToDto`: exposes original_prompt_id;
handles None defensively; exposes converter_identifiers with
round-trippable shape; empty list when domain has no converters.
- 2 in new `TestMessagePieceDtoDefaults`: direct DTO instantiation
asserting defaults `[]` and None, plus a JSON-round-trip via
model_dump() proving the frontend gets the flat shape.
All 6 failed for the right reasons against the original mapper
(KeyError 'original_prompt_id' on dumped DTO; absent field on direct
construction). All pass after the DTO + mapper changes. Backend
suite green: 658 passed (6 new), 4 skipped, 0 regressed.
PR sequencing
-------------
Per §9.4.5 PR sequencing: PR2 ships before the GUI PR3 so the V1.0
frontend types can reference these fields. Build-time check (the
auto-reverse code reads them; TS fails if absent) is the mandatory
enforcement; PR2's landing makes the gate pass.
… DTO fields
V1.0 tree-UI PR3a per doc/gui/design/01_tree_primitives.md §9.4.4 (a).
The TS-side counterpart of backend PR1 (a22854cb1) + PR2 (d1bbeed0d):
extends the frontend API surface to match the new wire shape.
Types added / extended
----------------------
- `CreateAttackRequest.prepended_conversation?: PrependedMessageRequest[]`
— the runner's central dispatch field; carries clean-prefix history
per §4.1 / §3.3. Optional so the existing chat tab's
`source_conversation_id` + `cutoff_index` path keeps working
unchanged.
- `PrependedMessageRequest` (new) — mirrors the backend's
PrependedMessageRequest. `role` is the four-value ChatMessageRole
literal; multimodal turns bundle multiple pieces into one message.
- `ComponentIdentifier` (new) — flat shape from
`ComponentIdentifier.model_dump()`; `class_name` + `class_module` +
`params` are required (the V1.0 runner reads these for ConverterRef
reconstruction in §9.3.1). `hash` / `pyrit_version` / `eval_hash` /
`children` are declared optional so the wire payload type-checks
regardless of which optional fields the backend chooses to populate.
- `BackendMessagePiece.original_prompt_id: string | null` — required
on every PR2-or-newer payload; null only on defensive-test inputs.
- `BackendMessagePiece.converter_identifiers: ComponentIdentifier[]` —
required; empty list = no converter applied (distinguishable from
field-missing by being present).
Contract tests
--------------
`frontend/src/types/treeUi.contract.test.ts` — 13 tests using TS
`satisfies` for compile-time shape verification + runtime sanity for
the optional-field defaults. The "contract" is that backend payloads
matching the documented PR2 shape produce usable typed values; if
the frontend types drift, the satisfies clauses fail at compile time.
Test infra bug fixed
--------------------
Latent bug in tsconfig.test.json: the file `extends` tsconfig.json,
which has `"exclude": ["src/**/*.test.ts", ...]`. Per TypeScript's
extends semantics, the inherited `exclude` keeps applying unless the
child overrides it — so the child's `include` patterns were no-ops
and `tsc -p tsconfig.test.json` compiled only jest.config.ts. ts-jest
doesn't type-check at run time, so test-file type drift went uncaught
indefinitely.
Fix: add `"exclude": []` to tsconfig.test.json so the `include`
patterns take effect. The PR3a contract tests rely on this:
running `npx tsc -p tsconfig.test.json` is what proves the
`satisfies` clauses are actually checked.
Pre-existing test type errors surfaced
--------------------------------------
The fix surfaces pre-existing type errors in other test files
(api.test.ts, AttackHistory.test.tsx, services/api.ts) that have
been latent since the config bug was introduced. These are
intentionally NOT fixed in this PR — they're unrelated to the
tree-UI work and need their own focused fix. `npm run type-check`
(the existing CI-wired script targeting tsconfig.json) is unchanged
and continues to pass. The tree-UI test surface is validated via
`npx tsc -p tsconfig.test.json | grep treeUi.contract` (clean).
TDD
---
Wrote treeUi.contract.test.ts first, ran type-check, watched 9
specific type errors fire ("Property 'X' does not exist on type Y" /
"Module has no exported member 'Z'"). Added the types; type errors
cleared; 13 runtime tests pass; existing 662 frontend tests
unaffected; lint clean; main `npm run type-check` clean.
PR sequencing
-------------
PR3a is a pure frontend type extension; nothing else consumes the
new types yet. PR3b (tree-UI domain types — ConversationTree,
ConversationTreeNode, ExecutionRecord, WaveEvent, etc.) will follow,
then PR4 (runner core) will consume both PR3a and PR3b types to
implement the V1.0 dispatch loop.
V1.0 tree-UI PR3b. Translates the design doc set into TypeScript:
type model from doc/gui/design/01_tree_primitives.md §4–§6 / §13
(data model, lifecycle, propagation, workspace, undo) and runner
interfaces from doc/gui/design/03_runner.md §2 / §6 (Runner,
RunnerStateSink, CostGuardrail, CrossTabLockManager, WaveEvent).
What ships
----------
`frontend/src/runner/treeTypes.ts` (single file; PR4 will add
`runner.ts` + helpers alongside it):
- Branded id types `ConversationTreeId`, `ConversationTreeNodeId` —
type-level disambiguation with zero runtime cost. Catches "passed
node id where tree id was expected" at compile time.
- Lifecycle: `NodeState` (the 7 values from §6.1), `NodeFailureClass`
(transient / rate_limited / permanent / blocked per §6.1 + §3.3a),
`ApiErrorReason` (the structured `{message; failure_class}` sink
reason from §3.3a `_format_api_error`).
- Shared types: `PromptDataType`, `ConverterRef` (stored id or inline
spec per §4.6), `PieceSpec`, `WaveTriggerKind` (the closed V1.0 +
V1.1/V2 enum from §6.2), `ExecutionRecord` (with the rev-18 timing
triple `dispatchedAt` / `targetFirstByteAt` / `completedAt`),
`ReflogEntry` (per-tree pinned wrapper around immutable
ExecutionRecord per §6.5 sharing semantics).
- Node taxonomy: `ConversationTreeNodeBase` + six discriminated
variants (`RootPromptNode`, `ImportMessageNode`, `UserTurnNode`,
`SendNode`, `FanNode`, `ScoreNode`) → `ConversationTreeNode` union
discriminated by `kind`. `NodeParams` helper alias for the undo
system's snapshot needs.
- FanNode surface: `FanAxis` (V1.0 attempt+converter, V1.1+ the four
others), `FanVariant` (discriminated union over axis with per-axis
payload shape). `promotedChildSlotIndex` + `deletedSlotIndices`
tombstone array per §5.1 invariants 2 + 4.
- Edge: `ConversationTreeEdge` with the slotIndex fan-discriminator
that MUST be in the resolved-input hash per §5.1 microsoft#4.
- Undo: `UndoOp` discriminated union per §6.9 with the rev-16
state-snapshot widening (`editParams` carries `priorState` +
`priorDescendantStates` so undo restores the §6.3-cascade state, not
just the named field; same for `makeCurrent` per §6.7 step 4).
- Tree container: `ConversationTree` (with `parentConversationTreeId` /
`parentSourceConversationId` / `undoStack`).
- Workspace: `Workspace` (V1.0 minimal: `currentTree`, `recentTreeIds`,
`settings`) + `WorkspaceSettings` (`reflogCapPerNode`,
`confirmThresholdCount`, `suppressConfirmModalThisSession`).
- Wave: `WaveEvent` discriminated union over `kind` (start /
node_complete / complete / busy / queued / reflog_eviction /
operator_tag_required) with required `emittedAt: string` per rev-18
Finding C.1. `complete.summary.failed` bucketed by class per rev-16
Findings 2+3; `blocked` is in-flight-cascade victims; `cancelled`
is operator wave-abort; `reflog_evicted` rolls up wave-time evictions.
- Runner interfaces: `Runner` (6 entry points: refresh* + cancelWave
+ cancelQueued + retryFailedNodes), `RunnerStateSink` (the runner's
sole React-state mutation surface with the rev-15 reason semantics:
string/ApiErrorReason/null/omitted, plus missing-node tolerance),
`CostGuardrail`, `CrossTabLockManager` (BroadcastChannel advisory
lock per §10.4).
`frontend/src/runner/treeTypes.contract.test.ts` — 35 tests using
`satisfies` for compile-time shape verification + runtime sanity for
each variant. Validates discriminator narrowing for the 6 node kinds
(switch-on-kind gives type-safe access to params[…]), the 6
FanVariant axes, the 7 WaveEvent kinds, and the 5 UndoOp kinds.
TDD
---
Wrote treeTypes.contract.test.ts first (importing 35 symbols from a
nonexistent module). Type-check `tsc -p tsconfig.test.json` flagged
"Cannot find module './treeTypes'" — the expected red. Created
treeTypes.ts with each named symbol; cleared an unused-import slip
(stray `./converterRef` reference); type-check returned clean for the
treeTypes.contract test surface. Runtime suite: 35 passed.
Aggregate frontend: 697 passed (+35), no regression; main
type-check + lint clean.
Scope discipline
----------------
This commit is type definitions + interface declarations only. No
runner implementation; PR4 will land that in `runner.ts` alongside
this file and consume both PR3a (API types) and PR3b (domain types).
The deliberate split avoids a 4000-line PR4 by getting the type
surface reviewer-stable first.
…(PR4a)
V1.0 tree-UI PR4a — first slice of the runner. Pure functions over the
ConversationTree that compute which leaf Sends are dispatchable for a
given wave, plus the retry-failed pre-readiness demotion that makes
[Retry failed] waves work.
What ships
----------
`frontend/src/runner/readiness.ts`:
- `findLeafSends(tree)` — every SendNode with no SendNode descendant.
UserTurn / Fan / Score descendants do NOT make a Send interior (per
the §2 vocabulary). Orphan Sends (Send with no children) are leaves
per 03 §3.2.
- `isLeafSend(tree, nodeId)` — predicate counterpart to the above.
- `computeReady(tree, S)` — the §3.1 readiness rule literally:
leaf Sends in S whose every SEND ancestor has state in {edited,
stale, running, clean}. Interior Sends never enter ready (they
regenerate as part of their leaf's sequence per §3.2). Failed /
cancelled SEND ancestors block the leaf — the rev-15 Finding 4
anti-amplification rule that prevents single-5xx-cascades-to-N-
retries against rate-limited targets.
- `buildSForTree` / `buildSForSubtree` / `buildSForNode` — S
construction per the three refresh scopes from 03 §2.1.
- `demoteRetryFailedNodes(tree, S, sink)` — §3.1 step 2b: for
`waveTriggerKind === 'retry_failed'` only, flip every S-member
{failed, cancelled} node back to `stale` and clear its execution
BEFORE the readiness rule runs. Uses the `null` reason sentinel
(per 03 §2.2) so `lastError` clears rather than lingers. This is
the mechanism that lets the [Retry failed] toast button's wave
actually re-dispatch failed leaves — without it the ancestor
allowlist would silently exclude them and the wave would no-op.
`frontend/src/runner/testHelpers.ts` — shared across all PR4 sub-PRs:
- Builders per node kind (`mkRoot`, `mkUserTurn`, `mkSend`, `mkFan`,
`mkScore`, `mkImport`) that fill boilerplate (timestamps, empty
history, default state) so tests name only the fields under test.
- `mkEdge`, `mkTree` for graph construction. `mkTree` derives edges
from `parentId` if not supplied; tests needing explicit fan
slotIndices pass `overrides.edges`.
- `mkExecution` for ExecutionRecord fixtures.
- `mkMockSink` — recording sink that captures every call. Helpers
`callsOf(method)`, `events()`, `stateChanges(nodeId)` keep test
assertions terse.
`frontend/src/runner/readiness.test.ts` — 30 tests covering:
- findLeafSends: 7 cases including the orphan-Send edge case, fan
children, Crescendo-style interior detection, Score-only descendants
not making Send interior, multi-depth trees.
- computeReady: 11 cases including the rev-15 anti-amplification
cases (failed/cancelled ancestor blocks; edited/stale/running/clean
ancestor admits), Fan/Score-transparent ancestor walking, the leaf-
not-in-S edge case, empty S.
- buildSForTree / Subtree / Node: 5 cases including the running/draft
state exclusion, subtree scope, single-node scope.
- demoteRetryFailedNodes: 5 cases including the null reason sentinel
(clears lastError vs leaves it stale), ignoring non-{failed,
cancelled} S members, ignoring nodes outside S, and the integration
case showing computeReady admits a previously-blocked leaf after
demotion.
- 1 fan-slot-aware traversal smoke test confirming explicit
slotIndex edges don't perturb leaf detection.
TDD + scope discipline
----------------------
Tests written first against a nonexistent ./readiness module (TS2307
+ implicit-any cascade as the expected red). Implementation made the
file resolve and the types narrow; all 30 pass. One real bug caught
in the helpers: `as const` on the shared `base()` object made
`executionHistory: []` resolve to `readonly []`, unassignable to the
mutable `ReflogEntry[]` field shape — fixed in the same PR (since
the helper is its own file landing here for the first time).
Test infra-bug from PR3a stays surfaced (pre-existing test type
errors in unrelated files); the runner directory type-checks clean
under `tsc -p tsconfig.test.json`. Aggregate frontend: 727 tests
pass (+30), no regression; main type-check + lint clean.
Next slice
----------
PR4b: `resolvePathPartition` — the pure function that walks a leaf's
root-to-leaf path and partitions Sends into clean prefix (load into
prepended_conversation as historical context) and fresh suffix (the
N add_message calls). Builds on testHelpers; no dependency on
readiness.ts. Will exercise the §5.1 invariant 5 Fan/Score
transparency on the path-walk side.
…w (PR4a.1)
Three concrete fixes from the post-PR4a rubber-duck pass (Opus 4.7 extra
high reasoning). Each addresses a defect that would have bitten PR4b
or rotted into bad test hygiene.
1. mkTree: auto-number fan-child edge slotIndex
----------------------------------------------
Before, every derived edge got `slotIndex: 0` — including all children
of a FanNode. The readiness tests don't read edges so they passed, but
PR4b's `resolvePathPartition` will read `edge.slotIndex` to drive the
fan-child variant resolution. Fixtures built via `mkTree` would have
handed PR4b bogus shapes (all attempt-fan siblings sharing slot 0,
violating the §5.1 slot-stability invariant), producing test failures
that look like resolver bugs but are actually fixture bugs.
The fix tracks which parent ids are FanNodes and auto-numbers their
child edges by ordinal. Non-fan parents stay on slot 0. Tests needing
explicit slot indices still pass `overrides.edges`.
2. MockSink: delete stateChanges helper
-------------------------------------
The `stateChanges(nodeId)` helper baked in a query shape — "give me
the sequence of states for this node" — that tempted tests to assert
exact transition sequences (`expect(stateChanges('s')).toEqual(['running', 'clean'])`).
That kind of assertion locks the runner into a specific transition
order and breaks the moment a legitimate intermediate transition is
added. `events()` was also dropped: thin sugar for `callsOf('emitWaveEvent')`
with no payoff. Callers can compose `callsOf('setNodeState').filter(...)`
when they need a per-node view, which is rare enough to not deserve
a helper.
Also dropped the unused `ALL_WAVE_TRIGGER_KINDS` re-export (kitchen
sink waiting to grow).
3. demoteRetryFailedNodes test: actually compose with computeReady
----------------------------------------------------------------
The previous "integration check" test built TWO trees by hand — one
with failed states and one with the desired post-demotion states —
then ran computeReady against the second. The demoter could have
written 'staale' (typo) and the test would still pass because the
hand-built tree had the correct 'stale' state. Test-that-passes,
not test-that-proves.
The new version runs the demoter against a recording sink, projects
the recorded `setNodeState` calls onto a copy of the original tree,
then runs computeReady over the projection. If the demoter writes
anything but `stale`, the projected tree differs from the
hand-built one and computeReady's result diverges. The composition
is honest now.
Verification: 30 readiness tests pass, 727 frontend tests pass, no
regression. lint + type-check clean.
Other rubber-duck items not in this commit (filed for later):
- Trim contract-test runtime expect boilerplate (PR4a.2, this cycle).
- Add narrow CI gate for contract-test type-check (PR4a.3, this cycle).
- Tighten DTO original_prompt_id to non-nullable str + add contract
test proving the validator guarantee (separate decision; the doc
spec'd nullable).
- Q.S.1 cost-cliff regression test (lands in PR4c).
- Drop new doc citations starting PR4b; full citation strip at end
of V1.0.
… (PR4a.2)
Two coupled changes from the rubber-duck review.
1. Trim `treeTypes.contract.test.ts` runtime-expect boilerplate
------------------------------------------------------------
The previous file was 694 lines with 35 tests, most matching the
shape `const x = { ... } satisfies SomeType; expect(x.field).toBe(...)`.
The `satisfies` clause was doing all the real work; the `expect` was
asserting a literal typed two lines up — type theater that would
never fail unless someone deliberately broke the literal.
Rewrote to 11 tests focused on what genuinely catches future bugs:
- Four discriminator-narrowing tests (ConversationTreeNode kind,
FanVariant axis, WaveEvent kind, UndoOp kind). Each exercises the
switch-on-discriminator pattern the runner / UI rely on; lost
narrowing here would silently degrade dispatch sites to `any`.
- Three default-value contracts where null-vs-absent matters at
runtime (SendNode empty params → inherits target; FanNode
promotedChildSlotIndex `null` vs `0` for Pick semantics;
ExecutionRecord timing-triple nullability for pre-target-call
failures).
- Three interface-shape stubs (Runner, RunnerStateSink with all
three reason shapes, CostGuardrail + CrossTabLockManager).
- One forward-compat assertion (WaveTriggerKind admits V1.1+
markers).
Plus a short block of type-only assertions (`type _Assert... = X
extends Y ? true : never`) for structural drift that runtime tests
don't reach (branded id types on ConversationTree, ReflogEntry
shape, ConverterRef union, Workspace shape).
Net: -614 / +383 lines. Trim ratio ~50%, but the kept tests are
genuinely load-bearing.
2. Narrow CI gate for the satisfies clauses
-----------------------------------------
PR3a added `"exclude": []` to tsconfig.test.json so the contract
tests' satisfies clauses type-check. But CI (frontend_tests.yml)
only runs `npx tsc --noEmit` against tsconfig.json — the satisfies
clauses were unenforced in CI. Type theater.
Per-user direction (narrow gate), added tsconfig.contract.json that
includes ONLY the tree-UI contract files + their type dependencies
(treeUi.contract.test.ts, treeTypes.contract.test.ts, treeTypes.ts,
testHelpers.ts, types/index.ts). Wired:
- new `npm run type-check:contract` script for local DX
- new CI step `Run tree-UI contract type check` in frontend_tests.yml
The pre-existing test type errors that PR3a surfaced
(126 errors across 6 component-test files) stay latent — they
predate this work and fixing them would balloon scope. They are not
gated and not regressed by this change.
3. Real bug found by the trim
---------------------------
The new "ExecutionRecord timing triple admits null per-field" test
failed against `mkExecution` because `overrides.dispatchedAt ?? ISO_FIXED`
collapses explicit `null` overrides into the default ISO value. Fixed
mkExecution to use spread-merge (`{ ...defaults, ...overrides }`),
which preserves `null` distinct from `undefined`. The kind of bug
no operator would have seen until PR4c started constructing failure-
path execution records — caught at the right layer.
Verification: 703 frontend tests pass (was 727; trim cut 24 trivial
tests), lint clean, both type-check + type-check:contract green.
Other rubber-duck items still pending:
- DTO original_prompt_id nullability tightening (separate concern;
the doc spec'd nullable, deferred to a focused discussion).
- Q.S.1 cost-cliff regression test (lands in PR4c).
- Stop adding new doc citations starting PR4b.
- Full citation strip at end of V1.0.
…lker (PR4b)
Second slice of the runner. Pure function over the tree: walks a leaf
Send's root-to-leaf path and produces the dispatch plan that PR4c will
turn into one `create_attack` + N `add_message` calls.
What ships
----------
`frontend/src/runner/partition.ts`:
- `rootToLeafPath(tree, leafId)` — walks parents back to the root, reverses.
Throws if `leafId` is not in the tree.
- `isStaleForResolver(send)` — predicate: state in {edited, stale, failed,
cancelled} OR execution is null. The null clause is the safety net for
failed/cancelled (which null their execution by contract) and for
freshly-added draft Sends that have no execution yet.
- `resolvePathPartition(tree, leafId)` — the main walker. Output:
- `prepended: PrependedMessageRequest[]` — turns from the clean prefix
(Sends whose params still match their executions). Each clean Send
contributes its input UserTurn message + its stored assistant
response message; both load into `prepended_conversation`.
- `freshSuffix: FreshSuffixEntry[]` — the first stale Send and
everything after, down to the leaf. Each entry is (userTurn,
fanVariant, sendNode); PR4c turns each into one `add_message`.
- `treePathSegments: Array<[FanAxis, number]>` — (axis, slotIndex)
pairs for every Fan ancestor, in topo order. PR4c's _build_labels
will JSON-encode this into the `tree_path` AR label.
- `target: string` — the resolved target_registry_name (per-Send
override wins over root's value).
Notable shape decisions
-----------------------
- Synthetic UserTurn from Root. When a Send's first non-Fan, non-Score
ancestor is the RootPromptNode (i.e., no operator-authored UserTurn
between root and Send), the walker synthesizes a UserTurn-shaped
object carrying root's text + attachments. Marked with `synthetic:
true` so PR4c can detect and avoid double-wrapping; the dispatch
code can read uniform fields (role, text, attachments) regardless.
- `fanVariant` on FreshSuffixEntry vs. `treePathSegments`. Two different
things on purpose:
- `treePathSegments` records every Fan ancestor on the path. Used
for the `tree_path` label (round-trips fan structure for reload).
- `fanVariant` carries the variant the runner needs at dispatch
time *for this specific Send*. Set when a Fan sits between the
Send's input UserTurn and the Send itself (the attempt-axis-
directly-above-Send pattern). Cleared by the intervening
UserTurn for the converter-axis-with-per-child-UserTurn pattern,
where the variant data lives on the child UserTurn's authored
`converterPipeline` (fan-child materialization at create time).
Convergent: tree_path is always present; fanVariant is only present
when the runner needs it to disambiguate the Send beyond what's
already on the input UserTurn.
- Clean-prefix assistant message carries `original_prompt_id` per piece
but defers the actual piece content to PR4c's piece cache. The
partition is pure tree-walking; piece data lives in a separate cache
the dispatcher hydrates from `GET /attacks/{id}/messages` at wave start.
- ImportMessageNode on the dispatch path throws. V1.0 does not extend
imported context via the runner's dispatch loop; if a future caller
wires that path, this throw makes the gap loud rather than silent.
TDD
---
Tests written first against a nonexistent ./partition module
(TS2307 + implicit-any cascade). 25 cases covering:
- rootToLeafPath: topo order, Fan/Score ancestors included, error on
unknown id.
- isStaleForResolver: each of {edited, stale, failed, cancelled},
no-execution case, clean-with-execution case, running-with-null.
- Single-Send chain (all-stale).
- Root-promoted-to-UserTurn (no UT between root and first Send).
- systemPrompt → leading system message; absent → no system message.
- All-clean upstream + edited leaf: prepended = (user-turn + assistant-
response), freshSuffix = [leaf].
- Stale interior + leaf: prefix empty, both stales in freshSuffix.
- Clean prefix + stale interior + leaf: prefix loaded, both stales
in freshSuffix.
- Defensive: failed-with-execution still goes to freshSuffix (state
check wins, retries always re-dispatch).
- Fan(attempt)-directly-above-Send: input UT is from above the Fan,
fanVariant captures (axis, slot), tree_path has one segment.
- Fan(converter)-with-per-child-UT: input UT is the per-child UT,
fanVariant null (data on child UT's pipeline), tree_path has one segment.
- Score ancestor: pass-through; UT and variant unchanged.
- Nested fans: tree_path accumulates in topo order.
- SendNode target override wins; absence inherits from root.
- Throws on non-Send target.
- Throws on interior-Send target (the runner's dispatch loop is
documented as only calling this for leaves; precondition fails loud).
- tree_path produces JSON-serializable shape; empty array for fan-
less leaf.
One real design clarification surfaced during TDD: the converter-fan
test originally asserted `freshSuffix.fanVariant = (converter, slot)`,
which contradicts the spec's variant-clearing-on-UserTurn rule. Fixed
the test to reflect the right semantics (variant data lives on the
materialized child UserTurn's pipeline; fanVariant is null at that
Send; tree_path still captures the fan ancestor for label round-trip).
Verification: 728 frontend tests pass (+25), no regression. lint,
type-check, type-check:contract all clean.
Next slice
----------
PR4c will wire the dispatch sequence (one `create_attack` + N
`add_message` calls against a mock API client) consuming the partition
output, plus `_build_labels` (with the tree_path JSON-encoding from
treePathSegments), `_format_api_error` failure classification, the
200-message cap short-circuit, the labels-divergence invariant test,
and the Q.S.1 cost-cliff regression test that the rubber-duck flagged.
Pure helpers the dispatcher (PR4c2) calls per leaf. Split out so the
orchestrator stays small and these are testable without any API-client
mocking. No I/O; no React.
What ships
----------
`frontend/src/runner/dispatchHelpers.ts`:
- `buildLabels(args)` — the source of the labels-divergence invariant.
Produces the Record<string, string> attached to every create_attack
and add_message call in one leaf's dispatch sequence. The dispatcher
calls this once per dispatch and reuses the result; identical labels
across all N+1 calls fall out by construction.
Schema:
operator (required; hard-asserts non-empty)
operation (empty string permitted)
conversation_tree_id (stringified)
wave_id (uuid from the wave)
wave_trigger_kind (the closed-enum value)
tree_path (JSON-encoded array of [axis, slot])
parent_conversation_tree_id (OMITTED when null; written only when
the tree is a clone)
The "omit-on-null" rule for parent_conversation_tree_id is a real
contract: writing the empty string would surface a self-parent in
History "Open clones of T" — actively wrong. Omission is the honest
"no parent" signal.
The empty-operator hard-assert is defense-in-depth. The runner's
entry-point shim's tag-hygiene gate is the load-bearing check; the
assert exists so a future refactor that bypasses the gate panics
rather than silently writing operator:'' ARs (which destroy audit
attribution).
- `parseTreePathLabel(label)` + `isTreePathLabelValid(label)` — JSON
decoder + validator for the tree_path round-trip. Fail-soft on
malformed input (empty array) so older clients encountering a
future encoding don't hard-crash. The validator is for tests /
defensive code paths that want to distinguish well-formed empty
from malformed.
- `formatApiError(error, callName)` — failure-class classification.
Maps an ApiError (already-normalized by services/errors.ts) into
one of the four NodeFailureClass values:
transient : 5xx, network, timeout (auto-retry-eligible)
rate_limited : HTTP 429 + provider-specific (Anthropic 529,
detail-body strings 'rate_limit_exceeded' or
'overloaded_error'). Retry-gated in UX until the
operator manually re-triggers.
permanent : 4xx other than 429. Operator-fix required.
blocked : runner-synthesized for in-flight cascade victims;
NOT produced by formatApiError (cascade lives in
the dispatcher, PR4d).
Defaults to `transient` for unclassifiable shapes. A wrongly-
classified transient gives an unhelpful but harmless Retry click;
a wrongly-classified permanent silently locks the operator out of
recovery. Safer default.
Provider-specific detection lives in a small private registry. The
V1.x plan is to push detection to the backend (which knows each
target's provider) once token-bucket throttling lands; until then,
client-side registry keeps the runner self-contained.
TDD
---
Tests written first against a nonexistent ./dispatchHelpers module
(TS2307 expected). 26 cases covering:
- buildLabels: required keys present, treePath JSON encoded,
operation='' permitted, parent_conversation_tree_id omitted-on-null
vs written, empty-operator throws, identical inputs produce
deep-equal labels (the divergence invariant source).
- tree_path encoding round-trip: build → parse → equal; '[]' (not
absent) for fan-less; fail-soft on absent/empty/malformed/
wrong-typed input.
- formatApiError: transient (network, timeout, 5xx), rate_limited
(429, 529, detail-body strings), permanent (4xx, operator-mismatch
body, 401/403), unknown-status defaults to transient, message
shape includes callName/status/detail.
26 pass first try (one lint nit fixed: unused NodeFailureClass type
import — the value-returning function uses the inferred ApiErrorReason
type, not the underlying class union).
Aggregate frontend: 754 tests pass (+26), no regression. lint,
type-check, type-check:contract all clean.
Next slice
----------
PR4c2: the dispatch orchestrator. Consumes the partition output
(PR4b) + these helpers + a mocked API client to drive one leaf's
`create_attack` + N `add_message` sequence. Tests the 200-cap
short-circuit, the labels-divergence invariant at the call site,
the Q.S.1 cost-cliff regression (60-leaf attempt-fan with 10-deep
shared stale prefix = 600 add_message calls under V1.0's no-
intra-wave-memoization rule), and the mid-chain partial-commit
failure semantics.
…ssages (PR4c2)
The heart of one leaf's HTTP lifecycle. Consumes PR4b's resolvePathPartition
output + PR4c1's helpers + a mocked-or-real RunnerAttacksApi to drive the
backend interaction. The only place in the runner that talks to the API.
What ships
----------
`frontend/src/runner/dispatch.ts`:
- `RunnerAttacksApi` interface — the 2-method slice of the existing
`attacksApi` the runner depends on. Production wires this to
services/api.ts; tests pass a recording mock.
- `dispatchLeaf(args)` — orchestrates one leaf:
1. Defense-in-depth: throws on empty operator (the entry-point
shim's tag-hygiene gate is upstream; this fires if a future
refactor bypasses the gate).
2. Calls `resolvePathPartition` to compute clean prefix + fresh
suffix + tree_path segments + resolved target.
3. 200-message cap short-circuit. Fails the leaf with a permanent
failure-class and a recovery-pointing message; NO backend call
fires. Recovery is branch-from-midpoint.
4. Builds labels ONCE (the labels-divergence invariant source).
5. Marks every fresh-suffix Send `running` atomically so siblings
observing in-flight state see them together.
6. `create_attack` with prepended_conversation + labels.
7. For each fresh-suffix entry, `add_message` (send=true) with
the same labels + the resolved converter_ids; extracts new
assistant pieces by turn_number diff; records ExecutionRecord
on the Send; flips to `clean`.
8. On API error mid-sequence: failing Send → failed (execution
cleared); later Sends roll back to `stale` (executions cleared);
earlier successful Sends keep their clean state + executions.
- `LeafDispatchOutcome` discriminated union:
`{ kind: 'success', leafId, callsIssued }`
`{ kind: 'failed', leafId, failedNodeId, failureClass, partialAttackResultId }`
The cascade-on-failure layer (PR4d) will consume this to drop sibling
leaves whose paths include the failed ancestor.
Notable shape decisions
-----------------------
- `asApiError` private helper. The shared `toApiError` from services/
errors normalizes axios + Error + string throws but treats an
already-normalized ApiError as an unknown object (falling into the
"anything else" branch that loses the status code). The dispatcher
catches both raw axios errors (production) and pre-normalized
ApiErrors (tests, upstream layers that re-throw). The passthrough
fixes both without modifying the shared helper.
- New pieces extracted via turn_number diff. `AddMessageResponse.messages.messages`
carries the ENTIRE conversation; the dispatcher holds a per-sequence
`priorMaxTurnNumber` watermark and collects assistant pieces from
turns strictly above it. Bounded O(messages-per-AR) per call,
hard-capped at 200 by the backend; cheap.
- ExecutionRecord built in-runner. The runner mints the executionId
via `crypto.randomUUID()` (with a Math.random() fallback for very
old browsers); records dispatchedAt / targetFirstByteAt / completedAt
as the same timestamp (single-turn synchronous dispatch; full
per-call timing is a streaming-target enhancement out of V1.0 scope).
- The 200-cap is enforced on `prepended_conversation` ONLY, matching
the backend Pydantic max_length=200. add_messages extend the AR's
conversation past 200 messages cleanly; only the clean-prefix load
trips the cap.
TDD
---
Tests written first against a nonexistent ./dispatch module. 16 tests:
- Happy path (single-Send chain): 1 create_attack + 1 add_message;
correct request shapes; running→clean transition; ExecutionRecord
carries waveId/waveTriggerKind/AR id/conversation id/pieceIds.
- Multi-Send chain with clean prefix: prepended carries (user, assistant)
pairs from each clean Send; one add_message per stale Send in path
order; clean Sends are untouched.
- Multi-stale chain: each stale Send becomes its own add_message.
- Labels-divergence invariant at the CALL SITE: 4-stale chain produces
5 requests; all five labels dicts are deep-equal. Plus parent_-
conversation_tree_id written only on clones.
- tree_path label populated from fan ancestors (axis, slot).
- 200-cap short-circuit: 101 clean Sends → 202 prepended turns; leaf
fails with permanent class and recovery-pointing message; no backend
call fires.
- create_attack failure: every fresh-suffix Send fails; no add_message.
- Mid-chain add_message failure: failed Send → failed; later Sends →
stale; earlier Sends stay clean; partialAttackResultId surfaced.
- Classification round-trip: 429 → rate_limited; 400+operator-mismatch
→ permanent.
- Tag-hygiene defense-in-depth: empty operator throws synchronously.
- The Q.S.1 cost-cliff REGRESSION: 60 sibling leaves with a 10-deep
shared stale prefix produce 60 create_attacks + 660 add_messages =
720 total calls. Pins V1.0's no-intra-wave-memoization invariant
via test rather than absence-of-code; a well-meaning future
contributor who adds memoization sees this drop to ~71 and the
assertion fires loudly.
- Outcome shape: success carries leafId + callsIssued; failure carries
failedNodeId + failureClass + partialAttackResultId (null when
create_attack failed before any AR was created).
Three real defects surfaced during TDD:
1. `toApiError` doesn't recognize already-normalized ApiError throws
→ all classification tests returned 'transient'. Fixed via asApiError
passthrough.
2. The Q.S.1 test originally had `Fan → Send` directly (no UserTurn
between), which violates the §5.1 microsoft#5 invariant and makes the resolver
throw. Real test bug: rewrote to put a UserTurn above the Fan (the
correct attempt-fan shape where all siblings share an input UT).
3. mkTree's auto-numbered fan-child edges (PR4a.1) earn their keep
here: without that fix, all 60 fan-children would have shared
slot 0 and the partition's tree_path would have been wrong.
Verification: 770 frontend tests pass (+16), no regression. lint,
type-check, type-check:contract all clean.
Next slice
----------
PR4d: the in-flight cascade. When `dispatchLeaf` returns `kind:
'failed'`, the dispatch loop drops every sibling leaf in `ready`
whose root-to-leaf path includes the failed Send → `stale` with
`failure_class: 'blocked'`. Plus `cancelWave(treeId)` (flips a per-wave
flag at `ready.popNext()` boundaries) and `cancelQueued(treeId)`
(drops queued waves without aborting the active one).
Wraps PR4c2's per-leaf dispatcher in the wave-level loop. Owns:
- the concurrency cap (maxParallel, default 4)
- in-flight cascade: when a leaf's dispatch fails on a shared interior
Send, not-yet-dispatched siblings drop to `blocked` rather than
independently retrying the same failure
- operator cancellation via a per-wave controller checked at each
ready-pop boundary; in-flight HTTP completes (V1.0 UI-level cancel
contract), not-yet-dispatched leaves transition to `cancelled`
- wave-event emission (start with estimated call count, one
node_complete per dispatched leaf, complete with bucketed summary)
What ships
----------
`frontend/src/runner/wave.ts`:
- `WaveDispatchController` + `createWaveController()` — factory for the
per-wave cancel handle. PR4e's entry-point shim creates one per
wave and registers it in the active-wave map so `runner.cancelWave(treeId)`
can flip it.
- `WaveSummary` — the wave-complete tally shape. Mirrors
`WaveEvent.complete.summary` (succeeded, failed bucketed by class,
blocked, cancelled, reflog_evicted) and is also the return value of
runWave so callers can read it without subscribing to the event stream.
- `runWave(args)` — the loop:
1. Compute initial ready set via PR4a's computeReady; estimate calls
upfront for the start event.
2. Emit `start` event with estimatedCalls + wave metadata.
3. Drain ready into inflight up to maxParallel; wait on Promise.race;
on each completion tally outcome + emit node_complete; on failure
run the cascade.
4. On cancellation: skip further picks; let in-flight complete
naturally (their natural outcomes are tallied, not clobbered with
`cancelled` — the execution-clobber gate).
5. Cancel-tally: leaves still in `remaining` after the loop exits via
cancellation get marked cancelled in the sink + outcome map.
6. Emit `complete` event with the bucketed summary; return it.
Notable shape decisions
-----------------------
- Cascade is "drop from ready", not "drop from running". When a leaf's
dispatch fails on an interior Send, only siblings still in the
remaining-set get blocked. Siblings already in flight continue and
may independently fail on the same Send (counted as their own
per-leaf transient failures, not as cascade-blocked).
- The cascade walks `remaining`, not the whole tree. Each leaf in
`remaining` is checked against the failedSendId via rootToLeafPath
containment. Bounded by remaining-leaf count × path depth; cheap
at V1.0 sizes.
- Cancel uses a controller flag, not an AbortController. JS native
AbortController is bigger surface than needed and doesn't compose
naturally with `dispatchLeaf`'s in-flight contract (which is "let
the HTTP complete"). The controller's role is purely "don't pick
more leaves"; in-flight ones decide their own fate.
- The cancel-tally step uses `'transient'` as the placeholder
failure_class in lastError because NodeFailureClass doesn't carry
a 'cancelled' variant (cancelled is a NodeState, not a failure class).
The tally bucket separately accounts for cancelled leaves; the
failure_class in lastError is informational only for cancelled state.
- The summary is returned directly AND emitted via the complete event.
PR4e's shim will use the return value; the UI subscribes to the
event stream. Both paths see the same data; no divergence risk
because there's one source of truth (the outcomes map).
Test scaffolding
----------------
A controllable mock API client (`mkControllableApi`) gives tests precise
control over per-leaf timing via deferred Promise resolutions
(`releaseNext` / `failNext` / `pendingCount`). The wave loop is
deterministic given the resolution order; tests use a poll-based
`waitFor` helper to wait for specific predicates rather than fixed
microtask-flush counts (which proved racy when the dispatchLeaf
chain has more await hops than the flush counter — caught and fixed
during this PR's TDD cycle).
TDD
---
Tests written first against a nonexistent ./wave module (TS2307 +
missing-module cascade). 16 tests covering:
- Empty S: no dispatches; zero summary; start + complete only.
- Single leaf happy path: one dispatch; summary.succeeded=1; events
in order (start → node_complete → complete) with the right
metadata (waveId, triggerKind, estimatedCalls, treeId).
- 3-leaf fan all succeed: summary.succeeded=3.
- Concurrency cap: maxParallel=2 with 5 ready leaves; inflight.max=2
throughout the wave.
- Default maxParallel=4 when omitted.
- Wave-event ordering: start first; one node_complete per leaf with
'success' or 'failure' outcome; complete last with bucketed summary.
- In-flight cascade (maxParallel=1, shared interior Send fails):
sibling leaves drop to `blocked` with failure_class='blocked' on
lastError; no further API calls fire; summary 1 failed + 2 blocked.
- In-flight cascade ONLY blocks not-yet-dispatched siblings: with
maxParallel=3 and 3 sibling leaves all dispatching, each leaf's
failure is independent; no blocked count; summary 3 failed.
- Mixed cascade across two fan subtrees sharing one ancestor Send:
cascade walks across fan boundaries; both subtrees' siblings get
blocked.
- Cancel before any dispatch: all leaves cancelled; no API calls.
- Cancel mid-wave: in-flight completes; remaining → cancelled.
- Execution-clobber gate: in-flight leaves that complete AFTER cancel
still record their executions; tallied as succeeded, not cancelled.
- Controller omitted defaults to never-cancelled.
- Summary failure-class bucketing across mixed outcomes (1 each of
transient + rate_limited + permanent).
Two real test-infrastructure defects surfaced during TDD:
1. `flushMicrotasks(4)` was too few hops for the dispatchLeaf chain's
await sequence (await createAttack → await addMessage → record →
wrap → race → loop → pick next → await createAttack → await
addMessage). Bumped default to 32 and added a poll-based `waitFor`
helper for tests that need a specific predicate satisfaction.
2. The original concurrency-cap test tried to interleave releases and
pending-count assertions, which raced the loop. Simplified to drain
all 5 deferreds + assert inflight.max at the end — the invariant
the test was actually trying to prove.
Verification: 786 frontend tests pass (+16), no regression. lint,
type-check, type-check:contract all clean.
Next slice
----------
PR4e: 5-step entry-point shim. Wraps runWave with:
1. Tag-hygiene gate (operator non-empty)
2. Cross-tab lock acquire (mock for now; real BroadcastChannel in PR4f)
3. Cost guardrail modal
4. Per-tree wave queue check
5. Wave start + try/finally lock release
Plus the active-wave map so `runner.cancelWave(treeId)` can lookup the
controller created by the shim. Then PR4f wires the real BroadcastChannel
lock + queue drain, completing the runner.
…V1.0 (PR4d.1)
Critical fix. Rubber-duck reviewer caught: partition.ts's clean-prefix
branch was writing the literal placeholder string `'<deferred: resolved
from piece cache>'` as the `original_value` of every assistant piece
loaded into prepended_conversation. The piece cache the design assumes
(per 03 §3.3a `_load_piece_as_request`) doesn't exist. Every clean-
prefix dispatch in production would send fabricated assistant history
to the LLM — either backend-rejected (validation) or model-accepted
and reasoned-against (silently corrupting target responses).
Tests passed because no assertion looked at `pieces[i].original_value`.
The fix
-------
Two options were viable:
A. Build the piece cache (~150 LOC: new module, wave-start integration,
GET /attacks/{id}/messages per distinct source AR, cache by piece_id,
partition reads from cache). Closes the design's contract; preserves
the per-leaf-edit cost optimum.
B. Drop the clean-prefix branch entirely. Every Send on the path
enters freshSuffix and re-fires against the target. ~10 LOC change.
Operators pay full re-dispatch cost on every wave; correctness is
restored because every assistant message the target sees was
actually generated by the target.
V1.0 ships option B per operator's explicit choice — aligns with Q.S.1
"dumb but correct" V1.0 discipline. The clean-prefix optimization
returns in V1.x with the piece cache.
Operator-visible cost: editing only a leaf at the bottom of a 10-deep
clean chain now costs 11 calls (1 create_attack + 11 add_messages)
instead of the 2 calls the clean-prefix model would have produced.
~5× hot-path regression on edit-leaf-only. The wave-summary's call
count reflects this honestly; the cost-guardrail modal intercepts
expensive refreshes.
What ships
----------
- `partition.ts`: clean-prefix branch removed. Every Send on the path
enters freshSuffix unconditionally. `prepended` carries at most the
system message (when RootPromptNode.params.systemPrompt is set).
Dead helpers deleted: `userTurnMessage`, `assistantResponseMessage`,
the placeholder-string code. File-header docstring rewritten to
document V1.0 reality and the V1.x migration path.
- `isStaleForResolver` kept but re-documented: it's no longer on the
resolver's hot path under V1.0; retained for defensive callers (UI
cost preview, V1.x cache layer).
- `partition.test.ts`: three test cases updated to assert V1.0 behavior
("all-clean upstream", "clean Send + stale interior", "failed Send
with execution"). All now expect empty prepended + every Send in
freshSuffix. The implementation defect that the original tests
silently tolerated (placeholder string in original_value) is now
unreachable.
- `dispatch.test.ts`: "multi-Send chain with clean prefix" rewritten
to assert V1.0's empty-prepended behavior + N add_messages for the
full chain. "200-message cap" rewritten — the cap is unreachable in
V1.0 normal traffic (prepended ≤ 1), so the test now documents
contract (100-deep chain dispatches successfully because prepended
is empty) + leaves a placeholder for the V1.x cache-layer expansion.
Design doc updates
------------------
- `01 §1.2 known limitations` gains a new entry: "No clean-prefix
optimization in V1.0 — every dispatch re-fires the full chain from
the root." Spells out the ~5× hot-path cost, the V1.x migration,
and the correctness rationale. Updates the 200-turn-cap entry to
note the cap is unreachable in V1.0 by construction.
- `03 §4.1` gains a "V1.0 implementation reality" subsection before
the resolver pseudocode. Names the two options considered, the
choice rationale, the cost trade-off, and the V1.x migration shape.
The pseudocode below is left intact as the eventual V1 model.
Verification: 787 frontend tests pass, no regression. lint, type-check,
type-check:contract all clean. The Q.S.1 cost-cliff test in dispatch.test.ts
remains green at 720 backend calls per 60-leaf attempt-fan wave — the
shared prefix was already stale, so it was already in freshSuffix under
the prior buggy behavior; no math change.
The rubber-duck's other items (asApiError move, synthetic discriminator,
dead helper deletion, cancel-tally reason, reflog_evicted TODO,
dispatcher tag-gate redundancy) land in PR4d.2 to keep this commit
focused on the single load-bearing fix.
Six smaller items from the rubber-duck review bundled into one focused
cleanup. None are behavior changes; each is a layer / discipline fix.
1. Move ApiError pass-through to services/errors.ts
------------------------------------------------
The dispatcher had a private `asApiError` helper that duck-checked the
ApiError shape and passed it through `toApiError`. Reviewer caught:
the shape is OWNED by services/errors.ts; the dispatcher was
re-implementing a check at the wrong layer. Moved the passthrough
into `toApiError` itself as an early-return: any already-normalized
ApiError gets returned verbatim (idempotent). Two new test cases
in errors.test.ts pin (a) the referential-equality pass-through and
(b) that plain objects lacking the ApiError shape still fall into
the unknown-throw branch. The dispatcher's `asApiError` + the
private `isAlreadyApiError` helper are deleted; dispatcher catches
just call `toApiError(raw)` now.
2. Synthetic UserTurn → real discriminator
----------------------------------------
`SyntheticUserTurnFromRoot` carried a `synthetic: true` field that
the dispatcher's `piecesForUserTurn` / `resolvedConverterIds` ducked
against via `(ut as { synthetic?: boolean }).synthetic === true`.
That's a runtime check across a declared union — the worst of both
worlds. Fixed: the synthetic shape now uses `kind:
'synthetic_user_turn_from_root'`, which matches the `kind` field on
real `UserTurnNode` (`'user_turn'`), making the union a real
discriminated union. Consumers narrow via `if (ut.kind ===
'synthetic_user_turn_from_root') ...` and TypeScript narrows the
rest. All `as`-casts deleted.
3. Delete dead helpers
--------------------
- `parseTreePathLabel` and `isTreePathLabelValid` in dispatchHelpers.ts:
no production callers; the reviver was speculative for unrealized
future needs. Tests for both also deleted; one kept-and-rewritten
test pins `buildLabels` output shape ('[]' for fan-less leaves,
JSON-encoded for nested fans) — that's the actual contract.
- `cryptoRandomUuid` Math.random() fallback in dispatch.ts: jsdom +
modern Node both have `crypto.randomUUID`; the fallback was an
unreachable RNG dependency adding coverage gaps. Replaced with a
direct call.
4. Cancel-tally honest reason
---------------------------
`runWave`'s cancel-tally was writing `{ message: '...', failure_class:
'transient' }` on cancelled leaves' `lastError`. State='cancelled' is
what consumers actually read; the structured failure_class on a
not-failed node was confusing. Switched to a plain string reason
`'wave cancelled by operator'`. The sink normalizes string reasons
to transient internally, but consumers reading lastError.failure_class
in a cancelled-state node now see the documented "string was passed"
path rather than a hand-coded structured form.
5. Reflog_evicted TODO
--------------------
`runWave`'s summary hardcodes `reflog_evicted: 0` until the reflog GC
layer lands. Added a TODO(reflog) comment explaining the migration
shape so the placeholder doesn't rot into a forgotten zero.
6. Drop dispatcher's redundant tag-gate assertion
-----------------------------------------------
`dispatchLeaf` had its own `if (!args.operator) throw` check. But
`buildLabels` (called downstream) also asserts the same. Two defense-
in-depth checks for the same precondition is one too many — pick the
source. Kept `buildLabels`'s assert (it's the actual label-build site
where the consequence of a missing operator would silently destroy
audit attribution). Dropped dispatcher's; tests covering the synchronous
throw on empty operator still pass because the throw happens via
buildLabels at the same callsite, returns the same error class.
Verification: 787 frontend tests pass (same count as PR4d.1 — the
new errors.test.ts pass-through tests offset the deleted dead-code
tests). Coverage: 93.22%/85.71%/92.23%/95.2% globally; runner
directory 93.93/85.2/94.2/94.98 — all above the 85/85/90/90 thresholds.
The two remaining gaps in dispatch.ts (200-cap branch + the inline-
converter inner if) are defensible: the cap is unreachable in V1.0
by construction (per PR4d.1) and the inline branch fires only for
ConverterRef shapes that don't carry a `converterId` (rare in V1.0).
Lint, type-check, type-check:contract all clean.
Open rubber-duck items still pending:
- DTO original_prompt_id nullability tightening (V1.0 ships nullable
per the spec; not yet relitigated).
- Citation-strip discipline: still in-progress; partition.ts and
wave.ts source comments retain a few inline section refs that
should clear at end-of-V1.0.
- The `reconcileTransformStates` / `reconcileAllTransforms` calls
(03 §3.3 try/finally + §3.1 step 6) are NOT implemented. PR4e or
a dedicated follow-up needs them or UserTurn/Score nodes that need
state reconciliation after the wave stay stuck `stale`.
- `WaveEvent.operator_tag_required` is in the type union but never
emitted; PR4e's shim wires the gate.
The runner's entry-point shim that wraps runWave with the canonical
five-step ordering from 03 §2.1: tag-hygiene gate → cross-tab lock
acquire → cost-guardrail modal → per-tree wave-queue check → wave
start. Steps 2–5 sit inside try/finally so the lock releases on every
exit path; drain runs OUTSIDE the lock so each drained wave can
acquire its own. This commit also lands the wave-end
reconcileAllTransforms pass (§3.1 step 6) and the buildSForRetry
helper that the retry-failed scope requires.
What ships
- frontend/src/runner/shim.ts
- createRunnerShim(deps) → RunnerShim with refreshNode /
refreshSubtree / refreshTree / retryFailedNodes / cancelWave /
cancelQueued. Per-tree active-wave + queue maps live in the
closure so cancel{Wave,Queued} can look up controllers and
dropped requests.
- The five steps in order:
1. Operator tag missing → emit operator_tag_required, no
acquire, no release. Returns.
2. lockManager.acquire returns 'busy' → emit busy event, return
(no release; we don't hold the lock).
3. costGuardrail.approve(estimatedCalls, triggerKind) → reject
returns via outer finally (lock released).
4. currentWaveByTree.has(treeId) → enqueue {waveId, triggerKind,
scope, leafCount}, emit queued event with queueDepth, return
via outer finally.
5. createWaveController + runWaveStarter → set
currentWaveByTree → await settled → reconcileAllTransforms →
delete currentWaveByTree → outer finally releases lock →
drain queue OUTSIDE the lock.
- cancelWave(treeId): looks up active controller, .cancel(), then
awaits settled (swallowing rejection) so the public contract
"resolves when the wave fully settles" holds.
- cancelQueued(treeId): splices the queue, emits a synthetic
complete event with summary.cancelled = leafCount per dropped
wave (the §10.3 contract — operator sees the queued banner
reconcile).
- Dependencies are injected (operatorProvider, treeProvider, sink,
lockManager, costGuardrail, runWaveStarter, uuid, optional now)
so tests mock every boundary and the shim's orchestration is
the only thing under test.
- frontend/src/runner/reconcile.ts
- reconcileAllTransforms(tree, treeId, sink) walks every
transform-class node (user_turn / fan / score) once and flips
stale → clean when the parent is clean. Single pass; transforms
whose parent flipped this pass stay stale for a follow-up wave
(catches the operator-typical Score-as-Send-sibling case the
per-dispatch path-scoped reconcile cannot reach).
- frontend/src/runner/readiness.ts
- buildSForRetry(tree, nodeIds): S = {nodeIds} ∪ {failed/cancelled
Send ancestors on each input's root-to-leaf path}, deduped.
Walks transparently through UserTurn / Fan / Score ancestors
(only Send state counts). Silently skips missing nodeIds (UI
race tolerance).
- demoteRetryFailedNodes signature evolved from void to return
ConversationTree: fires sink calls (React-state side) AND
returns a new tree with demoted nodes flipped to stale +
execution null + lastError null (the pure-data side runWave's
computeReady reads). Returns the input tree by identity on
no-op so caller memoization stays valid.
Notable shape decisions
- Drain runs OUTSIDE the outer lock-release. The spec's literal
pseudocode in §2.1 nested the drain inside the outer try, which
would deadlock on real BroadcastChannel-keyed locks (re-entered
drained waves would block trying to acquire the same lock). The
correct read is: each drained wave gets its own acquire-release
cycle. Drain reachability after the outer try is the right signal
— every early-exit path (tag-gate, busy, missing tree,
cost-cancel, enqueue) returns from inside the try (bypassing the
drain block); a step-5 exception propagates through the finally
and exits the function before drain runs. No bookkeeping flag
needed.
- waveId minted per shim entry, not per queue request. The §10.3
spec mints one waveId per entry; for an enqueued wave the queued
event carries it. When the queued wave later drains, its
re-entered shim invocation mints a fresh waveId for the dispatch
itself. That's what the literal spec does and what the V1.0 UI
needs (the queue banner reads queueDepth, not per-wave tracking).
- Retry-failed shim DEMOTES via demoteRetryFailedNodes and uses the
RETURNED tree for runWaveStarter. The §3.1 step 2b demotion is
spec'd as inside the dispatch loop, but doing it in the shim
keeps runWave dumb (it doesn't need to know about
waveTriggerKind === 'retry_failed' semantics). The dual sink-call
+ returned-tree shape on demoteRetryFailedNodes is the source of
truth for both surfaces.
- cancelQueued emits the complete event itself. The shim is the
only place that has the per-queued-wave waveId + leafCount; the
mocked runWaveStarter for queued waves never runs, so no other
layer can emit the wave's lifecycle complete event. This matches
the §10.3 contract literally.
- Missing-tree path: silent return. The UI is the only legitimate
caller of these entry points and always passes a treeId that
matches the current tree. If the operator races a tree-delete
against a refresh click, the shim no-ops rather than crashes. The
lock IS released (we acquired before the lookup, per the spec's
step ordering).
- The shim doesn't depend on runWave's wave.ts directly — it takes a
RunWaveStarter dependency (the function signature mirrors
RunWaveArgs minus the sink/api/operation, which production wires
in via a thin adapter). This is how the shim tests assert the
five-step ordering without mocking the whole dispatcher.
TDD narrative
Started with readiness.test.ts: added 8 cases for buildSForRetry
(the entry point's pre-S helper) + 4 cases for the
demoteRetryFailedNodes return-tree contract. RED was tsc TS2724
('buildSForRetry' not exported) + TS2339/TS2345 on the void return.
Implemented in readiness.ts; 43 readiness tests pass (28 prior + 15
new).
reconcile.test.ts next: 12 cases pinning the
transform-flips-when-parent-clean rule, the Send-untouched
invariant, idempotency on clean transforms, and the wide-tree walk
that catches sibling-of-Send Scores. RED was TS2307 on
'./reconcile'. Implemented reconcile.ts; 12/12 green.
shim.test.ts is the bulk of the PR: 39 cases across 12 sections —
tag-gate, lock-busy, cost-cancel, queue enqueue/drain, wave-start
args, lock release on every exit path (success / starter throws /
cost cancel / queue / tag abort / busy abort), S construction per
entry point, waveTriggerKind mapping, cancelWave behavior,
cancelQueued behavior, wave-end reconcile, per-tree isolation,
missing-tree no-op. RED was TS2307 on './shim'. Implemented;
38/39 green on first run.
Defects surfaced during TDD
- "cancelQueued does NOT affect the active wave" timed out on first
pass because shim.refreshTree synchronously returns a Promise that
suspends at the first await (lock acquire); the second invocation
hadn't reached the enqueue step by the time cancelQueued ran, so
the queue was empty and the no-op cancel didn't prevent the
second wave from later draining normally — but the test only
resolveNext'd once, leaving the drained second starter pending
forever. Fix: waitFor 'queued' event before cancelQueued. Real
bug surfaced: the race exists in production too, and operators
who fire cancel-queued before the queued event has emitted will
see the queued wave drain anyway. Acceptable for V1.0 (the
queued event is the UI's signal to enable the cancel chip), but
worth a note for PR4f when real BroadcastChannel-keyed locks
change the timing.
- Initial implementation carried a `waveDispatched` flag through
the outer try/finally to guard the drain. ESLint's
no-useless-assignment rule correctly flagged the initialization
as dead — the drain block is only reachable on the step-5 success
path (returns inside the outer try propagate through finally and
exit; exceptions propagate through both finallys and exit). The
flag was a cargo-culted pattern from C-style exit-code tracking;
JavaScript's try/finally semantics already encode the right
signal in reachability. Dropped the flag.
- Wave-end reconcile uses treeProvider for a fresh lookup rather
than the tree object we built S against. Reason: in production
the wave's sink writes flow into the React state container that
treeProvider reads, so the post-wave snapshot reflects the
dispatcher's Send state-flips and the reconciler walks the
correct world. In tests with closed-over fixed trees, this is a
no-op (same reference returned) — same answer either way.
Verification
Tests: 851 frontend passing (787 prior + 64 new: 15 readiness +
12 reconcile + 39 shim — see test counts below per file). Backend
unchanged (~658 passing).
Lint: clean.
Type-check: clean (main + contract).
Coverage: src/runner directory 94.88 / 87.29 / 94.04 / 96.05
against the 85/85/90/90 thresholds — shim.ts at 98.7/93.33/90.9/
100, reconcile.ts at 94.73/90/100/100, readiness.ts at
94.53/89.04/92.3/96.15. Pre-existing 126 latent test type errors
untouched (the narrow contract-test gate stays green).
Per file:
- readiness.test.ts: 43 tests (28 prior + 15 new)
- reconcile.test.ts: 12 tests (new file)
- shim.test.ts: 39 tests (new file)
Next slice
PR4f: replace mock CrossTabLockManager with real
BroadcastChannel('pyrit-runner') keyed on conversation_tree_id
(per 01 §9.4.3 / 03 §10.4). Real-lock timing will shift the
cancelQueued race noted above — worth retesting the drain
semantics with the polyfilled BroadcastChannel under jsdom.
Together PR4e + PR4f complete the runner core; the rubber-duck
reviewer fires after PR4f per the template.
Open rubber-duck items still pending
- DTO original_prompt_id nullability (since PR3a; not yet
re-litigated).
- Citation-strip discipline (partition.ts + wave.ts inline section
refs still present; end-of-V1.0 strip).
- reconcileTransformStates (path-scoped per-dispatch variant) NOT
implemented in this PR — wave-end reconcileAllTransforms covers
the canvas-stale-transform-after-wave bug; path-scoped variant
is a perf optimization for the incremental UI update that V1.0
can defer.
- CI gate for the 126 latent test type errors (deferred; narrow
contract-test gate stays in place).
- PR1 backward-compat fallback corpus verification (needs prod DB
access).
…mantics (PR4f)
The cross-tab advisory lock implementation per 01 §9.4.3 / 03 §10.4.
Replaces the mock CrossTabLockManager that the shim consumed in PR4e
with a real BroadcastChannel('pyrit-runner')-keyed lock; adds the
`broadcast-channel` npm polyfill for jsdom (simulate mode) so two
LockManager instances in the same jest process talk to each other the
same way two browser tabs talk through the native BroadcastChannel.
Also adds the queue-drain stale-set regression test that pins the
§10.3 "stale-set recomputed at wave-start" contract.
What ships
- frontend/src/runner/crossTabLock.ts
- createBroadcastChannelLockManager(options) returns
BroadcastChannelLockManager — a CrossTabLockManager + close()
+ exposed tabId. Options: channelName (default 'pyrit-runner'),
tabId (auto-mint via crypto.randomUUID), acquireTimeoutMs
(default 50 per §9.4.3), logger (default console), uuid
(replaceable for tests).
- Wire format on the channel (per §9.4.3 rev-10 correctness note):
{ type: 'lock_request', treeId, requestId, tabId }
{ type: 'lock_busy', requestId, holderTabId }
{ type: 'lock_released', treeId }
Request/reply correlation rides on `requestId` — MessagePort
transfer-list does NOT work with BroadcastChannel.
- Single onmessage dispatcher with a set of subscribers; the
persistent holder-response handler and the per-acquire busy
listener both register through it. One handler = consistent
behavior across native and the npm polyfill (which have
different onmessage calling conventions; see Defects below).
- Same-tab reacquire short-circuits: heldLocks.has(treeId) ⇒
{ acquired: true, holderTabId: null } immediately, no message
round-trip. Per the §9.4.3 protocol — otherwise we'd race our
own holder-response handler.
- Graceful degradation when BroadcastChannel is undefined (Safari
≤15.3): warn once, then always-acquired. Operators on legacy
Safari accept the V1.0 fork-bomb risk; everything else keeps
working. acquire/release/close all no-op safely.
- frontend/src/runner/treeTypes.ts
- CrossTabLockManager.acquire return type changed from
'acquired' | 'busy' to a discriminated union:
type LockAcquireResult =
| { acquired: true; holderTabId: null }
| { acquired: false; holderTabId: string }
so the busy reply's holderTabId flows through the shim into
the WaveEvent { kind: 'busy', holderTabId } the UI consumes.
- frontend/src/runner/shim.ts
- Consumes the new LockAcquireResult shape: on !acquired, emit
busy with lockResult.holderTabId (was hard-coded '' in PR4e).
- frontend/src/runner/shim.test.ts
- Updated mkControllableLockManager to take ReadonlyArray<
LockAcquireResult> and default to the acquired shape.
- Tightened the busy-event test to assert holderTabId propagation.
- NEW test: "drained re-entry recomputes S from the LATEST tree
state, not the snapshot at enqueue" — flips treeProvider's
tree between enqueue and drain; asserts the drained wave's
starter call carries S computed from the post-edit tree (per
§10.3 stale-set-recomputed-at-wave-start contract). Pins the
PR4e shim's correctness against the rev-15 reviewer Finding 5
concern that prompted the §10.3 → §2.1 unification.
- frontend/src/setupTests.ts
- Loads the `broadcast-channel` npm polyfill globally with
`enforceOptions({ type: 'simulate' })`. Simulate mode keeps the
transport in-process — required for jest's parallel test workers,
which would otherwise step on each other via the polyfill's
file-RPC default (broadcast-channel/methods/node.js).
- frontend/package.json + package-lock.json
- broadcast-channel ^7.3.0 added as devDependency per spec §9.4.3
("V1.0 commits to polyfilling via the broadcast-channel npm
package (~5 KB) loaded in the jest setup file"). Production
bundles use the browser's native BroadcastChannel; the polyfill
is dev-only.
Notable shape decisions
- Discriminated union for LockAcquireResult instead of a bare
{ acquired: boolean; holderTabId: string | null }. The narrowing
`if (!r.acquired) { r.holderTabId is string }` falls out
automatically; no nullability dance at every callsite.
- Single subscriber-set dispatcher rather than calling
addEventListener/removeEventListener directly on the channel.
Two reasons: (a) the native API supports addEventListener but
the npm polyfill's `onmessage` handler calling convention differs
from native (raw data vs MessageEvent — see Defects). Normalizing
in one place at the top-level onmessage = setter handles both
backends; (b) the protocol has one persistent handler (lock-
request responses) + N ephemeral handlers (per-acquire busy
listeners). The subscriber-set is the natural fit for that
structure.
- simulate-mode polyfill instead of the polyfill's default `node`
mode. The `node` method uses file-based RPC under /tmp — fine
for cross-process tests but a leak vector for jest's
parallel-worker layout. `simulate` is in-process, deterministic,
and ~5ms per message hop (the polyfill's SIMULATE_DELAY_TIME
constant — comfortably below the test's 20ms acquireTimeoutMs).
- The lock manager exposes `tabId` directly (not behind a getter)
so the busy modal's "another tab (id: …)" label can read it
without an extra API. Production passes the manager's tabId to
the modal's "this tab" hint.
- `acquireTimeoutMs` defaults to 50ms per §9.4.3. Configurable for
tests + a future operator preference if the latency proves to
be annoying (V1.x). Imperceptible vs a typical 10+ second wave.
- close() removes the onmessage handler and clears subscribers
before calling channel.close(). The polyfill throws on
double-close; we swallow that defensively (try/catch around
channel.close()).
- The release wire message (`lock_released`) is posted but the
spec's "Wait" auto-acquire flow is NOT wired in V1.0 — that's a
UI affordance (per §9.4.3 "Wait listens for the lock_released
message"). The spec also notes "Refresh anyway" as the operator
override path; both are UI-layer work that lands with PR5/PR6.
The runner's lock-side of the protocol (post on release) is
correct today; the listener side wires in with the UI.
TDD narrative
Started with the shim interface update (LockAcquireResult shape)
to let the lock tests reference the new type. Updated treeTypes.ts
+ shim.ts + shim.test.ts mock + the busy-event assertion in one
pass. Shim suite stayed 39/39 green through the interface change.
Then wrote crossTabLock.test.ts: 13 cases across single-instance
lifecycle, two-instance contention (busy reply, release-then-
reacquire, three-way contention), per-tree isolation, same-tab
reacquire, BroadcastChannel-absent degradation, close-stops-
responding, auto-mint tabId. RED was TS2307 on './crossTabLock'.
Implemented crossTabLock.ts. First run: 9/13 green — all
single-instance + degradation + close tests passed; all four
cross-instance tests failed. The reason was the polyfill calls
`onmessage(data)` with the raw user payload, but the native API
calls `onmessage(MessageEvent)` with a wrapper. My initial
implementation treated the argument as MessageEvent always and
read `.data` — which on the polyfill returned undefined. Fixed
by normalizing at the dispatcher (`instanceof MessageEvent` test
decides whether to unwrap). All 13 green.
Added the drain-stale-set regression test to shim.test.ts. First
run: failed on a test bug (used `sink.calls` instead of
destructured `callsOf` helper). Fixed; 40/40 shim green.
Defects surfaced during TDD
- The `broadcast-channel` npm polyfill's onmessage handler is
called with the RAW user data (not a MessageEvent). Native
BroadcastChannel calls it with a MessageEvent that wraps the
data in `.data`. This is a polyfill bug or design choice (the
polyfill's documentation doesn't surface it loudly), and the
contention tests caught it immediately — the runner's filter
`data.type === 'lock_request'` was reading `.type` off a
MessageEvent (which has `.type === 'message'`, not the wire
`.type === 'lock_request'`), so the holder-response handler
never fired and busy replies never went out. The fix is a
one-liner normalization at the onmessage setter; the runner
code stays clean. Documented in the BroadcastChannelLike type
comment.
- The polyfill's simulate transport adds a 5ms delay per
postMessage. Two hops (A→B request, B→A response) is 10ms.
Our default 50ms acquireTimeoutMs is fine for production, and
tests use 20ms which is also comfortable. Worth noting if
future tests get flaky on slower CI: bump the test-mode
timeout, don't lower the polyfill delay.
- The polyfill's `BroadcastChannel.close()` may throw on double-
close. The shim's try/finally pattern could trigger a double
close if a lock manager is reused across shim invocations
(today it isn't, but defensively the close() wraps channel.close
in try/catch). Silent swallow — closing an already-closed
channel has no observable effect.
- Same-tab reacquire (heldLocks.has) is a load-bearing short-
circuit, not an optimization. Without it, the holder-response
handler would fire for the tab's own lock_request and the
acquire would resolve to busy against itself. The §9.4.3
pseudocode includes this check ("if (heldLocks.has(treeId))
return 'acquired'"); the test "same-tab reacquire" pins it.
Verification
Tests: 865 frontend passing (851 prior + 14 new: 13 lock + 1
shim drain). Backend unchanged (~658 passing).
Lint: clean.
Type-check: clean (main + contract).
Coverage: src/runner directory 94.5 / 86.57 / 93.87 / 96.11
against the 85/85/90/90 thresholds —
crossTabLock.ts at 91.66/81.39/92.85/96.61
shim.ts at 98.7/93.33/90.9/100
(others unchanged)
Two uncovered crossTabLock.ts lines:
- line 93 (release no-op in degraded path) is covered by the
added release call in the BroadcastChannel-absent test.
- line 207 (defaultUuid's crypto-randomUUID-absent fallback)
is intentional dead branch for non-Node environments.
Next slice
PR4f closes the runner core (PR4a-f). Per the rubber-duck
template, fire the reviewer now before starting PR5 (react-flow
scaffold). Reviewer scope: PR4e + PR4f + the readiness/reconcile/
lock/shim files; specific concerns —
A. shim's drain-outside-lock decision (worth revisiting against
the §10.3 spec literal?)
B. CrossTabLockManager interface change (discriminated union
vs other shapes — premature?)
C. polyfill choice (npm vs in-process shim — leaner alternative?)
D. wave-end reconcile via fresh treeProvider lookup (correctness
against the React state container's update timing)
E. queue-drain stale-set test honesty (does it prove what it
claims, or is it a test-that-passes?)
F. coverage gaps (defensible vs material)
G. spec drift since PR4d.1
H. citation-strip discipline (still pending end-of-V1.0)
Plus the standard rubber-duck items (J. anything else,
hidden time-bombs, etc.).
Open rubber-duck items still pending
- DTO original_prompt_id nullability (since PR3a; not yet
re-litigated).
- Citation-strip discipline (partition.ts + wave.ts inline section
refs still present; end-of-V1.0 strip).
- reconcileTransformStates (path-scoped per-dispatch variant) not
implemented; wave-end reconcileAllTransforms covers the canvas-
stale-transform-after-wave bug.
- CI gate for the 126 latent test type errors (deferred).
- PR1 backward-compat fallback corpus verification (needs prod DB
access).
- PR4e race: cancelQueued fired before the queued event has
emitted would let the queued wave drain anyway. Acceptable for
V1.0 (the queued event is the UI's signal to enable the cancel
chip); document for PR5/PR6 UX work.
Address must-fix items from the post-PR4e+PR4f rubber-duck review.
The runner core remains feature-complete; this commit hardens the
read-back semantics, kills a maintenance hazard, tightens the lock
interface, and adds defense-in-depth tests for the
labels-divergence invariant.
What ships (per reviewer finding)
Finding D — wave-end reconcile read-back race [LOAD-BEARING]
The PR4e implementation re-read deps.treeProvider AFTER runWave
settled to get the post-wave tree snapshot for
reconcileAllTransforms. Reviewer flagged: production wires
treeProvider to React state, and React 19's setState commits
are queued at microtask boundaries — `await settled` may resume
before React has committed the wave's setState calls, leaving
treeProvider returning the STALE pre-wave tree. Reconcile would
then walk the wrong world and miss every transform the wave's
Send-completions just unblocked.
Fix: the shim now wraps deps.sink in a per-wave recorder
(createStateRecorder) and passes the wrapped sink to
runWaveStarter via the new `sink` field on RunWaveStarterArgs.
The recorder captures every setNodeState call into a per-wave
Map<NodeId, NodeState>; after `await settled`, the shim
constructs the post-wave tree via applyStateRecorder (input
tree + overlay of captured states) and feeds THAT to
reconcileAllTransforms. No React-state read-back; no timing
dependency. The recorder forwards every other sink method
(recordExecution / clearExecution / setReflogPinned /
emitWaveEvent) untouched so React state stays in sync.
Tests pin the new contract:
- "reconcile reads POST-WAVE state via the recording sink, not
a treeProvider snapshot" — tree with a stale Send + stale
Score child; the test starter writes setNodeState(Send,
clean) through args.sink; reconcile flips the Score to
clean. Without the recorder, the treeProvider (fixed-tree
closure) would return Send=stale and Score would stay stale.
- "starter receives a sink that is a wrapper (not the bare
deps.sink reference)" — defense-in-depth against a future
refactor that reverts the wrapping.
- "recording sink forwards every sink method to the underlying
deps.sink" — pins that recordExecution / clearExecution /
setReflogPinned / emitWaveEvent are NOT swallowed.
Finding C — polyfill swap (npm → in-process)
The broadcast-channel npm package's onmessage(data) calling
convention differs from native onmessage(MessageEvent), forcing
a normalization shim in crossTabLock.ts. Reviewer flagged:
(a) the normalization is a maintenance hazard (a future browser
MessageEvent subclass would break the `instanceof` check
silently); (b) simulate mode bypasses structured-clone
serialization, so non-JSON-serializable fields added to
WireMessage later would pass tests but fail in production;
(c) 7 transitive deps for a 25-line problem.
Fix: 25-line in-process BroadcastChannel shim in setupTests.ts
that delivers a real MessageEvent on postMessage, matching
native semantics exactly. Removes the broadcast-channel devDep
(and its 7 transitive packages), removes the eventOrData
normalization shim from crossTabLock.ts, and tightens
BroadcastChannelLike back to `onmessage: (event: MessageEvent)
=> void`. Production code unchanged.
Spec note: 01 §9.4.3 says "V1.0 commits to polyfilling via the
broadcast-channel npm package." Spec wins on settled
commitments by default; reviewer's case for the in-process
shim ranged across three orthogonal concerns (test fidelity,
maintenance burden, dep surface) and was strong enough to
override. Spec amendment for end-of-V1.0 doc pass.
Finding B — LockAcquireResult union simplification
PR4f's discriminated union was:
{ acquired: true; holderTabId: null }
| { acquired: false; holderTabId: string }
Reviewer flagged the `holderTabId: null` on the acquired-true
variant as junk data — invites a reader to think it carries
meaning when it doesn't. `acquired: true` already says "no
holder."
Fix: tightened to `{ acquired: true } | { acquired: false;
holderTabId: string }`. Shim consumer unchanged (already only
reads holderTabId on the !acquired branch). All call-sites
updated (crossTabLock.ts, shim.test.ts mock,
crossTabLock.test.ts assertion, treeTypes.contract.test.ts).
Finding J — acquire-after-close throws
Reviewer flagged: PR4f's closed-manager acquire returned
{ acquired: true } silently. A closed manager has no
holder-response handler and no peer subscription — returning
"acquired" is a silent lie that produces a phantom cross-tab
race. Fail-loud over silent-lie.
Fix: acquire on a closed manager throws Error('cross-tab lock
manager is closed'). Both the native-BC path and the
BroadcastChannel-absent degraded path track a `closed` flag and
throw consistently. Release stays best-effort no-op on closed
managers (release is the cleanup path; throwing there would
surface inside the shim's outer finally and cascade with the
wave settlement). Tests added: "acquire after close throws"
+ "release after close is a no-op (NOT throw)."
Finding J (also) — parentConversationTreeId preservation contract
Reviewer flagged: the shim reads `tree.parentConversationTreeId`
off the post-demotion tree returned by demoteRetryFailedNodes
and forwards it to runWaveStarter for label-divergence
compliance. If a future demoteRetryFailedNodes refactor lost
the spread of tree-level fields, parentConversationTreeId
becomes undefined and the labels-round-trip integration test
would fail in CI — but no unit test would catch the regression
on its own.
Fix: contract test in readiness.test.ts —
"preserves tree-level fields on the returned tree (id, edges,
parentConversationTreeId, undoStack, etc.)" — verifies the
demoted tree carries id / parentConversationTreeId /
parentSourceConversationId / displayName / rootId / createdAt /
edges / undoStack identical to the input. Pin lands before any
regression.
Finding A — drain-block structural invariant comment
Reviewer reading: the drain-outside-the-lock decision is
correct, but the commit body's "prevents a deadlock" argument
is wrong (the lock manager's same-tab reacquire short-circuits
would handle that). The real reason is cross-tab fairness —
drain-inside makes other tabs wait for N-deep queues; drain-
outside lets them interleave. Reviewer also flagged the
"no bookkeeping flag needed" property as load-bearing on the
structural invariant "every early-exit uses `return`" — a future
refactor that replaces a guarded return with an else-branch
would silently start draining on the early-exit path.
Fix: extended the comment block on shim.ts's drain block to
state the cross-tab-fairness rationale (corrects the original
commit body's deadlock argument) AND name the
return-on-early-exit structural invariant explicitly. The next
refactor reader has explicit guidance.
Notable shape decisions
- createStateRecorder is a local helper (not exported). The shim
is the only producer/consumer of the recorder lifecycle; making
it a public utility would invite reuse in places where the
React-state-read-back race doesn't apply.
- applyStateRecorder returns the input tree by identity when
states.size === 0 (no-op waves). Preserves caller-side identity-
based memoization the same way demoteRetryFailedNodes does on
no-op demotions.
- The recording sink wraps deps.sink but does NOT track
recordExecution. Reason: state transitions are always via
setNodeState; recordExecution attaches the execution record
THEN setNodeState flips state to 'clean'. Capturing
setNodeState only is sufficient for the reconcile use case
AND keeps the recorder small. If a future caller needs to see
the execution map, extend then; don't speculate.
- The in-process BroadcastChannel polyfill uses queueMicrotask
for delivery (matching native sync-emit-but-async-receive
semantics). Tests await two microtask hops to settle a
round-trip (A request → B response). Determinism in
parallel-worker jest is preserved because state is purely
in-memory per-process.
- Acquire-after-close throws but release-after-close is no-op.
The asymmetry is intentional and documented inline: release
runs from the shim's outer finally; throwing there would
cascade with the wave settlement and turn one issue into two.
Acquire runs from a fresh shim entry; throwing there surfaces
the bug at the right callsite.
- Drop the broadcast-channel npm dep + its 7 transitives entirely
(was added in PR4f, removed in PR4e+f.1). Net dep change for
PR4e+f vs PR4d.2: zero.
Defects surfaced during TDD
- Initial D-fix attempt forgot to update RunWaveStarterArgs's
public type, causing the test's custom starter to read
args.sink from a type that didn't declare it. Caught at
type-check; added `sink` to the args interface.
- The shim's import block accidentally pulled in ApiErrorReason
after a copy/paste from treeTypes. ESLint's
no-unused-vars caught it; cleaned up.
- When swapping the polyfill, the existing "after close, the
closed manager no longer responds as holder" test continued to
pass without the in-process shim's listeners-set teardown
behaving correctly. Verified by running the test BEFORE
removing the npm dep; both implementations satisfy the
contract identically.
- Section 11 comment header in shim.test.ts got dropped during
the section-10 edit. Restored.
Verification
Tests: 871 frontend passing (865 prior + 6 new: 3 shim D-proving +
1 lock close-throws + 1 lock release-after-close + 1 readiness
parentConversationTreeId contract). Backend unchanged (~658
passing).
Lint: clean.
Type-check: clean (main + contract).
Coverage: src/runner directory 95.13 / 87.42 / 96.26 / 96.58
(was 94.5 / 86.57 / 93.87 / 96.11). All modules above the
85/85/90/90 thresholds.
crossTabLock.ts at 94.66/85.36/100/98.36 (functions to 100)
shim.ts at 98.91/96.87/95/100 (branches improved)
readiness.ts at 94.53/89.04/92.3/96.15 (unchanged)
reconcile.ts at 94.73/90/100/100 (unchanged)
wave.ts at 100/93.1/100/100 (unchanged)
partition.ts at 93.75/85.1/100/94.59 (unchanged)
dispatchHelpers.ts at 97.43/89.47/100/97.22 (unchanged)
dispatch.ts at 85.33/50/88.88/88.23 (UNCHANGED; pre-existing
low-branch debt from PR4c2;
tracked as open item)
Open rubber-duck items still pending
Carried forward (not addressed in PR4e+f.1):
- DTO original_prompt_id nullability (since PR3a; not re-litigated).
- Citation-strip discipline (partition.ts + wave.ts inline section
refs; end-of-V1.0 strip).
- CI gate for the 126 latent test type errors.
- PR1 backward-compat fallback corpus verification (needs prod DB
access).
- PR4e cancelQueued-race: cancelQueued fired before the queued
event emits lets the queued wave drain anyway. Acceptable for
V1.0; PR5/PR6 UX should document.
New from PR4e+f.1 review:
- dispatch.ts branch coverage at 50% (pre-existing from PR4c2,
not introduced by PR4e/f). dispatch.ts owns create_attack +
add_message sequencing and partial-commit semantics; the low
branch coverage on mid-chain failure paths is genuine debt.
Tracked for follow-up; not gating PR5.
- Spec drift docs (drain-outside-the-lock, retry-failed-in-shim,
LockAcquireResult DU): three sentence-length amendments owed to
01 §9.4.3, 03 §2.1, 03 §3.1 step 2b. End-of-V1.0 doc pass.
- shim drain loop serializes ALL queued waves under the first
shim invocation's call stack. If operator clicks Refresh 5
times: the FIRST call's promise doesn't resolve until ALL 5
waves complete. Contract decision deferred to PR6 (which adds
the wave-status banner that consumes this).
Next slice
PR5 — react-flow UI scaffold. Sub-PRs PR5a-g per plan. Runner
core (PR4a-f + PR4e+f.1) is now feature-complete and the rubber-
duck checkpoint is closed. The runner-shape PR5 depends on is
frozen: ConversationTree as plain object, RunnerShim's six-method
surface, RunnerStateSink + WaveEvent stream contracts.
…pter (PR5a)
First slice of PR5 (the react-flow UI). Adds the pure adapter that maps
a domain ConversationTree onto react-flow's Node[]+Edge[] shape, plus a
minimal `TreeCanvas` scaffold component that mounts ReactFlow with the
adapter's output. Per-node components (PR5b), action rails (PR5c),
edge `+` chip (PR5d), Stack rendering (PR5e), Pick/Unpick (PR5f), and
layout (PR5g) land in subsequent slices.
What ships
- frontend/src/components/Tree/conversationTreeToReactFlow.ts
- conversationTreeToReactFlow(tree) → { treeId, nodes, edges }
- One react-flow Node per ConversationTreeNode (1:1, no
restructuring). Discriminated-union typing on the result
(TreeFlowNode = Node<{node:RootPromptNode},'root_prompt'> | ...
| Node<{node:ScoreNode},'score'>) so PR5b's node components
can register by kind and narrow params via a switch.
- Each node's `data.node` is the SAME ConversationTreeNode
reference (not a clone) so downstream useMemo hooks in node
components can identity-check against unchanged-tree renders.
- Each node carries a placeholder `{ x: 0, y: 0 }` position;
PR5g's d3-hierarchy layout pass overrides on render.
- One react-flow Edge per ConversationTreeEdge with
type='smoothstep' (orthogonal routing, per 02 §4.4).
- Edge data carries `slotIndex` so the PR5e Fan-Children Stack
predicate and PR5f Pick/Unpick (writes
promotedChildSlotIndex) can read it directly without
walking back to the source ConversationTreeEdge.
- Stable edge id from the source `ConversationTreeEdge.id`
(load-bearing for react-flow's reconciler — id changes force
unmount/remount and kill the PR5d edge-hover state).
- Exhaustive kind switch with `never`-typed default arm —
compile-time guard against adding a new ConversationTreeNode
kind without an adapter arm.
- frontend/src/components/Tree/TreeCanvas.tsx
- Scaffold component: takes a ConversationTree, calls the
adapter, mounts ReactFlow + ReactFlowProvider.
- `useMemo` on the adapter call so tree-prop identity changes
(not just shape changes) drive re-adaption.
- `data-tree-id` + `data-testid="tree-canvas"` on the wrapper
div for test introspection AND so PR5b+ can route action-
rail callbacks back to the runner shim by reading the
ancestor tree id.
- nodeTypes + edgeTypes deliberately not registered yet — the
scaffold uses react-flow's default node renderer (shows the
node id) until PR5b's per-kind components land. Commented
stubs flag where PR5b/5d will plug in.
- fitView prop on for the scaffold so single-tree-fits-canvas
works without manual zoom.
Notable shape decisions
- The adapter is a pure function over tree shape; no react,
no hooks, no closures over runner state. PR5g's layout pass
will wrap this output and add positions; per-node interactivity
(action rail callbacks, edge `+` chip) lands in the node /
edge components and routes through props supplied at the
TreeCanvas boundary.
- Discriminated-union node typing rather than a single
Node<{node: ConversationTreeNode}>. Without the discrimination,
every node component would need an internal `if (node.kind ===
...)` narrow before reading params. The kind-discriminated
union lets PR5b register node components as
`nodeTypes: { root_prompt: RootPromptCard, ... }` and have
TypeScript narrow `data.node` to the right type at the
component boundary.
- TreeFlowEdgeData is `{ slotIndex: number }`, NOT the full
`ConversationTreeEdge`. Reasons: (a) slotIndex is the only
field a Stack / Pick consumer reads; (b) extending TreeFlowEdgeData
later is a non-breaking type change (edge consumers read
specific keys, not the whole object); (c) keeps the adapter
output minimal.
- Placeholder positions at (0,0) instead of computing a quick
initial layout. react-flow tolerates same-position nodes
(they overlap at the origin). PR5g's layout pass owns
positioning end-to-end; computing a throwaway layout here
would be wasted work. The scaffold's `fitView` keeps the
overlap from breaking the canvas mount visually until layout
lands.
- TreeCanvas doesn't take a sink, runner, or any callbacks
yet — pure-render shell. PR5b will widen the prop set as
action-rail callbacks land; defining the prop surface
speculatively would invite over-specifying before we know
what the components need.
TDD narrative
conversationTreeToReactFlow.test.ts: 17 cases pinning node 1:1
mapping, kind → type passthrough, ImportMessageNode-as-root,
data.node identity preservation, placeholder positions, edge
1:1 mapping, edge id stability, slotIndex on edge data,
smoothstep edge type, fan-children edges with auto-numbered
slot indices, explicit edges with non-default slotIndex,
root-only tree (zero edges), input-mutation safety, wide
multi-fan-path tree, treeId on result, kind-discriminated
type narrowing. RED was TS2307 on './conversationTreeToReactFlow'.
Implemented; 17/17 green.
TreeCanvas.test.tsx: 4 cases — node count per tree, treeId
attribute on wrapper, tree-swap survives without losing nodes,
wide tree mounts cleanly. RED was TS2307 on './TreeCanvas'.
Defects surfaced during TDD
- First TreeCanvas test pass used `[data-id]` as the node-card
selector. react-flow tags connection handles with `data-id`
too (e.g., "1-r-null-target"), so the selector matched 6 DOM
nodes per card (1 wrapper + 2 handles + handles on the
handles). Switched to `[data-testid^="rf__node-"]` which
matches only the card wrapper. 3/4 tests went green to 4/4.
- First implementation pass used `): JSX.Element` as the
TreeCanvas return type. Main tsconfig (vs tsconfig.test.json)
doesn't include the legacy global JSX namespace; eslint
no-undef caught it. Dropped the explicit return type to
match the existing component convention (ConnectionBanner,
ErrorBoundary, etc.).
Verification
Tests: 892 frontend passing (871 prior + 21 new: 17 adapter + 4
scaffold). Backend unchanged.
Lint: clean.
Type-check: clean (main + contract).
Coverage: src/components/Tree directory 90.9 / 85.71 / 100 /
90.47 — clear of the 85/85/90/90 global thresholds.
conversationTreeToReactFlow.ts: 85.71/85.71/100/85.71
(uncovered lines: the `never`-typed default arm of the kind
switch — compile-time-unreachable; istanbul still counts it)
TreeCanvas.tsx: 100/100/100/100
Dependencies
- @xyflow/react ^12.11.0 added as a runtime dep (replacement for
the older `reactflow` package; v12+ ships under the @xyflow
scope). Adds 19 transitive packages.
Next slice
PR5b — per-kind node components (RootPromptCard, UserTurnCard,
SendCard, FanCard, ScoreCard, ImportMessageCard). Each card
reads `data.node` (typed to its kind), renders a Fluent UI card,
and exposes hooks for the PR5c action rail. The scaffold's
nodeTypes prop wires up once components are ready.
Open rubber-duck items still pending (unchanged from PR4e+f.1)
- DTO original_prompt_id nullability.
- Citation-strip discipline (partition.ts + wave.ts + shim.ts
have inline section refs; end-of-V1.0 strip).
- CI gate for the 126 latent test type errors.
- PR1 backward-compat fallback corpus verification.
- PR4e cancelQueued-race.
- dispatch.ts branch coverage at 50% (pre-existing).
- Spec drift docs (drain-outside-the-lock, retry-failed-in-shim,
LockAcquireResult DU).
- shim drain loop call-stack serialization.
Six per-kind node cards (RootPromptCard, ImportMessageCard,
UserTurnCard, SendCard, FanCard, ScoreCard) registered against the
TreeCanvas's `nodeTypes` prop. Each card displays the kind-specific
summary fields a tree-viewing operator needs to see at a glance —
prompt text + target on the root, role + text + converter count on
user turns, target override + lastError on sends, axis + variant
count + pick indicator on fans, scorer type + V1.0 render-only hint
on scores. Action rails (PR5c), edge `+` chip (PR5d), Stack
rendering (PR5e), Pick/Unpick (PR5f), and layout (PR5g) land
separately.
What ships
- frontend/src/components/Tree/nodeCards.tsx
- RootPromptCard: prompt text (truncated body) + target chip;
no target handle (root has no parent).
- ImportMessageCard: source conversation id + cutoff index;
no target handle.
- UserTurnCard: text body + role + optional converter-count
chip; both handles. Body carries the full text on the `title`
attribute so hover discoverability works even when the body
is line-clamp truncated.
- SendCard: optional target override + lastError surface
(visible only when state is 'failed' AND lastError !== null);
both handles.
- FanCard: axis + variant count + Pick indicator (when
promotedChildSlotIndex is non-null); both handles.
- ScoreCard: scorer type + V1.0 render-only hint (per 02 §2.2
ScoreNode rail: configure-scorer is V1.1).
- Shared CardFrame component renders the kind label, the
state badge (7-state color map: draft / clean / edited /
stale / running / failed / cancelled), and the react-flow
Handles (target on top, source on bottom; toggleable via
props since root + import don't have parents).
- Shared MetaRow for "label: value" key-value pairs in a
consistent layout.
- frontend/src/components/Tree/treeNodeTypes.ts
- Kind → component map: `{ root_prompt, import_message,
user_turn, send, fan, score }`. Lives in its own module so
eslint's react-refresh/only-export-components rule stays
happy (mixing the registry with components defeats HMR).
- frontend/src/components/Tree/TreeCanvas.tsx
- Wired `nodeTypes={treeNodeTypes}` into ReactFlow. The
scaffold's commented-out stub from PR5a is now live; cards
render in place of react-flow's default node renderer.
- frontend/src/components/Tree/nodeCards.test.tsx
- 34 tests across six per-card describes + a registry
integration describe. Each card describe asserts:
- kind label rendered
- kind-specific fields rendered
- state badge renders the current state
- handle visibility per parent expectations
Registry tests mount cards via TreeCanvas to prove the
`nodeTypes` wiring is intact end-to-end (missing entries
would fall back to react-flow's default node renderer, which
only shows the node id — the test asserts specific card
content like "Root prompt" + "Send", which only render when
the registry is properly registered).
Notable shape decisions
- Cards are read-only display only in PR5b. No callbacks, no
edit handlers, no action rail. PR5c widens the prop set as
action-rail callbacks land; defining the surface speculatively
would over-specify before we know what the components need.
- State badge uses a 7-state inline color map rather than
Fluent UI's intent-based MessageBar. Reason: state colors are
a domain-specific visual language (clean=green, edited=yellow,
stale=orange, running=blue, failed=red) that doesn't map onto
Fluent's success/warning/error intents. The inline map keeps
the color choices in one place and visible in the test grep.
- CardFrame takes showTargetHandle + showSourceHandle as
explicit props with defaults of true. Root + import set
showTargetHandle={false} explicitly. The visual reads
correctly without the top handle (no half-edge stub) and the
type-system enforcement falls out of CardFrame's prop signature.
- SendCard's lastError block renders inline (red panel) rather
than as a popover or tooltip. Operators scanning a failed
fan want the failure reason at a glance; clicking through
to a tooltip adds friction without value. The PR6 wave-
complete toast covers the wave-level summary; the per-card
lastError covers the per-leaf detail.
- FanCard's Pick indicator renders as a MetaRow ("pick: slot N")
rather than a special chip. The §3.3 Pick semantic is one of
several fan-card facts (axis, count, pick); rendering them
uniformly as MetaRows keeps the card visually consistent.
PR5f's Pick/Unpick interaction lands here, but it's
affordance — not display — so the visual stays this PR.
- ScoreCard's "V1.0: displays scores attached to upstream
pieces" hint prevents the V1.0 operator from expecting to
click and configure (the affordance is V1.1 per 02 §2.2 + the
spec's render-only contract per 01 §4.5 / 03 §3.2). Surfacing
the limitation on the card is cheaper than a tooltip on the
(currently-not-rendered) configure icon.
- NodeProps generic uses `Extract<TreeFlowNode, { type: 'X' }>`
so each card's props type is exactly the corresponding adapter
output node. This is what makes `data.node: RootPromptNode`
(vs `ConversationTreeNode`) without a runtime narrow inside
the card — the discriminated-union shape PR5a set up pays
off here.
TDD narrative
Single test file (nodeCards.test.tsx) with 34 cases organized
into seven describe blocks (one per card + the registry). RED
was TS2307 on './nodeCards'. Implementation took two passes:
first pass put treeNodeTypes in nodeCards.tsx; lint flagged
react-refresh/only-export-components because the file mixed
component exports with a non-component constant. Moved
treeNodeTypes into its own module (treeNodeTypes.ts); 55/55
green.
Defects surfaced during TDD
- Initial registry placement in nodeCards.tsx triggered the
react-refresh/only-export-components warning. Mixing exports
breaks HMR for the components (the bundler can no longer
fast-refresh just the changed component). Split into a
separate treeNodeTypes.ts module — net +1 file, +0 LOC of
logic, and HMR works.
- First lint pass also caught that `): JSX.Element` is not
valid under the main tsconfig (no global JSX). PR5a hit the
same issue; convention is to omit the explicit return type
on React components.
Verification
Tests: 926 frontend passing (892 prior + 34 new). Backend
unchanged.
Lint: clean.
Type-check: clean (main + contract).
Coverage: src/components/Tree directory 96.07 / 93.1 / 100 /
96 — clear of the 85/85/90/90 thresholds.
nodeCards.tsx: 100/95.45/100/100 (one uncovered branch is
the inline color map's defensive fallback for a state
not in the union — unreachable)
treeNodeTypes.ts: 100/100/100/100
TreeCanvas.tsx: 100/100/100/100
conversationTreeToReactFlow.ts: 85.71/85.71/100/85.71
(unchanged from PR5a; the never-typed default arm)
Next slice
PR5c — per-node action rail (icons + tooltips per 02 §2.2).
Adds onRefresh, onBranch, onDelete, onOpenLinear callback props
to TreeCanvas → cards. The rail itself is a small floating row
positioned by react-flow's NodeToolbar component (or a custom
absolute-positioned div if NodeToolbar's defaults don't fit).
Action wiring routes through props supplied at the TreeCanvas
boundary; the actual runner calls land in PR7 when persistence
+ auto-reverse make a complete-cycle integration possible.
Open rubber-duck items still pending (unchanged from PR5a)
- DTO original_prompt_id nullability.
- Citation-strip discipline.
- CI gate for the 126 latent test type errors.
- PR1 backward-compat fallback corpus verification.
- PR4e cancelQueued-race.
- dispatch.ts branch coverage at 50% (pre-existing).
- Spec drift docs (drain-outside-the-lock, retry-failed-in-shim,
LockAcquireResult DU).
- shim drain loop call-stack serialization.
Address must-fix + should-fix items from the post-PR5a+PR5b rubber-duck
review. The Tree UI scaffold and per-kind cards stay feature-equivalent;
this commit fixes a load-bearing theme bug, removes coupling to react-
flow internals, threads the `selected` prop through every card so PR5c's
action rail can read it without rewriting card prop surfaces, and strips
spec citations that regressed past the PR4b directive.
What ships (per reviewer finding)
Finding B.1 — dark-theme-only baked colors [LOAD-BEARING]
Reviewer flagged: PR5b's `STATE_COLORS` constant hardcoded dark
hexes (#1e3a1e for clean-green, #3a1e1e for failed-red, etc.).
App.tsx toggles webLightTheme ↔ webDarkTheme; switching to light
theme rendered cards as dark-on-near-white stripes — clean-green
looked like a smudge, failed-red was unreadable. Real runtime
defect; the cards were the only component in the workspace
ignoring the theme contract.
Fix: replace `STATE_COLORS` with `STATE_BADGE_TOKENS: Record<
NodeState, { background, foreground }>` keyed on Fluent palette
tokens (colorPaletteGreenBackground2 / Foreground2 pairs, etc.).
Both light + dark themes auto-adapt. Matches the rest of the
codebase's status-chip convention (Chat/TargetBadge.styles.ts).
Finding C / J.3 — CardFrame inline styles + missing `selected`
Reviewer flagged:
(a) PR5b cards used inline `style={{...}}` for every visual
decision. PR5c's action rail will need `:hover` +
`[data-selected]` pseudo-classes that inline styles can't
express, forcing a useState(hover) + onMouseEnter retrofit
on every card.
(b) react-flow passes `selected: boolean` to every node
component, but PR5b cards dropped it on the floor (the
test helpers even passed `selected: false` already).
PR5c's selection-visual + action-rail-visibility-when-
selected would have rewritten every card's prop surface.
Fix: new `nodeCards.styles.ts` companion module using
`makeStyles` + Fluent tokens for every card visual. CardFrame
now accepts `selected?: boolean` (default false at the
destructure site — one default site, not per-card), threads it
through to `data-selected` on the wrapper, and applies the
`frameSelected` class for the brand-color outline. Every card
receives + forwards `selected` from NodeProps. PR5c becomes
"add the action rail," not "rewrite six cards."
Finding E — data-testid="rf__node-*" couples to react-flow internals
Reviewer flagged: PR5a's TreeCanvas test used
`container.querySelectorAll('[data-testid^="rf__node-"]')` — a
private testid scheme that could silently shift with a @xyflow
minor-version bump.
Fix: CardFrame emits `data-tree-node-id={nodeId}` on the wrapper.
TreeCanvas tests now select via `[data-tree-node-id]`, which is
under our control and immune to react-flow renames. Also enables
the PR5c action rail to find cards by node id without walking
react-flow's DOM tree.
Finding A — compile-time guard on the kind → component registry
Reviewer flagged: PR5b's `treeNodeTypes` registry was just an
`as const`. The "every kind has a registry entry" guarantee
relied on the test alone — a developer adding a new
ConversationTreeNodeKind without a registry entry would only
discover it at jest time.
Fix: `as const satisfies Record<ConversationTreeNodeKind,
ComponentType<never>>` on treeNodeTypes. Compile-time
completeness: tsc fails the moment a new kind lands without a
registry arm. The existing runtime test is now defense-in-depth.
Finding I — citation discipline regressed by 8 instances
PR5a + PR5b added 8 new `02 §...` references in JSDoc + test
comments after the PR4b directive ("no new citations in code").
Stripped all 8 — 5 in module-header JSDoc, 3 in test comments
and titles. The doc/gui/design/ files remain the source of
truth; the code just no longer mirrors specific section refs.
Finding H.4 — operator-facing V1.0/V1.1 text in ScoreCard
Reviewer flagged: ScoreCard's body copy was
"V1.0: displays scores attached to upstream pieces (configuration
is V1.1)." Operators don't know what V1.0/V1.1 means; release
labels in operator-facing copy are a smell. The "scorer
configuration coming later" detail belongs on the PR5c action
rail's disabled `✏` icon tooltip, not the card body.
Fix: strip the V1.0/V1.1 text. The ScoreCard footer now reads
just "Read-only display" — enough to signal non-interactivity
without naming an internal version label.
Finding F — implementer's coverage rationalization was wrong
Reviewer flagged: PR5b commit body claimed the missing branch
was "the inline color map's defensive fallback for a state not
in the union" — but there was no `??` fallback in the code; an
invalid state would throw on `.background` access. The
rationalization didn't match the source.
Fix: actual uncovered branches found and exercised:
- SendCard's `state==='failed' && lastError===null` quadrant
(test "does NOT render the error panel when state is
'failed' but lastError is null")
- UserTurnCard's singular-converter ternary
(test "uses singular 'converter' for a one-converter pipeline")
- CardFrame's `selected = false` default (test "cards default
to unselected when `selected` is undefined")
Finding D — adapter ↔ registry alignment defense-in-depth
Reviewer flagged: the registry test could be the single point of
failure for adapter/registry alignment. If the adapter emitted
`type: 'rootPrompt'` and the registry keyed on `root_prompt`,
only the registry test would catch it; per-card tests would still
pass.
Fix: new test "every kind emitted by the adapter has a registry
entry (adapter ↔ registry alignment)" — builds a tree with every
kind, runs the adapter, asserts every result `node.type` is a
registry key. Defense-in-depth alongside the `satisfies`
compile-time guard from Finding A.
Finding J.2 — `mockNodeProps` generic stub builder
Reviewer flagged: PR5b had six near-identical `as unknown as
Parameters<typeof X>[0]` cast helpers. The `as unknown as` is a
"trust me" the type-checker can't validate; adding a mandatory
field to NodeProps would silently break tests.
Fix: one generic `mockNodeProps<T>(id, data, selected?)` helper;
the six card-specific wrappers (`rootPromptProps(node)`, etc.)
are now thin invocations that supply the appropriate `T`.
Notable shape decisions
- `makeStyles` was chosen over emotion or styled-components because
Chat/TargetBadge.styles.ts is the existing in-codebase pattern.
Mixing CSS-in-JS systems would be churn for no benefit.
- Griffel (Fluent's CSS engine) rejects the `borderColor`
CSS shorthand for theme-token reuse reasons (so individual sides
can be overridden via longhand). The `frameSelected` slot uses
`borderTopColor` / `borderRightColor` / `borderBottomColor` /
`borderLeftColor` instead. Same visual; lint-clean.
- The inline state-badge color comes from STATE_BADGE_TOKENS at
render time (one `<span style={{ background, color }}>` per
card). makeStyles can't easily produce 7 dynamic-key combinations
without static slot generation; the runtime lookup is cheaper
and reads more naturally. Token references resolve at render
against the current theme, so dark/light still both work.
- The `selected` default lives in CardFrame's destructure
(`selected = false`), not per-card (`selected ?? false`). One
branch to cover, not six. The cards forward whatever NodeProps
handed them and rely on CardFrame's default for the absent case.
- Stripped the SendCard inline lastError panel? No — kept it.
Reviewer flagged it as spec drift (H.2: the spec pins failure
summarization at the wave-status banner, not the per-card
panel). Decision: keep the inline surface because operator
"why did this leaf fail" is a real read-this-card moment, and
the action-rail `💬` (PR5c, drawer) is the "full raw response"
surface — different need. The spec should grow an acknowledgment
sentence (deferred to end-of-V1.0 doc pass with the other spec
drift notes).
TDD narrative
Worked finding-by-finding starting from the load-bearing one
(B.1 theme bug). Each finding:
- identified the regression-prone surface
- landed the fix
- added or tightened a test that proves the fix sticks
Two test failures caught regressions:
1. The "failed-with-null-lastError" test used a text-content
regex that matched the state badge's "failed" string —
false-positive. Fixed by switching to a class-substring scan
for `errorPanel` (makeStyles preserves slot names in dev
mode for debuggability, so the class contains 'errorPanel').
2. The Griffel `borderColor` shorthand warning surfaced during
the makeStyles refactor. Fixed via four longhand properties.
Defects surfaced during TDD
- Griffel's CSS-shorthand rejection list includes `borderColor`
but NOT `border` (the full shorthand). The distinction is
"shorthand that lets later rules override specific sides via
longhand" — only the partial-override family is rejected.
Documented inline in nodeCards.styles.ts.
- The "doesn't render error panel" test originally used a
text-substring scan that false-matched against the state
badge's literal "failed" text. Lesson: don't text-scan for
state words when the state itself is a literal element on
the card.
- Coverage on nodeCards.tsx's branch count moved with the
`selected` consolidation. Originally each card had its own
`?? false` (6 branches); consolidation to CardFrame's
destructure-default left 1 branch. The "defaults to
unselected" test covers it, but istanbul still scores the
default-parameter as two arms.
Verification
Tests: 932 frontend passing (926 prior + 6 net: 4 new + 2 renamed
+ 0 removed). Backend unchanged.
Lint: clean.
Type-check: clean (main + contract).
Coverage: src/components/Tree directory 96.66 / 94.28 / 100 /
96.61 (was 96.07 / 93.1 / 100 / 96 in PR5b).
nodeCards.tsx: 100/96.42/100/100 (one uncovered branch is
the default-parameter arm of `selected = false` — istanbul
over-counts default-parameter coverage)
nodeCards.styles.ts: 100/100/100/100
treeNodeTypes.ts: 100/100/100/100
TreeCanvas.tsx: 100/100/100/100
conversationTreeToReactFlow.ts: 85.71/85.71/100/85.71
(unchanged; the never-typed default arm of the kind switch)
Open rubber-duck items still pending
Carried forward (not addressed in PR5a+b.1):
- DTO original_prompt_id nullability.
- Citation-strip discipline (partition.ts + wave.ts + shim.ts
still have section refs; this PR did NOT strip those, only the
new ones introduced by PR5a+b).
- CI gate for the 126 latent test type errors.
- PR1 backward-compat fallback corpus verification.
- PR4e cancelQueued-race.
- dispatch.ts branch coverage at 50% (pre-existing).
- Spec drift docs (drain-outside-the-lock, retry-failed-in-shim,
LockAcquireResult DU); PR5b.1 adds another: SendCard's inline
lastError surface is spec drift kept by design.
- shim drain loop call-stack serialization.
Deferred from this review:
- Card decomposition split point (six cards + three shared
primitives in one 270-line file). Reviewer J.4: "split when
CardFrame's hover/select state lives in its own hook —
somewhere mid-PR5c." Honored: not splitting now.
- PR5g layout memo keying. Reviewer J.1 noted PR5g will be
tempted to re-layout per-leaf state ping. Not addressable
in PR5b; tracked for PR5g.
Next slice
PR5c — per-node action rail (icons + tooltips). Callback props
on cards (onRefresh, onBranch, onDelete, onOpenLinear), threaded
via TreeCanvas. CardFrame's `selected`-driven `data-selected`
attribute is what PR5c's rail-visibility-on-hover-or-selected
CSS reads. The makeStyles shell from PR5b.1 is where the hover
pseudo-class lands.
…xt (PR5c)
The common-to-every-node action rail (Refresh / Branch / Branch-as-
subtree-stub / Delete / Open-in-linear) wired through TreeCanvas →
CardFrame → ActionRail via a React context. Per-callback opt-in:
undefined callbacks hide their buttons so PR5c lands the wiring
without forcing every runner integration to land in the same PR.
Kind-specific actions (✏ edit, ⚡ converter, ≡ role, ↻×N re-run,
📎 attachment, 🎯 target-override, etc.) defer to later sub-PRs —
each needs its own state machine + dialog and is poorly served by
the same minimum-viable callback bag.
What ships
- frontend/src/components/Tree/actionRail.tsx
- ActionRail({ nodeId, callbacks, branchLabel }) — small Fluent
UI Button row with five action slots:
↻ Refresh (ArrowSyncRegular)
🌿 Branch (BranchRegular; label varies by node kind)
⫝ Branch-subtree (BranchForkRegular, ALWAYS disabled —
V1.1 placeholder; reserved slot per the operator-facing
convention that V1.1 enablement is a state flip, not a
new button)
🗑 Delete (DeleteRegular)
🔍 Open-in-linear (OpenRegular)
- Each callback is optional; undefined hides the button entirely
so PR5c ships before per-action runner wiring.
- Wrapper emits `data-tree-action-rail` + `data-tree-node-id`
for DOM scoping (PR5d's edge `+` chip will use the rail
element's position as an anchor reference).
- frontend/src/components/Tree/actionRail.styles.ts
- Fluent makeStyles for the rail row layout (horizontal flex,
tokens-spaced gap, top border via stroke2 token). Visibility
defaults to always-visible in PR5c; the hover/focus
visibility-flip per the design wires alongside Stack
rendering (PR5e) when CardFrame grows a hover handler.
- frontend/src/components/Tree/actionCallbacksContext.ts
- ActionCallbacksContext (React context, default null).
- useActionCallbacks() returns ActionCallbacks | null.
- Lives in its own module so the adapter stays pure and a
callbacks-only-change render doesn't re-run
conversationTreeToReactFlow.
- frontend/src/components/Tree/TreeCanvas.tsx (modified)
- Added optional `actionCallbacks?: ActionCallbacks` prop.
When supplied, wraps ReactFlow in
<ActionCallbacksContext.Provider value={actionCallbacks}>.
When omitted, provider value is null and cards skip the
rail entirely (preserves the PR5a/PR5b "display only" use).
- The adapter is NOT in the actionCallbacks dependency list —
changes to callbacks don't perturb the tree's adapter output,
preserving identity-stable nodes/edges for react-flow's
reconciler.
- frontend/src/components/Tree/nodeCards.tsx (modified)
- CardFrame consumes useActionCallbacks(); when non-null,
renders <ActionRail nodeId={nodeId} callbacks={callbacks}
branchLabel={branchLabel} />.
- Every card forwards its kind-appropriate `branchLabel`:
RootPromptCard → "Clone tree" (the operator-facing language
for a root clone), every other card → "Branch from here".
- frontend/src/components/Tree/actionRail.test.tsx
- 20 tests across four sections:
1. ActionRail in isolation — opt-in/opt-out rendering per
callback presence, disabled Branch-subtree slot, click
invocations with the right nodeId
2. TreeCanvas integration — rail per card, Refresh-click
on a specific card invokes onRefresh with that card's
id, root vs non-root branchLabel, callbacks-omitted
renders zero rails (back-compat)
3. Accessibility — aria-label on every button, tooltip on
the disabled V1.1 button, data attributes for DOM scoping
4. CardFrame integration — rail doesn't break
data-tree-node-id wrapper attribute (PR5a/PR5b selector
contract survives)
Notable shape decisions
- Context vs prop-threading for callbacks. The adapter (PR5a) is
pure and identity-stable; threading callbacks through `data`
on every adapter node would force re-adaption on every
callback-prop change. Context lets the rail consume callbacks
where it renders, leaving the adapter untouched. Trade-off:
cards become non-pure (they consume context); accepted because
the alternative breaks the adapter's identity contract for
react-flow's reconciler.
- branchLabel is a CardFrame prop, not derived inside ActionRail.
The card knows the kind ("root prompt" → "Clone tree"); the
rail doesn't and shouldn't. Pushing branchLabel into ActionRail
via the cards keeps the rail kind-agnostic so PR5d/PR5e can
reuse it on the Stack card without inventing new render rules.
- Tooltip `relationship="description"` rather than
`relationship="label"`. The Fluent label-relationship pattern
uses aria-labelledby pointing at the tooltip content; the
tooltip only renders the content when shown, so the accessible
name comes from the hover tooltip alone. Under jsdom (and
arguably under screen-reader-without-hover), this means the
button has no name. Switched to `description` and kept an
explicit aria-label on every button so the accessible name
is permanent.
- V1.1 Branch-subtree button always renders (disabled). Per the
operator-facing convention: V1.1 enablement is a state flip,
not a new affordance appearing. Keeping the slot reserved
prevents an operator's muscle memory from forming around four
buttons that suddenly become five. The `title` attribute
carries the operator-friendly explanation ("coming in a future
release") — no V1.0/V1.1 release labels in operator-facing copy.
- Five callbacks, not nine. Per the rubber-duck principle of
"don't speculate," PR5c ships the common-to-every-node rail
only. The seven kind-specific actions (per-card edit,
converter, role, re-run-N, attachment, target-override, view-
raw-response) each carry their own UI work (palette, role
cycler, count-prompt, etc.) and don't share a callback
surface. Each lands when its full interaction is ready.
- Callbacks accept nodeId only, not (treeId, nodeId). The host
that mounts TreeCanvas already knows the treeId — the
callbacks close over it at the consumer's call site:
`onRefresh={(id) => runner.refreshNode(treeId, id)}`.
Adding treeId to the rail's callback signature would be
speculative complexity for a single-tree V1.0 canvas.
TDD narrative
Started with actionRail.test.tsx — 20 cases pinning callback
invocations, opt-in rendering, the disabled V1.1 slot, and the
rail's behavior when threaded through TreeCanvas. RED was
TS2307 on './actionRail'.
Implementation took three corrective passes:
1. Initial ActionRail used `<Tooltip relationship="label">` —
caused all buttons except the first to lose their accessible
name in jsdom. Switched to `relationship="description"` and
kept explicit aria-label on every button.
2. userEvent.click on a button inside a react-flow node card
throws "Cannot read properties of null (reading 'document')"
— react-flow's pointerdown handler dereferences a null
window owner inside jsdom. Switched click tests to
fireEvent.click which dispatches a single MouseEvent
without pointer-event machinery.
3. screen.getByRole filters out visibility-hidden elements;
react-flow renders nodes with `visibility: hidden` until
its layout pass runs (which jsdom never triggers). Switched
integration tests to container.querySelectorAll
scoped by data-tree-node-id, matching the PR5a TreeCanvas
test pattern.
Defects surfaced during TDD
- Fluent Tooltip's `relationship="label"` is operator-hostile
in non-interactive renders: the button's only accessible name
is the tooltip content, but the tooltip content isn't in the
DOM until hover. Screen readers without hover navigation lose
the name. Documented inline in actionRail.tsx that the
`description` relationship is the right choice when buttons
also carry an explicit aria-label.
- react-flow's `visibility: hidden` on un-laid-out nodes
confuses testing-library's role-based queries. Tests using
role queries inside the canvas need to use the
data-tree-node-id wrapper-scoped pattern instead. Worth a
note for PR5d-g test authors.
- userEvent.click → react-flow pointerdown handler →
NullPointerException in jsdom. fireEvent.click is the
workaround for interactive testing inside the canvas.
Both patterns documented inline at the test callsites.
Verification
Tests: 952 frontend passing (932 prior + 20 new). Backend
unchanged.
Lint: clean.
Type-check: clean (main + contract).
Coverage: src/components/Tree directory 97.53 / 95.74 / 100 /
97.5 (was 96.66 / 94.28 / 100 / 96.61 in PR5b.1).
actionRail.tsx: 100/100/100/100
actionRail.styles.ts: 100/100/100/100
actionCallbacksContext.ts: 100/100/100/100
TreeCanvas.tsx: 100/100/100/100
nodeCards.tsx: 100/96.66/100/100 (the rail-render branch
adds a single unmeasured combination — context-null vs
non-null × per-card; defensible)
conversationTreeToReactFlow.ts: 85.71/85.71/100/85.71
(unchanged; never-typed default arm)
Open rubber-duck items still pending (unchanged from PR5b.1)
- DTO original_prompt_id nullability.
- Citation-strip discipline (partition.ts + wave.ts + shim.ts
legacy refs; new code stays clean).
- CI gate for the 126 latent test type errors.
- PR1 backward-compat fallback corpus verification.
- PR4e cancelQueued-race.
- dispatch.ts branch coverage at 50% (pre-existing).
- Spec drift docs (drain-outside-the-lock, retry-failed-in-shim,
LockAcquireResult DU, SendCard inline lastError).
- shim drain loop call-stack serialization.
Next slice
PR5d — per-edge `+` chip + insert-on-edge popover. Adds a custom
edgeTypes entry to TreeCanvas, an onEdgeInsert callback to the
ActionCallbacks surface, and a Fluent Popover with kind-aware
insert options per the upstream-node-kind contract. Reuses the
data-tree-node-id wrapper-scoping pattern from PR5b.1 for
position anchoring.
The per-edge `+` chip + insert popover per the operator-facing
insert-on-edge affordance. Adds a custom react-flow edge component
(InsertEdge) that wraps SmoothStepEdge with a midpoint chip; clicking
the chip opens a Fluent Menu whose options vary by the upstream node's
kind (only legal next-node types render, hiding illegal ones is
cheaper than enabling-with-error). Selecting an option fires the
host-supplied onEdgeInsert callback with a discriminant (EdgeInsertKind)
naming the chosen action.
What ships
- frontend/src/components/Tree/InsertEdge.tsx
- Custom react-flow edge component (replaces 'smoothstep' as the
default edge type for Tree adapter output).
- Renders BaseEdge for the orthogonal stroke + an absolute-
positioned chip at the midpoint via EdgeLabelRenderer (with a
fallback inline render for test environments where the
EdgeLabelRenderer portal target isn't mounted).
- Chip suppression rules:
- onEdgeInsert callback absent → no chip
- parent kind is `score` (terminal) → no chip
- parent kind is `fan` (children managed via FanCard +) → no chip
- menuForParent() builds the kind-aware option list:
- root_prompt / import_message: Follow-up + Inject + Send
+ Fan attempt + Fan converter (+ V1.1 disabled stubs)
- user_turn: Send + Append converter + Fan converter
(+ V1.1 disabled stubs)
- send: Follow-up + Inject + Score + Fan attempt
+ Fan converter (+ V1.1 disabled stubs)
- Branded source/target back to ConversationTreeNodeId at the
onEdgeInsert callback boundary so hosts receive the same brand
the runner uses everywhere.
- frontend/src/components/Tree/insertEdge.styles.ts
- makeStyles for the chip wrapper (absolute + pointer-events:all)
and the chip button (20px circular Fluent Button).
- frontend/src/components/Tree/treeEdgeTypes.ts
- Edge-type registry: `{ insert: InsertEdge }`. Passed to
ReactFlow's `edgeTypes` prop. Sibling to treeNodeTypes.ts.
- frontend/src/components/Tree/actionRail.tsx (modified)
- Added `EdgeInsertKind` exported type (discriminant for
onEdgeInsert).
- Added `onEdgeInsert?: (parentId, childId, kind) => void` to
`ActionCallbacks`. Per-callback opt-in pattern from PR5c
preserves: undefined hides the chip.
- frontend/src/components/Tree/conversationTreeToReactFlow.ts (modified)
- Edge data now carries `parentKind: ConversationTreeNodeKind`
so InsertEdge can pick the kind-aware menu without a tree
lookup at render. Built once per adapter call (O(nodes) hash).
- Edges now emit `type: 'insert'` (was `'smoothstep'`) so
ReactFlow routes them to the InsertEdge component.
- Added PLACEHOLDER_WIDTH/HEIGHT (260×80) on every node so the
reconciler treats nodes as measured ahead of layout. PR5g's
layout pass overrides positions; the dims unblock edge
rendering in environments without real ResizeObserver.
- frontend/src/components/Tree/conversationTreeToReactFlow.test.ts (modified)
- Renamed the edge-type assertion to reflect the 'insert' type
(was 'smoothstep' in PR5a).
- frontend/src/components/Tree/TreeCanvas.tsx (modified)
- Wired `edgeTypes={treeEdgeTypes}` alongside the existing
`nodeTypes={treeNodeTypes}` registration.
- frontend/src/components/Tree/edgeInsert.test.tsx
- 23 tests across six sections:
1. chip presence/suppression (callback present/absent,
context null, score/fan parents)
2. kind-aware menu options (one test per parent kind +
V1.1 disabled axes)
3. callback invocation (one test per kind discriminant
+ disabled-items-don't-fire)
4. accessibility (aria-label "Insert after X")
5. adapter parentKind contract (edge data carries source
kind; fan parents emit parentKind='fan')
6. registry smoke test (treeEdgeTypes.insert === InsertEdge)
Notable shape decisions
- Test approach: direct InsertEdge mount inside ReactFlowProvider,
not TreeCanvas → react-flow → edge render. Reason: react-flow's
edge layer is gated on full node measurement (handleBounds
populated via ResizeObserver), which jsdom can't simulate
cleanly without invasive setupTests changes (DOMMatrixReadOnly
stub, ResizeObserver fire-on-observe, getBoundingClientRect
override). Per PR5c's rubber-duck lesson "don't fight react-
flow internals," the integration is covered by the adapter's
`edge.type === 'insert'` assertion + the registry's
`treeEdgeTypes.insert === InsertEdge` assertion; the
InsertEdge component itself is tested via direct mount.
- EdgeLabelRenderer fallback. Production wraps the chip in
EdgeLabelRenderer (portals out of the SVG into a fixed layer
above the canvas so the HTML chip renders over the SVG path).
In tests the portal target (`.react-flow__edgelabel-renderer`
div) doesn't exist — InsertEdge checks `useStore(s =>
Boolean(s.domNode?.querySelector(...)))` and falls back to
inline render. Visual is identical in jsdom (no layout); in
production the portal path is taken.
- parentKind on edge data, not derived at render. The adapter
computes parentKind once per edge (O(nodes) hash lookup); the
InsertEdge consumes data.parentKind without any tree-side
re-query. Keeps the edge component pure and avoids context
lookups for what's essentially adapter-state.
- One callback (onEdgeInsert), one discriminant (EdgeInsertKind).
The host receives (parentId, childId, kind) and decides how to
splice the new node. Alternative: one callback per kind
(`onInsertFollowUp`, `onInsertSend`, etc.). Rejected because
the host's tree-edit logic is typically a single function
`insertBetween(parent, child, kindToBuild)` — splitting forces
seven near-identical wrappers.
- V1.1 fan axes (`fan_prompt`, `fan_target`) reserve menu slots
as DISABLED items, not absent. Same operator-facing convention
as PR5c's Branch-as-subtree button: V1.1 enablement is a state
flip, not a new affordance appearing. Disabled-stub strings
use "(coming later)" — no "V1.1" release labels in operator
copy.
- PLACEHOLDER_WIDTH/HEIGHT (260×80) on adapter output nodes.
React-flow won't render edges until source + target nodes
have measured dimensions; supplying them up-front (via
node.width/height per the NodeBase interface) lets edges
render before the ResizeObserver loop completes. Production
cards report their real size via ResizeObserver on mount;
these placeholders are the until-then value. PR5g's layout
pass owns positions, not dimensions.
- Fan-child edges DO emit parentKind='fan' so InsertEdge
suppresses the chip even when an operator selects a fan-child
edge. Adding a chip there would be operator-hostile — the +
button next to a fan-child edge would compete with the
FanCard's own `+ Add variant` button.
TDD narrative
Started with edgeInsert.test.tsx — 23 cases pinning chip
presence/suppression, the per-parent menu options, callback
invocation, and the adapter contract. RED: TS2305 on
EdgeInsertKind + TS2353 on onEdgeInsert (member missing from
ActionCallbacks).
Implementation took three corrective passes:
1. Initial assumption: edge components would render inside
TreeCanvas via the canvas integration. jsdom + react-flow's
handleBounds gate kept edges out of the DOM entirely.
Pivoted to direct-mount tests; covered the canvas-level
contract via the adapter test + a registry smoke test.
2. EdgeLabelRenderer's portal target only exists inside
<ReactFlow>, not the bare ReactFlowProvider used by direct
mount. Added a portal-target check via useStore + an inline
render fallback. Production keeps the portal path.
3. Two type-check failures from the main tsconfig (not the
test config): readonly fanAxes mismatch with the InsertMenu
interface (fixed: typed fanAxes as ReadonlyArray); plain
string source/target passed to a branded-id callback (fixed:
cast at the callback boundary).
Defects surfaced during TDD
- jsdom + react-flow edge measurement: the standard
no-op ResizeObserver mock in setupTests.ts prevents react-flow
from populating node.internals.handleBounds, which gates
edge rendering entirely. Tried upgrading the ResizeObserver
mock to fire on observe(); revealed DOMMatrixReadOnly is also
absent from jsdom (react-flow's transform-decoder throws);
reverted. The right pattern is to test components that depend
on full canvas state via direct mount, not via TreeCanvas.
- EdgeLabelRenderer's portal target is mounted by <ReactFlow>,
not ReactFlowProvider. Tests using direct mount need either
a portal-target fallback in the component (chose this) or a
full ReactFlow mount with the handleBounds workaround above
(rejected as too invasive).
- Main tsconfig is stricter than tsconfig.test.json for branded-
id types. The test passes plain strings as source/target on
the synthetic EdgeProps; in production react-flow emits plain
strings too, so the cast at the callback boundary is the
permanent shape.
Verification
Tests: 975 frontend passing (952 prior + 23 new). Backend
unchanged.
Lint: clean.
Type-check: clean (main + contract).
Coverage: src/components/Tree directory 97.14 / 89.87 / 100 /
97.08 (was 97.53 / 95.74 / 100 / 97.5 in PR5c).
InsertEdge.tsx: 95.65/83.33/100/95.55 (two uncovered branches
live in parentLabel() switch fall-throughs that the
suppress-chip-on-score/fan rule make unreachable — they
exist for type exhaustiveness)
insertEdge.styles.ts: 100/100/100/100
treeEdgeTypes.ts: 100/100/100/100
actionRail.tsx: 100/100/100/100 (onEdgeInsert added to
ActionCallbacks; no rail-side change)
conversationTreeToReactFlow.ts: 90.9/77.77/100/90.47
(one uncovered branch is the `nodeKindById.get(parentId) ??
'root_prompt'` fallback for an orphan edge — unreachable
in a well-formed tree)
All other modules: 100/100/100/100
Open rubber-duck items still pending (unchanged from PR5c)
- DTO original_prompt_id nullability.
- Citation-strip discipline (legacy partition.ts + wave.ts +
shim.ts refs; new code stays clean).
- CI gate for the 126 latent test type errors.
- PR1 backward-compat fallback corpus verification.
- PR4e cancelQueued-race.
- dispatch.ts branch coverage at 50% (pre-existing).
- Spec drift docs.
- shim drain loop call-stack serialization.
New from PR5d:
- Test pattern: TreeCanvas-level integration tests can't drive
react-flow's edge layer in jsdom. Documented in the
edgeInsert.test.tsx header so PR5e/PR5f authors take the
direct-mount approach for any test that depends on edge
rendering.
Next slice
PR5e — Fan-Children Stack rendering. FanCard + adapter conspire
to render N identical-subtree fan children as a single stacked
card. Builds on the slotIndex field already on edge data (PR5a)
+ parentKind=fan suppression (this PR).
The Fan-Children Stack: when an attempt-axis fan has N >= 2
structurally-identical children, the FanCard renders a single inline
summary ("Send ×10, 9 ✓, 1 ⚠") instead of N separate child cards on
the canvas. Auto-collapse for N > 3 by default; operator-toggleable
via a ⊞/⊟ button on the FanCard. Builds on the slotIndex + parentKind
edge data from PR5a/d and the chip-suppress-on-parent=fan rule
from PR5d.
What ships
- frontend/src/components/Tree/fanStack.ts
- `isStackable(tree, fanId)` — predicate: fan kind +
attempt axis + N >= 2 + all children's subtrees structurally
identical (recursive shape + kinds; params + state may differ
per the design's "execution may differ" note).
- `defaultCollapsedFanIds(tree)` — subset of stackable fans
with N > 3 (the auto-collapse threshold). Smaller stackable
fans render expanded by default but can be manually collapsed.
- `computeStackAggregate(tree, fanId)` → `{ childKind, total,
byState: Record<NodeState, number> }`. Total + per-state
counts feed the FanCard's collapsed body.
- All functions are pure tree-walkers — no react, no DOM, no
side effects. Indexed via a per-call children-by-parent map
so the predicate is O(tree-size) even on deeply-nested fans.
- frontend/src/components/Tree/stackCollapseContext.ts
- `StackCollapseContext` (React context, default null).
- `StackCollapseValue = { collapsedFanIds: ReadonlySet<NodeId>;
toggleStack: (fanId) => void }`.
- `useStackCollapse()` returns the value or null when no
provider is mounted — cards rendered outside a TreeCanvas
(per-card tests) skip the toggle entirely.
- frontend/src/components/Tree/conversationTreeToReactFlow.ts (modified)
- New `TreeFlowAdapterOptions { collapsedFanIds? }` parameter.
- When `collapsedFanIds` is supplied, the adapter:
(a) walks each collapsed fan's subtree and collects all
descendant ids (`collectHiddenDescendants` BFS)
(b) drops those descendants from `nodes` AND drops every
edge whose source or target is hidden
(c) attaches `stackedSummary: StackAggregate` to the
collapsed fan's `data` so FanCard renders the stack
body
- The FanNode discriminant's data type widened to
`{ node: FanNode; stackedSummary?: StackAggregate }`. Backwards-
compatible: existing per-kind component types are unchanged.
- Omitted/empty `collapsedFanIds` behaves identically to PR5d
(no collapse, full tree). The EMPTY_SET sentinel skips the
filter entirely for the no-op case.
- frontend/src/components/Tree/nodeCards.tsx (modified)
- FanCard reads `data.stackedSummary` and renders an inline
`StackSummaryBody` component when present: shows
"send ×10" + a status line `4 ✓, 1 ●, 1 ⚠, 5 ⧖` built
from the aggregate's byState counts.
- FanCard renders a `StackToggleButton` when
`useStackCollapse()` returns a non-null value. Button
icon flips between ArrowMinimizeRegular (currently
expanded → click to collapse) and ArrowMaximizeRegular
(currently collapsed → click to expand).
- aria-label flips ("Collapse to stack" / "Expand stack")
so the accessible name matches the action the click will
perform.
- frontend/src/components/Tree/nodeCards.styles.ts (modified)
- Added `stackSummary` slot (dashed-border body inside the
fan card; flex-col with kind label + status line),
`stackKindLabel` (semibold), `stackStatusLine` (monospace,
subdued color).
- frontend/src/components/Tree/TreeCanvas.tsx (modified)
- Owns the `collapsedFanIds: Set<NodeId>` state via useState,
seeded from `defaultCollapsedFanIds(tree)`.
- `lastTreeId` sentinel watches `tree.id`; when the operator
swaps to a different tree, the collapse state is reseeded
from that tree's default-collapsed set. (The runner mutates
trees in place during waves, so we watch the id, not the
reference.)
- `toggleStack(fanId)` flips the fan id's membership; passed
into the context value.
- StackCollapseContext.Provider wraps ReactFlow alongside the
existing ActionCallbacksContext.Provider.
- The adapter is now called with `{ collapsedFanIds }`; the
useMemo deps include the set so toggles re-adapt.
- frontend/src/components/Tree/fanStack.test.ts
- 20 tests across three describes: `isStackable` (positive,
negative, kind/axis/N edge cases), `defaultCollapsedFanIds`
(auto-collapse threshold, multi-fan), `computeStackAggregate`
(mixed states, empty/non-fan defaults).
- frontend/src/components/Tree/conversationTreeToReactFlow.test.ts (modified)
- 7 new tests for the `collapsedFanIds` adapter option:
child filtering, edge filtering, recursive subtree filter,
stackedSummary attachment, no-summary-when-uncollapsed,
omitted-options backwards-compat, multi-fan independence.
- frontend/src/components/Tree/fanStackCanvas.test.tsx
- 15 tests covering:
- FanCard summary body rendering (kind × count + status
line with ✓/●/⚠/⧖ counts)
- FanCard toggle button presence / context-null hiding
- toggle click → toggleStack(fanId)
- aria-label flips between Collapse/Expand
- TreeCanvas auto-collapses N>3 stackable fans
- N=3 NOT auto-collapsed (boundary)
- converter-axis NOT auto-collapsed (predicate excludes)
- toggle round-trip: collapse → expand → collapse
- tree-id-change reseeds the collapse state
Notable shape decisions
- Stack state lives at TreeCanvas, not on the domain tree. The
collapse decision is a UI affordance, not authoring state —
mutating ConversationTree to track it would leak through the
runner contract + persistence layer. TreeCanvas-internal
state means the host doesn't see it, and a tree-id swap
correctly reseeds without needing collapse-state migration.
- Adapter takes `collapsedFanIds` as an OPTION, not a required
param. Existing callers (PR5a-d tests, any future caller that
doesn't care about stacks) keep working. Empty set = no-op
fast path; the adapter doesn't even build the hidden-id set
when nothing is collapsed.
- `stackedSummary` lives on `data`, not as a separate prop. The
adapter attaches it to the fan node it emits; the FanCard
consumes via NodeProps. This keeps the prop surface for
cards stable (they all share the `NodeProps<TreeFlowNode>`
shape) and the summary is automatically available via the
same discriminated-union narrowing PR5b set up.
- childKind in StackAggregate is nullable (returns null on
empty/non-fan). Defensive against the orphan-fan case (a
FanNode with zero children — which the stackable predicate
rejects, so this is unreachable in practice, but the type
allows a clean default).
- Auto-collapse threshold is N > 3 (matches the design doc).
Below threshold the stack renders expanded by default but
the operator CAN collapse manually via the toggle. This
matches the spec's "Collapse to Stack is auto-applied when
N>3 and all children are structurally identical; otherwise
expanded" language.
- Status line uses ✓ / ● / ⚠ / ⧖ glyphs matching the
operator-facing convention from the wave-status banner spec
(PR6 will reuse these). Lumps `failed + cancelled` into the
⚠ count because both are operator-visible problem states;
lumps `draft + edited + stale` into the ⧖ pending count
because all three mean "not yet executed." Counts that are
zero are omitted from the status line (no noise like
"4 ✓, 0 ●, 0 ⚠").
- Toggle icon flip: ArrowMinimizeRegular when expanded (the
action is "minimize / collapse"), ArrowMaximizeRegular
when collapsed (the action is "maximize / expand"). aria-
label matches: "Collapse to stack" or "Expand stack." The
icons match the action operators will take, not the current
state — operators click the button to do a thing, not to
describe a state.
- StackSummaryBody renders inside the FanCard body, not as
a sibling card or a dedicated stack-card node. The spec's
ASCII art shows the stack summary nested inside the fan
card's border, which matches this implementation. Avoids
an extra react-flow node + edge for the stack, which would
double the canvas DOM cost.
TDD narrative
Three test files in sequence:
1. fanStack.test.ts — pure predicate + aggregate (20 tests).
RED was TS2307 on './fanStack'. Implementation
straightforward: recursive subtree structural-equality
walk + per-axis filter.
2. conversationTreeToReactFlow.test.ts additions (7 tests)
for the adapter's new option. Implementation involved
widening TreeFlowNode's fan arm to carry the optional
summary + adding the descendant-filter pass.
3. fanStackCanvas.test.tsx (15 tests) — the FanCard and
TreeCanvas wiring. RED was on the new ⊞/⊟ button +
stack-summary body (cards don't render either today).
Implementation: extended FanCard with StackSummaryBody
+ StackToggleButton helpers; TreeCanvas owns the state
+ provider.
All three suites green on the first implementation run after
the type-check pass (one lint warning for an unused `styles`
binding in FanCard — eliminated by deleting the binding).
Defects surfaced during TDD
- First FanCard implementation pass had both `const styles =
useNodeCardStyles()` at the FanCard level AND inside the
helper components, with the parent's binding unused. Lint
caught it (no-unused-vars). Removed the parent's binding;
each helper calls the hook itself.
- Stray .github/workflows/frontend_tests.yml diff appeared in
git status (a one-character whitespace change someone made
out-of-band). Reverted before commit so this PR's diff is
Tree-component-only.
Verification
Tests: 1017 frontend passing (975 prior + 42 new: 20 fanStack +
7 adapter + 15 fanStackCanvas). Backend unchanged.
Lint: clean.
Type-check: clean (main + contract).
Coverage: src/components/Tree directory 96.91 / 90.47 / 100 /
98.05 (was 97.14 / 89.87 / 100 / 97.08 in PR5d — branch +
line coverage both improved).
fanStack.ts: 95.83/90.9/100/98 (the one uncovered line is
the orphan-fan `total === 0` early-return in
computeStackAggregate — unreachable when called via the
adapter because the predicate filters first)
stackCollapseContext.ts: 100/100/100/100
TreeCanvas.tsx: 100/100/100/100
conversationTreeToReactFlow.ts: 94.73/89.28/100/96
(improved from 90.9/77.77/100/90.47; the lone uncovered
branch is the `collapsedFanIds.size === 0` fast-path
skip of `collectHiddenDescendants` — exercised by every
existing test but istanbul under-counts default-param
branches)
nodeCards.tsx: 98.3/92.3/100/100
Open rubber-duck items still pending (unchanged from PR5d)
- DTO original_prompt_id nullability.
- Citation-strip discipline (legacy partition.ts + wave.ts +
shim.ts refs; new code stays clean).
- CI gate for the 126 latent test type errors.
- PR1 backward-compat fallback corpus verification.
- PR4e cancelQueued-race.
- dispatch.ts branch coverage at 50% (pre-existing).
- Spec drift docs.
- shim drain loop call-stack serialization.
Next slice
PR5f — Pick / Unpick. The FanCard's MetaRow "pick: slot N"
display from PR5b lands its interactive twin: clicking a fan
child (in the expanded view, or a member of the stack summary)
fires `onPickFanChild(fanId, slotIndex)` and writes
`FanNode.params.promotedChildSlotIndex`. V1.0 visual is "dim
non-promoted children to ~40% opacity" per the spec's
V1.0-simplification of §3.3.
Interactive Pick / Unpick for fan children. Per-child toggle icon on
the action rail (CheckmarkCircle outline = pickable, filled =
currently picked). Click toggles: picking switches to the clicked
slot; clicking the currently-picked icon unpicks (passes null).
When the fan is in the collapsed Stack state (PR5e), the FanCard
renders a "Pick…" dropdown that lists each member by slot + state;
clicking an item picks (or unpicks if already promoted) without
having to expand the stack first.
Visual: the promoted child gets a brand-color outline (similar to
selection); siblings dim to 40% opacity per the V1.0-simplification
of the design's §3.3. V1.0 Pick is visual only — the runner doesn't
read `promotedChildSlotIndex` yet; V1.1+ scopes Refresh and Stack-
edit to the picked attempt.
What ships
- frontend/src/components/Tree/fanStack.ts (modified)
- StackAggregate gains a `members: StackMember[]` field built
in slot-index order from the tree's edges. The collapsed-stack
Pick popover (PR5f) reads it without doing a tree walk at
render time. Exported new `StackMember` interface.
- frontend/src/components/Tree/conversationTreeToReactFlow.ts (modified)
- New `FanChildInfo` interface (exported): `parentFanId`,
`slotIndex`, `promoted`, `dimmed`.
- Every TreeFlowNode kind's data widened with optional
`fanChildInfo?: FanChildInfo`. Cards consume it; non-fan-
children carry undefined.
- Adapter builds a `fanChildIndex: Map<childId, {parentFan,
slotIndex}>` from tree.edges (one O(edges) pass) then for
each node looks up the entry + computes
`promoted = parentFan.params.promotedChildSlotIndex === slot`
and `dimmed = parentFan.params.promotedChildSlotIndex !== null
&& != slot`. No per-render tree walks.
- frontend/src/components/Tree/actionRail.tsx (modified)
- `ActionCallbacks` gains `onPickFanChild?: (fanNodeId,
slotIndex | null) => void`. Null = unpick.
- `ActionRailProps` gains optional `fanChildInfo?: {
parentFanId, slotIndex, promoted }`. When supplied AND
`onPickFanChild` is wired, the rail renders a
CheckmarkCircle toggle button (outline = pickable, filled =
picked). Click:
- promoted → invokes onPickFanChild(parentFanId, null)
- not promoted (or sibling picked) → invokes
onPickFanChild(parentFanId, slotIndex)
- aria-label flips between "Pick this attempt" and "Unpick this
attempt" to match the action the click will perform.
- frontend/src/components/Tree/nodeCards.tsx (modified)
- CardFrame gains optional `fanChildInfo?: FanChildInfo`
threaded from every card's `data.fanChildInfo`. CardFrame:
- emits `data-dimmed` + `data-promoted` attributes on the
wrapper for DOM scoping
- applies `frameDimmed` (40% opacity) when dimmed=true
- applies `framePromoted` (brand-color outline + shadow)
when promoted=true
- passes `fanChildInfo` into ActionRail
- Every card (Root, Import, UserTurn, Send, Fan, Score) now
forwards `data.fanChildInfo` into CardFrame.
- FanCard's "pick: slot N" MetaRow gains a `title` attr
clarifying "Visual focus only. Future releases will scope
Refresh and Stack-edit to the picked attempt." (per the
rubber-duck's E directive — sets correct operator expectation
against the cherry-pick-metaphor disappointment).
- MetaRow gains an optional `title?: string` prop.
- New `StackPickButton` helper component (inside nodeCards.tsx):
Fluent Menu with "Pick…" trigger that lists each stack member
by slot + state. Currently-picked item shows
`✓ (picked)` and clicking unpicks. Renders ONLY when the
fan is collapsed (stack state) AND onPickFanChild is wired.
- frontend/src/components/Tree/nodeCards.styles.ts (modified)
- New `frameDimmed` (opacity: 0.4) and `framePromoted`
(brand-color border + 2px shadow) slots.
- frontend/src/components/Tree/fanPick.test.tsx (NEW)
- 25 tests across seven describes:
1. adapter — fanChildInfo on fan children (4 tests)
- present on fan children, absent on non-fan children
- promoted/dimmed flags reflect promotedChildSlotIndex
2. computeStackAggregate — members list (2 tests)
3. CardFrame — fan-child dim / promoted (3 tests)
4. ActionRail — Pick toggle (7 tests)
- presence / suppression (callback absent, non-fan child,
callback wired)
- aria-label "Pick this attempt" vs "Unpick"
- click semantics (pick own slot; unpick when promoted)
5. FanCard — pick MetaRow tooltip (2 tests)
6. FanCard — collapsed-stack Pick popover (5 tests)
- presence/suppression, menu opens with N items, click
invokes onPickFanChild, promoted item shows "(picked)"
+ unpicks on click
7. TreeCanvas — Pick round-trip via per-child icon (2 tests)
- click invokes callback with slot index
- promoted child renders data-promoted=true; siblings
render data-dimmed=true
- frontend/src/components/Tree/fanStack.test.ts (modified)
- Two existing strict-equality assertions on
computeStackAggregate updated to include the new `members`
field.
Notable shape decisions (with reviewer rationale)
Per a rubber-duck pre-implementation review:
- VERDICT FROM REVIEWER: ship-with-these-specific-changes.
- Toggle on the per-child icon, NOT separate Unpick affordance
on the FanCard. Reviewer (B): "The operator's mental model is
'I picked child microsoft#3.' The reversal is 'unpick child microsoft#3,' not
'go to the parent and clear its pick field.'" Filled-when-
picked / outline-when-not is the radio-with-clear primitive
every operator already knows.
- Collapsed-stack Pick popover NON-NEGOTIABLE per reviewer (C):
"The most common workflow is run-N-attempts, pick the best.
Hiding Pick when the stack is collapsed → four clicks per
decision → feature gets used twice then abandoned." The popover
is two clicks (open, pick) and keeps the canvas stable.
- `CheckmarkCircleRegular` / `CheckmarkCircleFilled` glyph pair
per reviewer (F): "The most boring choice and the most honest
one: 'this one is selected as the chosen attempt.' Doesn't
oversell the V1.0 semantics, pairs cleanly as filled/outline
for toggle state, doesn't collide with the V1.1 SendCard
stubs (🎯 Change-target, ★ Pin-as-main-path)."
- Both `promoted` AND `dimmed` on adapter output (D): two flags
emitted from the adapter, CardFrame reads both. Reviewer:
"Don't derive promoted at render time from data.dimmed ===
false && parent.promotedChildSlotIndex !== null; just emit
both flags from the adapter."
- MetaRow tooltip with explicit V1.0-visual-only note (E):
"Pick (V1.0): visual focus only. Future releases will use this
to scope Refresh and Stack-edit." Sets correct operator
expectation against the cherry-pick-metaphor disappointment.
- Single callback `onPickFanChild(fanId, slotIndex | null)`
instead of per-action pair (Pick/Unpick). Host's tree-edit
logic is one function; splitting would force two wrappers.
Null = unpick is the explicit signal.
- Spec confirms single-select: "Promotion stays single-valued
per FanNode; branching is the answer when the operator wants
'but I also want to see what attempt microsoft#7 leads to.'" V1.1
doesn't change this — gesture model carries forward.
Other decisions:
- StackAggregate.members in slot-index order (sorted by adapter)
so the popover displays attempts in the order operators
expect ("attempt #0, microsoft#1, microsoft#2…"). Slot indices match what the
runner uses for hashing and what the FanCard MetaRow displays.
- StackMember includes `state` so the popover items can show the
per-attempt status (clean / failed / running). The operator's
"pick the best" workflow needs to see which attempts succeeded
before clicking. PR5f shows the state name; PR5g+ may swap to
a per-state glyph.
- Auto-clear on promoted-child delete (per reviewer G3) is
documented as a host-side contract in the onPickFanChild
JSDoc. The UI doesn't have to know about it — the host's
tree-mutation layer enforces. If the host violates, the
FanCard MetaRow shows "pick: slot N" referring to a missing
slot but doesn't crash.
- The collapsed-stack popover trigger uses the
CheckmarkCircleRegular icon AND a "Pick…" label (with
ellipsis). The ellipsis follows the standard convention that
a click opens a menu rather than committing an action.
TDD narrative
Single test file (fanPick.test.tsx) with 25 cases drove the
whole PR. RED was a stack of TS errors:
- 'fanChildInfo' missing from data types (every kind variant)
- 'members' missing from StackAggregate
- 'onPickFanChild' missing from ActionCallbacks
Implementation in order:
1. Extend StackAggregate with members + StackMember
2. Add FanChildInfo + widen TreeFlowNode data
3. Build fanChildIndex in adapter + compute per-node info
4. Add onPickFanChild to ActionCallbacks
5. Extend ActionRail with fanChildInfo prop + Pick toggle
6. Extend CardFrame with fanChildInfo (data attrs + classes +
passthrough to ActionRail)
7. Thread data.fanChildInfo through every card's CardFrame call
8. Add StackPickButton helper + wire into FanCard
9. Add MetaRow title prop + V1.0 tooltip text on FanCard
All 25 tests green on first run.
Two existing fanStack.test.ts strict-equality tests broke
because of the new `members` field. Fixed with one
multi-replace.
Defects surfaced during TDD
- The strict `toEqual<StackAggregate>` pattern in pre-existing
fanStack tests is brittle against future StackAggregate
extensions. Considered loosening to `toMatchObject` but kept
strict — the strict shape forces every future contributor to
explicitly think about what's in the aggregate. Cheap to
update; loud about new fields.
- The original PR5f sketch had Unpick as a separate ✕ button on
the FanCard MetaRow. Reviewer's B finding rerouted to a toggle
on the per-child icon. The mental-model argument was decisive.
- The original PR5f sketch hid Pick entirely when the stack was
collapsed. Reviewer's C finding flagged this as the dominant-
workflow killer. Added the popover, which costs ~50 LOC and
saves operators ~3 clicks per Pick decision.
Verification
Tests: 1042 frontend passing (1017 prior + 25 new). Backend
unchanged.
Lint: clean.
Type-check: clean (main + contract).
Coverage: src/components/Tree directory 96.99 / 91.78 / 100 /
98.29 (was 96.91 / 90.47 / 100 / 98.05 in PR5e — branch
coverage improved).
actionRail.tsx: 94.11/95.45/100/100 (one uncovered line is
the showPick-false early-return of onPickClick — a no-op
guard that runs only if React invokes a stale handler)
conversationTreeToReactFlow.ts: 96/92.1/100/96.92 (the
uncovered branches are the FanChildInfo orphan-fallback +
the no-op-when-not-fan-child path)
fanStack.ts: 96.2/89.58/100/98.24 (unchanged uncovered
line is the orphan-fan total=0 early-return)
nodeCards.tsx: 98.55/94.66/100/100 (uncovered branches
live in defensive null-coalesce on optional props
— defensible)
All other modules: 100/100/100/100
Open rubber-duck items still pending (unchanged from PR5e)
- DTO original_prompt_id nullability.
- Citation-strip discipline (legacy partition.ts + wave.ts +
shim.ts refs; new code stays clean).
- CI gate for the 126 latent test type errors.
- PR1 backward-compat fallback corpus verification.
- PR4e cancelQueued-race.
- dispatch.ts branch coverage at 50% (pre-existing).
- Spec drift docs.
- shim drain loop call-stack serialization.
New from PR5f review (deferred):
- PR5e Stack expand-on-Pick-attempt vs popover-instead. Reviewer
flagged that auto-expand-on-Pick-intent would also work; we
chose the popover instead because it keeps the canvas stable.
If operators report "I want to see the expanded children when
I pick," revisit then.
- Keyboard accelerator for Pick (tabIndex + 'P' shortcut).
Reviewer noted this is genuinely good for power operators;
deferred to PR5f.1 or a follow-up sub-PR since it's additive
to the visible affordance.
Next slice
PR5g — Buchheim-Walker layout via d3-hierarchy. Overrides the
PR5a (0,0) placeholder positions with computed coordinates so
the tree renders top-to-bottom without manual zoom. The layout
pass also needs to account for the collapsed-stack FanCard
height (taller when stack summary + Pick popover are present)
so siblings don't overlap.
V1.0 layout: plain `d3-hierarchy.tree()` over the adapter's node +
edge output. Overrides the (0,0) placeholder positions emitted by
the adapter (PR5a/d) with computed coordinates so the canvas renders
top-down without manual zoom. Main-path pinning is V1.1 and not part
of this PR. The collapsed-fan filter (PR5e) already trims hidden
descendants from the layout input, so d3-hierarchy never sees them.
What ships
- frontend/src/components/Tree/layoutTree.ts (NEW)
- `layoutTree(nodes, edges, options?)` → `Map<nodeId, {x, y}>`.
Pure function over the adapter's output; no React, no DOM, no
hidden state.
- Builds the parent→child relation from `edges` rather than from
each node's domain-level `parentId`. This makes the function
consume the SAME filtered view react-flow gets — collapsed-fan
descendants were already dropped by the adapter.
- Uses `d3-hierarchy.stratify()` to build the hierarchy from the
flat node + edge lists, then `tree().nodeSize([w, h])` for
Buchheim-Walker layout with configurable per-node block size.
- Defaults: horizontalSpacing=220 (matches the card min-width),
verticalSpacing=140 (generous room for card height + action
rail + meta rows). Both overridable via the options bag.
- Defensive fallbacks:
- empty input → empty Map
- cycle / orphan-only input (no roots) → every node at (0,0)
so the canvas doesn't crash; bug surfaces visually
- `LayoutNode`, `LayoutOptions` types exported for the
TreeCanvas consumer + future testing.
- frontend/src/components/Tree/layoutTree.test.ts (NEW)
- 17 tests across eight describes:
1. coverage — every node gets a coord; filtered subset only;
single-node tree; zero-node defensive
2. top-down orientation — root at smallest y; siblings share y
3. linear chain — vertical line (every node shares x)
4. sibling placement — distinct x; middle child centered over
parent; left-to-right ordering matches insertion
5. determinism — identical input → identical output
6. nested subtree separation — sibling subtrees do not collide
7. configurable spacing — vertical + horizontal scale correctly
8. TreeCanvas integration probe — produces non-(0,0) coords
- frontend/src/components/Tree/TreeCanvas.tsx (modified)
- New `useMemo` over `layoutTree(rawNodes, edges)`. Maps each
adapter-emitted node through the position lookup; falls back to
the adapter's placeholder for any node the layout pass didn't
cover (defensive — should never happen).
- The memo's deps are the adapter-output references, NOT the
tree/collapsedFanIds — so a re-render that doesn't change
shape (e.g., a callback-prop change) doesn't re-layout.
- frontend/jest.config.ts (modified)
- Added `moduleNameMapper` entry redirecting `d3-hierarchy` to
its UMD bundle at /node_modules/d3-hierarchy/dist/. The npm
package ships ESM as its main entry; ts-jest's transform only
hits `.tsx?` and Jest's CJS require trips on the ESM `import`
statements. The UMD bundle works under CJS without any
transformer. Production (Vite) keeps the ESM path — only the
jest transform sidesteps it.
- frontend/package.json (modified)
- `d3-hierarchy` ^3.1.2 as a runtime dep.
- `@types/d3-hierarchy` ^3.1.7 as a devDep.
- One new transitive (d3-hierarchy itself; no further chain).
Notable shape decisions
- Edges as the parent-relation source. The adapter's node has a
domain `node.parentId` deeply nested in `data.node`, but the
edge list is the post-filter view (collapsed-fan descendants
are absent from BOTH nodes AND edges). Consuming edges keeps
the layout pass aligned with the adapter's filter rules without
needing the layout to re-implement them.
- `tree().nodeSize([w, h])` instead of `tree().size([W, H])`.
`nodeSize` makes the per-node block fixed and lays out within
that — the canvas can grow to fit the tree. `size` would scale
the whole layout to fit a fixed bounding box, which clips on
large trees. The spec calls for "tight packing"; nodeSize is
the right primitive.
- No main-path pinning (V1.1). Per spec §4.3, V1.0 ships layer 2
(plain Buchheim-Walker) only. Layer 1 (main-path centerline
pinning) requires the SendCard's ★ Pin icon which is also V1.1.
Both land together in V1.1 — adding layer 1 here would create
a V1.0 surface that nothing else V1.0 reaches.
- No adaptive collapse (the third layer per §4.3). PR5e already
ships the Fan-Children Stack collapse (the only "adaptive"
layer V1.0 needs); the adapter pre-filters its descendants
before this layout pass sees them.
- Defensive cycle fallback (no roots → every node at origin)
rather than throwing. The spec doesn't require the canvas to
crash on a malformed view; visual overlap is more diagnostic
than a runtime exception that breaks the whole tree view.
- moduleNameMapper redirect for d3-hierarchy instead of widening
transformIgnorePatterns. The transform path required adding
`d3-.*` to the negative-lookahead AND extending the `transform`
config to cover `.jsx?` (ts-jest doesn't transform JS by
default). The mapper is one config line and doesn't perturb
other modules' transform pipeline.
- Multi-root forest path explicitly dropped. The original sketch
had a "lay each root out, translate them so they don't overlap"
loop. The adapter never produces multi-root trees (V1.0 domain
contract: one root per tree), so the path was dead code. The
cycle fallback covers the "no roots" case; the impossible
"multiple roots in a well-formed tree" case is treated
identically (each root laid out at the origin; visual overlap
surfaces the malformation).
TDD narrative
Single test file (layoutTree.test.ts) with 16 cases drove the
implementation. RED was TS2307 on './layoutTree'.
Implementation had two pivots:
1. First implementation pass used `import { stratify, tree }
from 'd3-hierarchy'`. Jest barfed: d3-hierarchy is ESM-only
and ts-jest's transform only hits TypeScript files. Tried
widening transformIgnorePatterns to allow-list `d3-.*` —
didn't help because ts-jest still doesn't have a `.js`
transform. Switched to moduleNameMapper redirecting to the
UMD bundle.
2. The "asymmetric subtree separation" test as originally
written expected u2 to sit fully OUTSIDE u1's subtree
x-range. Wrong: Buchheim-Walker keeps siblings at the same
depth (u2 at depth 1, u1's grandchildren at depth 4); u2's
x can sit between u1's grandchildren x-range because they're
at different y's. Rewrote the test to assert disjoint-x at
the SAME depth (u1 vs u2 at depth 1).
Added the malformed-cycle coverage test after the initial
implementation to push branch coverage on the no-roots fallback
above the 85% threshold.
Defects surfaced during TDD
- The "different y means non-colliding" insight from the second
pivot is important for the V1.1 main-path-pinning layer
(different y is exactly what main-path will exploit). Worth
a note for the V1.1 implementer.
- d3-hierarchy's ESM-only publishing pattern WILL recur every
time we add a d3-* package. The moduleNameMapper indirection
is per-package; documented inline so the next d3 package gets
the same treatment.
- The first-write of the layout pass took ~1ms for a 60-node
tree (measured ad-hoc via `console.time` during local
bring-up). Well within budget; the V1.0 1000-node soft cap
is a non-issue at this perf level. Worth re-measuring if
PR6's wave-status banner forces frequent re-layouts.
Verification
Tests: 1058 frontend passing (1042 prior + 16 new + 1 extra for
coverage = 17 new layoutTree tests). Backend unchanged.
Lint: clean.
Type-check: clean (main + contract).
Coverage: src/components/Tree directory 96.49 / 90.57 / 100 /
98.55 (was 96.99 / 91.78 / 100 / 98.29 in PR5f — statements
+ lines slightly higher, branches dipped because of new
defensive fallbacks).
layoutTree.ts: 93.33/82.6/100/100 (the uncovered branches
are defensive guards: visibleIds.has() short-circuits + the
out.has() dup-suppression — both unreachable in well-formed
input).
All other modules: unchanged from PR5f.
Global coverage gate: passes (no threshold-fail at end of
coverage run).
Open rubber-duck items still pending (unchanged from PR5f)
- DTO original_prompt_id nullability.
- Citation-strip discipline (legacy refs in runner files).
- CI gate for the 126 latent test type errors.
- PR1 backward-compat fallback corpus verification.
- PR4e cancelQueued-race.
- dispatch.ts branch coverage at 50% (pre-existing).
- Spec drift docs.
- shim drain loop call-stack serialization.
- PR5f deferred items (auto-expand on Pick-attempt, keyboard
accelerator).
Next slice
PR5 is now feature-complete (PR5a-g + b.1 review). Per the
rubber-duck schedule, fire the reviewer on the full UI surface
(PR5a-g) before starting PR6. PR6 wires:
- cost-guardrail modal (intercepts Refresh at confirmThreshold)
- ↻ cost-preview tooltip
- wave-status ribbon (canvas-level)
- wave-complete toast with the 5-bucket summary
- [Retry failed] button + reflog drawer
Before PR6, the integration question to settle: does the layout
pass need to re-run on per-leaf state pings (e.g., a Send
flipping to "running" during a wave)? Today it does NOT — the
layout memo deps are the adapter output references, and state
changes don't change shape (so the adapter returns the same
reference). Verify this stays true when PR6 wires the
runner-state-sink integration into TreeCanvas.
…hape (PR5h.1 review)
Per PR5 rubber-duck reviewer findings B + D (bundled fix): the adapter
mixed pure shape mapping with render-time policy (collapse filter +
stackedSummary), and the TreeCanvas layout pass re-ran on every tree
ref change. The latter becomes a 60-leaf-wave layout cliff once PR6
wires the runner sink — every state flip would re-layout the canvas.
- `applyStackCollapse(shape, tree, collapsedFanIds): TreeFlowAdapterResult`
new module owning the Fan-Children Stack collapse policy:
- filters descendants of collapsed fans + edges into/out of them
- clones collapsed fan nodes with `stackedSummary` attached (no
input mutation)
- identity behaviour: empty `collapsedFanIds` returns the input shape
- `conversationTreeToReactFlow(tree)` simplified to a pure shape pass:
- `TreeFlowAdapterOptions` interface removed; no `collapsedFanIds`
option; no `stackedSummary` attachment; no `computeStackAggregate`
import
- still attaches `parentKind` per edge and `fanChildInfo` per fan-
child (these are stable derivations from the input, not UI policy)
- `TreeCanvas` new pipeline: `adapter → applyStackCollapse → layout`,
with `useShapeMemoizedLayout` hook keying layout on a derived
shape-key (`nodes.length:ids|edges.length:ids`). Cached positions
are returned by reference across renders where shape is unchanged.
- `fanChildInfo` stays in the adapter (not in collapse pass). It
depends on `promotedChildSlotIndex` (per-Pick state), so the
adapter is NOT shape-reference-stable across Pick clicks. That's
fine: the layout memo uses a shape-key (string), so even when the
adapter returns a new ref, layout doesn't re-run unless the
shape-key changes. Putting `fanChildInfo` in collapse would force a
deeper split (separate decoration pass) for no operator-observable
benefit.
- `applyStackCollapse` early-returns the input shape on empty
`collapsedFanIds` — preserves shape identity for the common case,
saves a defensive clone-and-filter pass.
- `useShapeMemoizedLayout` uses `useRef`+conditional-assignment-during-
render rather than `useMemo` over the shape-key. `useMemo` would
recompute when the underlying `nodes`/`edges` refs change (state
flips); the ref cache returns the cached Map even when callers pass
new arrays.
- `hidden as unknown as ReadonlySet<string>` cast at the call site:
react-flow's `Edge.source/target` are plain strings; `hidden` is
branded `ConversationTreeNodeId`. Membership is structural at
runtime; local cast avoids per-call brand annotations.
- RED: created `applyStackCollapse.test.ts` (10 tests covering
identity, single + multiple fan collapse, descendant filtering,
edge filtering, `stackedSummary` aggregation, and explicit purity
on shape and tree). Confirmed TS2307 cannot-find-module via
`tsc --noEmit -p tsconfig.test.json`.
- GREEN: implemented `applyStackCollapse.ts`. 10/10 pass.
- RED for memoization: added 2 tests to `TreeCanvas.test.tsx`
spying on `layoutTree`. The "runs once across state-only re-renders"
test asserts call count stays constant after 2 re-renders with new
tree refs but same shape (id, kind, edge structure). The "re-runs
when shape changes" test asserts call count grows when a node is
added.
- GREEN: refactored `TreeCanvas` to the new pipeline + extracted
`useShapeMemoizedLayout`. Both memoization tests pass.
- Migrated the moved collapse-option describe block out of
`conversationTreeToReactFlow.test.ts` (those tests now live in
`applyStackCollapse.test.ts`). Updated `layoutTree.test.ts`'s one
callsite that used the old `{ collapsedFanIds }` option to call
through `applyStackCollapse` instead.
- Initial type-check failure: my `hidden.has(e.source)` and
`hidden.has(n.id)` calls in `applyStackCollapse.ts` were branded-
vs-string mismatches (react-flow's `Edge.source` is `string`;
`hidden` is `ReadonlySet<ConversationTreeNodeId>`). The original
adapter avoided this because it filtered `tree.edges` directly
(branded). Resolved with a local `hiddenStrings` view cast at the
function head, keeping the brand discipline at the boundary.
- `npm test --no-coverage`: 1064 passed, 1064 total (was 1059;
+10 new applyStackCollapse tests; -7 moved out; +2 new TreeCanvas
memo tests)
- `npm run type-check`: 0
- `npm run type-check:contract`: 0
- `npm run lint`: 0 warnings
- `npm run test:coverage`:
- `TreeCanvas.tsx`: 100/90/100/100 (was 100/100/100/100; the
new hook has a one-branch first-call path uncovered)
- `applyStackCollapse.ts`: 97.43/95/100/100
- `conversationTreeToReactFlow.ts`: 95.12/89.47/100/94.59
(slightly down from 96.49 because the dropped collapse code
lost some branches it used to cover)
- `layoutTree.ts`: 93.33/82.6/100/100 (unchanged)
- All exceed 85/85/90/90 threshold.
PR5h.2: `InsertEdge` discriminant-lie fix — replace `kind: 'fan_attempt'`
typed-literal-for-disabled-items with a discriminated `InsertMenuOption`
union (`{ disabled: false; kind; label }` vs `{ disabled: true; label;
reason }`). Reviewer Finding H#1; ~10 LOC plus tests.
From PR5 review:
- PR5h.2: InsertEdge discriminant safety (H#1) -- NEXT
- PR5h.3: edgeInsert.test.tsx vacuous-pass guard (J#4)
- PR5h.4: action rail hover-gate CSS (J#6)
- PR5h.5-7: editing affordances (UserTurn Edit + RootPrompt Edit +
UserTurn Converter palette) -- per user Q1 hybrid decision
- Per-child Pick toggle drift -- amend spec §2.4 end-of-V1.0 doc pass
(user Q2 = keep + amend)
- V1.1 disabled stubs skipped (user Q3 = c; spec §2.4 amendment)
- `💬 View raw response` dropped from V1.0 (user Q1 hybrid)
- TreeCanvas synchronous-setState-during-render (rev H#2) -- track
- Auto-collapse seed only re-runs on tree.id change, not shape change
within same tree (rev J#1) -- track for PR6
- `toFlowEdge`'s `parentKind ?? 'root_prompt'` silent default (rev J#2)
- Tooltip `relationship="description"` on icon-only buttons (rev J#3)
- Card decomposition (450 LOC; rev J#5) -- defer to start of PR6a
- Spec drift docs (drain-outside-lock, retry-failed-in-shim,
LockAcquireResult DU, SendCard inline lastError) -- end-of-V1.0
- shim drain loop call-stack serialization -- PR6 wave-status surface
- Stack-collapse persistence across un-stackable transitions -- PR6
- dispatch.ts 50% branch coverage -- pre-existing PR4c2; not gating
…ty (PR5h.2 review)
Per PR5 reviewer Finding H#1: the V1.1-disabled "Fan out: prompt
(coming later)" and "Fan out: target (coming later)" items each
carried `kind: 'fan_attempt' as const` — a typed lie ("discriminant
is unused on disabled items"). The compiler stayed happy because the
old `InsertMenuOption` shape required `kind: EdgeInsertKind` and
treated `disabled` as optional. When V1.1 adds `'fan_prompt'` and
`'fan_target'` to `EdgeInsertKind`, a flag-flip from `disabled: true`
to `disabled: false` would silently dispatch the wrong axis until
someone notices the literal-`'fan_attempt'` left behind.
## What ships
- `InsertMenuOption` is now exported and a discriminated union:
type InsertMenuOption =
| { readonly disabled: false; readonly kind: EdgeInsertKind; readonly label: string }
| { readonly disabled: true; readonly label: string; readonly disabledReason: string }
The disabled arm has NO `kind` field. Enabling a disabled item
forces a same-commit choice of a real `kind`; an editor that
merely flips the discriminant gets a type error.
- All `menuForParent` callsites updated to construct each option
with the explicit `disabled: false | true` discriminant. Disabled
items lose their stale `'fan_attempt' as const` placeholder.
- Render handlers narrow via `if (opt.disabled === false) handleSelect(opt.kind)`
in both basic and fanAxes loops, replacing the previous truthy-
short-circuit `() => !opt.disabled && handleSelect(opt.kind)`.
The explicit narrowing satisfies the discriminated union — the
short-circuit form does not (TS sees `opt.kind` as possibly absent
on the disabled arm).
## TDD narrative
- RED: created `insertMenuOption.test.ts` with 5 cases:
1. disabled item without kind compiles.
2. enabled item with kind compiles.
3. disabled item with kind fails to compile (asserted via
`@ts-expect-error`).
4. enabled item without kind fails to compile (asserted via
`@ts-expect-error`).
5. narrowing: `opt.kind` accessible after `opt.disabled === false`.
`tsc --noEmit -p tsconfig.test.json` reported TS2459 (InsertMenuOption
not exported) + TS2578 unused @ts-expect-error directives (because the
unresolved import collapsed the type to `any`, swallowing the errors
the directives expected).
- GREEN: refactored `InsertEdge.tsx` to export the union + update all
construction + render-time narrowing. 5/5 new tests pass.
## Verification
- `npm test --no-coverage`: 1069 passed (was 1064; +5 insertMenuOption)
- `npm run type-check`: 0
- `npm run type-check:contract`: 0
- `npm run lint`: 0 warnings
## Next slice
PR5h.3: strengthen `edgeInsert.test.tsx` "disabled V1.1 items do NOT
invoke onEdgeInsert when clicked" — add `expect(disabledFan).toBeDefined()`
before the click so the test stops passing vacuously when no disabled
item is rendered (reviewer Finding J#4).
## Open rubber-duck items still pending
(unchanged from PR5h.1 commit body; tracking continues to PR5h.7)
…PR5h.3 review)
Per PR5 reviewer Finding J#4: two tests in edgeInsert.test.tsx had
vacuous-pass shapes that would survive a regression removing the V1.1
disabled stubs.
## What ships
- "V1.1 axes (Fan prompt, Fan target) render disabled": the old for-loop
body fired `expect()` only when an item matched the regex; an empty
filter result passed the test silently. New shape filters first, asserts
`v11Items.length >= 2`, then loops.
- "disabled V1.1 fan-axis items do NOT invoke onEdgeInsert when clicked":
the old `if (disabledFan !== undefined) { ... }` skipped the assertion
when no disabled item rendered. New shape asserts
`expect(disabledFan).toBeDefined()` before the click and non-null
asserts at the click site.
## TDD narrative — proving the strengthening is meaningful
After landing the change, I temporarily stripped both V1.1 disabled
items (the V1_0_FAN_AXES "prompt"/"target" entries + the user_turn
fan-prompt entry) from `InsertEdge.tsx` and re-ran the edge-insert
suite. Both strengthened tests failed honestly:
Tests: 2 failed, 21 passed, 23 total
> expect(disabledFan).toBeDefined()
^
at src/components/Tree/edgeInsert.test.tsx:288:25
> expect(v11Items.length).toBeGreaterThanOrEqual(2)
Before the strengthening, both would have passed silently with the same
production change.
## Verification
- Reverted the temp strip via `git checkout`; original 23/23 green again.
- `npm test -- --testPathPatterns=edgeInsert`: 23 passed.
## Next slice
PR5h.4: action rail hover-gate CSS — match spec §2.2's "rail floats below
each node card on hover/focus" by adding the visibility-on-hover gate
via `:hover [data-tree-action-rail]` + `[data-selected="true"]
[data-tree-action-rail]` selectors. Reviewer Finding J#6; ~5 LOC + 2
tests (default opacity 0, hover/selected → opacity 1).
## Open rubber-duck items still pending
(unchanged from PR5h.1/h.2 commit bodies; tracking continues to PR5h.7)
…4 review)
Per PR5 reviewer Finding J#6: spec §2.2 says "rail floats below each
node card on hover/focus" but PR5c shipped with `opacity: 1` always
("always-visible until PR5e/PR5f wire the hover behavior"). PR5e/f
landed without that follow-up. Operator-visible drift; cheap to fix.
## What ships
- `actionRail.styles.ts`: rail default `opacity: 0` + 120ms opacity
transition. The fade is gentle enough to feel intentional (no
hard-snap on hover) and short enough to not lag a keyboard-walker.
- `nodeCards.styles.ts`: frame style adds three nested selectors via
Griffel's `&` reference:
- `&:hover [data-tree-action-rail]` → opacity 1
- `&:focus-within [data-tree-action-rail]` → opacity 1 (keyboard
walker hits a rail button → frame's focus-within fires → rail
stays visible while the button has focus)
- `&[data-selected="true"] [data-tree-action-rail]` → opacity 1
(selected card keeps its rail visible regardless of pointer)
- Two new behavioral tests in `actionRail.test.tsx`:
- default opacity is 0 when card is unselected
- opacity flips to 1 when frame carries `data-selected="true"`
jsdom + Griffel honor the attribute-selector branch; `:hover` and
`:focus-within` are CSS-only and verified via code review +
Playwright follow-up (jsdom can't simulate either reliably).
## Notable shape decisions
- Hover-gate lives on the FRAME style (parent), not the rail style
(child). Griffel can't write "if ancestor :hover" from the child;
the parent's `&:hover [data-tree-action-rail]` selector is the
only honest expression.
- `transitionProperty: 'opacity'` + `transitionDuration: '120ms'` is
a visual nicety, not a behavior. It costs nothing in jsdom (no
transitions fire) and avoids a jarring snap when hovering away.
- Did NOT touch the rail's `[data-tree-action-rail]` attribute or
any other DOM contract — only the visual opacity changed.
## TDD narrative
- Tests written first: 2 new cases in `actionRail.test.tsx §5`.
Default-hidden + data-selected-visible. Both fail before the CSS
change (opacity is `'1'` for both cases under the old `opacity: 1`
rule).
- GREEN after applying the styles change.
## Verification
- `npm test --no-coverage`: 1071 passed (was 1069; +2 hover-gate)
- `npm run type-check`: 0
- `npm run type-check:contract`: 0
- `npm run lint`: 0 warnings
## Next slice
PR5h.5: UserTurn `✏ Edit text inline` — first of three V1.0 editing
affordances per the Q1 hybrid scope. Inline `<Textarea>` swap on the
user-turn card, sets `node.state = 'edited'`, propagates downstream
via the runner sink (the sink path is already wired through the
ActionCallbacks `onEditParams` slot — actually, no: PR5c shipped
`onEditParams` only as part of the common rail design; the inline-
edit path needs its own callback. Confirm before writing.).
## Open rubber-duck items still pending
PR5h mechanical fixes complete after this. Remaining V1.0 work:
- PR5h.5: UserTurn ✏ Edit text inline
- PR5h.6: RootPrompt ✏ Edit prompt + target + system prompt
- PR5h.7: UserTurn ⚡ Converter palette
- All Q-decision spec amendments (per-child Pick, dropped 💬 raw response,
no V1.1 stubs) → end-of-V1.0 doc pass
- Other carry-forwards unchanged
Wires the last PR7-review dead seam — useAutoReverse — to a real consumer: the History tab's "Open as tree" action (spec §5.12). This completes the §13.1a/§5.12 host-integration surface. Per spec line 18 + §5.12, V1.0 "Open as tree" ships linear+converter reconstruction (multi-conversation fanout detection is V1.1) — so this path uses useAutoReverse (single-AR linear), distinct from the reload path's fan-aware reconstruction (PR7g). What ships: - useAutoReverse: preserves the AR's conversation_tree_id as the reconstructed tree's id (spec §13.1). A V1.0+ AR (label present) keeps that id so subsequent reload/refresh stay consistent; a pre-V1.0 AR (no label) keeps the freshly-minted id from linearChainFromMessages. - TreeRunnerHost: new `openFromAttackResultId?` prop drives useAutoReverse; an effect pushes the reconstructed tree via onTreeChange, gated on the tree id (applied once per reconstructed AR, so the inline onTreeChange re-firing each render doesn't re-push). `autoReverseApi?` test-injectable defaults to the module attacksApi. - AttackTable: optional `onOpenAttackAsTree?` → a BranchRegular icon button next to the existing "Open attack" button. Renders only when the callback is provided (implicitly flag-gated). Threaded through AttackHistory. - App (thin glue): `openTreeFromArId` state + `handleOpenAttackAsTree` (sets the AR id + switches to the tree view); passes onOpenAttackAsTree to AttackHistory only when treeUiEnabled, and openFromAttackResultId to the host. Notable shape decisions: - The id-preservation lives in useAutoReverse (which already fetches the AR) rather than the host, so the hook is self-consistent and unit- tested. Without it, opening a V1.0+ AR would mint a fresh id that disagrees with the AR's labels → reload would find no rows and fall to greenfield. That's a real bug, fixed here. - The host reuses its existing onTreeChangeRef mirror (not a render-time ref write) for the push effect; lastOpenedTreeIdRef gates re-pushes. - "Open as tree" intentionally uses linear reconstruction for ALL ARs per §5.12 V1.0 scope; the fan-aware path is reserved for in-session reload (§9.4.1 / PR7g). Attempt-fan trees opened from History show the linear base — acceptable + spec-sanctioned for V1.0. TDD narrative: 1. useAutoReverse: +2 tests (preserve conversation_tree_id; mint fresh when absent). RED → GREEN (override tree.id from ar.labels). 2. TreeRunnerHost: +2 tests (auto-reverse emits tree via onTreeChange; no-op when openFromAttackResultId null). RED → GREEN (prop + effect). 3. AttackTable: +2 tests (button absent without callback; present + fires, row-open suppressed). RED → GREEN (prop + button). 4. App + AttackHistory: thin glue (untested per the agreed approach). Verification: - jest AttackTable|AttackHistory|useAutoReverse|TreeRunnerHost: 91/91. - npm test: 1374 pass / 70 suites (was 1368). - type-check / lint clean. This closes both PR7-review dead seams (onGuardedSwapReady in PR7i.3a, useAutoReverse here). Remaining: converter + nested fan-aware reload reconstruction (deferred V1.1 follow-up).
…ter fans (PR7g slice 3)
Extends fan-aware reload (PR7g slice 2 did attempt fans) to a single
root-level `converter` fan, using the flattened topology the operator
chose. Nested / multi-axis fans remain on the honest degraded-banner
path.
Investigation first (Explore subagent, cited): the converter-fan reverse
shape is NOT uniquely determined by the forward tree — V1.0's no-cache
dispatch sends prepended_conversation=[], so the authored `u_above`
level is never persisted. Operator decision: flatten (attach the fan
directly to root); re-execution is equivalent (each slot re-dispatches
its converter). Per-leaf message fetches accepted for V1.0.
What ships:
- autoReverse.reconstructTreeWithFans gains `converterResolver?` +
`onConverterDivergence?`. When the single fan is a root-level
converter fan AND a resolver is supplied, it assembles:
root_prompt(text) → fan(converter, variants) →
[user_turn(text, converterPipeline=slot converters) → send] × N
- root/user_turn text comes from the base leaf's first-turn
ORIGINAL value (the authored prompt, not the converted gibberish).
- per-slot converters come from reconstructVariantPayloads via the
resolver (most-frequent consensus; onConverterDivergence per
disagreeing slot).
- bails to degraded (returns null → linear) when the base lacks a
leading user turn or slots aren't contiguous 0..N-1 (no tombstone
guessing in V1.0).
- useReloadReconstruction: when the leaf set is a single root-level
converter fan, pre-fetches each member leaf's messages
(buildConverterResolver) and reads each one's first-user-turn
converter_identifiers into a resolver keyed by AR id. Non-converter
shapes skip the N fetches. Surfaces a console.warn (not the degraded
banner) when reconstructed-but-divergent slots exist.
Notable shape decisions:
- The pure function stays pure: it takes a synchronous resolver; the
async N-leaf fetch lives in the hook. This keeps reconstructTreeWithFans
unit-testable without mocking an API.
- Slot contiguity gate (0..N-1) avoids guessing deleted-slot tombstone
semantics; non-contiguous → degraded.
- Corrected the prior PR7g comment that claimed the base leaf's
converter "survives via linearChainFromMessages" — it does NOT
(root_prompt has no converterPipeline field; a first-turn converter
is dropped by linear promotion). The converter now survives via this
fan-aware path; the linear-drop for non-fan converted first-turns
remains a documented V1.x gap (root prompts don't model converters).
TDD narrative:
1. autoReverse: +5 tests — converter fan WITH resolver reconstructs the
flattened shape; original_value (not converted) used for text;
per-slot divergence → most-frequent + callback; non-contiguous slots
degrade; (kept) no-resolver degrades.
2. RED: 3 new tests failed (still degrading); 2 already passed.
3. GREEN: assembleRootConverterFan; 46/46 autoReverse pass.
4. useReloadReconstruction: flipped the old "converter degrades" test to
"converter reconstructs (no banner) + fetches each member leaf";
added a nested-fan test to preserve degraded-banner coverage. 55/55.
Verification:
- jest autoReverse|useReloadReconstruction: 55/55 pass.
- npm test: 1379 pass / 70 suites (was 1374).
- type-check / lint clean.
Remaining fan-reconstruction work (deferred V1.1): nested fans,
multi-axis-at-same-position fans, and the linear converter-on-root-prompt
modeling gap.
Code-review fixes:\n- resolve pending cost-guardrail approvals on unmount so waves do not hang behind a vanished modal\n- prevent cost and dirty-edit dialogs from stacking in the shared modal slot\n- cancel active waves on TreeRunnerHost unmount and keep StrictMode/Fast Refresh from closing the live lock manager\n- route Open-as-tree through the dirty-edit guard and suppress stale fragment reload while explicit AR reconstruction is in flight\n- page reload reconstruction with backend-safe limit=100 and fail soft when one converter leaf fetch fails\n\nBrowser-driven fixes:\n- mount Tree View directly when the URL carries conversation_tree_id\n- add React Flow controls and minimap\n- enable local node dragging while keeping ad-hoc connecting disabled and edge tab stops suppressed\n- increase default vertical layout spacing to avoid editor/response-card overlap\n- show assistant response previews on response cards from both reconstruction and live dispatch\n- wire integrated root/user-turn editing with stale propagation\n- make response-card refresh run the subtree so interior stale responses are actionable\n- compose sequential sink mutations so live response previews survive recordExecution -> clean transitions\n\nLive validation:\n- created a benign live attack against OpenAIChatTarget_gpt-4o_rr\n- opened it as a tree, edited root prompt/target, refreshed the response, and verified POST /api/attacks + POST /messages succeeded with the card updating in-session\n- verified dirty-edit guard, response preview rendering, controls/minimap, local dragging, and overlap checks in browser\n\nVerification:\n- npm run type-check\n- npm run type-check:contract\n- npm run lint\n- changed suites in-band: 10 passed / 232 tests\n- full Jest: 71 passed / 1403 tests (parallel worker teardown warning remains)\n\nOpen design follow-ups:\n- backend AttackSummary does not expose target_registry_name, so historical tree reload/open cannot recover a refreshable target without operator edit\n- edge handles currently sit visually inside cards; consider moving handles to card exterior in the visual polish pass\n- consider response truncation plus expanded/linear-detail view for long outputs\n- decide whether action rails should always show, remain hover/focus gated, or sit over the node edge
What ships:\n- wires response-card Add follow-up prompt action through TreeRunnerHost\n- wires response-card Fan out response attempts and Fan out converters actions\n- wires the existing edge insert menu callback into host-owned structural reducers\n- adds pure tree reducers for append-child, insert-between, and attempt/converter fan wrapping\n\nBehavior:\n- Add follow-up creates an edited UserTurn under the response\n- Fan out response attempts wraps the response edge in an attempt Fan and adds a fresh stale Send slot\n- Fan out converters wraps the response edge in a converter Fan and adds a new editable UserTurn -> stale Send branch\n- Edge insert menu now mutates the tree instead of remaining presentational\n\nValidation:\n- browser-verified Add follow-up prompt, attempt fan, and converter fan creation on a live tree\n- npm run type-check\n- npm run type-check:contract\n- npm run lint\n- changed suites in-band: 11 passed / 262 tests\n- full Jest: 71 passed / 1410 tests (parallel worker teardown warning remains)\n\nStill not wired:\n- converter catalog fetch + UserTurn converter palette host persistence\n- fan-child Pick/Unpick host persistence\n- clone/branch tree action\n- delete node/subtree action\n- open path in linear/chat view\n- refresh cost preview tooltip
What ships:\n- fetches converter instances in App and passes them into TreeRunnerHost\n- wires UserTurn converter palette selections through host state\n- wires fan child Pick/Unpick through host state\n- wires refresh cost preview labels using the subtree estimator that matches the Refresh action\n\nValidation:\n- changed suites in-band: 13 passed / 305 tests\n- npm run type-check\n- npm run type-check:contract\n- npm run lint\n\nStill intentionally not wired:\n- clone/branch tree action\n- delete node/subtree action\n- open path in linear/chat view\n- deeper visual polish for edge handles/action rail placement/long-message expansion
What ships:\n- moves React Flow connection handles to the card perimeter instead of visually inside the node content\n- styles handles with Fluent/brand-aware colors and a small outline\n- floats the action rail below the card edge as a compact toolbar while preserving hover/focus/selected reveal behavior\n\nValidation:\n- focused visual/component suites in-band\n- changed suites in-band: 14 passed / 334 tests\n- npm run type-check\n- npm run type-check:contract\n- npm run lint\n\nRemaining design follow-ups:\n- decide whether the action rail should eventually always show instead of hover/focus reveal\n- add long-message expand/detail behavior or explicit open-linear path
What ships:\n- wires Clone tree / Branch from here to create a new client-side tree with parentConversationTreeId\n- wires Delete for non-root subtrees\n- wires Open in linear view to a lightweight right-side path drawer\n- adds pure reducers for clone and subtree delete\n\nValidation:\n- changed suites in-band: 14 passed / 340 tests\n- npm run type-check\n- npm run type-check:contract\n- npm run lint\n\nRemaining follow-ups:\n- richer long-message expansion/detail behavior\n- fuller branch-from-node semantics beyond whole-tree clone\n- explicit delete confirmation modal if product wants destructive-action confirmation before V1.0
What ships:\n- hides Delete on root prompt cards\n- routes non-root Delete actions through a confirmation dialog\n- preserves the shared modal slot priority with cost and dirty-edit dialogs\n\nValidation:\n- changed suites in-band: 14 passed / 341 tests\n- npm run type-check\n- npm run type-check:contract\n- npm run lint
What ships:\n- adds Playwright coverage for Tree View greenfield\n- covers History Open as tree reconstruction with response previews\n- covers add follow-up prompt\n- covers response attempt fan creation\n- covers converter fan creation\n\nValidation:\n- VITE_ENABLE_TREE_UI=true playwright test e2e/tree.spec.ts --project mock: 4 passed\n- npm run type-check\n- npm run type-check:contract\n- npm run lint
What ships:\n- themes React Flow minimap, controls, and attribution with Fluent tokens\n- removes white minimap/control chrome in dark mode\n\nBrowser validation:\n- dark mode minimap: rgb(31, 31, 31)\n- dark mode controls: rgb(41, 41, 41)\n- light mode minimap/controls use light neutral surfaces\n\nVerification:\n- TreeCanvas focused tests\n- npm run type-check\n- npm run lint
Captures the next checkpoint after PR7 quality-gate work: target recovery, attempt fan count, branch-from-here subtree semantics, long-response inspection, converter fan UX, auto-layout, action rail/handle polish, delete details, past runs, wave toast, and browser coverage exit criteria.
|
This could be amazing for TAP! |
|
Does it work for multimodal content? |
|
How do I switch back and forth from this to the single chat view? |
Currently there are a lot of cases in the UI that really add to the complexity. In this case, we have the (converter/no converter) which really should be collapsed into an option on the converter fan to use/not use the nullconverter. (likely as a checkbox) |
Yeah effectively this is manual TAP!! Curious to think about what we can do to adapt a whole scenario to a conversation and be able to play with the internal messages afterward. More thought required... |
MVP no, but no reason not to. |
Another shortcoming is how the tree is currently persisted (via labels which is very hacky). Ideally the backend just has this data model directly. My long-term vision here is to unify the UIs into one tab with two splits (one side chat, the other tree) and of course could support hiding the tree view for linear chats or user preference. The current chat split in the tree tab (see on the right side) is not up to feature parity with the chat tab. |
My recommendation for now is to just use the chat split in the tree view for most things and if you cannot do a specific feature, then just fallback the entire conversation to the chat tab. |
|
Thanks for the comments, @romanlutz ! |
But a scenario implicitly means many different attacks that wouldn't be represented together(?) |
Labels? The prompt IDs and original prompt IDs are tracked on message pieces, right? And related conversations are tracked on attack result. What else is needed? |
Ok but what's the mechanism to switch? Is there a button? |

Draft: Complete Tree UI MVP
Warning
This is more of a reference PR than a good starting point for a real implementation.
There are a lot of dead-end code paths and over-build complexities!
Just try running locally with
VITE_ENABLE_TREE_UI=true uv run dev.pySummary
This PR wraps up the Tree UI MVP as a browser-usable operator experience for opening historical attacks as editable conversation trees, inspecting and extending paths, creating/pruning fan branches, applying converter transforms, and recovering from common edit/refresh mistakes.
The MVP keeps the tree model primarily client-owned while integrating with existing attack history, conversations, targets, converters, and refresh dispatch APIs.
High-Level Concepts
Conversation Trees
The UI represents an attack as a
ConversationTreemade of typed nodes:root_promptuser_turnconvertersendfanscoreimport_messageEdges encode parent/child relationships and fan slot indices. The client reducer owns structural edits, stale propagation, branch/clone semantics, fan creation, prune-to-picked behavior, and local recovery states.
Path Chat
The tree canvas now pairs with a selected-path chat pane. Operators can focus a node/path, inspect the selected path as a linear conversation, add follow-up prompts, and see pending responses without losing the tree context.
Refresh Waves
Refresh runs through the existing runner wave model: the client selects dispatchable sends, partitions clean context versus stale suffix, builds labels, and dispatches through the attack API. The UI now makes wave state and preflight failures easier to understand before a backend call is attempted.
Historical Reconstruction
Opening an attack as a tree reconstructs tree shape from backend conversations and labels. Single-conversation attacks become linear trees; multi-conversation attacks can be merged into path branches. Target registry names are recovered from
AttackSummary.target.target_registry_namewhen available so recovered trees can refresh without empty-target failures.UI Captures
Merged tree with path chat
Shows the loaded historical attack opened as a merged tree, with the path-chat pane available for focused linear inspection.


Follow-up prompt creates pending response / Attempt fan pruned to picked path
Follow.Up.Prompts.mp4
Converter transform branch
Converters.mp4
Notable Details
target_registry_nameto target summary DTOs and frontend mirrors.No targetchip and blocks refresh with a clear preflight modal when the target is missing.2-50count picker.Branch from heresemantics: keep root-to-selected path plus selected subtree, exclude unrelated siblings.ConverterCard, including direct baseline behavior for converter insertion.Tests
Validated locally with:
git diff --checknpm run type-checknpm test -- --runInBand --silent --json --outputFile=/tmp/tree-mvp-jest.json ...uv run pytest tests/unit/backend/test_mappers.py::TestAttackResultToSummaryVITE_ENABLE_TREE_UI=true npm run test:e2e -- e2e/tree.spec.ts e2e/tree-mvp.spec.ts --project=mock --workers=1Areas For Improvement
Backend Conversation Tree Persistence
The MVP still relies heavily on frontend-owned tree state plus attack labels for reconstruction. A stronger backend model would persist first-class conversation tree records:
conversation_treeconversation_tree_nodeconversation_tree_edgeBackend Wave Representation
Waves are currently client-orchestrated and reflected through labels/execution records. A future backend wave model could persist:
Richer Reconstruction
Historical reconstruction now covers more MVP paths, but fully faithful reconstruction of nested fan topology, converter lineage, deleted slots, and older attack records should eventually be backend-assisted instead of inferred from labels and conversation messages.
Collaboration And Concurrency
The current model is suitable for a single operator session. Persisted node versions plus optimistic concurrency would make collaborative editing, cross-tab recovery, and stale write detection more reliable.
Backend-Owned Tree Search And Audit
Once trees and waves are first-class backend entities, History could filter and display tree-level state directly: latest wave, failed leaves, parent tree lineage, target lineage, unresolved stale nodes, and branch ancestry.
Appendix
Prompt To Attempt Fan
flowchart TD P[Prompt] --> AF{{Attempt fan}} AF --> A1[Attempt 1] AF --> A2[Attempt 2] AF --> A3[Attempt N] classDef treeNode fill:#1f6b84,stroke:#0b2f3d,color:#fff,stroke-width:2px; classDef fanNode fill:#ffffff,stroke:#1f6b84,color:#1f6b84,stroke-width:2px; class P,A1,A2,A3 treeNode; class AF fanNode;Prompt To Converter Fan
flowchart TD P[Prompt] --> CF{{Converter fan}} CF --> C1[Converter 1] CF --> C2[Converter 2] C1 --> R1[Response 1] C2 --> R2[Response 2] classDef treeNode fill:#1f6b84,stroke:#0b2f3d,color:#fff,stroke-width:2px; classDef fanNode fill:#ffffff,stroke:#1f6b84,color:#1f6b84,stroke-width:2px; class P,C1,C2,R1,R2 treeNode; class CF fanNode;Prompt To Converter Settings Fan
flowchart TD P[Prompt] --> SF{{Converter settings fan}} SF --> S1["Base64Converter<br/>encoding = b64encode"] SF --> S2["Base64Converter<br/>encoding = urlsafe_b64encode"] SF --> S3["Base64Converter<br/>encoding = b64decode"] S1 --> R1[Response 1] S2 --> R2[Response 2] S3 --> R3[Response 3] classDef treeNode fill:#1f6b84,stroke:#0b2f3d,color:#fff,stroke-width:2px; classDef fanNode fill:#ffffff,stroke:#1f6b84,color:#1f6b84,stroke-width:2px; class P,S1,S2,S3,R1,R2,R3 treeNode; class SF fanNode;Response Branches Into Follow-Ups
flowchart TD P[Prompt] --> R0[Response] R0 --> F1[Follow up 1] R0 --> F2[Follow up 2] F1 --> R1[Response] F2 --> R2[Response] classDef treeNode fill:#1f6b84,stroke:#0b2f3d,color:#fff,stroke-width:2px; class P,R0,F1,F2,R1,R2 treeNode;Similar Projects