Skip to content

feat(lifecycle): artifact registry + marginal-lift ablation (phase 1)#361

Merged
drewstone merged 1 commit into
mainfrom
feat/artifact-lifecycle-foundation
Jun 22, 2026
Merged

feat(lifecycle): artifact registry + marginal-lift ablation (phase 1)#361
drewstone merged 1 commit into
mainfrom
feat/artifact-lifecycle-foundation

Conversation

@drewstone

Copy link
Copy Markdown
Contributor

What

rt#267 phase 1 — the artifact-lifecycle foundation. Two primitives the rest of the lifecycle hangs off, exposed as a new ./lifecycle subpath.

1. ArtifactRegistry — typed catalog of profile artifacts with stable ids

A pure in-memory store of ProfileArtifacts. An artifact is a discrete, individually-promotable piece of an AgentProfile — one of skill | tool | mcp | hook | subagent | prompt (the §1.5 profile surface, one-to-one with an AgentProfile field). API:

  • register(input) — assigns a stable id (<kind>-<n>, or honors a caller-supplied id idempotently; re-registering replaces under the same id). Fails loud on a malformed explicit id.
  • get(id) / list({ kind?, status? }) / promote(id) (idempotent; fails loud on unknown id).
  • compose(base, ids?) — applies a set of artifacts onto a baseline profile (no ids = all promoted; explicit ids = exactly those, in order, fail-loud on unknown).
  • applyArtifact / applyArtifacts — the one bridge from an ArtifactKind to an AgentProfile field (shallow-immutable; never mutates the base). Both the registry's compose and the marginal-lift ablation share it, so there is a single source of truth for how a kind lands on the profile.

2. measureMarginalLift — the with-vs-without ablation

given a baseline profile + a candidate artifact + an eval runner, returns the marginal score/cost delta the artifact adds.

Runs the caller's EvalRunner on baseline (the "without" arm) and on applyArtifact(baseline, candidate) (the "with" arm), and subtracts: scoreDelta = with.composite − without.composite, costDelta = with.costUsd − without.costUsd. Selector-agnostic — it quantifies the ablation, it does not judge or gate (selector≠judge). Supports a pre-computed baselineResult to skip the baseline arm when ranking several candidates against one baseline, and forwards an AbortSignal.

EvalResult mirrors the project's score/cost convention (composite from OutcomeMeasurement, costUsd from LoopResult), so a caller wires a thin wrapper over runLoop / runBenchmark / runAgentEval.

Why this shape

  • Sits directly on the existing AgentProfile contract (@tangle-network/agent-interface) and the §1.5 "an agent IS its profile" law — no new profile model, no parallel surface type.
  • Reuses the established score/cost conventions and the typed-error taxonomy (ValidationError) — fails loud, no silent zeros.
  • Additive only: new ./lifecycle subpath + tsup entry + typedoc entryPoint. No existing surface changed.

Tests

src/lifecycle/lifecycle.test.ts — 21 tests: applyArtifact for all six kinds (immutability, key resolution, append-vs-overwrite), the registry (stable/explicit ids, idempotent re-register, fail-loud guards, filtered list, promote, compose with/without ids), and measureMarginalLift (delta math, exactly-two-arm runs, baseline-skip, negative-delta drop signal, signal forwarding, and an end-to-end rank-then-promote-then-compose flow).

Gates (all green, run locally)

  • pnpm run build OK (emits dist/lifecycle.{js,d.ts})
  • pnpm run typecheck OK (incl. typecheck:examples)
  • pnpm test1066 passed, 1 skipped
  • pnpm run lint (Biome) OK
  • pnpm run docs:check OK (regenerated docs/api/lifecycle.md + index, freshness gate green)

Deferred to later phases (NOT in this PR)

  • Per-surface lifecycles — the propose → measure → gate → ship loop wired per ArtifactKind (e.g. how a skill candidate is generated/validated vs an mcp candidate).
  • BuildableSurface author contract — the typed interface a profile author implements to enumerate which artifacts a surface can produce, so the registry can be populated by an author rather than hand-registered.
  • Promotion-gate wiring — connecting measureMarginalLift output into defaultProductionGate / heldOutGate so promotion is gated on held-out significance, not a raw point delta. Phase 1 deliberately stops at "produce the number"; the gate consumes it later.

This was the largest of the five; it is fully green, so opening as a normal (non-draft) PR.

…hase 1)

Foundation for the artifact-lifecycle. Two primitives the rest hangs off:

- ArtifactRegistry — a typed catalog of profile artifacts
  (skill|tool|mcp|hook|subagent|prompt) with stable ids and
  register/list/get/promote/compose. applyArtifact is the one bridge
  from an ArtifactKind to an AgentProfile field (§1.5 profile law).
- measureMarginalLift — the with-vs-without ablation: given a baseline
  profile + a candidate artifact + an EvalRunner, returns the score/cost
  delta the artifact adds on its own.

Exposed as the new ./lifecycle subpath. Per-surface lifecycles and the
buildable-surface author contract are deferred to later phases.

@tangletools tangletools left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Auto-approved PR — f4ad2a17

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-22T13:35:32Z

@drewstone drewstone merged commit c67ad94 into main Jun 22, 2026
1 check passed
@drewstone drewstone mentioned this pull request Jun 22, 2026
drewstone added a commit that referenced this pull request Jun 22, 2026
…provenance + artifact lifecycle (#363)

Bundles the build-phase PRs landed since 0.72.0:
- #360 max-live-workers concurrency cap + explicit worker metering opt-in
- #362 mounted-resource manifest + caller selection receipts on LoopResult
- #361 artifact registry + marginal-lift ablation (rt#267 phase 1, ./lifecycle)
- #359 preserve partial events on abort via typed SandboxRunAbortError

Bumps package.json to 0.73.0, pins docs/canonical-api.md to 0.73.0, and adds
decision-table rows for run provenance (result.provenance.mounts/selectionReceipts)
and the artifact-lifecycle ablation (measureMarginalLift / ArtifactRegistry, /lifecycle).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants