feat(skills): flat skill versioning + UCB1 adaptive-bandit routing (Phase 4) by QodeXcli · Pull Request #24 · QodeXcli/QodeX

QodeXcli · 2026-06-25T14:56:26Z

Phase 4 of the proposed evolution architecture — flat, OS-agnostic skill versioning with UCB1 adaptive-bandit A/B routing.

Storage (no symlinks, no split-brain)

~/.qodex/skills/<id>/
  manifest.json        ← single source of truth (versions, champion/challenger, per-version stats)
  SKILL.v1.md  SKILL.v2.md …

Identical on Windows/macOS/Linux, no admin rights. Legacy single-file SKILL.md skills keep working unchanged (routedSkillBody falls back).

The algorithm (pure, tested)

src/skills/learning/skill-versioning.ts — the exact spec'd types + functions:

routeSkillVersion — UCB1: an unsampled version is explored first; otherwise it balances exploit (champion) vs explore (challenger); a challenger whose success rate collapses loses traffic automatically. Conservative on ties.
createNextVersion — adds a challenger (parent = champion) without touching any other version's state.
recordVersionExecution / decideChampion — converge the test: promote a clearly-better challenger, retire a worse one.
Robustness over the sketch: next tag is max+1 (not count+1), and a retired version is kept in history (marked retired) so tags never collide and its file stays referenced.

I/O + CLI

versioned-store.ts (atomic manifest writes): routedSkillBody, ensureVersioned, addChallenger, recordOutcomeAndConverge.
qodex skill versions <name> — shows champion/challenger, per-version success rate + tokens, and the routed pick.

Live-verified

Seeded v1 → added a v2 challenger → fed the bandit a strong champion + weak challenger → it auto-retired the bad challenger and kept routing the champion.

Tests

14 (UCB1 explore/exploit, tag robustness after retire, stat updates, promote/retire/keep-testing). ✅ typecheck · ✅ full suite (1212) · ✅ build.

Note: Phase 5 (escalating-cascade judge) is a separate PR. Wiring the per-run execution-stat recording into the live agent loop is the remaining integration step on top of this storage+algorithm layer.

…g (Phase 4) Per the proposed architecture: a skill keeps its full history in ONE directory behind a flat manifest — no symlinks, no split-brain, identical on Windows/macOS/Linux, no admin rights. ~/.qodex/skills/<id>/ manifest.json ← single source of truth (versions, champion/challenger, stats) SKILL.v1.md SKILL.v2.md … - src/skills/learning/skill-versioning.ts (PURE, unit-tested): the exact types (VersionStats / VersionDetail / SkillManifest) + initManifest, createNextVersion (adds a challenger, never touches other versions), routeSkillVersion (UCB1 — explores an unsampled arm first, then balances exploit/explore, conservative on ties), recordVersionExecution, and decideChampion (promote a clearly-better challenger, RETIRE a worse one). Robustness over the sketch: next tag is max+1 (not count+1) and a retired version is KEPT in history (marked retired) so tags never collide and its SKILL.v{N}.md stays referenced. - src/skills/learning/versioned-store.ts (thin, atomic I/O): routedSkillBody resolves the UCB1-routed version, FALLING BACK to legacy single-file SKILL.md so nothing breaks; ensureVersioned / addChallenger / recordOutcomeAndConverge. - CLI: `qodex skill versions <name>` — champion/challenger, per-version success rate, tokens, and the routed pick. Live-verified end-to-end: seeded v1, added a v2 challenger, fed the bandit a strong champion + a weak challenger → it auto-RETIRED the bad challenger and kept routing the champion. Tests: 14 (UCB1 explore/exploit, tag robustness after retire, stat updates, promote/retire/keep-testing convergence). typecheck + full suite (1212) + build green.

QodeXcli mentioned this pull request Jun 25, 2026

feat(skills): escalating-cascade judge + alignment-drift self-improvement (Phase 5) #25

Merged

QodeXcli merged commit 3b3521f into main Jun 25, 2026
2 checks passed

QodeXcli deleted the feat/skill-versioning branch June 25, 2026 15:03

QodeXcli mentioned this pull request Jun 26, 2026

feat(skills): strengthen UCB1 — composite reward, tunable c, trial floor, scores, off-switch #27

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skills): flat skill versioning + UCB1 adaptive-bandit routing (Phase 4)#24

feat(skills): flat skill versioning + UCB1 adaptive-bandit routing (Phase 4)#24
QodeXcli merged 1 commit into
mainfrom
feat/skill-versioning

QodeXcli commented Jun 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

QodeXcli commented Jun 25, 2026

Storage (no symlinks, no split-brain)

The algorithm (pure, tested)

I/O + CLI

Live-verified

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant