[STG-2278] feat(cli): did-you-mean + telemetry for unknown commands by shrey150 · Pull Request #2249 · browserbase/stagehand

shrey150 · 2026-06-12T19:36:15Z

Summary

Linear: https://linear.app/browserbase/issue/STG-2278/add-did-you-mean-suggestions-and-telemetry-for-unknown-browse-commands

Adds a command_not_found oclif hook to the browse CLI that prints a did-you-mean suggestion for unknown commands and emits a new cli.command_not_found telemetry event, while preserving oclif's standard "command not found" error and exit code 2.

Impact if merged

Unknown commands (browse sessions / search / contexts / auth status — the old Commander-era syntax that agents were trained on, plus plain typos) currently exit 2 with no suggestion and emit NO telemetry event, so this failure class is invisible by construction. Old-binary telemetry shows the pattern is real (1,310 commander-error events from sessions.list alone in 30d; 115 from search). It's a failed-first-command class, and a failed first command cuts 7-day retention 12.4x (0.42% vs 5.21%). Did-you-mean turns an agent guess-loop into a one-turn recovery, and the new cli.command_not_found event finally lets us size and rank the dead ends. No new dependency — deliberately avoids @oclif/plugin-not-found, which prompts interactively (agent-hostile).

Implementation notes

New hook src/hooks/command-not-found.ts, registered in the oclif.hooks config. Suggestion order: explicit alias table first (old-CLI syntax → current tree, e.g. sessions → cloud sessions list, auth status → doctor, search → cloud search), then nearest match by Levenshtein over config.commandIDs with a distance threshold (the did-you-mean clause is omitted when nothing decent matches). Alias targets are validated against the live command tree at runtime and against oclif.manifest.json in tests, so they can't silently drift.
Privacy: id + suggestion only, never argv. oclif's spaced-topic parsing glues unknown leading argv tokens into the attempted id (e.g. browse opne https://example.com arrives as opne:https://example.com), so the hook sanitizes down to leading command-shaped tokens and reports only the matched prefix (or the first token when nothing matches). The telemetry payload carries exactly attempted_command and suggested_command — URLs, selectors, queries, and secrets never leave the machine. Covered by a dedicated test asserting argv values are absent from captured payloads.
Exit semantics preserved. A command_not_found hook that returns normally makes oclif treat the invocation as handled (exit 0), silently swallowing the failure. The hook therefore re-throws oclif's standard CLIError("command <id> not found") after printing the suggestion, keeping stderr output and exit code 2 byte-identical to current behavior.
Telemetry can't hang or get lost. The event reuses the existing PostHog transport (400ms abort timeout, best-effort catch) and is awaited inside the hook before the error is thrown, so it is delivered before process exit but cannot delay it beyond the transport timeout. The finally-hook completion path early-returns for unknown commands (prerun never fires), so there is no double counting.
No new runtime dependency; ~140 LOC of source plus tests.

E2E Test Matrix

Command / flow	Observed output	Confidence / sufficiency
`node bin/run.js sessions` (local build)	stderr: `"browse sessions" is not a browse command. Did you mean "browse cloud sessions list"? Run browse --help for all commands.` then `Error: command sessions not found`; `echo $?` → `2`	Proves alias suggestion + preserved exit code on the highest-volume old-syntax pattern
`node bin/run.js auth status`	stderr: `"browse auth status" is not a browse command. Did you mean "browse doctor"? ...`; exit `2`	Proves multi-token alias matching (`auth:status` → `doctor`)
`node bin/run.js search "test"`	stderr: `"browse search" is not a browse command. Did you mean "browse cloud search"? ...`; exit `2`; the query token is not shown as part of the attempted command	Proves alias prefix matching strips trailing user args from messaging
`node bin/run.js opne https://example.com`	stderr: `"browse opne" is not a browse command. Did you mean "browse open"? ...`; exit `2`	Proves Levenshtein typo fallback; URL excluded from the attempted command
`node bin/run.js open https://example.com --local`	JSON result `{"mode": "managed-local", ..., "title": "Example Domain", "url": "https://example.com/"}`; exit `0`; no suggestion output	Proves valid commands are completely unaffected (hook never fires)
Live telemetry capture: `BROWSERBASE_TELEMETRY_HOST=<local capture server>` + `node bin/run.js auth status`	Capture server logged `POST /i/v0/e/` with `"event": "cli.command_not_found"`, `"attempted_command": "auth.status"`, `"suggested_command": "doctor"` plus standard env/version props; payload received before CLI exit `2`; no argv content in payload	Proves the event actually sends, flushes before process exit, and carries only id + suggestion
`pnpm test` (builds then vitest)	`Test Files 16 passed (16), Tests 229 passed (229)` — includes 13 new unit/integration tests (alias table validity vs manifest, Levenshtein, thresholds, token sanitization, built-CLI suggestion/exit-code/telemetry/privacy)	Full regression sweep; existing telemetry suite still green
`pnpm lint`	prettier + eslint + `tsc --noEmit` all pass	Supporting evidence only

🤖 Generated with Claude Code

Summary by cubic

Adds did-you-mean suggestions and privacy-safe telemetry for unknown browse CLI commands, while keeping the standard error output and exit code 2. Addresses Linear STG-2278 by helping users recover from old syntax and typos; typo matching now uses fastest-levenshtein.

New Features
- Added a command_not_found hook that prints a suggestion using an alias table for old syntax, with segment-aligned Levenshtein fallback for typos; omitted when no good match.
- Sends cli.command_not_found telemetry with strict privacy: only the sanitized attempted command id and the suggested command, never raw argv.
- Preserves default behavior by rethrowing the not-found error (stderr unchanged, exit code 2) and avoids @oclif/plugin-not-found.
- Removed misleading auth/login → doctor suggestions.

^{Written for commit bcee6ed. Summary will update on new commits.}

Unknown commands previously exited 2 with no suggestion and emitted no telemetry (prerun never fires, so command_completed early-returns). This adds a command_not_found oclif hook that prints a did-you-mean line (explicit alias table for old Commander-era syntax, Levenshtein fallback for typos) and emits a new cli.command_not_found event with only the sanitized attempted command id and the suggestion - never raw argv. oclif's standard error and exit code 2 are preserved, and no new runtime dependency is added (deliberately not @oclif/plugin-not-found, which prompts interactively in TTYs). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

changeset-bot · 2026-06-12T19:36:25Z

🦋 Changeset detected

Latest commit: bcee6ed

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 0 packages

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

cubic-dev-ai

1 issue found across 6 files

Confidence score: 3/5

In packages/cli/src/lib/command-suggestions.ts, fuzzy matching can leave trailing user tokens in attempted, which then gets emitted as attempted_command telemetry; merging as-is risks leaking argv-derived input and undermines the file’s sanitization contract. Ensure attempted is fully sanitized (or omitted from telemetry) before merge, and add a regression test covering trailing-token cases.

Architecture diagram

sequenceDiagram
    participant CLI as CLI Process
    participant Hook as command_not_found Hook
    participant Suggest as suggestCommand()
    participant Telemetry as captureCommandNotFound()
    participant PostHog as PostHog Transport
    participant Oclif as oclif Core

    Note over CLI,Oclif: Unknown command path: "browse sessions" / "browse opne https://..."

    CLI->>Oclif: dispatch unknown command id
    Oclif->>Oclif: identify command_not_found hook
    Oclif->>Hook: invoke hook({ config, id })

    Note over Hook,Suggest: id = "sessions" or "opne:https://example.com"

    Hook->>Suggest: suggestCommand(id, config.commandIDs)
    
    Note over Suggest: extractCommandTokens() strips<br/>argument-like tokens (URLs, flags)<br/>Returns only command-shaped prefix

    alt Explicit alias match
        Suggest->>Suggest: check aliasSuggestions map
        Note over Suggest: "sessions" -> "cloud:sessions:list"<br/>"auth:status" -> "doctor"
    else Levenshtein fallback
        Suggest->>Suggest: compute edit distance vs all command IDs
        Note over Suggest: threshold = max(2, floor(len/3)), cap 5<br/>"opne" -> "open" (distance 2)
    else No decent match
        Suggest->>Suggest: return suggestion = null
    end

    Suggest-->>Hook: { attempted, suggestion }

    alt Suggestion exists
        Hook->>Hook: validate suggestion exists in config.findCommand()
        Note over Hook: Guard against alias target drift
        Hook->>CLI: stderr: "Did you mean ...?"
    else No suggestion
        Hook->>CLI: stderr: "... Run --help for all commands."
    end

    Note over Hook,PostHog: Privacy: only attempted + suggested command ids, never argv

    Hook->>Telemetry: captureCommandNotFound(version, attempted, suggestion)
    Telemetry->>PostHog: POST /capture - event: cli.command_not_found
    Note over Telemetry,PostHog: Payload: { attempted_command, suggested_command }<br/>Timeout: 400ms, best-effort catch
    PostHog-->>Telemetry: 200 OK (or timeout/error, silently swallowed)

    Note over Hook: Await telemetry flush<br/>before throwing error

    Hook->>Oclif: throw CLIError("command {id} not found")

    Oclif->>CLI: stderr: "Error: command {id} not found"
    Oclif->>CLI: exit code 2

_{Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic}

…r reach telemetry Cubic flagged that whole-string fuzzy scoring could retain a trailing user-provided token in the attempted command (e.g. 'browse stat s' -> attempted_command 'stat.s'). Fuzzy matching now aligns token prefixes per command-id segment: only ids with the same segment count are considered and each token must be within its own edit-distance threshold of the aligned segment, so a token can only be retained when it itself looks like a typo of a real command word. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

auth and login express authentication intent, not diagnostics; an unknown auth command now falls through to the plain not-found message (telemetry still records the attempt). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…stance Zero-dep micro-lib (same one @oclif/plugin-not-found uses), already in the monorepo pnpm store. The segment-aligned threshold logic stays ours. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

shrey150 · 2026-06-12T22:22:27Z

Re: the auth/login alias comment — agreed and addressed in e8520c4a0: dropped all three (auth, auth:status, login → doctor). They now fall through to the plain not-found message (verified live; exit 2 unchanged), and cli.command_not_found still records the attempts — which doubles as demand signal for a real browse login. Regression test pins that auth/login never map to doctor.

Also pre-addressed the two pending draft comments on the hand-rolled edit distance ("there has to be a util function for this") in 37ae41c74: swapped to fastest-levenshtein (zero-dep, 21KB, the same lib @oclif/plugin-not-found uses, already in the monorepo pnpm store). The segment-aligned per-token threshold logic stays ours — no util provides that. −26 LOC, 230/230 tests green.

(Replying here instead of in-thread: GitHub blocks threaded replies while your review draft is pending.)

…tegration tests Addresses review: extractCommandTokens + suggestCommand cases become it.each tables; the built-CLI/dummy-server tests (real command_not_found hook + telemetry transport) move to cli-command-not-found.integration.test.ts, leaving the pure-function unit tests in cli-command-not-found.test.ts. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

cubic-dev-ai Bot reviewed Jun 12, 2026

View reviewed changes

Comment thread packages/cli/src/lib/command-suggestions.ts Outdated

shrey150 commented Jun 12, 2026

View reviewed changes

Comment thread packages/cli/src/lib/command-suggestions.ts Outdated

shrey150 and others added 2 commits June 12, 2026 15:18

fix(cli): drop auth/login -> doctor alias suggestions

e8520c4

auth and login express authentication intent, not diagnostics; an unknown auth command now falls through to the plain not-found message (telemetry still records the attempt). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

refactor(cli): use fastest-levenshtein instead of hand-rolled edit di…

37ae41c

…stance Zero-dep micro-lib (same one @oclif/plugin-not-found uses), already in the monorepo pnpm store. The segment-aligned threshold logic stays ours. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

shrey150 commented Jun 12, 2026

View reviewed changes

Comment thread packages/cli/src/lib/command-suggestions.ts Outdated

Comment thread packages/cli/src/lib/command-suggestions.ts

ajmcquilkin approved these changes Jun 16, 2026

View reviewed changes

Comment thread packages/cli/tests/cli-command-not-found.test.ts Outdated

Comment thread packages/cli/tests/cli-command-not-found.test.ts Outdated

Comment thread packages/cli/tests/cli-command-not-found.test.ts Outdated

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[STG-2278] feat(cli): did-you-mean + telemetry for unknown commands#2249

[STG-2278] feat(cli): did-you-mean + telemetry for unknown commands#2249
shrey150 wants to merge 5 commits into
mainfrom
shrey/command-not-found-hook

shrey150 commented Jun 12, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

changeset-bot Bot commented Jun 12, 2026 •

edited

Loading

Uh oh!

cubic-dev-ai Bot left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

shrey150 commented Jun 12, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shrey150 commented Jun 12, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Impact if merged

Implementation notes

E2E Test Matrix

Summary by cubic

Uh oh!

changeset-bot Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

shrey150 commented Jun 12, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shrey150 commented Jun 12, 2026 •

edited by cubic-dev-ai Bot

Loading

changeset-bot Bot commented Jun 12, 2026 •

edited

Loading

cubic-dev-ai Bot left a comment •

edited

Loading