Skip to content

[FEATURE] Telemetry, file watcher & worktree detection — observability + reliability #66

Description

@Wolfvin

Summary

Add 4 reliability/observability features: anonymous opt-in telemetry, native file watcher (FSEvents/inotify), per-file staleness banner, and git worktree mismatch detection.

Worker consensus (1 report — all from CodeGraph)

Worker Source Contribution
CodeGraph update!/CodeLens_CodeGraph_Upgrade_Analysis.md #24 Anonymous telemetry — opt-in, 4 invariants: zero hot-path cost, zero stdout, off is off, fail silent. Fields: event_type, event_name, language, agent, file_count_bucket, schema_version, machine_id, timestamp. NEVER: code, paths, file/symbol names, queries, IP. Off-switches: codelens telemetry off, CODELENS_TELEMETRY=0, DO_NOT_TRACK=1.
CodeGraph same file #2 Auto-sync native file watcher — replace watchdog with FSEvents (macOS) / inotify (Linux) / ReadDirectoryChangesW (Windows). Single recursive fs.watch (O(1) descriptors). Per-directory debounce (default 2000ms). CODELENS_NO_WATCH=1, CODELENS_FORCE_WATCH=1, CODELENS_WATCH_DEBOUNCE_MS env vars. Auto-disable on WSL2 /mnt/*.
CodeGraph same file #3 Per-file staleness banner + connect-time catch-up — prepend ⚠️ Some files referenced below were edited since the last index sync… banner. On MCP reconnect, run (size, mtime) + content-hash reconciliation vs working tree. Block first query until catch-up finishes (or 5s timeout, then proceed with stale + banner).
CodeGraph same file #23 Worktree mismatch detection — detect git worktree nested inside main checkout. Walk-up resolves .codelens/ to parent main checkout (different branch). Warning in codelens status + banner on every MCP read tool response. Suggest codelens init -i in worktree.

Proposed scope (P3, 4-6 weeks total — all optional)

Phase 1 — Per-file staleness banner (P2, 1 week, quick win)

  • In-memory Dict[str, float] (path → edit_timestamp), thread-safe via threading.Lock
  • Walk indexed file list with os.stat(path) to compare (st_size, st_mtime_ns)
  • Re-compute content-hash only when size/mtime changed
  • Prepend ⚠️ Some files referenced below were edited since the last index sync… banner (with file names + edit age) to MCP responses
  • Surface non-referenced pending files as small footer
  • New file: scripts/sync/pending.py

Phase 2 — Connect-time catch-up (P2, 1 week, depends on Phase 1)

  • On MCP (re)connect, run (size, mtime) + content-hash (SHA-256) reconciliation vs working tree
  • Absorb edits made while no MCP server was running
  • Block first query until catch-up finishes (or 5s timeout, then proceed with stale + banner)

Phase 3 — Native file watcher (P2, 2-3 weeks, depends on daemon in performance issue)

  • Replace Python watchdog with native OS watchers
  • macOS: single recursive fs.watch FSEvents stream
  • Windows: single ReadDirectoryChangesW handle
  • Linux: per-directory inotify
  • Per-directory debounce (default 2000ms, clamp [100ms, 60s])
  • CODELENS_NO_WATCH=1, CODELENS_FORCE_WATCH=1, CODELENS_WATCH_DEBOUNCE_MS env vars
  • Auto-disable on WSL2 /mnt/*, offer git hooks fallback
  • Lock contention retry: MAX_LOCK_RETRIES=5 exponential backoff

Phase 4 — Worktree mismatch detection (P2, 1 week)

  • gitWorktreeRoot(dir) via git rev-parse --show-toplevel
  • detectWorktreeIndexMismatch(project_root) returning {worktree_root, main_checkout_root, index_root}
  • Warning in codelens status output
  • Banner prepend on every MCP read tool response
  • Suggest codelens init -i in worktree to build its own index
  • New file: scripts/sync/worktree.py

Phase 5 — Anonymous telemetry (P3, 2 weeks, opt-in only)

  • 4 invariants: zero hot-path cost, zero stdout, off is off, fail silent
  • Fields: event_type (mcp_tool/cli_command/lifecycle), event_name, language, agent, file_count_bucket (<100/100-1k/1k-10k/10k+), schema_version, machine_id (random UUID), timestamp (UTC date, only completed days sent)
  • NEVER: code, paths, file/symbol names, queries, IP
  • Off-switches: codelens telemetry off, CODELENS_TELEMETRY=0, DO_NOT_TRACK=1
  • Per-day rollup: aggregate count per (event_type, event_name, language, agent) per UTC day
  • Network: urllib.request POST JSON, fire-and-forget via threading.Thread daemon
  • Machine ID: random.uuid4() stored in ~/.codelens/machine_id
  • Buffer file: ~/.codelens/telemetry-buffer.json (max 256KB, rotate if exceed)
  • Ingest endpoint: Cloudflare Worker or simple FastAPI, public code for audit
  • New file: scripts/telemetry/__init__.py

Acceptance criteria

  • Phase 1: staleness banner appears when file edited after last scan
  • Phase 3: native watcher uses O(1) descriptors on macOS
  • Phase 4: worktree mismatch warning appears in codelens status
  • Phase 5: telemetry is OFF by default, opt-in via codelens telemetry on
  • Phase 5: no telemetry sent when DO_NOT_TRACK=1 is set
  • Phase 5: public audit of telemetry schema + ingest endpoint code

Privacy note

Telemetry MUST be opt-in (default OFF). Must respect DO_NOT_TRACK=1 (industry standard). Must NEVER collect code, paths, file names, queries, or IP. Public audit of ingest endpoint code is required for trust.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions