Skip to content

[FEATURE] ast-grep optional accelerator — bundle binary for ~3x rule matching speedup #68

Description

@Wolfvin

Summary

Bundle ast-grep binary (auto-provisioned per-platform, SHA-256 verified) as an optional accelerator for rule pattern matching. If ast-grep is available, route certain rule patterns to it for ~3x speedup. Falls back to native Semgrep-YAML matcher (from rule pattern engine issue) if unavailable.

Worker source

Worker Source Contribution
UBS update!/CodeLens_UBS_Upgrade_Analysis.md #14 Bundle ast-grep binary (auto-provisioning per-platform, SHA-256 verified, cached at ~/.codelens/ast-grep/<version>/<platform>/, graceful fallback to tree-sitter ad-hoc if unavailable). Support ast-grep rule YAML format in plugin rule_pack. Implement ancestor-aware matching via stopBy: end directive. 50+ ast-grep rules ported from UBS builtin pack. Decision point: Option B (integrate ast-grep) recommended over Option A (build pattern matcher from scratch — 2-3 sprints vs 1 sprint).

Proposed scope (P3, 1 sprint / 2-3 weeks)

Phase 1 — Binary auto-provisioning (P3, 1 week)

  • Download ast-grep binary from GitHub releases per platform (linux-x64, darwin-x64, darwin-arm64, win32-x64)
  • SHA-256 verify
  • Cache at ~/.codelens/ast-grep/<version>/<platform>/
  • Graceful fallback if download fails or platform unsupported
  • New file: scripts/astgrep_runner.py

Phase 2 — Rule format bridge (P3, 1 week, depends on rule pattern engine issue)

  • Support ast-grep rule YAML format in plugin rule_pack:
    • pattern, any, all, not, inside, has, precedes, follows, kind, regex
    • Metavariables: $X, $$$ARGS, $NAME
    • constraints, utils
  • Implement ancestor-aware matching via stopBy: end directive (traverse entire ancestor tree, not just immediate parent)
  • Converter: translate CodeLens Semgrep-YAML rules → ast-grep rules where possible
  • --rules=DIR flag to scan/taint/secrets

Phase 3 — Rule pack porting (P3, 1 week, optional)

  • Port 50+ ast-grep rules from UBS builtin pack (MIT license compatible)
  • Document at references/ast-grep-rule-syntax.md with 30+ examples

Acceptance criteria

  • ast-grep binary auto-provisions on first run on all 4 platforms
  • SHA-256 verification prevents tampered binary execution
  • Rule pattern engine (Semgrep-YAML) rules can be routed to ast-grep for ~3x speedup
  • Graceful fallback to native matcher if ast-grep unavailable
  • 50+ ported rules pass parity tests vs native matcher

Decision rationale

UBS recommends Option B (integrate ast-grep) over Option A (build pattern matcher from scratch — 2-3 sprints vs 1 sprint). CodeLens already has the rule pattern engine issue (Semgrep-YAML, Layer 2 of #43 hybrid) which is Option A. This issue is complementary: ast-grep accelerates certain patterns, native matcher handles the rest.

Relationship to other issues

License note

ast-grep is MIT-licensed — binary can be redistributed. UBS rule pack is MIT — rules can be ported directly.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions