Before an AI writes a new class/id/function, CodeLens must be checked. This is not optional.
CodeLens is an AI-native code intelligence platform that gives AI agents full visibility into a codebase before they write any code. It prevents collision, overwrite of existing logic, security vulnerabilities, and dead code through 57 CLI commands, an MCP server with 55 tools (50 static + 5 dynamic), AST-based taint analysis, live CVE/OSV scanning, a plugin system with OWASP Top 10 + Compliance rule packs, a true graph data model (nodes + edges) for structural code queries, and token-efficient --format compact output for high-volume agent workflows (issue #17).
- 57 CLI Commands — From basic scan/query to AST taint analysis, CVE scanning, plugin management, auto-fix, dashboards, CI/CD quality gates, and
graph-schemafor cheap graph-shape introspection - MCP Server (55 Tools) — Native AI agent integration via Model Context Protocol (JSON-RPC over stdio), 50 statically-defined tools + 5 dynamically discovered, every tool accepts a
formatparameter (json/markdown/ai/sarif/compact) - Token-Efficient Compact Output (v8.2, issue #17) —
--format compactproduces single-char-key JSON with abbreviated types, omitted null fields, and relative paths — ~50% smaller thanjsonon real trace output. Combined with--limit/--offsetpagination, 5 structural queries now cost <5k tokens (down from 30-80k) - AST Taint Engine — Tree-sitter based taint analysis with return-value propagation, scope hierarchy, and branch condition refinement
- Live CVE/OSV Scanning — Real-time vulnerability data from OSV.dev API with SQLite cache, 9 ecosystems (PyPI, npm, crates.io, Go, Maven, NuGet, RubyGems, Pub, Hex)
- Cross-File Call Graph — Workspace-wide call graph with import resolution and bidirectional taint propagation
- Graph Data Model (v8.2) — True node + edge graph (
graph_nodes+graph_edgesSQLite tables) for structural queries: callers, callees, blast radius, circular chains. Populated during scan;traceengine migrated to use it by default with--use-graph/--no-graphflag for A/B testing - Plugin System — 4 plugin types (rule_pack/engine/formatter/command), 3-tier discovery (local → user → built-in), OWASP Top 10 (36 rules) + Compliance (53 rules: PCI-DSS v4.0 + HIPAA)
- VS Code Extension — Diagnostics Provider, Code Actions, Guard hooks, Health status bar
- CI/CD Integration — GitHub Actions workflows, SARIF v2.1.0 output, PR decoration,
checkquality-gate command - Guard Command — Pre/post-write verification designed for AI agent workflows
- Tree-sitter Powered — Accurate AST-based parsing for HTML, CSS, JS, TS/TSX, Rust, Python, Vue, Svelte, Blade
- Regex Fallback Parsers — 28 additional languages supported via regex-based parsers (C, C++, Go, Java, Kotlin, Swift, Ruby, PHP, Scala, Dart, Elixir, Lua, R, Haskell, Nim, Objective-C, GDScript, Shell, Vim, Zig, and more)
- Framework Auto-Detection — React/Next.js, Vue, Svelte, Tailwind CSS, Express, Fastify, Koa, Hono, Django, Flask, FastAPI, Tauri, and more
- Incremental Scanning — Only re-parse changed files for speed, with SQLite persistent registry storage
- Git-Aware Re-Index (v8.2) —
scan --incrementalusesgit diff <last-indexed-sha> --name-onlyto enumerate exactly the files git knows changed (mtime fallback when git unavailable).git-statusreports the HEAD/last-indexed SHA + branch + changed-files count + re-scan recommendation in one call.diff --git-awareshows changed files + symbols + downstream caller impact.watch --git-modepollsgit diff --name-onlyinstead of watchdog file events. All features gracefully degrade when git is unavailable - Workspace Auto-Detect — No need to specify workspace path if you're already in the project
- AI-Optimized Output —
--format ai(normalized schema) and--format compact(token-efficient single-char keys) flags for AI agent consumption - Auto-Fix Engine — Confidence-scored auto-fixes with dry-run-by-default safety
- HTML Dashboard — Generate visual dashboards with historical trend tracking
- Hybrid LSP Engine — Optional LSP-enhanced deep analysis (
--deepflag) when language servers are available - Security Auditing — Detect hardcoded secrets, data flow taint analysis, CVE scanning, ReDoS regex auditing
- Quality Scoring — Code smells, complexity metrics, dead code detection
- CSS Deep Analysis — Unused variables, orphan keyframes, specificity wars, z-index abuse
- Performance Hints — N+1 queries, sync blocking, memory leaks, expensive renders
# Install dependencies (tree-sitter grammars + watchdog)
bash setup.sh
# Initialize workspace (auto-detects frameworks)
python3 scripts/codelens.py init /path/to/workspace
# Scan workspace and build registry
python3 scripts/codelens.py scan /path/to/workspace
# Check if "btn-primary" already exists before creating it
python3 scripts/codelens.py query "btn-primary" /path/to/workspace --domain frontend
# List all dead code
python3 scripts/codelens.py list /path/to/workspace --domain all --filter dead
# Detect frameworks
python3 scripts/codelens.py detect /path/to/workspaceIf you omit the workspace path, CodeLens auto-detects it:
python3 scripts/codelens.py scan # Auto-detect workspace
python3 scripts/codelens.py query "btn-primary" # Auto-detect workspace
python3 scripts/codelens.py smell # Auto-detect workspaceIf no .codelens/ registry exists, any analysis command auto-runs init + scan (capped at 3000 files to prevent timeout):
export CODELENS_AI_MODE=1 # --format ai becomes default
python3 scripts/codelens.py query "myFunction" --lite
# → Auto-init + auto-scan → then query → {found, action}| Command | Description |
|---|---|
init [workspace] |
Initialize .codelens config with auto-detected frameworks |
scan [workspace] [--incremental] [--full] [--max-files N] |
Scan workspace and build registry |
validate [workspace] |
Validate registry vs file system |
detect [workspace] |
Detect frameworks and show recommended config |
watch [workspace] [--git-mode] [--interval SECS] |
Start file watcher (default: watchdog; --git-mode polls git diff --name-only) |
git-status [workspace] |
Show git-aware scan state: HEAD SHA, last-indexed SHA, changed files, re-scan recommendation |
migrate [workspace] |
Migrate JSON registry to SQLite persistent database |
serve |
Start MCP server for AI agent integration (JSON-RPC over stdio) |
lsp-status |
Check which LSP servers are available for --deep analysis |
| Command | Description |
|---|---|
query "name" [workspace] [--domain] [--file] [--fuzzy] |
Pre-write check: does this name already exist? |
impact "name" [workspace] [--action modify|delete] |
Change impact analysis |
refactor-safe "name" [workspace] [--action rename|move] |
Pre-flight rename/move safety check |
guard [workspace] (--pre|--post) --file PATH |
Pre/post-write verification for AI agents |
check [workspace] [--severity ...] [--max-findings N] |
CI/CD quality gate — exits non-zero on failure |
| Command | Description |
|---|---|
search "pattern" [workspace] [--type] [--context] [--limit N] [--offset N] |
Regex search across workspace (paginated, default limit=20) |
symbols "name" [workspace] [--fuzzy] [--limit N] [--offset N] |
Search symbol in registry (paginated) |
trace "name" [workspace] [--direction up|down|both] [--depth N] [--limit N] [--offset N] |
Deep call chain tracing (paginated) |
context "name" [workspace] |
Rich symbol context (definition, callers, callees) |
outline [workspace] [--file] [--all] [--limit N] [--offset N] |
File structure outline (paginated) |
missing-refs [workspace] |
Detect CSS/HTML mismatches |
dependents "file" [workspace] |
Module-level import tracking |
list [workspace] [--domain] [--filter] [--limit N] [--offset N] |
List entries with filter (paginated, default limit=20) |
graph-schema [workspace] |
Return graph shape: node/edge counts, type distribution, indexes (issue #17) |
ask "question" |
Ask a question in natural language (auto-dispatches to relevant commands) |
summary [workspace] [--focus ...] [--detail ...] |
Auto-summary with prioritized findings (anti-overload) |
| Command | Description |
|---|---|
secrets [workspace] [--severity ...] |
Detect hardcoded API keys, passwords, tokens |
vuln-scan [workspace] [--severity ...] [--offline] [--osv-ttl N] [--refresh] [--max-age Nh] |
Scan dependencies for known CVEs (OSV.dev + native audit). --refresh bypasses the OSV cache and forces fresh API calls; --max-age Nh treats cache entries older than N hours as stale for this run only (issue #30). Output includes a cache_info block (last_refresh, age_hours, ttl_hours, is_stale, stale_packages) so agents can decide whether to trust the cached CVE data. |
taint [workspace] |
Run AST-based taint analysis for vulnerability detection |
dataflow [workspace] [--source] [--sink] |
Data flow taint analysis with cross-file call graph |
env-check [workspace] [--var NAME] |
Audit environment variables |
smell [workspace] [--categories ...] [--severity ...] |
Code smell detection with health score |
complexity [workspace] [--name FN] [--threshold N] |
Cyclomatic/cognitive complexity scoring |
dead-code [workspace] [--categories ...] |
Enhanced dead code detection |
debug-leak [workspace] [--category ...] |
Detect leftover debug code |
fix [workspace] [--apply] |
Auto-fix issues with confidence scoring (dry-run by default) |
| Command | Description |
|---|---|
architecture [workspace] [--lite] [--no-cache] |
Single-call codebase overview for AI agents (languages, frameworks, entry points, packages, routes, hotspots, total symbols). --lite omits routes/packages/hotspots for <1k token orientation (issue #19) |
entrypoints [workspace] |
Map execution entry points |
api-map [workspace] |
Map REST/GraphQL/gRPC routes to handlers |
state-map [workspace] |
Track global state management |
diff [workspace] |
Compare registry snapshots |
circular [workspace] |
Detect circular dependencies |
graph-schema [workspace] |
Cheap graph-shape introspection: node/edge counts, type distribution, indexes (issue #17) |
resolve-types [workspace] |
Manually trigger hybrid type resolution (import-aware CALLS edge refinement, issue #13) |
handbook [workspace] |
Generate project handbook for AI agents |
dashboard [workspace] |
Generate HTML visualization dashboard |
history [workspace] |
Show historical trend data and charts |
| Command | Description |
|---|---|
side-effect [workspace] [--name FN] |
Pure vs impure function analysis |
stack-trace "name" [workspace] |
Error propagation simulation |
test-map [workspace] |
Test coverage mapping |
config-drift [workspace] |
Dependency drift detection |
type-infer [workspace] |
Lightweight type inference |
ownership [workspace] |
Git blame code ownership |
regex-audit [workspace] |
ReDoS-vulnerable regex auditing |
a11y [workspace] |
Accessibility auditing (WCAG 2.1) |
perf-hint [workspace] [--severity ...] [--category ...] |
Performance anti-pattern detection |
css-deep [workspace] |
Deep CSS analysis (vars, keyframes, specificity) |
| Command | Description |
|---|---|
analyze [workspace] [--focus ...] [--timeout SECS] |
Full repo analysis: init + scan + all engines in one command |
binary-scan [workspace] |
Scan for binary/compiled artifacts with Tauri/Electron RE analysis |
artifact-scan [workspace] [--deep] |
Scan for compiled/built artifacts (reverse engineering mode) |
benchmark [workspace] |
Run accuracy and performance benchmarks |
plugin <subcommand> |
Manage plugins: install, list, search, update, info, validate |
| Query Result | Action |
|---|---|
found: false |
SAFE — create new |
found: true + status: active |
EXTEND — don't overwrite |
found: true + status: dead |
ASK user — reuse or delete? |
found: true + status: duplicate_ref |
LIST all referrers first |
found: true + status: collision |
STOP — active bug, fix first |
| Risk Level | Action |
|---|---|
critical |
DO NOT change. Report to user. |
high |
Warning. List all affected first. |
medium |
Caution. Run tests. |
low |
Safe, proceed. |
Tree-sitter parsed (AST-level): HTML, CSS, SCSS, JavaScript, TypeScript, TSX/JSX, Rust, Python, Vue SFC, Svelte, Blade
Regex fallback parsed (28+ languages): C, C++, C#, Go, Java, Kotlin, Swift, Ruby, PHP, Scala, Dart, Elixir, Lua, R, Haskell, Nim, Objective-C, GDScript, Shell/Bash, Vim, Zig, and more
Frameworks: React/Next.js, Vue/Nuxt, Svelte/SvelteKit, Tailwind CSS, Express, Fastify, Koa, Hono, Django, Flask, FastAPI, Tauri, Drupal, pytest, poetry, setuptools, tox, sphinx, nox, hatch
Package Managers: npm, yarn, pnpm, bun, cargo, pip, pipenv, poetry, go modules
codelens/
├── SKILL.md # Full documentation for AI agents
├── SKILL-QUICK.md # Quick reference (concise)
├── README.md # This file
├── CHANGELOG.md # Version history (top-level)
├── CONTRIBUTING.md # Contribution guidelines
├── SECURITY.md # Security policy
├── CODE_OF_CONDUCT.md # Code of Conduct
├── LICENSE.txt # MIT License
├── setup.sh # Dependency installer
├── pyproject.toml # Python package metadata
├── skill.json # Skill metadata
├── mcp_config.json # MCP server config templates (Claude, Cursor, VS Code, Continue, Cline)
├── pytest.ini # Pytest configuration
├── references/ # Detailed reference docs
│ ├── parser-rules.md # Parsing rules per language
│ ├── query-examples.md # Query usage examples
│ ├── status-codes.md # Status & flag reference
│ ├── changelog.md # Older changelog (per-version highlights)
│ └── agent-integration.md # AI agent integration guide
├── scripts/
│ ├── codelens.py # CLI entry point (56 commands registered)
│ ├── mcp_server.py # MCP JSON-RPC server (54 tools)
│ ├── registry.py # Registry read/write/build
│ ├── persistent_registry.py # SQLite persistent storage (WAL mode)
│ ├── base_parser.py # Base tree-sitter parser
│ ├── base_engine.py # Base analysis engine
│ ├── grammar_loader.py # Lazy tree-sitter grammar loader
│ ├── framework_detect.py # Framework auto-detection
│ ├── incremental.py # Incremental scan support
│ ├── edge_resolver.py # Cross-file edge resolution
│ ├── graph_model.py # Graph data model (nodes + edges) — issue #8
│ ├── git_aware.py # Git-diff aware incremental re-index — issue #14
│ ├── search_engine.py # Regex code search
│ ├── trace_engine.py # Call chain tracing
│ ├── impact_engine.py # Change impact analysis
│ ├── outline_engine.py # File structure outline
│ ├── missing_refs.py # CSS/HTML mismatch detection
│ ├── diff_engine.py # Registry diff/snapshots
│ ├── circular_engine.py # Circular dependency detection
│ ├── context_engine.py # Rich symbol context
│ ├── dependents_engine.py # Module import tracking
│ ├── validate_engine.py # Registry validation
│ ├── dataflow_engine.py # Data flow taint analysis
│ ├── ast_taint_engine.py # AST-based taint analysis (tree-sitter)
│ ├── crossfile_taint_engine.py # Cross-file taint propagation
│ ├── callgraph_engine.py # Workspace-wide call graph
│ ├── smell_engine.py # Code smell detection
│ ├── sideeffect_engine.py # Side-effect analysis
│ ├── refactor_safe_engine.py # Refactoring safety check
│ ├── deadcode_engine.py # Enhanced dead code detection
│ ├── stacktrace_engine.py # Error propagation simulation
│ ├── testmap_engine.py # Test coverage mapping
│ ├── configdrift_engine.py # Dependency drift detection
│ ├── typeinfer_engine.py # Lightweight type inference
│ ├── ownership_engine.py # Git blame ownership
│ ├── secrets_engine.py # Hardcoded secret detection
│ ├── entrypoints_engine.py # Entry point mapping
│ ├── apimap_engine.py # API route mapping
│ ├── statemap_engine.py # State management tracking
│ ├── envcheck_engine.py # Environment variable audit
│ ├── debugleak_engine.py # Debug code leak detection
│ ├── complexity_engine.py # Complexity scoring
│ ├── regexaudit_engine.py # Regex auditing (ReDoS)
│ ├── a11y_engine.py # Accessibility auditing (WCAG 2.1)
│ ├── vulnscan_engine.py # Vulnerability scanning
│ ├── osv_client.py # OSV.dev API client (9 ecosystems)
│ ├── perfhint_engine.py # Performance hints
│ ├── cssdeep_engine.py # Deep CSS analysis
│ ├── autofix_engine.py # Auto-fix with confidence scoring
│ ├── dashboard_engine.py # HTML dashboard generation
│ ├── history_engine.py # Historical trend tracking
│ ├── semantic_engine.py # Semantic rules engine
│ ├── hybrid_engine.py # LSP-enhanced hybrid analysis
│ ├── lsp_client.py # LSP client wrapper
│ ├── convention_engine.py # Naming convention checking
│ ├── plugin_system.py # Plugin system & marketplace
│ ├── pre_commit_hook.py # Git pre-commit hook integration
│ ├── utils.py # Shared utilities (version, helpers)
│ ├── commands/ # One file per CLI command (auto-registered, 57 commands incl. graph-schema)
│ ├── formatters/ # Output formatters (markdown, sarif, compact)
│ ├── parsers/ # Tree-sitter + fallback parsers
│ │ ├── html_parser.py, css_parser.py, js_frontend_parser.py, js_backend_parser.py
│ │ ├── rust_parser.py, python_parser.py, tsx_parser.py, ts_backend_parser.py
│ │ ├── vue_parser.py, svelte_parser.py, tailwind_detector.py, blade_parser.py
│ │ └── fallback_*.py # 28 regex-based fallback parsers (C, C++, Go, Java, ...)
│ ├── rules/ # Built-in YAML rule packs
│ │ ├── javascript_security.yaml
│ │ └── python_security.yaml
│ └── plugins/ # Built-in plugin packs
│ ├── owasp_top10/rules/owasp_top10.yaml (36 rules, A01-A10)
│ └── compliance/rules/{hipaa.yaml, pci_dss.yaml} (53 rules)
├── benchmarks/ # Benchmark suite & fixtures (clean_app + vulnerable_app)
├── tests/ # Pytest test suite
└── vscode-codelens/ # VS Code extension source
- Python 3.8+
- tree-sitter + language grammars (auto-installed by
setup.sh) - watchdog (optional, for file watching)
- git (optional, for ownership analysis)
- Language server (optional, for
--deepLSP-enhanced analysis)
# Clone the repository
git clone https://github.com/Wolfvin/CodeLens.git
cd CodeLens
# Run setup
bash setup.sh
# Verify
python3 scripts/codelens.py --helpCodeLens is designed to be used by AI coding agents. The full integration guide is in references/agent-integration.md.
Key principle: Before an AI writes any new class, id, or function, it MUST query CodeLens first to check for collisions, overwrites, and dead code.
CodeLens ships with a native MCP server (55 tools) for direct AI agent integration:
# Start MCP server (JSON-RPC over stdio)
python3 scripts/codelens.py serveEvery MCP tool accepts a format parameter with the enum [json, markdown, ai, sarif, compact].
For high-volume agent workflows, pass format: "compact" to cut token usage ~50%. Example:
// tools/call with format=compact
{"name": "codelens_graph_schema", "arguments": {"workspace": "/path/to/proj", "format": "compact"}}
// → {"s":"ok","n":1234,"e":5678,"nts":{"function":1000,"class":234},"ets":{"CALLS":5678},"ix":6}The new codelens_graph_schema tool (issue #17) returns the graph shape in one cheap call —
use it first to decide whether structural queries (callers/callees/blast-radius) will return
meaningful results before paying tokens for them.
See mcp_config.json for Claude Desktop, Cursor, VS Code Copilot, Continue.dev, and Cline configuration templates.
# Pre-write safety check
python3 scripts/codelens.py guard /path/to/workspace --pre --file src/new_module.py
# Post-write verification
python3 scripts/codelens.py guard /path/to/workspace --post --file src/new_module.py# Quality gate — exits non-zero on failure (use in CI/CD pipelines)
python3 scripts/codelens.py check /path/to/workspace --severity high --max-findings 50
# SARIF output for GitHub Advanced Security / VS Code
python3 scripts/codelens.py check /path/to/workspace --format sarif > codelens.sarif# List installed plugins
python3 scripts/codelens.py plugin list
# Built-in plugins (already shipped):
# - owasp_top10 (36 OWASP Top 10 rules, A01-A10)
# - compliance (53 rules: PCI-DSS v4.0 + HIPAA Security Rule)
# Search registry (future marketplace)
python3 scripts/codelens.py plugin search "sql injection"CodeLens excels in AI-native code intelligence — a niche where MCP integration, guard hooks, and AI-optimized output matter most. Here is an honest assessment vs established tools:
| Dimension | CodeLens | SonarQube | CodeQL | Semgrep |
|---|---|---|---|---|
| AI Agent Integration | 8 | 4 | 3 | 5 |
| Frontend Breadth | 8 | 6 | 3 | 5 |
| MCP / AI-Native Design | 9 | 2 | 2 | 3 |
| Taint Analysis Depth | 5 | 7 | 10 | 7 |
| CI/CD & SARIF | 5 | 10 | 7 | 8 |
| Plugin/Rule Ecosystem | 2 | 10 | 5 | 8 |
| IDE Integration | 4 | 9 | 8 | 9 |
| Community & Maturity | 1 | 10 | 8 | 7 |
| Live CVE Scanning | 7 | 9 | 3 | 8 |
| Cross-File Analysis | 6 | 8 | 10 | 7 |
Our genuine strengths: AI-native design, frontend analysis breadth, MCP integration, guard for AI workflows.
Where we lag: Community ecosystem, IDE marketplace presence, deep abstract interpretation (CodeQL's domain), enterprise CI/CD integrations.
Our goal: Be the best code intelligence tool for AI agent workflows, not a SonarQube replacement.
See CONTRIBUTING.md for guidelines.
See SECURITY.md for reporting vulnerabilities.
MIT License — see LICENSE.txt
See CHANGELOG.md for version history (top-level) and references/changelog.md for older per-version highlights.